wasm + ffmpeg to achieve the front-end video frame interception weixin_33772645 2018-07-28 14:52:32 1435 Favorite 5 copyright Is it possible to process audio and video on the front-end page? For example, the user selects a video, and then supports him to set any frame of the video as the cover, without uploading the entire video to the backend for processing. After some exploration by the author, this function is basically realized, a complete demo: ffmpeg wasm interception of video frames function : Support mp4/mov/mkv/avi and other files. The basic idea is this: Use a file input to let the user select a video file, then read it as an ArrayBuffer, and pass it to ffmpeg.wasm for processing. After processing, output the rgb data to the canvas or convert it to base64 as the src attribute of the img tag to form a picture . (Canvas can directly use video dom as the object of drawImage to obtain video frames, but the format of video playback is relatively small. This article focuses on the implementation of the ffmpeg scheme, because ffmpeg can do other things, this is just an example.) Here is a question, why use ffmpeg instead of writing directly in JS? Because the C library for multimedia processing is relatively mature, ffmpeg is one of them, and it is still open source, and wasm can just convert it to format and use it on the web page. There are fewer JS libraries related to multimedia processing. Write a demultiplexing by yourself ( demux) and the complexity of decoding video can be imagined, JS direct encoding and decoding will be more time-consuming. So there are ready-made ready-made ones. The first step is compilation (if you are not interested in the compilation process, you can skip to step 2) 1. Compile ffmpeg to the wasm version At first I thought it would be very difficult, but later I found that it was not that big, because there was a videoconverter.js that had already been converted (it is a ffmpeg to achieve audio and video transcoding on the web page), the key is to put some useless features Disable when configuring, or it will report a syntax error when compiling. The wasm transferred from emsdk is used here . The installation method of emsdk has been clearly explained in its installation tutorial . It mainly uses a script to determine the system to download different compiled files. After downloading, there will be several executable files, including emcc, emc++, emar and other commands. emcc is a C compiler, emc++ is a C++ compiler, and emar is used to package different .o library files into one .a file. First download the source code from ffmpeg 's official website. (1) configure Unzip into the directory, and then execute the following command: emconfigure ./configure --cc="emcc" --enable-cross-compile --target-os=none --arch=x86_32 --cpu=generic \ --disable-ffplay --disable-ffprobe --disable-asm --disable-doc --disable-devices --disable-pthreads --disable-w32threads --disable-network \ --disable-hwaccels --disable-parsers --disable-bsfs --disable-debug --disable-protocols --disable-indevs --disable-outdevs --enable-protocol=file 复制代码 Usually the role of configure is to generate a Makefile-the configure stage confirms some compilation environment and parameters, and then generates a compilation command to put it in the Makefile. The main function of the previous emconfigure is to specify the compiler as emcc, but this is not enough, because there are some sub-modules in ffmpeg, and it is not possible to completely specify all compilers as emcc. Fortunately, configure of ffmpeg can The custom compiler is specified through the --cc parameter. On Mac, the C compiler generally uses /usr/bin/clang, which is specified as emcc here. The following disable disables some features that do not support wasm. For example, --disable-asm disables the use of assembly code, because those assembly syntax emcc is incompatible. If it is not disabled, the compiler will report an error syntax. error. Another --disable-hwaccels is to disable hard decoding. Some graphics cards support direct decoding without application decoding (soft decoding). The performance of hard decoding will be significantly higher than that of soft decoding. After this is disabled, it will cause later use When reporting a warning: [swscaler @ 0x105c480] No accelerated colorspace conversion found from yuv420p to rgb24. But it does not affect the use. (The process of executing configure will report a segment fault, but it has no effect in the subsequent process.) After the configure command is executed, a Makefile and related configuration files will be generated. (2) make make is the stage to start compiling, execute the following command to compile: emmake make复制代码 When executed on Mac, you will find that when you finally assemble multiple .o files into .a files, you will get an error: AR libavdevice/libavdevice.a fatal error: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ar: fatal error in /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault. xctoolchain/usr/bin/ranlib To solve this problem, you need to change the packaged command from ar to emar, and then remove the ranlib process, modify the ffbuild/config.mak file: # 修改ar为emar - AR=ar + AR=emar # 去掉ranlib - RANLIB=ranlib + #RANLIB=ranlib复制代码 Then re-make it. After the compilation is complete, a total ffmpeg file will be generated in the ffmpeg directory, and libavcodec.a and other files will be generated in the ffmpeg's libavcodec and other directories. These files are the bitcode files we will use later. bitcode is an intermediate code of a compiled program . (At the end, the strip -o ffmpeg ffmpeg_g command will hang, but it doesn’t matter, the strip is changed to cp ffmpeg_g ffmpeg) 2. Use ffmpeg ffmpeg is mainly composed of several lib directories: libavcodec: provides codec function libavformat: Demultiplexing (demux) and multiplexing (mux) libswscale: image scaling and pixel format conversion Take an mp4 file as an example. mp4 is a container format. First, use the libavformat API to demultiplex mp4 to obtain information such as the location of audio and video in this file. The video is generally encoded using h264, etc. So you need to use libavcodec to decode the yuv format of the image, and finally convert it to rgb format with the help of libswscale. There are two ways to use ffmpeg, the first is to directly compile the ffmpeg file obtained in the first step into wasm: # 需要拷贝一个.bc后缀,因为emcc是根据后缀区分文件格式的 cp ffmpeg_g ffmpeg.bc emcc ffmpeg.bc -o ffmpeg.html复制代码 Then a ffmpeg.js and ffpmeg.wasm will be generated. ffmpeg.js is used to load and compile the wasm file and provide a global Module object to control the functions of the ffmpeg API in wasm. With this, call ffmpeg's API through Module in JS. But I feel that this method is more troublesome. There are many differences between the data type of JS and C. The API of C is frequently adjusted in JS. It is more troublesome to let the data be passed to and from, because it is necessary to adjust the interception function. API of ffmpeg. So I use the second way, first write C code, realize the function in C, and finally expose an interface to JS, so that JS and WASM only need to communicate through an interface API, not like The first way is called as frequently. So the problem turns into two steps: The first step is to use C language to write a ffmpeg function to save the video frame image The second step is to compile wasm and js for data interaction The implementation of the first step mainly refers to a ffmpeg tutorial: ffmpeg tutorial . The code inside is all ready-made, just copy it over. There are some small problems because the ffmpeg version he uses is slightly older, and some API parameters need to be modified. The code has been uploaded to github and can be seen at: cfile/simple.c . The method of use has been introduced in the readme, compiled into an executable file simple by the following command: gcc simple.c -lavutil -lavformat -lavcodec `pkg-config --libs --cflags libavutil` `pkg-config --libs --cflags libavformat` `pkg-config --libs --cflags libavcodec` `pkg-config --libs --cflags libswscale` -o simple复制代码 Then use it to pass the location of a video file: ./simple mountain.mp4复制代码 A picture in pcm format will be generated in the current directory. This simple.c is the API for calling ffmpeg to automatically read the hard disk file. It needs to be changed to read the file content from the memory, that is, the buffer we read from the memory ourselves and then passed to ffmpeg, before we can change the data transmission to the buffer from JS Obtained , the implementation of this can be seen: simple-from-memory.c . The specific C code will not be analyzed here, that is, the API is adjusted, which is relatively simple, that is, to know how to use it, there are relatively few development documents on the ffmpeg online. . In this way, even if the first step is completed, then in the second step, the input of the data is changed to get from JS, and the output is changed to return to JS. 3. Interaction between js and wasm The specific implementation of the wasm version is in web.c (there is also a proccess.c which takes out some functions of simple.c). In web.c, there is a function exposed to JS calls. Let’s call it setFile. setFile is called for JS: EMSCRIPTEN_KEEPALIVE // 这个宏表示这个函数要作为导出的函数 ImageData *setFile(uint8_t *buff, const int buffLength, int timestamp) { // process ... return result; }复制代码 Three parameters need to be passed: buff: original video data (passed in through JS's ArrayBuffer) buffLength: The total size of the video buff (in bytes) timestamp: I want to capture the video frame of the second second Finally, after processing the data structure that returns an ImageData: typedef struct { uint32_t width; uint32_t height; uint8_t *data; } ImageData;复制代码 There are three fields: the width and height of the picture and the RGB data. Compile after writing these C files: emcc web.c process.c ../lib/libavformat.bc ../lib/libavcodec.bc ../lib/libswscale.bc ../lib/libswresample.bc ../lib/libavutil.bc \ -Os -s WASM=1 -o index.html -s EXTRA_EXPORTED_RUNTIME_METHODS='["ccall", "cwrap"]' -s ALLOW_MEMORY_GROWTH=1 -s TOTAL_MEMORY=16777216 复制代码 Use the libavcode.bc and other files generated by the first step to compile. These files have a dependency order, and they cannot be reversed. The dependent ones must be placed behind. Here are some parameters to explain: -o index.htmlIt means that the hmtl file is exported, and index.js and index.wasm will be exported at the same time . These two are mainly used, and the generated index.html is useless; -s EXTRA_EXPORTED_RUNTIME_METHODS='["ccall", "cwrap"] It means to export the two functions ccall and cwrap. The function of these two functions is to call the setFile function written in C above; -s TOTAL_MEMORY=16777216 It means that the total memory size of wasm is about 16MB, which is also the default value, and this needs to be a multiple of 64; -s ALLOW_MEMORY_GROWTH=1 Automatically expand when the memory exceeds the total size. After compiling, write a main.html, add input[type=file] and other controls, and introduce the index.js generated above, it will load index.wasm, and provide a global Module object to control the wasm API, including the above Specify the exported function during compilation, as shown in the following code: