Fast and save dive into VVC (H.266) development

In this short note I want to share with you how to build VVC (VTM) encoder because I did not find any docs how to start work with it from scratch in the Internet.

What is VVC encoder?

VVC (H.266) is the evolution of HEVC (H.265) codec, which offers ~40% compression efficiency than HEVC. I remind that HEVC offers 25-50% better data compression at the same level of video quality in compassion with AVC (H.264). For the more in depth explanation about VVC I suggest to see this video.

Prerequisites

  • any Linux machine (I used Ubuntu 17.10)
  • git
  • cmake
  • gcc
  • any source stream (.yuv)

Step 1: Building

We will use Fraunhofer`s VTM (VVC Test Model) version of encoder:

git clone https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM.git

After that we can do the job:

cd VVCSoftware_VTM/
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
make -j4
cd ../bin #here will be all binaries

We receive following binaries:

ls
DecoderAnalyserAppStatic DecoderAppStatic EncoderAppStatic parcatStatic SEIRemovalAppStatic StreamMergeAppStatic umake

Step 2: Encoding & decoding

Now we have to check that binaries are working. We will do that by encoding and the decoding the stream. Yeah you can type ./EncoderAppStatic --help to see all available parameter and choose needed parameters but I will share with you the simple example:

./EncoderAppStatic -i vicue_test_432x240_420_8_500.yuv -wdt 432 -hgt 240 -c encoder_randomaccess_vtm.cfg -f 5 -fr 30

Where

  • encoder_randomaccess_vtm.cfg is a file from VVCSoftware_VTM/cfg
  • vicue_test_432x240_420_8_500.yuv is a simple source stream

Parameters

  • -i- input stream
  • -wdt- width of the stream
  • -hgt - height of the stream
  • -c- configuration file to use
  • -f- number of frames to be encoded (we used only 5 to speed up our case)
  • -fr- frame rate

To decode stream backwards:

./DecoderAppStatic -b str.bin -o rec_decoded.yuv

Parameters

  • -b- input encoded
  • -o- reconstruct output



VTM engineering

File organization structure of VTM project:

  • Build file: the project file NextSoftware.sln containing VTM;
  • cfg file: It contains the VTM encoding configuration file .cfg, which can be opened and edited directly with a text tool;
  • Source file: It is the C++ source file of VTM, including the source file of encoder and decoder;
  • bin file: Contains the compilation results of the VTM project, EncoderApp.exe and DecoderApp.exe, etc.; currently VTM3.0 contains the following 13 projects.
    VIEW IMAGE
    more important:
  • DecoderAnalyserApp:
  • DecoderApp: Decoder application function
  • EncoderApp: Encoder application function
  • CommonLib: Common library functions for encoders and decoders
  • DecoderLib: Decoder library function
  • EncoderLib: Encoder library function

Cfg configuration file

The parameters of the cfg file need to be configured, including the parameter configuration of the VTM project itself and the parameter configuration of the test sequence, both of which are in the ./cfg directory. VIEW IMAGE

  • VTM project parameter configuration: As shown in the figure above, the project configuration files are divided into 4 types: encoder_intra_vtm.cfg (ie All_intra, full intra prediction mode), encoder_lowdelay_P_vtm.cfg, encoder_lowdelay_vtm.cfg, encoder_randomaccess_vtm.cfg.
    Among them, the parameters such as QP, the switch of tools such as deblocking, SAO, etc. are all in the configuration file, and the code stream file on the encoding side and the name of the reconstructed yuv file can be controlled;
    VIEW IMAGE
  • Test sequence parameter configuration: In the path ./cfg/per-sequence, there is the configuration of the test sequence under standard test conditions.
    As shown in the figure below, the parameters are sequence name, bit depth, luminance and chrominance mode, frame rate, which frame to start encoding, sequence image width, sequence image height, and how many frames are encoded.
    Among them, if you change other sequences, you need to modify the sequence name, width and height, and how many frames to start encoding should be set according to your needs.
    VIEW IMAGE

Engineering command settings

Encoder

Open the property page and fill in the command parameters and working directory: Command parameter format: "-c two cfg configuration files" + ">> log txt file"
such as -c encoder_intra_vtm_qp32.cfg -c SlideEditing.cfg >> log_file.txt
where> > Is no overwrite,> is overwrite.
The working directory is a sub-directory under the ./bin directory, depending on whether the selected debugger is Debug or Release, that is, the directory where the EncoderApp.exe generated after compilation is located.
VIEW IMAGE

Decoder

Open the property page and fill in the command parameters and working directory: the command parameters on the decoder side are different from those on the encoder side. There is no longer a configuration file, but a reconstructed yuv is generated based on the code stream file generated by the encoder.
The command format is "-b stream file" + "-o decode output reconstruction file" + ">> decoder log file" For
example, -b str.bin -o res_decoder.yuv >> log_decoder.txt
VIEW IMAGE

Project operation

Encoder

  • Set the encoder "EncoderApp" as the startup project and start running directly. VIEW IMAGE
    final encoder performance results will be recorded in the command parameter log file. The generated code stream file and reconstructed video file are set in the parameter configuration cfg file, and the path is the working directory set by the encoder itself.

Decoder

  • Set the decoder "DecoderApp" as the startup project and start running directly. Finally, the decoding performance results of the decoder will be recorded in the log file of the command parameters. The reconstructed video files generated by -o are all in the working directory set by the decoder.

Result output

Encoding side

In the reference software, general information has been printed in the log file, including the switch status of some tools, the encoding result, and the entire encoding time. VIEW IMAGE

The coding information output after each frame is edited. The important ones are: POC (corresponding to the sequence number of the image time domain playback sequence, where POC 0 is the first frame of the time domain playback), TId (temporal id time domain layer sequence number, a The frame can only refer to the time domain layer lower than his frame, and cannot refer to the higher layer frame), the total number of bits, and the PSNR value of each YUV.

Komentar