[Libav-user] Problem Generating B-Frames Using libavcodec/libx264

Discussion:

Robert Schmidt

2013-10-21 13:31:42 UTC

Hello all,

I am writing a an application that encodes incoming video in H.264 using libavcodec and libx264. However, I noticed that the output data was much larger than I expected. When I examined the output, I discovered that the encoder was only producing I- and P-frames and never producing B-frames.

I created a standalone utility in an attempt to isolate my encoding logic and determine what I am doing wrong. The utility reads in H.264 video from a file stored using the ITU H.264 Annex B format, decodes it, re-encodes it, and writes the resulting packets to another file in the Annex B format. Like my application, the test utility fails to generate any B-frames.

I ran the input file through ffmpeg using the same settings that I was providing in my utility and found that ffmpeg does produce B-frames. I have been trying to determine what I am doing differently than ffmpeg, but I have not been able to figure it out yet. I'm hoping that perhaps someone on this list can spot my oversight.

I use the following command line with ffmpeg:

    $ ffmpeg -v debug -i ~/annexb.264 -codec:v libx264 -preset superfast -g 30 -f h264 ./out.264

As far as I can tell, ffmpeg should simply be using the "superfast" libx264 preset and a group of pictures setting of 30 frames. Beyond that, it should be using the defaults.

My code to set up the encoder looks like the following:

static AVStream *add_video_stream(AVFormatContext *output_ctx, AVCodec **output_codec, enum AVCodecID codec_id)
{
    *output_codec = avcodec_find_encoder(codec_id);
    if (*output_codec == NULL) {
        printf("Could not find encoder for '%s' (%d)\n", avcodec_get_name(codec_id), codec_id);
        return NULL;
    }

    AVStream *output_stream = avformat_new_stream(output_ctx, *output_codec);
    if (output_stream == NULL) {
        printf("Could not create video stream.\n");
        return NULL;
    }
    output_stream->id = output_ctx->nb_streams - 1;
    AVCodecContext *codec_ctx = output_stream->codec;

    avcodec_get_context_defaults3(codec_ctx, *output_codec);

    codec_ctx->width = 1280;
    codec_ctx->height = 720;

    codec_ctx->time_base.den = 15000;
    codec_ctx->time_base.num = 1001;

/*    codec_ctx->gop_size = 30;*/
    codec_ctx->pix_fmt = AV_PIX_FMT_YUVJ420P;

    // try to force B-frame output
/*    codec_ctx->max_b_frames = 3;*/
/*    codec_ctx->b_frame_strategy = 2;*/

    output_stream->sample_aspect_ratio.num = 1;
    output_stream->sample_aspect_ratio.den = 1;

    codec_ctx->sample_aspect_ratio.num = 1;
    codec_ctx->sample_aspect_ratio.den = 1;

    codec_ctx->chroma_sample_location = AVCHROMA_LOC_LEFT;

    codec_ctx->bits_per_raw_sample = 8;

    if ((output_ctx->oformat->flags & AVFMT_GLOBALHEADER) != 0) {
        codec_ctx->flags |= CODEC_FLAG_GLOBAL_HEADER;
    }

    return output_stream;
}

int main(int argc, char **argv)
{
    // ... open input file

    avformat_alloc_output_context2(&output_ctx, NULL, "h264", output_path);
    if (output_ctx == NULL) {
        fprintf(stderr, "Unable to allocate output context.\n");
        return 1;
    }

    AVCodec *output_codec = NULL;
    output_stream = add_video_stream(output_ctx, &output_codec, output_ctx->oformat->video_codec);
    if (output_stream == NULL) {
        fprintf(stderr, "Error adding video stream to output context.\n");
        return 1;
    }
    encode_ctx = output_stream->codec;

    // seems to have no effect
#if 0
    if (decode_ctx->extradata_size != 0) {
        size_t extradata_size = decode_ctx->extradata_size;
        printf("extradata_size: %zu\n", extradata_size);
        encode_ctx->extradata = av_mallocz(extradata_size + FF_INPUT_BUFFER_PADDING_SIZE);
        memcpy(encode_ctx->extradata, decode_ctx->extradata, extradata_size);
        encode_ctx->extradata_size = extradata_size;
    }
#endif // 0

    AVDictionary *opts = NULL;
    av_dict_set(&opts, "preset", "superfast", 0);
    // av_dict_set(&opts, "threads", "auto", 0); // seems to have no effect

    ret = avcodec_open2(encode_ctx, output_codec, &opts);
    if (ret < 0) {
        fprintf(stderr, "Unable to open output video cocec: %s\n", av_err2str(ret));
        return 1;
    }

    // ... decoding/encoding loop, clean up, etc.

    return 0;
}

I have tried manually specifying the B-frame parameters in the AVCodecContext structure to no avail.

I've also tried debugging both my utility and ffmpeg under gdb so that I can compare the values of the AVCodecContext, X264Context, and AVStream structures. I have tried to make sure they are identical, but I still am not getting any B-frames.

For a while I thought perhaps the issue was that I was mishandling timestamps, so I replicated ffmpeg's processing chain and output timestamp debugging information similar to what ffmpeg produces. My debugging output appears identical to ffmpeg's.

Does anyone have any ideas as to what I may be doing wrong? I asked the same question on StackOverflow last week (http://stackoverflow.com/q/19456745/2895838) where I also included some of the logging output from both ffmpeg and my test utility. I've omitted that here in the interest of brevity. I also thought that the entire source code for the test utility would be too long for an e-mail, but I'm happy to provide more if that is helpful.

For what it's worth, I'm currently using ffmpeg 1.2 and its libraries.

Any assistance is greatly appreciated. I've been banging my head against this one for a while now.

Thanks in advance.

Rob

Francesco Damato

2013-10-23 10:23:05 UTC

Permalink

Hi Robert,

i have your same goal, also I am writing a an application that encodes
incoming video in H.264 using libavcodec and libx264. I don't know the
solution for yuor problem, because i do not know how to encode.
i have a avi or mp4 file as input. The *avcodec_encode_video2* function
takes input raw video data from frame, so i demux my file and i obtain a
raw video file; now i want encode it, so how i can open the raw file, read
the raw frames and encode them into h264?? Is this the right wat yo encode
in H.264? There are other solutions?? How do you encode?? Can you help me,
please??

Thanks in advance.

Regards

Post by Robert Schmidt
Hello all,
I am writing a an application that encodes incoming video in H.264 using
libavcodec and libx264. However, I noticed that the output data was much
larger than I expected. When I examined the output, I discovered that the
encoder was only producing I- and P-frames and never producing B-frames.
I created a standalone utility in an attempt to isolate my encoding logic
and determine what I am doing wrong. The utility reads in H.264 video from
a file stored using the ITU H.264 Annex B format, decodes it, re-encodes
it, and writes the resulting packets to another file in the Annex B
format. Like my application, the test utility fails to generate any
B-frames.
I ran the input file through ffmpeg using the same settings that I was
providing in my utility and found that ffmpeg does produce B-frames. I
have been trying to determine what I am doing differently than ffmpeg, but
I have not been able to figure it out yet. I'm hoping that perhaps someone
on this list can spot my oversight.
$ ffmpeg -v debug -i ~/annexb.264 -codec:v libx264 -preset superfast
-g 30 -f h264 ./out.264
As far as I can tell, ffmpeg should simply be using the "superfast"
libx264 preset and a group of pictures setting of 30 frames. Beyond that,
it should be using the defaults.
static AVStream *add_video_stream(AVFormatContext *output_ctx, AVCodec
**output_codec, enum AVCodecID codec_id)
{
*output_codec = avcodec_find_encoder(codec_id);
if (*output_codec == NULL) {
printf("Could not find encoder for '%s' (%d)\n",
avcodec_get_name(codec_id), codec_id);
return NULL;
}
AVStream *output_stream = avformat_new_stream(output_ctx,
*output_codec);
if (output_stream == NULL) {
printf("Could not create video stream.\n");
return NULL;
}
output_stream->id = output_ctx->nb_streams - 1;
AVCodecContext *codec_ctx = output_stream->codec;
avcodec_get_context_defaults3(codec_ctx, *output_codec);
codec_ctx->width = 1280;
codec_ctx->height = 720;
codec_ctx->time_base.den = 15000;
codec_ctx->time_base.num = 1001;
/* codec_ctx->gop_size = 30;*/
codec_ctx->pix_fmt = AV_PIX_FMT_YUVJ420P;
// try to force B-frame output
/* codec_ctx->max_b_frames = 3;*/
/* codec_ctx->b_frame_strategy = 2;*/
output_stream->sample_aspect_ratio.num = 1;
output_stream->sample_aspect_ratio.den = 1;
codec_ctx->sample_aspect_ratio.num = 1;
codec_ctx->sample_aspect_ratio.den = 1;
codec_ctx->chroma_sample_location = AVCHROMA_LOC_LEFT;
codec_ctx->bits_per_raw_sample = 8;
if ((output_ctx->oformat->flags & AVFMT_GLOBALHEADER) != 0) {
codec_ctx->flags |= CODEC_FLAG_GLOBAL_HEADER;
}
return output_stream;
}
int main(int argc, char **argv)
{
// ... open input file
avformat_alloc_output_context2(&output_ctx, NULL, "h264", output_path);
if (output_ctx == NULL) {
fprintf(stderr, "Unable to allocate output context.\n");
return 1;
}
AVCodec *output_codec = NULL;
output_stream = add_video_stream(output_ctx, &output_codec,
output_ctx->oformat->video_codec);
if (output_stream == NULL) {
fprintf(stderr, "Error adding video stream to output context.\n");
return 1;
}
encode_ctx = output_stream->codec;
// seems to have no effect
#if 0
if (decode_ctx->extradata_size != 0) {
size_t extradata_size = decode_ctx->extradata_size;
printf("extradata_size: %zu\n", extradata_size);
encode_ctx->extradata = av_mallocz(extradata_size +
FF_INPUT_BUFFER_PADDING_SIZE);
memcpy(encode_ctx->extradata, decode_ctx->extradata,
extradata_size);
encode_ctx->extradata_size = extradata_size;
}
#endif // 0
AVDictionary *opts = NULL;
av_dict_set(&opts, "preset", "superfast", 0);
// av_dict_set(&opts, "threads", "auto", 0); // seems to have no effect
ret = avcodec_open2(encode_ctx, output_codec, &opts);
if (ret < 0) {
fprintf(stderr, "Unable to open output video cocec: %s\n",
av_err2str(ret));
return 1;
}
// ... decoding/encoding loop, clean up, etc.
return 0;
}
I have tried manually specifying the B-frame parameters in the
AVCodecContext structure to no avail.
I've also tried debugging both my utility and ffmpeg under gdb so that I
can compare the values of the AVCodecContext, X264Context, and AVStream
structures. I have tried to make sure they are identical, but I still am
not getting any B-frames.
For a while I thought perhaps the issue was that I was mishandling
timestamps, so I replicated ffmpeg's processing chain and output timestamp
debugging information similar to what ffmpeg produces. My debugging output
appears identical to ffmpeg's.
Does anyone have any ideas as to what I may be doing wrong? I asked the
same question on StackOverflow last week (
http://stackoverflow.com/q/19456745/2895838) where I also included some
of the logging output from both ffmpeg and my test utility. I've omitted
that here in the interest of brevity. I also thought that the entire
source code for the test utility would be too long for an e-mail, but I'm
happy to provide more if that is helpful.
For what it's worth, I'm currently using ffmpeg 1.2 and its libraries.
Any assistance is greatly appreciated. I've been banging my head against
this one for a while now.
Thanks in advance.
Rob
_______________________________________________
Libav-user mailing list
http://ffmpeg.org/mailman/listinfo/libav-user

--
Francesco Damato

Robert Schmidt

2013-10-25 15:50:38 UTC

Permalink

Post by Francesco Damato
i have your same goal, also I am writing a an application that encodes
incoming video in H.264 using libavcodec and libx264. I don't know the
solution for yuor problem, because i do not know how to encode.
i have a avi or mp4 file as input. The *avcodec_encode_video2* function
takes input raw video data from frame, so i demux my file and i obtain a
raw video file; now i want encode it, so how i can open the raw file, read
the raw frames and encode them into h264?? Is this the right wat yo encode
in H.264? There are other solutions?? How do you encode?? Can you help me,
please??

Hi Francesco,

I'm not entirely certain what you are looking for, but for I basically
do the following for my transcoding test utility:

* Read a raw packet from the input file using av_read_frame (demux)
* Decode the packet into a frame using avcodec_decode_video2 (decode)
* Encode the frame using avcodec_encode_video2 (encode)
* Write the resulting packets using av_interleaved_write_frame (mux)

I found the examples at http://ffmpeg.org/doxygen/trunk/examples.html
helpful for understanding how to properly use the API calls. Also, if
you look up the relevant API calls in the FFmpeg Doxygen documenation
(http://ffmpeg.org/doxygen/trunk/index.html), there are
cross-references to where they are used in the examples that help
provide context and show how they're used.

If you want to use H.264 for encoding, you can specify "h264" as the
format if you call avformat_alloc_output_context2 to set up your
output muxer or use AV_CODEC_ID_H264 when a setting up your encoder.

My application receives its video data from a separate source, so I
have to manually set up the AVFrame structures to pass to the encoder
rather than rely on av_read_frame/avcodec_decode_video2, but otherwise
the process is basically the same.

I hope this helps.

Rob