[Libav-user] How to determine frame sizes in msec

Discussion:

Bob Kirnum

2018-10-02 15:15:15 UTC

We are using the libavformat APIs in order to support various containers
including WebM, MKV, MP4, and MOV. We do not use libavcodec APIs for
encoding or decoding, we have our own implementations. Our implementation
is a real-time media server which can play from or record to these
containers. We also offer DVR like controls for skipping ahead or behind.
I am having some difficulty finding a consistent implementation for all
containers to determine the frame times for the audio and video frames we
read. When using some MP4 files, the cur_dts value is always
AV_NOPTS_VALUE so this can't be used. When playing some MKV files, the
audio and video codec context time_base values are inconsistent. For video
the time_base is (1001 / 24000) which can be used to determine the frame
rate and duration (23.976 fps or 41.7 msec). However, the audio values are
(1 / 8000) which is the sample rate not the frame size. It is certainly
possible that I am not understanding this correctly. Can someone recommend
a consistent calculation that will work for audio and video in these
various containers?

Thanks,
Bob

Yurii Monakov

2018-10-03 07:58:32 UTC

Permalink

Bob, you can simply multiply number of samples and audio time_base.

Yurii

Post by Bob Kirnum
We are using the libavformat APIs in order to support various containers
including WebM, MKV, MP4, and MOV. We do not use libavcodec APIs for
encoding or decoding, we have our own implementations. Our implementation
is a real-time media server which can play from or record to these
containers. We also offer DVR like controls for skipping ahead or behind.
I am having some difficulty finding a consistent implementation for all
containers to determine the frame times for the audio and video frames we
read. When using some MP4 files, the cur_dts value is always
AV_NOPTS_VALUE so this can't be used. When playing some MKV files, the
audio and video codec context time_base values are inconsistent. For video
the time_base is (1001 / 24000) which can be used to determine the frame
rate and duration (23.976 fps or 41.7 msec). However, the audio values are
(1 / 8000) which is the sample rate not the frame size. It is certainly
possible that I am not understanding this correctly. Can someone recommend
a consistent calculation that will work for audio and video in these
various containers?
Thanks,
Bob
_______________________________________________
Libav-user mailing list
http://ffmpeg.org/mailman/listinfo/libav-user

Strahinja Radman

2018-10-03 12:36:53 UTC

Permalink

In my experience, only after the decoding of an audio frame one can tell what the number of samples for audio actually is.
One can guess based on the codec, but that is not very reliable.

From: Yurii Monakov
Sent: 03 October 2018 09:58
To: This list is about using libavcodec, libavformat, libavutil, libavdevice and libavfilter.
Subject: Re: [Libav-user] How to determine frame sizes in msec

Bob, you can simply multiply number of samples and audio time_base.

Yurii

Ð²Ñ, 2 ÐŸÐºÑ. 2018 Ð³. Ð² 18:15, Bob Kirnum <***@gmail.com>:
We are using the libavformat APIs in order to support various containers including WebM, MKV, MP4, and MOV.Â We do not use libavcodec APIs for encoding or decoding, we have our own implementations.Â Our implementation is a real-time media server which can play from or record to these containers.Â We also offer DVR like controls for skipping ahead or behind.Â I am having some difficulty finding a consistent implementation for all containers to determine the frame times for the audio and video frames we read.Â When using some MP4 files, the cur_dts value is always AV_NOPTS_VALUE so this can't be used.Â When playing some MKV files, the audio and video codec context time_base values are inconsistent.Â For video the time_base is (1001 / 24000) which can be used to determine the frame rate and duration (23.976 fps or 41.7 msec).Â However, the audio values are (1 / 8000) which is the sample rate not the frame size.Â It is certainly possible that I am not understandingÂ this correctly.Â Can someone recommend a consistent calculation that will work for audio and video in these various containers?

Thanks,
Bob

Bob Kirnum

2018-10-03 10:56:15 UTC

Permalink

Thanks Yuri. The problem is with consistency. I tried using the AVPacket
dts and duration values. These proved to me more consistent until I played
an MP4 file. For MKV and WebM, the values were in msec. For MP4, the
values were not. Looking at the AVStream time_base values, they were
1/16000 for both audio and video streams.

Yurii Monakov

2018-10-04 08:05:23 UTC

Permalink

Bob, you can convert timestamps to any time base with av_rescale_q function:

AVRational ms_time_base = { 1, 1000 };

/* assuming that you've just got packet from demuxer */
int64_t time_ms = av_rescale_q(packet.pts, stream.time_base, ms_time_base);

Yurii

Post by Bob Kirnum
Thanks Yuri. The problem is with consistency. I tried using the AVPacket
dts and duration values. These proved to me more consistent until I played
an MP4 file. For MKV and WebM, the values were in msec. For MP4, the
values were not. Looking at the AVStream time_base values, they were
1/16000 for both audio and video streams.
_______________________________________________
Libav-user mailing list
http://ffmpeg.org/mailman/listinfo/libav-user