Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
MPEG1 Single file C library (phoboslab.org) similar stories update story
96.0 points by phoboslab | karma 5764 | avg karma 14.23 2019-08-08 10:52:50+00:00 | hide | past | favorite | 27 comments



view as:

Do you have the encoder side of this also?

Excellent stuff, really excellent. Thanks for the additional links and cool tangents too!

Love to see stuff like this. I wonder why he put all of the code in a header file, though... I've never seen that done before; it seems like it would make it impossible to invoke this from two separate source files?

Here's a rationale for this from an author of very popular many single-header libraries: https://github.com/nothings/stb#why-single-file-headers

As to how to use from two separate translation units - you need to define PL_MPEG_IMPLEMENTATION to actually include implementation. So you do that in one of your .c files. Rest of includes will have only declarations included.


It's somewhat trendy these days, but the main reason I've found is windows developers struggle with adding third-party dependencies to their projects because their development environment sucks.

I think it's mostly game devs backporting C++ flaws into C.

Although this is getting downvoted because it is somewhat inflammatory (stops just short of using "winblowz" and "M$"), there is some truth to it. Visual Studio does have a package manager but its not widely used, and in practice can't be used in a lot of environments that C\C++ Windows developers (often game developers) are using. That said, I've seen stb_image used widely in environments that are not Windows because even if you have package manager and a build system that manages dependencies better than vanilla MSBuild it's still lower friction to include a single header file library. Last time I used xcode, dependency management was about as painful as visual studio but I might have been doing it wrong.

I like it too, to have just one file rather than many large and complicated files. Some programs I have seen do put everything in the header file, although I prefer to have a separate header file and program file, although having only the header file still works too. It isn't impossible to invoke from two separate source files, because there is a macro that you can define to specify if you want to include the implementation or not.

single headers (take it from a non c/c++ expert here just observing):

1. speeds up compilation -- through a single compilation unit. which as far as i understand is making the compiler compile a single "huge" pile of code. and because computers are extremely fast, is faster than figuring out dependencies using a build tool

2. makes dependency management easier (no make, no cmake, no scons, no ninja, etc.)

3. makes customization easier (just drop a macro and you can include just want you need)

i love the idea and empathize with the need for it.


How long until someone emscripten's this, so it runs directly in JS in the browser?

And how would that compare to others?

https://jsmpeg.com/


I'm guessing there won't be much difference since jsmpeg already has a small chunk of C code that gets run through emscripten and both libraries are written by the same guy.

https://github.com/phoboslab/jsmpeg/blob/master/build.sh


jsmpeg is by the same author!


MPEG2 is mostly patent-free too now, though I'm not sure how much larger its decoder would be.

The core of libmpeg2 is 3,810 lines of C (not including any systems-stream demultiplexer or audio decoder) versus about 2,400 for pl_mpeg.h. So not dramatically larger.

On the other hand, for progressive-scan 30fps video on computers without a VBV constraint and with a deterministic known decoder, I'm not sure any of the extra features in the MPEG-2 spec are very helpful compared with MPEG-1.


According to MPEG-LA [1]

Please note that the last US patent expired February 13, 2018, and patents remain active in Philippines and Malaysia after that date.

[1] https://www.mpegla.com/programs/mpeg-2/patent-list/


> This gross oversight in the overengineered (especially for its time) MPEG-PS and MPEG-TS container formats just leaves me dumbfounded. If anybody knows why the MPEG standard doesn't just provide a byte size in the header of each frame or even just a FRAME_END code, or if you have a solution for this problem, let me know!

Because the video encoding was created in 1988 and the mux format in 1995 when large amounts of fast RAM were incredibly expensive and recording/transcoding and processing devices didn't always even have a framebuffer to store a full frame. Many many MPEG-1, MPEG-2 and even MPEG4 AVC Baseline limitations become very obvious when you consider that they were encoded on CPUs that might be slower than 150MHz and be decoded on devices which may only have a few macroblocks worth of storage for decoded frame.

> Interestingly, if I interpret the source correctly, ffmpeg chose the second option (waiting for the next PICTURE_START_CODE) even for the MPEG-TS container format, which is meant for streaming. So demuxing MPEG-TS with ffmpeg always introduces a frame of latency.

I think the confusion here is because MPEG-TS was created for broadcast TV streaming, not realtime streaming. Broadcast TV can easily be seconds behind the source these days and has probably travelled at least once from geostationary orbit so one frame really isn't something anyone cares about. The more modern HLS/DASH formats tend to be even worse at this, with many sources waiting for a full several-second long chunk to be complete before transmitting it to the viewer's device.


Even if you had enough RAM, storing the frame at the encoder just to calculate its length would force a frame of latency at the encoder.

The way it is, the decoder has more flexibility in implementation.


Buffering a several second chunk allows the transmitter to include forward error correction (FEC).

This allows operators to trade off latency for stream integrity which is important when it's going to be transmitted over a lossy medium such as terrestrial broadcast.

This differs from the size problem because the amount of FEC (if any) is a parameter that can be configured between encoder and decoder. A size header would be a built in part of the protocol: hence my assertion that it would force a frame of latency.

i.e. FEC is an operational concern whereas the size header is a protocol design concern.


I think the MPEG people and the authors of pl_mpeg probably just differ on how "very hairy" it is to represent the byte-by-byte state of a single-threaded decoder. In other words, how hard is it to create a continuation object that lets the decoder run out of buffer at any byte location, return to the caller, and later resume from the same place when more bytes are available? The MPEG people were mostly thinking about hardware decoders, but this is not that hard in software -- and it's been done successfully by every major decoder implementation afaik.

The page writes: "if we're in the middle of decoding a video frame and the buffer doesn't have any more bytes yet (e.g. because it's streaming from the net) we would need to pause the decoder, save its exact state and later, when enough bytes are available, resume it again. Of course this isn't particularly difficult to achieve using threads, but if we want to stay single threaded it gets very hairy."

But I'm pretty sure the MPEG people would just say, "look, this is not that hard and was a solved problem 20 years ago. We're not going to introduce a 1-frame delay at the encoder (which is what you want us to do by putting a frame length in the PES header) but we're also not making you introduce a 1-frame delay at the decoder. Just do what libmpeg2 has been doing since 1999: make a state object that represents the byte-by-byte evolving state of the decoder. Update the state object as you go. If you run out of bytes in the buffer, return STATE_BUFFER to the caller. When the caller gives you more bytes later, use the state object to resume from where you left off."

"Here's what that object looks like for MPEG-2: https://github.com/cisco-open-source/libmpeg2/blob/master/li...

Yes, it's not beautiful, and yes, it would be more elegant if these variables were broken out into different scopes (current block, current slice, current picture, current sequence), but this is basically what you're in for and it's not that hard. And it avoids a 1-frame delay at either end. (And no, you don't need a 1-frame delay just because you're demuxing from a transport stream either -- just throw each TS packet into the video ES decoder as soon as you get it.)"


Thanks for the explanation! It hadn't occurred to me that an encoder would spit out packets for part of a frame, while it's still encoding. For what it's worth, ffmpeg's public API - at least the one I used in a previous project[1] - only produces full frames.

The MPEG-2 state object you linked to looks a lot like the private data of my decoder already[2]. I wonder if there's any restriction on when a packet may be concluded. I.e. do MPEG-PS packets have to contain full slices, or can they be cut off in the middle of slice?

The "hairy" part with my current design would be to reproduce the call stack. Again, if the decoder would live in its own thread, it would be a no-brainer.

> and it's been done successfully by every major decoder implementation afaik

As far as I can tell, ffmpeg's decoder does not allow for this. It always searches for the next picture's START_CODE before starting to decode the frame. Similarly, the libmpeg2 source you linked to doesn't seem to provide any functionality to resume decoding from anywhere in the stream either!? The NEEDBITS and DUMPBITS macros just assume there's always more data.

[1] https://github.com/phoboslab/jsmpeg-vnc/blob/master/source/e...

[2] https://github.com/phoboslab/pl_mpeg/blob/master/pl_mpeg.h#L...


Well.... now that I've gone to read the code again, you're absolutely right that I was too hasty in calling it the "byte-by-byte" state. It's more like the "slice-by-slice" state. libmpeg2 goes header-by-header, so, it can process each individual slice without waiting for the next picture start code, but it does buffer up a whole slice before starting any work.

If you just give it a single byte (or any number of bytes that doesn't include some subsequent start code, including a sequence_end_code for the end of the whole video), it just copies it to an internal buffer and then asks for more until it sees the beginning of the next slice or some other header. That's why NEEDBITS and DUMPBITS don't have to bail out in the middle -- by the time you get there, you know they have a whole slice to play with. So, yes, libmpeg2 does go start-code (or sequence_end_code) by start-code -- but not a picture start code.

ffmpeg/libavcodec is a wrapper around like 75+ different decoders, so I'm not too surprised if they have to go with a least-common-denominator interface.

In general an MPEG-2 TS or PS packet is just a fixed size packet and doesn't have to be aligned with any ES syntax element. Typically the PES packets (the much larger packets encapsulated in PS/TS packets) do contain exactly one video picture (i.e. the data_alignment_indicator is set on every video PES packet), but even this isn't formally required. Note that the PES packet header also includes an optional length field that would do what you want (but it's optional, in part to accommodate encoders that don't want to buffer the whole image before starting to encode pixels).

You might be interested in our TS/PES demuxing code that wraps libmpeg2/liba52 and tries to maintain a/v sync in the presence of arbitrary corruption -- it's more than half the size of your entire decoder!

https://github.com/StanfordSNR/puffer/blob/master/src/atsc/d...


Replace the call stack with a state machine. Then you don't have to monkey around with coroutines to get back to where you left off.

I have been wondering if we could built something on top of MPEG2 and AC3 and MP3, Codec with patents that had expired and something that is truly patents free. Which reminds me of Musepack, based on MPEG-1 Layer 2 [1]. Truly amazing quality at the time even when comparing to high bitrate AAC.

[1] https://www.musepack.net


As an aside, it makes me happy that Bink is still around. I remember using their Bink video codec for VJing way back in 2000 or whatever.

Legal | privacy