> they added .mov in their list of file types and had aliased it to .mp4 format
That’s weird. Why’d they do that. They should make a separate entry for mov and associate it with video/quicktime
Guess it might be something related to https://stackoverflow.com/a/44785870 but like they point out, mov is a container format that can contain one of many different codecs used. And isn’t mp4 just a container too? Referring to mov files as video/mp4 seems straight up incorrect to me
> I believe the reasoning is that the codec should ideally be invisible and irrelevant information to the average user.
I don’t think anyone decided to do it like that. It’s just been the way it always was.
Anyway, early on container formats were actually correlated with the codec, or at least the multimedia stack you needed.
Let’s remember the early file extensions:
.avi .rmbv .wmv .mov .flv
Everyone knew what they needed to install to play one of these files. It’s only later, starting with .mkv really, that container formats stopped having anything to do with codecs.
Even so, .mkv used to for a long time just mean h.264 to most users. If you had a device that could play mkvs it could play h.264 as well.
The confusion started full on with HEVC. To many users mkvs suddenly just stopped being playable.
I don’t see any reason to continue this trend. AV1 should just use “.av1”. Any device/program that can play av1 can also handle mkv/webm. And no one will be confused.
> It turns out, that within the silos of Apple, they do no communicate back and forth with each other. So the iTunes team read the white paper on the MOV file format and wrote all of their automation expecting these values to be present. Unfortunately, the team writing FCP had not updated their exporter to use these new features of the MOV.
I was somewhat involved in this at Apple at the time! Your description is actually a decent description of the reality on the ground. The following may however have some misremembered details so don't take it as gospel.
At the time QuickTime was migrating internally the older MacOS era subsystems to newer ones under the hood. The low level frameworks were private to QuickTime, Apple didn't ship public headers for them. Both iTMS and FCP however did use the private headers to use those frameworks directly.
With iTMS they had a bespoke media ingest and export pipeline. They were calling QuickTime's private frameworks but doing their own thing with the raw samples between ingest and export. IIRC they had QuickTime branch/builds of their own because their encoder fleet didn't just jump on the latest OS and QuickTime version.
FCP/Compressor were also directly using QuickTime's private frameworks but as I recall shipped their own private copies of some of them. I believe this was due to those revving with the OS rather than QuickTime's package but FCP had to support older OSes.
Even QuickTime's behavior wasn't entirely consistent. Videos captured on iPhones had an assumed color space (Rec.709 IIRC) but at the time didn't write the tags into the files. QuickTime had to recognize iPhone originated files and then play them as if they were tagged then tag them if it exported (pass-thru or otherwise) those files. For other untagged input files it assumed tags based on the frame size, SD sizes were assumed to be Rec.601 if they weren't tagged and HD sizes Rec.709 if they were untagged. Of course tagged files didn't need any guessing and got the tagged behavior on playback and export.
It's not necessarily that the three teams didn't talk to each other, their schedules were just wildly different cadences. While the QuickTime group was writing all the frameworks being used they were concerned mostly with the low level issues. They were also supporting iLife, iTunes (the app), and QuickTime Player/plugin releases along with a good portion of the group working on iOS as well. FCP and iTMS were just internal customers writing their own stuff using the low level frameworks.
> Video codec encoding was such a mess for such a long time due to the issues with MP4 and all the patent nightmare mess that unfortunately took way longer than it should have.
Dude this isn't even wrong. It's just nonsensical and ahistorical.
> but I failed to see how it warrants a new "format". You should be able to do that with any existing video format
It's about support.
The .zip format supports LZMA/ZStandard compression and files larger than 4 GB. But if you use that, a lot of software with .zip support will fail to decompress them.
The same way with log. While in theory you could probably make .mp4 or .mkv files with H264 encoded in log, I bet a lot of apps will not display that correctly if at all.
> since open implementations exist, open source developers know how the formats function.
These formats are way more complicated than you think they are. Just because open source implementations decode/encode something doesn't mean the implementation is standard-compliant. My one certainly is not.
> I wouldn't expect all codecs to come with their own container format!
Huh? That seems totally wrong to me.
I wouldn't expect a codec to develop their own, unique, incompatible container format, and I would hope very much that they would not (!!), but it's a real mistake to develop a codec and not specify at least some sort of preferred container format.
Otherwise, people are going to do what is often done with FLAC: not put it into a container format and just treat the raw output from the codec as a distribution format. And then complain that the whole thing sucks because there's no support for multiple streams, metadata, lyrics, subtitles, album covers, cuesheets, whatever.
The way to avoid that is not to set users up for failure by releasing the bare codec without some sort of preferred container format that's used by default. If the FLAC encoder had always used OGG containers (or Matroska or Quicktime or AVI or whatever) by default there'd be a whole lot fewer bare FLAC files floating around, convincing everyone that "FLAC sucks because there's no metadata".
That said, there are some issues with container formats as they are frequently implemented, which tends to drive users towards not using them and just exchanging bare codec output: if you separate the user from the codec with an intermediate container layer, you can create a ton of frustration when users get files that they think they can use, based on the file extension, but then get an error because they don't have the codec du jour... but there was no obvious way to tell that when they were looking at the file. The codec used is far more important to the average user than the type of container format, most of the time. (And yeah the ideal solution would be for filesystems to stop sucking so badly at metadata; MacOS and HFS was better at this stuff in 1996 than most modern computers are today -- at least it had the idea of both a file "type" and its "creator" as distinct things from the extension.) But in our world, the result is some containers being perceived as "unreliable" or "fussy" because they're used for a diversity of codecs that not all implementations have.
I don't have a great solution to that second problem, but at the very least I think that the file extension should follow the codec combination used inside the file, and not the container format. E.g. an AVI container with AVC compressed video inside shouldn't have the same file extension as an AVI container with Sorenson Video inside it; those two things are not interchangeable as far as a user is concerned. Since file extensions are the only metadata users get, they need to somehow represent the combination of both codec+container.
(Bit of context: the Quicktime plugin now has to be enabled on every site you want to use it with; there's no global way around it. Comes in wake of Google's dumping of h.264, which Quicktime handles natively).
> The backend part of it is just ffmpeg, which can do pretty much all of this for you.
Really? it takes in a movie and spits your thumbnails, sans spoilers, sans family un-friendly content, into your data store, already indexed? Across multiple data centers?
Seems like you're making it out to be much more trivial than it is.
EDIT: To be clear, the feature itself has trivial components. But rolling out a new feature requires far, far more than just the frontend code and an ffmpeg script.
> And the logistical costs of transcoding it to N different resolutions times M different codecs.
If only we had easy access to scalable video (SVC). If a container format supported it, the web browser could perform range queries to get the interesting bits as needed, no need for additional code.
And the uploader would need to transcode only once. This doesn't solve the codecs issue, but I think you could get away with offering one or two common codecs.
The problem is NOT THE FORMAT, the problem is the lack of tooling: links and w3m are among the rare text browsers that can display images in the console.
It's just a matter of the browser sending the image to the terminal in some format it can understand, but if that hasn't be thought about as a possibility (say, for text reflow issues) it's going to be far more complicated than just adding a new format, as you will have to work both on say the text reflow issues (ex: how do you select the size of the placeholder, when expressed in characters?), on top of the picture display issues.
Said differently, it would be easier to have console IDE that supported graphics if any format whatsoever (sixel, kitty...) was supported by a console IDE; we could then argue about the ideal format.
Arguing about the ideal format BEFORE letting the ecosystem grow using whatever solution there is only results in a negative loop.
It's like if a startup was arguing about the ideal technological stack even before trying to find a product market fit!!
Personally, I do not care much about sixels, kitty or iterm format - all I want is to see some kind of support for a format that's popular enough for tools using it to emerge.
Yes, it would be better if that supported format was the option that had the greatest chance of succeeding, but right now, that is a very remote concern: first we need tools, then if in the worst case they are for a "bad" format, we can write transcoders to whatever format people prefer!
Right now, there is rarely any "input" to transcode (how many console tools support say iTerm format?), so we have a much bigger bootstrapping problem.
> an off the shelf ASCII plotting library probably involves less custom tooling
With a terminal like msys2 or xterm, no custom tooling is required: just use the regular gnuplot after doing the export for the desired resolution, font, and font size.
gnuplot is far more standard than plotting library that often require special Unicode fonts on top of requiring you to use their specific format.
> This gross oversight in the overengineered (especially for its time) MPEG-PS and MPEG-TS container formats just leaves me dumbfounded. If anybody knows why the MPEG standard doesn't just provide a byte size in the header of each frame or even just a FRAME_END code, or if you have a solution for this problem, let me know!
Because the video encoding was created in 1988 and the mux format in 1995 when large amounts of fast RAM were incredibly expensive and recording/transcoding and processing devices didn't always even have a framebuffer to store a full frame. Many many MPEG-1, MPEG-2 and even MPEG4 AVC Baseline limitations become very obvious when you consider that they were encoded on CPUs that might be slower than 150MHz and be decoded on devices which may only have a few macroblocks worth of storage for decoded frame.
> Interestingly, if I interpret the source correctly, ffmpeg chose the second option (waiting for the next PICTURE_START_CODE) even for the MPEG-TS container format, which is meant for streaming. So demuxing MPEG-TS with ffmpeg always introduces a frame of latency.
I think the confusion here is because MPEG-TS was created for broadcast TV streaming, not realtime streaming. Broadcast TV can easily be seconds behind the source these days and has probably travelled at least once from geostationary orbit so one frame really isn't something anyone cares about. The more modern HLS/DASH formats tend to be even worse at this, with many sources waiting for a full several-second long chunk to be complete before transmitting it to the viewer's device.
That’s weird. Why’d they do that. They should make a separate entry for mov and associate it with video/quicktime
Guess it might be something related to https://stackoverflow.com/a/44785870 but like they point out, mov is a container format that can contain one of many different codecs used. And isn’t mp4 just a container too? Referring to mov files as video/mp4 seems straight up incorrect to me
reply