Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Ffmprovisr – Making FFmpeg Easier (amiaopensource.github.io) similar stories update story
414 points by nice__two | karma 229 | avg karma 6.54 2023-07-30 04:44:14 | hide | past | favorite | 106 comments



view as:

This is a great idea! Trying to get FFmpeg to do what I want it to do is always daunting. ChatGPT has been helpful, but not perfect. Thanks for this :)

ChatGPT totally made ffmpeg very accessible to me. I think minor issues I have is intricacies between different operation systems.

Chatgpt made lot of CLI tool args/options easy to use. I always had a hard time remembering OpenSSL options and arguments. Now I just use gpt

> Transcode to an H.264 access file

You use "access" several times but I don't know what you mean by it. I'm going to guess that is some non-english usage slipping in. Nothing else to complain about at this time. [EDIT] I should say "is used" and "they mean" because I don't know if the author is also the poster.


I think it might just be autocorrect translating "AAC" into "access".

'Access' copies are versions of originals intended for viewing, in contrast to the original ('preservation') video which may be impractically large to use, or in an unusual format.

(At least that is how the term is used by collections librarians. Even there terminology may vary)


vectorscope is a great commend to bookmark.

Nice idea, but needs a better name!

I like the name

Name is perfect.

See also:

* https://ffmpeg.guide/ — create complex FFmpeg filtergraphs quickly and correctly

* https://www.hadet.dev/ffmpeg-cheatsheet/ — clipping, adding fade in/out, scaling, concat, etc


ffmpeg.guide is really awesome. Do we have something similar for ImageMagick?

I'm not aware of one. I checked my bookmarks and found https://imagemagick.org/Usage/basics/ which has examples, discusses philosophy and methodology

The second link's clipping command is not ideal in my experience. For some god known reason, ffmpeg behaves differently depending on whether you put the -ss and -t/-to flags before or after the -i flag. And for me, before worked better.

It's also an issue in the original post.


my understanding is options in the cli are tied to either global options, input file options, or output file options; so placing options after -i option(s) would tie it to the output file... i'm definitely not an ffmpeg expert though.

Effectively, placing ss before the input filename seeks the file (the container?) without decoding it. Placing it after will decode the streams while skipping to the point you specify.

Seeking the container is usually much faster than decoding and then throwing away what you don't need, but it has fatal flaw: most videos use p-frames and thus require you to decode the frames before it.

So, say you want to skip to 60 seconds in. The solution is to do "-ss 50 -i input.mkv -ss 10", which is fast and should get the keyframes you need.


Speed was never my concern, judging by what you said I think putting the -as after tried to cut things between p-frames and that often caused probls with the cut, while putting it before started the cut at the last p-frame? Something like that, I don't use ffmpeg enough to remember what problems I was having but I remember what fixed it.

That trick (-ss -i -ss) should not be necessary by default (unless you use -noaccurate_seek) according to current documentation. But I haven't verified it.

https://ffmpeg.org/ffmpeg-all.html#toc-Main-options

  -ss position (input/output)

     When used as an input option (before -i), seeks in this input file to 
 position. Note that in most formats it is not possible to seek exactly, so 
 ffmpeg will seek to the closest seek point before position. When transcoding 
 and -accurate_seek is enabled (the default), this extra segment between the 
 seek point and position will be decoded and discarded. When doing stream copy 
 or when -noaccurate_seek is used, it will be preserved.
 
    When used as an output option (before an output url), decodes but discards 
 input until the timestamps reach position.

> For some god known reason

...which becomes obvious once you notice that options apply either to one of the inputs or output.


ffmpeg.guide looks amazing, but it feels like ffmpeg should really have something better itself. It's crazy trying to shoehorn a graph (non-linear) into a commandline (flat, linear). Even just a more verbose json config would be great.

Or just use ffmpeg-python.

if you want to improve a page like this include a screenshot

The page is a collection of commands to perform specific actions like transcoding, syncing video/audio/subs, etc. Screenshot(s) won't offer any additional information/help.

[flagged]

Conversely, I do not know any developers that use AI instead of google. And considering the horrific garbage it has given me the few times I tried, I’d be worried about what you’re doing with it.

I do use ChatGPT, and while it helped me with things like obscure issues with bash syntax, it hallucinated some CLI tools options often enough that I never trust it with anything important. (And I tried to ask it even fairly easy questions.)

While it works pretty well when I forget some CLI tool option, in cases when some tool does not have an option doing what I want, ChatGPT is worse than useless - instead of telling me that it doesn't know an option doing something exists, it just tries to "think something up".

In the world when the right answer is "no, there is no option doing what you want" even a small amount of time, ChatGPT is actively harmful.


To be fair, googling so many things nowadays will lead you to dogshit blog sites with a million CTA modals and needless prose

You could input your opinions directly into AIs rather than here, all developers I know don't want to read such ludicrous unsubstantiated claims.

You and the developers you know don't encompass all of humanity.

A nice collection. Wish those examples, or part of them, were included in ffmpeg manual. The ImageMagick section can easily be expanded to a site of its own.

Very useful!

I started using chatgpt to come up with ffmpeg commands, just so much faster and easier to find what I need.

I made a small tool so I can do it right in the cli: https://github.com/alexkrkn/help-cli

and made a video about it: https://www.youtube.com/watch?v=pOda6TDBqcY


[dead]

All Ive ever wanted was to convert mp4 to gif.

Why?

Perhaps it's a form of hipster irony, since by and large all GIF sites and social media sites now convert source GIFs to mp4.

There are still plenty of contexts where you can embed images but not videos (or at least not with automatic looping playback) - forums, github README.md, other markdown-based comment/post systems.

Gif compression limitations also encourage you to cut down to the essential parts - too many videos waste the viewers time with delays and irrelevant bits.


> irrelevant bits

Hmm.


What's more confounding about this than translating between any other two formats?

GIF is a terrible format for video

But it's the only common video format that can be treated like a static image in most cases. Like any other format, it's got its ideal use case.

Animated webp is commonly supported at least on he web these days.

Of course, it too is a horrible video format, which is impressive considering it is based on a not nearly as horrible video format. If only browsers would support silent looping video in <img> and CSS image contexts...


> Animated webp is commonly supported at least on he web these days.

Sure-- if it's a) a website that b) you're making. Tons of websites that allow user uploads only allow common static image formats-- jpg, gif... maybe png, maybe bmp, etc. I can't imagine anywhere that allows users to upload profile pictures, for example, would allow them to upload a webp, but I could imagine users wanting an animated profile picture. I've done it myself.

The whole point is that there are instances where using an animated gif is the only option if you want an animated image, and people want to convert videos to animated gif because of that. That's why FFmpeg does it. I'm not really sure why people find this so weird.


You can do that with ffmpeg, but the output isn't ideal

I'd use gifski: https://gif.ski/


I use ffmpeg often because it's so powerful, but its api cannot fit in my head. It should have a LLM frontend.

The API doesn't fit in my head either, but grepping through https://ffmpeg.org/ffmpeg-all.html usually gives me the option I need.

Probably the biggest barrier to ffmpeg adoption is all the offline 'freemium' and web frontends, the host sites for which have been SEO'd for phrases people commonly put into Google like "avi to mp4", "mp3 to wav", etc..

It took me more time than I wish it did to become open to using CLI apps, the Windows world had taught me to expect a GUI for everything.


I thought most of those file conversion sites were just ffmpeg on top of nginx or something?

They almost certainly are, cli is not always the best interface but ffmpeg covers such a wide swath of video utilities that I’m not sure how you could include all of its functionality in a GUI.

This guide recommends "yadif" as a deinterlacing filter. I find "w3fdif" looks better. Like yadif, it does not do motion tracking, so it's reasonably fast and avoids the distracting artifacts that motion tracking sometimes causes (I'd rather have consistently mediocre results than sometimes great and sometimes bad), but it considers three fields at a time instead of yadif's two, which lets it hide the interlacing artifacts better.

There is an extension of that: https://github.com/HomeOfVapourSynthEvolution/VapourSynth-Bw...

Though if you are reencoding, you mind as well go whole hog and use QTGMC.


"bwdif" is a hybrid of "yadif" and "w3fdif"

As previously suggested, w3fdif has mostly been supplanted by bwdif. w3fdif can produce shimmering, whereas yadif does not, which is why bwdif operates like yadif but uses the better field matching of w3fdif.

A bit off topic, IMO ffmpeg is one of the best software ever written. Fabian Fabrice (ff) is one talented engineer and people such as him are a gift to the FOSS community.

I used to work in a ~2 bil unicorn in which a big part of the products we worked on relied on ffmpeg.


And you gave a sizable donation, right?

https://ffmpeg.org/donations.html


I have a feeling the ~$2b wasn't his

Still, good reminder to speak up about the products to leadership and see if we can get them to donate.


Absolutely. For the first startup that I worked for, I earned a position with budgeting authority. Once we achieved profitably, I added donations to the open source tools we relied upon. I received some pushback and defended it. Looking back, I am proud of that choice. I encourage you all to do the same.

Can you share what you chose as your points and what convinced them at the end?

What likely works better in conservative environments, are not donations, but (generous) support contracts. This is something buisness people can understand.

Oh, this goes w/o question, but not all F/OSS projects are incorporated and/or can provide a contract. Even more trouble if the entities are in yhe different countries.

While I'm sure you are right, I can't help but to be irked by this strange environment we live in where we have to treat business people like infants who need things mashed and mushed into something they can digest and understand.

Nobody had to explain to me the concept of donating to support somebody working on something my business relies on, it's just common sense.

If I was really reliant on books documenting tolls and laws in regards to international trades, and you wanted to give a donation to support this work, I wouldn't be all "me no understand, please explain it in programmer-speak for me".


If you relied on a set of books and found out the author and publisher were head of the 'Defund STEM; Make Everyone Get Lit Degrees (DSMEGLD, prounced De-smegol'd, cause they are LOTR nerds)' lobbying organization and actively spent a good portion of their day convincing education organizations from kindergartens to Universities remove all their math, science, and programming classes and teach crystal healing instead, would you donate to them?

You love programming and wouldn't help a group of people try to destroy it, right>

Well, business people love money and don't want to help people not give it to them or not let them keep it.

(toungue in cheek but 60% serious...)


Have you spent much time around business people? I’ve spent years around them and open source software truly confuses them. Why would you ever use your time to make something and then give it away? I once suggested we make a donation and they laughed at me. Their response was if people are dumb enough to make something and give it away for free, then they get nothing.

One argument I made was sponsoring a project, especially buys you support / developer relations. For example, we replaced a commercial PKZIP license with 7-zip. Was able to use the fact that one of the project we donated to, 7-zip, implemented a feature request that helped in this transition. That combined with the fact that these donations relative to our proprietary software costs were insignificant, made it an easy sell to my boss.

Thanks forr the reply.

When you have a direct involevment in the project it surely helps.

And thanks for supporting 7z, from a guy who use it dayly! *cheers*


How about you just give a donation for all three of us?

>Fabian Fabrice

*Fabrice Bellard. Also creator of QEMU, TCC, QuickJS, and others.


Yeah, the etymology suggested by the top-level comment is completely wrong. Wikipedia says

> The name of the project is inspired by the MPEG video standards group, together with "FF" for "fast forward"

Corresponding source (Fabrice Bellard himself): https://ffmpeg.org/pipermail/ffmpeg-devel/2006-February/0103...


FYI the “FF” in “FFmpeg” stands for “fast forward.”

[dead]

There are no people such as Him. There’s nobody else in his league, heck, there’s nobody even playing the same game as Fabrice Bellard :-)

In all seriousness though, the sheer amount of devices running code he wrote at any given moment is just ridiculous.


Fabian Fabrice has created many amazing software projects.

Comparable only to famous film director Alan Smithee who has credit for so many films.


I love ffmpeg, but its performance has got to improve on ARM!

Great stuff!

ffmpeg is infrastructure-level important, and tools like this keep it going.


No they don't. There's maybe 20 people that keep it going

Fair point

People like to say that ffmpeg is complicated but when you make it into a nice gui it doesn't get any easier -- it is video compression itself that is complicated. No software could make it easier without making the decisions for you, like handbrake or some other click-through interface.

I'm not certain, but I highly suspect that if I sat down and learned about digital video encoding and compression on a granular enough level then figuring out how to do things in ffmpeg would be rather intuitive. Does anyone have experience doing this?


I've written DirectShow filters around 17 years back for WindowsMobile (not Windows Phone) so I've a decent understanding of codecs and containers.

Formats like mkv or codecs like HEVC didn't exist back then but the concept of manipulating audio/video through a bunch of filters is a wonderful one and most (all?) a/v transforming software does it. When I started looking into FFmpeg's man pages I could connect the dots and start using it after a day of fooling around.

I'm a CLI lover and man page reader so perhaps it worked to my advantage.


Since ffmpeg CLI still makes me pull my hair out, even with excellent guides, I am going to plug vapoursynth:

https://www.vapoursynth.com/

Its optimized Pythonic video filtering... But also so much more: https://vsdb.top/

And Staxrip, which makes such good use of ffmpeg, vapoursynth, and dozens of other encoders and tools that I reboot from linux to Windows just to use it: https://github.com/staxrip/staxrip


I would really appreciate just an ffmpeg wrapper with better CLI. It is unnecessarily convoluted, and while I don't know if there's a point of view from which it actually all makes sense, it is just inadequate in performing all sorts of extremely common tasks it is perfectly able to perform, if one knows which magic words it needs to hear. I probably have dozens of bash-aliases, that are nothing more than encoding 150-character ffmpeg commands into 2 simple words.

It is also incredibly stupid how 99% of time ffprobe is used without any arguments to just quickly see something as mundane as duration, resolution, framerate, MAYBE number of audio-tracks, yet 99% of its output is some completely irrelevant bullshit like compiling options.


>> I would really appreciate just an ffmpeg wrapper with better CLI.

There were (are?) tons of them on GitHub. But many are still obscure, or are single dev efforts that fizzled out.

Some focused, purpose built CLI frontends (like Av1an, specifically for transcoding) are excellent at what they do. Perhaps that is the better way than an all encompassing wrapper.


https://ffmpeg.org/ffmpeg-filters.html makes ffmpeg easier, not stuff like this.

The navigation layout of this is not ideal yet.

Specific recipes that should be added:

Removing audio: ffmpeg -i $input_file -c copy -an $output_file

Halving resolution: ffmpeg -i $input_file -vf "scale=iw/2:ih/2" $output_file


Those anchors don't work on Firefox on Android. The author is losing... I dunno.... 1e-20% or whatever of the browser market share among my fellow Android FF users.

They do work, just poorly. The text opens, but it scrolls you down to the end of the text.

There must be something else happening on my install, then.

Related: I have a small library of personal videos, including from my wedding, and I'd like to compress it as much as I can to reduce its storage footprint. I don't care much about codec compatibility, as long as I can watch them on my (ARM) MacBook, it's good.

In the past (over 10 years ago), I used to work with H.264, but I remember fiddling with parameters was a pain. I wonder if nowadays there are some promising new codecs based on ML. Again, as long as it works in my machine it's good, so anything from GitHub, HuggingFace and so on is acceptable, as long as it doesn't need too much effort and specialized knowledge to run it.


There are some promising codecs based on neural networks, however they are all very much research projects and have major limitations. Additionally, the compression ratios are only marginally higher than state-of-the-art engineered codecs. I think for your use case a more modern engineered codec such as VVC (H.266) or AV1 is perhaps more suitable.

Define 'small'? In GB/TB

I'd recommend not re-encoding as you'll have irrevocably lost data. Whatever it's size, in the future it won't be large.

Yea can get wonky, -c:v copy is the correct flag to not re-encode

This depends on how much time you want to spend. If you want the transcode to take less time than the playtime of your videos, it'll probably be best to just use the best hardware encoder you have with high quality settings.

If you have more time, then AV1 is good. Read through the trac page [1] and do test encodes with 5-10 seconds of video to determine what settings give you your desired quality. Note that low `-cpu-used` or `-preset` values will give great quality but take incredibly long. Then, encode a few minutes of video with various settings to determine what settings give you your desired file size.

For human time usage, keep track of the commands and options you use and how those affect the output. If the job will take more than a few hours, write your script to be cancellable and resumable.

[1]: https://trac.ffmpeg.org/wiki/Encode/AV1


Huge ffmpeg fan, though I could never remember the magical incantations I've used.

Nowadays I wrapped them all with Emacs functions. This makes them easily accessible as a "right-click" menu of sorts via M-x.

https://xenodium.com/emacs-ffmpeg-and-macos-alias-commands


In the spirit of making FFmpeg and video processing easier, we open-sourced a chat-based agent to assist video production - https://github.com/remyxai/FFMPerative

I am a bit confused. This seems to be about the command line programme "ffmpeg" that comes with the library, but not the library itself. That programme seems already very well documented, with a help option, a man page and everything. It is usually the library that no one knows how to properly use, due to a lack of documentation :-)

As someone who uses ffmpeg daily (mostly basic functions), I now rely on chatGPT to approximate the command and fine tune from there. Haven't used too many of the advanced features of ffmpeg so glad someone seems to be covering those use cases as most tutorials dont cover them.

I had a whole folder of videos I wanted to convert to 720p, asked chatGPT and it gave me this:

find . -maxdepth 1 -type f -name "" -exec sh -c 'pv "$1" | ffmpeg -i pipe:0 -filter:v scale=720:-2 -c:a copy "${1%.}.mp4" 2> /dev/null' _ {} \;

Not sure if it can be improved but it works well


I was looking for a ffmpeg UI recently and came across Shutter Encoder. It's open source, mac/windows, very good software.

I've finally started compressing my 15 year, 300GB personal video collection..

https://www.shutterencoder.com/

I'm compressing everything using, H.265 and videos are shrinking to sometimes 1/10th the size.. Is there who would give me reasons why I would not want to do this? I've read that it takes more processing power to watch these compressed videos, but not sure that will cause much trouble in the future...


I tried this automated across a large video collection and the quality was subpar because my CRV settings were weak despite looking fine on the test videos. Consider that a word of warning, validate many videos before removing the masters.

But with 300gb, storage is cheap enough that you could just keep the masters.


One reason for me to still pick h.264 is many aging (or budget?) hardware doesn't have hardware decoding for h.265.

Also it's just easier on my homelab to use Plex without having to transcode


Ffmpeg and ChatGPT are a match made in Heaven.

This is something a GPT would do way better.

Legal | privacy