Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Running FFmpeg on AWS Lambda for 1.9% the Cost of AWS Elastic Transcoder (intoli.com)
214 points by foob on May 2, 2018 | hide | past | favorite | 53 comments


Similarly, you can compile ffmpeg on Lambda, in 0.5 minutes, for 9 cents.[1] Versus 10 min on one core, for ~free. And while -j200 of ffmpeg is nice, -j1000 of the linux kernel is... wow, like seeing the future.

[1] demo in a talk: https://www.youtube.com/watch?v=O9qqSZAny3I&t=55m15s (the actual run (sans uploading) is at https://www.youtube.com/watch?v=O9qqSZAny3I&t=1h2m58s ); code: https://github.com/StanfordSNR/gg ; some slides (page 24): http://www.serverlesscomputing.org/wosc2/presentations/s2-wo...


This is impossibly amazing to me! Thank you so much, what an excellent lecture.


Very cool. I hope more of the "value add" stuff from cloud providers ($$$) can be replaced with open source running on their cloud functions. My suggestions:

* FFmpeg supports http/https as input protocols if compiled with the options enabled. See `ffmpeg -protocols`

* You can parallelize or chunk FFmpeg to enable longer inputs, e.g. I found https://github.com/nergdron/dve/blob/9f1ca516b18f50d1d99d15e...

* Try with larger memory sizes. Larger memory = more CPU for Lambda which may result in shorter transcodes. You might even pay the same amount if the transcodes are CPU bound and finish in roughly linear time wrt CPUs


This is not "cool" - this is either doing less than Amazon's encoding service or just exposing the pricing model. If it's cheaper they could charge less, unless they are terrible programmers who can't even use their own lambda functions.


>If it's cheaper they could charge less,

You definitely misunderstand cloud pricing.


How so? It seems to me this is a perfect example of it.


They have no interest to charge you less. The strategy is to lock you in and get the most they can from you.


Of course - that's why this "exposes their pricing model", which is charging for services.


Licensing costs are pretty high for the different encoders.


Next installment: "Running FFmpeg on a tower under my desk for 1% the cost of AWS Lambda".

Is there any effort to on-prem lambda stuff yet? I know it's a moving target, but I wouldn't recommend getting into cloud stuff you can't migrate out of.


Yes! If one uses something like Kubeless[1], you have something like AWS Lambda, but where the backend is Kubernetes rather than AWS. Yes you're still on a framework, but it's a vendor-agnostic, open-source one. There are some other attempts at similar things, too. I am partial to this one for now.

[1]: https://github.com/kubeless/kubeless


This is what I see as the true power of kubernetes -- once people start developing high quality (hopefully open source) applications for platforms like kubernetes, providers like AWS should lose their "value-added" benefits, and be reduced to more like colo providers, maybe offering 24/7 support as well.

That will be when the ubiquitous cloud truly arrives -- run on whatever provider in the sky, and as long as they run kubernetes you can run your workloads there.


At a pure level, Lambda "integration" is essentially an interface for passing in runtime arguments+environment to your application. If you're concerned about lock-in, you can write or integrate another entry point that executes your function on on-prem Tomcat or whatever other environment (Cloud Functions?) you want to run in.

The much harder challenge would be provisioning thousands of on-prem servers to handle the load, but I wouldn't necessarily qualify a dependency on cloud-like autoscaling as lock-in.

I guess the lock-in might come as you unwittingly couple yourself to the intricacies of the rest of AWS's offerings as your app architecture grows more complex.


Well there's iron.io (which just got bought by Oracle) and there are several FaaS projects for Kubernetes such as: OpenFaaS and Kubeless.

Edit: OpenFaaS can run on Nomad as well apparently: https://www.hashicorp.com/blog/functions-as-a-service-with-n...


https://openwhisk.apache.org/ backed by IBM AFAIK, but not only.


Next installment: "Running a private cloud under my desk for 1% the cost of AWS".


I'm thinking of writing a book about tower-under-desk computing. It's surprisingly common.



check: https://github.com/nuclio/nuclio high-perf serverless runs over (standalone) docker or Kubernetes or Cloud services


Their tool which facilitates the packaging and relocation of dynamically linked binaries is interesting: https://github.com/intoli/exodus

"Painless relocation of Linux binaries–and all of their dependencies–without containers."


Yeah, seriously. This sounds great.

Also, if you found exodus interesting, you may find the following interesting too.

https://github.com/endrazine/wcc


Wow this is amazing, how have I never heard of this before? Thank you for sharing this.


Wow! Thanks for flagging this. This tool has so many useful applications; mind blown.


Yes, I think this is the buried lede - that could be incredibly useful.


Very misleading title. Elastic Transcoder pricing applies primarily to video. This tutorial only covers the audio transcription which is much, much less resource-intensive.


Nothing misleading here, they compared their project's cost to the Elastic Transcoder audio pricing.

Elastic Transcoder Audio is $0.00450 per minute [1], this article says that with Lambda it cost "$0.00008273 per minute of audio, a full factor of 54 times less than Elastic Transcoder.".

0.00450 / 0.00008273 = 54

[1] https://aws.amazon.com/elastictranscoder/pricing/


Misleading title. Article is about audio encoding, not video. Better: “Using FFmpeg on AWS Lambda for audio encoding at 1.9% of cost for AWS Elastic Transcoder”


What's the benefit of using Exodus over just using the official-ish static builds of FFmpeg? https://johnvansickle.com/ffmpeg/

This works just fine on AWS Lambda. The `ffmpeg` binary there weighs in at 46 MB. Unless you need something not bundled with that build, it seems like this is sufficient and is easier to set up.


It looks like the youtube download has to complete before ffmpeg can start; is there a way to start processing the head while the tail is still being written?

This problem comes up a lot with storage blobs. The bigger they are the worse it is to serialize write/reads.


I don't see why not with I/O redirection, or named pipes. It probably wasn't done for simplicity.


This is great. I wouldn't have thought to run FFmpeg on Lambda.

I'm going to stick with Elastic Transcoder for now though. I like that I have no upgrades to maintain and very little code. I feel like if I did this, it would take me years to recoup the cost even with a 99% savings.

But that is only because I only have a few videos a month. Roughly $1.00 on Elastic Transcoder. If I had thousands or even hundreds of videos this seems like a great and worthwhile project. Especially since this article appears to take a lot of the trial and error and proof of concept out of the mix.

I worked for a large Internet company that had a Netflix like product back in 2007. The transcoders were literally just plugged in underneath people's cubicles. Kept things nice and warm in the winter and I'm sure the costs were pretty low.


Cool but only for up to 8 minute videos. Unless you found a way to parallelize the lambda tasks.


I'm far from an FFmpeg expert, but I believe it's possible to segment the input video, transcode the segments one by one, and then concatenate them. Not sure how the segmentation and concatenation steps perform, but if that's fast, this might even improve your overall transcoding speed due to the parallelization.


Media companies are already taking this approach using ffmpeg, AWS Lambda, and AWS Step Functions. I heard from two companies using such approaches at AWS re:Invent in October 2017, so it's definitely possible.

Rolling your own approach like this is certainly more complex to build/maintain than using Elastic Transcoder though.


If you know that you'll need more than 8 minutes, why wouldn't you just run ffmpeg on EC2? EC2 is now pay-per-second. I haven't looked at the prices recently, is AWS lambda so much cheaper that it's worth jumping through all these extra hoops?


You can encode about 1 video per EC2 medium instance without losing >1:1 encoding speed. It’s horrendously expensive.


Also not an expert, but since videos are transcoded as key-frames and changes applied to those key frames, I don't think it's as simple as segmenting something like a CSV. Transcoding is probably required just for the segmentation process. Putting it back together might be easier, but the final output file might also be larger because of overhead.


The segmentation code is keyframe-aware, so it only splits along keyframe edges. In other words: requesting segments of 30 seconds each probably won't get you segments that are exactly 30 seconds long. Still, there could be plenty of other obstacles I'm not aware of.


Neither am I. It's pretty simple to do though, and the performance of the steps that aren't encoding are a lot quicker as it's mainly just copying the encoded files to an intermediate format, and then concatenating those together.

ffmpeg -i "file.mp4" -ss 01:16 -to 02:16 -c:v libx264 -crf 32 "newFile.mp4"

ffmpeg -i "file.mp4" -ss 02:20 -to 02:45 -c:v libx264 -crf 32 "newFile1.mp4"

ffmpeg -i "newFile.mp4" -c copy -bsf:v h264_mp4toannexb -f mpegts temp.ts

ffmpeg -i "newFile1.mp4" -c copy -bsf:v h264_mp4toannexb -f mpegts temp1.ts

timeout /t 5 /nobreak

ffmpeg -i "concat:temp.ts|temp1.ts" -c copy -bsf:a aac_adtstoasc "Finished.mp4"



Doubling the memory allocation for a lambda task also doubles the CPU allocation. It's possible it could go much faster than this.


Don't you also have to pay for traffic going to/from lambda? In that case raw audio and video would be very expensive!


Traffic to Lambda from the internet and between Lambda and S3 is free. The only thing you pay for are the transfer costs from S3 (at cents per GB).


Assuming resulting audio of 3 minutes, then 1000 uses would result in 9 GB, or about 81 cents. As long as you can get ads for $1 per mille, you should be good. That said, you'd probably need to implement something to prevent abuse (single user bypassing the frontend and just spamming your backend).

Looking forward to the next post in the series.


What if you use an S3 upload hook? Are you charged for bandwidth S3 -> lambda? Also, wouldn't you use the same bandwidth with elastic transcoder anyway?


When I worked at Panasonic we did this exact thing. It's remarkably easy for the cost savings.


I used FFmpeg static build to transcode WAV to mp3, but the latest 64-bit build gave me corrupt files, so I had to hunt down an archived version. Works well though!


Is it downloading the whole Youtube video just to pull out the audio? Why not just download the audio to begin with?


What is the API to download only the audio from a youtube video? How would you do your proposed solution?


Let's take a video I recently linked in another HN comment. https://www.youtube.com/watch?v=r_fxB6yrDVo If I run youtube-dl -F https://www.youtube.com/watch?v=r_fxB6yrDVo then I get a bunch of options marked "audio only DASH audio", e.g.

251 webm audio only DASH audio 143k , opus @160k, 78.96MiB

Then if I run youtube-dl -f 251 -g https://www.youtube.com/watch?v=r_fxB6yrDVo then I get this horrible URL: https://r1---sn-hxugvj5nu-cvnl.googlevideo.com/videoplayback...

However if I wget "[horrible_url]" -O audio, it still takes forever to download, so I guess rate-limiting might be the issue. But if download time is the problem, you could have one server that just downloads the data slowly to S3 and then kicks off the lambda job on the completed file.


Because it's in the video container.


Most YT videos use DASH protocol to serve you video and audio separately. That's because you can use same audio stream with different resolution videos. "youtube-dl" script can download just the audio file, without downloading video data.


Yes well, now you just have to pay the fee for licensing the codecs FFmpeg gives you ;). What was it? One million dollars for MP4? Good luck with that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: