Similarly, you can compile ffmpeg on Lambda, in 0.5 minutes, for 9 cents.[1] Versus 10 min on one core, for ~free. And while -j200 of ffmpeg is nice, -j1000 of the linux kernel is... wow, like seeing the future.
Very cool. I hope more of the "value add" stuff from cloud providers ($$$) can be replaced with open source running on their cloud functions. My suggestions:
* FFmpeg supports http/https as input protocols if compiled with the options enabled. See `ffmpeg -protocols`
* Try with larger memory sizes. Larger memory = more CPU for Lambda which may result in shorter transcodes. You might even pay the same amount if the transcodes are CPU bound and finish in roughly linear time wrt CPUs
This is not "cool" - this is either doing less than Amazon's encoding service or just exposing the pricing model. If it's cheaper they could charge less, unless they are terrible programmers who can't even use their own lambda functions.
Next installment: "Running FFmpeg on a tower under my desk for 1% the cost of AWS Lambda".
Is there any effort to on-prem lambda stuff yet? I know it's a moving target, but I wouldn't recommend getting into cloud stuff you can't migrate out of.
Yes! If one uses something like Kubeless[1], you have something like AWS Lambda, but where the backend is Kubernetes rather than AWS. Yes you're still on a framework, but it's a vendor-agnostic, open-source one. There are some other attempts at similar things, too. I am partial to this one for now.
This is what I see as the true power of kubernetes -- once people start developing high quality (hopefully open source) applications for platforms like kubernetes, providers like AWS should lose their "value-added" benefits, and be reduced to more like colo providers, maybe offering 24/7 support as well.
That will be when the ubiquitous cloud truly arrives -- run on whatever provider in the sky, and as long as they run kubernetes you can run your workloads there.
At a pure level, Lambda "integration" is essentially an interface for passing in runtime arguments+environment to your application. If you're concerned about lock-in, you can write or integrate another entry point that executes your function on on-prem Tomcat or whatever other environment (Cloud Functions?) you want to run in.
The much harder challenge would be provisioning thousands of on-prem servers to handle the load, but I wouldn't necessarily qualify a dependency on cloud-like autoscaling as lock-in.
I guess the lock-in might come as you unwittingly couple yourself to the intricacies of the rest of AWS's offerings as your app architecture grows more complex.
Very misleading title. Elastic Transcoder pricing applies primarily to video. This tutorial only covers the audio transcription which is much, much less resource-intensive.
Nothing misleading here, they compared their project's cost to the Elastic Transcoder audio pricing.
Elastic Transcoder Audio is $0.00450 per minute [1], this article says that with Lambda it cost "$0.00008273 per minute of audio, a full factor of 54 times less than Elastic Transcoder.".
Misleading title. Article is about audio encoding, not video. Better: “Using FFmpeg on AWS Lambda for audio encoding at 1.9% of cost for AWS Elastic Transcoder”
This works just fine on AWS Lambda. The `ffmpeg` binary there weighs in at 46 MB. Unless you need something not bundled with that build, it seems like this is sufficient and is easier to set up.
It looks like the youtube download has to complete before ffmpeg can start; is there a way to start processing the head while the tail is still being written?
This problem comes up a lot with storage blobs. The bigger they are the worse it is to serialize write/reads.
This is great. I wouldn't have thought to run FFmpeg on Lambda.
I'm going to stick with Elastic Transcoder for now though. I like that I have no upgrades to maintain and very little code. I feel like if I did this, it would take me years to recoup the cost even with a 99% savings.
But that is only because I only have a few videos a month. Roughly $1.00 on Elastic Transcoder. If I had thousands or even hundreds of videos this seems like a great and worthwhile project. Especially since this article appears to take a lot of the trial and error and proof of concept out of the mix.
I worked for a large Internet company that had a Netflix like product back in 2007. The transcoders were literally just plugged in underneath people's cubicles. Kept things nice and warm in the winter and I'm sure the costs were pretty low.
I'm far from an FFmpeg expert, but I believe it's possible to segment the input video, transcode the segments one by one, and then concatenate them. Not sure how the segmentation and concatenation steps perform, but if that's fast, this might even improve your overall transcoding speed due to the parallelization.
Media companies are already taking this approach using ffmpeg, AWS Lambda, and AWS Step Functions. I heard from two companies using such approaches at AWS re:Invent in October 2017, so it's definitely possible.
Rolling your own approach like this is certainly more complex to build/maintain than using Elastic Transcoder though.
If you know that you'll need more than 8 minutes, why wouldn't you just run ffmpeg on EC2? EC2 is now pay-per-second. I haven't looked at the prices recently, is AWS lambda so much cheaper that it's worth jumping through all these extra hoops?
Also not an expert, but since videos are transcoded as key-frames and changes applied to those key frames, I don't think it's as simple as segmenting something like a CSV. Transcoding is probably required just for the segmentation process. Putting it back together might be easier, but the final output file might also be larger because of overhead.
The segmentation code is keyframe-aware, so it only splits along keyframe edges. In other words: requesting segments of 30 seconds each probably won't get you segments that are exactly 30 seconds long. Still, there could be plenty of other obstacles I'm not aware of.
Neither am I. It's pretty simple to do though, and the performance of the steps that aren't encoding are a lot quicker as it's mainly just copying the encoded files to an intermediate format, and then concatenating those together.
Assuming resulting audio of 3 minutes, then 1000 uses would result in 9 GB, or about 81 cents. As long as you can get ads for $1 per mille, you should be good. That said, you'd probably need to implement something to prevent abuse (single user bypassing the frontend and just spamming your backend).
What if you use an S3 upload hook? Are you charged for bandwidth S3 -> lambda? Also, wouldn't you use the same bandwidth with elastic transcoder anyway?
I used FFmpeg static build to transcode WAV to mp3, but the latest 64-bit build gave me corrupt files, so I had to hunt down an archived version. Works well though!
However if I wget "[horrible_url]" -O audio, it still takes forever to download, so I guess rate-limiting might be the issue. But if download time is the problem, you could have one server that just downloads the data slowly to S3 and then kicks off the lambda job on the completed file.
Most YT videos use DASH protocol to serve you video and audio separately. That's because you can use same audio stream with different resolution videos. "youtube-dl" script can download just the audio file, without downloading video data.
Yes well, now you just have to pay the fee for licensing the codecs FFmpeg gives you ;). What was it? One million dollars for MP4? Good luck with that.
[1] demo in a talk: https://www.youtube.com/watch?v=O9qqSZAny3I&t=55m15s (the actual run (sans uploading) is at https://www.youtube.com/watch?v=O9qqSZAny3I&t=1h2m58s ); code: https://github.com/StanfordSNR/gg ; some slides (page 24): http://www.serverlesscomputing.org/wosc2/presentations/s2-wo...