I know a lot of people look down on ChatGpt. But I have been using it for creati...

amadvance · on Feb 28, 2023

It really works! With few interactions, ChatGPT was able to create a not so obvious -filter_complex pipeline, like:

  ffmpeg -ss 00:01:00 -t 00:02:00 -i input1.mp4 -ss 00:03:00 -t 00:02:30 -i input2.mp4 -filter_complex "[0:v][0:a][1:v][1:a]concat=n=2:v=1:a=1[v][a];[v]eq=brightness=0.3[outv]" -map "[outv]" output.mp4

m3kw9 · on Feb 28, 2023

I asked gpt3 and here is what it said:

This command uses FFmpeg to create a single output video file from two input video files. It starts by specifying a start and duration for each of the two files (input1.mp4 and input2.mp4). It then applies a filter complex to the two files, which combines the two videos and audio into one stream, and adds a brightness filter with a value of 0.3. Finally, it maps the output video stream to the output file (output.mp4).

pjc50 · on Feb 28, 2023

And what does all that filter_complex do?

suzumer · on Feb 28, 2023

-filter_complex specifies a series of filters that accept inputs and return outputs. Any value contained in brackets ([]), is a value that can be input or output by a filter. [0:v],[0:a],[1:v],[1:a] are values supplied by ffmpeg representing the video and audio streams of the 1st and 2nd inputs, in this case input1.mp4 and input2.mp4.

The first filter, concat, takes in a set of synchronized audio and video segments, concatenates them, and returns the resulting audio and video clips. n specifies the number of segments, v specifies the number of output video clips, and a specifies the number of output audio clips. The results are saved to the values of [v] and [a] for video and audio respectively.

The eq filter then takes the [v] video returned by concat, and adjusts the value to a brightness of 0.3. For reference, 0 represent no change to the brightness.

This [v] value is then mapped to the output video using -map.

That being said, this filter isn't correct, as the [a] value is never used or mapped, so the filter would fail. The correct way to write the filter, if the intended use is to discard the audio, would be:

  -filter_complex "[0:v][1:v]concat[v];[v]eq=brightness=0.3[outv]"

I omitted the n,v, and a values in the concat filter, as they are by default 2,1, and 0 respectively.

Another way to visualize this filter in an imperative style would look like this:

  def filter(input0, input1) {
    v = concat(input0.v,input1.v);
    outv = eq(v,brightness=0.3);
    return outv;
  }

pancrufty · on Feb 28, 2023

I think you can ask that to ChatGPT in a follow up question ;)

nolok · on Feb 28, 2023

I understand what you mean.

But I also understand my sister doesn't need to know how her phone does any of what it does to play candy crush or read her emails.

Just like she doesn't need to know how a microwave works to reheat her meal.

If you want to know how things are done, of course get yourself involved in the details, but for most things in life you just want to use it without bothering with the details, so you can focus of the parts that are of interests to you.

(I know some people like to know the details of everything, and maybe you are one of them, and that's great, but the vast majority of people do not)

nicky0 · on Feb 28, 2023

Yes, chat GPT excels at comprehending and explaining things that have a consistent structure, restructuring, and and synthesising variations. If you keep it in its lane, it’s an excellent tool.

It’s really really bad at counting though. For example, try asking it to produce a line of 40 asterisks.

gamegoblin · on Feb 28, 2023

It’s bad at counting because counting relies on a stateful O(N) algorithm you run in your brain.

GPT is trained to reproduce human text, which tends to simply have the output of this O(N) counting process, but not the process itself. So GPT “thinks” it should be able to just spit out the number just like human text implies we do. It doesn’t know we are relying on an offline O(N) algorithm.

If you have it emit a numbered list of 40 elements, it will succeed, because producing a numbered list embeds the O(N) process and state into the text, which is the only thing it can see and reason about.

nicky0 · on Feb 28, 2023

That’s very interesting. I assumed it was something about the fact that it is a language model rather than a calculating machine. So printing 44 asterisks instead of 40 is kind of close.

I wonder if would it be possible to teach the machine to recognise situations it’s better at and be less confident other answers? Or does it need to be confident about everything in order to produce good answers where it does.

It’s kind of funny how confident chatgpt is about giving out bullshit, and then even when you correct it, it says oh I’m terribly sorry, here is definitely the correct answer this time and then it gives you another wrong answer. Just an observation, I realise it is just a tool that you have to understand the limitations of.

OmarAssadi · on Feb 28, 2023

> here is definitely the correct answer this time and then it gives you another wrong answer.

My favorite is when it gets into some weird context loop, apologizes and claims to have corrected an issue, but gives you literally, character-for-character, the same answer it gave before.

Fortunately, it mostly happens to me when I am asking particularly ambiguous or weird questions -- e.g., asking for any assembly in AT&T/GAS syntax seems to always go wrong, not necessarily in terms of the logic itself, but rather that it ends up mixing Intel and AT&T, or asking explicitly for POSIX-compliant shell often gives weird Bash/GNUisms, presumably since so many StackOverflow posts seem to conflate all shells with Bash and always expect GNU coreutils.

jack_pp · on Feb 28, 2023

We can check our answers, we can spit out bullshit like it does but then take the time to check them. It has no process for checking the answers or analyzing them and I'd rather not ask it how confident it is because that's just not what I care about.

I find it amazing that it can actually sort of run code "in its head", all the code output it does is not actually run through an interpreter but it's still pretty close if not perfect each time. But trying to run code with it is mostly for kicks, rather I asked it to produce a simple API for me and then produce a python script that tests it. it had no bugs and I could check it myself fairly fast; certainly faster than it would've taken me to write all that code without any bugs. I'd have had to check my own code for bugs anyway.

So if you accept that chatGPT is sort of like a guy that looked over millions of programmers shoulders but never actually communicated with any of them to understand the code, it has a perfect memory while not being able to compute much in its head then it can still be a great tool. Just understand its limitations and its advantages. Just because it can't reverse a string in its head doesn't mean it's "dumb" or not useful for everyday tasks.

nicky0 · on Feb 28, 2023

I code with GitHub Copilot. I liken it to pair programming with an brilliant, insigntful & more experienced colleague who is always slightly drunk.

robin_reala · on Feb 28, 2023

So basically a chat routine that’s been designed to hit the Ballmer peak.

cubefox · on March 1, 2023

Note that language models get much better at pretty much any reasoning task when they are prompted to use chain-of-thaught (Cot) reasoning. The difference between "Solve x" and "Solve x, let's think step by step" comes from the language model using the context window as short term memory in some sense. Perhaps your explanation in terms of complexity is better, but I'm not sure whether it explains the effectiveness of CoT in general.

Wowfunhappy · on Feb 28, 2023

Shouldn't RHLF help with this? So it learns that when people specify a number, they mean something very specific.

gamegoblin · on Feb 28, 2023

You cannot RL learn an O(N) algorithm in an O(1) feed forward neural network.

You could RL learn that when someone specifies a number, the appropriate thing to say is "Ok, 40 asterisks, let's count them, 1, *, 2, *, 3 , *, ..." and then it would indeed produce 40 asterisks. But not as a single string. Because producing them as a single contiguous string requires some offline state/memory/processing, and all the neural network has access to is the last ~page of text.

Embedding the counting process into the text itself kind of embeds the state of the O(N) algorithm in the O(N) text itself, that is, "unrolling the loop" externally.

brookst · on Feb 28, 2023

It doesn’t have any logic; it just tries to complete strings in the most plausible way. It’s training material probably did not have a lot of “write five at signs: @@@@@“. RLHF might help steer it in the right direction, but probably wouldn’t product the concept of counting or loops.

Wowfunhappy · on Feb 28, 2023

So, this is where I guess I just don't understand. I've had ChatGPT produce code for me that there is absolutely no way it already had in its training set. I realize it can't actually "think", but then I also don't know how to describe what I'm seeing.

Sunspark · on Feb 28, 2023

It gave me 40 short Gaulish warriors..

beepbooptheory · on Feb 28, 2023

Just be sure to thank all those stackoverflow repliers for this!

quijoteuniv · on Feb 28, 2023

Agree! I used ChatGPT to explain some (uncommented)ffmpeg scripts I wrote few years ago. Scripts where created by going trough many websites and adapting to my needs. Explanation from chatGPT was spot on.