LLMs and Image AIs are the opposite of self-driving cars. "Everybody" had concrete expectations for at least half a decade now that the moment where self-driving cars would surpass human ability was imminent, yet the tech hasn't lived up to it (yet). While practically nobody was expecting AI to be able to do the jobs of artists, programmers or poets anywhere near human level anytime soon, yet here we are.
Great work, congratulations! One question, if I understood it right you based your demo on GPT-2 - what is your experience working with those open-source language AIs. In terms of computational requirements and performance?
I'm really fascinated by all the tools the OS community is building based on StableDiffusion (like OP's), which compares favourably with the latest closed-source models like Dall-E or Midjourney, and can run reasonably well on a high-end home computer or a very reasonably-sized cloud instance. For language models, it seems the requirements are substantially higher, and it's hard to match the latest GPT versions in terms of quality.
If LLMs (etc.) had the same requirements and business models as AV cars they'd still be considered a failure. Nobody expects Stable Diffusion to have a 6-sigma accuracy rate*, nor do we expect ChatGPT to seamlessly integrate into a human community. The AV business model discourages individual or small scale participation, so we wouldn't even have SD (would anyone allow a single OSS developer to drive or develop an AV car? Ok, there's Comma, that's all there is on the OSS side).
* The amount of times that I've seem an 'impressive' selections of AI images that I consider a critical failure deserves it's own word. The AIs are impressive for even getting that far, it's just that some people have bad taste and pick the bad outputs.