Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Which is a deduplicated version of this: https://www.bigcode-project.org/docs/about/the-stack/

And probably, yes. While it contains 358 programming languages, obviously there's a long tail after the 20 most-represented languages. Some people might not expect without thinking about it for a bit that many of the most-represented "languages" are actually things like JSON, XML, HTML, CSV, text, markdown, YAML, SVG.

Also note that it won't be able to parse natural language nearly as well without additionally being trained on something like the LAION dataset, so this version will be more of an autocomplete like Copilot rather than something which can manifest high level business logic from whole cloth like ChatGPT.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: