GLM-4.7-Flash was the first local coding model that I felt was intelligent enoug...

nake89 · 2026-02-12T11:25:47 1770895547

Slightly off topic. I had a hard time getting models to run with ollama, and I thought that my computer (32gm ram, GTX4070 12Gb vram) just couldn't do it. The I tried LM Studio and after fiddling with some settings, I got models running and quite fast. I didn't try GLM-4.7 flash but I did GLM-4.6v flash and it was amazing to see it be able to analyze all kinds of images (since it has vision support). I was simply stunned. I can't believe that a simple gaming machine can do many of the things I used cloud models for. It was absolutely strikingly good at guessing locations of photos. Even vague ones. Deducing landmarks, writings, types of traffic signs. I need to try 4.7 flash. Hopefully it can ran fast with my machine.

Balinares · 2026-02-12T08:04:43 1770883483

I'm not sure what it is about GLM 4.7 Flash, but it definitely seems to nail a sweet spot. Even the supposedly frontier models make a mess of large requests, so small, well-scoped requests are the way, IMO; and in that space, 4.7 Flash holds its own better than it has any right to.

Aerroon · 2026-02-12T09:41:18 1770889278

And you can run quantized versions on old hardware! Like 10 year old hardware. You might only get 3 tokens/sec, but it works.

ThouYS · 2026-02-12T16:07:35 1770912455

for me gpt-oss:20b was that. glm 4.7 flash was not better, but much slower on a 16GB card

khimaros · 2026-02-11T20:19:24 1770841164

minimax-m.2 is close

satvikpendem · 2026-02-12T09:10:58 1770887458

2.5 is out now too.

khimaros · 2026-02-12T13:47:41 1770904061

i meant m2.1, but you are probably talking about kimi, not minimax

satvikpendem · 2026-02-12T14:09:00 1770905340

No, MiniMax M2.5 is now available on agent.minimax.io. We await the weights still.

satvikpendem · 2026-02-12T17:24:57 1770917097

https://www.minimax.io/news/minimax-m25