Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Who closes the gate? Is it Claude itself after it runs the verification? Who makes sure the verification did in fact run?


I usually have Claude confirm with me but I've seen it close it if its a unit test that passed for example.


You can't trust it 100%. Sometimes it will just refuse to fix a compiler or lint warning (often saying "This was a pre-existing issue...") or write a trivial test that does nothing and always passes.


> writes code with a lot of warnings > compacts > "This was a pre-existing issue..."

I still take this over writing code myself though.


I'm not saying you shouldn't. I'd say 70% of my work code is written by Claude Code or Codex. But this is something you should be aware of when interacting with agents.


Point being that there are multiple gates to one story, including human testing as one of them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: