Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is ALWAYS broken on unicode. It only makes sense in ASCII and with alphabets similar to the Latin alphabet.


It's not broken if "character" means "extended grapheme cluster".

Though in that case it makes it very hard to index the column.


Nobody cares about them, I am afraid. Most people _really_ can't accept that you can't work on Unicode like you did with ASCII, and do not want to let go of their old C habits (like iterating char by char and doing something).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: