UTF-16 is the internal encoding of ICU to this day. If you're using ICU, you're using UTF-16. The library treats UTF-8 as a conversion target rather than a native representation. If you ever see a new project pick UTF-16 and you don't know why, it's because of ICU; any other choice forces a round trip conversion on every ICU call. If you pick UTF-16 you can just use icu::UnicodeString as your string representation and life is easy.