Short post: On Windows, UTF-16 was the dominant locale, and UTF-8 was something only to convert to and from. (Microsoft jumped the gun before Unicode expanded the address space.) While it got better (Windows 10 can use UTF-8 as an MBCS locale with ANSI APIs), it was historically a lot worse.
For converting, you’d use the MultiByteToWideChar
and its opposite WideCharToMultiByte
. On legacy Windows, they have slightly confusing semantics. Specifically, with flags. While Vista on introduced many flags that can be used with the UTF-8 codepage (to deal with the quirks of conversion, like invalid characters), previously only MB_ERR_INVALID_CHARS
was allowed, and only if you were running XP or 2000 SP4. Before that, you can’t have any flags if you’re converting to or from UTF-16 and UTF-8. It’s unfortunately a little dangerous, but that’s the rub.