not really. according to https://en.wikipedia.org/wiki/UTF-8 and https://en.wikipedia.org/wiki/File:Roadmap_to_Unicode_BMP.sv..., any codepoint bigger 0x80 (second half of the first box) is 2 byte per character, and codepoint bigger than 0x800 (anything past the 8th box) is 3 bytes charater. so while it might be fair for CJK languages, it's even less fair for languages that don't mostly use the latin alphabet.
??? = 9 bytes
how are you = 11 bytes
reply