Hacker symbol

February 21, 2020 ~ 1 min read

UTF8 bit sequences


How does UTF-8 to multi-byte encoding? With bit sequences!

If a byte starts with 110 it means we'll need two bytes
If a byte starts with 1110 it means we'll need three bytes
If a byte starts with 11110 it means we'll need four bytes
If a byte starts with 10, it means it's a continuation of a multi-byte character sequence.

Sebastian BolaƱos

Hi, I'm Sebastian. I'm a software developer from Costa Rica. You can follow me on Twitter. I enjoy working on distributed systems.