EF BB BF UTF-8(可变长1-4字节,兼容ASCII) ,带BOM形式的
FE FF UTF-16/UCS-2, little endian(定长2字节)
FF FE UTF-16/UCS-2, big endian(定长2字节)
FF FE 00 00 UTF-32/UCS-4, little endian(定长4字节)
00 00 FE FF UTF-32/UCS-4, big-endian(定长4字节)
UTF-8 编码方式:
一个字节: 0xxxxxxx 目的是兼容ASCII
两个字节: 110xxxxx 10xxxxxx
三个字节: 1110xxxx 10xxxxxx 10xxxxxx
四个字节: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx