从RF输入的中文会进行unicode编码:u'u6587u4ef6u5230u8fbeu6210u529f'
从orc数据库查询到的中文会进行gbk编码得到ASCII:'xcexc4xbcxfexb5xbdxb4xefxb3xc9xb9xa6'
ASCII编码数据进行decode('gbk')后得到unicode编码的数据;
unicode编码的数据encode('gbk')后得到gbk编码数据ASCII;
更详细的内容见:http://www.2cto.com/kf/201407/317866.html
gbk编码与解码:
>>> a='xcexc4xbcxfexb5xbdxb4xefxb3xc9xb9xa6' >>> a 'xcexc4xbcxfexb5xbdxb4xefxb3xc9xb9xa6' >>> b=a.decode('gbk') #gbk解码,得到的是unicode对象 >>> b u'u6587u4ef6u5230u8fbeu6210u529f' >>> print u'u6587u4ef6u5230u8fbeu6210u529f' 文件到达成功
>>> c=b.encode('gbk') #gbk编码,得到的是ASCII对象
>>> c
'xcexc4xbcxfexb5xbdxb4xefxb3xc9xb9xa6'
>>> a==c
True
>>> print c
文件到达成功
>>>
>>> print a.decode('utf-8')
Traceback (most recent call last):
File "<pyshell#5>", line 1, in <module>
print a.decode('utf-8')
File "D:Python27libencodingsutf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xce in position 0: invalid continuation byte
utf8编码与解码:
>>> a=u'10' >>> b=10 >>> a u'10' >>> b 10 >>> c='10' >>> c '10' >>> b==c False >>> a==c True >>> a==b False >>> aint=int(a) >>> aint 10 >>> astr=a.encode('utf8') >>> astr '10' >>>