python decode unicode encode
?????? 字符串在Python內部的表示是unicode編碼,因此,在做編碼轉換時,通常需要以unicode作為中間編碼,即先將其他編碼的字符串解碼(decode)成unicode,再從unicode編碼(encode)成另一種編碼。
?????? 代碼中字符串的默認編碼與代碼文件本身的編碼一致,以下是不一致的兩種:
??????? 1. s = u'你好'
??????????? 該字符串的編碼就被指定為unicode了,即python的內部編碼,而與代碼文件本身的編碼(查看默認編碼:import sys?? print('hello',sys.getdefaultencoding())? ascii 。設置默認編碼:import sys reload(sys) ?sys.setdefaultencoding('utf-8')))無關。因此,對于這種情況做編碼轉換,只需要直接使用encode方法將其轉換成指定編碼即可.
??????? 2. # -*- coding: utf-8 -*-
??????????? s = ‘你好’
??????????? 此時為utf-8編碼,ascii編碼不能顯示漢字
?
isinstance(s, unicode)? #用來判斷是否為unicode ,是返回True,不是返回False
unicode(str,'gb2312')與str.decode('gb2312')是一樣的,都是將gb2312編碼的str轉為unicode編碼?
?
使用str.__class__可以查看str的編碼形式
原理說了半天,最后來個包治百病的吧:)
?
#!/usr/bin/env python
#coding=utf-8
s="中文"
if isinstance(s, unicode):
#s=u"中文"
print s.encode('gb2312')
else:
#s="中文"
print s.decode('utf-8').encode('gb2312')
?
語音模塊代碼:
# -*- coding: utf-8 -*-import import sys print('hello',sys.getdefaultencoding()) def xfs_frame_info(words):#decode utf-8 to python internal unicode coding isinstance(words,unicode)wordu = words.decode('utf-8')#encode python unicode to gbkdata = wordu.encode('gbk')length = len(data) + 2frame_info = bytearray(5)frame_info[0] = 0xfdframe_info[1] = (length >> 8)frame_info[2] = (length & 0x00ff)frame_info[3] = 0x01frame_info[4] = 0x01buf = frame_info + dataprint("buf:",buf)return bufif __name__ == "__main__":print("hello world")words1= u'你好'#encodetype = isinstance(words1,unicode)#print("encodetype",encodetype)print("origin unicode", words1)words= words1.encode('utf-8')print("utf-8 encoded", words)a = xfs_frame_info(words)print('a',a)if __name__ == "__main__":print("hello world")words1= '你好'print("oringe utf-8 encode:",words1)encodetype = isinstance(words1,unicode)wordu = words1.decode('utf-8') ? ?print("unicode from utf-8 decode:",wordu)#encodetype = isinstance(words1,utf-8)#encodetype = isinstance(words1,'ascii')#print("encodetype",encodetype)#print("origin unicode", words1) word_utf8 = wordu.encode('utf-8')#encodetype2 = isinstance(words,utf8)#print("encodetype2",encodetype2)print("utf-8 encoded",word_utf8)a = xfs_frame_info(word_utf8)print('a',a)你好前不加u''時,要多一步decode為unicode
轉載于:https://www.cnblogs.com/cj2014/p/4236114.html
總結
以上是生活随笔為你收集整理的python decode unicode encode的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: kafka直连方式消费多个topic
- 下一篇: holer实现外网访问内网数据库