python requests 乱码_解决使用requests中文乱码
碰到了使用requests亂碼,總結一下。
Requests will automatically decode content from the server. Most unicode charsets are seamlessly decoded.
When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property:
>>> r.encoding 'utf-8' >>> r.encoding = 'ISO-8859-1'
If you change the encoding, Requests will use the new value of r.encoding whenever you call r.text. You might want to do this in any situation where you can apply special logic to work out what the encoding of the content will be. For example, HTML and XML have the ability to specify their encoding in their body. In situations like this, you should use r.content to find the encoding, and then set r.encoding. This will let you use r.text with the correct encoding.
Requests will also use custom encodings in the event that you need them. If you have created your own encoding and registered it with the codecs module, you can simply use the codec name as the value of r.encoding and Requests will handle the decoding for you.
亂碼原因:假設頁面編碼方式為utf-8,如果服務器返回的response header沒有charset=utf-8選項,requests 默認的encode方式為 ISO-8859-1
例子
百度網頁頁面用的的是utf-8編碼(在頁面里可找到),百度response header里沒有設置charset-utf-8,requests response默認采用ISO-8859-1編碼,因此引起中文亂碼
淘寶網頁頁面用的的是utf-8編碼(在頁面里可找到),淘寶response header有設置charset-utf-8,requests response默認采用header中的編碼utf-8編碼,因此中文沒有亂碼
簡單解決方式1: 重新編碼解碼 baidu.text.encode('ISO-8859-1').decode('utf-8')
簡單解決方式2 baidu.encoding='utf-8'
簡單解決方式3:baidu.content type為字節流,即是編碼后的字節流,用utf-8解碼得到中文無亂碼
總結
以上是生活随笔為你收集整理的python requests 乱码_解决使用requests中文乱码的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: MES系统生产过程管理解决方案,主要包含
- 下一篇: 应用在hiapk安卓市场分类排行榜中不显