【转】urllib urllib2 httplib
| httplib實(shí)現(xiàn)了HTTP和HTTPS的客戶端協(xié)議,一般不直接使用,在python更高層的封裝模塊中(urllib,urllib2)使用了它的http實(shí)現(xiàn)。 import httplib conn = httplib.HTTPConnection("google.com") conn.request('get', '/') print conn.getresponse().read() conn.close() 復(fù)制代碼 httplib.HTTPConnection ( host [ , port [ , strict [ , timeout ]]] ) HTTPConnection類的構(gòu)造函數(shù),表示一次與服務(wù)器之間的交互,即請求/響應(yīng)。參數(shù)host表示服務(wù)器主機(jī),如:http://www.csdn.net/;port為端口號,默認(rèn)值為80; 參數(shù)strict的 默認(rèn)值為false, 表示在無法解析服務(wù)器返回的狀態(tài)行時(shí)( status line) (比較典型的狀態(tài)行如: HTTP/1.0 200 OK ),是否拋BadStatusLine 異常;可選參數(shù)timeout 表示超時(shí)時(shí)間。 調(diào)用request 方法會(huì)向服務(wù)器發(fā)送一次請求,method 表示請求的方法,常用有方法有g(shù)et 和post ;url 表示請求的資源的url ;body 表示提交到服務(wù)器的數(shù)據(jù),必須是字符串(如果method 是”post” ,則可以把body 理解為html 表單中的數(shù)據(jù));headers 表示請求的http 頭。 conn = httplib.HTTPConnection("www.g.com", 80, False) conn.request('get', '/', headers = {"Host": "www.google.com", "User-Agent": "Mozilla/5.0 (Windows; U; Windows NT 5.1; zh-CN; rv:1.9.1) Gecko/20090624 Firefox/3.5", "Accept": "text/plain"}) res = conn.getresponse() print 'version:', res.version print 'reason:', res.reason print 'status:', res.status print 'msg:', res.msg print 'headers:', res.getheaders() #html #print '\n' + '-' * 50 + '\n' #print res.read() conn.close() 復(fù)制代碼 Httplib模塊中還定義了許多常量,如: response = urllib2.urlopen(req) the_page = response.read() 復(fù)制代碼 FTP同樣: req = urllib2.Request('ftp://pythoneye.com')urlopen返回的應(yīng)答對象response有兩個(gè)很有用的方法info()和geturl() data = urllib.urlencode(values) req = urllib2.Request(url, data) 復(fù)制代碼 get方式: data['name'] = 'Somebody Here'data['location'] = 'Northampton' data['language'] = 'Python' url_values = urllib.urlencode(data) url = 'http://pythoneye.com/example.cgi' full_url = url + '?' + url_values data = urllib2.open(full_url) 復(fù)制代碼 使用Basic HTTP Authentication: import urllib2# Create an OpenerDirector with support for Basic HTTP Authentication... auth_handler = urllib2.HTTPBasicAuthHandler() auth_handler.add_password(realm='PDQ Application', uri='https://pythoneye.com/vecrty.py', user='user', passwd='pass') opener = urllib2.build_opener(auth_handler) # ...and install it globally so it can be used with urlopen. urllib2.install_opener(opener) urllib2.urlopen('http://www. pythoneye.com/app.html') 復(fù)制代碼 使用代理ProxyHandler: proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})proxy_auth_handler = urllib2.HTTPBasicAuthHandler() proxy_auth_handler.add_password('realm', 'host', 'username', 'password') opener = build_opener(proxy_handler, proxy_auth_handler) # This time, rather than install the OpenerDirector, we use it directly: opener.open('http://www.example.com/login.html') URLError–HTTPError: from urllib2 import Request, urlopen, URLError, HTTPError req = Request(someurl) try: response = urlopen(req) except HTTPError, e: print 'Error code: ', e.code except URLError, e: print 'Reason: ', e.reason else: ............. 復(fù)制代碼 或者: from urllib2 import Request, urlopen, URLErrorreq = Request(someurl) try: response = urlopen(req) except URLError, e: if hasattr(e, 'reason'): print 'Reason: ', e.reason elif hasattr(e, 'code'): print 'Error code: ', e.code else: ............. 復(fù)制代碼 通常,URLError在沒有網(wǎng)絡(luò)連接(沒有路由到特定服務(wù)器),或者服務(wù)器不存在的情況下產(chǎn)生 try: urllib2.urlopen(req) except URLError, e: print e.reason print e.code print e.read() 復(fù)制代碼 最后需要注意的就是,當(dāng)處理URLError和HTTPError的時(shí)候,應(yīng)先處理HTTPError,后處理URLError def http_open(self, req): return self.do_open(httplib.HTTPConnection, req) http_request = AbstractHTTPHandler.do_request_ 復(fù)制代碼 HTTPHandler是Openers當(dāng)中的默認(rèn)控制器之一,看到這個(gè)代碼,證實(shí)了urllib2是借助于httplib實(shí)現(xiàn)的,同時(shí)也證實(shí)了Openers和Handlers的關(guān)系。 |
關(guān)注 - 2
粉絲 - 0 +加關(guān)注 0 0 (請您對文章做出評價(jià)) ? 博主前一篇:browser1.py
? 博主后一篇:【轉(zhuǎn)帖】用python爬蟲抓站的一些技巧總結(jié)
Feedback
#1樓[樓主] 回復(fù) 引用 查看
2011-05-13 12:50 by Morya ?| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | import socks import httplib2 url2 = r'http://www.cnblogs.com/' url = r'http://passport.cnblogs.com/login.aspx' body = [ ('tbUserName', 'xxx'), ('tbPassword', 'xxx'), ('txtReturnUrl', 'http://home.cnblogs.com/'), ] headers = {'Content-type': 'application/x-www-form-urlencoded', 'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 6.1; zh-CN; rv:1.9.2.17) Gecko/20110420 Firefox/3.6.17'} #proxy = httplib2.ProxyInfo(socks.PROXY_TYPE_HTTP, 'http://xxx.xxx.com', 8080) http = httplib2.Http( # proxy_info = proxy ) response, content = http.request(url, 'GET', headers=headers, #connection_type='http', ) # set cookie headers = {'Cookie': response['set-cookie']} response, content = http.request( url, 'POST', headers=headers, body=urllib.urlencode(body) ) print type(content) try: content = unicode(content) self.view.setHtml(content) except: return return |
轉(zhuǎn)載于:https://www.cnblogs.com/cheungjustin/archive/2012/01/05/2313509.html
總結(jié)
以上是生活随笔為你收集整理的【转】urllib urllib2 httplib的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 主动出击!消息称佳能在研究与手机厂商合作
- 下一篇: 美媒发问:没有中国参与 世界能制造电动汽