武汉加油!爬取百度迁徙地图数据+城市出行强度
最近為了建立預測病毒傳播的模型,需要爬取到百度遷徙上的數據,這里順手寫一下
一、找規律
首先,我們打開http://qianxi.baidu.com/,隨便查詢一個城市的資料,比如查看北京29號的資料:
可以找到數據存在這里,接下來我們看hearers
找到:Request URL: http://huiyan.baidu.com/migration/cityrank.jsonp?dt=province&id=110000&type=move_out&date=20200129&callback=jsonp_1580386984204_4388923
忽略最后一個&,我們可以發現只需找到兩個變量(id、date)對應參數,就可以批量爬取數據。date我們可以確定,現在需要找id
接下來在源碼中找到http://qianxi.cdn.bcebos.com/app/index.js?b8e016517dc0b92ce531:
打開后發現每個地區對應的id:
二、代碼:
#coding:utf-8 import urllib.request import os import pandas as pdCODE = "北京|110000,天津|120000,興安盟|152200,巢湖|340181,定安|469021,屯昌|469022,澄邁|469023,臨高|469024,海東地區|630200,香港|810000,澳門|820000,昌都地區|540300,山南地區|540500,日喀則地區|540200,那曲地區|540600,林芝地區|540400,吐魯番地區|650400,銅仁地區|520600,畢節地區|520500,石家莊|130100,唐山|130200,秦皇島|130300,邯鄲|130400,邢臺|130500,保定|130600,張家口|130700,承德|130800,滄州|130900,廊坊|131000,衡水|131100,太原|140100,大同|140200,陽泉|140300,長治|140400,晉城|140500,朔州|140600,晉中|140700,運城|140800,忻州|140900,臨汾|141000,呂梁|141100,呼和浩特|150100,包頭|150200,烏海|150300,赤峰|150400,通遼|150500,鄂爾多斯|150600,呼倫貝爾|150700,巴彥淖爾|150800,烏蘭察布|150900,沈陽|210100,大連|210200,鞍山|210300,撫順|210400,本溪|210500,丹東|210600,錦州|210700,營口|210800,阜新|210900,遼陽|211000,盤錦|211100,鐵嶺|211200,朝陽|211300,葫蘆島|211400,長春|220100,四平|220300,遼源|220400,通化|220500,白山|220600,松原|220700,白城|220800,哈爾濱|230100,齊齊哈爾|230200,雞西|230300,鶴崗|230400,雙鴨山|230500,大慶|230600,伊春|230700,佳木斯|230800,七臺河|230900,牡丹江|231000,黑河|231100,綏化|231200,上海|310000,南京|320100,無錫|320200,徐州|320300,常州|320400,蘇州|320500,南通|320600,連云港|320700,淮安|320800,鹽城|320900,揚州|321000,鎮江|321100,泰州|321200,宿遷|321300,杭州|330100,寧波|330200,溫州|330300,嘉興|330400,湖州|330500,紹興|330600,金華|330700,衢州|330800,舟山|330900,臺州|331000,麗水|331100,合肥|340100,蕪湖|340200,蚌埠|340300,淮南|340400,馬鞍山|340500,淮北|340600,銅陵|340700,安慶|340800,黃山|341000,滁州|341100,阜陽|341200,宿州|341300,六安|341500,亳州|341600,池州|341700,宣城|341800,福州|350100,廈門|350200,莆田|350300,三明|350400,泉州|350500,漳州|350600,南平|350700,龍巖|350800,寧德|350900,南昌|360100,景德鎮|360200,萍鄉|360300,九江|360400,新余|360500,鷹潭|360600,贛州|360700,吉安|360800,宜春|360900,撫州|361000,上饒|361100,濟南|370100,青島|370200,淄博|370300,棗莊|370400,東營|370500,煙臺|370600,濰坊|370700,濟寧|370800,泰安|370900,威海|371000,日照|371100,萊蕪|370100,臨沂|371300,德州|371400,聊城|371500,濱州|371600,菏澤|371700,鄭州|410100,開封|410200,洛陽|410300,平頂山|410400,安陽|410500,鶴壁|410600,新鄉|410700,焦作|410800,濮陽|410900,許昌|411000,漯河|411100,三門峽|411200,南陽|411300,商丘|411400,信陽|411500,周口|411600,駐馬店|411700,武漢|420100,黃石|420200,十堰|420300,宜昌|420500,襄陽|420600,鄂州|420700,荊門|420800,孝感|420900,荊州|421000,黃岡|421100,咸寧|421200,隨州|421300,仙桃|429004,潛江|429005,天門|429006,長沙|430100,株洲|430200,湘潭|430300,衡陽|430400,邵陽|430500,岳陽|430600,常德|430700,張家界|430800,益陽|430900,郴州|431000,永州|431100,懷化|431200,婁底|431300,廣州|440100,韶關|440200,深圳|440300,珠海|440400,汕頭|440500,佛山|440600,江門|440700,湛江|440800,茂名|440900,肇慶|441200,惠州|441300,梅州|441400,汕尾|441500,河源|441600,陽江|441700,清遠|441800,東莞|441900,中山|442000,潮州|445100,揭陽|445200,云浮|445300,南寧|450100,柳州|450200,桂林|450300,梧州|450400,北海|450500,防城港|450600,欽州|450700,貴港|450800,玉林|450900,百色|451000,賀州|451100,河池|451200,來賓|451300,崇左|451400,海口|460100,三亞|460200,五指山|469001,瓊海|469002,儋州|460400,文昌|469005,萬寧|469006,東方|469007,重慶|500000,成都|510100,自貢|510300,攀枝花|510400,瀘州|510500,德陽|510600,綿陽|510700,廣元|510800,遂寧|510900,內江|511000,樂山|511100,南充|511300,眉山|511400,宜賓|511500,廣安|511600,達州|511700,雅安|511800,巴中|511900,資陽|512000,貴陽|520100,六盤水|520200,遵義|520300,安順|520400,昆明|530100,曲靖|530300,玉溪|530400,保山|530500,昭通|530600,麗江|530700,臨滄|530900,普洱|530800,拉薩|540100,西安|610100,銅川|610200,寶雞|610300,咸陽|610400,渭南|610500,延安|610600,漢中|610700,榆林|610800,安康|610900,商洛|611000,蘭州|620100,嘉峪關|620200,金昌|620300,白銀|620400,天水|620500,武威|620600,張掖|620700,平涼|620800,酒泉|620900,慶陽|621000,定西|621100,隴南|621200,西寧|630100,銀川|640100,石嘴山|640200,吳忠|640300,固原|640400,中衛|640500,烏魯木齊|650100,克拉瑪依|650200,石河子|659001,阿拉爾|659002,圖木舒克|659003,五家渠|659004,恩施|422800,恩施土家族苗族自治州|422800,延邊|222400,延邊朝鮮族自治州|222400,神農架地區|429021,神農架林區|429021,湘西州|433100,湘西土家族苗族自治州|433100,大興安嶺地區|232700,白沙縣|469025,白沙黎族自治縣|469025,昌江黎族自治縣|469026,樂東黎族自治縣|469027,陵水黎族自治縣|469028,保亭黎族苗族自治縣|469029,瓊中黎族苗族自治縣|469030,阿壩州|513200,阿壩藏族羌族自治州|513200,甘孜州|513300,甘孜藏族自治州|513300,涼山州|513400,涼山彝族自治州|513400,黔西南布依族苗族自治州|522300,黔東南苗族侗族自治州|522600,黔南布依族苗族自治州|522700,楚雄州|532300,楚雄彝族自治州|532300,紅河州|532500,紅河哈尼族彝族自治州|532500,文山州|532600,文山壯族苗族自治州|532600,西雙版納傣族自治州|532800,大理州|532900,大理白族自治州|532900,德宏州|533100,德宏傣族景頗族自治州|533100,怒江州|533300,怒江傈僳族自治州|533300,迪慶州|533400,迪慶藏族自治州|533400,阿里地區|542500,臨夏回族自治州|622900,甘南藏族自治州|623000,海北藏族自治州|632200,黃南藏族自治州|632300,海南藏族自治州|632500,果洛藏族自治州|632600,玉樹藏族自治州|632700,海西蒙古族藏族自治州|632800,昌吉回族自治州|652300,博爾塔拉蒙古自治州|652700,巴音郭楞蒙古自治州|652800,哈密地區|650500,哈密|650500,阿克蘇地區|652900,克孜勒蘇柯爾克孜自治州|653000,伊犁哈薩克自治州|654000,喀什地區|653100,和田地區|653200,塔城地區|654200,阿勒泰地區|654300,錫林郭勒盟|152500,阿拉善盟|152900" name = [x.split("|")[0] for x in CODE.split(',')] number = [x.split("|")[1] for x in CODE.split(',')] code = list(zip(number, name)) code = {val : index for val, index in code} time = list(range(20200110, 20200126)) if not os.path.exists('武漢'):os.mkdir('武漢') os.chdir('武漢') exist = os.listdir() exist = {i[:-5] for i in exist}def Open(url):heads = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}req = urllib.request.Request(url, headers=heads)response = urllib.request.urlopen(url)html = response.read()return html.decode('unicode_escape')def conserve(html, time, name):global workcity = []value = []for i in html['list']:city.append(i['province_name'] + i['city_name'])value.append(i['value'])res = {'城市':city, '比例':value}res = pd.DataFrame(res)res.to_excel(excel_writer=work, sheet_name=time)def main():for num, name in code.items():if name in exist:continueglobal workf = pd.DataFrame()f.to_excel(name + '.xlsx')work = pd.ExcelWriter(name + '.xlsx')for t in time:try:print(name, t)url = 'http://huiyan.baidu.com/migration/cityrank.jsonp?dt=province&id=' + num + '&type=move_in&date=' + str(t)html = Open(utl).split('(')[1][:-1]conserve(eval(html)['data'], str(t), name)except SyntaxError:passwork.save()main()三、結果:
報錯了繼續運行就好,因為有些地點的數據是空的,需要別的日期可以在time列表里修改,記得修改headers
如果需要爬取遷徙指數:
#coding:utf-8 import urllib.request import os import pandas as pd import jsonCODE = "北京|110000,天津|120000,興安盟|152200,巢湖|340181,定安|469021,屯昌|469022,澄邁|469023,臨高|469024,海東地區|630200,香港|810000,澳門|820000,昌都地區|540300,山南地區|540500,日喀則地區|540200,那曲地區|540600,林芝地區|540400,吐魯番地區|650400,銅仁地區|520600,畢節地區|520500,石家莊|130100,唐山|130200,秦皇島|130300,邯鄲|130400,邢臺|130500,保定|130600,張家口|130700,承德|130800,滄州|130900,廊坊|131000,衡水|131100,太原|140100,大同|140200,陽泉|140300,長治|140400,晉城|140500,朔州|140600,晉中|140700,運城|140800,忻州|140900,臨汾|141000,呂梁|141100,呼和浩特|150100,包頭|150200,烏海|150300,赤峰|150400,通遼|150500,鄂爾多斯|150600,呼倫貝爾|150700,巴彥淖爾|150800,烏蘭察布|150900,沈陽|210100,大連|210200,鞍山|210300,撫順|210400,本溪|210500,丹東|210600,錦州|210700,營口|210800,阜新|210900,遼陽|211000,盤錦|211100,鐵嶺|211200,朝陽|211300,葫蘆島|211400,長春|220100,四平|220300,遼源|220400,通化|220500,白山|220600,松原|220700,白城|220800,哈爾濱|230100,齊齊哈爾|230200,雞西|230300,鶴崗|230400,雙鴨山|230500,大慶|230600,伊春|230700,佳木斯|230800,七臺河|230900,牡丹江|231000,黑河|231100,綏化|231200,上海|310000,南京|320100,無錫|320200,徐州|320300,常州|320400,蘇州|320500,南通|320600,連云港|320700,淮安|320800,鹽城|320900,揚州|321000,鎮江|321100,泰州|321200,宿遷|321300,杭州|330100,寧波|330200,溫州|330300,嘉興|330400,湖州|330500,紹興|330600,金華|330700,衢州|330800,舟山|330900,臺州|331000,麗水|331100,合肥|340100,蕪湖|340200,蚌埠|340300,淮南|340400,馬鞍山|340500,淮北|340600,銅陵|340700,安慶|340800,黃山|341000,滁州|341100,阜陽|341200,宿州|341300,六安|341500,亳州|341600,池州|341700,宣城|341800,福州|350100,廈門|350200,莆田|350300,三明|350400,泉州|350500,漳州|350600,南平|350700,龍巖|350800,寧德|350900,南昌|360100,景德鎮|360200,萍鄉|360300,九江|360400,新余|360500,鷹潭|360600,贛州|360700,吉安|360800,宜春|360900,撫州|361000,上饒|361100,濟南|370100,青島|370200,淄博|370300,棗莊|370400,東營|370500,煙臺|370600,濰坊|370700,濟寧|370800,泰安|370900,威海|371000,日照|371100,萊蕪|370100,臨沂|371300,德州|371400,聊城|371500,濱州|371600,菏澤|371700,鄭州|410100,開封|410200,洛陽|410300,平頂山|410400,安陽|410500,鶴壁|410600,新鄉|410700,焦作|410800,濮陽|410900,許昌|411000,漯河|411100,三門峽|411200,南陽|411300,商丘|411400,信陽|411500,周口|411600,駐馬店|411700,武漢|420100,黃石|420200,十堰|420300,宜昌|420500,襄陽|420600,鄂州|420700,荊門|420800,孝感|420900,荊州|421000,黃岡|421100,咸寧|421200,隨州|421300,仙桃|429004,潛江|429005,天門|429006,長沙|430100,株洲|430200,湘潭|430300,衡陽|430400,邵陽|430500,岳陽|430600,常德|430700,張家界|430800,益陽|430900,郴州|431000,永州|431100,懷化|431200,婁底|431300,廣州|440100,韶關|440200,深圳|440300,珠海|440400,汕頭|440500,佛山|440600,江門|440700,湛江|440800,茂名|440900,肇慶|441200,惠州|441300,梅州|441400,汕尾|441500,河源|441600,陽江|441700,清遠|441800,東莞|441900,中山|442000,潮州|445100,揭陽|445200,云浮|445300,南寧|450100,柳州|450200,桂林|450300,梧州|450400,北海|450500,防城港|450600,欽州|450700,貴港|450800,玉林|450900,百色|451000,賀州|451100,河池|451200,來賓|451300,崇左|451400,海口|460100,三亞|460200,五指山|469001,瓊海|469002,儋州|460400,文昌|469005,萬寧|469006,東方|469007,重慶|500000,成都|510100,自貢|510300,攀枝花|510400,瀘州|510500,德陽|510600,綿陽|510700,廣元|510800,遂寧|510900,內江|511000,樂山|511100,南充|511300,眉山|511400,宜賓|511500,廣安|511600,達州|511700,雅安|511800,巴中|511900,資陽|512000,貴陽|520100,六盤水|520200,遵義|520300,安順|520400,昆明|530100,曲靖|530300,玉溪|530400,保山|530500,昭通|530600,麗江|530700,臨滄|530900,普洱|530800,拉薩|540100,西安|610100,銅川|610200,寶雞|610300,咸陽|610400,渭南|610500,延安|610600,漢中|610700,榆林|610800,安康|610900,商洛|611000,蘭州|620100,嘉峪關|620200,金昌|620300,白銀|620400,天水|620500,武威|620600,張掖|620700,平涼|620800,酒泉|620900,慶陽|621000,定西|621100,隴南|621200,西寧|630100,銀川|640100,石嘴山|640200,吳忠|640300,固原|640400,中衛|640500,烏魯木齊|650100,克拉瑪依|650200,石河子|659001,阿拉爾|659002,圖木舒克|659003,五家渠|659004,恩施|422800,恩施土家族苗族自治州|422800,延邊|222400,延邊朝鮮族自治州|222400,神農架地區|429021,神農架林區|429021,湘西州|433100,湘西土家族苗族自治州|433100,大興安嶺地區|232700,白沙縣|469025,白沙黎族自治縣|469025,昌江黎族自治縣|469026,樂東黎族自治縣|469027,陵水黎族自治縣|469028,保亭黎族苗族自治縣|469029,瓊中黎族苗族自治縣|469030,阿壩州|513200,阿壩藏族羌族自治州|513200,甘孜州|513300,甘孜藏族自治州|513300,涼山州|513400,涼山彝族自治州|513400,黔西南布依族苗族自治州|522300,黔東南苗族侗族自治州|522600,黔南布依族苗族自治州|522700,楚雄州|532300,楚雄彝族自治州|532300,紅河州|532500,紅河哈尼族彝族自治州|532500,文山州|532600,文山壯族苗族自治州|532600,西雙版納傣族自治州|532800,大理州|532900,大理白族自治州|532900,德宏州|533100,德宏傣族景頗族自治州|533100,怒江州|533300,怒江傈僳族自治州|533300,迪慶州|533400,迪慶藏族自治州|533400,阿里地區|542500,臨夏回族自治州|622900,甘南藏族自治州|623000,海北藏族自治州|632200,黃南藏族自治州|632300,海南藏族自治州|632500,果洛藏族自治州|632600,玉樹藏族自治州|632700,海西蒙古族藏族自治州|632800,昌吉回族自治州|652300,博爾塔拉蒙古自治州|652700,巴音郭楞蒙古自治州|652800,哈密地區|650500,哈密|650500,阿克蘇地區|652900,克孜勒蘇柯爾克孜自治州|653000,伊犁哈薩克自治州|654000,喀什地區|653100,和田地區|653200,塔城地區|654200,阿勒泰地區|654300,錫林郭勒盟|152500,阿拉善盟|152900" name = [x.split("|")[0] for x in CODE.split(',')] number = [x.split("|")[1] for x in CODE.split(',')] code = list(zip(number, name)) code = {val : index for val, index in code} time = list(range(20200110, 20200126)) if not os.path.exists('武漢1'):os.mkdir('武漢1') os.chdir('武漢1')def Open(url):heads = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}req = urllib.request.Request(url, headers=heads)response = urllib.request.urlopen(url)html = response.read()return html.decode('unicode_escape')def conserve(html, name):times = []work = pd.ExcelWriter(name + '.xlsx')value = []for i in html['list']:times.append(i)value.append(html['list'][i])res = {'時間':times, '遷徙規模指數':value}res = pd.DataFrame(res)res.to_excel(excel_writer=work)work.save()def main():for num, name in code.items():f = pd.DataFrame()f.to_excel(name + '.xlsx')try:print(name)url = 'http://huiyan.baidu.com/migration/historycurve.jsonp?dt=province&id=' + num + '&type=move_out&startDate=20200110&endDate=20200125'html = Open(utl).split('(')[1][:-1]conserve(eval(html)['data'], name)except:passmain()另外,需要遷入的數據只需將url修改成’http://huiyan.baidu.com/migration/historycurve.jsonp?dt=province&id=’ + num + '&type=move_in&startDate=20200110&endDate=20200125’即可
三月十七號補充城市出行強度代碼
#coding:utf-8 import urllib.request import os import pandas as pd import jsonCODE = "北京|110000,天津|120000,興安盟|152200,巢湖|340181,定安|469021,屯昌|469022,澄邁|469023,臨高|469024,海東地區|630200,香港|810000,澳門|820000,昌都地區|540300,山南地區|540500,日喀則地區|540200,那曲地區|540600,林芝地區|540400,吐魯番地區|650400,銅仁地區|520600,畢節地區|520500,石家莊|130100,唐山|130200,秦皇島|130300,邯鄲|130400,邢臺|130500,保定|130600,張家口|130700,承德|130800,滄州|130900,廊坊|131000,衡水|131100,太原|140100,大同|140200,陽泉|140300,長治|140400,晉城|140500,朔州|140600,晉中|140700,運城|140800,忻州|140900,臨汾|141000,呂梁|141100,呼和浩特|150100,包頭|150200,烏海|150300,赤峰|150400,通遼|150500,鄂爾多斯|150600,呼倫貝爾|150700,巴彥淖爾|150800,烏蘭察布|150900,沈陽|210100,大連|210200,鞍山|210300,撫順|210400,本溪|210500,丹東|210600,錦州|210700,營口|210800,阜新|210900,遼陽|211000,盤錦|211100,鐵嶺|211200,朝陽|211300,葫蘆島|211400,長春|220100,四平|220300,遼源|220400,通化|220500,白山|220600,松原|220700,白城|220800,哈爾濱|230100,齊齊哈爾|230200,雞西|230300,鶴崗|230400,雙鴨山|230500,大慶|230600,伊春|230700,佳木斯|230800,七臺河|230900,牡丹江|231000,黑河|231100,綏化|231200,上海|310000,南京|320100,無錫|320200,徐州|320300,常州|320400,蘇州|320500,南通|320600,連云港|320700,淮安|320800,鹽城|320900,揚州|321000,鎮江|321100,泰州|321200,宿遷|321300,杭州|330100,寧波|330200,溫州|330300,嘉興|330400,湖州|330500,紹興|330600,金華|330700,衢州|330800,舟山|330900,臺州|331000,麗水|331100,合肥|340100,蕪湖|340200,蚌埠|340300,淮南|340400,馬鞍山|340500,淮北|340600,銅陵|340700,安慶|340800,黃山|341000,滁州|341100,阜陽|341200,宿州|341300,六安|341500,亳州|341600,池州|341700,宣城|341800,福州|350100,廈門|350200,莆田|350300,三明|350400,泉州|350500,漳州|350600,南平|350700,龍巖|350800,寧德|350900,南昌|360100,景德鎮|360200,萍鄉|360300,九江|360400,新余|360500,鷹潭|360600,贛州|360700,吉安|360800,宜春|360900,撫州|361000,上饒|361100,濟南|370100,青島|370200,淄博|370300,棗莊|370400,東營|370500,煙臺|370600,濰坊|370700,濟寧|370800,泰安|370900,威海|371000,日照|371100,萊蕪|370100,臨沂|371300,德州|371400,聊城|371500,濱州|371600,菏澤|371700,鄭州|410100,開封|410200,洛陽|410300,平頂山|410400,安陽|410500,鶴壁|410600,新鄉|410700,焦作|410800,濮陽|410900,許昌|411000,漯河|411100,三門峽|411200,南陽|411300,商丘|411400,信陽|411500,周口|411600,駐馬店|411700,武漢|420100,黃石|420200,十堰|420300,宜昌|420500,襄陽|420600,鄂州|420700,荊門|420800,孝感|420900,荊州|421000,黃岡|421100,咸寧|421200,隨州|421300,仙桃|429004,潛江|429005,天門|429006,長沙|430100,株洲|430200,湘潭|430300,衡陽|430400,邵陽|430500,岳陽|430600,常德|430700,張家界|430800,益陽|430900,郴州|431000,永州|431100,懷化|431200,婁底|431300,廣州|440100,韶關|440200,深圳|440300,珠海|440400,汕頭|440500,佛山|440600,江門|440700,湛江|440800,茂名|440900,肇慶|441200,惠州|441300,梅州|441400,汕尾|441500,河源|441600,陽江|441700,清遠|441800,東莞|441900,中山|442000,潮州|445100,揭陽|445200,云浮|445300,南寧|450100,柳州|450200,桂林|450300,梧州|450400,北海|450500,防城港|450600,欽州|450700,貴港|450800,玉林|450900,百色|451000,賀州|451100,河池|451200,來賓|451300,崇左|451400,海口|460100,三亞|460200,五指山|469001,瓊海|469002,儋州|460400,文昌|469005,萬寧|469006,東方|469007,重慶|500000,成都|510100,自貢|510300,攀枝花|510400,瀘州|510500,德陽|510600,綿陽|510700,廣元|510800,遂寧|510900,內江|511000,樂山|511100,南充|511300,眉山|511400,宜賓|511500,廣安|511600,達州|511700,雅安|511800,巴中|511900,資陽|512000,貴陽|520100,六盤水|520200,遵義|520300,安順|520400,昆明|530100,曲靖|530300,玉溪|530400,保山|530500,昭通|530600,麗江|530700,臨滄|530900,普洱|530800,拉薩|540100,西安|610100,銅川|610200,寶雞|610300,咸陽|610400,渭南|610500,延安|610600,漢中|610700,榆林|610800,安康|610900,商洛|611000,蘭州|620100,嘉峪關|620200,金昌|620300,白銀|620400,天水|620500,武威|620600,張掖|620700,平涼|620800,酒泉|620900,慶陽|621000,定西|621100,隴南|621200,西寧|630100,銀川|640100,石嘴山|640200,吳忠|640300,固原|640400,中衛|640500,烏魯木齊|650100,克拉瑪依|650200,石河子|659001,阿拉爾|659002,圖木舒克|659003,五家渠|659004,恩施|422800,恩施土家族苗族自治州|422800,延邊|222400,延邊朝鮮族自治州|222400,神農架地區|429021,神農架林區|429021,湘西州|433100,湘西土家族苗族自治州|433100,大興安嶺地區|232700,白沙縣|469025,白沙黎族自治縣|469025,昌江黎族自治縣|469026,樂東黎族自治縣|469027,陵水黎族自治縣|469028,保亭黎族苗族自治縣|469029,瓊中黎族苗族自治縣|469030,阿壩州|513200,阿壩藏族羌族自治州|513200,甘孜州|513300,甘孜藏族自治州|513300,涼山州|513400,涼山彝族自治州|513400,黔西南布依族苗族自治州|522300,黔東南苗族侗族自治州|522600,黔南布依族苗族自治州|522700,楚雄州|532300,楚雄彝族自治州|532300,紅河州|532500,紅河哈尼族彝族自治州|532500,文山州|532600,文山壯族苗族自治州|532600,西雙版納傣族自治州|532800,大理州|532900,大理白族自治州|532900,德宏州|533100,德宏傣族景頗族自治州|533100,怒江州|533300,怒江傈僳族自治州|533300,迪慶州|533400,迪慶藏族自治州|533400,阿里地區|542500,臨夏回族自治州|622900,甘南藏族自治州|623000,海北藏族自治州|632200,黃南藏族自治州|632300,海南藏族自治州|632500,果洛藏族自治州|632600,玉樹藏族自治州|632700,海西蒙古族藏族自治州|632800,昌吉回族自治州|652300,博爾塔拉蒙古自治州|652700,巴音郭楞蒙古自治州|652800,哈密地區|650500,哈密|650500,阿克蘇地區|652900,克孜勒蘇柯爾克孜自治州|653000,伊犁哈薩克自治州|654000,喀什地區|653100,和田地區|653200,塔城地區|654200,阿勒泰地區|654300,錫林郭勒盟|152500,阿拉善盟|152900" name = [x.split("|")[0] for x in CODE.split(',')] number = [x.split("|")[1] for x in CODE.split(',')] code = list(zip(number, name)) code = {val : index for val, index in code} time = list(range(20200101, 20200130)) if not os.path.exists('武漢1'):os.mkdir('武漢1') os.chdir('武漢1') exist = os.listdir() exist = {i[:-5] for i in exist}def Open(url):heads = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}req = urllib.request.Request(url, headers=heads)response = urllib.request.urlopen(url)html = response.read()return html.decode('unicode_escape')def conserve(html, name):times = []work = pd.ExcelWriter(name + '.xlsx')value = []for i in html['list']:times.append(i)value.append(html['list'][i])res = {'時間':times, '出行強度':value}res = pd.DataFrame(res)res = res.sort_values(by='時間')res.to_excel(excel_writer=work)work.save()def main():for num, name in code.items():if name in exist:continuef = pd.DataFrame()f.to_excel(name + '.xlsx')try:print(name)url = 'http://huiyan.baidu.com/migration/internalflowhistory.jsonp?dt=city&id=' + num + '&date=20200301'html = Open(utl).split('(')[1][:-1]conserve(eval(html)['data'], name)except:passmain()結果如下圖,已按時間排好序:
??希望我的文章對您有所幫助,同時也感謝您能抽出寶貴的時間閱讀,創作不易,如果您喜歡的話,歡迎點贊、關注、收藏。您的支持是我創作的動力,希望今后能帶給大家更多優質的文章
總結
以上是生活随笔為你收集整理的武汉加油!爬取百度迁徙地图数据+城市出行强度的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 后端:50 个 经典 Spring 面试
- 下一篇: Vue之前端页面使用json编辑框