pachong2
<!--xpath 教程:https://blog.csdn.net/li6727975/article/details/46126079解析json教程:https://blog.csdn.net/luxideyao/article/details/77802389
-->
<module name="招聘" type="51job"> <!-- 此處 keyword 新聞關(guān)鍵字根據(jù)需求設(shè)置對應(yīng)的value--><select><input name="keyword" type="text" value="java" label="相關(guān)關(guān)鍵詞,可以職位或公司名稱等,和51job官網(wǎng)一樣"/></select><webSite>https://www.51job.com/</webSite><result>職位,地點,薪資,公司名稱,地址,公司性質(zhì),規(guī)模,分類,招聘要求,發(fā)布時間,公司福利,職位信息,公司信息</result> <!-- 有防爬蟲,所有需要切換代理IP,但要生效需要對應(yīng)的套餐使用代理IP --><proxyInfo /><!-- 此引擎所有變量替換規(guī)則為: ${變量名} --><operator name="category" desc="獲取總頁數(shù)"><request charset="gbk"><url>http://search.51job.com/list/000000,000000,0000,00,9,99,${keyword},2,1.html?lang=c&stype=1&postchannel=0000&workyear=99&cotype=99°reefrom=99&jobterm=99&companysize=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&welfare= </url><header>Connection: keep-aliveUpgrade-Insecure-Requests: 1User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8Referer: https://search.51job.comAccept-Encoding: gzip, deflate, brAccept-Language: zh-CN,zh;q=0.9</header><output><field name="total_pages" desc="總頁數(shù)"><parser>//*[@class="p_in"]/span[1]</parser><script>NumberUtil;getNumber;${total_pages}</script></field></output></request></operator><operator name="pagination" desc="分頁, pagination為系統(tǒng)命名 "><page for="1 <= pageNo <= ${total_pages}"><request charset="gbk"><url>http://search.51job.com/list/000000,000000,0000,00,9,99,${keyword},2,${pageNo}.html?lang=c&stype=1&postchannel=0000&workyear=99&cotype=99°reefrom=99&jobterm=99&companysize=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&welfare= </url><header>Connection: keep-aliveUpgrade-Insecure-Requests: 1User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8Referer: https://search.51job.comAccept-Encoding: gzip, deflate, brAccept-Language: zh-CN,zh;q=0.9</header><output><table for="4 <= i"><field name="listUrl"><parser>//*[@id="resultList"]/div[${i}]/p/span/a/@href</parser></field><field name="發(fā)布時間"><parser>//*[@id="resultList"]/div[${i}]/span[4]</parser></field></table></output></request></page><criteria><request charset="gbk" desc="從列表進入爬取詳情信息 "><url>${listUrl}</url><header>Host: jobs.51job.comConnection: keep-aliveUpgrade-Insecure-Requests: 1User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.87 Safari/537.36Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8Referer: https://search.51job.com/list/000000,000000,0000,00,9,99,${keyword},2,${pageNo}.html?lang=c&stype=&postchannel=0000&workyear=99&cotype=99°reefrom=99&jobterm=99&companysize=99&providesalary=99&lonlat=0%2C0&radius=-1&ord_field=0&confirmdate=9&fromType=&dibiaoid=0&address=&line=&specialarea=00&from=&welfare=Accept-Encoding: gzip, deflate, brAccept-Language: zh-CN,zh;q=0.9</header><output><field name="職位"><parser>//*[@class="tHeader tHjob"]/div/div[1]/h1</parser></field><field name="地點"><parser>/html/body/div[3]/div[2]/div[2]/div/div[1]/span</parser></field><field name="薪資"><parser>/html/body/div[3]/div[2]/div[2]/div/div[1]/strong</parser></field><field name="公司名稱"><parser>/html/body/div[3]/div[2]/div[2]/div/div[1]/p[1]/a</parser> </field><field name="value"><parser>/html/body/div[3]/div[2]/div[2]/div/div[1]/p[2]</parser> </field><field name="公司性質(zhì)"><script>"${value}".split("|")[0];</script> </field><field name="規(guī)模"><script>"${value}".split("|")[1];</script> </field><field name="分類"><script>"${value}".split("|")[2];</script> </field><field name="招聘要求"><parser>/html/body/div[3]/div[2]/div[3]/div[1]/div/div</parser></field><field name="公司福利"><parser>/html/body/div[3]/div[2]/div[3]/div[1]/div/p</parser> </field><field name="職位信息"><parser>/html/body/div[3]/div[2]/div[3]/div[2]/div/p[1]</parser></field><field name="地址"><parser>/html/body/div[3]/div[2]/div[3]/div[3]/div/p/text()</parser> </field><field name="公司信息"><parser>/html/body/div[3]/div[2]/div[3]/div[4]/div/text()[1]</parser> </field></output></request></criteria></operator>
</module>
?
轉(zhuǎn)載于:https://www.cnblogs.com/sky-ai/p/9839095.html
總結(jié)
- 上一篇: 动手动脑(三)
- 下一篇: 软件工程学习笔记——软件工程基本原理