爬虫 - HDU题目信息
生活随笔
收集整理的這篇文章主要介紹了
爬虫 - HDU题目信息
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
方便Markdown寫博客
import re, requests from bs4 import BeautifulSoup#獲取小標題 def get_title(soup):return soup.find_all(name='div', attrs='panel_title')#獲取文本 def get_content(soup):return soup.find_all(name='div', attrs='panel_content')#獲取樣例 def get_sample(soup):return soup.find_all(name='pre')# url = "http://acm.hdu.edu.cn/showproblem.php?pid=1004" print("Please input url:") url = input() html = requests.get(url) soup = BeautifulSoup(html.text, "lxml")title_list = get_title(soup)[0:5]content_list = get_content(soup)[0:3]sample_list = get_sample(soup)[0:2]title = list() content = list() sample = list() all = list()#對文本進行處理 for i in title_list:title.append('#### ' + i.text + '\r\n') for i in content_list:content.append(i.text + '\r\n') for i in sample_list:sample.append('```\n' + i.text + '\n```' + '\r\n') content += samplefor i in range(5):all.append(title[i] + content[i])content_title = soup.title.text + soup.h1.text before = '```\n' + "title: " + content_title + '\n' + "time: \n" + 'tags: \n' + 'categories: ACM\n' + '\n```\n' # print(before)f = open('HDU.txt', 'w') f.write(before) for i in all:f.write(i)f.write("#### AC\n- ") f.close() print("Done!")總結
以上是生活随笔為你收集整理的爬虫 - HDU题目信息的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: HDU 4121 Xiangqi
- 下一篇: 爬虫 - POJ题目信息