1:安装库
pip install beautifulsoup4 pip install pandas
2:爬取数据
我们拿 https://cuiqingcai.com/archives/ 网站为例子,来进行爬取文章标题
import requests from bs4 import BeautifulSoup import pandas as pd import openpyxl # 请求网页数据 res = requests.get("https://cuiqingcai.com/archives/") soup = BeautifulSoup(res.text, "html.parser") # 爬取数据 data = [] for div in soup.find_all("div", class_="post-title"): data.append(div.text) # 存入Excel df = pd.DataFrame(data, columns=["Data"]) df.to_excel("data.xlsx", index=False)
标签:python,excel,存入,爬取,import,div,data From: https://www.cnblogs.com/xlei/p/17095623.html