我们在搜索页面随机点开拥有数据的页面。
www.shujujidi.com
观察其所需数据的元素特点,编写代码
from bs4 import BeautifulSoup import requests headers ={ "User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/126.0.0.0 Safari/537.36 Edg/126.0.0.0" } response = requests.get("https://www.shujujidi.com/caijing/544.html",headers=headers).text # 使用BeautifulSoup解析HTML soup = BeautifulSoup(response, 'html.parser') # 查找所有的<tr>标签 rows = soup.find_all('tr') # 存储提取的数据 data = [] for row in rows: cols = row.find_all('td') if len(cols) == 2: # 确保每行有两个数据 year = cols[0].text.strip() # 提取年份 value = float(cols[1].text.strip()) # 提取数值并转换为浮点数 data.append((year, value)) # 打印提取的数据 for item in data: print(f"Year: {item[0]}, Value: {item[1]}")
运行其程序,即可得到江苏省历年GDP
标签:GDP,item,python,text,cols,BeautifulSoup,获取数据,data,headers From: https://blog.csdn.net/YaaYaa_/article/details/140248349