首页 > 数据库 >python爬取小说网数据(数据库存储)

python爬取小说网数据(数据库存储)

时间:2022-11-27 21:35:06浏览次数:43  
标签:python text db 爬取 item lst 小说网 my find

一、BeautifulSoup解析:

bs = BeautifulSoup(resp.text, 'html.parser')
book_img_text = bs.find('div', class_='book-img-text')
ul = book_img_text.find('ul')
li_list = ul.find_all('li')


二、Mysql存储:

my_db = mysql.connector.connect(host='localhost', user='root', password='root', database='python_db',
auth_plugin='mysql_native_password')
my_cursor = my_db.cursor()

# print(my_db)
# sql语句
sql = 'insert into tbl_qidian (rank_tag_no1, name, intro, updata) values (%s,%s,%s,%s)'
# 执行批量插入
my_cursor.executemany(sql, lst)
# 提交事务
my_db.commit()

print("保存完毕")


三、数据库语句:

create table tbl_qidian(
id int(4) primary key auto_increment,
rank_tag_no1 varchar(255),
name varchar(255),
intro varchar(2550),
updata varchar(255)
);


四、完整代码:

import requests
from bs4 import BeautifulSoup
import mysql.connector


url = "https://www.qidian.com/rank/yuepiao/"
headers = {
"User-Agent": "Mozilla/5.0(Windows NT 6.1;WOW64) AppleWebKit/537.36(KABUL, like Gecko) "
"Chrome/86.0.4240.198Safari/537.36 "
}
resp = requests.get(url=url, headers=headers)
# print(resp.status_code)
# print(resp.text)
bs = BeautifulSoup(resp.text, 'html.parser')
book_img_text = bs.find('div', class_='book-img-text')
ul = book_img_text.find('ul')
li_list = ul.find_all('li')
lst = []
for item in li_list:
box = item.find('div', class_='book-img-box')
rank_tag_no1 = box.find('span').text
names = item.find('div', class_='book-mid-info')
name = names.find('h2').text
intro = item.find('p', class_='intro').text
update = item.find('p', class_='update').text
# print(count, rank_tag_no1, name, intro, update)
lst.append([rank_tag_no1, name, intro, update])
#
# for i in lst:
# print(i)

# lst = [['小说名称', '作者']]
# for i in range(0, len(names)):
# lst.append([names[i], author[i]])
# # for item in lst:
# # print(item)
# wk = openpyxl.Workbook()
# sheet = wk.active
# for item in lst:
# sheet.append(item)
# wk.save("11-某某小说月票榜.xlsx")


my_db = mysql.connector.connect(host='localhost', user='root', password='root', database='python_db',
auth_plugin='mysql_native_password')
my_cursor = my_db.cursor()

# print(my_db)
# sql语句
sql = 'int into tbl_qidian (rank_tag_no1, name, intro, updata) values (%s,%s,%s,%s)'
# 执行批量插入
my_cursor.executemany(sql, lst)
# 提交事务
my_db.commit()

print("保存完毕")


六、效果截图:

python爬取小说网数据(数据库存储)_批量插入


标签:python,text,db,爬取,item,lst,小说网,my,find
From: https://blog.51cto.com/u_14012524/5890361

相关文章