爬取小说并拷贝到为xls格式

时间：2022-10-29 22:47:02浏览次数：52

标签：__ url text 爬取 soup dict 拷贝到 xls response

import requests
import bs4
import pandas as pd
def l():
    for i in range(30):
        dict={}
        book = soup.find_all('a', attrs={'class': 'jt'})[i].text
        sign = soup.find_all('td', attrs={'class': 'td1'})[i].text
        the_new_chapter = soup.find_all('td', attrs={'class': 'td6'})[i].text
        dict['名称']=book
        dict['作者']=the_new_chapter
        li.append(dict)
if __name__ == '__main__':
    headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36'}
    base_url = 'https://www.17k.com/all/book/2_29_179_0_0_0_0_0_'
    li=[]
    for i in range(10):
      data='{parame}.html'.format(parame=i)
      url=base_url+data
      # # print(url)
      response = requests.get(url=url, headers=headers)
      # 获得网页html代码
      response.encoding = 'utf-8'
      # print(response.text)

      soup = bs4.BeautifulSoup(response.text, 'html.parser')
      l()
      df = pd.DataFrame(li)
      print(df)
      df.to_excel('linz.xls')

标签：__,url,text,爬取,soup,dict,拷贝到,xls,response
From： https://www.cnblogs.com/JK8395/p/16840077.html

python 爬虫 -----Bs4 爬取并且下载图片
#1.拿到主页面主代码，拿到子页面连接地址，href#2.通过href拿到子页面内容，从子页面中找到图片的下载地址img->src#3.下载图片importrequestsfrombs4importBea......
求大神解答：利用python爬取各县GDP结果为空，求大神看看我的代码问题在哪？
目标url=红黑人口库代码importrequestsfromlxmlimportetreeimporttimeif__name__=='__main__': url='https://pagead2.googlesyndication.com/getconfig/soda......
爬取淘宝女装并可视化分析
这次主要是爬虫实战+数据可视化分析：爬虫针对是淘宝的女装信息详细代码数据可以到我的gitee下载：爬取淘宝女装并可视化分析:基于爬虫，获取淘宝的商品信息，保存本地并进行可视......
将主机上的文件拷贝到pod的容器中
官网文档https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#cp例如将主机上的index.html文件拷贝到pod容器中替换nginx首页显示`kubectlcp.......
金蝶K3 V15 Win10x64 导出xlsx提示“未设置对象变量或 With block 变量”解决
安装了Office2019x64Pro版，金蝶K3导出物料时选择xls2003版无错误提示，但选择高版本Officexlsx时则出现提示报错。原因时没有安装对应的数据库引擎：AccessDatabaseE......
Python爬取照片
实例：爬取内蒙古科技大学校徽打开网站 1.引入requests模块 2.输入要请求的网站url 网址获取 3.发送请求头 user-agent的获......
python 爬取电影天堂
代码如下：#1.定位到电影天堂必看片栏目#2.从其中提取到子页面的连接地址#3.请求子页面的连接地址并拿到下载地址importrequestsimportredomain="https://......
Spider·爬取小说
导入资源包importrequestsimportbs4获取链接url='https://www.17k.com/top/refactor/top100/18_popularityListScore/18_popularityListScore_finishBook_top_1......
noi.cn 访问量爬取
网课期间开始的一项无聊的项目，对noi.cn的访问量进行爬取。具体操作为直接访问对应的网址，获取其网站底部的总访问量信息。爬虫使用Python编写，配合bat文件和Windows......
Python|爬取每日疫情数据并使用matplotlib绘制图像进行分析
网页分析数据源腾讯疫情实时追踪打开网址，F12进入开发者工具（刷新一下页面），如下，所有数据都可以通过接口获取：国内数据接口：https://api.inews.qq.com/newsqa/v1/query/inn......

爬取小说并拷贝到为xls格式

相关文章

赞助商

阅读排行