首页 > 编程语言 >python下载站长素材免费简历模板(xpath)

python下载站长素材免费简历模板(xpath)

时间:2022-10-19 18:34:04浏览次数:48  
标签:xpath jianli python free down url path 模板 headers

import os.path

import requests
from lxml import etree

if __name__ == '__main__':
    if not os.path.exists('./jianli'):
        os.mkdir('./jianli')

    headers = {
        'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/106.0.0.0 Safari/537.36'
    }
    # 下载前两页模板
    for i in range(1, 3):
        if i == 1:
            url = 'https://sc.chinaz.com/jianli/free.html'
        else:
            url = 'https://sc.chinaz.com/jianli/free_' + str(i) + '.html'
        page = requests.get(url=url, headers=headers)
        page.encoding = 'utf-8'
        tree = etree.HTML(page.text)
        free_jianli = tree.xpath('//div[@id="main"]/div/div')

        for free in free_jianli:
            # 获取链接
            free_url = free.xpath('./a/@href')[0]
            # 简历标题
            free_title = free.xpath('./a/img/@alt')[0]
            # 根据链接获取到对应下载页面
            free_content = requests.get(url=free_url, headers=headers).text
            free_content_tree = etree.HTML(free_content)
            # 找到下载链接
            down_path = free_content_tree.xpath('//div[@class="down_wrap"]/div[2]/ul/li[1]/a/@href')[0]
            # 下载内容的标题
            down_path_title = free_title + '.' + down_path.split('.')[-1]
            # 根据下载链接进行二进制数据下载
            down_path_content = requests.get(url=down_path, headers=headers).content
            # 存入文件中
            with open('./jianli/' + down_path_title, 'wb') as fp:
                fp.write(down_path_content)
                print(down_path_title, "下载成功")
        print("第{0}页下载成功".format(i))
    print('下载完成')

 

标签:xpath,jianli,python,free,down,url,path,模板,headers
From: https://www.cnblogs.com/zhh-blogs/p/16807325.html

相关文章

  • Day4:Python基础:列表、元组的方法二
    1、列表按切片修改List=['Wang',[1,2,3,4],'Liu','Xinyouyi','zhangsanfeng','王小二']List[0:2]='狗屎'print(List)----------------------------------[......
  • Python: Builder Pattern
     DuBuilder.py#生成器模式BuilderPatternfrom__future__importannotationsfromabcimportABC,abstractmethodfromtypingimportAnyclassBuilder(ABC)......
  • python 可视化图表-折线图
    1.代码部分"""基础折线图"""#导包frompyecharts.chartsimportLine#创建一个折线图对象line=Line()#给折线图对象添加x轴的数据line.add_xaxis(["中国","......
  • python 嵌入式打包【保姆级文图教程】
    python嵌入式打包目录python嵌入式打包嵌入式解决了什么问题?什么是嵌入式?如何安装嵌入式?前往官网下载嵌入式解压并配置嵌入式设置如何安装新的模块如何运行嵌入式解......
  • python带你制作一个gequ下载器,海量gequ免费听
    前言大家早好、午好、晚好吖~  环境使用:Python3.8Pycharm安装python第三方模块:win+R输入cmd点击确定,输入安装命令pipinstall模块名(p......
  • python编程考试题目大全
    1.题目名称:批阅奏章某朝皇帝有大臣n名(1<=n<=1000),分别编号大臣1~n。某日皇帝身体抱恙,奏章堆积如山无法及时一一批阅,便命身旁內侍帮他把奏章按指定顺序排序后再阅。于是皇帝......
  • Vue模板是怎样编译的
    这一章我们开始讲模板解析编译:总结来说就是通过compile函数把tamplate解析成renderFunction形式的字符串compiler/index.jsimport{parse}from'./parser/index'imp......
  • python自动化报告发送到邮件(qq邮箱)
      password为上方的授权码 smtpserver为(百度查一下对应邮箱smtp服务器是多少) username(用户名为发送方的邮箱)receiver(为接收人的邮箱)  importsmtplibf......
  • Day4:Python基础逻辑判断的优先顺序及不同数据类型间的转换
    1、逻辑判断优先级顺序优先级顺序,()>not>and>or#print(2>1and1<4)#print(2>1and1<4or2<3and9>6or2<4and3<2)#TorTorF#Tor......
  • 不怕新歌有多嗨,就怕老歌带DJ,用Python批量下载dj歌曲!
    俗话说,不怕新歌有多嗨,就怕老歌带dj,为了验证这句话,于是我直接找了个dj网,用Python把dj都下载下来,亲身体验一下!环境模块软件Python3.8pycharm2021 模块......