【Python】pyppeteer简单使用

时间：2024-07-16 17:30:25浏览次数：14

标签：pyppeteer input Python random 简单 import page await asyncio

爬取百度搜索python的第一页标题

import sys
sys.path.append("/home/user/.local/lib/python3.9/site-packages")  #将包的路径添加到环境变量
import asyncio
from pyppeteer import launch
from pyppeteer_stealth import stealth #反检测模块，隐藏浏览器特征
import random

width,height = 1366,768

async def main():
    brower = await launch({"executablePath":"/opt/browser360/browser360-cn"}, #指定已安装的浏览器,
                          headless = False,#无界面模式关闭，显示界面
                          userDataDir = "./userdata", #设置用户目录 userDataDir，可以恢复之前的历史记录，也可以恢复很多网站的登录信息
                          args = ["--disable-infobars",f"--window-size={width},{height}"],#关闭提示”Chrome 正受到自动测试软件的控制”,设定界面大小
                          #devtools = True,#打开调试模式,如果这个参数设置为 True，那么 headless 参数就会无效，会被强制设置为 False
                          )
    page = await brower.newPage() #打开一个新的标签页
    await stealth(page)  #反检测模块，隐藏特征
    await page.setViewport({"width":width,"height":height}) #界面大小

    await page.goto("https://www.baidu.com/")
    await page.evaluate("""() =>{ Object.defineProperties(navigator,{ webdriver:{ get: () => false } }) }""") #使用 JavaScript 将它设置为false，规避webdriver检测
    await page.type('input#kw.s_ipt', "python") #搜索python
    await asyncio.sleep(random.random() * random.randint(3, 5))
    await page.click('input#su') # 点击搜索按钮
    await asyncio.sleep(random.random() * random.randint(3, 5))
    await page.evaluate('window.scrollBy(0, window.innerHeight)') # 向下滑动滚动条
    title_element = await page.Jx('//*//h3/a') # 提取搜索结果第一页的标题

    # 清空input输入框的关键词，防止关键词累加
    await page.evaluate('document.querySelector("input[id=kw]").value=""')
    await asyncio.sleep(random.random() * random.randint(3, 5))

    for element in title_element:
        print(await (await element.getProperty('textContent')).jsonValue()) #打印所有节点元素文本信息

    await asyncio.sleep(10)
    await brower.close()
asyncio.get_event_loop().run_until_complete(main())

第一页标题：

部分搜索页面

标签：pyppeteer,input,Python,random,简单,import,page,await,asyncio
From： https://www.cnblogs.com/shan-gui-yao/p/18305743

使用Python和Selenium爬取京东商品数据
简介❤❤码农不是吗喽（大学生版）-CSDN博客在本文中，我们将探讨如何使用Python编程语言结合Selenium库来爬取京东网站上的商品数据。Selenium是一个强大的工具，可以模拟真实用户对网页的交互操作，非常适合进行网页自动化测试和数据抓取。一、环境准备......
Python Part6 函数
1.参数传递位置参数关键字参数若同时有位置参数和关键字参数，则位置参数在前，否则报错默认值参数：defhappy_birthday(name='小李',age=10):print('祝'+name+'生日快乐！')print(str(age)+'岁生日快乐！')happy_birthday()happy_birthday('小王')happy_birthday(......
python 3D例子
importpygame#导入Pygame库，用于创建游戏窗口和处理事件frompygame.localsimport*#导入Pygame的本地模块，包含常用的变量和函数fromOpenGL.GLimport*#导入OpenGL的核心功能fromOpenGL.GLUTimport*#导入OpenGL的实用工具库fromOpenGL.GLUimpor......
OpenCV开发笔记（七十八）：在ubuntu上搭建opencv+python开发环境以及匹配识别Demo
若该文为原创文章，转载请注明原文出处本文章博客地址：https://hpzwl.blog.csdn.net/article/details/140435870长沙红胖子Qt（长沙创微智科）博文大全：开发技术集合（包含Qt实用技术、树莓派、三维、OpenCV、OpenGL、ffmpeg、OSG、单片机、软硬结合等等）持续更新中…OpenCV开发专栏......
python 3d 2
importpygame#导入Pygame库，用于创建游戏窗口和处理事件frompygame.localsimport*#导入Pygame的本地模块，包含常用的变量和函数fromOpenGL.GLimport*#导入OpenGL的核心功能fromOpenGL.GLUTimport*#导入OpenGL的实用工具库fromOpenGL.GLUimport......
Python3禁用AD账号与重置AD账号密码
Python3禁用AD账号#LDAP服务器地址、端口号及连接参数importldap3fromldap3importServer,Connection,ALLserver=Server('10.10.22.14',port=389,get_info=ALL)conn=Connection(server,user='admin',password='yyy',auto_bind=True)username=......
Python教程：ceil、floor、round、int取整
1.向上取整math.ceilmath.ceil()严格遵循向上取整，所有小数都向着数值更大的方向取整。importmathmath.ceil(-1.5)#-1math.ceil(1.5)#2math.ceil(-0.9)#02.向下取整math.floor同math.ceil类似，方向相反，向下取整。importmathmath.floor(-0.5)#-1math.floor......
Python教程：input接受输入
Python中input()函数接受一个标准输入数据，返回为字符类型。无论是int、float、list等，在输入的时候，都是以字符串存储。使用语法：a=input([prompt])#提示信息等待用户输入1.接受一个值a=input('input:')#input:100type(a)#strb=input()#abctype(b)#s......
python中os.stat().st_size、os.path.getsize()获取文件大小
一、os.stat().st_sizeos.stat(filePath)返回读取指定文件的相关属性，然后利用stat模块进行处理。importosos.stat('data_feather_ys.feather')#os.stat_result(st_mode=33206,st_ino=3659174697257342,st_dev=2829373452,st_nlink=1,st_uid=0,st_gid=0,st_size=400......
用Python统计次数的5种方法
一、使用字典dict统计循环遍历出一个可迭代对象的元素，如果字典中没有该元素，那么就让该元素作为字典的键，并将该键赋值为1，如果存在则将该元素对应的值加1。lists=['a','a','b',1,2,3,1]count_dist=dict()foriinlists:ifiincount_dist:count_dist[i]+......

【Python】pyppeteer简单使用

相关文章

赞助商

阅读排行