使用 Selenium 捕获 XHR 请求时遇到问题

标签：python selenium-webdriver xmlhttprequest

首先，我不是开发人员，所以使用 ai 为我生成一个代码以从网页获取 xhr 请求，即： https://www.oddsportal.com/football/brazil/serie-a/bragantino -athletico-pr-xx0ujiJ5/ 这只是一个示例。我想从该页面上的 xhr 请求获取分数，而不是使用其他方法（例如使用类等定位它）。

对我来说最有趣的是，当您打开网页时，我想要获取的网络请求不会出现，直到你点击另一个选项卡或桌面并返回目标页面，就会出现我需要的请求。该请求的响应将为我提供比赛得分数据。

这是硒代码，它捕获一些请求，但不是我要查找的请求。代码会告诉你我这里的情况。由于我所在国家/地区的网页的可访问性，我必须仅使用带有 chrome 开发工具的 Opera 浏览器。提供此信息，但我不相信情况是这样，因为此代码正在获取一些请求，并将它们列在终端上。如果有人模拟这个，请帮助我。

下面是我的py文件的代码。我期望从代码中捕获请求名称：1-xx0ujiJ5-yj3ae.dat，其网址为： https://www.oddsportal.com/feed/postmatch-score/1-xx0ujiJ5-yj3ae.dat

import json
    import time
    from selenium import webdriver
    from selenium.webdriver.chrome.service import Service
    from selenium.webdriver.chrome.options import Options

    Paths
    driver_path = r'C:\Users\tugbe\Desktop\basit_oddsportal\drivers\chromedriver-win64\chromedriver.exe'
    binary_path = r'C:\Users\tugbe\AppData\Local\Programs\Opera\opera.exe'

    Set up Chrome options for Opera
    options = Options()
    options.binary_location = binary_path
    options.add_argument('--ignore-certificate-errors')
    options.add_argument('--ignore-ssl-errors')
    options.add_argument("--auto-open-devtools-for-tabs")  # Automatically open DevTools
    options.debugger_address = "127.0.0.1:9222"  # Connect to the existing Opera instance

    Enable Performance logging
    options.set_capability('goog:loggingPrefs', {'performance': 'ALL'})

    Initialize the WebDriver service
    service = Service(driver_path)

    Initialize the WebDriver with options and capabilities
    driver = webdriver.Chrome(service=service, options=options)

    Enable Network logging
    driver.execute_cdp_cmd('Network.enable', {})

    def capture_network_logs(duration=60):
    start_time = time.time()
    captured_urls = set()
    print(f"Capturing network logs for {duration} seconds...")
    while time.time() - start_time < duration:
    logs = driver.get_log('performance')
    for entry in logs:
    log = json.loads(entry['message'])['message']
    if log['method'] == 'Network.requestWillBeSent':
    request = log['params']['request']
    if 'url' in request and request['url'] not in captured_urls:
    print(f"Request URL: {request['url']}")
    print(f"Request Method: {request['method']}")
    print(f"Request Headers: {json.dumps(request['headers'], indent=2)}")
    print("=" * 80)
    captured_urls.add(request['url'])
    time.sleep(1)  # Sleep for a short while before capturing logs again

    try:
    Open a new tab in the existing Opera window
    driver.execute_script("window.open('about:blank', '_blank');")
    driver.switch_to.window(driver.window_handles[-1])

    Open the target URL in the new tab
    url = 'https://www.oddsportal.com/football/brazil/serie-a/bragantino-athletico-pr-xx0ujiJ5/'
    driver.get(url)

    Wait until the page is loaded
    time.sleep(10)  # Wait for 10 seconds to ensure the page is fully loaded

    Inspect an element to trigger the DevTools
    driver.execute_script("document.querySelector('body').click();")

    Wait a bit to ensure the element is inspected
    time.sleep(5)

    Refresh the page to ensure all requests are captured
    driver.refresh()

    Wait until the page is loaded again
    time.sleep(10)

    Click on the second opened tab
    driver.switch_to.window(driver.window_handles[1])

    Wait a bit in the second tab
    time.sleep(5)

    Switch back to the target tab
    driver.switch_to.window(driver.window_handles[-1])

    Wait a bit to ensure the switch is complete
    time.sleep(5)

    Capture and print network logs for 1 minute (60 seconds)
    capture_network_logs(duration=60)

    finally:
    Close the WebDriver
    driver.quit()

问题在于正在尝试在请求发生之前捕获请求。已经正确地识别出在单击远离页面然后返回之前不会加载包含分数数据的 XHR 请求。

当前的代码打开页面，等待，然后开始监听网络请求。需要更改代码的结构，以便在与页面交互以触发请求之后监听请求。

以下是如何修改代码以捕获所需 XHR 请求的方法：

import json
import time
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options

# ...（现有的路径、选项和服务设置）...

driver = webdriver.Chrome(service=service, options=options)
driver.execute_cdp_cmd('Network.enable', {})

def capture_network_logs(duration=60):
    start_time = time.time()
    captured_urls = set()
    print(f"捕获网络日志 {duration} 秒...")
    while time.time() - start_time < duration:
        logs = driver.get_log('performance')
        for entry in logs:
            log = json.loads(entry['message'])['message']
            if log['method'] == 'Network.requestWillBeSent':
                request = log['params']['request']
                if 'url' in request and request['url'] not in captured_urls:
                    print(f"请求 URL: {request['url']}")
                    print(f"请求方法: {request['method']}")
                    print(f"请求标头: {json.dumps(request['headers'], indent=2)}")
                    print("=" * 80)
                    captured_urls.add(request['url'])

                    # 检查是否是目标请求
                    if '1-xx0ujiJ5-yj3ae.dat' in request['url']:
                        # 在此处处理请求，例如，提取分数数据
                        print("找到目标请求！")

        time.sleep(1)

try:
    # ...（现有的打开选项卡和加载 URL 的代码）...

    # ***在切换选项卡之前***，开始捕获网络日志
    capture_network_logs()

    # ...（现有的切换选项卡和与页面交互的代码）...

finally:
    driver.quit()

此代码中的更改是：

在与页面交互以触发 XHR 请求之前启动 capture_network_logs() 函数。 这可确保在请求发生时监听它。
在 capture_network_logs() 函数中，添加了一个条件来检查捕获的请求是否是正在寻找的请求。 如果是，则可以从请求中提取分数数据。

通过在与页面交互以触发请求之前开始监听网络请求，应该能够使用 Selenium 成功捕获所需的 XHR 请求。

标签：python,selenium-webdriver,xmlhttprequest
From： 78790854

使用 Selenium 捕获 XHR 请求时遇到问题

相关文章

赞助商

阅读排行