我正在尝试从Prizepicks 中抓取CS2 道具,但它卡在了一段代码上,我不知道如何修复这部分。我尝试使用 api 东西,但它对我来说效果不太好,所以我尝试从 app.prizepicks 中提取它。任何建议将不胜感激,因为我真的不知道还能做什么。
下面是代码:
from path import Path
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import pandas as pd
import undetected_chromedriver as uc
############################################################################
# Initialize the WebDriver
driver = uc.Chrome()
driver.maximize_window() # Open the window in full screen mode
############################################################################
# Scraping PrizePicks
driver.get("https://app.prizepicks.com/")
time.sleep(5)
# Waiting and closes popup
WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "close")))
time.sleep(5)
driver.find_element(By.XPATH, "/html/body/div[3]/div[3]/div/div/button").click()
time.sleep(3)
# Creating tables for players
ppPlayers = []
# It will click CS2 tab, if you want to scrape a different sport, change 'CS2' to the sport of your liking.
driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='CS2']").click()
time.sleep(3)
# Waits until stat container element is viewable
stat_container = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.CLASS_NAME, "stat-container")))
# Finding all the stat elements within the stat-container. (In this case, it's Map 1-2 Kills, Headshots, etc.)
categories = driver.find_element(By.XPATH, "//div[@role='button'][normalize-space()='MAPS 1-2 Headshots']").click()
projectionsPP = (By.CSS_SELECTOR, ".projection")
wait = WebDriverWait(driver, 25)
element_wait = wait.until(EC.presence_of_all_elements_located(projectionsPP))
# Creating tables for players
ppPlayers = []
############################################################################
# Data Scraping Automation begins here
for projection in projectionsPP:
names = projection.find_element_by_xpath('.//div[@class="name"]').text
value = projection.find_element_by_xpath('.//div[@class="presale-score"]').get_attribute('innerHTML')
proptype = projection.find_element(By.CLASS_NAME, "text").get_attribute('innerHTML')
players = {
'Name': names,
'Value': value,
'Prop': proptype.replace("<wbr>", "")
}
ppPlayers.append(players)
dfProps = pd.DataFrame(ppPlayers)
dfProps.to_csv('CS2 Props.csv')
print("PrizePicks Props Offered: ", '\n')
print(dfProps)
print('\n')
driver.quit()
我希望它从 CS2 选项卡中提取道具信息并打印到 Excel 工作表,但是一旦我运行调试器,它会转到 CS2 选项卡并卡在这个特定部分。每次都会超时,无论我增加多少时间,它都会卡在那里。
projectionsPP = (By.CSS_SELECTOR, ".projection")
wait = WebDriverWait(driver, 25)
element_wait = wait.until(EC.presence_of_all_elements_located(projectionsPP))
任何帮助将不胜感激。另外,我还不知道如何有效地使用 Stack Overflow,所以如果我搞砸了一些基本的东西,请给我恩典,谢谢!
代码似乎存在一些问题,我将帮助你解决这些问题。
-
目标定位问题: 代码尝试定位
MAPS 1-2 Headshots
元素,但单击此元素并不一定会显示你尝试检索的.projection
元素。这可能是TimeoutException
的原因,因为预期元素在预期位置没有加载。 -
循环逻辑错误:
for
循环for projection in projectionsPP:
有问题,因为projectionsPP
是一个定位器元组,而不是你想要迭代的元素列表。 -
缺少类别迭代: 代码旨在提取 CS2 道具,但它仅针对单个类别(
MAPS 1-2 Headshots
)。要抓取所有类别,你需要添加一个循环来遍历每个类别。
以下是经过改进的代码,可以解决这些问题:
from path import Path
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import pandas as pd
import undetected_chromedriver as uc
driver = uc.Chrome()
driver.maximize_window()
driver.get("https://app.prizepicks.com/")
time.sleep(5)
WebDriverWait(driver, 15).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "close")))
time.sleep(5)
driver.find_element(By.XPATH, "/html/body/div[3]/div[3]/div/div/button").click()
time.sleep(3)
driver.find_element(By.XPATH, "//div[@class='name'][normalize-space()='CS2']").click()
time.sleep(3)
ppPlayers = []
# 找到所有类别并迭代它们
categories = WebDriverWait(driver, 10).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@class='stat-container']//div[@role='button']")))
for category in categories:
category.click() # 点击类别以加载道具
time.sleep(3) # 调整等待时间,如果需要可以增加
# 在类别中找到所有道具
projections = WebDriverWait(driver, 25).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, ".projection")))
for projection in projections:
name = projection.find_element(By.CLASS_NAME, "name").text
value = projection.find_element(By.CLASS_NAME, "presale-score").get_attribute('innerHTML')
prop_type = projection.find_element(By.CLASS_NAME, "text").get_attribute('innerHTML').replace("<wbr>", "")
players = {
'Name': name,
'Value': value,
'Prop': prop_type
}
ppPlayers.append(players)
dfProps = pd.DataFrame(ppPlayers)
dfProps.to_csv('CS2 Props.csv')
print("PrizePicks Props Offered: ", '\n')
print(dfProps)
print('\n')
driver.quit()
这些更改包括:
- 我们现在正在遍历所有类别并单击每个类别以加载道具。
- 我们正在等待每个类别加载道具。
-
我们已更正
for
循环以迭代找到的道具。
这应该可以解决你遇到的
TimeoutException
并正确抓取数据。请注意,网站结构可能会发生变化,如果代码再次停止工作,你可能需要调整定位器。