以下是一个使用隧道代理进行爬虫的Python模板:
```python
import requests
# 设置代理服务器
proxy_host = "your_proxy_host"
proxy_port = "your_proxy_port"
proxy_username = "your_proxy_username"
proxy_password = "your_proxy_password"
# 设置目标网址
target_url = "your_target_url"
# 构建代理认证信息
proxy_auth = requests.auth.HTTPProxyAuth(proxy_username, proxy_password)
# 构建代理配置
proxy = {
"http": f"http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}",
"https": f"https://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}"
}
# 发送请求
response = requests.get(target_url, proxies=proxy, auth=proxy_auth)
# 处理响应
if response.status_code == 200:
# 解析网页内容
html = response.text
# 进行其他操作
...
else:
print("请求失败")
```
请注意,使用代理服务器进行爬虫时,需要确保你有合法的访问权限和授权。同时,你还需要替换代码中的`your_proxy_host`、`your_proxy_port`、`your_proxy_username`和`your_proxy_password`为你实际的代理服务器信息。
#! -*- encoding:utf-8 -*-
import requests
# 要访问的目标页面
targetUrl = "http://ip.hahado.cn/ip"
# 代理服务器
proxyHost = "ip.hahado.cn"
proxyPort = "39010"
# 代理隧道验证信息
proxyUser = "username"
proxyPass = "password"
proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
resp = requests.get(targetUrl, proxies=proxies)
print resp.status_code
print resp.text
标签:username,python,模版,爬虫,your,host,proxy,password,port
From: https://blog.51cto.com/u_15822686/6579807