用Rust反爬虫，这里有你要的教程和代码

时间：2024-04-06 10:33:57浏览次数：29

标签：website 教程 HTTP like 爬虫 HellPot login your Rust

Ever wanted to mess with people scanning the web for vulnerabilities? I certainly did. This is the story how I found a way to punish them, then used Rust to improve it, and then killed my web server using a van.

有没有想过给那些总是利用网络漏洞搞爬虫的人添点堵？作者做到了。这个故事是如何找到一种惩罚他们的方式，然后使用 Rust 来改进它，然后用van干崩溃了服务器的故事。

Step 0: Getting Annoyed 生气

Alright, so if you’ve ever run a website at any scale, and happen to look at the access logs, you will soon find that a lot of requests coming in has nothing to do with your website. A lot of them instead look at paths like /wp-login.php , /.env and /.git/config . Turns out a lot of different people want to either steal your database password or try to login to your WordPress site. While not surprising, it is a bit annoying when you try to check stats of your site.
好吧，如果你曾经运行过任何规模的网站，并且碰巧查看了访问日志，你很快就会发现很多请求与你的网站无关。他们中的很多人反而会看像 /wp-login.php ， /.env 和 /.git/config 这样的路径。事实证明，很多不同的人想要窃取您的数据库密码或尝试登录您的WordPress网站。虽然这并不奇怪，但当你试图检查网站的统计数据时，这有点烦人。

This is of course an automated process (or well, some maniac might do this manually, it’s a big internet after all). It won’t help updating your /robots.txt (a file describing how bots are allowed to check your website), because no self-respecting password-stealing bot would ever bother to read it. However, big companies like Google do respect this file (with some exceptions). Could we somehow use this to our advantage?
这当然是一个自动化的过程（或者，一些疯子可能会手动完成，毕竟林子大了什么鸟都有）。它不会帮助更新你的 /robots.txt ，因为密码窃取机器人不会费心去读它。然而，像谷歌这样的大公司确实使用到这个文件（除了一些例外）。能不能利用这一点？

Step 1: Finding the Gates of Hell
第一步：寻找地狱之门

In looking into ways to mess with our annoying bot friends I stumbled upon HellPot, an HTTP honeypot designed to crash bots attempting to scrape a website by simply giving them what the asked for. Any HTTP request to HellPot on specified paths (like the aforementioned /wp-login.php) will be met with an eternal stream of data from The Birth of Tragedy (Hellenism and Pessimism) by Friedrich Nietzsche, that kind of looks like a website. We just make sure to put the same paths in our robots.txt to avoid bingbot experiencing Nietzsche at several MB/s.
在研究如何给这些烦人的机器人制造障碍时，作者偶然发现了HellPot，这是一个HTTP蜜罐技术，旨在通过简单地给他们所要求的内容来崩溃试图抓取网站的机器人。

任何在指定路径（如前面提到的 /wp-login.php ）上对HellPot的HTTP请求都会遇到无穷尽数据流，作者给他们来了一套尼采的《悲剧的诞生》全文；这些内容看起来像是一个真实的网站上的。只是确保在 robots.txt 中放置相同的路径，以避免bingbot以过快的速度获取这些“悲剧”。

标签：website,教程,HTTP,like,爬虫,HellPot,login,your,Rust
From： https://blog.csdn.net/yu101994/article/details/137383625

基于springboot的大学生综合成绩管理系统3（含源码+sql+视频导入教程+文档+PPT）
......
pdffactory pro 8注册码序列号下载附教程
PdfFactoryPro可以说是一款行业专业且技术领先的的PDF虚拟打印机软件。其不仅占用系统内存小巧，功能强大，可支持用户无需使用Acrobat来创建AdobePDF即可以进行PDF组件的创建和打印。同时，现在全新的PdfFactoryPro8也正式上线来袭，全面新增添加了如书签、作业订购、信头和自动电......
Stable Diffusion本地部署教程
StableDiffusion本地部署的步骤一般包括准备环境、下载StableDiffusion模型和依赖库、配置运行参数等。下面是一个通用的教程，用以在计算机上本地部署StableDiffusion。准备环境1.确保硬件满足最低要求： -一块NVIDIAGPU，至少4GB显存（推荐更高显存） -足够的磁盘空......
30天拿下Rust之超级好用的“语法糖”
概述 Rust语言的设计非常注重开发者的体验，因此它包含了许多实用的“语法糖”。这些“语法糖”让代码更简洁、易读，同时保持了语言的强大和灵活性。1、字符串插值字符串插值允许我们在字符串中嵌入变量或表达式的值，使用{}作为占位符。fnmai......
【爬虫】项目篇-selenium爬取大鱼潮汐网
爬取指定日期的潮汐数据创建driver对象，并设为最大窗口url="https://www.chaoxibiao.net/tides/75.html"option=Options()option.binary_location=r"C:\Users\txmmy\AppData\Local\Google\Chrome\Application\chrome.exe"drvier=webdriver.Chrome(options=option......
【爬虫】项目篇-新东方六级听力音频
importrequests,time,randomfromfake_useragentimportUserAgenturls=open(r'E:\01pycharmproject\网络爬虫技术\sjj1.txt',encoding='utf-8').read().split()i=1forurlinurls:headers={#'User-agent':'Mozilla/5......
rust 面向对象编程特性、模式与模式匹配、高级特征
面向对象编程OOP学习了结构体、枚举，它们可以包含自定义数据字段，也可以定义内部方法，它们提供了与对象相同的功能。面向对象的四大特征：封装、继承、多态通过pub标记为公有的结构体，在其他模块中可以访问使用这个结构体。但是对于结构体内部字段，如果不用pub，则仍是私有的，则可以通过......
Rust语言基础：语法、数据类型与操作符
Rust语言基础：语法、数据类型与操作符Rust是一种系统编程语言，致力于安全、并发和实用性。它是由Mozilla基金会开发的，并得到了广泛的应用。在本篇文章中，我们将带你了解Rust的基础知识，包括语法、数据类型和操作符。1.Rust的语法Rust的语法类似于C++和Java，但同时又更加简洁......
【爬虫】项目篇-使用selenium、requests爬取天猫商品评论
目录使用selenium使用requests使用seleniumfromselenium.webdriverimportChrome,ChromeOptionsfromselenium.webdriver.support.waitimportWebDriverWaitfromselenium.webdriver.common.byimportByfromselenium.webdriver.supportimportexpected_conditionsasE......
【爬虫】项目篇-爬取豆瓣电影周榜Top10，保存至Redis
写法一：编写两个爬虫程序文件：爬虫1将豆瓣一周口碑榜的电影url添加到redis中名为movie_url的列表中（注意避免多次运行导致重复的问题）；爬虫2从movie_url中读出网址，爬取每一部电影的导演、主演、类型、制片国家/地区、语言、上映日期、片长，并将它们保存到redis的hash表（自行命名）中。d......

用Rust反爬虫，这里有你要的教程和代码

Step 0: Getting Annoyed 生气

Step 1: Finding the Gates of Hell
第一步：寻找地狱之门

相关文章

赞助商

阅读排行

用Rust反爬虫，这里有你要的教程和代码

Step 0: Getting Annoyed 生气

Step 1: Finding the Gates of Hell 第一步：寻找地狱之门

相关文章

赞助商

阅读排行

Step 1: Finding the Gates of Hell
第一步：寻找地狱之门