爬虫 1(入门基础)
一、什么是爬虫
通过编写代码,模拟正常用户使用浏览器的过程,使其能够在互联网自动进行数据抓取
二、HTTP协议
三、URL是什么
URL:资源定位符,是用于完整地描述Internet上网页和其他资源的地址的一种标识方法
四、Header请求头
五、请求头参数的含义
![image-20240423161804279](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150519.png)
![image-20240423161819002](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150520.png)
![image-20240423161943982](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150521.png)
![image-20240423162004304](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150522.png)
![image-20240423162112716](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150523.png)
![image-20240423162242693](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150524.png)
六、requests库
![image-20240423162541549](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150525.png)
1、GET请求
![image-20240425165458517](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150526.png)
![image-20240423164842076](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150527.png)
![image-20240425165620025](https://gitee.com/drenched-with-snow/pic-go/raw/master/202404261150528.png)
通过编写代码,模拟正常用户使用浏览器的过程,使其能够在互联网自动进行数据抓取
URL:资源定位符,是用于完整地描述Internet上网页和其他资源的地址的一种标识方法