首页 > 其他分享 >每日总结2024/1/15(爬虫学习)

每日总结2024/1/15(爬虫学习)

时间:2024-01-15 23:34:05浏览次数:43  
标签:baidu BFESS 15 爬虫 accept 2024 application com MSA

原文链接

python爬虫 - Python3.x+Fiddler抓取APP数据 - 学习分享 - SegmentFault 思否

我爬取的为浏览器数据

 

 可以看到在这里我们成功爬取到了浏览器数据,但是在实现python中爬取数据遇到了很多版本以及配置缺少的问题,同时我的版本中http为2,不知道有没有其他影响,代码目前为

"""
GET https://m.baidu.com/s?word=%E7%9F%B3%E5%AE%B6%E5%BA%84&opfc=1 HTTP/2
host: m.baidu.com
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (Linux; Android 7.1.2; SM-G977N Build/LMY48Z; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/75.0.3770.143 Mobile Safari/537.36
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3
referer: https://m.baidu.com/?from=844b&vit=fps
accept-encoding: gzip, deflate
accept-language: zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7
x-requested-with: com.android.browser
cookie: BAIDUID=3ED233718A42B05529E739BC5B06191C:FG=1; BAIDUID_BFESS=3ED233718A42B05529E739BC5B06191C:FG=1; bd_af=1; kleck=4e9a1215601bba92d8209d1d02da24a2; BDSVRTM=6; BDORZ=AE84CDB3A529C0F8A2B9DCDD1D18B695; SE_LAUNCH=5%3A1705329123; POLYFILL=0; BA_HECTOR=a4al8580a4282ga4000k0ka4v2ttic1iqagfc1s; ZFY=aZwTfnzUTuuEpLiW5Hy5F3zJcwRrFA4RMpopOVBZqQI:C; MSA_WH=450_724; MSA_PBT=148; MSA_ZOOM=1000; MSA_PHY_WH=900_1600; shifen[680740050152_15861]=1705329137; H_WISE_SIDS=282632_286879_110085_287513_287837_287168_280169_284912_288373_283782_287982_288710_288714_288718_288601_288742_288746_288749_287620_284816_269050_265881_281890_289949_289950_289956_290237_290370_290498_286491_290555_290562_282553_282815_289431_290987_287977_286234_291051_203517_291151_277936_290425_288871_256739_290667_288252_291510_281879_291398_286910_291727_290567_283016_291868_291956_291989_292027_292135_292224_292167_292249_292247_292251_284551_292073_292089_289527_291192_292327_292314_292357_292363_287174_287718_282466_292506_292345_292614_292710_292773_292414_292459_292453_292822_292804_287703_289094_292583_292893_291327_8000065_8000107_8000126_8000142_8000143_8000149_8000151_8000163_8000175_8000185; PSINO=1; H_WISE_SIDS_BFESS=282632_286879_110085_287513_287837_287168_280169_284912_288373_283782_287982_288710_288714_288718_288601_288742_288746_288749_287620_284816_269050_265881_281890_289949_289950_289956_290237_290370_290498_286491_290555_290562_282553_282815_289431_290987_287977_286234_291051_203517_291151_277936_290425_288871_256739_290667_288252_291510_281879_291398_286910_291727_290567_283016_291868_291956_291989_292027_292135_292224_292167_292249_292247_292251_284551_292073_292089_289527_291192_292327_292314_292357_292363_287174_287718_282466_292506_292345_292614_292710_292773_292414_292459_292453_292822_292804_287703_289094_292583_292893_291327_8000065_8000107_8000126_8000142_8000143_8000149_8000151_8000163_8000175_8000185; BAIDULOC=11975304.608602_4139954.0751319_1_359_1705329581618; H5LOC=1; BAIDULOC_BFESS=11975304.608602_4139954.0751319_1_359_1705329581618; seClickID=52f3551186394218; wpr=0; COOKIE_SESSION=446_0_0_2_0_t2_11_2_1_0_0_1_9_1705329568%7C2%23446_t2_0_1_1_2_5_2_1705329568%7C2; FC_MODEL=0_0_0_0_700.68_0_0_0_0_0_720.21_0_2_11_1_5_0_0_1705329568%7C2%23700.68_0_0_2_1_0_1705329568%7C2%230_ax_1_0_0_0_0_1705329568; shifen[730377201155_68091]=1705330314; BCLID=11154939026636247212; BCLID_BFESS=11154939026636247212; BDSFRCVID=jhkOJeC62GYZcUJq6tUy5PqNP2K2rBJTH6amWV8ifk80i3kPJfb2EG0PdU8g0KuhSv7IogKKBmOTHg-F_2uxOjjg8UtVJeC6EG0Ptf8g0x5; BDSFRCVID_BFESS=jhkOJeC62GYZcUJq6tUy5PqNP2K2rBJTH6amWV8ifk80i3kPJfb2EG0PdU8g0KuhSv7IogKKBmOTHg-F_2uxOjjg8UtVJeC6EG0Ptf8g0x5; H_BDCLCKID_SF=JnPjVI-2JK83qJTph47hqR-8MxrK2JT3KC_X3b7Ef-bVsh7_bf--D60HyHDO-J_qbNQX_xJhBfJHolTg2Rjxy5K_htJjt4CfKH4e24jc2POkqC3HQT3mXlQbbN3i3xrwBKJuWb3cWhoV8UbSbIcPBTD02-nBat-OQ6npaJ5nJq5nhMJmb67JDbv0eG_DqT_OtbC8V-35b5rWjJjvM-n_bntJ5eT22-us2DbW2hcH0KLKsUTL-pokKjK33qutQnowLnriotoPWfb1MRjz3pDWMtKfea30J-nTWDcaoq5TtUJfSDnTDMRhMhtBjPryKMni0Dj9-pnjHlQrh459XP68bTkA5bjZKxtq3mkjbPbDfn02JKKu-n5jHjoWDG_f3H; H_BDCLCKID_SF_BFESS=JnPjVI-2JK83qJTph47hqR-8MxrK2JT3KC_X3b7Ef-bVsh7_bf--D60HyHDO-J_qbNQX_xJhBfJHolTg2Rjxy5K_htJjt4CfKH4e24jc2POkqC3HQT3mXlQbbN3i3xrwBKJuWb3cWhoV8UbSbIcPBTD02-nBat-OQ6npaJ5nJq5nhMJmb67JDbv0eG_DqT_OtbC8V-35b5rWjJjvM-n_bntJ5eT22-us2DbW2hcH0KLKsUTL-pokKjK33qutQnowLnriotoPWfb1MRjz3pDWMtKfea30J-nTWDcaoq5TtUJfSDnTDMRhMhtBjPryKMni0Dj9-pnjHlQrh459XP68bTkA5bjZKxtq3mkjbPbDfn02JKKu-n5jHjoWDG_f3H; __bsi=11480918622058509743_00_19_N_N_0_0303_c02f_Y


"""


import requests
url='https://m.baidu.com/s?word=%E7%9F%B3%E5%AE%B6%E5%BA%84&opfc=1'
headers = {
    'host': 'm.baidu.com',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Linux; Android 7.1.2; SM-G977N Build/LMY48Z; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/75.0.3770.143 Mobile Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
    'referer': 'https://m.baidu.com/?from=844b&vit=fps',
    'accept-encoding': 'gzip, deflate',
    'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
    'x-requested-with': 'com.android.browser',
    'cookie': 'BAIDUID=3ED233718A42B05529E739BC5B06191C:FG=1; BAIDUID_BFESS=3ED233718A42B05529E739BC5B06191C:FG=1; bd_af=1; kleck=4e9a1215601bba92d8209d1d02da24a2; BDSVRTM=6; BDORZ=AE84CDB3A529C0F8A2B9DCDD1D18B695; SE_LAUNCH=5%3A1705329123; POLYFILL=0; BA_HECTOR=a4al8580a4282ga4000k0ka4v2ttic1iqagfc1s; ZFY=aZwTfnzUTuuEpLiW5Hy5F3zJcwRrFA4RMpopOVBZqQI:C; MSA_WH=450_724; MSA_PBT=148; MSA_ZOOM=1000; MSA_PHY_WH=900_1600; shifen[680740050152_15861]=1705329137; H_WISE_SIDS=...(此处省略了大量内容)... __bsi=11480918622058509743_00_19_N_N_0_0303_c02f_Y'
}
re = requests.get(url=url,headers=headers)
print(re.text)

 

标签:baidu,BFESS,15,爬虫,accept,2024,application,com,MSA
From: https://www.cnblogs.com/azwz/p/17966645

相关文章

  • 2024.1
    1.P3780[SDOI2017]苹果树首先要转化一下\(t-h\lek\)这个限制,考虑它的实际意义,就是我们取出所选的点中最深的一个,记为\(x\),从根到它每个点都能免费领一个苹果。那我们反过来枚举\(x\),尝试计算答案。按普通的方式进行树形dp是\(O(nk^2)\)的,瓶颈在于我们合并两个背包......
  • 1月15日总结
    好呀,我是歪歪。Spring的事件监听机制,不知道你有没有用过,实际开发过程中用来进行代码解耦简直不要太爽。但是我最近碰到了一个涉及到泛型的场景,常规套路下,在这个场景中使用该机制看起来会很傻,但是最终了解到Spring有一个优雅的解决方案,然后去了解了一下,感觉有点意思。和你一......
  • 1.15模拟赛 T2题解
    简要题意多重背包但是乘法思路暴力就直接跑背包考虑乘法能否变为加法,可以找到模数的原根,将每个数映射一下,这样乘法就变成了加法,可以直接\(\text{bitset}\)优化,但是暴力这样做还是过不了于是我们考虑二进制分组优化背包,这样复杂度貌似就对了?code#pragmaGCCoptimize("Ofast......
  • CodeForces 1500C Matrix Sorting
    洛谷传送门CF传送门做了好久。怎么会是呢。题目的操作可以看成,求出一些关键字,使得\(B\)矩阵的行是由\(A\)按照这些第\(1\)关键字、第\(2\)关键字一直到第\(k\)关键字,最后还有一个原来所在行下标的关键字,从小到大排序。肯定是从排好序的\(B\)矩阵入手。首先任意找......
  • 闲话1.15
    今天接着摆了。上午打了场模拟赛,接着掉分......
  • 1.15闲话
    推歌:蜥蜴舞曲/洛天依by伊野奏/Creuzer/Realillusions为啥感觉今天我闲话内容都这么少的样子,可能是因为HS_xh这几天去搞whk了导致闲话水平大幅下降(他去搞whk管我啥事今天上午帮助高一大佬去找羽毛球拍然后没找到,回来用手一抹我去怎么流鼻血了(流汗黄豆)菜菜菜菜菜菜菜菜菜菜......
  • 2024省选联测11
    A.Giao徽的烤鸭给定一棵树,边权为\(1\)。在第\(i\)家店办卡花费\(w_i\)元。对于任意一家店,如果Giao徽在到\(i\)的距离小于等于\(p\)的所有店办了卡,可得到\(v_p\)元的代金券。求最大利润。\(f_{u,i}\)表示在以\(u\)为根的子树中,到\(u\)距离小于等于\(i\)......
  • P1558 色板游戏
    原题链接题解1,种30棵树,每棵树代表每种颜色,树的每个节点代表这个颜色在对应区间上是否存在code#include<bits/stdc++.h>usingnamespacestd;intst[32][400005]={0};intlazy[32][400005]={0};voidpushdown(intwho,intnode){st[who][node*2]=lazy[who][node];......
  • Picturesocial | 开发实践:如何在15分钟内将应用容器化
    在常见的软件架构体系中,容器无疑是一个技术热点。有些开发者在工作中熟练使用容器技术,有些可能刚刚开始容器之旅。面对容器使用经验不同的各类开发者,我们希望通过这个系列文章,由浅入深地介绍如何使用容器技术来构建,运维我们的软件应用程序。贯穿整个系列,我们将持续构建一个名为......
  • 2024-1-15 大数据hive-执行计划
    学习执行计划。简单的解释为:explainquery;一个简单的例子为:explainselectsum(id)fromtest1;该语句的执行计划为:STAGEDEPENDENCIES:Stage-1isarootstageStage-0dependsonstages:Stage-1STAGEPLANS:Stage:Stage-1MapReduceMapO......