首页 > 其他分享 >scrapy ja3 tls

scrapy ja3 tls

时间:2023-10-25 11:03:33浏览次数:28  
标签:tls self ciphers request ja3 spider scrapy ._ download

 

 

 

# -*- coding:utf-8 -*-
import random

from scrapy.core.downloader.contextfactory import ScrapyClientContextFactory
from scrapy.core.downloader.handlers.http11 import HTTP11DownloadHandler, ScrapyAgent

ORIGIN_CIPHERS = 'TLS13-AES-256-GCM-SHA384:TLS13-CHACHA20-POLY1305-SHA256:TLS13-AES-128-GCM-SHA256:ECDH+AESGCM:ECDH+CHACHA20:DH+AESGCM:DH+CHACHA20:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES'


def shuffle_ciphers():
    ciphers = ORIGIN_CIPHERS.split(":")
    random.shuffle(ciphers)

    ciphers = ":".join(ciphers)

    print("________")
    print(ciphers)
    print("________")

    return ciphers + ":!aNULL:!MD5:!DSS"


class MyHTTPDHandler(HTTP11DownloadHandler):

    def download_request(self, request, spider):
        """Return a deferred for the HTTP download"""

        tls_cliphers = shuffle_ciphers()
        _contextFactory = ScrapyClientContextFactory(tls_ciphers=tls_cliphers)

        agent = ScrapyAgent(
            contextFactory=_contextFactory,
            pool=self._pool,
            maxsize=getattr(spider, 'download_maxsize', self._default_maxsize),
            warnsize=getattr(spider, 'download_warnsize', self._default_warnsize),
            fail_on_dataloss=self._fail_on_dataloss,
            crawler=self._crawler,
        )
        return agent.download_request(request)


class MyHTTPDownloadHandler(MyHTTPDHandler):
    def download_request(self, request, spider):
        return super().download_request(request, spider)

 

 

 

 

 

 

'DOWNLOAD_HANDLERS': {
             'http': 'middlewares.sc_middlewares.MyHTTPDownloadHandler',
             'https': 'middlewares.sc_middlewares.MyHTTPDownloadHandler',
         },

 



标签:tls,self,ciphers,request,ja3,spider,scrapy,._,download
From: https://blog.51cto.com/angdh/8015540

相关文章

  • TLS Handshake failed: tls: server selected unsupported protocol version 301
    2023/10/2321:04:55D:/Dev/sre/gormSQLServer/main.go:20[error]failedtoinitializedatabase,goterrorTLSHandshakefailed:tls:serverselectedunsupportedprotocolversion301TLSHandshakefailed:tls:serverselectedunsupportedprotocolversion30......
  • Python scrapy爬虫框架爬取廖雪峰大佬官网Python教程左侧目录
    文章转载至:mob6454cc6f27a3 的PythonScrapy爬虫框架实例(一)爬虫框架scrapy简单实例吃面崩掉牙的Scrapy爬虫框架入门教程(1)——爬取廖雪峰老师的博客!!只是爬取一个字段,并不将爬取的数据存储下来。!!运行环境:window10x64系统python3.6pycharmanacondascrapy安装好以上环境、包之后,......
  • Sitecore 里删除 Item 报错 Could not create SSL/TLS secure channel.
    解决方法:前往mmc里添加你的域名win+r输入mmc打开下图面板;......
  • TLS/SSL 协议 - ClientHello
    ClientHello在一次新的握手流程中,ClientHello消息总是第一条消息。这条消息将客户端的功能和首选项传送给服务器。客户端会在新建连接后,希望重新协商或者响应服务器发起的重新协商请求(由HelloRequest消息指示)时,发送这条消息。Wireshark抓取ClientHello消息:Version协议版......
  • 禁用Ubuntu Release TLS更新
    在使用ubuntu进行开发,会有如下提示WelcometoUbuntu12.04.5LTS(GNU/Linux3.4.0-030400-generici686)*Documentation:https://help.ubuntu.com/Newrelease'14.04.1LTS'available.Run'do-release-upgrade'toupgradetoit.禁用方法:vim/etc/update-......
  • 手机爬虫用Scrapy详细教程:构建高效的网络爬虫
    如果你正在进行手机爬虫的工作,并且希望通过一个高效而灵活的框架来进行数据抓取,那么Scrapy将会是你的理想选择。Scrapy是一个强大的Python框架,专门用于构建网络爬虫。今天,我将与大家分享一份关于使用Scrapy进行手机爬虫的详细教程,让我们一起来探索Scrapy的功能和操作,为手机爬虫增添......
  • nginx 关闭TLS 1.0 TLS 1.1
    server{listen443ssl;server_namewww.XXX.com;proxy_read_timeout3600s; #设置读取超时时间ssl_certificateC:/xxxx.pem;ssl_certificate_keyC:/xxxx.key;ssl_session_cachesha......
  • scrapy自带的中间件
    {'scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware':100,'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware':300,'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware':350,......
  • docker-compose deploy 高可用 elasticsearch TLS
    文章目录1.sysctl2.swap3.hosts4.配置instances.yaml5.创建证书6.部署7.修改kibanna密码8.清理1.sysctl[root@githubes_tls]#cat/etc/sysctl.conf#sysctlsettingsaredefinedthroughfilesin#/usr/lib/sysctl.d/,/run/sysctl.d/,and/etc/sysctl.d/.##......
  • scrapy post请求练习
    importscrapyimportjsonclassTransferpostSpider(scrapy.Spider):name='transferPost'allowed_domains=['fanyi.baidu.com']#start_urls=['http://fanyi.baidu.com/']#post请求不能用默认生成的,因为不能携带请求参数#de......