首页 > 其他分享 >json数据传输压缩以及数据切片分割分块传输多种实现方法,大数据量情况下zlib压缩以及bytes指定长度分割

json数据传输压缩以及数据切片分割分块传输多种实现方法,大数据量情况下zlib压缩以及bytes指定长度分割

时间:2023-09-26 14:31:38浏览次数:50  
标签:分割 cur 压缩 list content 943751 数据量 print data


json数据传输压缩以及数据切片分割分块传输多种实现方法,大数据量情况下zlib压缩以及bytes指定长度分割。

import sys
import zlib
import json
import math

KAFKA_MAX_SIZE = 1024 * 1024
CONTENT_MIN_MAX_SIZE = KAFKA_MAX_SIZE * 0.9


def split_data(data):
    """
    :param data: json
    :return:
    """
    # 先进行压缩
    data = zlib.compress(json.dumps(data).encode('utf-8'))
    part_content = b""
    content_list = []

    data_length = len(data)
    data_size = sys.getsizeof(data)
    step = math.ceil(data_length / ((data_size / CONTENT_MIN_MAX_SIZE) + 1))  # 计算切割的长度

    for i in range(0, len(data), step):
        default_content = part_content
        part_content += data[i:i + step]
        cur_part_content_size = sys.getsizeof(part_content)
        # 限制每个分片在 0.9M 到 1M 之间
        if CONTENT_MIN_MAX_SIZE < cur_part_content_size < KAFKA_MAX_SIZE:
            content_list.append(part_content)
            part_content = b""
        elif KAFKA_MAX_SIZE < cur_part_content_size:
            content_list.append(default_content)
            part_content = data[i:i + step]

    if part_content:
        content_list.append(part_content)
    return content_list


def split_data2(data):
    """
    :param data: json
    :return:
    """
    # 先进行压缩
    data = zlib.compress(json.dumps(data).encode('utf-8'))
    data_length = len(data)
    step = int(CONTENT_MIN_MAX_SIZE)
    content_list = [data[i:i + step] for i in range(0, data_length, step)]
    return content_list


def split_data3(data):
    """
    :param data: json
    :return:
    """
    # 先进行压缩
    data = zlib.compress(json.dumps(data).encode('utf-8'))
    import re
    step = int(CONTENT_MIN_MAX_SIZE)
    last = data[int((len(data) / step)) * step:]
    content_list = re.findall(bytes("[\s\S]{" + str(step) + "}", encoding="utf8"), data)
    content_list.append(last)
    return content_list


def re_j_data(content_list):
    # 还原成json
    cur_content = b""
    for item in content_list:
        cur_content += item
    data = json.loads(zlib.decompress(cur_content))
    return data


if __name__ == '__main__':
    test_data = {
        "domain": "uoowoo.cn",
        "ip": "122.10.52.140",
        "ip_version": 4,
        "port": 443,
        "path": "/",
        "url": "https://uoowoo.cn/",
        "protocol": "https",
        "transport": "tcp",
        "date": 1624346530364,
        "status_code": 200,
        "header": 'HTTP/1.1 200 OK\\r\\nServer: nginx\\r\\nDate: Tue, 22 Jun 2021 07:22:09 GMT\\r\\nContent-Type: text/html\\r\\nLast-Modified: Fri, 11 Jun 2021 10:25:32 GMT\\r\\nTransfer-Encoding: chunked\\r\\nVary: Accept-Encoding\\r\\nETag: W/\\"60c33a1c-1ba9\\"\\r\\nContent-Encoding: gzip\\r\\nConnection: close\\r\\n\\r\\n',
        "body": "PCFkb2N0eXBlIGh0bWw+DQo8aHRtbCBsYW5nPSJ6aC1DTiI+DQogICA8aGVhZD4NCiAgICAgIDxtZXRhIGNoYXJzZXQ9IlVURi04Ii8+DQogICAgICA8bWV0YSBuYW1lPSJ2aWV3cG9ydCIgY29udGVudD0id2lkdGg9ZGV2aWNlLXdpZHRoLGluaXRpYWwtc2NhbGU9MSxtYXhpbXVtLXNjYWxlPTEsdXNlci1zY2FsYWJsZT1ubyIvPg0KICAgICAgPG1ldGEgbmFtZT0iZm9ybWF0LWRldGVjdGlvbiIgY29udGVudD0idGVsZXBob25lPW5vIj4NCiAgICAgIDx0aXRsZT5PROS9k+iCsjwvdGl0bGU+DQogICAgICA8bWV0YSBuYW1lPSJrZXl3b3JkcyIgY29udGVudD0iT0TkvZPogrIiLz4NCiAgICAgIDxtZXRhIG5hbWU9ImRlc2NyaXB0aW9uIiBjb250ZW50PSJPROS9k+iCsiIvPg0KICAgICAgPGxpbmsgcmVsPSJzaG9ydGN1dCBpY29uIiBocmVmPSIuL2ltYWdlcy9mYXZpY29uLmljbyIvPg0KICAgICAgPHNjcmlwdCBzcmM9Ii4vanMvaW5kZXguanMiPjwvc2NyaXB0Pg0KICAgICAgPGxpbmsgaHJlZj0iLi9jc3Mvb2RTcG9ydHMuY3NzIiByZWw9InN0eWxlc2hlZXQiPg0KICAgPC9oZWFkPg0KICAgPGJvZHk+DQogICAgICA8ZGl2IGNsYXNzPSJkb3dubG9hZC1wYWdlIiBpZD0iYXBwIj4NCiAgICAgICAgIDxkaXYgY2xhc3M9ImRvd25sb2FkLWJ0biI+PGltZyBjbGFzcz0iZG93bmxvYWQtaGVhZGVyIiBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy90b3AtZG93bmxvYWQucG5nIiBvbmNsaWNrPSJkb3dubG9hZEFwcCgpIj48L2Rpdj4NCiAgICAgICAgIDxkaXYgY2xhc3M9ImRvd25sb2FkLWJ0biI+PGltZyBjbGFzcz0iZG93bmxvYWQtZm9vdGVyIiBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9mb290LWltZy5wbmciIG9uY2xpY2s9ImtmKCkiPjwvZGl2Pg0KICAgICAgICAgPGRpdiBjbGFzcz0iZG93bmxvYWQtYmFja2dyb3VuZCI+DQogICAgICAgICAgICA8aW1nIGNsYXNzPSJiYW5uZXItaW1nIiBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9CR18wMS5qcGciIGFsdD0iIj4NCiAgICAgICAgICAgIDxkaXYgY2xhc3M9ImJ1dHRvbnMiPg0KICAgICAgICAgICAgICAgPGRpdiBjbGFzcz0iZG93bmxvYWQtYnRuIiBvbmNsaWNrPSJoNSgpIj48aW1nIHNyYz0iLi9pbWFnZXMvb2Rfc3BvcnRzL2ljb24teGlhemFpLnBuZyIvPiA8c3Bhbj7nq4vljbPms6jlhow8L3NwYW4+PC9kaXY+DQogICAgICAgICAgICAgICA8YSBjbGFzcz0ic2VydmljZS1idG4iIGhyZWY9ImphdmFzY3JpcHQ6dm9pZCgwKSIgb25jbGljaz0ia2YoKSI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9pY29uLXNlcnZpY2UucG5nIi8+IDxzcGFuPuW9qemHkeWuouacjTwvc3Bhbj48L2E+DQogICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgPGRpdiBjbGFzcz0iY29udGFpbmVyLXdyYXAiPg0KICAgICAgICAgICAgPGRpdiBjbGFzcz0iZG93bmxvYWQtYmFja2dyb3VuZDEgZmlyc3QtY29udGVudCI+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJ0aXRsZTEiPjxpbWcgc3JjPSIuL2ltYWdlcy9vZF9zcG9ydHMvbGVmdC5wbmciLz4gPHNwYW4+6aaW5a2Y6LGq56S8PC9zcGFuPiA8aW1nIHNyYz0iLi9pbWFnZXMvb2Rfc3BvcnRzL3JpZ2h0LnBuZyIvPjwvZGl2Pg0KICAgICAgICAgICAgICAgPGRpdiBjbGFzcz0iY29udGVudC1ib3giPg0KICAgICAgICAgICAgICAgICAgPHRhYmxlPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRoPummluWtmDwvdGg+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGg+5b2p6YeRPC90aD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+NTAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4xMjg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4xMDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yNjg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yMDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4zODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD41MDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD42ODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4xMDAwMDwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+MTI4ODwvdGQ+DQogICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjIwMDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4xNjg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+NTAwMDA8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjI4ODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4xMDAwMDA8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjU4ODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yMDAwMDA8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjg4ODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgIDwvdGFibGU+DQogICAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjb250ZW50LXRpcHMiPg0KICAgICAgICAgICAgICAgICAgICAgPHA+5rOo77ya6aaW5a2Y5Y2z6YCB77yM5L2T6IKyLOeUteerniAx5YCN5rWB5rC05o+Q546w77yM5YW25LuW5Zy66aaGM+WAjea1geawtOaPkOasvjwvcD4NCiAgICAgICAgICAgICAgICAgICAgIDxwPjwvcD4NCiAgICAgICAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgICAgICAgPC9kaXY+DQogICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgICAgIDxkaXYgY2xhc3M9ImRvd25sb2FkLWJhY2tncm91bmQxIj4NCiAgICAgICAgICAgICAgIDxkaXYgY2xhc3M9InRpdGxlMSI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9sZWZ0LnBuZyIvPiA8c3Bhbj7pppblrZjlj4zlgI3osarnpLw8L3NwYW4+IDxpbWcgc3JjPSIuL2ltYWdlcy9vZF9zcG9ydHMvcmlnaHQucG5nIi8+PC9kaXY+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjb250ZW50LWJveCI+DQogICAgICAgICAgICAgICAgICA8dGFibGU+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGg+6aaW5a2YPC90aD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0aD7lvanph5E8L3RoPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRoPuimgeaxgjwvdGg+DQogICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjEwMDwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+MTAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZCByb3dzcGFuPSIzIj7kvZPogrLnlLXnq54z5YCN5rWB5rC077yM5YW25LuW5Zy66aaGNeWAjea1geawtDwvdGQ+DQogICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjEyMDA8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjEyMDA8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yMDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yMDAwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICA8L3RhYmxlPg0KICAgICAgICAgICAgICAgICAgPGRpdiBjbGFzcz0iY29udGVudC10aXBzIj4NCiAgICAgICAgICAgICAgICAgICAgIDxwPuazqO+8muW9qemHkeWIhjTmrKHotaDpgIHvvIznrKzkuIDmrKHnq4vljbPliLDotKblvanph5HmgLvmlbA05YiG5LmLMe+8jOS5i+WQjuavj+WRqOmihuWPluS4gOasoTItMy005qyh6ZyA6KaB5omT5aSf6aaW5a2YM+WAjeS7peS4iueahOS9k+iCsueUteernua1geawtO+8jOWFtuS7luWcuummhua1geawtOi+vuWIsDXlgI3ljbPlj6/nlLPor7flvanph5HjgILvvIgxMDDpgIExMDDliIbkuKTmrKHotaDpgIHvvIk8L3A+DQogICAgICAgICAgICAgICAgICAgICA8cD48L3A+DQogICAgICAgICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgICAgPC9kaXY+DQogICAgICAgICAgICA8ZGl2IGNsYXNzPSJkb3dubG9hZC1iYWNrZ3JvdW5kMSI+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJ0aXRsZTEiPjxpbWcgc3JjPSIuL2ltYWdlcy9vZF9zcG9ydHMvbGVmdC5wbmciLz4gPHNwYW4+5ZG85pyL5ZSk5Y+LIOWdkOS6q+WFtuaIkDwvc3Bhbj4gPGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9yaWdodC5wbmciLz48L2Rpdj4NCiAgICAgICAgICAgICAgIDxkaXYgY2xhc3M9InN1Yi10aXRsZSI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9pY29uLXN1YnRpdGxlLnBuZyI+6LGq56S85LiAPC9kaXY+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjb250ZW50LWltZyI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9oYW9saTEucG5nIiBhbHQ9IiI+PC9kaXY+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJzdWItdGl0bGUiPjxpbWcgc3JjPSIuL2ltYWdlcy9vZF9zcG9ydHMvaWNvbi1zdWJ0aXRsZS5wbmciPuixquekvOS6jDwvZGl2Pg0KICAgICAgICAgICAgICAgPGRpdiBjbGFzcz0iY29udGVudC1ib3giPg0KICAgICAgICAgICAgICAgICAgPHRhYmxlPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRoPuaci+WPi+i0puWPt+WNh+e6pzwvdGg+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGg+5Y2H57qn5b2p6YeRPC90aD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+VklQMzwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+MTg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+VklQNDwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+Njg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+VklQNTwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+ODg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+VklQNjwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+MTI4ODwvdGQ+DQogICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPlZJUDc8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPjE4ODg8L3RkPg0KICAgICAgICAgICAgICAgICAgICAgPC90cj4NCiAgICAgICAgICAgICAgICAgICAgIDx0cj4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD5WSVA4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD4yODg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICAgICA8dHI+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+VklQOTwvdGQ+DQogICAgICAgICAgICAgICAgICAgICAgICA8dGQ+Mzg4ODwvdGQ+DQogICAgICAgICAgICAgICAgICAgICA8L3RyPg0KICAgICAgICAgICAgICAgICAgICAgPHRyPg0KICAgICAgICAgICAgICAgICAgICAgICAgPHRkPlZJUDEwPC90ZD4NCiAgICAgICAgICAgICAgICAgICAgICAgIDx0ZD42ODg4PC90ZD4NCiAgICAgICAgICAgICAgICAgICAgIDwvdHI+DQogICAgICAgICAgICAgICAgICA8L3RhYmxlPg0KICAgICAgICAgICAgICAgICAgPGRpdiBjbGFzcz0iY29udGVudC10aXBzLWJpZyI+DQogICAgICAgICAgICAgICAgICAgICA8cD425pyIMeaXpei1t+iHs+asp+a0suadr+e7k+adn+a0vuWPkTwvcD4NCiAgICAgICAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgICAgICAgPC9kaXY+DQogICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgICAgIDxkaXYgY2xhc3M9ImRvd25sb2FkLWJhY2tncm91bmQxIj4NCiAgICAgICAgICAgICAgIDxkaXYgY2xhc3M9InRpdGxlMSI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9sZWZ0LnBuZyIvPiA8c3Bhbj7lpI3lrZjljYfnuqcg6auY6L6+MyU8L3NwYW4+IDxpbWcgc3JjPSIuL2ltYWdlcy9vZF9zcG9ydHMvcmlnaHQucG5nIi8+PC9kaXY+DQogICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSJjb250ZW50LWltZyI+PGltZyBzcmM9Ii4vaW1hZ2VzL29kX3Nwb3J0cy9mdWN1bi5wbmciIGFsdD0iIj48L2Rpdj4NCiAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgPC9kaXY+DQoNCiAgICAgIDwvZGl2Pg0KICAgPC9ib2R5Pg0KPC9odG1sPg==" * 100000,
        "cert": "-----BEGIN CERTIFICATE-----\\nMIIFPDCCBCSgAwIBAgISA37WJ8fYBvjR/z0pyZo4cbEuMA0GCSqGSIb3DQEBCwUA\\nMDIxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MQswCQYDVQQD\\nEwJSMzAeFw0yMTA2MDcwNjQ3NTRaFw0yMTA5MDUwNjQ3NTRaMBExDzANBgNVBAMT\\nBnliNy5hYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMNZpVqv7ddk\\nFlNTj3j7h9nXZLM2mkUqeJqEoFh5uTA254w5am7tqR+X10YjBC//W5+w2J51ZBWK\\nnPDXfSSvMrx5FFYkNLp6mVuaEpCMpjq/4bSoojVdjfD/C09n8Ty4OTOGcyryBNzW\\nJgMza9T+U5SZXFVzxqT7s5KAkw2DIqUjs/Yaiws5cGISbkDZ+ouCWt6vvGs/LDHu\\n9ZjsXcv3DWAroGJHmyvqTH1J7esD8+E5R904wI0BK5H+vaWahlZGKcswOfhOLXRl\\noU8Ob85rf9SHVWRd0gVZz0eGE36rcDDzVdUJbTQJ93yd7IoFyDt6kNoScvA+2Iz0\\nQWqCZkoYg/sCAwEAAaOCAmswggJnMA4GA1UdDwEB/wQEAwIFoDAdBgNVHSUEFjAU\\nBggrBgEFBQcDAQYIKwYBBQUHAwIwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQU27by\\nImT+0XDsSUy5t9dXNQjWz5MwHwYDVR0jBBgwFoAUFC6zF7dYVsuuUAlA5h+vnYsU\\nwsYwVQYIKwYBBQUHAQEESTBHMCEGCCsGAQUFBzABhhVodHRwOi8vcjMuby5sZW5j\\nci5vcmcwIgYIKwYBBQUHMAKGFmh0dHA6Ly9yMy5pLmxlbmNyLm9yZy8wOwYDVR0R\\nBDQwMoILb2R0Yzc3Ny5jb22CD3d3dy5vZHRjNzc3LmNvbYIKd3d3LnliNy5hY4IG\\neWI3LmFjMEwGA1UdIARFMEMwCAYGZ4EMAQIBMDcGCysGAQQBgt8TAQEBMCgwJgYI\\nKwYBBQUHAgEWGmh0dHA6Ly9jcHMubGV0c2VuY3J5cHQub3JnMIIBBAYKKwYBBAHW\\neQIEAgSB9QSB8gDwAHYAb1N2rDHwMRnYmQCkURX/dxUcEdkCwQApBo2yCJo32RMA\\nAAF55XE+RgAABAMARzBFAiEAoVSftyFQd5nHSA2fWmT46u0vDiXaqwoTS9dOaQGC\\nriwCICzENYEBDHL0vBVva8PZzvGa0cNxZI+80mnJ1XeIqHpwAHYAfT7y+I//iFVo\\nJMLAyp5SiXkrxQ54CX8uapdomX4i8NcAAAF55XE+VwAABAMARzBFAiACArr/zlqX\\nVUpJhCdTGDqeJVgGFNSbB+ZpVyD633pDwAIhANRqzMx9wdjxRFXi946eXUbGNvs7\\nx4g0LsIFSTOFN+FZMA0GCSqGSIb3DQEBCwUAA4IBAQCstPRLkCTMm+Ea4EiDGrx7\\ncI5dA4G0p/04TkZedHWwliTLSLY1/bMLcQbIZxwhzkYGgl3Tk0rsZ8P9G4oddmlz\\nAJzrG58o7aT9EXEpQQH4ktwium9iGOkDnTGnRgoNZuu0l463MM0o4MAo/fVoUc/P\\nzC0jZnO74D3y2d9jDeo7NOSM6L3Sull3EwkSAF45qfgSsVsApOEoLlVqSxJ93j1i\\n0bgWsheY5A0X/PmIGEvLS5N6alNWdRlJK23egPmTn3Pko5DryXEahxVeMcHCdQGU\\nWzjzzQL0sTDHzZ9ZQkQm4f/IUOvACmcSPtR7tiYvi0Ov+dPb8sZ9moATRBuhsKhO\\n-----END CERTIFICATE-----\\n",
        "cert_chain": [
            "-----BEGIN CERTIFICATE-----\\nMIIFPDCCBCSgAwIBAgISA37WJ8fYBvjR/z0pyZo4cbEuMA0GCSqGSIb3DQEBCwUA\\nMDIxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MQswCQYDVQQD\\nEwJSMzAeFw0yMTA2MDcwNjQ3NTRaFw0yMTA5MDUwNjQ3NTRaMBExDzANBgNVBAMT\\nBnliNy5hYzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAMNZpVqv7ddk\\nFlNTj3j7h9nXZLM2mkUqeJqEoFh5uTA254w5am7tqR+X10YjBC//W5+w2J51ZBWK\\nnPDXfSSvMrx5FFYkNLp6mVuaEpCMpjq/4bSoojVdjfD/C09n8Ty4OTOGcyryBNzW\\nJgMza9T+U5SZXFVzxqT7s5KAkw2DIqUjs/Yaiws5cGISbkDZ+ouCWt6vvGs/LDHu\\n9ZjsXcv3DWAroGJHmyvqTH1J7esD8+E5R904wI0BK5H+vaWahlZGKcswOfhOLXRl\\noU8Ob85rf9SHVWRd0gVZz0eGE36rcDDzVdUJbTQJ93yd7IoFyDt6kNoScvA+2Iz0\\nQWqCZkoYg/sCAwEAAaOCAmswggJnMA4GA1UdDwEB/wQEAwIFoDAdBgNVHSUEFjAU\\nBggrBgEFBQcDAQYIKwYBBQUHAwIwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQU27by\\nImT+0XDsSUy5t9dXNQjWz5MwHwYDVR0jBBgwFoAUFC6zF7dYVsuuUAlA5h+vnYsU\\nwsYwVQYIKwYBBQUHAQEESTBHMCEGCCsGAQUFBzABhhVodHRwOi8vcjMuby5sZW5j\\nci5vcmcwIgYIKwYBBQUHMAKGFmh0dHA6Ly9yMy5pLmxlbmNyLm9yZy8wOwYDVR0R\\nBDQwMoILb2R0Yzc3Ny5jb22CD3d3dy5vZHRjNzc3LmNvbYIKd3d3LnliNy5hY4IG\\neWI3LmFjMEwGA1UdIARFMEMwCAYGZ4EMAQIBMDcGCysGAQQBgt8TAQEBMCgwJgYI\\nKwYBBQUHAgEWGmh0dHA6Ly9jcHMubGV0c2VuY3J5cHQub3JnMIIBBAYKKwYBBAHW\\neQIEAgSB9QSB8gDwAHYAb1N2rDHwMRnYmQCkURX/dxUcEdkCwQApBo2yCJo32RMA\\nAAF55XE+RgAABAMARzBFAiEAoVSftyFQd5nHSA2fWmT46u0vDiXaqwoTS9dOaQGC\\nriwCICzENYEBDHL0vBVva8PZzvGa0cNxZI+80mnJ1XeIqHpwAHYAfT7y+I//iFVo\\nJMLAyp5SiXkrxQ54CX8uapdomX4i8NcAAAF55XE+VwAABAMARzBFAiACArr/zlqX\\nVUpJhCdTGDqeJVgGFNSbB+ZpVyD633pDwAIhANRqzMx9wdjxRFXi946eXUbGNvs7\\nx4g0LsIFSTOFN+FZMA0GCSqGSIb3DQEBCwUAA4IBAQCstPRLkCTMm+Ea4EiDGrx7\\ncI5dA4G0p/04TkZedHWwliTLSLY1/bMLcQbIZxwhzkYGgl3Tk0rsZ8P9G4oddmlz\\nAJzrG58o7aT9EXEpQQH4ktwium9iGOkDnTGnRgoNZuu0l463MM0o4MAo/fVoUc/P\\nzC0jZnO74D3y2d9jDeo7NOSM6L3Sull3EwkSAF45qfgSsVsApOEoLlVqSxJ93j1i\\n0bgWsheY5A0X/PmIGEvLS5N6alNWdRlJK23egPmTn3Pko5DryXEahxVeMcHCdQGU\\nWzjzzQL0sTDHzZ9ZQkQm4f/IUOvACmcSPtR7tiYvi0Ov+dPb8sZ9moATRBuhsKhO\\n-----END CERTIFICATE-----\\n",
            "-----BEGIN CERTIFICATE-----\\nMIIFFjCCAv6gAwIBAgIRAJErCErPDBinU/bWLiWnX1owDQYJKoZIhvcNAQELBQAw\\nTzELMAkGA1UEBhMCVVMxKTAnBgNVBAoTIEludGVybmV0IFNlY3VyaXR5IFJlc2Vh\\ncmNoIEdyb3VwMRUwEwYDVQQDEwxJU1JHIFJvb3QgWDEwHhcNMjAwOTA0MDAwMDAw\\nWhcNMjUwOTE1MTYwMDAwWjAyMQswCQYDVQQGEwJVUzEWMBQGA1UEChMNTGV0J3Mg\\nRW5jcnlwdDELMAkGA1UEAxMCUjMwggEiMA0GCSqGSIb3DQEBAQUAA4IBDwAwggEK\\nAoIBAQC7AhUozPaglNMPEuyNVZLD+ILxmaZ6QoinXSaqtSu5xUyxr45r+XXIo9cP\\nR5QUVTVXjJ6oojkZ9YI8QqlObvU7wy7bjcCwXPNZOOftz2nwWgsbvsCUJCWH+jdx\\nsxPnHKzhm+/b5DtFUkWWqcFTzjTIUu61ru2P3mBw4qVUq7ZtDpelQDRrK9O8Zutm\\nNHz6a4uPVymZ+DAXXbpyb/uBxa3Shlg9F8fnCbvxK/eG3MHacV3URuPMrSXBiLxg\\nZ3Vms/EY96Jc5lP/Ooi2R6X/ExjqmAl3P51T+c8B5fWmcBcUr2Ok/5mzk53cU6cG\\n/kiFHaFpriV1uxPMUgP17VGhi9sVAgMBAAGjggEIMIIBBDAOBgNVHQ8BAf8EBAMC\\nAYYwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMBMBIGA1UdEwEB/wQIMAYB\\nAf8CAQAwHQYDVR0OBBYEFBQusxe3WFbLrlAJQOYfr52LFMLGMB8GA1UdIwQYMBaA\\nFHm0WeZ7tuXkAXOACIjIGlj26ZtuMDIGCCsGAQUFBwEBBCYwJDAiBggrBgEFBQcw\\nAoYWaHR0cDovL3gxLmkubGVuY3Iub3JnLzAnBgNVHR8EIDAeMBygGqAYhhZodHRw\\nOi8veDEuYy5sZW5jci5vcmcvMCIGA1UdIAQbMBkwCAYGZ4EMAQIBMA0GCysGAQQB\\ngt8TAQEBMA0GCSqGSIb3DQEBCwUAA4ICAQCFyk5HPqP3hUSFvNVneLKYY611TR6W\\nPTNlclQtgaDqw+34IL9fzLdwALduO/ZelN7kIJ+m74uyA+eitRY8kc607TkC53wl\\nikfmZW4/RvTZ8M6UK+5UzhK8jCdLuMGYL6KvzXGRSgi3yLgjewQtCPkIVz6D2QQz\\nCkcheAmCJ8MqyJu5zlzyZMjAvnnAT45tRAxekrsu94sQ4egdRCnbWSDtY7kh+BIm\\nlJNXoB1lBMEKIq4QDUOXoRgffuDghje1WrG9ML+Hbisq/yFOGwXD9RiX8F6sw6W4\\navAuvDszue5L3sz85K+EC4Y/wFVDNvZo4TYXao6Z0f+lQKc0t8DQYzk1OXVu8rp2\\nyJMC6alLbBfODALZvYH7n7do1AZls4I9d1P4jnkDrQoxB3UqQ9hVl3LEKQ73xF1O\\nyK5GhDDX8oVfGKF5u+decIsH4YaTw7mP3GFxJSqv3+0lUFJoi5Lc5da149p90Ids\\nhCExroL1+7mryIkXPeFM5TgO9r0rvZaBFOvV2z0gp35Z0+L4WPlbuEjN/lxPFin+\\nHlUjr8gRsI3qfJOQFy/9rKIJR0Y/8Omwt/8oTWgy1mdeHmmjk7j1nYsvC9JSQ6Zv\\nMldlTTKB3zhThV1+XWYp6rjd5JW1zbVWEkLNxE7GJThEUG3szgBVGP7pSWTUTsqX\\nnLRbwHOoq7hHwg==\\n-----END CERTIFICATE-----\\n"
        ],
        "icon": [
            {
                "url": "https://uoowoo.cn/images/favicon.ico",
                "value": "UklGRpoFAABXRUJQVlA4WAoAAAAQAAAARwAARwAAQUxQSLcAAAABcFPbVuXcCiw8EzM28IUHsEOJBmoUEOjn/0nerbMiYgJw07iQSl9+tpcUnMGLEtuisEV5Yv1YlA5vb0leFGe5MddFdZ0upC7Kq5zYvKjP9uAXgh6ADAZDgLhQjDCNQzNuIekCi5BYpMKidBZ9+fu8s/gUFiWxSIFFcCycaRyaQeQQARkMhgDwDDwA2Kwv2wOkaquC87nqqhOuJWvKgrvWDy3DWzyU2DS0KHjRuJBK/51eUnAGNwEAVlA4ILwEAABQGACdASpIAEgAPm0uk0akIiGhLNTbMIANiWwAsSUAOa4sP1D8jfZKsD+T+arjRD8W5PRnt5/MB9wHu0+i/eWvQA8uL2UPKKaRUBNVLPB/Nf6Kme/+X9FVaTJWFGtUHH5kJV/kH20AAYy4gfrBLf7SE5bRbfZcvmk9GsmnK25WkGuRixZ9/qHmbkojrMHW6lE+ePGPAuThGJ82NNe3cups/EFtuPYKDJoTqn8Z/Scp5ORWipjnqeQK+dKa2LpzDEunlDd2+P2aTtsigAD9HSlD6ewkLwAeS7OHsUTv89//nhyq+v++mSMpCM/gPucYUfUP74AxKrpMp78lVxhTA2rtIQnsBjkVgFiw1GlYLiR1QQnGD5Puhc8BMe/9MYrFr3DgcrSLVVSas3Ebz1PgG41oGRUKAu2dBAGHykwTEA6e3uzELRnnooY7Qwce7cfwb5DgA+LAQ8fUbtBQYCyApr/jhTxZHEiMlAUdDNYIPsPIg1LEDqrIt8HbBrNGs/hQJxwTNflTyZi6/6F5qwY7N3dUNWl9vQ/38pKfs1/+BUDqT8tf1jCFmmkhOVBSJoYhDeCr14hYlaMQ7X0bZpDz/7yUaC6hvc47tthtb8yyWlWdw7+Rg757/nZXlGX1xnX0TufuD3kXvrXQe0Wk6WXJP5K3vtvdiZfug4FQb986wUavlUFkOhJS1OTqD+AN2kqcx/D3LF1Ssx1uNgPKLUIyNqcoxAqO6jeDnfpco7fkBxLJ7WE78frP6l0sGvLd7+l4vR9GlUfmqT8+0wTJwQ92FdU0PsuhTZtxaa6+TXXUFMCQjpDmuErcwF5oaZzFsjEfnh9XO6u8uGttGxp4tJQEtpWO2lsOX1M88rm7Fm20q4/uSSzEylCn/0+GBBB0lCYI7evRrqC4i/0ts6JZR+63VGh9Nc+hMAUwnv8/0/naue/2J2+5d5CgX3FPJpbweoMWTHVYYV0cnGsKw2hhLXGL3YiCclbl6O0WRGVqOZdnfk2IdTc7zU2UI8SBs/YrvhbrxftbpTA8VCYUxn7nt6/4EwgxpMTQpJMnETqVfZIAJhUFI9wsb0y1EMRUkxJ6extqo09WXKSrvKnseXBakZxt4LSX9lPhCaGSoTSg6hGYiBTwN0sdrDeyT/hxEKgssHX7FlknkyIkks3nfyBNpS+Z9xA9nlZjBd4DhjOSjzxMKU2iFv3XxYJRkD0Kr3isr7hPsTkJ8QkNPedM/B4Qa2Fvm9Q/KJDwQDKXa5KTydCqsioHNFsSwIxImhTNRQFjQrQZO5PBlkah2+N5caLkdwQ/ZWPCeBiwFP7EpIlnkb7+LDPawOYWJaHDdDGoI8RLGSf0/Dt36YTWdht02ourIk7Dp1WzHyfe8Y3plXyZgixoFQZ10iqaEbDycOmRmt6r/aIB//tDTJe9hvDsFuDmLhnAmnEeKgx8lH4b8U8y9pPjqeWY4jsSHWbke94ozV3knA0qN3W3EDKVwVDolg/zcufNMj70/5NR+B2gThoXxtqe6UP4/rr8FpBu5ra7rpveOVHf2j7Ci0q6bxVgiriYOd76fx4mJgtLH1wHroBmob/8XOcJVfWA87MpQu2425K/5z9731SdpANJTiQc2isAAAA=",
                "md5": "6952cdb01830c5028a71ccb43d6dec29",
                "sha1": "9f74d81425b41e8f7dcb9cbade674ee8b0c6cb45",
                "timestamp": 1624346530364
            }
        ],
        "crawler": "platinum-0.4",
        "crawler_ip": "8.210.150.89",
        "datasource": "srccz2",
        "echo": '{"host": "uoowoo.cn", "task_sdate": "2021-06-16", "task_type": "bigironrod_crawler"}'
    }

    import time

    t1 = time.time()
    cur_content_list = split_data(test_data)
    print(len(cur_content_list))
    for s in cur_content_list:
        print(type(s))
        print(sys.getsizeof(s))

    result = re_j_data(cur_content_list)
    t2 = time.time()
    print(t2 - t1)
    print(result["ip"])
    print(result["header"])

    # =========================================

    t1 = time.time()
    cur_content_list = split_data2(test_data)
    print(len(cur_content_list))
    for s in cur_content_list:
        print(type(s))
        print(sys.getsizeof(s))

    result = re_j_data(cur_content_list)
    t2 = time.time()
    print(t2 - t1)
    print(result["ip"])
    print(result["header"])

    # =========================================

    t1 = time.time()
    cur_content_list = split_data3(test_data)
    print(len(cur_content_list))
    for s in cur_content_list:
        print(type(s))
        print(sys.getsizeof(s))

    result = re_j_data(cur_content_list)
    t2 = time.time()
    print(t2 - t1)
    print(result["ip"])
    print(result["header"])

测试结果(三种方法性能相近):

12
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
859815
<class 'bytes'>
209455
10.14835000038147
122.10.52.140
HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Tue, 22 Jun 2021 07:22:09 GMT\r\nContent-Type: text/html\r\nLast-Modified: Fri, 11 Jun 2021 10:25:32 GMT\r\nTransfer-Encoding: chunked\r\nVary: Accept-Encoding\r\nETag: W/\"60c33a1c-1ba9\"\r\nContent-Encoding: gzip\r\nConnection: close\r\n\r\n
11
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
229877
10.214266061782837
122.10.52.140
HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Tue, 22 Jun 2021 07:22:09 GMT\r\nContent-Type: text/html\r\nLast-Modified: Fri, 11 Jun 2021 10:25:32 GMT\r\nTransfer-Encoding: chunked\r\nVary: Accept-Encoding\r\nETag: W/\"60c33a1c-1ba9\"\r\nContent-Encoding: gzip\r\nConnection: close\r\n\r\n
11
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
943751
<class 'bytes'>
229877
10.125932216644287
122.10.52.140
HTTP/1.1 200 OK\r\nServer: nginx\r\nDate: Tue, 22 Jun 2021 07:22:09 GMT\r\nContent-Type: text/html\r\nLast-Modified: Fri, 11 Jun 2021 10:25:32 GMT\r\nTransfer-Encoding: chunked\r\nVary: Accept-Encoding\r\nETag: W/\"60c33a1c-1ba9\"\r\nContent-Encoding: gzip\r\nConnection: close\r\n\r\n

Process finished with exit code 0


标签:分割,cur,压缩,list,content,943751,数据量,print,data
From: https://blog.51cto.com/u_12763213/7608490

相关文章

  • Easysearch 压缩功能的显著提升:从 8.7GB 到 1.4GB
    引言在海量数据的存储和处理中,索引膨胀率是一个不可忽视的关键指标。它直接影响了存储成本和查询性能。近期,Easysearch在这方面取得了显著的进展,其压缩功能的效果远超过了之前的版本。本文将详细介绍这一进展。Easysearch各版本压缩性能对比根据之前文章的数据,Easysearchv1.1在......
  • 三维模型3DTile格式轻量化压缩在移动智能终端应用方面的重要性分析
    三维模型3DTile格式轻量化压缩在移动智能终端应用方面的重要性分析 随着移动智能终端设备的不断发展和普及,如智能手机、平板电脑等,以及5G网络技术的推广应用,使得在这些设备上频繁使用三维地理空间数据成为可能。然而,由于这类数据通常具有大尺度、高精度等特点,其数据量巨大,如果......
  • 如何使用python进行pdf文件分割
    1.安装PyPDF2包pipinstallPyPDF2然后importPyPDF22.在PyPDF2库中,可以使用以下代码打开PDF文件:pdf_file=open('filename.pdf','rb')pdf_reader=PyPDF2.PdfFileReader(pdf_file)total_pages=pdf_reader.numPages3.下面代码将每一页分开fromPyPDF2impo......
  • LZW字典压缩算法及例程
    字典压缩算法是一种数据压缩方法,其基本原理是将重复出现的字符串或者片段用一个短的代表符号来表示,从而减小数据的存储空间。字典压缩算法通常用于无损压缩数据。一种常见的字典压缩算法是Lempel-Ziv-Welch(LZW)算法。该算法通过构建和更新一个字典来实现压缩。初始时,字典中包含......
  • Kafka消息压缩算法性能调优与选择
    前言Kafka作为一款高性能的分布式消息队列,其消息压缩算法的选择和调优对于系统性能的提升至关重要。本文将深入探讨Kafka消息压缩算法的性能调优和选择。压缩算法的选择Kafka支持多种压缩算法,包括gzip、snappy和lz4。这些算法各有优缺点,需要根据实际情况进行选择。gzipgzip是......
  • 压缩和归档、文件搜索、文本过滤__实验
    1.使用root用户连接到具有图形界面的CentOS7系统2.将/etc目录归档到/root下,并命名为etc.tar.gz[root@localhost~]#tar-zcvf/root/etc.tar.gz/etc3.将etc.tar.gz文件释放到/tmp文件夹中。[root@localhost~]#tar-zxfetc.tar.gz-C/tmp/4.将/etc、/boot目录归档到......
  • 备份压缩
    unarj命令 gunzip命令  bzip2recover命令 bzip2命令 bunzip2命令   ar命令 ......
  • nginx指定文件类型进行gzip压缩
    如果在访问Nginx服务器时看到了.gz压缩文件,但其中也包括了不是JavaScript文件的内容,这可能是由于Nginx配置未正确过滤或限制哪些文件应该进行gzip压缩所致。在Nginx配置中,你可以使用gzip_types指令来指定哪些文件类型应该进行gzip压缩。以下是一些步骤来确保只有JavaScript文件......
  • 压缩和归档、文件搜索、文本过滤
    列表对比Linux系统下常用压缩与解压缩命令的区别压缩命令特点解压缩命令zip可压缩目录,不删除原文件unzipgzip删除原文件,可指定压缩比gunzipbzip2删除源文件bunzip2xz压缩比更大unxztar命令的语法与常用选项有哪些?各选项的作用是什么?语法:tar选项打包后的名字打包的文件或目录......
  • Shell中如何分割字符串
    使用字符替换来分割字符串tr或者类似实现字符串替换的工具,如sed。echo"go:python:rust:js"|tr":""\n"#使用tr将分隔符:替换成换行符\n使用tr将分隔符:替换成换行符\n。使用cut分割字符串echo"go:python:rust:js"|cut-d":"-f1echo"go:pyth......