提供一个思路
然后一个一个去补
代码
import json from fontTools.ttLib import TTFont import re from lxml import etree with open('t.xml', 'r', encoding='utf-8') as fp: x = fp.read().encode('utf-8') xl = etree.HTML(x) k = {} for i in xl.xpath('//cmap/*/map'): k[int(i.xpath('./@code')[0], 16)] = i.xpath('./@name')[0] def check_word(kw): pass with open('1.html', 'r', encoding='utf-8') as fp: x = fp.read() d_js = re.findall('window.__INITIAL_STATE__=({.+});', x)[0] content = json.loads(d_js)['reader']['chapterData']['content'] xl = etree.HTML(content) for i in xl.xpath('//p/text()'): row = list(i) txt = '' for kw in row: if k.get(ord(kw)): txt += k.get(ord(kw)) else: txt += kw print(txt,row)View Code
我觉得可以ocr识别 目前目前没有思路 希望有大神告诉
第二种方法
手机段无字体加密
标签:xpath,fp,加密,xl,字体,import,番茄,kw,txt From: https://www.cnblogs.com/inkser/p/17612901.html