在使用xpath,遇到了报错,记录下。
(python中代码是没有问题)
from lxml import etree import requests from constants import headers def run(): # url = "http://www.gushiju.net/shici/guanyu/%E4%B8%89%E5%9B%BD%E6%BC%94%E4%B9%89" # page_text = requests.get(url=url, headers=headers).text tree = etree.parse("2.html") res = tree.xpath('/html/head/title') print(res) if __name__ == '__main__': run()
发现,其实是html文档有问题。
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Title</title> </head> <body> <h1>hello world</h1> </body> </html>
正确的解决方式是:
<meta charset="UTF-8" /> <meta http-equiv="X-UA-Compatible" content="IE=edge" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
// 在每一个标签末尾加上“/”号
执行成功了!
标签:__,head,lxml,etree,url,html,line From: https://www.cnblogs.com/shaoyishi/p/16971974.html