文章目录
一、json数据解析
用来解析多层嵌套的json数据;JsonPath 是一种信息抽取类库,是从JSON文档中抽取指定信息的工具,提供多种语言实现版本,包括:Javascript, Python, PHP 和 Java。
语法使用案例:
{ "store": {
"book": [
{ "category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{ "category": "fiction",
"author": "Evelyn Waugh",
"title": "Sword of Honour",
"price": 12.99
},
{ "category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"isbn": "0-553-21311-3",
"price": 8.99
},
{ "category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
],
"bicycle": {
"color": "red",
"price": 19.95
}
}
}
练习
# 拿到json数据之后,先转换格式,转换成Python能操作的格式
python_data = json.loads(json_data)
# jsonpath取到的数据是以列表的形式返回
print(jsonpath.jsonpath(python_data, '$.store.book[*].author'))
print(jsonpath.jsonpath(python_data, '$..author'))
print(jsonpath.jsonpath(python_data,'$.store.*'))
print(jsonpath.jsonpath(python_data,'$.store..price'))
print(jsonpath.jsonpath(python_data,'$..book[(@.length-1)]'))
print(jsonpath.jsonpath(python_data,'$..book[:2]')) # 前两本书
print(jsonpath.jsonpath(python_data,'$..book[?(@.isbn)]'))
感兴趣的可以自己打印尝试一下,这里就不多做阐述了。
二、案例演示
1.解析获得数据
import requests
import json
import jsonpath
if __name__ == '__main__':
url_ = 'https://www.lagou.com/lbs/getAllCitySearchLabels.json'
headers_ = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0'
}
response_ = requests.get(url_,headers=headers_)
json_data = response_.text
print(type(json_data))
python_data = json.loads(json_data)
res = jsonpath.jsonpath(python_data,'$..A[0].name')
print(res)
2.简化代码
前提—响应对象必须是json格式的数据
if __name__ == '__main__':
url_ = 'https://www.lagou.com/lbs/getAllCitySearchLabels.json'
headers_ = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0'
}
response_ = requests.get(url_,headers=headers_)
python_data = response_.json() # json是一个方法,不是import导入模块的json模块,直接得到Python格式的数据,自动转换类型哦
res = jsonpath.jsonpath(python_data,'$..A[0].name')
print(res)
3.豆瓣json数据解析
import requests
import jsonpath
if __name__ == '__main__':
url_ = 'https://movie.douban.com/j/chart/top_list?type=11&interval_id=100%3A90&action=&start=0&limit=20'
headers_ = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36 Edg/122.0.0.0'
}
response_ = requests.get(url_,headers=headers_)
py_data = response_.json()
# 解析出电影名称和评分
title_list = jsonpath.jsonpath(py_data,'$..title')
print('tltle',title_list)
score_list = jsonpath.jsonpath(py_data,'$..score')
print('score',score_list)
# 转成键值对对应的字典
movie_dict = {}
for i in range(len(title_list)):
movie_dict[title_list[i]] = score_list[i]
print(movie_dict)
总结
json数据提取的语法比较复杂,需要多花一些时间去学习。
人一生的价值,不应该用时间去衡量,而是用深度去衡量。