尝试着用request库去爬取了一些B站视频
参考和抄了一些博主的代码和思路,我略作了修改,目前是不能爬取带分页的视频(只需要略作修改,也可爬取):
Python爬取B站视频,只需一个B站视频地址,即可任意下载 - 腾讯云开发者社区-腾讯云 (tencent.com)
import requests import re # 正则表达式 import pprint import json import subprocess import os if not os.path.exists("./video"): os.mkdir("./video") ID=r"BV1Vx4y177ew" fileName="./video/test" url = f'https://www.bilibili.com/video/{ID}/' headers = { 'referer': f'https://www.bilibili.com/video/{ID}?spm_id_from=333.337.search-card.all.click', 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36' } def send_request(url): response = requests.get(url=url, headers=headers) return response def get_video_data(html_data): """解析视频数据""" # 提取视频的标题 #title = re.findall('<span class="tit">(.*?)</span>', html_data) # print(title) # 提取视频对应的json数据 json_data = re.findall('<script>window\.__playinfo__=(.*?)</script>', html_data)[0] # print(json_data) # json_data 字符串 json_data = json.loads(json_data) pprint.pprint(json_data) # 提取音频的url地址 audio_url = json_data['data']['dash']['audio'][0]['backupUrl'][0] print('解析到的音频地址:', audio_url) # 提取视频画面的url地址 video_url = json_data['data']['dash']['video'][0]['backupUrl'][0] print('解析到的视频地址:', video_url) video_data = [audio_url, video_url] return video_data def save_data(file_name, audio_url, video_url): # 请求数据 print('正在请求音频数据') audio_data = send_request(audio_url).content print('正在请求视频数据') video_data = send_request(video_url).content with open(file_name + '.mp3', mode='wb') as f: f.write(audio_data) print('正在保存音频数据') with open(file_name + '.mp4', mode='wb') as f: f.write(video_data) print('正在保存视频数据') def merge_data(video_name): print('视频合成开始:', video_name) # ffmpeg -i video.mp4 -i audio.wav -c:v copy -c:a aac -strict experimental output.mp4 COMMAND = f'ffmpeg -i {video_name}.mp4 -i {video_name}.mp3 -vcodec copy -acodec copy {video_name}_out.mp4' subprocess.run(COMMAND, encoding='utf-8',shell=True) print('视频合成结束:', video_name) rep=send_request(url).text video=get_video_data(rep) save_data(fileName,*video) merge_data(fileName) print("success")
如果看不懂这些代码,建议看一看https://www.bilibili.com/video/BV1ha4y1H7sx/?spm_id_from=333.337.search-card.all.click
可能的错误有 控制台会有 ffmpeg一堆乱码,1.没有安装ffmpeg库,环境变量没配置好。2.配置好以后,需要重启一下,pycharm才会识别到该库。
标签:视频,url,爬虫,爬取,json,video,print,data From: https://www.cnblogs.com/xmds/p/17072317.html