写在前面
自从Sora问世,视频生成领域就火了起来。前不久腾讯AI团队刚刚开源了混元视频生成模型HunyuanVideo,本人第一时间就在AutoDL上面部署了代码,快来体验一下吧。
AutoDL算法社区的镜像地址:https://www.codewithgpu.com/i/Tencent/HunyuanVideo/HunyuanVideo-Configured
或者使用Docker部署指令:
docker pull registry.cn-zhangjiakou.aliyuncs.com/codewithgpu3/tencent-hunyuanvideo:0gGLhlnr6T
Respect!原项目地址如下:
https://github.com/Tencent/HunyuanVideo
基本环境
- CUDA: 11.8
- PyTorch: 2.1.2
- flash-attn: 2.5.9.post1
下载预训练模型
因为预训练模型太大了(约占65G硬盘),本人没有放进镜像,请大家依次运行下面的代码从Huggingface网站下载哈~
- 安装huggingface_hub
python -m pip install "huggingface_hub[cli]"
- 在autodl的固态盘新建预训练模型文件夹
cd /root/autodl-tmp/
mkdir ckpts
- 下载HunyuanVideo预训练模型
huggingface-cli download tencent/HunyuanVideo --local-dir ./ckpts
- 下载MLLM的Text Encoder模型
cd ckpts
huggingface-cli download xtuner/llava-llama-3-8b-v1_1-transformers --local-dir ./llava-llama-3-8b-v1_1-transformers
- 模型预处理(节约显存)
cd /root/HunyuanVideo/
python hyvideo/utils/preprocess_text_encoder_tokenizer_utils.py --input_dir ckpts/llava-llama-3-8b-v1_1-transformers --output_dir ckpts/text_encoder
- 下载CLIP预训练模型
cd /root/autodl-tmp/ckpts
huggingface-cli download openai/clip-vit-large-patch14 --local-dir ./text_encoder_2
- 将/root/autodl-tmp/ckpts软连接到/root/HunyuanVideo/ckpts
cd /root/
ln -s /root/autodl-tmp/ckpts/ HunyuanVideo/
使用教程:)
- 先进入HunyuanVideo路径
cd /root/HunyuanVideo/
- 执行以下命令之一生成视频:
- 720p x 1280p 分辨率
python3 sample_video.py \
--video-size 720 1280 \
--video-length 129 \
--infer-steps 50 \
--prompt "A cat walks on the grass, realistic style." \
--flow-reverse \
--use-cpu-offload \
--save-path ./results
- 544p x 960p 分辨率
python3 sample_video.py \
--video-size 544 960 \
--video-length 129 \
--infer-steps 30 \
--prompt "A sexy beauty lying on the beach, realistic style." \
--flow-reverse \
--use-cpu-offload \
--save-path ./results
- 在results文件夹中查看视频(.MP4格式)
自己生成的DEMO
- 【544p x 960p】Prompt:A sexy beauty lying on the beach, realistic style.
(在审核。。) - 【720p x 1280p】Prompt:A sexy beauty lying on the beach, realistic style.
(在审核。。) - 【720p x 1280p】Prompt:On the streets of modern cities, the camera first follows a motorcycle speeding through skyscrapers. The wheels left deep marks on the wet ground, and the surrounding buildings swayed in the constantly moving wind. The camera quickly pans to the right, capturing a police car following closely behind with flashing lights, reflecting the neon lights and traffic on the street in brilliant colors. The bustling city scene is blurred into a flowing light and shadow in the background, and the movements of motorcycles and police cars appear unusually rapid and tense.
AI文生视频3
写在后面
-
这个项目因为是视频生成任务,显存需求较大,上面的两种分辨率,大分辨率大约需要76G显存,小分辨率大约需要43G显存,请合理租卡~
-
Prompt的书写是有讲究的,这个后面再分享吧。