论文背景
A Text-to-Speech Transformer in TensorFlow 2
- Neural Speech Synthesis with Transformer Network
- FastSpeech: Fast, Robust and Controllable Text to Speech
- FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
- FastPitch: Parallel Text-to-speech with Pitch Prediction
模型仓库+文档
https://github.com/as-ideas/TransformerTTS
安装
1. python 3.6环境
conda create -n TTS36 python==3.6
conda activate TTS36
2. github仓库
git clone thub.com/as-ideas/TransformerTTS.git
3. pip软件包
cd TransformerTTS
pip install -r requirements.txt
此时运行得到错误信息
(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ python predict_tts.py -t "Please, say something."
2023-06-22 23:18:30.259622: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-06-22 23:18:30.259658: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "predict_tts.py", line 7, in <module>
from data.audio import Audio
File "/home/user/model/model3/TransformerTTS/data/audio.py", line 6, in <module>
import librosa.display
File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/librosa/display.py", line 23, in <module>
from matplotlib.cm import get_cmap
File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/__init__.py", line 139, in <module>
from . import cbook, rcsetup
File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/rcsetup.py", line 27, in <module>
from matplotlib.fontconfig_pattern import parse_fontconfig_pattern
File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/matplotlib/fontconfig_pattern.py", line 18, in <module>
from pyparsing import (Literal, ZeroOrMore, Optional, Regex, StringEnd,
File "/home/user/anaconda3/envs/TTS36/lib/python3.6/site-packages/pyparsing/__init__.py", line 130, in <module>
__version__ = __version_info__.__version__
AttributeError: 'version_info' object has no attribute '__version__'
更改包版本
pip install pyparsing==2.4.7
记录一下此时的版本
(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ pip list
Package Version
----------------------- ---------
absl-py 0.15.0
astunparse 1.6.3
attrs 22.2.0
audioread 3.0.0
cached-property 1.5.2
cachetools 4.2.4
certifi 2023.5.7
cffi 1.15.1
charset-normalizer 2.0.12
clang 5.0
clldutils 3.12.0
colorlog 6.7.0
csvw 2.0.0
cycler 0.11.0
Cython 0.29.35
dataclasses 0.8
decorator 5.1.1
dill 0.3.4
flatbuffers 1.12
gast 0.4.0
google-auth 1.35.0
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
grpcio 1.48.2
h5py 3.1.0
idna 3.4
importlib-metadata 4.8.3
isodate 0.6.1
joblib 1.1.1
keras 2.6.0
Keras-Preprocessing 1.1.2
kiwisolver 1.3.1
librosa 0.7.1
llvmlite 0.31.0
Markdown 3.3.7
matplotlib 3.2.2
multiprocess 0.70.12.2
numba 0.48.0
numpy 1.19.5
oauthlib 3.2.2
opt-einsum 3.3.0
p-tqdm 1.3.3
pathos 0.2.8
phonemizer 2.2.2
pip 21.3.1
pox 0.3.0
ppft 1.6.6.4
protobuf 3.19.6
pyasn1 0.5.0
pyasn1-modules 0.3.0
pycparser 2.21
pyparsing 2.4.7
python-dateutil 2.8.2
pyworld 0.3.3
regex 2023.6.3
requests 2.27.1
requests-oauthlib 1.3.1
resampy 0.3.1
rfc3986 1.5.0
rsa 4.9
ruamel.yaml 0.17.32
ruamel.yaml.clib 0.2.7
scikit-learn 0.24.2
scipy 1.5.4
segments 2.2.1
setuptools 59.6.0
six 1.15.0
soundfile 0.12.1
tabulate 0.8.10
tensorboard 2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.6.2
tensorflow-estimator 2.6.0
termcolor 1.1.0
threadpoolctl 3.1.0
tqdm 4.40.1
typing-extensions 3.7.4.3
uritemplate 4.1.1
urllib3 1.26.16
webrtcvad 2.0.10
Werkzeug 2.0.3
wheel 0.37.1
wrapt 1.12.1
zipp 3.6.0
4. 预处理模型
下载地址
解压
运行
(TTS36) user@ubuntu:~/model/model3/TransformerTTS$ python predict_tts.py -t "Please, say something." -p /home/user/model/model3/TransformerTTS/model/mobel/bdf06b9_ljspeech_step_90000/bdf06b9_ljspeech_step_90000
2023-06-23 03:48:08.648046: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-06-23 03:48:08.648086: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Loading model from /home/user/model/model3/TransformerTTS/model/mobel/bdf06b9_ljspeech_step_90000/bdf06b9_ljspeech_step_90000
2023-06-23 03:48:18.661162: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2023-06-23 03:48:18.661202: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2023-06-23 03:48:18.661224: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (ubuntu): /proc/driver/nvidia/version does not exist
2023-06-23 03:48:18.687299: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
WARNING: git_hash mismatch: bdf06b9(config) vs 3638055(local).
Output wav under outputs/custom_text
标签:语音,23,--,py,TransformersTTS,user,TransformerTTS,tensorflow,TTS36
From: https://www.cnblogs.com/WordDealer/p/17499503.html