https://blog.csdn.net/weixin_42264992/article/details/125395239?ops_request_misc=&request_id=&biz_id=102&utm_term=link%20-s%20$kaldi_root/egs/wsj/s5&utm_medium=distribute.pc_search_result.none-task-blog-2~all~sobaiduweb~default-8-125395239.142^v51^control,201^v3^control_2&spm=1018.2226.3001.4187
-SA版权协议,转载请附上原文出处链接及本声明。 原文链接:https://blog.csdn.net/weixin_42264992/article/details/125395239
前言
Kaldi是当前最流行的开源语音识别工具(Toolkit),它使用WFST来实现解码算法。Kaldi的主要代码是C++编写,在此之上使用bash和python脚本做了一些工具。而实时识别系统的好坏取决于语音识别的性能,语音识别包含特征提取、声学模型、语言模型、解码器等部分。Kaldi工具箱集成了几乎所有搭建语音识别器需要用到的工具。
step1、下载源代码
git clone https://github.com/kaldi-asr/kaldi.git
1
step2、数据预处理,准备材料
wav.scp
id 存储位置
001 /tmp/dataset/wav_file1.wav
text (没错,就是没有文件类型后缀)
id 文本内容
001 我爱我的祖国
utt2spk
id 发音人
001 speaker1
spk2utt
发音人 ids
speaker1 001, 002,003
Kaldi有工具,当你已经有文件3时,可自动生成4。
参考处理代码如下:
import argparse
import os
def prepare_inf(txt_path,wav_path,output_path):
'''
生成wav.scp(id,wav_path); text(id,文本); utt2spk(id,speaker); utt2spk(speaker,id)
id是文件名,没有后缀!
:param txt_path: text file path
:param wav_path: wav file path
:param output_path: output the inf file (wav.scp, text, utt2spk, spk2utt)
:return: None
'''
wav_scp=[]
text=[]
utt2spk=[]
spk2utt={}
#wav.scp(id,wav_path)
for file in os.listdir(wav_path):
file_name,_=file.split('.')
wav_file=os.path.join(wav_path,file)
txt_file=os.path.join(txt_path,file_name+'.txt')
if file[-3:] == 'wav':
#wav.scp
wav_scp.append(file_name+' '+wav_file+'\n')
#text
if os.path.exists(txt_file):
txt=open(txt_file,'r',encoding='utf-8').read().replace('\n',' ').strip()
text.append(file_name+' '+txt+'\n')
spk=file_name[:8]
#utt2spk
utt2spk.append(file_name+' '+spk+'\n')
#spk2utt
if spk not in spk2utt:
spk2utt[spk]=[]
spk2utt[spk].append(file_name)
with open(os.path.join(output_path,'wav.scp'),'w',encoding='utf-8') as writer:
writer.writelines(wav_scp)
with open(os.path.join(output_path,'text'),'w',encoding='utf-8') as writer:
writer.writelines(text)
with open(os.path.join(output_path, 'utt2spk'), 'w', encoding='utf-8') as writer:
writer.writelines(utt2spk)
with open(os.path.join(output_path, 'spk2utt'), 'w', encoding='utf-8') as writer:
for item in spk2utt.keys():
writer.write(str(item)+' '+' '.join(spk2utt[item])+'\n')
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument('--txt_path', type=str, default='/home/nfs/datasets/mediaspeech-es')
parser.add_argument('--wav_path', type=str,default='/home/nfs/datasets/mediaspeech-es')
parser.add_argument('--output_path', type=str,default='/home/wangyuke_i/asr_test/mykaldi/inf')
args = parser.parse_args()
prepare_inf(args.txt_path,args.wav_path,args.output_path)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
step3、配置环境
将kaldi/egs/aishell/s5/path.sh复制到自己的项目路径下,并将export KALDI_ROOT改为自己的kaldi路径
1
source运行:
. path.sh
1
会生成不可见文件:prepare_lang.sh(pre +tab键+tab键 显示)
至此,path配置成功
ps: 如果没有生成prepare_lang.sh:
. /kaldi/egs/wsj/s5/path.sh
1
查看/kaldi/egs/wsj/s5/路径下有prepare_lang.sh(pre +tab键+tab键 显示)则表示成功。
step4、编译tools:
vim /kaldi/INSTALL
1
看到官方的安装指引,分三步安装:
nlpmaster@NLP-2021-server:/home/nfs/yuke$ vim select_wav.py
nlpmaster@NLP-2021-server:/home/nfs/yuke$ vim select_wav.py
nlpmaster@NLP-2021-server:/home/nfs/yuke$ cd ..
nlpmaster@NLP-2021-server:/home/nfs$ ls
This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[Option 1 in the following does not apply to native Windows install, see windows/INSTALL or following Option 2]
Option 1 (bash + makefile):
Steps:
(1)
go to tools/ and follow INSTALL instructions there.
(2)
go to src/ and follow INSTALL instructions there.
Option 2 (cmake):
Go to cmake/ and follow INSTALL.md instructions there.
Note, it may not be well tested and some features are missing currently.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
按如下命令依次执行即可:
cd kaldi/tools/
extras/check_dependencies.sh
1
2
报错
如果没有报错,跳过这一节。
出现报错,按照提示安装相关api即可,可以直接运行官方命令安装依赖包:
sh extras/install_mkl.sh
1
也可以自己安装报错的依赖包:
yum install gcc-c++ make automake autoconf patch bzip2 unzip wget sox gcc-gfortran libtool subversion python3 zlib-devel zlib-devel.x86_64 0:1.2.7-18.el7
1
ps:MKL包会安装了也识别不了仍然提示安装,因为默认安装最新版的,有大佬说有问题,因此到官网下载2017 or 2018版的。
完成之后按官方文档继续走:
cd /kaldi/tools/
make -j 4
1
2
会自动检查环境是否齐全,如果有报错表明有环境没安装好,根据报错信息安装。
怎么检查配置成功?
cd /kaldi/tools/
输入fst,狂按tab键
1
2
出现如下界面表明安装成功:有很多fst开头的文件,特别是fstcompile文件,基本表明安装成功!
编译src
cd src
./configure --shared
make depend -j 8
make -j 8
1
2
3
4
BUG报错:
***configure failed: CUDA 10_1 does not support c++ (g++-9). Only versions strictly older than 9.0 are supported. ***
1
解决办法:
1、cuda路径参考博客:https://blog.csdn.net/accumulating_mocai/article/details/110006111
2、降低gcc版本:https://zhuanlan.zhihu.com/p/453542931
————————————————
版权声明:本文为CSDN博主「飞扬々岁月」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/weixin_42264992/article/details/125395239