首先本地环境要安装部署PaddleSpeech语音识别系统,参考Windows10系统部署PaddleSpeech
本地部署好后,根据官方文档启动TTS的流式服务,参考PaddleSpeech语音启用流式服务
1、相关服务的启动
1.1本机启动TTS API 服务
paddlespeech_server start --config_file ./demos/speech_server/conf/application.yaml
[2024-09-10 18:08:16,571] [ INFO] - cls : python engine.
[2024-09-10 18:08:30,175] [ INFO] - Initialize CLS server engine successfully on device: cpu.
[2024-09-10 18:08:30,175] [ INFO] - text : python engine.
[2024-09-10 18:08:34,068] [ INFO] - loading configuration file C:\Users\HIAPAD\.paddlespeech\models\ernie_linear_p3_wudao-punc-zh\1.0\ernie_linear_p3_wudao-punc-zh.tar\ckpt\config.json
[2024-09-10 18:08:34,069] [ INFO] - Model config ErnieConfig {
"architectures": [
"ErnieForTokenClassification"
],
"attention_probs_dropout_prob": 0.1,
"enable_recompute": false,
"fuse": false,
"hidden_act": "relu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"id2label": {
"0": "LABEL_0",
"1": "LABEL_1",
"2": "LABEL_2",
"3": "LABEL_3"
},
"initializer_range": 0.02,
"intermediate_size": 3072,
"label2id": {
"LABEL_0": 0,
"LABEL_1": 1,
"LABEL_2": 2,
"LABEL_3": 3
},
"layer_norm_eps": 1e-12,
"max_position_embeddings": 513,
"model_type": "ernie",
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pad_token_id": 0,
"paddlenlp_version": null,
"pool_act": "tanh",
"task_id": 0,
"task_type_vocab_size": 3,
"type_vocab_size": 2,
"use_task_id": true,
"vocab_size": 18000
}
[2024-09-10 18:08:51,924] [ INFO] - All model checkpoint weights were used when initializing ErnieForTokenClassification.
[2024-09-10 18:08:51,925] [ WARNING] - Some weights of ErnieForTokenClassification were not initialized from the model checkpoint at C:\Users\HIAPAD\.paddlespeech\models\ernie_linear_p3_wudao-punc-zh\1.0\ernie_linear_p3_wudao-punc-zh.tar\ckpt and are newly initialized: ['ernie.embeddings.task_type_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2024-09-10 18:08:51,940] [ INFO] - Already cached C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\vocab.txt
[2024-09-10 18:08:51,950] [ INFO] - tokenizer config file saved in C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\tokenizer_config.json
[2024-09-10 18:08:51,950] [ INFO] - Special tokens file saved in C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\special_tokens_map.json
[2024-09-10 18:08:51,951] [ INFO] - Using model: ernie_linear_p3_wudao.
[2024-09-10 18:08:51,953] [ INFO] - Initialize Text server engine successfully on device: cpu.
[2024-09-10 18:08:51,953] [ INFO] - vector : python engine.
[2024-09-10 18:08:54,461] [ INFO] - Initialize Vector server engine successfully on device: cpu.
Building prefix dict from the default dictionary ...
[2024-09-10 18:08:54] [DEBUG] [__init__.py:113] Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\HIAPAD\AppData\Local\Temp\jieba.cache
[2024-09-10 18:08:54] [DEBUG] [__init__.py:132] Loading model from cache C:\Users\HIAPAD\AppData\Local\Temp\jieba.cache
Loading model cost 0.592 seconds.
[2024-09-10 18:08:55] [DEBUG] [__init__.py:164] Loading model cost 0.592 seconds.
Prefix dict has been built successfully.
[2024-09-10 18:08:55] [DEBUG] [__init__.py:166] Prefix dict has been built successfully.
INFO: Started server process [4704]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)
出现这个说明启动成功。第一次启动因为本地没有相关模型,会自动从网上下载模型,相对比较慢些。
1.2本机启动TTS Streaming服务
paddlespeech_server start --config_file ./demos/streaming_tts_server/conf/tts_online_application.yaml
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit) 说明启动成功!
1.3服务提醒
- 相关配置文件获取,需要git获得项目,Demo文件存在,通过PIP安装时没有相关配置文件。
- API服务提供TTS和ASR接口服务,但是流式服务TTS和ARS需要单独启动。
2,Java编写测试速度
2.1 编写网络连接类
import sun.misc.BASE64Decoder;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
/**
*
* java 测试语音识别服务器API和Stream TTS速度
*
* @author 赵海洋
* @date 2024-09-10
*/
public class HttpURLConnectionTest {
/**
* Http post请求
*/
public static String doPost(String httpUrl, String param) {
HttpURLConnection connection = null;
OutputStream os = null;
InputStream is = null;
BufferedReader br = null;
StringBuffer result = new StringBuffer();
try {
//1.1、创建连接对象
URL url = new URL(httpUrl);
//1.2、创建连接
connection = (HttpURLConnection) url.openConnection();
//2.1、设置请求方法
connection.setRequestMethod("POST");
//2.2、设置连接超时时间 单位:毫秒
connection.setConnectTimeout(30000);
//2.3、设置读取超时时间 单位:毫秒
connection.setReadTimeout(60000);
//2.4、默认值为:false,当向远程服务器传送数据/写数据时,需要设置为true
connection.setDoOutput(true);
// 默认值为:true,当前向远程服务读取数据时,设置为true,该参数可有可无
connection.setDoInput(true);
//2.5、设置通用的请求属性
connection.setRequestProperty("accept", "*/*");
connection.setRequestProperty("connection", "Keep-Alive");
connection.setRequestProperty("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
connection.setRequestProperty("Content-Type", "application/json;charset=utf-8");
//3、写需要发送的请求
if (null != param && !param.equals("")) {
byte[]paramData = param.getBytes("UTF-8");
connection.setRequestProperty("ontent-length", ""+paramData.length);
//设置参数
os = connection.getOutputStream();
//拼装参数
os.write(paramData);
}
//4、读取响应
if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
is = connection.getInputStream();
if (null != is) {
br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
String temp = null;
while (null != (temp = br.readLine())) {
result.append(temp);
result.append("\r\n");
}
}
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if(br!=null){
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if(os!=null){
try {
os.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if(is!=null){
try {
is.close();
} catch (IOException e) {
e.printStackTrace();
}
}
//关闭连接
connection.disconnect();
}
return result.toString();
}
/**
* Http get请求
*/
public static String doGet(String httpUrl){
HttpURLConnection connection = null;
InputStream is = null;
BufferedReader br = null;
StringBuffer result = new StringBuffer();
try {
//1、创建连接对象
URL url = new URL(httpUrl);
connection = (HttpURLConnection) url.openConnection();
//2.1、设置请求方法
connection.setRequestMethod("GET");
//2.2、设置连接超时时间 单位:毫秒
connection.setReadTimeout(15000);
//3、开始连接
connection.connect();
//4、获取响应数据
if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
//获取返回的数据
is = connection.getInputStream();
if (null != is) {
br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
String temp = null;
while (null != (temp = br.readLine())) {
result.append(temp);
}
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (null != br) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if (null != is) {
try {
is.close();
} catch (IOException e) {
e.printStackTrace();
}
}
//关闭远程连接
connection.disconnect();
}
return result.toString();
}
/**
* Http get请求下载图片
*/
public static void downLoad(String httpUrl){
HttpURLConnection connection = null;
InputStream is = null;
BufferedInputStream fis = null;
BufferedOutputStream fio = null;
try {
//1、创建连接对象
URL url = new URL(httpUrl);
connection = (HttpURLConnection) url.openConnection();
//2.1、设置请求方法
connection.setRequestMethod("GET");
//2.2、设置连接超时时间 单位:毫秒
connection.setReadTimeout(15000);
//3、开始连接
connection.connect();
//4、获取响应数据
if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
//获取返回的数据
is = connection.getInputStream();
if (null != is) {
fis = new BufferedInputStream(is);
fio = new BufferedOutputStream(new FileOutputStream("e:\\demo\\test.jpg"));
// 一次读取一个字节数组
int len = 0;
byte[] bys = new byte[1024];
while ( (len = fis.read(bys)) != -1 ){
System.out.print(len);
fio.write(bys, 0, len);
}
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (null != fis) {
try {
fis.close();
} catch (IOException e) {
e.printStackTrace();
}
}
if (null != fio) {
try {
fio.close();
} catch (IOException e) {
e.printStackTrace();
}
}
//关闭远程连接
connection.disconnect();
}
}
}
2.2 API方式速度
public static void main(String[] args) throws IOException {
String param = "{\"text\": \"重要提醒:写博客要负责人,最好不要发没有质量保证的东西!以免耽误大家的时间!\",\"spk_id\": 0,\"speed\": 1.0, \"volume\": 1.0,\"sample_rate\": 0,\"save_path\": \"./tts.wav\"}";
Long time = System.currentTimeMillis();
String post = doPost("http://127.0.0.1:8090/paddlespeech/tts", param);
System.out.println(post);
System.out.println("耗时:"+(System.currentTimeMillis()-time));
}
客户端输出:
后台服务器输出:
后台模型运行总耗时:1485,前端请求到得到相应总耗时:1552,差值67就是网络传输耗时。
服务器返回的结果:
{"success":true,"code":200,"message":{"description":"success."},"result":{"lang":"zh","spk_id":0,"speed":1.0,"volume":1.0,"sample_rate":24000,"duration":6.9,"save_path":"./tts.wav","audio":"UklGRuQNBQBXQVZFZm10IBAAAAABAAEAwF0AAIC7AAACABAAZGF0YcANBQD+/f/+//z//v/7//7//v/+//3+/5//j/+//7//v/+//8//n/+//7//8AAP7/AQAAAAAAAAABAAIAAgACAAMAAgACAP//AgADAAQAAwAGAAMABAADAAQAAwADAAIAAwABAAAA/vz//v/8//7/+//6//r/+//5//z/+f/7//z9//7//f/9//3//P/7//3/+//6//j/+f/1//X/8//0//T/9P/w//T/9P/5//z//v/+/wIAAgADAAIAAAA="}}
- audio:音频Base64编码后的字符串
- save_path:语音服务器生成的语音文件
- result:就是语音模型参数
2.3 Stream方式速度
public static void main(String[] args) throws IOException {
String param = "{\"text\": \"重要提醒:写博客要负责人,最好不要发没有质量保证的东西!以免耽误大家的时间!\",\"spk_id\": 0,\"speed\": 1.0, \"volume\": 1.0,\"sample_rate\": 0,\"save_path\": \"./tts.wav\"}";
Long time = System.currentTimeMillis();
String post = doPost("http://127.0.0.1:8092/paddlespeech/tts/streaming", param);
System.out.println(post);
System.out.println("耗时:"+(System.currentTimeMillis()-time));
}
客户端输出:
服务器端输出
同样的文本文案,流式要比API快些 。
如果我们把文本调大:
类似于游戏制作,创作出一个虚拟场景供人体验,其核心是graphics的各项技术的发挥。
和我们接触最多的就是应用在游戏上,可以说是传统游戏娱乐设备的一个升级版,
主要关注虚拟场景是否有良好的体验。
而与真实场景是否相关,他们并不关心。
VR设备往往是浸入式的,典型的设备就是oculus rift
流式响应的时间:
API响应的时间:
可以看到随文本的增大,响应时间会逐渐拉大,可以看到在大文本上流式还是有很大的优势。
2.3 纵向对比
- 1,API模式随文本的增加,响应时间逐步增加,经过多次运行后,耗时是固定不变的
- 2,Stream模式随文本的增加, 响应时间逐步增加,经过多次运行后,每次的耗时都是变动的,
总结:创作不易,希望能对大家有所帮助!
标签:INFO,10,Java,PaddleSpeech,TTS,09,2024,connection,null From: https://blog.csdn.net/zhyooo123/article/details/142105354