PaddleSpeech TTS API与流式速度对比（windows Java版）

标签：INFO 10 Java PaddleSpeech TTS 09 2024 connection null

首先本地环境要安装部署PaddleSpeech语音识别系统，参考Windows10系统部署PaddleSpeech

本地部署好后，根据官方文档启动TTS的流式服务，参考PaddleSpeech语音启用流式服务

1、相关服务的启动

1.1本机启动TTS API 服务

paddlespeech_server start --config_file ./demos/speech_server/conf/application.yaml

[2024-09-10 18:08:16,571] [    INFO] - cls : python engine.
[2024-09-10 18:08:30,175] [    INFO] - Initialize CLS server engine successfully on device: cpu.
[2024-09-10 18:08:30,175] [    INFO] - text : python engine.
[2024-09-10 18:08:34,068] [    INFO] - loading configuration file C:\Users\HIAPAD\.paddlespeech\models\ernie_linear_p3_wudao-punc-zh\1.0\ernie_linear_p3_wudao-punc-zh.tar\ckpt\config.json
[2024-09-10 18:08:34,069] [    INFO] - Model config ErnieConfig {
  "architectures": [
    "ErnieForTokenClassification"
  ],
  "attention_probs_dropout_prob": 0.1,
  "enable_recompute": false,
  "fuse": false,
  "hidden_act": "relu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "id2label": {
    "0": "LABEL_0",
    "1": "LABEL_1",
    "2": "LABEL_2",
    "3": "LABEL_3"
  },
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "label2id": {
    "LABEL_0": 0,
    "LABEL_1": 1,
    "LABEL_2": 2,
    "LABEL_3": 3
  },
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 513,
  "model_type": "ernie",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "paddlenlp_version": null,
  "pool_act": "tanh",
  "task_id": 0,
  "task_type_vocab_size": 3,
  "type_vocab_size": 2,
  "use_task_id": true,
  "vocab_size": 18000
}

[2024-09-10 18:08:51,924] [    INFO] - All model checkpoint weights were used when initializing ErnieForTokenClassification.
[2024-09-10 18:08:51,925] [ WARNING] - Some weights of ErnieForTokenClassification were not initialized from the model checkpoint at C:\Users\HIAPAD\.paddlespeech\models\ernie_linear_p3_wudao-punc-zh\1.0\ernie_linear_p3_wudao-punc-zh.tar\ckpt and are newly initialized: ['ernie.embeddings.task_type_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[2024-09-10 18:08:51,940] [    INFO] - Already cached C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\vocab.txt
[2024-09-10 18:08:51,950] [    INFO] - tokenizer config file saved in C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\tokenizer_config.json
[2024-09-10 18:08:51,950] [    INFO] - Special tokens file saved in C:\Users\HIAPAD\.paddlenlp\models\ernie-1.0\special_tokens_map.json
[2024-09-10 18:08:51,951] [    INFO] - Using model: ernie_linear_p3_wudao.
[2024-09-10 18:08:51,953] [    INFO] - Initialize Text server engine successfully on device: cpu.
[2024-09-10 18:08:51,953] [    INFO] - vector : python engine.
[2024-09-10 18:08:54,461] [    INFO] - Initialize Vector server engine successfully on device: cpu.
Building prefix dict from the default dictionary ...
[2024-09-10 18:08:54] [DEBUG] [__init__.py:113] Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\HIAPAD\AppData\Local\Temp\jieba.cache
[2024-09-10 18:08:54] [DEBUG] [__init__.py:132] Loading model from cache C:\Users\HIAPAD\AppData\Local\Temp\jieba.cache
Loading model cost 0.592 seconds.
[2024-09-10 18:08:55] [DEBUG] [__init__.py:164] Loading model cost 0.592 seconds.
Prefix dict has been built successfully.
[2024-09-10 18:08:55] [DEBUG] [__init__.py:166] Prefix dict has been built successfully.
INFO:     Started server process [4704]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)

INFO: Uvicorn running on http://0.0.0.0:8090 (Press CTRL+C to quit)

出现这个说明启动成功。第一次启动因为本地没有相关模型，会自动从网上下载模型，相对比较慢些。

1.2本机启动TTS Streaming服务

paddlespeech_server start --config_file ./demos/streaming_tts_server/conf/tts_online_application.yaml

INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit) 说明启动成功！

1.3服务提醒

相关配置文件获取，需要git获得项目，Demo文件存在，通过PIP安装时没有相关配置文件。
API服务提供TTS和ASR接口服务，但是流式服务TTS和ARS需要单独启动。

2，Java编写测试速度

2.1 编写网络连接类

import sun.misc.BASE64Decoder;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
/**
 *
 * java 测试语音识别服务器API和Stream TTS速度
 *
 * @author 赵海洋
 * @date 2024-09-10
 */
public class HttpURLConnectionTest {
    /**
     * Http post请求
     */
    public static String doPost(String httpUrl, String param) {
        HttpURLConnection connection = null;
        OutputStream os = null;
        InputStream is = null;
        BufferedReader br = null;
        StringBuffer result = new StringBuffer();
        try {
            //1.1、创建连接对象
            URL url = new URL(httpUrl);
            //1.2、创建连接
            connection = (HttpURLConnection) url.openConnection();

            //2.1、设置请求方法
            connection.setRequestMethod("POST");
            //2.2、设置连接超时时间 单位：毫秒
            connection.setConnectTimeout(30000);
            //2.3、设置读取超时时间 单位：毫秒
            connection.setReadTimeout(60000);
            //2.4、默认值为：false，当向远程服务器传送数据/写数据时，需要设置为true
            connection.setDoOutput(true);
            // 默认值为：true，当前向远程服务读取数据时，设置为true，该参数可有可无
            connection.setDoInput(true);
            //2.5、设置通用的请求属性
            connection.setRequestProperty("accept", "*/*");
            connection.setRequestProperty("connection", "Keep-Alive");
            connection.setRequestProperty("user-agent", "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)");
            connection.setRequestProperty("Content-Type", "application/json;charset=utf-8");
            
            //3、写需要发送的请求
            if (null != param && !param.equals("")) {
                byte[]paramData = param.getBytes("UTF-8");
                connection.setRequestProperty("ontent-length", ""+paramData.length);

                //设置参数
                os = connection.getOutputStream();
                //拼装参数
                os.write(paramData);
            }

            //4、读取响应
            if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
                is = connection.getInputStream();
                if (null != is) {
                    br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
                    String temp = null;
                    while (null != (temp = br.readLine())) {
                        result.append(temp);
                        result.append("\r\n");
                    }
                }
            }
        } catch (MalformedURLException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if(br!=null){
                try {
                    br.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if(os!=null){
                try {
                    os.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if(is!=null){
                try {
                    is.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            //关闭连接
            connection.disconnect();
        }
        return result.toString();
    }

    /**
     * Http get请求
     */
    public static String doGet(String httpUrl){
        HttpURLConnection connection = null;
        InputStream is = null;
        BufferedReader br = null;
        StringBuffer result = new StringBuffer();

        try {
            //1、创建连接对象
            URL url = new URL(httpUrl);
            connection = (HttpURLConnection) url.openConnection();

            //2.1、设置请求方法
            connection.setRequestMethod("GET");
            //2.2、设置连接超时时间 单位：毫秒
            connection.setReadTimeout(15000);

            //3、开始连接
            connection.connect();

            //4、获取响应数据
            if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
                //获取返回的数据
                is = connection.getInputStream();
                if (null != is) {
                    br = new BufferedReader(new InputStreamReader(is, "UTF-8"));
                    String temp = null;
                    while (null != (temp = br.readLine())) {
                        result.append(temp);
                    }
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (null != br) {
                try {
                    br.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if (null != is) {
                try {
                    is.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            //关闭远程连接
            connection.disconnect();
        }
        return result.toString();
    }

    /**
     * Http get请求下载图片
     */
    public static void downLoad(String httpUrl){
        HttpURLConnection connection = null;
        InputStream is = null;
        BufferedInputStream fis = null;
        BufferedOutputStream fio = null;

        try {
            //1、创建连接对象
            URL url = new URL(httpUrl);
            connection = (HttpURLConnection) url.openConnection();

            //2.1、设置请求方法
            connection.setRequestMethod("GET");
            //2.2、设置连接超时时间 单位：毫秒
            connection.setReadTimeout(15000);

            //3、开始连接
            connection.connect();

            //4、获取响应数据
            if (connection.getResponseCode() == HttpURLConnection.HTTP_OK) {
                //获取返回的数据
                is = connection.getInputStream();
                if (null != is) {
                    fis = new BufferedInputStream(is);
                    fio = new BufferedOutputStream(new FileOutputStream("e:\\demo\\test.jpg"));

                    // 一次读取一个字节数组
                    int len = 0;
                    byte[] bys = new byte[1024];
                    while ( (len = fis.read(bys)) != -1 ){
                        System.out.print(len);
                        fio.write(bys, 0, len);
                    }
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (null != fis) {
                try {
                    fis.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if (null != fio) {
                try {
                    fio.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            //关闭远程连接
            connection.disconnect();
        }
    }
}

2.2 API方式速度

   public static void main(String[] args) throws IOException {

        String param = "{\"text\": \"重要提醒：写博客要负责人，最好不要发没有质量保证的东西！以免耽误大家的时间！\",\"spk_id\": 0,\"speed\": 1.0, \"volume\": 1.0,\"sample_rate\": 0,\"save_path\": \"./tts.wav\"}";

        Long time = System.currentTimeMillis();
        String post = doPost("http://127.0.0.1:8090/paddlespeech/tts", param);

        System.out.println(post);
        System.out.println("耗时:"+(System.currentTimeMillis()-time));
    }

客户端输出：

后台服务器输出：

后台模型运行总耗时：1485，前端请求到得到相应总耗时：1552，差值67就是网络传输耗时。

服务器返回的结果：

{"success":true,"code":200,"message":{"description":"success."},"result":{"lang":"zh","spk_id":0,"speed":1.0,"volume":1.0,"sample_rate":24000,"duration":6.9,"save_path":"./tts.wav","audio":"UklGRuQNBQBXQVZFZm10IBAAAAABAAEAwF0AAIC7AAACABAAZGF0YcANBQD+/f/+//z//v/7//7//v/+//3+/5//j/+//7//v/+//8//n/+//7//8AAP7/AQAAAAAAAAABAAIAAgACAAMAAgACAP//AgADAAQAAwAGAAMABAADAAQAAwADAAIAAwABAAAA/vz//v/8//7/+//6//r/+//5//z/+f/7//z9//7//f/9//3//P/7//3/+//6//j/+f/1//X/8//0//T/9P/w//T/9P/5//z//v/+/wIAAgADAAIAAAA="}}

audio：音频Base64编码后的字符串
save_path：语音服务器生成的语音文件
result：就是语音模型参数

2.3 Stream方式速度

 public static void main(String[] args) throws IOException {

        String param = "{\"text\": \"重要提醒：写博客要负责人，最好不要发没有质量保证的东西！以免耽误大家的时间！\",\"spk_id\": 0,\"speed\": 1.0, \"volume\": 1.0,\"sample_rate\": 0,\"save_path\": \"./tts.wav\"}";

        Long time = System.currentTimeMillis();
        String post = doPost("http://127.0.0.1:8092/paddlespeech/tts/streaming", param);
        System.out.println(post);
        System.out.println("耗时:"+(System.currentTimeMillis()-time));
    }

客户端输出：

服务器端输出

同样的文本文案，流式要比API快些。

如果我们把文本调大：

类似于游戏制作，创作出一个虚拟场景供人体验，其核心是graphics的各项技术的发挥。
和我们接触最多的就是应用在游戏上，可以说是传统游戏娱乐设备的一个升级版，
主要关注虚拟场景是否有良好的体验。
而与真实场景是否相关，他们并不关心。
VR设备往往是浸入式的，典型的设备就是oculus rift

流式响应的时间：

API响应的时间：

可以看到随文本的增大，响应时间会逐渐拉大，可以看到在大文本上流式还是有很大的优势。

2.3 纵向对比

1，API模式随文本的增加，响应时间逐步增加，经过多次运行后，耗时是固定不变的
2，Stream模式随文本的增加，响应时间逐步增加，经过多次运行后，每次的耗时都是变动的，

总结：创作不易，希望能对大家有所帮助！

标签：INFO,10,Java,PaddleSpeech,TTS,09,2024,connection,null
From： https://blog.csdn.net/zhyooo123/article/details/142105354