首页 > 其他分享 >funasr

funasr

时间:2024-10-16 22:23:00浏览次数:8  
标签:信号 FunASR speech https com funasr recognition

funasr

https://www.funasr.com/#/

https://github.com/modelscope/FunASR

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

FunASR hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun!

 

  • FunASR is a fundamental speech recognition toolkit that offers a variety of features, including speech recognition (ASR), Voice Activity Detection (VAD), Punctuation Restoration, Language Models, Speaker Verification, Speaker Diarization and multi-talker ASR. FunASR provides convenient scripts and tutorials, supporting inference and fine-tuning of pre-trained models.
  • We have released a vast collection of academic and industrial pretrained models on the ModelScope and huggingface, which can be accessed through our Model Zoo. The representative Paraformer-large, a non-autoregressive end-to-end speech recognition model, has the advantages of high accuracy, high efficiency, and convenient deployment, supporting the rapid construction of speech recognition services. For more details on service deployment, please refer to the service deployment document.

 

 https://cloud.baidu.com/article/3347080

3. 多语言支持

随着全球化的推进,多语言支持已成为语音识别技术的必备功能。FunASR支持中文、英文、日文等多种主流语言,并可根据用户需求进行定制开发,满足不同国家和地区的语音识别需求。这一特性使得FunASR在跨国企业、国际交流等领域具有广泛的应用前景。

 

demo

https://www.funasr.com/static/offline/index.html

 

tutourial

https://www.cnblogs.com/LaoDie1/p/18183024

https://www.cnblogs.com/v3ucn/p/17956926

 

 

VAD

https://blog.ailemon.net/2021/02/18/introduction-to-vad-theory/

首先我们来明确一下基本概念,语音激活检测(VAD, Voice Activation Detection)算法主要是用来检测当前声音信号中是否存在人的话音信号的。该算法通过对输入信号进行判断,将话音信号片段与各种背景噪声信号片段区分出来,使得我们能够分别对两种信号采用不同的处理方法。

VAD有很多种特征提取方法,一种最简单直接的是:通过短时能量(short time energy, STE)和短时过零率(zero cross counter, ZCC) 来测定,即基于能量的特征。短时能量就是一帧语音信号的能量,过零率就是一帧语音的时域信号穿过0(时间轴)的次数。一般来说,精确度高的VAD会提取基于能量的特征、频域特征、倒谱特征、谐波特征、长时信息特征等多个特征进行判断[1]。最后我们再根据阈值进行比较,或者使用统计的方法和机器学习的方法,得出是语音信号还是非语音信号的结论。

 

标签:信号,FunASR,speech,https,com,funasr,recognition
From: https://www.cnblogs.com/lightsong/p/18471053

相关文章

  • WeNet与FunASR对比:全面解析
    目录1.项目背景2.技术架构3.识别性能4.模型训练与优化5.应用场景与部署6.社区与生态7.未来发展总结随着语音识别技术的快速发展,越来越多的开源语音识别框架涌现,其中WeNet和FunASR都是备受关注的项目。它们都提供了强大的语音识别功能,但在架构设计、技术细节、......
  • 基于funasr+pyaudio实现电脑本地麦克风实时语音识别项目语音转文本python实现
    【框架地址】https://github.com/modelscope/FunASR【简单介绍】FunASR是一个功能全面的语音识别工具包,支持多种预训练模型的推理和微调,提供高精度和高效能的语音识别服务。结合PyAudio库,可以实现电脑本地麦克风实时语音识别项目。该项目首先通过PyAudio库捕获麦克风输入的......
  • 【机器学习】语音转文字 - FunASR 的应用与实践(speech to text)
    本文将介绍FunASR,一个多功能语音识别模型,包括其特点、使用方法以及在实际应用中的表现。我们将通过一个简单的示例来展示如何使用FunASR将语音转换为文字,并探讨其在语音识别领域的应用前景。一、引言随着人工智能技术的不断发展,语音识别技术在各个领域得到了广泛应用。......
  • 【语音识别】搭建本地的语音转文字系统:FunASR(离线不联网即可使用)
    参考自:参考配置:FunASR/runtime/docs/SDK_advanced_guide_offline_zh.mdatmain·alibaba-damo-academy/FunASR(github.com)参考配置:FunASR/runtime/quick_start_zh.mdat861147c7308b91068ffa02724fdf74ee623a909e·alibaba-damo-academy/FunASR(github.com)参考运行......