首页 > 其他分享 >FunAudioLLM/SenseVoice

FunAudioLLM/SenseVoice

时间:2024-10-25 16:21:37浏览次数:7  
标签:ago SenseVoice FunAudioLLM months speech model recognition

Skip to content  

Navigation Menu

 
  •  
  •  
  •  
  •  
  •  
  • Pricing
        FunAudioLLM/SenseVoicePublic    

FunAudioLLM/SenseVoice

  2 BranchesTags  

Folders and files

Name  

Latest commit

LauraGPTLauraGPT OmniSenseVoice de00f2b · last week

History

85 Commits
.github/ISSUE_TEMPLATE docs 3 months ago
data update 3 months ago
deepspeed_conf update 3 months ago
image Third-Party Work last week
utils update onnx codes 3 months ago
.gitignore update 3 months ago
LICENSE dingding 3 months ago
README.md OmniSenseVoice last week
README_ja.md OmniSenseVoice last week
README_zh.md OmniSenseVoice last week
api.py fix api torchaudio.load bug 3 months ago
demo1.py docs 3 months ago
demo2.py docs 3 months ago
demo_libtorch.py python runtime 3 months ago
demo_onnx.py python runtime 3 months ago
export.py update onnx codes 3 months ago
export_meta.py ONNX support 3 months ago
finetune.sh sensevoice 3 months ago
model.py bugfix memory leak last month
requirements.txt Update requirements.txt 2 months ago
webui.py sensevoice 3 months ago

Repository files navigation

(简体中文|English|日本語)

Introduction

SenseVoice is a speech foundation model with multiple speech understanding capabilities, including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED).

Homepage  What's News  Benchmarks  Install  Usage  Community

Model Zoo: modelscopehuggingface

Online Demo: modelscope demohuggingface space

Highlights

标签:ago,SenseVoice,FunAudioLLM,months,speech,model,recognition
From: https://www.cnblogs.com/flyingsir/p/18502804

相关文章

  • AI超强语音转文本SenseVoice,本地化部署教程!
    模型介绍SenseVoice专注于高精度多语言语音识别、情感辨识和音频事件检测多语言识别:采用超过40万小时数据训练,支持超过50种语言,识别效果上优于Whisper模型。富文本识别:具备优秀的情感识别,能够在测试数据上达到和超过目前最佳情感识别模型的效果。支持声音事件检测能力,支持......
  • 阿里开源语音理解和语音生成大模型FunAudioLLM
       近年来,人工智能(AI)的进步极大地改变了人类与机器的互动方式,例如GPT-4o和Gemin-1.5等。这种转变在语音处理领域尤为明显,其中高精度的语音识别、情绪识别和语音生成等能力为更直观、更类人的交互铺平了道路。阿里开源大模型FunAudioLLM,一个创新的框架,旨在促进人类与大型......