本次实践的目标是:ESP32S3开发板连INMP441麦克风,并用MicroPython控制处理语音信号存为pcm格式文件。
参考学习例子:使用ESP32与INMP441麦克风模块实现音频传输_esp32 inmp441-CSDN博客
xiao_esp32s3板的说明:麦克风使用 | Seeed Studio Wiki
咨询AI的例子:https://skywalk.blog.csdn.net/article/details/144154006
学习
首先学习下esp32开发板I2S 通信模式
模式概览
芯片 | I2S 标准 | PDM TX | PDM RX | TDM | ADC/DAC | LCD/摄像头 |
---|---|---|---|---|---|---|
ESP32 | I2S 0/1 | I2S 0 | I2S 0 | 无 | I2S 0 | I2S 0 |
ESP32-S2 | I2S 0 | 无 | 无 | 无 | 无 | I2S 0 |
ESP32-C3 | I2S 0 | I2S 0 | 无 | I2S 0 | 无 | 无 |
ESP32-C6 | I2S 0 | I2S 0 | 无 | I2S 0 | 无 | 无 |
ESP32-S3 | I2S 0/1 | I2S 0 | I2S 0 | I2S 0/1 | 无 | 无 |
ESP32-H2 | I2S 0 | I2S 0 | 无 | I2S 0 | 无 | 无 |
ESP32-P4 | I2S 0~2 | I2S 0 | I2S 0 | I2S 0~2 | 无 | 无 |
MicroPython的I2S接口可以连接麦克风。MicroPython的I2S接口支持通过I2S协议进行音频数据的传输,这使得它能够与麦克风等音频输入设备进行连接。例如,MicroPython I2S Examples项目提供了一组代码示例,展示了如何在支持MicroPython的开发板上使用I2S协议进行音频播放和录制,包括从麦克风录制声音1。
连接麦克风的具体步骤和示例代码
- 安装MicroPython:首先确保您的微控制器(如ESP32、STM32等)已经安装了MicroPython固件。
- 配置I2S:在初始化I2S时,需要设置采样率、位深度和通道数等参数。例如:
i2s = machine.I2S(...) i2s.init(samplerate=44100, bits=16, channels=2)
- 读取麦克风数据:使用I2S的
read()
方法读取麦克风输入的音频数据:data = i2s.read()
兼容性和应用场景
MicroPython I2S Examples项目支持多种MicroPython端口和开发板,如STM32、ESP32、Raspberry Pi Pico等。它利用I2S协议,适用于多种微控制器和开发板,并且提供了详细的硬件配置信息和文档,使得用户可以轻松上手进行音频处理和播放1。
通过以上步骤和示例代码,您可以成功地将MicroPython的I2S接口与麦克风连接,并进行音频数据的读取和处理。
开发板和麦克风连线
-
- 将麦克风模块的VCC引脚连接到ESP32开发板的3.3V电源引脚。
- 将麦克风模块的GND引脚连接到ESP32开发板的GND引脚。
- 将麦克风模块的SCK(时钟线)引脚连接到ESP32开发板的一个GPIO引脚(如GPIO17)。
- 将麦克风模块的WS(声道选择线)引脚连接到ESP32开发板的另一个GPIO引脚(如GPIO18)。
- 将麦克风模块的SD(数据线)引脚连接到ESP32开发板的另一个GPIO引脚(如GPIO16)。
c语言代码例子
#include <Arduino.h>
#include <driver/i2s.h>
#include <WiFiUdp.h>
#define I2S_WS 18
#define I2S_SD 16
#define I2S_SCK 17
#define I2S_PORT I2S_NUM_0
#define bufferLen 1024
const char* ssid = "你的WiFi名称";
const char* password = "你的WiFi密码";
const char* host = "接收音频端的IP地址"; // 电脑的IP地址
const int port = 8888; // 监听的端口
WiFiUDP udp; // 使用UDP协议进行数据传输
int16_t sBuffer[bufferLen];
void setup() {
Serial.begin(115200);
Serial.println("Setup I2S ...");
// 连接WiFi
WiFi.begin(ssid, password);
while (WiFi.status() != WL_CONNECTED) {
delay(600);
Serial.print("-");
}
Serial.println("WiFi 已连接");
Serial.println("IP 地址: ");
Serial.println(WiFi.localIP());
// 初始化I2S
i2s_config_t i2s_config = {
.mode = (i2s_mode_t)(I2S_MODE_MASTER | I2S_MODE_RX),
.sample_rate = 16000,
.bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
.channel_format = I2S_CHANNEL_FMT_ONLY_LEFT,
.communication_format = (i2s_comm_format_t)(I2S_COMM_FORMAT_STAND_I2S),
.intr_alloc_flags = 0,
.dma_buf_count = 8,
.dma_buf_len = bufferLen,
.use_apll = false
};
i2s_driver_install(I2S_PORT, &i2s_config, 0, NULL);
i2s_pin_config_t pin_config = {
.bck_io_num = I2S_SCK,
.ws_io_num = I2S_WS,
.data_out_num = I2S_PIN_NO_CHANGE,
.data_in_num = I2S_SD
};
i2s_set_pin(I2S_PORT, &pin_config);
i2s_start(I2S_PORT);
}
void loop() {
size_t bytesIn = 0;
esp_err_t result = i2s_read(I2S_PORT, &sBuffer, bufferLen * sizeof(int16_t), &bytesIn, portMAX_DELAY);
if (result == ESP_OK && bytesIn > 0) {
// 发送音频数据到服务器
udp.beginPacket(host, port);
udp.write((uint8_t*)sBuffer, bytesIn);
udp.endPacket();
}
}
python语言代码例子
import machine
import network
import usocket as socket
import utime
# 配置WiFi
SSID = '你的WiFi名称'
PASSWORD = '你的WiFi密码'
wlan = network.WLAN(network.STA_IF)
wlan.active(True)
wlan.connect(SSID, PASSWORD)
while not wlan.isconnected():
pass
print('WiFi connected', wlan.ifconfig())
# 配置I2S接口(这里需要根据你的麦克风模块和ESP32的引脚连接情况来调整)
i2s = machine.I2S(
0, # I2S端口号
sck=machine.Pin(17), # SCK引脚
ws=machine.Pin(18), # WS引脚
sdin=machine.Pin(16), # SDIN引脚(麦克风输入)
mode=machine.I2S.RX, # 接收模式
bits=16, # 16位音频
channels=1, # 单声道
rate=16000, # 采样率16kHz
bufferlen=2048 # 缓冲区大小
)
# 创建UDP套接字
addr = socket.getaddrinfo('0.0.0.0', 8888)[0][-1]
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(addr)
print("等待音频数据流...")
try:
while True:
# 从I2S接口读取音频数据
read_len = i2s.readinto(buf) # 假设buf是一个预先定义的bytearray
if read_len:
# 将音频数据转换为适合网络传输的格式(如果需要)
# 在这个简单例子中,我们直接发送原始音频数据
# 注意:UDP是面向数据报的协议,可能会丢失或重新排序数据包
sock.sendto(buf[:read_len], ('接收端的IP地址', 接收端的端口号))
# 可以添加延时以避免过快的循环
utime.sleep_ms(10)
except KeyboardInterrupt:
print("程序被中断")
finally:
# 清理资源
i2s.deinit()
sock.close()
-
缓冲区:
buf
应该是一个足够大的bytearray
对象,用于存储从I2S接口读取的音频数据。在上面的代码中,buf
没有定义,你需要根据你的需求来定义它,比如buf = bytearray(2048)
。 -
网络配置:将
'接收端的IP地址'
和接收端的端口号
替换为你的目标接收端的实际IP地址和端口号。
存储声音文件的例子
地址:GitCode - 全球开发者的开源社区,开源代码托管平台
可以存储到sd卡。
# The MIT License (MIT)
# Copyright (c) 2022 Mike Teachman
# https://opensource.org/licenses/MIT
# Purpose: Read audio samples from an I2S microphone and write to SD card
#
# - read 32-bit audio samples from I2S hardware, typically an I2S MEMS Microphone
# - convert 32-bit samples to specified bit size
# - write samples to a SD card file in WAV format
# - samples will be continuously written to the WAV file
# until a keyboard interrupt (ctrl-c) is detected
#
# Blocking version
# - the readinto() method blocks until
# the supplied buffer is filled
import os
from machine import Pin
from machine import I2S
if os.uname().machine.count("PYBv1"):
# ======= I2S CONFIGURATION =======
SCK_PIN = "Y6"
WS_PIN = "Y5"
SD_PIN = "Y8"
I2S_ID = 2
BUFFER_LENGTH_IN_BYTES = 40000
# ======= I2S CONFIGURATION =======
elif os.uname().machine.count("PYBD"):
import pyb
pyb.Pin("EN_3V3").on() # provide 3.3V on 3V3 output pin
os.mount(pyb.SDCard(), "/sd")
# ======= I2S CONFIGURATION =======
SCK_PIN = "Y6"
WS_PIN = "Y5"
SD_PIN = "Y8"
I2S_ID = 2
BUFFER_LENGTH_IN_BYTES = 40000
# ======= I2S CONFIGURATION =======
elif os.uname().machine.count("ESP32"):
from machine import SDCard
sd = SDCard(slot=2) # sck=18, mosi=23, miso=19, cs=5
os.mount(sd, "/sd")
# ======= I2S CONFIGURATION =======
SCK_PIN = 32
WS_PIN = 25
SD_PIN = 33
I2S_ID = 0
BUFFER_LENGTH_IN_BYTES = 40000
# ======= I2S CONFIGURATION =======
elif os.uname().machine.count("Raspberry"):
from sdcard import SDCard
from machine import SPI
cs = Pin(13, machine.Pin.OUT)
spi = SPI(
1,
baudrate=1_000_000, # this has no effect on spi bus speed to SD Card
polarity=0,
phase=0,
bits=8,
firstbit=machine.SPI.MSB,
sck=Pin(14),
mosi=Pin(15),
miso=Pin(12),
)
sd = SDCard(spi, cs)
sd.init_spi(25_000_000) # increase SPI bus speed to SD card
os.mount(sd, "/sd")
# ======= I2S CONFIGURATION =======
SCK_PIN = 16
WS_PIN = 17
SD_PIN = 18
I2S_ID = 0
BUFFER_LENGTH_IN_BYTES = 60000 # larger buffer to accommodate slow SD card driver
# ======= I2S CONFIGURATION =======
elif os.uname().machine.count("MIMXRT"):
from machine import SDCard
sd = SDCard(1) # Teensy 4.1: sck=45, mosi=43, miso=42, cs=44
os.mount(sd, "/sd")
# ======= I2S CONFIGURATION =======
SCK_PIN = 21
WS_PIN = 20
SD_PIN = 8
I2S_ID = 1
BUFFER_LENGTH_IN_BYTES = 40000
# ======= I2S CONFIGURATION =======
else:
print("Warning: program not tested with this board")
# ======= AUDIO CONFIGURATION =======
WAV_FILE = "mic.wav"
RECORD_TIME_IN_SECONDS = 10
WAV_SAMPLE_SIZE_IN_BITS = 16
FORMAT = I2S.MONO
SAMPLE_RATE_IN_HZ = 22_050
# ======= AUDIO CONFIGURATION =======
format_to_channels = {I2S.MONO: 1, I2S.STEREO: 2}
NUM_CHANNELS = format_to_channels[FORMAT]
WAV_SAMPLE_SIZE_IN_BYTES = WAV_SAMPLE_SIZE_IN_BITS // 8
RECORDING_SIZE_IN_BYTES = (
RECORD_TIME_IN_SECONDS * SAMPLE_RATE_IN_HZ * WAV_SAMPLE_SIZE_IN_BYTES * NUM_CHANNELS
)
def create_wav_header(sampleRate, bitsPerSample, num_channels, num_samples):
datasize = num_samples * num_channels * bitsPerSample // 8
o = bytes("RIFF", "ascii") # (4byte) Marks file as RIFF
o += (datasize + 36).to_bytes(
4, "little"
) # (4byte) File size in bytes excluding this and RIFF marker
o += bytes("WAVE", "ascii") # (4byte) File type
o += bytes("fmt ", "ascii") # (4byte) Format Chunk Marker
o += (16).to_bytes(4, "little") # (4byte) Length of above format data
o += (1).to_bytes(2, "little") # (2byte) Format type (1 - PCM)
o += (num_channels).to_bytes(2, "little") # (2byte)
o += (sampleRate).to_bytes(4, "little") # (4byte)
o += (sampleRate * num_channels * bitsPerSample // 8).to_bytes(4, "little") # (4byte)
o += (num_channels * bitsPerSample // 8).to_bytes(2, "little") # (2byte)
o += (bitsPerSample).to_bytes(2, "little") # (2byte)
o += bytes("data", "ascii") # (4byte) Data Chunk Marker
o += (datasize).to_bytes(4, "little") # (4byte) Data size in bytes
return o
wav = open("/sd/{}".format(WAV_FILE), "wb")
# create header for WAV file and write to SD card
wav_header = create_wav_header(
SAMPLE_RATE_IN_HZ,
WAV_SAMPLE_SIZE_IN_BITS,
NUM_CHANNELS,
SAMPLE_RATE_IN_HZ * RECORD_TIME_IN_SECONDS,
)
num_bytes_written = wav.write(wav_header)
audio_in = I2S(
I2S_ID,
sck=Pin(SCK_PIN),
ws=Pin(WS_PIN),
sd=Pin(SD_PIN),
mode=I2S.RX,
bits=WAV_SAMPLE_SIZE_IN_BITS,
format=FORMAT,
rate=SAMPLE_RATE_IN_HZ,
ibuf=BUFFER_LENGTH_IN_BYTES,
)
# allocate sample arrays
# memoryview used to reduce heap allocation in while loop
mic_samples = bytearray(10000)
mic_samples_mv = memoryview(mic_samples)
num_sample_bytes_written_to_wav = 0
print("Recording size: {} bytes".format(RECORDING_SIZE_IN_BYTES))
print("========== START RECORDING ==========")
try:
while num_sample_bytes_written_to_wav < RECORDING_SIZE_IN_BYTES:
# read a block of samples from the I2S microphone
num_bytes_read_from_mic = audio_in.readinto(mic_samples_mv)
if num_bytes_read_from_mic > 0:
num_bytes_to_write = min(
num_bytes_read_from_mic, RECORDING_SIZE_IN_BYTES - num_sample_bytes_written_to_wav
)
# write samples to WAV file
num_bytes_written = wav.write(mic_samples_mv[:num_bytes_to_write])
num_sample_bytes_written_to_wav += num_bytes_written
print("========== DONE RECORDING ==========")
except (KeyboardInterrupt, Exception) as e:
print("caught exception {} {}".format(type(e).__name__, e))
# cleanup
wav.close()
if os.uname().machine.count("PYBD"):
os.umount("/sd")
elif os.uname().machine.count("ESP32"):
os.umount("/sd")
sd.deinit()
elif os.uname().machine.count("Raspberry"):
os.umount("/sd")
spi.deinit()
elif os.uname().machine.count("MIMXRT"):
os.umount("/sd")
sd.deinit()
audio_in.deinit()
实践
首先看ESP32开发板管脚图
印象里有些管脚是保留或者有特殊用途,不过一般碰不到问题,所以直接用即可。但是自己确实找了很多资料,想确定到底用哪些管脚,最后定下来用16、17和18三个管脚。
第一部分C和python的管脚连线
都是用的17 18 16这三个管脚, 同这个:使用ESP32与INMP441麦克风模块实现音频传输_esp32 inmp441-CSDN博客
- 将麦克风模块的VCC引脚连接到ESP32开发板的3.3V电源引脚。
- 将麦克风模块的GND引脚连接到ESP32开发板的GND引脚。
- 将麦克风模块的SCK(时钟线)引脚连接到ESP32开发板的一个GPIO引脚(如GPIO17)。
- 将麦克风模块的WS(声道选择线)引脚连接到ESP32开发板的另一个GPIO引脚(如GPIO18)。
- 将麦克风模块的SD(数据线)引脚连接到ESP32开发板的另一个GPIO引脚(如GPIO16)。
第二部分 存音例子python的管脚连线
连线为32、25、33
# ======= I2S CONFIGURATION =======
SCK_PIN = 32
WS_PIN = 25
SD_PIN = 33
I2S_ID = 0
网上看到的管脚连线
好吧,还是自己的文档,是咨询文心一言这个问题:micropython如何通过麦克风录音,然后通过调用百度语音识别api ,进行语音识别。请将解题步骤一步步详细写下来
https://skywalk.blog.csdn.net/article/details/144154006
-
ESP32 I2S 引脚示例(实际使用时请根据你的开发板引脚图):
SCK
接 ESP32 的GPIO14
WS
接 ESP32 的GPIO15
SD
接 ESP32 的GPIO32
VCC
接 ESP32 的3.3V
GND
接 ESP32 的GND
思考再三决定使用的管脚、引脚:
存音例子的里的32 25 33 , 我实在找不到啊,所以最后决定使用16 17 18三个管脚:
sck=machine.Pin(17), # SCK引脚
ws=machine.Pin(18), # WS引脚
sdin=machine.Pin(16), # SDIN引脚(麦克风输入)
开始编程
网上几乎很难找到ESP32接麦克风的MicroPython的例子,用文心一言给的例子代码:
# ESP32 录音例子
# 初始化 I2S 并录制音频
from machine import I2S, Pin
import os
def init_i2s():
i2s = I2S(
I2S.NUM0, # 使用 I2S0
sck=Pin(17), # 串行时钟引脚
ws=Pin(18), # 字选择(LRCLK)引脚
sd=Pin(16), # 串行数据引脚
mode=I2S.RX, # 接收模式
bits=16, # 16 位采样
format=I2S.MONO, # 单声道
rate=16000, # 16kHz 采样率
ibuf=4096 # 缓冲区大小
)
return i2s
# 录制音频函数:
def record_audio(i2s, duration_seconds, file_name):
wav_data = bytearray()
total_bytes_read = 0
samples_per_second = 16000
bytes_per_sample = 2 # 16 位
total_samples = samples_per_second * duration_seconds
total_bytes = total_samples * bytes_per_sample
buffer = bytearray(1024)
try:
while total_bytes_read < total_bytes:
num_bytes_read = i2s.readinto(buffer)
if num_bytes_read:
wav_data.extend(buffer[:num_bytes_read])
total_bytes_read += num_bytes_read
finally:
i2s.deinit()
# 保存为文件
with open(file_name, 'wb') as f:
f.write(wav_data)
# 使用方法:
i2s = init_i2s()
record_audio(i2s, duration_seconds=5, file_name='audio.pcm')
该代码执行后碰到报错:MemoryError: memory allocation failed, allocating 64512 byte
把录音时间设为1秒,成功录音1秒。中间尝试修改i2s设备的buff,也没有解决录5秒会报错的问题。
后来咨询文心一言,发现它在回答我问题的时候,已经预先判断可能会出现内存不够的问题,当时它就说了:
可能遇到的问题和解决方案
MicroPython 不支持 HTTPS 请求
HTTPS 请求语音识别 API:
内存不足:
录制较长时间的音频可能会导致内存不足
解决方案:将录制的数据直接写入文件,而不是全部存储在内存中。
def record_audio(i2s, duration_seconds, file_name): with open(file_name, 'wb') as f: total_bytes_read = 0 samples_per_second = 16000 bytes_per_sample = 2 # 16 位 total_samples = samples_per_second * duration_seconds total_bytes = total_samples * bytes_per_sample buffer = bytearray(1024) try: while total_bytes_read < total_bytes: num_bytes_read = i2s.readinto(buffer) if num_bytes_read: f.write(buffer[:num_bytes_read]) total_bytes_read += num_bytes_read finally: i2s.deinit()
具体可见这个文档:https://skywalk.blog.csdn.net/article/details/144154006
修改代码,得到最终代码
按照提示,修改最终代码如下:
from machine import I2S, Pin
import os
def init_i2s():
i2s = I2S(
0, # 使用 I2S0
sck=Pin(17), # 串行时钟引脚
ws=Pin(18), # 字选择(LRCLK)引脚
sd=Pin(16), # 串行数据引脚
mode=I2S.RX, # 接收模式
bits=16, # 16 位采样
format=I2S.MONO, # 单声道
rate=16000, # 16kHz 采样率
# ibuf=4096 # 缓冲区大小
ibuf=2048
)
return i2s
# 录制音频函数:
def record_audio(i2s, duration_seconds, file_name):
with open(file_name, 'wb') as f:
total_bytes_read = 0
samples_per_second = 16000
bytes_per_sample = 2 # 16 位
total_samples = samples_per_second * duration_seconds
total_bytes = total_samples * bytes_per_sample
buffer = bytearray(1024)
try:
while total_bytes_read < total_bytes:
num_bytes_read = i2s.readinto(buffer)
if num_bytes_read:
f.write(buffer[:num_bytes_read])
total_bytes_read += num_bytes_read
finally:
i2s.deinit()
i2s = init_i2s()
record_audio(i2s, duration_seconds=5, file_name='audio.pcm')
这段代码终于可以录5秒的音频了!录音文件存储在ESP32S3开发板的根目录,名字叫audio.pcm
播放录制的语音
录制完成后,在ESP32开发板生成audio.pcm文件,大约157k大小。通过webrepl将开发板连上电脑并下载文件到电脑上,具体参考:启动后就能用浏览器连上开发板:esp32c3开发板通过micropython的WebREPL提供浏览器交互编程_esp32如何实现浏览器访问-CSDN博客
在电脑上再用专业的声音处理软件打开audio.pcm,比如Audacity 软件,以“原始数据”方式导入,点击“检测”,让软件自动识别编码、采样率等数据。如果没有识别正确,就手工修改成:16000Hz采样频率,signed 16-bit PCM编码,单声道、字节序选择“默认尾端”,然后点击绿色三角形的播放按钮,就可以听到录制的声音了。
如果采样频率设置错误,比如设为44100,就会导致声音时长缩短。如果字节序选了“大尾端”,那么声音就会比较大,但是噪音感觉也大。如果采样编码选择错误,那就只能听到噪音了。
好了,我们的ESP32开发板使用MicroPython录麦克风的声音实践成功结束。下一步,我们会尝试语音识别的实现。
调试
i2s = init_i2s()代码报错没I2S.NUM0
>>> i2s = init_i2s()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in init_i2s
AttributeError: type object 'I2S' has no attribute 'NUM0'
把I2S.NUM0这句改成0
报错memory allocation failed
>>> record_audio(i2s, duration_seconds=5, file_name='audio.pcm')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 13, in record_audio
MemoryError: memory allocation failed, allocating 64512 byte
可能是占用空间太大了,比如超过了64k,只录1秒试试:
i2s = init_i2s()
record_audio(i2s, duration_seconds=1, file_name='audio.pcm')
1秒ok啦!现在的文件是32k大小。
尝试修改i2s初始化时里面的buff大小,改小:ibuf=4096 改成ibuf=1024
没有解决问题。后来是根据文心一言的建议,读一点写一点:
def record_audio(i2s, duration_seconds, file_name):
with open(file_name, 'wb') as f:
total_bytes_read = 0
samples_per_second = 16000
bytes_per_sample = 2 # 16 位
total_samples = samples_per_second * duration_seconds
total_bytes = total_samples * bytes_per_sample
buffer = bytearray(1024)
try:
while total_bytes_read < total_bytes:
num_bytes_read = i2s.readinto(buffer)
if num_bytes_read:
f.write(buffer[:num_bytes_read])
total_bytes_read += num_bytes_read
finally:
i2s.deinit()
问题解决,这时候再录5秒就没有任何问题了。
标签:格式文件,i2s,MicroPython,read,引脚,I2S,bytes,存为,num From: https://blog.csdn.net/skywalk8163/article/details/144137782