如何使用 Python 和 Numpy 重现 Matlab 文件读取以解码 .dat 文件？

标签：python numpy matlab

我有一个 Matlab 脚本，可以读取编码的 .dat 文件，对其进行解码并保存。我试图使用 numpy 将其转换为 Python。我发现对于同一个文件，我得到不同的输出结果（python 数字没有意义）。该代码最初作为从串行端口读取的脚本的一部分运行，因此是数据的结构。

我首先认为位移是问题所在，因为索引差异，并且由于 Numpy 将右移和左移分成不同的函数，但情况似乎并非如此。我还尝试将位移位和点积运算分开，以确认它们是以正确的顺序完成的，但这也不是问题。然后我将 python .read() 替换为 np.fromfile() ，认为在读取时不指定正确的 uint8 格式可能会损坏数据。这时我在调试时注意到 packet_data 包含与我在 Matlab 中得到的值不同的值。我假设重塑数据会打乱它，但实际上矩阵完全包含不同的值。当然，这也意味着 packet_data_bytes 是完全错误的。我不知道为什么，但同一个文件在 Python 中读取时会给出不同的值，而不是在 Matlab 中读取。我不确定读取函数有什么区别，或者是否与我在脚本中打开文件的方式有关。

这是 matlab 和 python 中的代码。

Matlab:|| |Python：

'matlab'

% Define input and output file paths
inputFilePath = 'C:/Users/x/Downloads/Raw.dat'
outputFilePath = 'C:/Users/x/Downloads/decoded.dat'


% Open the input file
fileID = fopen(inputFilePath, 'r');

% Open the output file
outputFileID = fopen(outputFilePath, 'w');

% Check if files are successfully opened
if fileID == -1
    error('Cannot open the input file.');
end

if outputFileID == -1
    error('Cannot open the output file.');
end

% Constants
packetNumberBytesLength = 4;
packetDataBytesRows = 12;
packetDataBytesCols = 1024;
packetDataMasks = [1,2,4,8,16,32];
numSamples = 6;
numChannels = 1024;

% Read and process each packet
while ~feof(fileID)
    % Read packet number bytes
    packetNumberBytes = fread(fileID, packetNumberBytesLength, 'uint8');
    if numel(packetNumberBytes) < packetNumberBytesLength
        break;
    end
    
    % Read packet data bytes
    packetDataBytes = fread(fileID, [packetDataBytesRows, packetDataBytesCols], 'uint8');
    if numel(packetDataBytes) < packetDataBytesRows * packetDataBytesCols
        break;
    end
    
    % Decode packet number
    packetNumber = [1677216, 65536, 256, 1] * packetNumberBytes;
    
    % Decoding packet data
    Samples = zeros(numSamples, numChannels);
    for n = 1:numSamples
        Samples(n, :) = [2048,1024,512,256,128,64,32,16,8,4,2,1] * bitshift(bitand(packetDataBytes, packetDataMasks(n)), 1-n);
        Samples(n, numChannels) = 0; % Invalid sample in case of Single Cell scan mode
    end

    % Get current date and time
    currentDateTime = datestr(now, 'dd-mmm-yyyy HH:MM:SS.FFF');
    
    % Write decoded data to output file
    writematrix(currentDateTime, outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix([packetNumber, 1], outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix(Samples(1, :), outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix([3], outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix(Samples(3, :), outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix([5], outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
    writematrix(Samples(5, :), outputFilePath, 'Delimiter', 'tab', 'WriteMode', 'append');
end

% Close the input and output files
fclose(fileID);
fclose(outputFileID);

Python:

'python'

import numpy as np
import datetime

# Define input and output file paths
input_file_path = 'C:/Users/x/Downloads/Raw.dat'
output_file_path = 'C:/Users/x/Downloads/decoded.dat'

# Constants
packet_number_bytes_length = 4
packet_data_bytes_rows = 12
packet_data_bytes_cols = 1024
num_samples = 6
num_channels = 1024
packet_data_masks = [1, 2, 4, 8, 16, 32]

# Open the input and output files
try:
    with open(input_file_path, 'rb') as input_file, open(output_file_path, 'w') as output_file:

        # Read and process each packet
        while True:
            # Read packet number bytes
            packet_number_bytes = np.fromfile(input_file, dtype=np.uint8, count=packet_number_bytes_length)
            if len(packet_number_bytes) < packet_number_bytes_length:
                break

            # Read packet data bytes
            packet_data_bytes = np.fromfile(input_file, dtype=np.uint8,
                                            count=packet_data_bytes_rows * packet_data_bytes_cols)
            if len(packet_data_bytes) < packet_data_bytes_rows * packet_data_bytes_cols:
                break

            # Reshape the packet data into a 2D array
            packet_data = packet_data_bytes.reshape((packet_data_bytes_rows, packet_data_bytes_cols))

            # Decode packet number
            packet_number = np.dot([1677216, 65536, 256, 1], np.frombuffer(packet_number_bytes, dtype=np.uint8))

            # Decode packet data
            Samples = np.zeros((num_samples, num_channels), dtype=int)
            for n in range(num_samples):
                Samples[n, :] = np.dot(
                    [2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1],
                    np.right_shift(np.bitwise_and(packet_data, packet_data_masks[n]), n)
                )
            Samples[:, num_channels - 1] = 0  # Invalid sample in case of Single Cell scan mode

            # Get current date and time
            current_date_time = datetime.datetime.now().strftime('%d-%b-%Y %H:%M:%S.%f')[:-3]

            # Write decoded data to output file
            output_file.write(current_date_time + '\n')
            output_file.write(f"{packet_number}\t1\n")
            np.savetxt(output_file, Samples[0, :].reshape(1, -1), delimiter='\t', fmt='%d')
            output_file.write('3\n')
            np.savetxt(output_file, Samples[2, :].reshape(1, -1), delimiter='\t', fmt='%d')
            output_file.write('5\n')
            np.savetxt(output_file, Samples[4, :].reshape(1, -1), delimiter='\t', fmt='%d')

except FileNotFoundError as e:
    print(f"Error: {e}")

在读取和解码 .dat 文件时遇到的问题很可能是由于 Python 和 MATLAB 处理字节顺序（Endianness）的差异造成的。

问题所在：

字节顺序： MATLAB 默认使用大端字节序（Big-Endian），而 Python 默认使用小端字节序（Little-Endian）。这意味着在读取多字节数字（如的 packetNumberBytes ）时，MATLAB 和 Python 会以相反的顺序解释字节。
位移操作： 的代码在解码 packet_data 时使用了位移操作。尽管逻辑上是正确的，但在处理不同字节序时，位移操作的结果可能会有所不同。

解决方案：

指定字节顺序： 在 Python 代码中读取 packetNumberBytes 时，明确指定大端字节序：

python packet_number_bytes = np.fromfile(input_file, dtype='>u4', count=1)

这里 '>u4' 表示读取一个无符号整数 (uint32)，并使用大端字节序 ( > )。

简化解码过程： 可以使用 NumPy 的 unpackbits 函数来简化解码过程，避免手动进行位移操作。

```python packet_data_bits = np.unpackbits(packet_data).reshape(packet_data_bytes_rows, -1)

Samples = np.zeros((num_samples, num_channels), dtype=int) for n in range(num_samples): Samples[n, :] = np.dot( [2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1], packet_data_bits[n 8:(n+1) 8].T # 使用转置来进行矩阵乘法 ) ```

这段代码首先将 packet_data 解码为位数组，然后使用矩阵乘法计算 Samples 。

修改后的 Python 代码：

import numpy as np
import datetime

# ... (其他代码保持不变)

with open(input_file_path, 'rb') as input_file, open(output_file_path, 'w') as output_file:
    while True:
        # 使用大端字节序读取 packet number
        packet_number_bytes = np.fromfile(input_file, dtype='>u4', count=1)
        if len(packet_number_bytes) < 1:
            break
        packet_number = packet_number_bytes[0]  # 获取整数数值

        # ... (读取 packet data bytes 的代码保持不变)

        # 使用 unpackbits 解码 packet data
        packet_data_bits = np.unpackbits(packet_data).reshape(packet_data_bytes_rows, -1)

        Samples = np.zeros((num_samples, num_channels), dtype=int)
        for n in range(num_samples):
            Samples[n, :] = np.dot(
                [2048, 1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1],
                packet_data_bits[n*8:(n+1)*8].T
            )

        # ... (其他代码保持不变)

通过以上修改，的 Python 代码应该可以正确读取和解码 .dat 文件，并生成与 MATLAB 代码相同的结果。

标签：python,numpy,matlab
From： 78785498

如何使用 Python 和 Numpy 重现 Matlab 文件读取以解码 .dat 文件？

相关文章

赞助商

阅读排行