EMD-KPCA-LSTM 基于经验模态分解和核主成分分析的长短期记忆网络多维时间序列预测MATLAB代码(含LSTM、EMD-LSTM、EMD-KPCA-LSTM三个模型的对比)
matlab
参考文档:基于EMD-PCA-LSTM的光伏功率预测模型
研究内容:本案例使用数据集是北半球光伏功率,共四个输入特征(太阳辐射度 气温 气压 大气湿度),一个输出预测(光伏功率); 预测对象可以是电力负荷、风速、光伏等等时间序列数据集;代码注释清楚,可以读取本地EXCEL数据,
**
以下代码仅供参考
**
实现基于EMD-KPCA-LSTM的多维时间序列预测模型,并与LSTM和EMD-LSTM进行对比。以下是详细的步骤和代码。
环境准备
确保您已经安装了MATLAB,并且具有以下工具箱:
- Statistics and Machine Learning Toolbox
- Signal Processing Toolbox
- Neural Network Toolbox
数据加载与预处理
我需要加载和预处理数据。假设的数据存储在一个Excel文件中,包含四个输入特征(太阳辐射度、气温、气压、大气湿度)和一个输出特征(光伏功率)。以下是数据加载和预处理的脚本 data_preprocessing.m
:
[<title="Data Preprocessing Script for Solar Power Prediction">]
% Define paths
excelFilePath = 'datasets/solar_power_data.xlsx';
% Load data from Excel
rawData = readtable(excelFilePath);
% Convert table to matrix
dataMatrix = table2array(rawData);
% Normalize the features
[numRows, numCols] = size(dataMatrix);
X = dataMatrix(:, 1:numCols-1); % Input features
y = dataMatrix(:, numCols); % Output feature (Solar Power)
% Min-Max normalization
minVals = min(X);
maxVals = max(X);
X_normalized = (X - minVals) ./ (maxVals - minVals);
% Split data into training, validation, and test sets
trainRatio = 0.7;
valRatio = 0.15;
testRatio = 0.15;
numTrainSamples = floor(trainRatio * numRows);
numValSamples = floor(valRatio * numRows);
numTestSamples = numRows - numTrainSamples - numValSamples;
X_train = X_normalized(1:numTrainSamples, :);
y_train = y(1:numTrainSamples);
X_val = X_normalized(numTrainSamples+1:numTrainSamples+numValSamples, :);
y_val = y(numTrainSamples+1:numTrainSamples+numValSamples);
X_test = X_normalized(numTrainSamples+numValSamples+1:end, :);
y_test = y(numTrainSamples+numValSamples+1:end);
% Save preprocessed data
save('preprocessed_datasets/preprocessed_data.mat', 'X_train', 'y_train', 'X_val', 'y_val', 'X_test', 'y_test');
disp('Preprocessed data saved to: preprocessed_datasets/preprocessed_data.mat');
EMD分解
接下来,我们使用经验模态分解(EMD)来分解输入信号。以下是EMD分解的脚本 emd_decomposition.m
:
[<title="EMD Decomposition Script">]
% Load preprocessed data
load('preprocessed_datasets/preprocessed_data.mat');
% Apply EMD to each input feature
[X_emd_train, imfsTrain] = emd_decompose_features(X_train);
[X_emd_val, imfsVal] = emd_decompose_features(X_val);
[X_emd_test, imfsTest] = emd_decompose_features(X_test);
% Save decomposed data
save('preprocessed_datasets/emd_decomposed_data.mat', 'X_emd_train', 'imfsTrain', 'X_emd_val', 'imfsVal', 'X_emd_test', 'imfsTest');
disp('EMD decomposed data saved to: preprocessed_datasets/emd_decomposed_data.mat');
function [X_emd, imfs] = emd_decompose_features(X)
[~, numFeatures] = size(X);
X_emd = zeros(size(X));
imfs = cell(numFeatures, 1);
for i = 1:numFeatures
imfComponents = emd(X(:, i)');
imfs{i} = imfComponents';
X_emd(:, i) = sum(imfComponents)';
end
end
KPCA降维
接下来,我们使用核主成分分析(KPCA)对EMD分解后的数据进行降维。以下是KPCA降维的脚本 kpca_reduction.m
:
[<title="KPCA Reduction Script">]
% Load EMF decomposed data
load('preprocessed_datasets/emd_decomposed_data.mat');
% Apply KPCA to each input feature
[X_kpca_train, kpcaModels] = kpca_reduce_features(X_emd_train);
[X_kpca_val, ~] = kpca_reduce_features(X_emd_val, kpcaModels);
[X_kpca_test, ~] = kpca_reduce_features(X_emd_test, kpcaModels);
% Save reduced data
save('preprocessed_datasets/kpca_reduced_data.mat', 'X_kpca_train', 'X_kpca_val', 'X_kpca_test');
disp('KPCA reduced data saved to: preprocessed_datasets/kpca_reduced_data.mat');
function [X_kpca, models] = kpca_reduce_features(X, existingModels)
[~, numFeatures] = size(X);
X_kpca = zeros(size(X));
models = cell(numFeatures, 1);
for i = 1:numFeatures
if nargin == 1 || isempty(existingModels{i})
kpcaModel = kpca(X(:, i)', 'KernelFunction', 'rbf', 'NumComponents', 2);
models{i} = kpcaModel;
else
kpcaModel = existingModels{i};
end
X_kpca(:, i) = predict(kpcaModel, X(:, i)');
end
end
LSTM模型训练
我们将分别训练三种模型:LSTM、EMD-LSTM和EMD-KPCA-LSTM。以下是LSTM模型训练的脚本 lstm_training.m
:
[<title="LSTM Training Script">]
% Load preprocessed data
load('preprocessed_datasets/preprocessed_data.mat');
% Prepare sequences for LSTM
sequenceLength = 24; % Example sequence length of 24 time steps
[X_seq_train, y_seq_train] = prepare_sequences(X_train, y_train, sequenceLength);
[X_seq_val, y_seq_val] = prepare_sequences(X_val, y_val, sequenceLength);
[X_seq_test, y_seq_test] = prepare_sequences(X_test, y_test, sequenceLength);
% Train LSTM model
layers = [
sequenceInputLayer(sequenceLength, NumFeatures=numel(X_train, 2))
lstmLayer(100)
fullyConnectedLayer(1)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs', 50, ...
'GradientThreshold', 1, ...
'InitialLearnRate', 0.005, ...
'LearnRateSchedule', 'piecewise', ...
'LearnRateDropFactor', 0.2, ...
'LearnRateDropPeriod', 50, ...
'Verbose', 0, ...
'Plots', 'training-progress');
net = trainNetwork(X_seq_train, y_seq_train, layers, options);
% Evaluate LSTM model
y_pred_lstm = net.predict(X_seq_test);
mse_lstm = mean((y_pred_lstm - y_seq_test).^2);
rmse_lstm = sqrt(mse_lstm);
fprintf('LSTM Model RMSE: %.4f\n', rmse_lstm);
% Save trained model
save('trained_models/lstm_model.mat', 'net');
disp('LSTM model saved to: trained_models/lstm_model.mat');
function [X_seq, y_seq] = prepare_sequences(X, y, seqLen)
numSamples = numel(y) - seqLen + 1;
X_seq = cell(numSamples, 1);
y_seq = zeros(numSamples, 1);
for i = 1:numSamples
X_seq{i} = X(i:i+seqLen-1, :);
y_seq(i) = y(i+seqLen-1);
end
end
EMD-LSTM模型训练
以下是EMD-LSTM模型训练的脚本 emd_lstm_training.m
:
[<title="EMD-LSTM Training Script">]
% Load EMD decomposed data
load('preprocessed_datasets/emd_decomposed_data.mat');
% Prepare sequences for EMD-LSTM
sequenceLength = 24; % Example sequence length of 24 time steps
[X_seq_train, y_seq_train] = prepare_sequences(X_emd_train, y_train, sequenceLength);
[X_seq_val, y_seq_val] = prepare_sequences(X_emd_val, y_val, sequenceLength);
[X_seq_test, y_seq_test] = prepare_sequences(X_emd_test, y_test, sequenceLength);
% Train EMD-LSTM model
layers = [
sequenceInputLayer(sequenceLength, NumFeatures=numel(X_emd_train, 2))
lstmLayer(100)
fullyConnectedLayer(1)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs', 50, ...
'GradientThreshold', 1, ...
'InitialLearnRate', 0.005, ...
'LearnRateSchedule', 'piecewise', ...
'LearnRateDropFactor', 0.2, ...
'LearnRateDropPeriod', 50, ...
'Verbose', 0, ...
'Plots', 'training-progress');
net_emd = trainNetwork(X_seq_train, y_seq_train, layers, options);
% Evaluate EMD-LSTM model
y_pred_emd_lstm = net_emd.predict(X_seq_test);
mse_emd_lstm = mean((y_pred_emd_lstm - y_seq_test).^2);
rmse_emd_lstm = sqrt(mse_emd_lstm);
fprintf('EMD-LSTM Model RMSE: %.4f\n', rmse_emd_lstm);
% Save trained model
save('trained_models/emd_lstm_model.mat', 'net_emd');
disp('EMD-LSTM model saved to: trained_models/emd_lstm_model.mat');
function [X_seq, y_seq] = prepare_sequences(X, y, seqLen)
numSamples = numel(y) - seqLen + 1;
X_seq = cell(numSamples, 1);
y_seq = zeros(numSamples, 1);
for i = 1:numSamples
X_seq{i} = X(i:i+seqLen-1, :);
y_seq(i) = y(i+seqLen-1);
end
end
EMD-KPCA-LSTM模型训练
以下是EMD-KPCA-LSTM模型训练的脚本 emd_kpca_lstm_training.m
:
[<title="EMD-KPCA-LSTM Training Script">]
% Load KPCA reduced data
load('preprocessed_datasets/kpca_reduced_data.mat');
% Prepare sequences for EMD-KPCA-LSTM
sequenceLength = 24; % Example sequence length of 24 time steps
[X_seq_train, y_seq_train] = prepare_sequences(X_kpca_train, y_train, sequenceLength);
[X_seq_val, y_seq_val] = prepare_sequences(X_kpca_val, y_val, sequenceLength);
[X_seq_test, y_seq_test] = prepare_sequences(X_kpca_test, y_test, sequenceLength);
% Train EMD-KPCA-LSTM model
layers = [
sequenceInputLayer(sequenceLength, NumFeatures=numel(X_kpca_train, 2))
lstmLayer(100)
fullyConnectedLayer(1)
regressionLayer];
options = trainingOptions('adam', ...
'MaxEpochs', 50, ...
'GradientThreshold', 1, ...
'InitialLearnRate', 0.005, ...
'LearnRateSchedule', 'piecewise', ...
'LearnRateDropFactor', 0.2, ...
'LearnRateDropPeriod', 50, ...
'Verbose', 0, ...
'Plots', 'training-progress');
net_emd_kpca = trainNetwork(X_seq_train, y_seq_train, layers, options);
% Evaluate EMD-KPCA-LSTM model
y_pred_emd_kpca_lstm = net_emd_kpca.predict(X_seq_test);
mse_emd_kpca_lstm = mean((y_pred_emd_kpca_lstm - y_seq_test).^2);
rmse_emd_kpca_lstm = sqrt(mse_emd_kpca_lstm);
fprintf('EMD-KPCA-LSTM Model RMSE: %.4f\n', rmse_emd_kpca_lstm);
% Save trained model
save('trained_models/emd_kpca_lstm_model.mat', 'net_emd_kpca');
disp('EMD-KPCA-LSTM model saved to: trained_models/emd_kpca_lstm_model.mat');
function [X_seq, y_seq] = prepare_sequences(X, y, seqLen)
numSamples = numel(y) - seqLen + 1;
X_seq = cell(numSamples, 1);
y_seq = zeros(numSamples, 1);
for i = 1:numSamples
X_seq{i} = X(i:i+seqLen-1, :);
y_seq(i) = y(i+seqLen-1);
end
end
结果比较
最后,我们将比较三种模型的预测结果。以下是结果比较的脚本 result_comparison.m
:
[<title="Result Comparison Script">]
% Load preprocessed data
load('preprocessed_datasets/preprocessed_data.mat');
% Load trained models
load('trained_models/lstm_model.mat');
load('trained_models/emd_lstm_model.mat');
load('trained_models/emd_kpca_lstm_model.mat');
% Prepare sequences for testing
sequenceLength = 24; % Example sequence length of 24 time steps
[X_seq_test, y_seq_test] = prepare_sequences(X_test, y_test, sequenceLength);
% Predict using LSTM model
y_pred_lstm = net.predict(X_seq_test);
% Predict using EMD-LSTM model
y_pred_emd_lstm = net_emd.predict(X_seq_test);
% Predict using EMD-KPCA-LSTM model
y_pred_emd_kpca_lstm = net_emd_kpca.predict(X_seq_test);
% Calculate RMSE for each model
mse_lstm = mean((y_pred_lstm - y_seq_test).^2);
rmse_lstm = sqrt(mse_lstm);
mse_emd_lstm = mean((y_pred_emd_lstm - y_seq_test).^2);
rmse_emd_lstm = sqrt(mse_emd_lstm);
mse_emd_kpca_lstm = mean((y_pred_emd_kpca_lstm - y_seq_test).^2);
rmse_emd_kpca_lstm = sqrt(mse_emd_kpca_lstm);
fprintf('LSTM Model RMSE: %.4f\n', rmse_lstm);
fprintf('EMD-LSTM Model RMSE: %.4f\n', rmse_emd_lstm);
fprintf('EMD-KPCA-LSTM Model RMSE: %.4f\n', rmse_emd_kpca_lstm);
% Plot predictions vs actual values
figure;
plot(y_seq_test, 'b', 'DisplayName', 'Actual');
hold on;
plot(y_pred_lstm, 'r--', 'DisplayName', 'LSTM');
plot(y_pred_emd_lstm, 'g-.', 'DisplayName', 'EMD-LSTM');
plot(y_pred_emd_kpca_lstm, 'm:', 'DisplayName', 'EMD-KPCA-LSTM');
xlabel('Time Steps');
ylabel('Solar Power');
title('Comparison of Models');
legend show;
grid on;
function [X_seq, y_seq] = prepare_sequences(X, y, seqLen)
numSamples = numel(y) - seqLen + 1;
X_seq = cell(numSamples, 1);
y_seq = zeros(numSamples, 1);
for i = 1:numSamples
X_seq{i} = X(i:i+seqLen-1, :);
y_seq(i) = y(i+seqLen-1);
end
end
使用说明
-
配置路径:
- 确保
datasets/solar_power_data.xlsx
文件存在,并且包含所需的列。 - 确保所有文件名和路径与脚本中的路径一致。
- 确保
-
运行脚本:
- 在MATLAB命令窗口中依次运行以下脚本:
data_preprocessing.m
emd_decomposition.m
kpca_reduction.m
lstm_training.m
emd_lstm_training.m
emd_kpca_lstm_training.m
result_comparison.m
- 在MATLAB命令窗口中依次运行以下脚本:
-
注意事项:
- 确保所有必要的工具箱已安装,特别是
Statistics and Machine Learning Toolbox
,Signal Processing Toolbox
, 和Neural Network Toolbox
。 - 根据需要调整参数,如
sequenceLength
,MaxEpochs
,InitialLearnRate
等。
- 确保所有必要的工具箱已安装,特别是
示例
假设您的数据文件夹结构如下:
datasets/
└── solar_power_data.xlsx
并且 solar_power_data.xlsx
包含五个列(太阳辐射度、气温、气压、大气湿度、光伏功率)。运行上述脚本后,您可以查看各种图表和预测结果。
总结
,我们可以构建一个全面的时间序列预测系统,包括数据加载、预处理、EMD分解、KPCA降维以及LSTM、EMD-LSTM和EMD-KPCA-LSTM模型的训练和评估。以下是所有相关的代码文件:
- 数据预处理脚本 (
data_preprocessing.m
) - EMD分解脚本 (
emd_decomposition.m
) - KPCA降维脚本 (
kpca_reduction.m
) - LSTM训练脚本 (
lstm_training.m
) - EMD-LSTM训练脚本 (
emd_lstm_training.m
) - EMD-KPCA-LSTM训练脚本 (
emd_kpca_lstm_training.m
) - 结果比较脚本 (
result_comparison.m
)