我正在尝试创建一个 LSTM 模型来检测时间序列数据中的异常情况。它需要 5 个输入并产生 1 个布尔输出(如果检测到异常则为 True/False)。异常模式通常连续 3 - 4 个时间步长。与大多数 LSTM 示例不同,它们预测未来数据或对整个数据序列进行分类,我尝试在每个时间步输出 True/False 检测标志(如果检测到,则在模式的最后一个时间步点处为 True) )。
不幸的是,CrossEntropyLoss 似乎不允许超过 1D 的输出张量,在这种情况下,它将是 2D [num 个序列,带有布尔数据的序列长度]
这里是一些示例代码我想要生成什么:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
# Define LSTM classifier model
class LSTMClassifier(nn.Module):
def __init__(self, input_size, hidden_size, num_layers, output_size):
super(LSTMClassifier, self).__init__()
self.hidden_size = hidden_size
self.num_layers = num_layers
self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
self.fc = nn.Linear(hidden_size, output_size)
def forward(self, x):
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size).to(x.device)
out, _ = self.lstm(x, (h0, c0))
out = self.fc(out[:, -1, :])
return out
# Input - 100 examples containing 5 data points per timestep (where there are 10 timesteps)
X_train = np.random.rand(100, 10, 5)
# Output - 100 examples containing 1 True/False output per timestep to match the input
y_train = np.random.choice(a=[True, False], size=(100, 10)) # Binary labels (True or False)
# Convert data to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.bool)
# Define model parameters
input_size = X_train.shape[2] # 5 inputs per timestep
hidden_size = 4 # Pattern we are trying to detect is usually 4 timesteps long
num_layers = 1
output_size = 1 # True/False
# Instantiate the model
model = LSTMClassifier(input_size, hidden_size, num_layers, output_size)
# Define loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Train the model
num_epochs = 10
for epoch in range(num_epochs):
optimizer.zero_grad()
outputs = model(X_train_tensor)
loss = criterion(outputs, y_train_tensor)
loss.backward()
optimizer.step()
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}')
# Test the model
X_test = np.random.rand(10, 10, 5) # Generate some test data - same dimensions as input
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
with torch.no_grad():
predictions = model(X_test_tensor)
predicted_outputs = torch.argmax(predictions, dim=1)
print("Predicted Outputs:", predicted_outputs)
我是否需要重新调整输出,或者可能使用不同的损失函数,或者 LSTM 以外的模型?
你遇到的问题是你的LSTM模型在每个时间步都输出一个预测,而
CrossEntropyLoss
希望有一个单一预测的张量(对于每个序列)。你有几个选择来解决这个问题:
1. 使用BCELoss和sigmoid输出:
-
修改你的LSTM模型,使其在
fc
层后输出一个介于0和1之间的值。你可以通过在fc
层后添加一个sigmoid
激活函数来实现这一点:
def forward(self, x):
# ... your existing code ...
out = self.fc(out)
out = torch.sigmoid(out) # Apply sigmoid activation
return out
-
使用
BCELoss
(二元交叉熵损失)作为你的损失函数,它适用于二元分类问题:
criterion = nn.BCELoss()
-
在训练过程中,你需要将你的
y_train_tensor
转换为float
类型:
y_train_tensor = torch.tensor(y_train, dtype=torch.float32)
2. 为每个时间步计算损失,然后汇总:
- 保持你的模型输出为每个时间步的预测。
-
使用
BCELoss
计算每个时间步的损失。 - 对所有时间步的损失求平均值,得到一个单一损失值用于反向传播。
criterion = nn.BCELoss(reduction='none') # Do not reduce across time steps
# ... inside the training loop:
outputs = model(X_train_tensor)
loss = criterion(outputs, y_train_tensor.float()) # Calculate loss for each time step
loss = loss.mean() # Average the loss over all time steps
loss.backward()
# ...
3. 使用LSTM进行序列分类:
- 如果你只关心整个序列是否包含异常,而不是每个时间步的具体预测,你可以使用LSTM进行序列分类。
- 修改你的模型,只在最后一个时间步输出一个预测。
-
使用
BCELoss
或CrossEntropyLoss
(如果你输出的是类概率),并使用一个单一标签来指示整个序列是否包含异常。
以下是如何实现选项2的示例代码:
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
# ... (Your LSTMClassifier class remains the same) ...
# ... (Your data generation remains the same) ...
# ... (Your model parameters remain the same) ...
# Instantiate the model
model = LSTMClassifier(input_size, hidden_size, num_layers, output_size)
# Define loss function and optimizer
criterion = nn.BCELoss(reduction='none') # Don't reduce loss across time steps
optimizer = optim.Adam(model.parameters(), lr=0.001)
# Train the model
num_epochs = 10
for epoch in range(num_epochs):
optimizer.zero_grad()
outputs = model(X_train_tensor)
loss = criterion(outputs, y_train_tensor.float()) # Calculate loss for each time step
loss = loss.mean() # Average the loss over all time steps
loss.backward()
optimizer.step()
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item()}')
# Test the model
# ... (Your test code remains similar, but you'll get predictions for each time step) ...
请记住,为你的特定问题选择最佳方法取决于你的目标和数据的特征。
标签:python,machine-learning,pytorch,neural-network,lstm From: 78781313