1. InfoNCE loss(源自知乎https://zhuanlan.zhihu.com/p/506544456 )
1.引入
把对比学习看成是一个字典查询的任务,即训练一个编码器从而去做字典查询的任务。假设已经有一个编码好的queryq以及一系列编码好的样本k0, k1, k2..., 把k0, k1, k2...可以看作是字典里的key。假设只有一个keyk+和q匹配, 那么q和k+为正样本对,其余的key为q的负样本。一旦定义好了正负样本对,就需要一个对比学习的损失函数来指导模型来进行学习。
2. 目的
query和唯一的正样本k+相似,并且和其他所有负样本key都不相似的时候,这个loss的值应该比较低。反之,如果query和k+不相似,或者和其他负样本的key相似了,那么loss就应该大,从而惩罚模型,促使模型进行参数更新。
3. 公式
\[L_q = -\log \frac{\exp(q \cdot k_l / \tau)}{\sum_{i=0}^k \exp(q \cdot k_i / \tau)} \]这个k指的是负样本的数量, \(\tau\) 是一个温度超参数。
上式分母中的sum是在1个正样本和k个负样本上做的,从0到k,所以共k+1个样本,也就是字典里所有的key。
InfoNCE loss其实就是一个cross entropy loss,做的是一个k+1类的分类任务,目的就是想把q这个图片分到\(k_+\)这个类。
2. Instance discrimination(下面的概念均来自b站https://www.bilibili.com/video/BV1C3411s7t9/?spm_id_from=333.1007.top_right_bar_window_history.content.click&vd_source=4afdb0bf8f80389d3492b886b5277ddc)
大致定义
B站中朱老师讲解的例子是做代理任务(即原先任务过于复杂或者抽象, 采用方便模型学习理解的简单的任务来促进模型进行后续任务的学习), 本例子中采用的是图像分类的例子。假如采用ImageNet的图像数据集, 取出其中一张照片分为两组进行旋转裁剪和插值等操作,然后把这两张图片定义为一个正样本对, 而对于其他的图片作为负样本, 利用编码器分别提取这些图片的特征, 对比学习就是让正样本之间的特征距离拉近, 而负样本对之间的距离拉远。而这篇原本的论文MoCohttps://arxiv.org/abs/1911.05722这种采用无监督学习的方式,在不同的应用场景下发现比监督学习方式的模型几乎接近, 真是让人惊叹!有机会刷完朱老师的视频, 感觉还是知之甚少
3. 对于多标签二分类问题的思考
今天在做kaggle上面machine learning比赛课程时, 想采用BP神经网络的方法解决钢轨有损检测的问题, 可以发现它的测试数据和需要提交的csv文件如下:
可以发现要进行测试的数据均为0, 1 二分类的数据, 因此想的采用BP神经网络实现较为方便。
1. chatGPT的思路
下面是Chat GPT给出的样例代码, 结合本案例做出的代码:
点击查看代码
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split
train_X, val_X, train_labels, val_labels = train_test_split(X, y,random_state=1)
train_X_np = train_X.values
val_X_np = val_X.values
train_labels_np = train_labels.values
val_labels_np = val_labels.values
train_X_tensor = torch.from_numpy(train_X_np)
val_X_tensor = torch.from_numpy(val_X_np)
train_labels_tensor = torch.from_numpy(train_labels_np)
val_labels_tensor = torch.from_numpy(val_labels_np
train_X_tensor = torch.tensor(train_X_tensor, dtype=torch.float32)
val_X_tensor = torch.tensor(val_X_tensor, dtype=torch.float32)
train_labels_tensor = torch.tensor(train_labels_tensor, dtype=torch.float32)
val_labels_tensor = torch.tensor(val_labels_tensor, dtype=torch.float32)
n_features = train_X_tensor.shape[1]
n_labels = train_labels_tensor.shape[1]
batch_size = 32
train_dataset = TensorDataset(train_X_tensor, train_labels_tensor)
val_dataset = TensorDataset(val_X_tensor, val_labels_tensor)
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False)
class MultiLabelClassifier(nn.Module):
def __init__(self, n_features, n_labels):
super(MultiLabelClassifier, self).__init__()
self.layer1 = nn.Linear(n_features, 64)
self.layer2 = nn.Linear(64, 128)
self.layer3 = nn.Linear(128,512)
self.layer4 = nn.Linear(512, 512)
self.layer5 = nn.Linear(512, n_labels)
self.dropout = nn.Dropout(p=0.5)
def forward(self, x):
x = torch.relu(self.layer1(x))
x = self.dropout(x)
x = torch.relu(self.layer2(x))
x = self.dropout(x)
x = torch.relu(self.layer3(x))
x = self.dropout(x)
x = torch.relu(self.layer4(x))
x = self.dropout(x)
x = torch.sigmoid(self.layer5(x)) # 使用Sigmoid函数,以便每个输出都是独立的标签概率
return x
model = MultiLabelClassifier(n_features, n_labels)
loss_fn = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
def binary_accuracy(output, target, threshold=0.5):
"""计算多标签分类的准确率"""
# 使用阈值将输出概率转换为二进制预测
preds = (output > threshold).float()
# 比较预测和真实标签
correct = (preds == target).float()
# 计算每个样本的准确率
acc = correct.sum(1) / target.size(1)
# 返回所有样本的平均准确率
return acc.mean()
epochs = 20
best = 0
# 训练和评估, 并将效果最好的进行参数保存
for epoch in range(epochs):
total_acc = 0
total_loss = 0
total_acc_val = 0
total_loss_val =0
for batch_inputs, batch_labels in train_loader:
model.train()
optimizer.zero_grad()
outputs = model(batch_inputs)
loss = loss_fn(outputs, batch_labels)
acc = binary_accuracy(outputs, batch_labels)
loss.backward()
optimizer.step()
total_loss += loss.item()
total_acc += acc.item()
train_loss, train_acc = total_loss/len(train_loader), total_acc / len(train_loader)
if train_acc >= best:
best = train_acc
torch.save(model.state_dict(), 'best.pth')
print("save best model param")
print(f'Epoch {epoch+1}/{epochs}, train_Loss: {train_loss:.4f}, train_Accuracy: {train_acc:.4f}')
for batch_inputs, batch_labels in val_loader:
model.eval()
with torch.no_grad():
outputs = model(batch_inputs)
loss_val = loss_fn(outputs, batch_labels)
acc_val = binary_accuracy(outputs, batch_labels)
total_loss_val += loss_val.item()
total_acc_val += acc_val.item()
average_loss = total_loss_val/len(val_loader)
average_acc = total_acc_val/len(val_loader)
print(f'val_loss:{average_loss:.4f}, val_acc:{average_acc:.4f}')
best_net = MultiLabelClassifier(n_features, n_labels)
# print(best_net.state_dict())
best_param = torch.load('best.pth')
best_net.load_state_dict(best_param)
# print(best_net.state_dict())
test_np = test_data.values
test_tensor = torch.from_numpy(test_np).float()
preds = (best_net(test_tensor) > 0.5).int()
preds = preds.numpy()
submission_new = pd.DataFrame(preds, columns=['Pastry',
'Z_Scratch',
'K_Scatch',
'Stains',
'Dirtiness',
'Bumps',
'Other_Faults'])
submission_new.insert(0, 'id', submission_id)
submission_new.to_csv('pytorch test.csv', index=False)