首页 > 其他分享 >学习笔记13:微调模型

学习笔记13:微调模型

时间:2024-06-04 09:33:18浏览次数:19  
标签:acc loss 13 torch 微调 笔记 epoch test model

转自:https://www.cnblogs.com/miraclepbc/p/14360807.html

resnet预训练模型

resnet模型与之前笔记中的vgg模型不同,需要我们直接覆盖掉最后的全连接层
先看一下resnet模型的结构:

我们需要先将所有的参数都设置成requires_grad = False
然后再重新定义fc层,并覆盖掉原来的。
重新定义的fc层的requires_grad默认为True

 
for p in model.parameters():
    p.requries_grad = False

in_f = model.fc.in_features
model.fc = nn.Linear(in_f, 4)

当定义optimizer的时候,需要注意,传进去的参数是fc层的参数,而不是所有层的参数

optimizer = torch.optim.Adam(model.fc.parameters(), lr = 0.001)

微调

微调的一般步骤是:

  • 重新定义全连接层
  • 训练重新定义的全连接层
  • 解冻部分其他层
  • 训练整个模型
    注意:微调是在训练完新的全连接层后,才能进行的。也就相当于整个模型训练了两次。
    optimizer这时的参数就是整个模型的参数了。
    代码:
for param in model.parameters():
    param.requires_grad = True

extend_epoch = 30
optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)

全部代码

import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision import datasets, transforms, models
import os
import shutil
%matplotlib inline

train_transform = transforms.Compose([
    transforms.Resize(224),
    transforms.RandomCrop(192),
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(0.2),
    transforms.ColorJitter(brightness = 0.5),
    transforms.ColorJitter(contrast = 0.5),
    transforms.ToTensor(),
    transforms.Normalize(mean = [0.5, 0.5, 0.5], std = [0.5, 0.5, 0.5])
])
test_transform = transforms.Compose([
    transforms.Resize((192, 192)),
    transforms.ToTensor(),
    transforms.Normalize(mean = [0.5, 0.5, 0.5], std = [0.5, 0.5, 0.5])
])
train_ds = datasets.ImageFolder(
    "E:/datasets2/29-42/29-42/dataset2/4weather/train",
    transform = train_transform
)
test_ds = datasets.ImageFolder(
    "E:/datasets2/29-42/29-42/dataset2/4weather/test",
    transform = test_transform
)
train_dl = torch.utils.data.DataLoader(train_ds, batch_size = 8, shuffle = True)
test_dl = torch.utils.data.DataLoader(test_ds, batch_size = 8)
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = models.resnet101(pretrained = True)
for p in model.parameters():
    p.requries_grad = False
in_f = model.fc.in_features
model.fc = nn.Linear(in_f, 4)

loss_func = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.fc.parameters(), lr = 0.001)
epochs = 30
exp_lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size = 7, gamma = 0.1)

def fit(epoch, model, trainloader, testloader):
    correct = 0
    total = 0
    running_loss = 0
    
    model.train()
    for x, y in trainloader:
        x, y = x.to(device), y.to(device)
        y_pred = model(x)
        loss = loss_func(y_pred, y)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        with torch.no_grad():
            y_pred = torch.argmax(y_pred, dim = 1)
            correct += (y_pred == y).sum().item()
            total += y.size(0)
            running_loss += loss.item()

    exp_lr_scheduler.step()
    
    epoch_acc = correct / total
    epoch_loss = running_loss / len(trainloader.dataset)
    
    test_correct = 0
    test_total = 0
    test_running_loss = 0
    
    model.eval()
    with torch.no_grad():
        for x, y in testloader:
            x, y = x.to(device), y.to(device)
            y_pred = model(x)
            loss = loss_func(y_pred, y)
            y_pred = torch.argmax(y_pred, dim = 1)
            test_correct += (y_pred == y).sum().item()
            test_total += y.size(0)
            test_running_loss += loss.item()
    epoch_test_acc = test_correct / test_total
    epoch_test_loss = test_running_loss / len(testloader.dataset)
    
    print('epoch: ', epoch, 
          'loss: ', round(epoch_loss, 3),
          'accuracy: ', round(epoch_acc, 3),
          'test_loss: ', round(epoch_test_loss, 3),
          'test_accuracy: ', round(epoch_test_acc, 3))
    
    return epoch_loss, epoch_acc, epoch_test_loss, epoch_test_acc

train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(epochs):
    epoch_loss, epoch_acc, epoch_test_loss, epoch_test_acc = fit(epoch, model, train_dl, test_dl)
    train_loss.append(epoch_loss)
    train_acc.append(epoch_acc)
    test_loss.append(epoch_test_loss)
    test_acc.append(epoch_test_acc)

for param in model.parameters():
    param.requires_grad = True
extend_epoch = 30
optimizer = torch.optim.Adam(model.parameters(), lr = 0.0001)
train_loss = []
train_acc = []
test_loss = []
test_acc = []
for epoch in range(extend_epoch):
    epoch_loss, epoch_acc, epoch_test_loss, epoch_test_acc = fit(epoch, model, train_dl, test_dl)
    train_loss.append(epoch_loss)
    train_acc.append(epoch_acc)
    test_loss.append(epoch_test_loss)
    test_acc.append(epoch_test_acc)

标签:acc,loss,13,torch,微调,笔记,epoch,test,model
From: https://www.cnblogs.com/gongzb/p/18230156

相关文章

  • 笔记7:训练过程封装(代码模板)
    转自:https://www.cnblogs.com/miraclepbc/p/14335456.html相关包importtorchimportpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltfromtorchimportnnimporttorch.nn.functionalasFfromtorch.utils.dataimportTensorDatasetfromtorch.utils.......
  • 学习笔记8:全连接网络实现MNIST分类(torch内置数据集)
    转自:https://www.cnblogs.com/miraclepbc/p/14344935.html相关包导入importtorchimportpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltfromtorchimportnnimporttorch.nn.functionalasFfromtorch.utils.dataimportTensorDatasetfromtorch.ut......
  • 学习笔记9:卷积神经网络实现MNIST分类(GPU加速)
    转自:https://www.cnblogs.com/miraclepbc/p/14345342.html相关包导入importtorchimportpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltfromtorchimportnnimporttorch.nn.functionalasFfromtorch.utils.dataimportTensorDatasetfromtorch.ut......
  • 笔记2:张量简介
    张量生成方法转自:https://www.cnblogs.com/miraclepbc/p/14329476.html张量的形状及类型张量的计算张量的梯度手写线性回归张量生成方法张量的形状及类型张量的计算张量的梯度手写线性回归......
  • 笔记3:逻辑回归(分批次训练)
    转自:https://www.cnblogs.com/miraclepbc/p/14332084.html相关库导入importtorchimportpandasaspdimportnumpyasnpimportmatplotlib.pyplotaspltfromtorchimportnn%matplotlibinline数据读入及预处理data=pd.read_csv('E:/datasets/dataset/credit-a.cs......
  • 笔记5:TensorDataset、DataLoader及数据集划分
    TensorDataset转自:https://www.cnblogs.com/miraclepbc/p/14333299.html导入相关包fromtorch.utils.dataimportTensorDataset特征与标签合并HRdataset=TensorDataset(X,Y)模型训练forepochinrange(epochs):foriinrange(num_batch):x,y=HRda......
  • 《信息学奥赛一本通 编程启蒙C++版》3126-3130(5题)
    3126:练21.3 神奇装置信息学奥赛一本通-编程启蒙(C++版)在线评测系统练21.3神奇装置信息学奥赛一本通-编程启蒙(C++版)在线评测系统3126:练21.3神奇装置_哔哩哔哩_bilibili#include<bits/stdc++.h>usingnamespacestd;intmain(){ inta,b,c,d; cin>>a>>b>>c......
  • 代码随想录算法训练营第27天 | 39. 组合总和 、 40.组合总和II 、 131.分割回文串
    组合总和本题是集合里元素可以用无数次,那么和组合问题的差别其实仅在于startIndex上的控制题目链接/文章讲解:https://programmercarl.com/0039.组合总和.html视频讲解:https://www.bilibili.com/video/BV1KT4y1M7HJ/***@param{number[]}candidates*@param{number......
  • 实战营学习笔记3
    在浦语大模型的第三课《基于Internlm和LangChain构建你的知识库》中,北辰老师以其生动有趣的风格,深入浅出地讲解了RAG(RetrievalAugmentedGeneration)的基本概念,并指导我们如何利用茴香豆搭建一个RAG助手。在此之前,我阅读过一些关于大型语言模型的资料,心中一直存有一个疑惑:既......
  • Day 13| 239. 滑动窗口最大值、 347.前 K 个高频元素
    239.滑动窗口最大值(一刷至少需要理解思路)之前讲的都是栈的应用,这次该是队列的应用了。本题算比较有难度的,需要自己去构造单调队列,建议先看视频来理解。题目链接/文章讲解/视频讲解:https://programmercarl.com/0239.滑动窗口最大值.html思考用单调队列实现,太难了,超过能力范......