开源模型应用落地-qwen1.5-7b-chat-LoRA微调（二）

标签：7b -- lora 模型 chat model LoRA qwen1.5

一、前言

预训练模型提供的是通用能力，对于某些特定领域的问题可能不够擅长，通过微调可以让模型更适应这些特定领域的需求，让它更擅长解决具体的问题。

本篇是开源模型应用落地-qwen-7b-chat-LoRA微调（一）进阶篇，学习通义千问最新1.5系列模型的微调方式。

二、术语介绍

2.1. LoRA微调

LoRA (Low-Rank Adaptation) 用于微调大型语言模型 (LLM)。是一种有效的自适应策略，它不会引入额外的推理延迟，并在保持模型质量的同时显着减少下游任务的可训练参数数量。

2.2. Qwen1.5

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data. In comparison with the previous released Qwen, the improvements include:

6 model sizes, including 0.5B, 1.8B, 4B, 7B, 14B, and 72B;
Significant performance improvement in human preference for chat models;
Multilingual support of both base and chat models;
Stable support of 32K context length for models of all sizes
No need of trust_remote_code.

For more details, please refer to our blog post and GitHub repo.

三、构建环境

3.1. 基础环境

操作系统：centos7
Tesla V100-SXM2-32GB CUDA Version: 12.2

3.2.下载qwen1.5-7b-chat模型

方式一：通过huggingface下载

https://huggingface.co/Qwen/Qwen1.5-7B-Chat/tree/main

方式二：通过ModelScope下载

git clone https://www.modelscope.cn/qwen/Qwen1.5-7B-Chat.git

下载好的项目放置在/model目录下，并重命名为qwen1.5-7b-chat

3.3.下载qwen1.5项目

方式一：直接下载

地址：
GitHub - QwenLM/Qwen1.5: Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

方式二：使用git克隆项目

git clone https://github.com/QwenLM/Qwen1.5.git

下载好的项目放置在/service目录下，并重命名为qwen1.5

qwen1.5内置文件结构如下：

3.4.安装依赖

方式一：首次安装

conda create --name qwen1.5 python=3.10
conda activate qwen1.5
pip install -r /service/qwen1.5/requirements.txt
pip install peft

requirements.txt文件内容为：

transformers>=4.32.0,<4.38.0
accelerate
tiktoken
einops
transformers_stream_generator==0.0.4
scipy

方式二：基于现有虚拟环境升级

适用于已根据开源模型应用落地-qwen-7b-chat-LoRA微调（一）完成首次虚拟环境的搭建

conda create --name qwen1.5--clone qwen
conda activate qwen1.5
pip install --upgrade transformers==4.38.1

克隆环境后：

升级成功后：

若出现安装速度较慢的情况，可以指定第三方源，例如：

3.5.准备训练数据

示例数据如下

[
    {
        "instruction": "您是谁",
        "output": "我是人见人爱，车见车载的叮当猫，我非常乐意解决您的问题。"    
     },
    {
        "instruction": "您身份是啥",
        "output": "我是人见人爱，车见车载的叮当猫。"    
    },
    {
        "instruction": "你的名字是什么",
        "output": "我是叮当猫。"    
    }
]

准备好的数据放置在/service/qwen/data目录，并根据需要重命名，这里保持文件名为：lora_dataset.json

3.6.修改微调脚本

修改配置/service/qwen1.5/examples/sft/finetune.sh

#!/bin/bash
export CUDA_DEVICE_MAX_CONNECTIONS=1
DIR=`pwd`

# Guide:
# This script supports distributed training on multi-gpu workers (as well as single-worker training).
# Please set the options below according to the comments.
# For multi-gpu workers training, these options should be manually set for each worker.
# After setting the options, please run the script on each worker.

# Number of GPUs per GPU worker
GPUS_PER_NODE=$(python -c 'import torch; print(torch.cuda.device_count())')

# Number of GPU workers, for single-worker training, please set to 1
NNODES=${NNODES:-1}

# The rank of this worker, should be in {0, ..., WORKER_CNT-1}, for single-worker training, please set to 0
NODE_RANK=${NODE_RANK:-0}

# The ip address of the rank-0 worker, for single-worker training, please set to localhost
MASTER_ADDR=${MASTER_ADDR:-localhost}

# The port for communication
MASTER_PORT=${MASTER_PORT:-6001}

MODEL="/model/qwen1.5-7b-chat" # Set the path if you do not want to load from huggingface directly
# ATTENTION: specify the path to your training data, which should be a json file consisting of a list of conversations.
# See the section for finetuning in README for more information.
DATA="/service/qwen1.5/data/lora_dataset.json"
DS_CONFIG_PATH="finetune/ds_config_zero3.json"
USE_LORA=False
Q_LORA=False

function usage() {
    echo '
Usage: bash finetune/finetune_lora_ds.sh [-m MODEL_PATH] [-d DATA_PATH] [--deepspeed DS_CONFIG_PATH] [--use_lora USE_LORA] [--q_lora Q_LORA]
'
}

while [[ "$1" != "" ]]; do
    case $1 in
        -m | --model )
            shift
            MODEL=$1
            ;;
        -d | --data )
            shift
            DATA=$1
            ;;
        --deepspeed )
            shift
            DS_CONFIG_PATH=$1
            ;;
        --use_lora  )
            shift
            USE_LORA=$1
            ;;
        --q_lora    )
            shift
            Q_LORA=$1
            ;;
        -h | --help )
            usage
            exit 0
            ;;
        * )
            echo "Unknown argument ${1}"
            exit 1
            ;;
    esac
    shift
done

DISTRIBUTED_ARGS="
    --nproc_per_node $GPUS_PER_NODE \
    --nnodes $NNODES \
    --node_rank $NODE_RANK \
    --master_addr $MASTER_ADDR \
    --master_port $MASTER_PORT
"

torchrun $DISTRIBUTED_ARGS /service/qwen1.5/examples/sft/finetune.py \
    --model_name_or_path $MODEL \
    --data_path $DATA \
    --output_dir /model/fine-tuning/lora-qwen1.5-7b-chat \
    --num_train_epochs 2 \
    --per_device_train_batch_size 2 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 10 \
    --save_total_limit 10 \
    --learning_rate 3e-4 \
    --weight_decay 0.01 \
    --adam_beta2 0.95 \
    --warmup_ratio 0.01 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --report_to "none" \
    --model_max_length 512 \
    --lazy_preprocess True \
    --use_lora ${USE_LORA} \
    --gradient_checkpointing

主要关注以下参数：

num_train_epochs：用于指定训练的总轮数或迭代次数。一个轮数代表将整个训练数据集传递给模型进行一次训练的过程。增加num_train_epochs的值可以增加模型的训练时间和迭代次数，使模型有更多的机会学习训练数据集中的模式和特征。然而，如果num_train_epochs设置得过高，可能会导致过拟合，即模型在训练数据上表现良好，但在未见过的数据上表现较差。

per_device_train_batch_size：用于指定每个训练设备（如GPU）上的训练批量大小。它决定了在每次参数更新时，模型在每个设备上同时处理的训练样本数量。

gradient_accumulation_steps：用于控制梯度累积的步骤数，定义了在执行一次参数更新之前要累积的小批量样本数。例如，如果设置为2，则每处理两个小批量样本，就会计算梯度的累积并更新模型参数。这相当于使用两倍的批量大小进行训练，但梯度更新仍然是在每个小批量样本上执行的。

model_max_length：参数用于限制模型输入的最大长度。它指定了模型可以接受的最大输入文本长度，超过该长度的部分将被截断或忽略。设置model_max_length的目的是为了控制模型的计算成本和内存消耗。通过限制输入文本的长度，可以确保模型在资源受限的环境下能够高效地进行推断或训练。需要注意的是，设置model_max_length也会影响模型的生成能力。如果将文本截断到较短的长度，可能会丢失一些上下文信息，导致模型生成的结果不完整或不准确。

四、部署服务

通过微调改变模型的自我认知

4.1.启动训练

source /opt/anaconda3/bin/activate qwen1.5
nohup sh /service/qwen1.5/finetune/finetune.sh > output.txt 2>&1 &

执行后：

GPU使用情况：

五、附带说明

5.1.GPU资源不足

尽量减少gradient_accumulation_steps、gradient_accumulation_steps、model_max_length这三个值。

`5.2.``问题：raise JSONDecodeError("Expecting value", s, err.value) from None`

原因：下面红框注释的代码有缺陷，不支持qwen1系列的数据格式，需要改成json.load的方式进行整个文件加载

5.3.TypeError: Accelerator.init() got an unexpected keyword argument 'use_seedable_sampler'

原因：accelerate版本太低，当前版本是0.23.0

需要升级

pip install --upgrade accelerate==0.27.2

`5.4.问题：AttributeError: 'Qwen2Tokenizer' object has no attribute 'eod_id'`

不能沿用qwen1系列的代码，

GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

需要使用qwen1.5系列的代码：GitHub - QwenLM/Qwen1.5: Qwen1.5 is the improved version of Qwen, the large language model series developed by Qwen team, Alibaba Cloud.

标签：7b,--,lora,模型,chat,model,LoRA,qwen1.5
From： https://blog.csdn.net/qq839019311/article/details/137097181

开源模型应用落地-qwen1.5-7b-chat-LoRA微调（二）

一、前言

二、术语介绍

2.1. LoRA微调

2.2. Qwen1.5

三、构建环境

3.1. 基础环境

3.2.下载qwen1.5-7b-chat模型

3.3.下载qwen1.5项目

3.4.安装依赖

3.5.准备训练数据

3.6.修改微调脚本

四、部署服务

4.1.启动训练

五、附带说明

5.1.GPU资源不足

`5.2.``问题：raise JSONDecodeError("Expecting value", s, err.value) from None`

5.3.TypeError: Accelerator.init() got an unexpected keyword argument 'use_seedable_sampler'

`5.4.问题：AttributeError: 'Qwen2Tokenizer' object has no attribute 'eod_id'`

相关文章

赞助商

阅读排行

开源模型应用落地-qwen1.5-7b-chat-LoRA微调（二）

一、前言

二、术语介绍

2.1. LoRA微调

2.2. Qwen1.5

三、构建环境

3.1. 基础环境

3.2.下载qwen1.5-7b-chat模型

3.3.下载qwen1.5项目

3.4.安装依赖

3.5.准备训练数据

3.6.修改微调脚本

四、部署服务

4.1.启动训练

五、附带说明

5.1.GPU资源不足

5.2.问题：raise JSONDecodeError("Expecting value", s, err.value) from None

5.3.TypeError: Accelerator.__init__() got an unexpected keyword argument 'use_seedable_sampler'

5.4.问题：AttributeError: 'Qwen2Tokenizer' object has no attribute 'eod_id'

相关文章

赞助商

阅读排行

`5.2.``问题：raise JSONDecodeError("Expecting value", s, err.value) from None`

5.3.TypeError: Accelerator.init() got an unexpected keyword argument 'use_seedable_sampler'

`5.4.问题：AttributeError: 'Qwen2Tokenizer' object has no attribute 'eod_id'`