首页 > 其他分享 >【基础岛·第6关】OpenCompass 评测 InternLM-1.8B 实践

【基础岛·第6关】OpenCompass 评测 InternLM-1.8B 实践

时间:2024-09-27 20:50:37浏览次数:8  
标签:internlm2 py 1.8 opencompass InternLM chat OpenCompass hf internlm

目录

1. 概览

在 OpenCompass 中评估一个模型通常包括以下几个阶段:配置 -> 推理 -> 评估 -> 可视化。

配置:这是整个工作流的起点。您需要配置整个评估过程,选择要评估的模型和数据集。此外,还可以选择评估策略、计算后端等,并定义显示结果的方式。
推理与评估:在这个阶段,OpenCompass 将会开始对模型和数据集进行并行推理和评估。推理阶段主要是让模型从数据集产生输出,而评估阶段则是衡量这些输出与标准答案的匹配程度。这两个过程会被拆分为多个同时运行的“任务”以提高效率。
可视化:评估完成后,OpenCompass 将结果整理成易读的表格,并将其保存为 CSV 和 TXT 文件。

2. 环境配置

2.1 创建开发机和conda环境

在创建开发机界面选择镜像为 Cuda11.7-conda,并选择 GPU 为10% A100。,创建开发机

2.2 安装——面向GPU的环境安装

conda create -n opencompass python=3.10
conda activate opencompass
conda install pytorch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 pytorch-cuda=12.1 -c pytorch -c nvidia -y

# 注意:一定要先 cd /root
cd /root
git clone -b 0.2.4 https://github.com/open-compass/opencompass
cd opencompass
pip install -e .


apt-get update
apt-get install cmake
pip install -r requirements.txt
pip install protobuf

3. 数据准备

3.1 评测数据集

解压测评数据集

cp /share/temp/datasets/OpenCompassData-core-20231110.zip /root/opencompass/
unzip OpenCompassData-core-20231110.zip

3.2 InternLM和ceval 相关的配置文件

列出所有跟 InternLM 及 C-Eval 相关的配置

python tools/list_configs.py internlm ceval

将会看到:
+----------------------------------------+----------------------------------------------------------------------+
| Model | Config Path |
|----------------------------------------+----------------------------------------------------------------------|
| hf_internlm2_1_8b | configs/models/hf_internlm/hf_internlm2_1_8b.py |
| hf_internlm2_20b | configs/models/hf_internlm/hf_internlm2_20b.py |
| hf_internlm2_7b | configs/models/hf_internlm/hf_internlm2_7b.py |
| hf_internlm2_base_20b | configs/models/hf_internlm/hf_internlm2_base_20b.py |
| hf_internlm2_base_7b | configs/models/hf_internlm/hf_internlm2_base_7b.py |
| hf_internlm2_chat_1_8b | configs/models/hf_internlm/hf_internlm2_chat_1_8b.py |
| hf_internlm2_chat_1_8b_sft | configs/models/hf_internlm/hf_internlm2_chat_1_8b_sft.py |
| hf_internlm2_chat_20b | configs/models/hf_internlm/hf_internlm2_chat_20b.py |
| hf_internlm2_chat_20b_sft | configs/models/hf_internlm/hf_internlm2_chat_20b_sft.py |
| hf_internlm2_chat_20b_with_system | configs/models/hf_internlm/hf_internlm2_chat_20b_with_system.py |
| hf_internlm2_chat_7b | configs/models/hf_internlm/hf_internlm2_chat_7b.py |
| hf_internlm2_chat_7b_sft | configs/models/hf_internlm/hf_internlm2_chat_7b_sft.py |
| hf_internlm2_chat_7b_with_system | configs/models/hf_internlm/hf_internlm2_chat_7b_with_system.py |
| hf_internlm2_chat_math_20b | configs/models/hf_internlm/hf_internlm2_chat_math_20b.py |
| hf_internlm2_chat_math_20b_with_system | configs/models/hf_internlm/hf_internlm2_chat_math_20b_with_system.py |
| hf_internlm2_chat_math_7b | configs/models/hf_internlm/hf_internlm2_chat_math_7b.py |
| hf_internlm2_chat_math_7b_with_system | configs/models/hf_internlm/hf_internlm2_chat_math_7b_with_system.py |
| hf_internlm_20b | configs/models/hf_internlm/hf_internlm_20b.py |
| hf_internlm_7b | configs/models/hf_internlm/hf_internlm_7b.py |
| hf_internlm_chat_20b | configs/models/hf_internlm/hf_internlm_chat_20b.py |
| hf_internlm_chat_7b | configs/models/hf_internlm/hf_internlm_chat_7b.py |
| hf_internlm_chat_7b_8k | configs/models/hf_internlm/hf_internlm_chat_7b_8k.py |
| hf_internlm_chat_7b_v1_1 | configs/models/hf_internlm/hf_internlm_chat_7b_v1_1.py |
| internlm_7b | configs/models/internlm/internlm_7b.py |
| ms_internlm_chat_7b_8k | configs/models/ms_internlm/ms_internlm_chat_7b_8k.py |
+----------------------------------------+----------------------------------------------------------------------+
+--------------------------------+-------------------------------------------------------------------+
| Dataset | Config Path |
|--------------------------------+-------------------------------------------------------------------|
| ceval_clean_ppl | configs/datasets/ceval/ceval_clean_ppl.py |
| ceval_contamination_ppl_810ec6 | configs/datasets/contamination/ceval_contamination_ppl_810ec6.py |
| ceval_gen | configs/datasets/ceval/ceval_gen.py |
| ceval_gen_2daf24 | configs/datasets/ceval/ceval_gen_2daf24.py |
| ceval_gen_5f30c7 | configs/datasets/ceval/ceval_gen_5f30c7.py |
| ceval_ppl | configs/datasets/ceval/ceval_ppl.py |
| ceval_ppl_1cd8bf | configs/datasets/ceval/ceval_ppl_1cd8bf.py |
| ceval_ppl_578f8d | configs/datasets/ceval/ceval_ppl_578f8d.py |
| ceval_ppl_93e5ce | configs/datasets/ceval/ceval_ppl_93e5ce.py |
| ceval_zero_shot_gen_bd40ef | configs/datasets/ceval/ceval_zero_shot_gen_bd40ef.py |
| configuration_internlm | configs/datasets/cdme/internlm2-chat-7b/configuration_internlm.py |
| modeling_internlm2 | configs/datasets/cdme/internlm2-chat-7b/modeling_internlm2.py |
| tokenization_internlm | configs/datasets/cdme/internlm2-chat-7b/tokenization_internlm.py |
+--------------------------------+-------------------------------------------------------------------+

4. 启动测评

4.1 使用命令行配置参数法进行评测

打开 opencompass文件夹下configs/models/hf_internlm/的hf_internlm2_chat_1_8b.py ,贴入以下代码

from opencompass.models import HuggingFaceCausalLM


models = [
    dict(
        type=HuggingFaceCausalLM,
        abbr='internlm2-1.8b-hf',
        path="/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b",
        tokenizer_path='/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b',
        model_kwargs=dict(
            trust_remote_code=True,
            device_map='auto',
        ),
        tokenizer_kwargs=dict(
            padding_side='left',
            truncation_side='left',
            use_fast=False,
            trust_remote_code=True,
        ),
        max_out_len=100,
        min_out_len=1,
        max_seq_len=2048,
        batch_size=8,
        run_cfg=dict(num_gpus=1, num_procs=1),
    )
]

确保按照上述步骤正确安装 OpenCompass 并准备好数据集后,可以通过以下命令评测 InternLM2-Chat-1.8B 模型在 C-Eval 数据集上的性能。由于 OpenCompass 默认并行启动评估过程,我们可以在第一次运行时以 --debug 模式启动评估,并检查是否存在问题。在 --debug 模式下,任务将按顺序执行,并实时打印输出

#环境变量配置
export MKL_SERVICE_FORCE_INTEL=1
#或
export MKL_THREADING_LAYER=GNU

运行评估:

python run.py --datasets ceval_gen --models hf_internlm2_chat_1_8b --debug

运行时报错:

A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.0.1 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

Traceback (most recent call last):  File "/root/opencompass/run.py", line 1, in <module>
    from opencompass.cli.main import main
  File "/root/opencompass/opencompass/cli/main.py", line 9, in <module>
    from opencompass.partitioners import MultimodalNaivePartitioner
  File "/root/opencompass/opencompass/partitioners/__init__.py", line 1, in <module>
    from .mm_naive import *  # noqa: F401, F403
  File "/root/opencompass/opencompass/partitioners/mm_naive.py", line 8, in <module>
    from .base import BasePartitioner
  File "/root/opencompass/opencompass/partitioners/base.py", line 9, in <module>
    from opencompass.utils import (dataset_abbr_from_cfg, get_logger,
  File "/root/opencompass/opencompass/utils/__init__.py", line 4, in <module>
    from .collect_env import *  # noqa
  File "/root/opencompass/opencompass/utils/collect_env.py", line 2, in <module>
    from mmengine.utils.dl_utils import collect_env as collect_base_env
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/mmengine/utils/dl_utils/__init__.py", line 3, in <module>
    from .collect_env import collect_env
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/mmengine/utils/dl_utils/collect_env.py", line 10, in <module>
    import torch
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/__init__.py", line 1382, in <module>
    from .functional import *  # noqa: F403
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/functional.py", line 7, in <module>
    import torch.nn.functional as F
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
    from .modules import *  # noqa: F403
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
    from .transformer import TransformerEncoder, TransformerDecoder, \
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
    device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
/root/.conda/envs/opencompass/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /opt/conda/conda-bld/pytorch_1702400410390/work/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
Traceback (most recent call last):
  File "/root/opencompass/run.py", line 1, in <module>
    from opencompass.cli.main import main
  File "/root/opencompass/opencompass/cli/main.py", line 14, in <module>
    from opencompass.utils.run import (exec_mm_infer_runner, fill_eval_cfg,
  File "/root/opencompass/opencompass/utils/run.py", line 7, in <module>
    from opencompass.datasets.custom import make_custom_dataset_config
  File "/root/opencompass/opencompass/datasets/__init__.py", line 1, in <module>
    from .advglue import *  # noqa: F401, F403
  File "/root/opencompass/opencompass/datasets/advglue.py", line 4, in <module>
    from datasets import Dataset, concatenate_datasets
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/datasets/__init__.py", line 18, in <module>
    from .arrow_dataset import Dataset
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 59, in <module>
    import pandas as pd
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/__init__.py", line 22, in <module>
    from pandas.compat import is_numpy_dev as _is_numpy_dev  # pyright: ignore # noqa:F401
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/compat/__init__.py", line 18, in <module>
    from pandas.compat.numpy import (
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/compat/numpy/__init__.py", line 4, in <module>
    from pandas.util.version import Version
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/util/__init__.py", line 2, in <module>
    from pandas.util._decorators import (  # noqa:F401
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/util/_decorators.py", line 14, in <module>
    from pandas._libs.properties import cache_readonly
  File "/root/.conda/envs/opencompass/lib/python3.10/site-packages/pandas/_libs/__init__.py", line 13, in <module>
    from pandas._libs.interval import Interval
  File "pandas/_libs/interval.pyx", line 1, in init pandas._libs.interval
ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject.   

根据gpt的hint直接降级安装:
pip install numpy==1.24.3

评估结果:
image

4.2 使用配置文件修改参数法进行评测

//todo

标签:internlm2,py,1.8,opencompass,InternLM,chat,OpenCompass,hf,internlm
From: https://www.cnblogs.com/jchen2022/p/18433302

相关文章

  • 【免费大屏】JimuReport 积木仪表盘 v1.8.1 首个集成版本发布
    项目介绍积木报表JimuReport,是一款免费的数据可视化报表工具,含报表、仪表盘和大屏设计,像搭建积木一样完全在线设计报表!功能涵盖,数据报表、打印设计、图表报表、门户设计、大屏设计等!可视化报表,DataV、帆软的开源替代方案,比帆软拥有更好体验,更简单的使用方式Web版报表设计器,......
  • 基于JDK1.8和Maven的GeoTools 28.X源码自主构建实践
    目录前言一、GeoTools与Jdk的版本关系1、GeoTools与Jdk版本2、编译环境简介二、使用Maven编译GeoTools28.X1、GeoTools28.x2、Maven的完整编译3、构建时的问题三、总结前言        想要学习和掌握一个开源软件或者项目,源码是我们主要学习的内容。学习开源项目的源代码可以......
  • 远程Linux服务器安装jdk1.8(亲测)
    上传jdk压缩包到服务器将压缩包进行解压,使用tar-zvxf压缩包名命令进行解压配置环境变量    编辑/etc/profile文件,在文件末尾添加以下内容:        exportJAVA_HOME=/usr/local/src/jdk1.8        exportPATH=$PATH:$JAVA_HOME/bin  ......
  • opencompass评测InternLM1.8B
    1配置opencompass环境gitclone-b0.2.4https://github.com/open-compass/opencompasspipinstall-e.-ihttps://mirrors.163.com/pypi/simple/pipinstall-rrequirements.txt-ihttps://mirrors.163.com/pypi/simple/pipinstallprotobuf-ihttps://mirrors.163.c......
  • 高等数学 1.8 函数的连续性与间断点
    目录一、函数的连续性增量的概念函数连续的定义左连续与右连续的概念二、函数的间断点三种情形间断点举例一、函数的连续性增量的概念设变量\(u\)从它的一个初值\(u_1\)变到终值\(u_2\),终值与初值的差\(u_2-u_1\)就叫做变量\(u\)的增量,记作\(\Deltau\),即\[\De......
  • AWTK 1.8 发布
    1.8版本更新1.细节完善大量细节完善请参考最新动态2.新增文档拖入文件事件如何使用packedimage如何自定义资源加载方式如何使用CMake构建AWTK应用如何将资源编译到应用程序并使用它们关于自定义控件的offset的使用注意事项3.新增重要特性使用svgtiny解析svg,增加渐......
  • InternLM 大模型实战营笔记-7
    基础岛第5关XTuner微调个人小助手认知目的:用internlm2-chat-1_8b模型,通过QLoRA的方式来微调一个自己的小助手1.微调前的模型对话进行端口映射,XXXXX是自己开发机的端口ssh-CNg-L8501:127.0.0.1:[email protected]激活环境,运行Stream......
  • 阿里云上部署jdk1.8
    1.先在oricle官网或者其他资源处下载jdk1.8,注意下载路径不能有中文,其余随意。2.创建一个java文件夹mkdir/usr/local/java3.进入该文件夹下cd/usr/local/java4.将windows资源管理器中的jdk上传到阿里云服务器上,win+R,输入cmd进入终端后,使用SCP命令scpC:\path\to\jdk-8u2......
  • 工业图像输出卡设计原理图:FMC214-基于FMC兼容1.8V IO的Full Camera Link 输出子卡
    FMC214-基于FMC兼容1.8VIO的FullCameraLink输出子卡  一、板卡概述 基于FMC兼容1.8V IO的Full Camera Link 输出子卡支持Base、Middle、Full Camera link信号输出,兼容1.8V、2.5V、3.3V IO FPGA信号输出。适配xilinx不同型号开发板和公司内部各......
  • 图像信号处理卡设计原理图:FMC213V3-基于FMC兼容1.8V IO的Full Camera Link 输入子卡
    FMC213V3-基于FMC兼容1.8VIO的FullCameraLink输入子卡 一、板卡概述   该板卡为了考虑兼容1.8V电平IO,适配Virtex7,Kintex Ultrascale,Virtex ultrasacle + FPGA而特制,如果要兼容原来的3.3V 也可以修改硬件参数。板卡支持1路Full Camera link输入,同时......