直接上代码,来自魔搭的模型通义千问1.5-0.5B-Chat-AWQ · 模型库 (modelscope.cn)
from modelscope import AutoModelForCausalLM, AutoTokenizer device = "cuda" # the device to load the model onto model = AutoModelForCausalLM.from_pretrained( "qwen/Qwen1.5-0.5B-Chat-AWQ", device_map="cuda" ) tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen1.5-0.5B-Chat-AWQ") prompt = "Give me a short introduction to large language model." messages = [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(device) generated_ids = model.generate( model_inputs.input_ids, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response)
运行一下
结果发现报错了
bug ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run python -m bitsandbytes and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues ================================================================================ bin E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so False CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching in backup paths... CUDA SETUP: WARNING! libcuda.so not found! Do you have a CUDA driver installed? If you are on a cluster, make sure you are on a CUDA machine! CUDA SETUP: Loading binary E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\libbitsandbytes_cpu.so... argument of type 'WindowsPath' is not iterable CUDA SETUP: Problem: The main issue seems to be that the main CUDA library was not detected. CUDA SETUP: Solution 1): Your paths are probably not up-to-date. You can update them via: sudo ldconfig. CUDA SETUP: Solution 2): If you do not have sudo rights, you can do the following: CUDA SETUP: Solution 2a): Find the cuda library via: find / -name libcuda.so 2>/dev/null CUDA SETUP: Solution 2b): Once the library is found add it to the LD_LIBRARY_PATH: export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:FOUND_PATH_FROM_2a CUDA SETUP: Solution 2c): For a permanent solution add the export from 2b into your .bashrc file, located at ~/.bashrc E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/hf-mirror.com'), WindowsPath('https')} warn(msg) E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('module'), WindowsPath('/matplotlib_inline.backend_inline')} warn(msg) E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:149: UserWarning: WARNING: The following directories listed in your path were found to be non-existent: {WindowsPath('/usr/local/cuda/lib64')} warn(msg) E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:149: UserWarning: WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)! warn(msg) E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cuda_setup\main.py:149: UserWarning: WARNING: No GPU detected! Check your CUDA paths. Proceeding to load CPU-only library... warn(msg) --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\utils\import_utils.py:1390, in _LazyModule._get_module(self, module_name) 1389 try: -> 1390 return importlib.import_module("." + module_name, self.__name__) 1391 except Exception as e: File ~\AppData\Local\Programs\Python\Python310\lib\importlib\__init__.py:126, in import_module(name, package) 125 level += 1 --> 126 return _bootstrap._gcd_import(name[level:], package, level) File <frozen importlib._bootstrap>:1050, in _gcd_import(name, package, level) File <frozen importlib._bootstrap>:1027, in _find_and_load(name, import_) File <frozen importlib._bootstrap>:1006, in _find_and_load_unlocked(name, import_) File <frozen importlib._bootstrap>:688, in _load_unlocked(spec) File <frozen importlib._bootstrap_external>:883, in exec_module(self, module) File <frozen importlib._bootstrap>:241, in _call_with_frames_removed(f, *args, **kwds) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\integrations\bitsandbytes.py:11 10 if is_bitsandbytes_available(): ---> 11 import bitsandbytes as bnb 12 import torch File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\__init__.py:6 1 # Copyright (c) Facebook, Inc. and its affiliates. 2 # 3 # This source code is licensed under the MIT license found in the 4 # LICENSE file in the root directory of this source tree. ----> 6 from . import cuda_setup, utils, research 7 from .autograd._functions import ( 8 MatmulLtState, 9 bmm_cublas, (...) 13 matmul_4bit 14 ) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\research\__init__.py:1 ----> 1 from . import nn 2 from .autograd._functions import ( 3 switchback_bnb, 4 matmul_fp8_global, 5 matmul_fp8_mixed, 6 ) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\research\nn\__init__.py:1 ----> 1 from .modules import LinearFP8Mixed, LinearFP8Global File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\research\nn\modules.py:8 7 import bitsandbytes as bnb ----> 8 from bitsandbytes.optim import GlobalOptimManager 9 from bitsandbytes.utils import OutlierTracer, find_outlier_dims File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\optim\__init__.py:6 1 # Copyright (c) Facebook, Inc. and its affiliates. 2 # 3 # This source code is licensed under the MIT license found in the 4 # LICENSE file in the root directory of this source tree. ----> 6 from bitsandbytes.cextension import COMPILED_WITH_CUDA 8 from .adagrad import Adagrad, Adagrad8bit, Adagrad32bit File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\bitsandbytes\cextension.py:20 19 CUDASetup.get_instance().print_log_stack() ---> 20 raise RuntimeError(''' 21 CUDA Setup failed despite GPU being available. Please run the following command to get more information: 22 23 python -m bitsandbytes 24 25 Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them 26 to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes 27 and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues''') 28 lib.cadam32bit_grad_fp32 # runs on an error if the library could not be found -> COMPILED_WITH_CUDA=False RuntimeError: CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues The above exception was the direct cause of the following exception: RuntimeError Traceback (most recent call last) Cell In[1], line 4 1 from modelscope import AutoModelForCausalLM, AutoTokenizer 2 device = "cuda" # the device to load the model onto ----> 4 model = AutoModelForCausalLM.from_pretrained( 5 "qwen/Qwen1.5-0.5B-Chat-AWQ", 6 device_map="auto" 7 ) 8 tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen1.5-0.5B-Chat-AWQ") 10 prompt = "Give me a short introduction to large language model." File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\modelscope\utils\hf_util.py:111, in get_wrapped_class.<locals>.ClassWrapper.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 108 else: 109 model_dir = pretrained_model_name_or_path --> 111 module_obj = module_class.from_pretrained(model_dir, *model_args, 112 **kwargs) 114 if module_class.__name__.startswith('AutoModel'): 115 module_obj.model_dir = model_dir File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\models\auto\auto_factory.py:561, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 559 elif type(config) in cls._model_mapping.keys(): 560 model_class = _get_model_class(config, cls._model_mapping) --> 561 return model_class.from_pretrained( 562 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs 563 ) 564 raise ValueError( 565 f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n" 566 f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}." 567 ) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\modelscope\utils\hf_util.py:74, in patch_model_base.<locals>.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs) 72 else: 73 model_dir = pretrained_model_name_or_path ---> 74 return ori_from_pretrained(cls, model_dir, *model_args, **kwargs) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\modeling_utils.py:3389, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs) 3386 keep_in_fp32_modules = [] 3388 if hf_quantizer is not None: -> 3389 hf_quantizer.preprocess_model( 3390 model=model, device_map=device_map, keep_in_fp32_modules=keep_in_fp32_modules 3391 ) 3393 # We store the original dtype for quantized models as we cannot easily retrieve it 3394 # once the weights have been quantized 3395 # Note that once you have loaded a quantized model, you can't change its dtype so this will 3396 # remain a single source of truth 3397 config._pre_quantization_dtype = torch_dtype File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\quantizers\base.py:166, in HfQuantizer.preprocess_model(self, model, **kwargs) 164 model.is_quantized = True 165 model.quantization_method = self.quantization_config.quant_method --> 166 return self._process_model_before_weight_loading(model, **kwargs) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\quantizers\quantizer_awq.py:77, in AwqQuantizer._process_model_before_weight_loading(self, model, **kwargs) 76 def _process_model_before_weight_loading(self, model: "PreTrainedModel", **kwargs): ---> 77 from ..integrations import get_keys_to_not_convert, replace_with_awq_linear 79 self.modules_to_not_convert = get_keys_to_not_convert(model) 81 if self.quantization_config.modules_to_not_convert is not None: File <frozen importlib._bootstrap>:1075, in _handle_fromlist(module, fromlist, import_, recursive) File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\utils\import_utils.py:1380, in _LazyModule.__getattr__(self, name) 1378 value = self._get_module(name) 1379 elif name in self._class_to_module.keys(): -> 1380 module = self._get_module(self._class_to_module[name]) 1381 value = getattr(module, name) 1382 else: File E:\Prj\ChatGLM3-6B-32K\venv\lib\site-packages\transformers\utils\import_utils.py:1392, in _LazyModule._get_module(self, module_name) 1390 return importlib.import_module("." + module_name, self.__name__) 1391 except Exception as e: -> 1392 raise RuntimeError( 1393 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its" 1394 f" traceback):\n{e}" 1395 ) from e RuntimeError: Failed to import transformers.integrations.bitsandbytes because of the following error (look up to see its traceback): CUDA Setup failed despite GPU being available. Please run the following command to get more information: python -m bitsandbytes Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytes and open an issue at: https://github.com/TimDettmers/bitsandbytes/issues大量的报错 Log
首先检查环境:
命令行运行查看显卡和cuda配置
C:\Users\20116>nvidia-smi Thu Apr 4 17:27:53 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 537.58 Driver Version: 537.58 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3050 ... WDDM | 00000000:01:00.0 Off | N/A | | N/A 56C P8 3W / 69W | 1674MiB / 4096MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 131684 C ...rograms\Python\Python310\python.exe N/A | +---------------------------------------------------------------------------------------+ C:\Users\20116>nvcc nvcc fatal : No input files specified; use option --help for more information C:\Users\20116>Shell Code
运行代码查看部分配置库的信息
import torch import modelscope print(torch.__version__) print(torch.cuda.is_available()) print(modelscope.__version__)
输出:
2.1.0+cu118
True
1.12.0
查看报错信息库下的文件内容,发现有cu116相关文件
重试了以下的解决方法:
大模型训练时,使用bitsandbytes报错的解决方法-CSDN博客
Linux下安装CUDA并配置环境变量 – 源码巴士 (code84.com)
bitsandbytes-cuda118 on pypi.org? · Issue #866 · TimDettmers/bitsandbytes (github.com)
解决过程,大部分的解决方法都不行,发现这个bitsandbytes库的环境要求比较高,而且还需要对应的cuda版本,可惜的是我cu118 的版本没有对应的bitsandbytes库
离谱的是我才发现本机最开始安装的是cu117的版本,结果torch安装的却是cu118的版本虽然查看命令行smi发现还能支持cu121 的版本,之前跑了这么多的项目都能动就离谱了,仔细查看了报错的信息:
Inspect the output of the command and see if you can locate CUDA libraries. You might need to add them to your LD_LIBRARY_PATH. If you suspect a bug, please take the information from python -m bitsandbytesand open an issue at: https://github.com/TimDettmers/bitsandbytes/issues 或许是我的cuda的环境变量还没有配置完全,就配置了LD_LIBRARY_PATH,指向了cuda文件夹下了lib文件夹 然后在笔记本里重启内核再跑了一遍代码,结果居然成了,可喜可贺 最后的输出: 标签:name,lib,0.5,Windows11,AWQ,bitsandbytes,CUDA,import,model From: https://www.cnblogs.com/SaberZHT/p/18114426