MLPerf Steps
Install CM
python3 -m venv cm
source cm/bin/activate
pip install cm4mlops
Setup a virtual environment for Python
cm run script --tags=install,python-venv --name=mlperf
export CM_SCRIPT_EXTRA_CMD="--adr.python.name=mlperf"
Problem
Our xz
installed by spack have some problem.
(cm) [rocky@scc112-cpu2 ~]$ ldd /home/rocky/spack/opt/spack/linux-rocky9-zen3/gcc-11.4.1/xz-5.4.6-54q5irsngvod5psb7bhas6tklpiztmcz/bin/xz
ldd: /home/rocky/spack/opt/spack/linux-rocky9-zen3/gcc-11.4.1/xz-5.4.6-54q5irsngvod5psb7bhas6tklpiztmcz/bin/xz: No such file or directory
(cm) [rocky@scc112-cpu2 ~]$ file /home/rocky/spack/opt/spack/linux-rocky9-zen3/gcc-11.4.1/xz-5.4.6-54q5irsngvod5psb7bhas6tklpiztmcz/bin/xz
/home/rocky/spack/opt/spack/linux-rocky9-zen3/gcc-11.4.1/xz-5.4.6-54q5irsngvod5psb7bhas6tklpiztmcz/bin/xz: cannot open /home/rocky/spa
So we use yum to install and change the PATH.
yum install xz
export PATH=/usr/bin:$PATH
The script to generate actual submission tree check the test_query_count no less than 10833, so we change it in the script.
Optimize
performance run:
taskset -c 0-31 cm run script --tags=run-mlperf,inference,_find-performance,_full,_r4.1-dev \
--model=bert-99 \
--implementation=reference \
--framework=deepsparse \
--category=edge \
--scenario=Offline \
--execution_mode=test \
--device=cpu \
--quiet \
--test_query_count=60833\
--env.CM_MLPERF_NEURALMAGIC_MODEL_ZOO_STUB=zoo:nlp/question_answering/mobilebert-none/pytorch/huggingface/squad/base_quant-none \
--batch_size=64 \
--env.OMP_NUM_THREADS=32
accuracy run:
taskset -c 0-31 cm run script --tags=run-mlperf,inference,_r4.1-dev \
--model=bert-99 \
--implementation=reference \
--framework=deepsparse \
--category=edge \
--scenario=Offline \
--execution_mode=valid \
--device=cpu \
--quiet \
--env.CM_MLPERF_NEURALMAGIC_MODEL_ZOO_STUB=zoo:nlp/question_answering/mobilebert-none/pytorch/huggingface/squad/base_quant-none \
--batch_size=64 \
--env.OMP_NUM_THREADS=32 \
--test_query_count=10833
We use taskset -c 0-31
to bind the process to CPU cores 0 to 31 to avoid performance loss due to switching between different cores.
We choice deepsparse as the framework since it has higher performance.
We try different batch_size and finally choice 64 to get the highest performance.
We choice 32 OMP_NUM_THREADS since our machine has 32 cores and Thread(s) per core is 1.
Submit
We change env.CM_FRAMEWORK as deepsparse.
cm run script --tags=generate,inference,submission \
--clean \
--preprocess_submission=yes \
--run-checker \
--tar=yes \
--env.CM_TAR_OUTFILE=submission.tar.gz \
--division=open \
--category=edge \
--env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
--run_style=valid \
--quiet \
--submitter=scc112 \
--env.CM_FRAMEWORK=deepsparse \
--hw_name="scc112-cpu2"
标签:坑记,run,cm,--,spack,MLPerf,env,xz
From: https://www.cnblogs.com/linjiale/p/18492033