cuda toolkit v11.8 docs, link:https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html
preface
assess评估 application
异构计算
application profile
parallel it
get started
获得正确答案
优化cuda applications
perf metrics 性能指标
timing
bandwidth
Bandwidth - the rate at which data can be transferred - is one of the most important gating factors for performance 可见重要性
bandwidth包括理论值和有效值,一般有效值比理论值要低,要使用有效值来作为优化目标
effective
单位是GB/s,把读的字节数和写的字节数求和再除个运算时间。
例如