VoxelNeXt原理这里不多介绍,官方代码地址:https://github.com/dvlab-research/VoxelNeXt
总结一下VoxelNeXt的特点:
-
没有采用增大卷积核尺寸的方式来增大感受野,而是利用稀疏卷积做了两次下采样
-
在做BEV处理时,pointpillar是batch_spatial_features.view(batch_size, self.num_bev_features * self.nz, self.ny, self.nx) 将代表z轴的数据放到特征通道数上,VoxelNeXt而是将BEV视角下同一位置上的voxel特征加和在一起
需要讲解代码或者需要优化后代码的可以私信联系
这篇2023年CVPR检测算法,模型简单但是效果奇佳,但是在实际使用时出现一些问题。在自己的激光雷达点云数据集中,表现精度并没有论文里说的那么高,尤其是对大物体检测,效果一般。
遂做出几点改进:
-
将4、5、6层特征add到第3层,因为第三层特征丰富同时添加大感受野的特征信息后适合大目标的细节预测
-
将FPN网络添加到backbone后面,同时将原来的一个head变成三个head输出
-
将anchor-free变成anchor-base,backbone基本不变(去掉bev),将输出稀疏特征变为稠密特征,然后将稠密特征做bev处理(将z轴高度信息和通道数放到一起),然后额外添加两组残差网络,最终输出头和PointPillar一样 值得注意:需要将正负样本分配换成anchor-base的
-
使用k-means聚类对目标的位置、长宽高、yaw进行聚类,这样anchor设置可以更合理
-
时序处理,前后帧之间添加运动补偿
实测,证明第3、4、5效果很好。(看来anchor-free的回归还是不准确,会发生pred_box位置不准确问题)
最终将其改为anchor-base方法,但是3D backbone不变,因为原论文最大的提升精度在这里,通过简单的两个下采样得到一个较大的感受野。在3D backbone后面添加一个BEV层同时将稀疏输出改为稠密输出(代码实现很简单,加一个dense()即可),之后添加一些2D卷积,为了让目标在BEV视角下框的回归更加准确。
在OpenPCDet中配置文件如下:
CLASS_NAMES: 你自己的数据集类别
DATA_CONFIG:
_BASE_CONFIG_: 数据路径
POINT_CLOUD_RANGE: [-216.0, -216.0, -10, 216.0, 216.0, 20]
INFO_PATH: {
'train': [custom_infos_train.pkl],
'test': [custom_infos_val.pkl],
}
DATA_AUGMENTOR:
DISABLE_AUG_LIST: ['placeholder']
AUG_CONFIG_LIST:
- NAME: gt_sampling
DB_INFO_PATH:
- custom_dbinfos_train.pkl
USE_SHARED_MEMORY: False #True # set it to True to speed up (it costs about 15GB shared memory)
# DB_DATA_PATH:
# - nuscenes_dbinfos_10sweeps_withvelo_global.pkl.npy
PREPARE: {
filter_by_min_points: [
],
}
SAMPLE_GROUPS: [
]
NUM_POINT_FEATURES: 4
DATABASE_WITH_FAKELIDAR: False
REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
LIMIT_WHOLE_SCENE: False
- NAME: random_world_flip
ALONG_AXIS_LIST: ['x', 'y']
- NAME: random_world_rotation
WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]
- NAME: random_world_scaling
WORLD_SCALE_RANGE: [0.9, 1.1]
- NAME: random_world_translation
NOISE_TRANSLATE_STD: [0.5, 0.5, 0.5]
DATA_PROCESSOR:
- NAME: mask_points_and_boxes_outside_range
REMOVE_OUTSIDE_BOXES: True
- NAME: shuffle_points
SHUFFLE_ENABLED: {
'train': True,
'test': True
}
- NAME: transform_points_to_voxels
VOXEL_SIZE: [0.3, 0.3, 0.75]
MAX_POINTS_PER_VOXEL: 32
MAX_NUMBER_OF_VOXELS: {
'train': 160000,
'test': 160000
}
MODEL:
NAME: VoxelNeXt
# VFE:
# NAME: PillarVFE
# WITH_DISTANCE: False
# USE_ABSLOTE_XYZ: True
# USE_NORM: True
# NUM_FILTERS: [10]
VFE:
NAME: MeanVFE
BACKBONE_3D:
NAME: VoxelResBackBone8xVoxelNeXt
MAP_TO_BEV:
NAME: HeightCompression
NUM_BEV_FEATURES: 256
BACKBONE_2D:
NAME: BaseBEVBackbone
LAYER_NUMS: [5, 5]
LAYER_STRIDES: [1, 2]
NUM_FILTERS: [128, 256]
UPSAMPLE_STRIDES: [1, 2]
NUM_UPSAMPLE_FILTERS: [256, 256]
DENSE_HEAD:
NAME: AnchorHeadSingle
CLASS_AGNOSTIC: False # 认为类别不可知,只考虑物体的形状和位置,不考虑类别
USE_DIRECTION_CLASSIFIER: True
DIR_OFFSET: 0.78539
DIR_LIMIT_OFFSET: 0.0
NUM_DIR_BINS: 2
ANCHOR_GENERATOR_CONFIG: [
{
'class_name': ,
'anchor_sizes': [[47.22923995, 10.11453907, 5.69901839]],
'anchor_rotations': [0, 1.57],
'anchor_bottom_heights': [-2.78],
'align_center': False,
'feature_map_stride': 8,
'matched_threshold': 0.6,
'unmatched_threshold': 0.45
},
{
'class_name':
'anchor_sizes': [[18.08827736, 4.41048397, 5.82307534]],
'anchor_rotations': [0, 1.57],
'anchor_bottom_heights': [-2.9],
'align_center': False,
'feature_map_stride': 8,
'matched_threshold': 0.6,
'unmatched_threshold': 0.45
},
{
'class_name': ,
'anchor_sizes': [[84.35426546, 25.39603818, 18.08518533],
[150.06519539, 33.33043893, 20.76281147],
[278.15729554, 40.14967236, 27.68076635],
[429.08814444, 38.37864438, 31.39757939]],
'anchor_rotations': [-1.57, 1.57],
'anchor_bottom_heights': [-1.5],
'align_center': False,
'feature_map_stride': 8,
'matched_threshold': 0.6,
'unmatched_threshold': 0.45
},
{
'class_name': ,
'anchor_sizes': [[32.64775322, 11.23301994, 11.88011639],
[13.38060326, 3.13617654, 6.84278373],
[22.06522481, 7.9552366, 10.13181668]],
'anchor_rotations': [0, 0.16],
'anchor_bottom_heights': [-2.7],
'align_center': False,
'feature_map_stride': 8,
'matched_threshold': 0.6,
'unmatched_threshold': 0.45
}
]
TARGET_ASSIGNER_CONFIG:
NAME: AxisAlignedTargetAssigner
POS_FRACTION: -1.0 # 调整正样本的数量
SAMPLE_SIZE: 512
NORM_BY_NUM_EXAMPLES: False
MATCH_HEIGHT: True # 分配正负样本时,调用3D IOU
BOX_CODER: ResidualCoder
LOSS_CONFIG:
LOSS_WEIGHTS: {
'cls_weight': 1.0,
'loc_weight': 2.0,
'dir_weight': 0.2,
'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
}
POST_PROCESSING:
RECALL_THRESH_LIST: [0.1, 0.3, 0.5]
SCORE_THRESH: 0.1
OUTPUT_RAW_SCORE: False
EVAL_METRIC: kitti
NMS_CONFIG:
MULTI_CLASSES_NMS: True
NMS_TYPE: nms_gpu
NMS_THRESH: 0.01
NMS_PRE_MAXSIZE: 4096
NMS_POST_MAXSIZE: 500
OPTIMIZATION:
BATCH_SIZE_PER_GPU: 6
NUM_EPOCHS: 500
OPTIMIZER: adam_onecycle
LR: 0.001
WEIGHT_DECAY: 0.01
MOMENTUM: 0.9
MOMS: [0.95, 0.85]
PCT_START: 0.4
DIV_FACTOR: 10
DECAY_STEP_LIST: [35, 45]
LR_DECAY: 0.1
LR_CLIP: 0.0000001
LR_WARMUP: False
WARMUP_EPOCH: 1
GRAD_NORM_CLIP: 10
标签:False,NAME,True,NUM,VoxelNeXt,实测,1.0,优化,anchor From: https://blog.csdn.net/weixin_42835315/article/details/141225666