首页 > 其他分享 >kitti 数据集 可视化

kitti 数据集 可视化

时间:2024-10-08 10:10:45浏览次数:9  
标签:velo 数据 pts self 可视化 np kitti rect 3d

1. 网址

KITTI官网网址:https://www.cvlibs.net/datasets/kitti/index.php
下载数据集:https://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d
KITTI数据集论文:Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite
github可视化代码:https://github.com/kuixu/kitti_object_vis
其他: https://bbs.huaweicloud.com/blogs/371967

2. kitti简介

KITTI数据集是由德国卡尔斯鲁厄理工学院 Karlsruhe Institute of Technology (KIT) 和美国芝加哥丰田技术研究院 Toyota Technological Institute at Chicago (TTI-C) 于2012年联合创办,是目前国际上最为常用的自动驾驶场景下的计算机视觉算法评测数据集之一。该数据集用于评测立体图像(stereo),光流(optical flow),视觉测距(visual odometry),3D物体检测(object detection)和3D跟踪(tracking)等计算机视觉技术在车载环境下的性能。KITTI数据集包含市区、乡村和高速公路等场景采集的真实图像数据,每张图像中最多达15辆车和30个行人,还有各种程度的遮挡与截断。 KITTI数据集针对3D目标检测任务提供了14999张图像以及对应的点云,其中7481组用于训练,7518组用于测试,针对场景中的汽车、行人、自行车三类物体进行标注,共计80256个标记对象。

3. 采集车和传感器


KITTI数据集采集车的传感器布置平面如上图所示,车辆装配有2个灰度摄像机,2个彩色摄像机,一个Velodyne 64线3D激光雷达,4个光学镜头,以及1个GPS导航系统,在上图中使用了红色标记。

2个一百四十万像素的PointGray Flea2灰度相机
2个一百四十万像素的PointGray Flea2彩色相机
1个64线的Velodyne激光雷达,10Hz,角分辨率为0.09度,每秒约一百三十万个点,水平视场360°,垂直视场26.8°,至多120米的距离范围
4个Edmund的光学镜片,水平视角约为90°,垂直视角约为35°
1个OXTS RT 3003的惯性导航系统(GPS/IMU),6轴,100Hz,分别率为0.02米,0.1°

注意:双目相机之间的距离为0.54米,点云到相机之间的距离为0.27米,在上图中使用了蓝色标记。

4.坐标系

5. 数据文件介绍

5.1 图像:

training/image_2/000001.png

5.2 点云:

training/velodyne/000001.bin
激光点云数据是以浮点二进制文件格式存储,每行包含8个浮点数数据,其中每个浮点数数据由四位十六进制数表示且通过空格隔开。一个点云数据由4个浮点数数据构成,分别表示点云的x、y、z、r(其中xyz表示点云的坐标,r表示强度或反射值),点云的存储方式如下表所示:

calib文件:
training/calib/000001.txt

P0: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 0.000000000000e+00 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.875744000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 4.485728000000e+01 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.163791000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.745884000000e-03
P3: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.395242000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.199936000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.729905000000e-03
R0_rect: 9.999239000000e-01 9.837760000000e-03 -7.445048000000e-03 -9.869795000000e-03 9.999421000000e-01 -4.278459000000e-03 7.402527000000e-03 4.351614000000e-03 9.999631000000e-01
Tr_velo_to_cam: 7.533745000000e-03 -9.999714000000e-01 -6.166020000000e-04 -4.069766000000e-03 1.480249000000e-02 7.280733000000e-04 -9.998902000000e-01 -7.631618000000e-02 9.998621000000e-01 7.523790000000e-03 1.480755000000e-02 -2.717806000000e-01
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01



R0_rect矩阵的介绍:
Kitti 数据集是一个用于移动视觉和机器学习的著名数据集,经常用于自动驾驶汽车系统的研究。在 Kitti 数据集中,R0_rect表示的是矩形校正矩阵(rectifying rotation matrix)、即0号相机坐标系到矫正坐标系的旋转矩阵。
矩形校正是立体视觉计算中常用的一步,用于使得成对的立体相机的成像平面对齐,这样可以简化如立体匹配这样的后续处理步骤。实质上,它通过一个旋转将两个相机成像平面调整为共面,且使得它们的光轴平行。
R0_rect 矩阵用于将点从未校正的相机坐标系变换到校正后的坐标系。这一步是使用其他有关投影数据(如相机内部矩阵、深度信息等)之前必须进行的步骤。


5.3 label文件

training/label_2/000001.txt

Truck 0.00 0 -1.57 599.41 156.40 629.75 189.25 2.85 2.63 12.34 0.47 1.49 69.44 -1.56
Car 0.00 0 1.85 387.63 181.54 423.81 203.12 1.67 1.87 3.69 -16.53 2.39 58.49 1.57
Cyclist 0.00 3 -1.65 676.60 163.95 688.98 193.93 1.86 0.60 2.02 4.59 1.32 45.84 -1.55
DontCare -1 -1 -10 503.89 169.71 590.61 190.13 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 511.35 174.96 527.81 187.45 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 532.37 176.35 542.68 185.27 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 559.62 175.83 575.40 183.15 -1 -1 -1 -1000 -1000 -1000 -10

6. 可视化代码



import os
import sys
import numpy as np
import cv2

path_img = "/media/algo/data_1/everyday/20240928/kitti_object_vis-master/data/object/training/image_2/000001.png"
path_txt = "/media/algo/data_1/everyday/20240928/kitti_object_vis-master/data/object/training/label_2/000001.txt"
path_lidar = "/media/algo/data_1/everyday/20240928/kitti_object_vis-master/data/object/training/velodyne/000001.bin"
path_calib = "/media/algo/data_1/everyday/20240928/kitti_object_vis-master/data/object/training/calib/000001.txt"

def roty(t):
    """ Rotation about the y-axis. """
    c = np.cos(t)
    s = np.sin(t)
    return np.array([[c, 0, s], [0, 1, 0], [-s, 0, c]])

def compute_box_3d(obj, P):
    """ Takes an object and a projection matrix (P) and projects the 3d
        bounding box into the image plane.
        Returns:
            corners_2d: (8,2) array in left image coord.
            corners_3d: (8,3) array in in rect camera coord.
    """
    # compute rotational matrix around yaw axis
    R = roty(obj.ry) #[3,3]

    # 3d bounding box dimensions
    l = obj.l
    w = obj.w
    h = obj.h

    # 3d bounding box corners
    x_corners = [l / 2, l / 2, -l / 2, -l / 2, l / 2, l / 2, -l / 2, -l / 2]
    y_corners = [0, 0, 0, 0, -h, -h, -h, -h]
    z_corners = [w / 2, -w / 2, -w / 2, w / 2, w / 2, -w / 2, -w / 2, w / 2]

    #corners_3d[3, 8] rotate and translate 3d bounding box [3,3] @ [3, 8]
    corners_3d = np.dot(R, np.vstack([x_corners, y_corners, z_corners]))
    # print corners_3d.shape
    corners_3d[0, :] = corners_3d[0, :] + obj.t[0]
    corners_3d[1, :] = corners_3d[1, :] + obj.t[1]
    corners_3d[2, :] = corners_3d[2, :] + obj.t[2]
    # print 'cornsers_3d: ', corners_3d
    # only draw 3d bounding box for objs in front of the camera
    if np.any(corners_3d[2, :] < 0.1):
        corners_2d = None
        return corners_2d, np.transpose(corners_3d)

    # project the 3d bounding box into the image plane
    corners_2d = project_to_image(np.transpose(corners_3d), P)
    # print 'corners_2d: ', corners_2d
    return corners_2d, np.transpose(corners_3d)

def project_to_image(pts_3d, P):
    """ Project 3d points to image plane.

    Usage: pts_2d = projectToImage(pts_3d, P)
      input: pts_3d: nx3 matrix
             P:      3x4 projection matrix
      output: pts_2d: nx2 matrix

      P(3x4) dot pts_3d_extended(4xn) = projected_pts_2d(3xn)
      => normalize projected_pts_2d(2xn)

      <=> pts_3d_extended(nx4) dot P'(4x3) = projected_pts_2d(nx3)
          => normalize projected_pts_2d(nx2)
    """
    n = pts_3d.shape[0]
    pts_3d_extend = np.hstack((pts_3d, np.ones((n, 1)))) #[8,4] <-- [8, 3]
    # print(('pts_3d_extend shape: ', pts_3d_extend.shape))
    pts_2d = np.dot(pts_3d_extend, np.transpose(P))  # nx3
    pts_2d[:, 0] /= pts_2d[:, 2]
    pts_2d[:, 1] /= pts_2d[:, 2]
    return pts_2d[:, 0:2]
def inverse_rigid_trans(Tr):
    """ Inverse a rigid body transform matrix (3x4 as [R|t])
        [R'|-R't; 0|1]
    """
    inv_Tr = np.zeros_like(Tr)  # 3x4
    inv_Tr[0:3, 0:3] = np.transpose(Tr[0:3, 0:3])
    inv_Tr[0:3, 3] = np.dot(-np.transpose(Tr[0:3, 0:3]), Tr[0:3, 3])
    return inv_Tr

class Object3d(object):
    """ 3d object label """

    def __init__(self, label_file_line):
        data = label_file_line.split(" ")
        data[1:] = [float(x) for x in data[1:]]

        # extract label, truncation, occlusion
        self.type = data[0]  # 'Car', 'Pedestrian', ...
        self.truncation = data[1]  # truncated pixel ratio [0..1]
        self.occlusion = int(
            data[2]
        )  # 0=visible, 1=partly occluded, 2=fully occluded, 3=unknown
        self.alpha = data[3]  # object observation angle [-pi..pi]

        # extract 2d bounding box in 0-based coordinates
        self.xmin = data[4]  # left
        self.ymin = data[5]  # top
        self.xmax = data[6]  # right
        self.ymax = data[7]  # bottom
        self.box2d = np.array([self.xmin, self.ymin, self.xmax, self.ymax])

        # extract 3d bounding box information
        self.h = data[8]  # box height
        self.w = data[9]  # box width
        self.l = data[10]  # box length (in meters)
        self.t = (data[11], data[12], data[13])  # location (x,y,z) in camera coord.
        self.ry = data[14]  # yaw angle (around Y-axis in camera coordinates) [-pi..pi]

    def estimate_diffculty(self):
        """ Function that estimate difficulty to detect the object as defined in kitti website"""
        # height of the bounding box
        bb_height = np.abs(self.xmax - self.xmin)

        if bb_height >= 40 and self.occlusion == 0 and self.truncation <= 0.15:
            return "Easy"
        elif bb_height >= 25 and self.occlusion in [0, 1] and self.truncation <= 0.30:
            return "Moderate"
        elif (
            bb_height >= 25 and self.occlusion in [0, 1, 2] and self.truncation <= 0.50
        ):
            return "Hard"
        else:
            return "Unknown"

    def print_object(self):
        print(
            "Type, truncation, occlusion, alpha: %s, %d, %d, %f"
            % (self.type, self.truncation, self.occlusion, self.alpha)
        )
        print(
            "2d bbox (x0,y0,x1,y1): %f, %f, %f, %f"
            % (self.xmin, self.ymin, self.xmax, self.ymax)
        )
        print("3d bbox h,w,l: %f, %f, %f" % (self.h, self.w, self.l))
        print(
            "3d bbox location, ry: (%f, %f, %f), %f"
            % (self.t[0], self.t[1], self.t[2], self.ry)
        )
        print("Difficulty of estimation: {}".format(self.estimate_diffculty()))


def draw_projected_box3d(image, qs, color=(0, 255, 0), thickness=2):
    """ Draw 3d bounding box in image
        qs: (8,3) array of vertices for the 3d box in following order:
            1 -------- 0
           /|         /|
          2 -------- 3 .
          | |        | |
          . 5 -------- 4
          |/         |/
          6 -------- 7
    """
    qs = qs.astype(np.int32)
    for k in range(0, 4):
        # Ref: http://docs.enthought.com/mayavi/mayavi/auto/mlab_helper_functions.html
        i, j = k, (k + 1) % 4
        # use LINE_AA for opencv3
        # cv2.line(image, (qs[i,0],qs[i,1]), (qs[j,0],qs[j,1]), color, thickness, cv2.CV_AA)
        cv2.line(image, (qs[i, 0], qs[i, 1]), (qs[j, 0], qs[j, 1]), color, thickness)
        i, j = k + 4, (k + 1) % 4 + 4
        cv2.line(image, (qs[i, 0], qs[i, 1]), (qs[j, 0], qs[j, 1]), color, thickness)

        i, j = k, k + 4
        cv2.line(image, (qs[i, 0], qs[i, 1]), (qs[j, 0], qs[j, 1]), color, thickness)
    return image

def read_txt_to_objs(path_txt):
    lines = [line.rstrip() for line in open(path_txt)]
    objects = [Object3d(line) for line in lines]
    return objects

def get_lidar_in_image_fov(
    pc_velo, calib, xmin, ymin, xmax, ymax, return_more=False, clip_distance=2.0
):
    """ Filter lidar points, keep those in image FOV """
    pts_2d = calib.project_velo_to_image(pc_velo)
    fov_inds = (
        (pts_2d[:, 0] < xmax)
        & (pts_2d[:, 0] >= xmin)
        & (pts_2d[:, 1] < ymax)
        & (pts_2d[:, 1] >= ymin)
    )
    fov_inds = fov_inds & (pc_velo[:, 0] > clip_distance)
    imgfov_pc_velo = pc_velo[fov_inds, :]
    if return_more:
        return imgfov_pc_velo, pts_2d, fov_inds
    else:
        return imgfov_pc_velo

class Calibration(object):
    """ Calibration matrices and utils
        3d XYZ in <label>.txt are in rect camera coord.
        2d box xy are in image2 coord
        Points in <lidar>.bin are in Velodyne coord.

        y_image2 = P^2_rect * x_rect
        y_image2 = P^2_rect * R0_rect * Tr_velo_to_cam * x_velo
        x_ref = Tr_velo_to_cam * x_velo
        x_rect = R0_rect * x_ref

        P^2_rect = [f^2_u,  0,      c^2_u,  -f^2_u b^2_x;
                    0,      f^2_v,  c^2_v,  -f^2_v b^2_y;
                    0,      0,      1,      0]
                 = K * [1|t]

        image2 coord:
         ----> x-axis (u)
        |
        |
        v y-axis (v)

        velodyne coord:
        front x, left y, up z

        rect/ref camera coord:
        right x, down y, front z

        Ref (KITTI paper): http://www.cvlibs.net/publications/Geiger2013IJRR.pdf

        TODO(rqi): do matrix multiplication only once for each projection.
    """
    def __init__(self, calib_filepath, from_video=False):
        if from_video:
            calibs = self.read_calib_from_video(calib_filepath)
        else:
            calibs = self.read_calib_file(calib_filepath)
        # Projection matrix from rect camera coord to image2 coord
        self.P = calibs["P2"]
        self.P = np.reshape(self.P, [3, 4])
        # Rigid transform from Velodyne coord to reference camera coord
        self.V2C = calibs["Tr_velo_to_cam"]
        self.V2C = np.reshape(self.V2C, [3, 4])
        self.C2V = inverse_rigid_trans(self.V2C)
        # Rotation from reference camera coord to rect camera coord
        self.R0 = calibs["R0_rect"]
        self.R0 = np.reshape(self.R0, [3, 3])

        # Camera intrinsics and extrinsics
        self.c_u = self.P[0, 2]
        self.c_v = self.P[1, 2]
        self.f_u = self.P[0, 0]
        self.f_v = self.P[1, 1]
        self.b_x = self.P[0, 3] / (-self.f_u)  # relative
        self.b_y = self.P[1, 3] / (-self.f_v)

    def read_calib_file(self, filepath):
        """ Read in a calibration file and parse into a dictionary.
        Ref: https://github.com/utiasSTARS/pykitti/blob/master/pykitti/utils.py
        """
        data = {}
        with open(filepath, "r") as f:
            for line in f.readlines():
                line = line.rstrip()
                if len(line) == 0:
                    continue
                key, value = line.split(":", 1)
                # The only non-float values in these files are dates, which
                # we don't care about anyway
                try:
                    data[key] = np.array([float(x) for x in value.split()])
                except ValueError:
                    pass
        return data

    def read_calib_from_video(self, calib_root_dir):
        """ Read calibration for camera 2 from video calib files.
            there are calib_cam_to_cam and calib_velo_to_cam under the calib_root_dir
        """
        data = {}
        cam2cam = self.read_calib_file(
            os.path.join(calib_root_dir, "calib_cam_to_cam.txt")
        )
        velo2cam = self.read_calib_file(
            os.path.join(calib_root_dir, "calib_velo_to_cam.txt")
        )
        Tr_velo_to_cam = np.zeros((3, 4))
        Tr_velo_to_cam[0:3, 0:3] = np.reshape(velo2cam["R"], [3, 3])
        Tr_velo_to_cam[:, 3] = velo2cam["T"]
        data["Tr_velo_to_cam"] = np.reshape(Tr_velo_to_cam, [12])
        data["R0_rect"] = cam2cam["R_rect_00"]
        data["P2"] = cam2cam["P_rect_02"]
        return data

    def cart2hom(self, pts_3d):
        """ Input: nx3 points in Cartesian
            Oupput: nx4 points in Homogeneous by pending 1
        """
        n = pts_3d.shape[0]
        pts_3d_hom = np.hstack((pts_3d, np.ones((n, 1))))
        return pts_3d_hom

    # ===========================
    # ------- 3d to 3d ----------
    # ===========================
    def project_velo_to_ref(self, pts_3d_velo):
        pts_3d_velo = self.cart2hom(pts_3d_velo)  # nx4
        return np.dot(pts_3d_velo, np.transpose(self.V2C))

    def project_ref_to_velo(self, pts_3d_ref):
        pts_3d_ref = self.cart2hom(pts_3d_ref)  # nx4
        return np.dot(pts_3d_ref, np.transpose(self.C2V))

    def project_rect_to_ref(self, pts_3d_rect):
        """ Input and Output are nx3 points """
        return np.transpose(np.dot(np.linalg.inv(self.R0), np.transpose(pts_3d_rect)))

    def project_ref_to_rect(self, pts_3d_ref):
        """ Input and Output are nx3 points """
        return np.transpose(np.dot(self.R0, np.transpose(pts_3d_ref)))

    def project_rect_to_velo(self, pts_3d_rect):
        """ Input: nx3 points in rect camera coord.
            Output: nx3 points in velodyne coord.
        """
        pts_3d_ref = self.project_rect_to_ref(pts_3d_rect)
        return self.project_ref_to_velo(pts_3d_ref)

    def project_velo_to_rect(self, pts_3d_velo):
        pts_3d_ref = self.project_velo_to_ref(pts_3d_velo)
        return self.project_ref_to_rect(pts_3d_ref)

    # ===========================
    # ------- 3d to 2d ----------
    # ===========================
    def project_rect_to_image(self, pts_3d_rect):
        """ Input: nx3 points in rect camera coord.
            Output: nx2 points in image2 coord.
        """
        pts_3d_rect = self.cart2hom(pts_3d_rect)
        pts_2d = np.dot(pts_3d_rect, np.transpose(self.P))  # nx3
        pts_2d[:, 0] /= pts_2d[:, 2]
        pts_2d[:, 1] /= pts_2d[:, 2]
        return pts_2d[:, 0:2]

    def project_velo_to_image(self, pts_3d_velo):
        """ Input: nx3 points in velodyne coord.
            Output: nx2 points in image2 coord.
        """
        pts_3d_rect = self.project_velo_to_rect(pts_3d_velo)
        return self.project_rect_to_image(pts_3d_rect)

    def project_8p_to_4p(self, pts_2d):
        x0 = np.min(pts_2d[:, 0])
        x1 = np.max(pts_2d[:, 0])
        y0 = np.min(pts_2d[:, 1])
        y1 = np.max(pts_2d[:, 1])
        x0 = max(0, x0)
        # x1 = min(x1, proj.image_width)
        y0 = max(0, y0)
        # y1 = min(y1, proj.image_height)
        return np.array([x0, y0, x1, y1])

    def project_velo_to_4p(self, pts_3d_velo):
        """ Input: nx3 points in velodyne coord.
            Output: 4 points in image2 coord.
        """
        pts_2d_velo = self.project_velo_to_image(pts_3d_velo)
        return self.project_8p_to_4p(pts_2d_velo)

    # ===========================
    # ------- 2d to 3d ----------
    # ===========================
    def project_image_to_rect(self, uv_depth):
        """ Input: nx3 first two channels are uv, 3rd channel
                   is depth in rect camera coord.
            Output: nx3 points in rect camera coord.
        """
        n = uv_depth.shape[0]
        x = ((uv_depth[:, 0] - self.c_u) * uv_depth[:, 2]) / self.f_u + self.b_x
        y = ((uv_depth[:, 1] - self.c_v) * uv_depth[:, 2]) / self.f_v + self.b_y
        pts_3d_rect = np.zeros((n, 3))
        pts_3d_rect[:, 0] = x
        pts_3d_rect[:, 1] = y
        pts_3d_rect[:, 2] = uv_depth[:, 2]
        return pts_3d_rect

    def project_image_to_velo(self, uv_depth):
        pts_3d_rect = self.project_image_to_rect(uv_depth)
        return self.project_rect_to_velo(pts_3d_rect)

    def project_depth_to_velo(self, depth, constraint_box=True):
        depth_pt3d = get_depth_pt3d(depth)
        depth_UVDepth = np.zeros_like(depth_pt3d)
        depth_UVDepth[:, 0] = depth_pt3d[:, 1]
        depth_UVDepth[:, 1] = depth_pt3d[:, 0]
        depth_UVDepth[:, 2] = depth_pt3d[:, 2]
        # print("depth_pt3d:",depth_UVDepth.shape)
        depth_pc_velo = self.project_image_to_velo(depth_UVDepth)
        # print("dep_pc_velo:",depth_pc_velo.shape)
        if constraint_box:
            depth_box_fov_inds = (
                (depth_pc_velo[:, 0] < cbox[0][1])
                & (depth_pc_velo[:, 0] >= cbox[0][0])
                & (depth_pc_velo[:, 1] < cbox[1][1])
                & (depth_pc_velo[:, 1] >= cbox[1][0])
                & (depth_pc_velo[:, 2] < cbox[2][1])
                & (depth_pc_velo[:, 2] >= cbox[2][0])
            )
            depth_pc_velo = depth_pc_velo[depth_box_fov_inds]
        return depth_pc_velo

calib = Calibration(path_calib)

img = cv2.imread(path_img)
objects = read_txt_to_objs(path_txt)
img1 = np.copy(img)  # for 2d bbox
img2 = np.copy(img)  # for 3d bbox


for obj in objects:
    if obj.type == "DontCare":
        continue
    if obj.type == "Car":
        cv2.rectangle(
            img1,
            (int(obj.xmin), int(obj.ymin)),
            (int(obj.xmax), int(obj.ymax)),
            (0, 255, 0),
            2,
        )
    if obj.type == "Pedestrian":
        cv2.rectangle(
            img1,
            (int(obj.xmin), int(obj.ymin)),
            (int(obj.xmax), int(obj.ymax)),
            (255, 255, 0),
            2,
        )
    if obj.type == "Cyclist":
        cv2.rectangle(
            img1,
            (int(obj.xmin), int(obj.ymin)),
            (int(obj.xmax), int(obj.ymax)),
            (0, 255, 255),
            2,
        )

    box3d_pts_2d, _ = compute_box_3d(obj, calib.P)

    if box3d_pts_2d is None:
        print("something wrong in the 3D box.")
        continue
    if obj.type == "Car":
        img2 = draw_projected_box3d(img2, box3d_pts_2d, color=(0, 255, 0))
    elif obj.type == "Pedestrian":
        img2 = draw_projected_box3d(img2, box3d_pts_2d, color=(255, 255, 0))
    elif obj.type == "Cyclist":
        img2 = draw_projected_box3d(img2, box3d_pts_2d, color=(0, 255, 255))


cv2.namedWindow("img-2d", 0)
cv2.imshow("img-2d", img1)
cv2.namedWindow("img-3d", 0)
cv2.imshow("img-3d", img2)
cv2.waitKey()

## show_lidar_on_image

dtype = np.float32
n_vec = 4
scan = np.fromfile(path_lidar, dtype=dtype)
pc_velo = scan.reshape((-1, n_vec))
pc_velo = pc_velo[:, 0:3]

img_height, img_width, _ = img.shape
imgfov_pc_velo, pts_2d, fov_inds = get_lidar_in_image_fov(
    pc_velo, calib, 0, 0, img_width, img_height, True
)

imgfov_pts_2d = pts_2d[fov_inds, :]
imgfov_pc_rect = calib.project_velo_to_rect(imgfov_pc_velo)

import matplotlib.pyplot as plt

cmap = plt.cm.get_cmap("hsv", 256)
cmap = np.array([cmap(i) for i in range(256)])[:, :3] * 255

for i in range(imgfov_pts_2d.shape[0]):
    depth = imgfov_pc_rect[i, 2]
    color = cmap[int(640.0 / depth), :]
    cv2.circle(
        img,
        (int(np.round(imgfov_pts_2d[i, 0])), int(np.round(imgfov_pts_2d[i, 1]))),
        2,
        color=tuple(color),
        thickness=-1,
    )
cv2.imshow("projection", img)
cv2.waitKey(0)

标签:velo,数据,pts,self,可视化,np,kitti,rect,3d
From: https://www.cnblogs.com/yanghailin/p/18451076

相关文章

  • 网站域名无法连接数据库怎么回事
    网站域名无法连接数据库的问题可能涉及多个方面的原因,这里列举一些常见的原因及解决方法:数据库连接信息错误:检查数据库的用户名、密码、端口号、IP地址或主机名是否正确。网络问题:确保服务器与数据库之间的网络连通性正常,可以尝试ping数据库服务器地址来测试。检查是......
  • 宝塔面板数据库无法启动怎么办
    尝试手动启动数据库在命令行中尝试使用/etc/init.d/mysqldstart或servicemysqldstart命令手动启动MySQL服务。观察启动过程中是否有错误信息输出。重置或修复数据库如果上述方法都无法解决问题,可以考虑在宝塔面板中重置数据库密码或使用mysqladmin工具进行修复操作......
  • mysql数据库连接不上怎么办
    当遇到MySQL数据库连接不上的问题时,可以按照以下步骤逐一排查并尝试解决:检查网络连接确认客户端与MySQL服务器之间的网络连接是否正常。bash ping服务器IP或域名确认MySQL服务状态登录到服务器,检查MySQL服务是否已启动。bash systemctlstatu......
  • 网站数据库配置失败怎么办
    解决网站数据库配置失败的问题,可以按照以下步骤进行排查和修复:检查配置文件确认数据库连接信息是否正确,包括数据库地址、端口、用户名和密码。检查数据库名称是否正确。验证数据库服务状态确认数据库服务是否正在运行。使用命令行工具尝试连接数据库,确认连接是否成功......
  • 图像数据增强库综述:10个强大图像增强工具对比与分析
    在深度学习和计算机视觉领域,数据增强已成为提高模型性能和泛化能力的关键技术。本文旨在全面介绍当前广泛使用的图像数据增强库,分析其特点和适用场景,以辅助研究人员和开发者选择最适合其需求的工具。数据增强在深度学习模型训练中扮演着至关重要的角色,其重要性主要体现在以下几......
  • 网站数据库配置文件config.php
    在开发Web应用时,通常会有一个专门用于配置数据库连接信息的文件,例如config.php。这个文件主要用于存储数据库的连接参数,以便在整个应用程序中重用。下面是一个简单的config.php示例,展示了如何定义数据库连接的基本信息:<?php//数据库配置文件//数据库类型define('DB_TYPE......
  • 数据库连接错误的原因及解决方法
    数据库连接错误可能由多种原因引起。以下是一些常见的数据库连接错误及其解决方法:1.错误的数据库凭证原因错误的数据库用户名或密码。数据库用户没有足够的权限。解决方法检查用户名和密码:确认数据库用户名和密码是否正确。重新输入正确的用户名和密码。检查用......
  • 尽可能地恢复织梦CMS的数据库
    假设你有一个名为 backup_file.sql 的备份文件,可以使用以下命令恢复:使用phpMyAdmin登录phpMyAdmin:访问phpMyAdmin页面。选择要恢复的数据库。点击“导入”选项卡。选择 backup_file.sql 并点击“开始”。使用MySQL命令行工具创建新数据库:sh mysql......
  • 数据结构课程设计大项目————迷宫问题(邻接矩阵,prim生成算法,DFS寻路,BFS寻路,路径回溯
    一.前言迷宫问题是数据结构中最值得实践的大项目之一,本文主要讲解思路,提供的代码大部分都有注释(没有的就是太多了懒得写了QAQ)。为了更好的表现效果,该程序使用了easyx可视化,easyx简单易学(大概一天到两天就可以学会),上手简单。该程序由c语言实现,本人水平有限程序可优化空间很大。......
  • 最简单的示例:通过JDBC查询数据
    引言在现代企业级应用开发中,持久层框架(如MyBatis、Hibernate等)极大地简化了数据库操作,提高了开发效率和代码的可维护性。本文将通过一个最简单的示例,演示如何使用JDBC连接数据库、执行SQL语句以及处理结果,并与Mybatis源码做对比,为后续深入研究MyBatis源码打下基础......