手写Kd树(C++模板非递归实现)
本文实现的Kd树实现参考了高翔博士的书《自动驾驶与机器人中的slam技术从理论到实践》;高博士原书中是递归形式实现的,本文改写成了while循环的形式,避免了特大规模点云时的栈溢出问题,并且通过C++无类型模板参数,理论上可以实现广义上的无穷维点云的最近邻匹配。
1. Kd树
1.1 Kd树简介
K-d 树,最早由 Bentley Jon Louis 提出,是二分树的高维度版本,示意图如下图所示。K-d 树也是二叉树的一种,任意一个 K-d 树的节点由左右两侧组成。在二分树里,可以用单个维度的信息来区分左右,但在 K-d 树里,由于要分割高维数据,会用超平面(Hyperplane )来区分左右侧(不过,对于三维点,实际上的超平面就是普通的二维平面 )。在如何分割方面,则存在一些方法上的差异。当然,理论上,寻找超平面来分割两个高维点集可以看成一个SVM(支持向量机)的分类问题。本文为了追求算法的实时性,选用最简单的分割方法:沿轴超平面分割,具体到点云中就是求解其三个维度上的点云方差,选取最大方差的轴作为分割轴,并且选取均值作为分割阈值。
1.2 Kd树的建立
在 K-d 树的构建过程中,主要考虑如何对给定点云进行分割。不同分割方法的策略不同。传统的做法,或是以固定顺序来交替坐标轴陶,或是计算当前点云在各轴上的分散程度,取分散程度最大的轴作为分割轴。本文使用的是后一种方法,且采用BFS的方式构建Kd树。
Kd树的构建步骤如下:
- 初始化根节点,插入队列当中。
- 从队列中弹出所有节点,计算分割面分割点云序列。
- 小于分割阈值的点插入到左子树的序列当中;大于的插入到右子树序列当中。
- 如果此时点云序列中的点数量为1,则定义其为叶子节点;否则继续将其插入到队列中。
- 循环执行2-4。
代码实现如下:
template <int dim>
bool KdTree<dim>::build(const std::vector<PointType> &cloud)
{
if (cloud.empty())
return false;
m_cloud = cloud;
std::vector<size_t> cloud_idx(cloud.size());
std::for_each(cloud_idx.begin(), cloud_idx.end(), [idx = 0](size_t &i) mutable
{ i = idx++; });
// Initalize root node
m_root = std::make_shared<KdTreeNode>();
std::queue<std::pair<KdTreeNode *, std::vector<size_t>>> que_;
que_.push({m_root.get(), cloud_idx});
while (!que_.empty())
{
int size = que_.size();
for (int i = 0; i < size; ++i)
{
std::pair<KdTreeNode *, std::vector<size_t>> node_to_cloud = que_.front();
que_.pop();
KdTreeNode *node_temp = node_to_cloud.first;
std::vector<size_t> cloud_idx_temp = node_to_cloud.second;
std::vector<size_t> left_cloud_idx, right_cloud_idx;
compute_split_parameters_(cloud_idx_temp, node_temp->split_threval_,
node_temp->split_axis_,
left_cloud_idx, right_cloud_idx);
const auto create_leaf_node = [&que_](const std::vector<size_t> &index, KdTreeNode *&node)
{
if (!index.empty())
{
node = new KdTreeNode;
if (1 == index.size()) // leaf node
node->point_idx_ = index[0];
else
que_.push({node, index});
}
};
create_leaf_node(left_cloud_idx, node_temp->left_);
create_leaf_node(right_cloud_idx, node_temp->right_);
}
}
return true;
}
1.3 Kd树的查找
K-d 树的查找实际就是对二叉树的遍历过程。因此,与二叉树类似,可以用前序、中序、后序等方法来遍历。而 K-d 树的特点决定了我们可以对某些不必要的分枝进行剪枝,达到提高搜索效率的目的。下图展示了对某个查询点进行 K-d 树查找的过程。这里要当心的是,虽然查询点( 蓝色)落在了 K-d 树左侧,但它的最近邻是否一定落在左侧呢?事实上并不一定。分割面的位置是由 K-d树建立时期的点云分布决定的。而在查找时,这个最近邻既可以落在左侧,也可以落在右侧。然而,由于分割面的存在,右侧点与查询点会存在一个最小距离,左侧点则没有。这个最小距离就是查询点到分割面的垂直距离,记为 d。因此,如果在左侧找到了一个比 d 更近的点,那么右侧就
不可能存在更近的最近邻,遍历算法就不必再去右侧分枝搜索。反之,如果左侧的最近邻距离比 d要大,那么右侧还可能有更近的点,我们必须向右侧搜索。这就是 K-d 树遍历的基本原则,本文采用栈模拟的前序遍历实现Kd树的查找。
Kd树的前序遍历搜索过程如下:
- 将根节点入栈。
- 将栈中的所有节点依次出栈;如果出栈的是叶子节点,计算其到查询点的距离,如果小于堆中的元素,插入到结果堆中。
- 如果出栈的节点不是叶子节点。计算其在分割面的哪一侧,如果在左侧则搜索左子树;否则搜索右子树。
- 判断其是否需要剪枝,如果不需要则将右节点入栈;最后将左节点入栈。
- 循环执行2-4。
代码实现如下:
template <int dim>
void KdTree<dim>::search_(const PointType &point, std::priority_queue<NodeAndDistance> &knn_result)
{
std::stack<KdTreeNode *> st;
st.push(m_root.get());
while (!st.empty())
{
KdTreeNode *node_temp = st.top();
st.pop();
if (node_temp->is_leaf())
{
compute_leaf_distance(point, node_temp, knn_result);
}
else
{
KdTreeNode *this_side, *that_side;
if (point[node_temp->split_axis_] <
node_temp->split_threval_)
{
this_side = node_temp->left_;
that_side = node_temp->right_;
}
else
{
this_side = node_temp->right_;
that_side = node_temp->left_;
}
if (need_prune_branches_(point, node_temp, knn_result))
if (that_side)
st.push(that_side);
if (this_side)
st.push(this_side);
}
}
}
2. C++完整代码实现
#ifndef KDTREE_H
#define KDTREE_H
#include <Eigen/Dense>
#include <iostream>
#include <stack>
#include <queue>
#include <numeric>
#include <execution>
#include <ostream>
struct KdTreeNode
{
int idx_ = -1;
size_t point_idx_ = 0;
int split_axis_ = 0;
double split_threval_ = 0.f;
KdTreeNode *left_ = nullptr;
KdTreeNode *right_ = nullptr;
bool is_leaf() const { return left_ == nullptr && right_ == nullptr; };
};
struct NodeAndDistance
{
NodeAndDistance(KdTreeNode *node, double distance) : node_(node), distance_(distance){};
double distance_ = 0.f;
KdTreeNode *node_ = nullptr;
bool operator<(const NodeAndDistance &rhs) const { return distance_ < rhs.distance_; }
};
template <int dim>
class KdTree
{
public:
using PointType = Eigen::Matrix<double, dim, 1>;
explicit KdTree() = default;
~KdTree() { clear(); };
// BFS build kdtree
bool build(const std::vector<PointType> &cloud);
// DFS search
bool get_knn_points(const PointType &point, std::vector<size_t> &knn_idx, int k = 5);
// MT DFS search
bool get_knn_points_mt(const std::vector<PointType> &cloud,
std::vector<std::pair<size_t, size_t>> &matches, int k = 5);
// DFS search
void clear();
template <int dim>
friend std::ostream &operator<<(std::ostream &out, const KdTree<dim> &tree);
private:
void search_(const PointType &point, std::priority_queue<NodeAndDistance> &knn_result);
bool need_prune_branches_(const PointType &point, KdTreeNode *node,
std::priority_queue<NodeAndDistance> &knn_result);
void compute_split_parameters_(const std::vector<size_t> &points_index,
double &thre, int &axis,
std::vector<size_t> &left,
std::vector<size_t> &right);
void compute_leaf_distance(const PointType &point, KdTreeNode *node,
std::priority_queue<NodeAndDistance> &knn_result);
private:
std::shared_ptr<KdTreeNode> m_root = nullptr;
int m_k;
std::vector<PointType> m_cloud;
};
template <int dim>
bool KdTree<dim>::build(const std::vector<PointType> &cloud)
{
if (cloud.empty())
return false;
m_cloud = cloud;
std::vector<size_t> cloud_idx(cloud.size());
std::for_each(cloud_idx.begin(), cloud_idx.end(), [idx = 0](size_t &i) mutable
{ i = idx++; });
// Initalize root node
m_root = std::make_shared<KdTreeNode>();
std::queue<std::pair<KdTreeNode *, std::vector<size_t>>> que_;
que_.push({m_root.get(), cloud_idx});
while (!que_.empty())
{
int size = que_.size();
for (int i = 0; i < size; ++i)
{
std::pair<KdTreeNode *, std::vector<size_t>> node_to_cloud = que_.front();
que_.pop();
KdTreeNode *node_temp = node_to_cloud.first;
std::vector<size_t> cloud_idx_temp = node_to_cloud.second;
std::vector<size_t> left_cloud_idx, right_cloud_idx;
compute_split_parameters_(cloud_idx_temp, node_temp->split_threval_,
node_temp->split_axis_,
left_cloud_idx, right_cloud_idx);
const auto create_leaf_node = [&que_](const std::vector<size_t> &index, KdTreeNode *&node)
{
if (!index.empty())
{
node = new KdTreeNode;
if (1 == index.size()) // leaf node
node->point_idx_ = index[0];
else
que_.push({node, index});
}
};
create_leaf_node(left_cloud_idx, node_temp->left_);
create_leaf_node(right_cloud_idx, node_temp->right_);
}
}
return true;
}
template <int dim>
bool KdTree<dim>::get_knn_points(const PointType &point, std::vector<size_t> &knn_idx, int k)
{
m_k = k;
std::priority_queue<NodeAndDistance> knn_result;
search_(point, knn_result);
knn_idx.clear();
while (!knn_result.empty())
{
knn_idx.push_back(knn_result.top().node_->point_idx_);
knn_result.pop();
}
return true;
}
template <int dim>
bool KdTree<dim>::get_knn_points_mt(const std::vector<PointType> &cloud,
std::vector<std::pair<size_t, size_t>> &matches, int k)
{
if (cloud.empty())
return false;
matches.resize(cloud.size() * k);
std::vector<size_t> index(cloud.size());
std::for_each(index.begin(), index.end(), [idx = 0](size_t &i) mutable
{ i = idx++; });
std::for_each(std::execution::par_unseq, index.begin(), index.end(), [&](size_t &i)
{
std::vector<size_t> knn_idx;
get_knn_points(cloud[i], knn_idx, k);
for (size_t j = 0; j < k; ++j)
{
matches[i * k + j].second = knn_idx[j];
if(j>=knn_idx.size())
{
matches[i * k + j].first = std::numeric_limits<size_t>::max();
}
else
{
matches[i * k + j].first = i;
}
} });
return true;
}
template <int dim>
void KdTree<dim>::clear()
{
std::stack<KdTreeNode *> st;
if (m_root)
{
if (m_root->right_)
st.push(m_root->right_);
if (m_root->left_)
st.push(m_root->left_);
}
while (!st.empty())
{
KdTreeNode *node_temp = st.top();
st.pop();
if (node_temp->right_)
st.push(node_temp->right_);
if (node_temp->left_)
st.push(node_temp->left_);
delete node_temp;
}
}
template <int dim>
void KdTree<dim>::search_(const PointType &point, std::priority_queue<NodeAndDistance> &knn_result)
{
std::stack<KdTreeNode *> st;
st.push(m_root.get());
while (!st.empty())
{
KdTreeNode *node_temp = st.top();
st.pop();
if (node_temp->is_leaf())
{
compute_leaf_distance(point, node_temp, knn_result);
}
else
{
KdTreeNode *this_side, *that_side;
if (point[node_temp->split_axis_] <
node_temp->split_threval_)
{
this_side = node_temp->left_;
that_side = node_temp->right_;
}
else
{
this_side = node_temp->right_;
that_side = node_temp->left_;
}
if (need_prune_branches_(point, node_temp, knn_result))
if (that_side)
st.push(that_side);
if (this_side)
st.push(this_side);
}
}
}
template <int dim>
bool KdTree<dim>::need_prune_branches_(const PointType &point, KdTreeNode *node,
std::priority_queue<NodeAndDistance> &knn_result)
{
if (knn_result.size() < m_k)
return true;
double dist = point[node->split_axis_] - node->split_threval_;
if (dist * dist < knn_result.top().distance_)
{
return true;
}
return false;
}
template <int dim>
void KdTree<dim>::compute_split_parameters_(const std::vector<size_t> &points_index,
double &thre, int &axis,
std::vector<size_t> &left,
std::vector<size_t> &right)
{
// Calculate mean
Eigen::Matrix<double, dim, 1> means;
means = std::accumulate(points_index.begin(), points_index.end(), Eigen::Matrix<double, dim, 1>::Zero().eval(),
[&](const Eigen::Matrix<double, dim, 1> &sum, const int &index)
{ return sum + m_cloud[index]; }) /
points_index.size();
// Calculate variance
Eigen::Matrix<double, dim, 1> vars;
vars = std::accumulate(points_index.begin(), points_index.end(), Eigen::Matrix<double, dim, 1>::Zero().eval(),
[&](const Eigen::Matrix<double, dim, 1> &sum, const int &index)
{ return sum + (m_cloud[index] - means).cwiseAbs2().eval(); }) /
points_index.size();
// Find the axis with maximum variance
vars.maxCoeff(&axis);
thre = means[axis];
// Split point cloud to left and right
std::for_each(points_index.begin(), points_index.end(), [&](const size_t &idx)
{
if (m_cloud[idx][axis] < thre)
left.emplace_back(idx);
else
right.emplace_back(idx); });
}
template <int dim>
void KdTree<dim>::compute_leaf_distance(const PointType &point, KdTreeNode *node,
std::priority_queue<NodeAndDistance> &knn_result)
{
double distance = (m_cloud[node->point_idx_] - point).squaredNorm();
if (knn_result.size() < m_k)
{
knn_result.push({node, distance});
}
else
{
if (distance < knn_result.top().distance_)
{
knn_result.emplace(NodeAndDistance(node, distance));
knn_result.pop();
}
}
}
template <int dim>
std::ostream &operator<<(std::ostream &out, const KdTree<dim> &tree)
{
KdTreeNode *root = tree.m_root.get();
std::queue<KdTreeNode *> que_;
que_.push(root);
while (!que_.empty())
{
int size = que_.size();
for (int i = 0; i < size; ++i)
{
KdTreeNode *node = que_.front();
que_.pop();
if (node->is_leaf())
{
out << "Leaf Node index: " << node->point_idx_ << std::endl;
out << "Point Cloud: " << std::endl
<< tree.m_cloud[node->point_idx_] << std::endl;
}
if (node->left_)
que_.push(node->left_);
if (node->right_)
que_.push(node->right_);
}
}
return out;
}
#endif
3. 测试代码
3.1 代码实现
本文通过生成一个椭球形状的模拟点云数据,测试Kd树;并且为了验证Kd树的性能和准确性,本文也实现了一个简单的多线程暴力搜索算法做对比。
#include "kdtree.hpp"
template <typename FuncT>
void evaluate_and_call(FuncT &&func, int run_num)
{
double total_time = 0.f;
for (int i = 0; i < run_num; ++i)
{
auto t1 = std::chrono::high_resolution_clock::now();
func();
auto t2 = std::chrono::high_resolution_clock::now();
total_time += std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count() * 1000;;
}
std::cout << "run cost: " << total_time / 1000.f << std::endl;
}
void generateSphereCoordinates(std::vector<Eigen::Vector3d> &points, int numPoints)
{
points.clear();
double goldenRatio = (1 + std::sqrt(5.0)) / 2.0;
double angleIncrement = CV_PI * 2 * goldenRatio;
for (int i = 0; i < numPoints; ++i)
{
double t = double(i) / double(numPoints);
double inclination = std::acos(1 - 2 * t);
double azimuth = angleIncrement * i;
double x = std::sin(inclination) * std::cos(azimuth);
double y = std::sin(inclination) * std::sin(azimuth);
double z = std::cos(inclination);
points.push_back(Eigen::Vector3d(x, y, z));
}
}
void transformToEllipsoid(std::vector<Eigen::Vector3d> &points, double a, double b, double c)
{
for (auto &point : points)
{
point[0] *= a;
point[1] *= b;
point[2] *= c;
}
}
std::vector<Eigen::Vector3d> generate_ellipsoid()
{
// ellipsoid parameters
double a = 200;
double b = 100;
double c = 150;
int numPoints = 100000;
// Generate sphere coordinates and transform to ellipsoid
std::vector<Eigen::Vector3d> spherePoints;
generateSphereCoordinates(spherePoints, numPoints);
transformToEllipsoid(spherePoints, a, b, c);
return spherePoints;
}
void bf_nn(const std::vector<Eigen::Vector3d> &source,
const std::vector<Eigen::Vector3d> &target,
std::vector<std::pair<size_t, size_t>> &result)
{
std::vector<size_t> index(target.size());
std::for_each(index.begin(), index.end(), [idx = 0](size_t &i) mutable
{ i = idx++; });
const auto bfnn_points = [&source](const Eigen::Vector3d &p) -> int
{
return std::min_element(source.begin(), source.end(),
[&p](const Eigen::Vector3d &p1, const Eigen::Vector3d &p2)
{ return (p1 - p).norm() < (p2 - p).norm(); }) -
source.begin();
};
result.resize(target.size());
std::for_each(std::execution::par_unseq, index.begin(), index.end(), [&](size_t &i)
{
result[i].second = bfnn_points(target[i]);
result[i].first = i; });
}
void test_kdtree3d()
{
// Generate virtual point cloud
std::vector<Eigen::Vector3d> points = generate_ellipsoid();
Eigen::AngleAxisd rvec(CV_PI / 6, Eigen::Vector3d::UnitZ());
Eigen::Vector3d T(2, 2, 2);
std::vector<Eigen::Vector3d> source(points.size());
std::transform(points.begin(), points.end(), source.begin(), [&rvec, &T](const Eigen::Vector3d &p)
{ return rvec.toRotationMatrix() * p + T; });
// Build KD-tree
KdTree<3> kd_tree;
kd_tree.build(source);
// std::cout << kd_tree << std::endl;
std::vector<std::pair<size_t, size_t>> matches;
kd_tree.get_knn_points_mt(points, matches, 1);
// std::for_each(matches.begin(), matches.end(), [](const std::pair<size_t, size_t> &p_pair)
// { std::cout << "points index: " << p_pair.first << " source index: " << p_pair.second << std::endl; });
}
void test_bf_matches()
{
// Generate virtual point cloud
std::vector<Eigen::Vector3d> points = generate_ellipsoid();
Eigen::AngleAxisd rvec(CV_PI / 6, Eigen::Vector3d::UnitZ());
Eigen::Vector3d T(2, 2, 2);
std::vector<Eigen::Vector3d> source(points.size());
std::transform(points.begin(), points.end(), source.begin(), [&rvec, &T](const Eigen::Vector3d &p)
{ return rvec.toRotationMatrix() * p + T; });
std::vector<std::pair<size_t, size_t>> bf_matches;
bf_nn(source, points, bf_matches);
// std::for_each(bf_matches.begin(), bf_matches.end(), [](const std::pair<size_t, size_t> &p_pair)
// { std::cout << "points index: " << p_pair.first << " source index: " << p_pair.second << std::endl; });
}
void test_kdtree_bf_result()
{
// Generate virtual point cloud
std::vector<Eigen::Vector3d> points = generate_ellipsoid();
Eigen::AngleAxisd rvec(CV_PI / 6, Eigen::Vector3d::UnitZ());
Eigen::Vector3d T(2, 2, 2);
std::vector<Eigen::Vector3d> source(points.size());
std::transform(points.begin(), points.end(), source.begin(), [&rvec, &T](const Eigen::Vector3d &p)
{ return rvec.toRotationMatrix() * p + T; });
// BF match
std::vector<std::pair<size_t, size_t>> bf_matches;
bf_nn(source, points, bf_matches);
// Kdtree match
KdTree<3> kd_tree;
kd_tree.build(source);
// std::cout << kd_tree << std::endl;
std::vector<std::pair<size_t, size_t>> matches;
kd_tree.get_knn_points_mt(points, matches, 1);
int true_num = 0;
for (size_t i = 0; i < bf_matches.size(); ++i)
{
if (bf_matches[i].second == matches[i].second)
{
true_num++;
}
}
double true_p = double(true_num) / double(matches.size());
std::cout << "True positive rate: " << true_p * 100 << std::endl;
}
int main()
{
evaluate_and_call(test_kdtree3d, 1);
evaluate_and_call(test_bf_matches, 1);
evaluate_and_call(test_kdtree_bf_result, 1);
return 0;
}
3.2 测试结果
run cost: 0.284431
run cost: 5.92498
True positive rate: 100
run cost: 6.16525
4. 与PCL中的Kd树做对比
#include "kdtree.hpp"
#include <pcl/point_cloud.h>
#include <pcl/kdtree/kdtree_flann.h>
template <typename FuncT>
void evaluate_and_call(FuncT &&func, int run_num)
{
double total_time = 0.f;
for (int i = 0; i < run_num; ++i)
{
auto t1 = std::chrono::high_resolution_clock::now();
func();
auto t2 = std::chrono::high_resolution_clock::now();
total_time += std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1).count() * 1000;;
}
std::cout << "run cost: " << total_time / 1000.f << std::endl;
}
void test_pcl_kdtree()
{
std::vector<Eigen::Vector3d> points = generate_ellipsoid();
pcl::PointCloud<pcl::PointXYZ>::Ptr cloud(new pcl::PointCloud<pcl::PointXYZ>);
// Generate pointcloud data
cloud->width = points.size();
cloud->height = 1;
cloud->points.resize(cloud->width * cloud->height);
for (std::size_t i = 0; i < cloud->size(); ++i)
{
(*cloud)[i].x = points[i][0];
(*cloud)[i].y = points[i][1];
(*cloud)[i].z = points[i][2];
}
pcl::KdTreeFLANN<pcl::PointXYZ> kdtree;
kdtree.setInputCloud(cloud);
pcl::PointXYZ searchPoint;
int K = 1;
std::vector<int> pointIdxKNNSearch(K);
std::vector<float> pointKNNSquaredDistance(K);
Eigen::AngleAxisd rvec(CV_PI / 6, Eigen::Vector3d::UnitZ());
Eigen::Vector3d T(2, 2, 2);
Eigen::Vector3d p = points[0];
p = rvec.toRotationMatrix() * p + T;
pcl::PointXYZ p_search;
p_search.x = p[0];
p_search.y = p[1];
p_search.z = p[2];
kdtree.nearestKSearch(p_search, K, pointIdxKNNSearch, pointKNNSquaredDistance);
}
void test_kdtree3d()
{
// Generate virtual point cloud
std::vector<Eigen::Vector3d> points = generate_ellipsoid();
Eigen::AngleAxisd rvec(CV_PI / 6, Eigen::Vector3d::UnitZ());
Eigen::Vector3d T(2, 2, 2);
std::vector<Eigen::Vector3d> source(points.size());
std::transform(points.begin(), points.end(), source.begin(), [&rvec, &T](const Eigen::Vector3d &p)
{ return rvec.toRotationMatrix() * p + T; });
// Build KD-tree
KdTree<3> kd_tree;
kd_tree.build(points);
std::vector<size_t> knn;
kd_tree.get_knn_points(source[0], knn, 1);
}
int main()
{
evaluate_and_call(test_pcl_kdtree, 10);
evaluate_and_call(test_kdtree3d, 10);
return 0;
}
run cost: 0.748115
run cost: 0.204421
- 本文实现的kd树参照的高翔博士的书中的实现,按照原书的与PCL做的对比,其性能是相当的。但本文实现的kdtree要慢不少,分析原因可能是建树和析构内存比较耗时;也可能是本文使用的是椭球点云,PCL中的FLANN库分割的更加合理。如果代码实现上有什么问题欢迎批评指正。