想使用xarm搭建一个通过视觉(yolo)进行抓取的系统.(仅供参考,自己复盘用,初学者)
1,先使用xarm_ros(github)搭建自己想要的环境.
准备使用xarm_gazebo中的xarm6_beside_table.launch文件(但是world选择xarm_camera_scene.aorld).我希望在xarm末端处有一个D435i摄像机.同时,在桌子的上方也设置一个D435i.
通过上面的D435i确定坐标,去抓取.而末端处的D435i是为了减少末端的移动误差.
首先修改xarm_gazebo中的xarm6_beside_table.launch.把realsense_ros_gazebo(github)中的simulation_D435i_sdf.launch内容复制到xarm6_beside_table.launch里(xarm位置进行了调整)
<?xml version="1.0"?>
<launch>
<arg name="run_demo" default="false" doc="If true, perform the demo after launch"/>
<arg name="paused" default="false" doc="Starts gazebo in paused mode" />
<arg name="gui" default="true" doc="Starts gazebo gui" />
<arg name="effort_control" default="false"/>
<arg name="velocity_control" default="false"/>
<arg name="add_gripper" default="false" />
<arg name="add_vacuum_gripper" default="false" />
<arg name="namespace" default="xarm"/>
<arg name="model1300" default="false" />
<arg name="add_realsense_d435i" default="false" />
<arg name="add_d435i_links" default="false" />
<arg name="robot_sn" default="" />
<arg name="x" default="0"/>
<arg name="y" default="0"/>
<arg name="z" default="2"/>
<arg name="R" default="1.57"/>
<arg name="P" default="0"/>
<arg name="Y" default="1.57"/>
<!-- vehicle model and world -->
<arg name="sdf" default="$(find realsense_ros_gazebo)/sdf/D435i/model.sdf"/>
<arg name="xarm_velocity_control" value="$(eval arg('velocity_control') and not arg('run_demo'))" />
<rosparam file="$(find xarm6_moveit_config)/config/xarm6_params.yaml" command="load" ns="$(arg namespace)"/>
<rosparam if="$(arg add_gripper)" file="$(find xarm_controller)/config/gripper/gripper_gazebo_ros_control.yaml" command="load"/>
<!-- startup simulated world -->
<include file="$(find gazebo_ros)/launch/empty_world.launch">
<arg name="world_name" value="$(find xarm_gazebo)/worlds/xarm_camera_scene.world"/>
<arg name="paused" value="$(arg paused)"/>
<arg name="gui" value="$(arg gui)"/>
</include>
<!-- send robot urdf to param server, joint limit may be overwritten if use moveit planner -->
<param name="robot_description" command="$(find xacro)/xacro
--inorder '$(find xarm_description)/urdf/xarm_device.urdf.xacro' robot_type:=xarm dof:=6
add_gripper:=$(arg add_gripper) add_vacuum_gripper:=$(arg add_vacuum_gripper)
effort_control:=$(arg effort_control) velocity_control:=$(arg xarm_velocity_control)
model1300:=$(arg model1300)
add_realsense_d435i:=$(arg add_realsense_d435i)
add_d435i_links:=$(arg add_d435i_links)
robot_sn:=$(arg robot_sn)
"/>
<!-- spawn robot model in gazebo, located on the table -->
<node name="spawn_gazebo_model" pkg="gazebo_ros" type="spawn_model"
respawn="false" output="screen"
args="-urdf -model xarm6 -x -0 -y -0 -z 0.775 -Y 0 -param robot_description"/>
<node name="sdf_test" pkg="gazebo_ros" type="spawn_model" output="screen" args="-sdf -file $(arg sdf) -model realsense_camera -x $(arg x) -y $(arg y) -z $(arg z) -R $(arg R) -P $(arg P) -Y $(arg Y)"/>
<!-- load the corresponding controllers -->
<include file="$(find xarm_controller)/launch/xarm6_control.launch">
<arg name="run_demo_traj" value="$(arg run_demo)"/>
<arg name="effort_control" value="$(arg effort_control)"/>
<arg name="velocity_control" value="$(arg xarm_velocity_control)"/>
<arg name="add_gripper" value="$(arg add_gripper)" />
<arg name="namespace" value="$(arg namespace)" />
</include>
</launch>
效果图(记得更改D435i的角度位置等):
2,在ros实现yolo识别
根据这个链接,可以知道如何在ros中使用yolo.并且有提供基础的代码.
需要实现目标有三个:
2-1,实现识别并定位
2-2,取得重心
2-3,坐标转换
实现他仨才可以实现机械臂去抓取.
关于实现识别并定位.我使用segment.首先使用yolo提供的代码为基础:
import time
import ros_numpy
import rospy
from sensor_msgs.msg import Image
from ultralytics import YOLO
detection_model = YOLO("yolov8m.pt")
segmentation_model = YOLO("yolov8m-seg.pt")
rospy.init_node("ultralytics")
time.sleep(1)
det_image_pub = rospy.Publisher("/ultralytics/detection/image", Image, queue_size=5)
seg_image_pub = rospy.Publisher("/ultralytics/segmentation/image", Image, queue_size=5)
def callback(data):
"""Callback function to process image and publish annotated images."""
array = ros_numpy.numpify(data)
if det_image_pub.get_num_connections():
det_result = detection_model(array)
det_annotated = det_result[0].plot(show=False)
det_image_pub.publish(ros_numpy.msgify(Image, det_annotated, encoding="rgb8"))
if seg_image_pub.get_num_connections():
seg_result = segmentation_model(array)
seg_annotated = seg_result[0].plot(show=False)
seg_image_pub.publish(ros_numpy.msgify(Image, seg_annotated, encoding="rgb8"))
rospy.Subscriber("/camera/color/image_raw", Image, callback)
while True:
rospy.spin()
再根据yolo提供的 输出 :
Here is a table for the Masks
class methods and properties, including their name, type, and description:
Name | Type | Description |
---|---|---|
cpu() | Method | Returns the masks tensor on CPU memory. |
numpy() | Method | Returns the masks tensor as a numpy array. |
cuda() | Method | Returns the masks tensor on GPU memory. |
to() | Method | Returns the masks tensor with the specified device and dtype. |
xyn | Property (torch.Tensor ) | A list of normalized segments represented as tensors. |
xy | Property (torch.Tensor ) | A list of segments in pixel coordinates represented as tensors. |
获得识别物品的mask坐标:seg_result[0],masks.xyn
识别截图:
识别输出:
0: 480x640 1 stop sign, 899.3ms
Speed: 9.5ms preprocess, 899.3ms inference, 5.2ms postprocess per image at shape (1, 3, 480, 640)
resturt [array([[ 0.82187, 0.77083],
[ 0.82187, 0.85208],
[ 0.82656, 0.85208],
[ 0.82812, 0.85417],
[ 0.82969, 0.85417],
[ 0.83281, 0.85833],
[ 0.83438, 0.85833],
[ 0.84062, 0.86667],
[ 0.84219, 0.86667],
[ 0.84375, 0.86875],
[ 0.84375, 0.87083],
[ 0.84531, 0.87292],
[ 0.84688, 0.87292],
[ 0.84844, 0.875],
[ 0.85, 0.875],
[ 0.85156, 0.87708],
[ 0.85156, 0.88542],
[ 0.92031, 0.88542],
[ 0.92031, 0.78958],
[ 0.91406, 0.78958],
[ 0.90938, 0.78333],
[ 0.90625, 0.78333],
[ 0.90469, 0.78125],
[ 0.90312, 0.78125],
[ 0.9, 0.77708],
[ 0.9, 0.77083]], dtype=float32)]
关于实现确定重心,通过masks变points,之后计算points获得重心点(可以得到这个点的x,y,z坐标)
代码(重心是已经验证是错误的,下面的代码有问题):
import time
import ros_numpy
import rospy
from sensor_msgs.msg import Image, PointCloud2,CameraInfo
import cv2
from ultralytics import YOLO
import numpy as np
import open3d as o3d
# 加载YOLO模型
detection_model = YOLO("yolov9t.pt")
segmentation_model = YOLO("yolov8m-seg.pt")
rospy.init_node("ultralytics")
time.sleep(1)
# 创建发布者
seg_image_pub = rospy.Publisher("/ultralytics/segmentation/image", Image, queue_size=5)
# 初始化全局点云变量
camera_intrinsics = None
latest_pointcloud = None
def camera_info_callback(msg):
"""从CameraInfo消息中提取相机内参"""
global camera_intrinsics
# 提取相机内参矩阵 K
K = msg.K # 相机内参矩阵是一个长度为9的列表
fx = K[0] # 焦距 fx
fy = K[4] # 焦距 fy
cx = K[2] # 主点 cx
cy = K[5] # 主点 cy
# 将内参存储在字典中,方便后续使用
camera_intrinsics = {
'fx': fx,
'fy': fy,
'cx': cx,
'cy': cy
}
#rospy.loginfo(f"Camera intrinsics: fx={fx}, fy={fy}, cx={cx}, cy={cy}")
def get_mask_from_points(point_cloud):
global camera_intrinsics
if camera_intrinsics is None:
rospy.logwarn("Camera intrinsics not available yet.")
return None
fx, fy = camera_intrinsics['fx'], camera_intrinsics['fy']
cx, cy = camera_intrinsics['cx'], camera_intrinsics['cy']
# 从点云中提取X, Y, Z坐标
X = point_cloud[0]
Y = point_cloud[1]
Z = point_cloud[2]
# 投影公式,将三维点转换为二维像素坐标
u = (fx * X / Z) + cx
v = (fy * Y / Z) + cy
# 将结果拼接为 Nx2 的像素坐标数组
mask_coords = np.vstack((u, v)).T # 转换为Nx2的坐标数组
return mask_coords
# 回调函数处理点云
def pointcloud_callback(data):
"""保存最新的点云数据"""
global latest_pointcloud
latest_pointcloud = ros_numpy.point_cloud2.pointcloud2_to_array(data)
#print("lastest_pointcloud:",latest_pointcloud)
# 回调函数处理图像并发布注释图像
def image_callback(data):
global latest_pointcloud
# 将ROS图像消息转换为numpy数组
array = ros_numpy.numpify(data)
# 分割部分
if seg_image_pub.get_num_connections():
seg_result = segmentation_model(array)
seg_annotated = seg_result[0].plot(show=False)
segments = seg_result[0].masks.xy[0] # 分割掩码(归一化坐标)
# 确保 segments 是二维 NumPy 数组
if isinstance(segments, np.ndarray) and segments.ndim == 2 and segments.shape[1] == 2:
# 将归一化的坐标转换为图像的像素坐标
#height, width = array.shape[:2]
#scale_factors = np.array([width, height]) # [640, 480]
try:
# 直接对整个数组进行广播操作
scaled_segments = segments
print("Scaled mask points:", scaled_segments)
# 进一步处理这些放大后的坐标
selected_points = get_points_from_mask(scaled_segments, latest_pointcloud)
print("selected_points:",selected_points)
mask_coords = get_mask_from_points(selected_points)
if selected_points is not None and len(selected_points) > 0:
centroid = np.mean(selected_points, axis=0)
print(f"选中点云群的重心: {centroid}")
mask_mask=get_mask_from_points(centroid)
#a, b = (mask_mask[0]).astype(np.int), (mask_mask[1]).astype(np.int)
print(f"选中mask的重心: {mask_mask}")
else:
print("未找到匹配的点云数据")
except ValueError as e:
print(f"Error during scaling: {e}")
else:
print("Segments data is not in expected format or shape.")
seg_image_pub.publish(ros_numpy.msgify(Image, seg_annotated, encoding="rgb8"))
# 从mask坐标中提取点云数据
def get_points_from_mask(mask_coords, point_cloud):
"""根据mask坐标获取对应的点云点"""
selected_points = []
for coord in mask_coords:
u, v = int(coord[0]), int(coord[1])
# 从点云中提取对应的3D坐标 (u, v)
point = point_cloud[v, u] # 根据像素坐标提取点云
if not np.isnan(point['x']): # 检查是否为有效的点云点
selected_points.append([point['x'], point['y'], point['z']])
return np.array(selected_points)
# 订阅RGB图像和点云数据
rospy.Subscriber("/camera/color/image_raw", Image, image_callback)
rospy.Subscriber("/camera/depth/color/points", PointCloud2, pointcloud_callback)
rospy.Subscriber("/camera/color/camera_info", CameraInfo, camera_info_callback)
# ROS主循环
while not rospy.is_shutdown():
key = cv2.waitKey(1) & 0xFF # 每1毫秒检查一次按键
if key == ord('q'): # 按下q键时退出
print("Exiting...")
break
rospy.signal_shutdown('User pressed q to quit')
cv2.destroyAllWindows()
问题出在:目前的想法是,因为点云的检索问题.??
尝试新方法:引用了这个文章内的东西
通过这个sdk来获得点云坐标:rs.rs2_deproject_pixel_to_point
明天要做的事情:
- 关于实现坐标变换,
- 番外:现有的颜色小方块以外,添加香蕉等.
- 通过moveit_cammard实现根据yolo识别结果进行抓取