更稳定的手势识别方法--基于手部骨架与关键点检测

标签：识别方法 -- image 手部 cv2 hand mp hands landmarks

导读

本期将介绍并演示基于MediaPipe的手势骨架与特征点提取步骤以及以此为基础实现手势识别的方法。

介绍

关于MediaPipe以前有相关文章介绍，可以参看下面链接：

Google开源手势识别--基于TF Lite/MediaPipe

它能做些什么？它支持的语言和平台有哪些？请看下面两张图：

更稳定的手势识别方法--基于手部骨架与关键点检测_edn

更稳定的手势识别方法--基于手部骨架与关键点检测_ide_02

我们主要介绍手势骨架与关键点提取，其他内容大家有兴趣自行学习了解。github地址：https://github.com/google/mediapipe

效果展示

手势骨架提取与关键点标注：

手势识别0~6：

实现步骤

具体可参考下面链接：

https://google.github.io/mediapipe/solutions/hands

(1) 安装mediapipe，执行pip install mediapipe

更稳定的手势识别方法--基于手部骨架与关键点检测_手势识别_03

(2) 下载手势检测与骨架提取模型，地址：

https://github.com/google/mediapipe/tree/master/mediapipe/modules/hand_landmark

更稳定的手势识别方法--基于手部骨架与关键点检测_edn_04

更稳定的手势识别方法--基于手部骨架与关键点检测_ide_05

(3) 代码测试(摄像头实时测试)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands




hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    continue


  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  image.flags.writeable = False
  results = hands.process(image)


  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

输出与结果：

更稳定的手势识别方法--基于手部骨架与关键点检测_edn_06

更稳定的手势识别方法--基于手部骨架与关键点检测_ide_07

图片检测(可支持多个手掌)：

import cv2
import mediapipe as mp
from os import listdir
mp_drawing = mp.solutions.drawing_utils
mp_hands = mp.solutions.hands


# For static images:
hands = mp_hands.Hands(
    static_image_mode=True,
    max_num_hands=5,
    min_detection_confidence=0.2)
img_path = './multi_hands/'
save_path = './'
index = 0
file_list = listdir(img_path) 
for filename in file_list:
  index += 1
  file_path = img_path + filename
  # Read an image, flip it around y-axis for correct handedness output (see
  # above).
  image = cv2.flip(cv2.imread(file_path), 1)
  # Convert the BGR image to RGB before processing.
  results = hands.process(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))


  # Print handedness and draw hand landmarks on the image.
  print('Handedness:', results.multi_handedness)
  if not results.multi_hand_landmarks:
    continue
  image_hight, image_width, _ = image.shape
  annotated_image = image.copy()
  for hand_landmarks in results.multi_hand_landmarks:
    print('hand_landmarks:', hand_landmarks)
    print(
        f'Index finger tip coordinates: (',
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].x * image_width}, '
        f'{hand_landmarks.landmark[mp_hands.HandLandmark.INDEX_FINGER_TIP].y * image_hight})'
    )
    mp_drawing.draw_landmarks(
        annotated_image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imwrite(
      save_path + str(index) + '.png', cv2.flip(annotated_image, 1))
hands.close()


# For webcam input:
hands = mp_hands.Hands(
    min_detection_confidence=0.5, min_tracking_confidence=0.5)
cap = cv2.VideoCapture(0)
while cap.isOpened():
  success, image = cap.read()
  if not success:
    print("Ignoring empty camera frame.")
    # If loading a video, use 'break' instead of 'continue'.
    continue


  # Flip the image horizontally for a later selfie-view display, and convert
  # the BGR image to RGB.
  image = cv2.cvtColor(cv2.flip(image, 1), cv2.COLOR_BGR2RGB)
  # To improve performance, optionally mark the image as not writeable to
  # pass by reference.
  image.flags.writeable = False
  results = hands.process(image)


  # Draw the hand annotations on the image.
  image.flags.writeable = True
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
  if results.multi_hand_landmarks:
    for hand_landmarks in results.multi_hand_landmarks:
      mp_drawing.draw_landmarks(
          image, hand_landmarks, mp_hands.HAND_CONNECTIONS)
  cv2.imshow('result', image)
  if cv2.waitKey(5) & 0xFF == 27:
    break
cv2.destroyAllWindows()
hands.close()
cap.release()

更稳定的手势识别方法--基于手部骨架与关键点检测_edn_08

更稳定的手势识别方法--基于手部骨架与关键点检测_手势识别_09

总结后续说明

总结：MediaPipe手势检测与骨架提取模型识别相较传统方法更稳定，而且提供手指关节的3D坐标点，对于手势识别与进一步手势动作相关开发有很大帮助。

其他说明：

(1) 手部关节点标号与排序定义如下图：

更稳定的手势识别方法--基于手部骨架与关键点检测_ide_10

(2) 手部关节点坐标(x,y,z)输出为小于1的小数，需要归一化后显示到图像上，这部分可以查看上部分源码后转到定义查看，这里给出demo代码，另外Z坐标靠近屏幕增大，远离屏幕减小：

def Normalize_landmarks(image, hand_landmarks):
  new_landmarks = []
  for i in range(0,len(hand_landmarks.landmark)):
    float_x = hand_landmarks.landmark[i].x
    float_y = hand_landmarks.landmark[i].y
    # Z坐标靠近屏幕增大，远离屏幕减小
    float_z = hand_landmarks.landmark[i].z
    print(float_z)
    width = image.shape[1]
    height = image.shape[0]
 
    pt = mp_drawing._normalized_to_pixel_coordinates(float_x,float_y,width,height)
    new_landmarks.append(pt)
  return new_landmarks

(3) 基于此你可以做个简单额手势识别或者手势靠近远离屏幕的小程序，当然不仅要考虑关节点的坐标，可能还需要计算角度已经以前的状态等等，比如下面这样：

更稳定的手势识别方法--基于手部骨架与关键点检测_手势识别_11

其他demo与相关代码均在知识星球主题中发布，需要的朋友可以加入获取。

标签：识别方法,--,image,手部,cv2,hand,mp,hands,landmarks
From： https://blog.51cto.com/stq054188/5836326

更稳定的手势识别方法--基于手部骨架与关键点检测

介绍

Google开源手势识别--基于TF Lite/MediaPipe

效果展示

实现步骤

总结后续说明

相关文章

赞助商

阅读排行