使用 Azure Vision AI 的预训练模型分析货架图像时检测货架中的物体和间隙

我正在做货架产品识别,其中 webApp(使用 Flask 构建)使用 Azure Vision AI 的预训练模型分析货架图像。我为此使用了 Azure VM 实例。

我需要检测对象以及 空白区域 这些对象之间的间隙

以下代码 app.py 标记检测到的对象以及它们之间的间隙:

import os
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials
import cv2
import numpy as np
import matplotlib.pyplot as plt

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = 'static/uploads/'

# Initialize Azure Computer Vision client
endpoint = os.getenv('AZURE_COMPUTER_VISION_ENDPOINT')
key = os.getenv('AZURE_COMPUTER_VISION_KEY')
computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(key))

def preprocess_image(image_path):
    Preprocess the image to detect edges.
    image = cv2.imread(image_path)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blurred = cv2.GaussianBlur(gray, (5, 5), 0)
    edges = cv2.Canny(blurred, 50, 150)
    return edges

def analyze_image(filepath):
    Analyze the uploaded image using Azure Computer Vision and detect objects,
    and empty areas.
    with open(filepath, "rb") as image_contents:
        results = computervision_client.analyze_image_in_stream(image_contents, visual_features=[VisualFeatureTypes.objects])

    image = cv2.imread(filepath)
    height, width, _ = image.shape

    empty_areas = []
    bounding_boxes = []

    # Analyze detected objects
    confidence_threshold = 0.5
    shelves = {}
    num_shelves = 5
    for obj in results.objects:
        if obj.confidence > confidence_threshold:
            left = int(obj.rectangle.x)
            top = int(obj.rectangle.y)
            right = left + int(obj.rectangle.w)
            bottom = top + int(obj.rectangle.h)
            bounding_boxes.append((left, top, right, bottom))
            row_key = (top // (height // num_shelves))
            if row_key not in shelves:
                shelves[row_key] = []
            shelves[row_key].append((left, top, right, bottom))

    # Detect empty areas between objects
    gap_threshold = 50
    for row_key, objects in shelves.items():
        objects.sort(key=lambda x: x[0])
        for i in range(len(objects) - 1):
            _, _, right1, _ = objects[i]
            left2, _, _, _ = objects[i + 1]
            gap_width = left2 - right1
            if gap_width > gap_threshold:
                empty_areas.append((right1, row_key * (height // num_shelves), left2, (row_key + 1) * (height // num_shelves)))

    # Create an output image with bounding boxes
    fig, ax = plt.subplots()
    ax.imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))

    # Draw bounding boxes for objects
    for (left, top, right, bottom) in bounding_boxes:
        rect = plt.Rectangle((left, top), right - left, bottom - top, edgecolor='g', facecolor='none')

    # Draw bounding boxes for empty areas
    for (left, top, right, bottom) in empty_areas:
        rect = plt.Rectangle((left, top), right - left, bottom - top, edgecolor='r', facecolor='none')

    # Save the output image
    result_filepath = os.path.join(app.config['UPLOAD_FOLDER'], 'result_' + os.path.basename(filepath))

    return {
        'bounding_boxes': bounding_boxes,
        'empty_areas': empty_areas,
        'image_url': result_filepath

@app.route('/', methods=['GET', 'POST'])
def index():
    Handle the upload of the image and display analysis results.
    if request.method == 'POST':
        if 'file' not in request.files:
            return redirect(request.url)
        file = request.files['file']
        if file.filename == '':
            return redirect(request.url)
        if file:
            filepath = os.path.join(app.config['UPLOAD_FOLDER'], file.filename)
            results = analyze_image(filepath)
            return render_template('result.html', results=results)

    return render_template('index.html')

if __name__ == '__main__':
    app.run(host='', port=5000, debug=True)

其他文件是 index.html result.html

这些文件与更新的 app.py 相结合,现在将处理对象检测和间隙检测。但是,我无法正确识别和标记所有对象,然后检测它们之间的间隙。

用于分析的图像已附上。以下是参考链接: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/shelf-analyze

可能会有哪些额外更改我可以在 analyze_image 函数中正确识别对象之间的间隙吗?

Shelf Image Used for Analysis:


  1. 针对货架检测进行预处理: 当前的预处理步骤包括边缘检测,这对于查找空隙来说可能不是最有效的。请考虑添加一个步骤,使用 cv2.HoughLinesP() 等方法专门检测货架线。这样,可以更好地了解货架的结构,并更准确地识别货架上的产品。

  2. 基于货架分段对象: 没有充分利用货架信息。检测到货架线后,请使用这些线将图像分割成单独的货架。然后,对每个货架分别执行对象检测,确保即使在不同货架上彼此相邻的产品也不会被视为具有空隙。

  3. 使用基于密度的聚类来识别产品: Azure Vision AI 可能会返回属于同一产品的多个边界框。要解决这个问题,可以使用基于密度的聚类算法(例如 DBSCAN)对边界框进行分组。这将有助于将单个产品识别为一个连贯的组,从而实现更准确的空隙检测。

  4. 调整空隙检测参数: gap_threshold 的值应根据的具体需求和货架图像中的典型产品尺寸进行调整。尝试不同的阈值,或实施自适应阈值机制以获得最佳结果。

  5. 考虑深度信息(如果可用): 如果可以访问货架图像的深度信息(例如,使用 RGB-D 相机),则可以使用此信息来改善空隙检测。深度信息可以帮助更轻松地区分货架上的产品和货架背景,从而实现更可靠的空隙识别。

以下是如何实现这些改进的更新后的 analyze_image 函数:

def analyze_image(filepath):
    使用 Azure Computer Vision 分析上传的图像,并检测对象和空隙。
    with open(filepath, "rb") as image_contents:
        results = computervision_client.analyze_image_in_stream(
            image_contents, visual_features=[VisualFeatureTypes.objects]

    image = cv2.imread(filepath)
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    height, width, _ = image.shape

    # 1. 检测货架线
    edges = cv2.Canny(gray, 50, 150)
    lines = cv2.HoughLinesP(edges, 1, np.pi / 180, 100, minLineLength=100, maxLineGap=10)
    shelves = []
    if lines is not None:
        for line in lines:
            x1, y1, x2, y2 = line[0]
            if abs(y1 - y2) < 0.1 * height:  # 仅保留水平线

    # 2. 基于货架对对象进行分段
    confidence_threshold = 0.5
    shelf_objects = {}
    for obj in results.objects:
        if obj.confidence > confidence_threshold:
            left = int(obj.rectangle.x)
            top = int(obj.rectangle.y)
            right = left + int(obj.rectangle.w)
            bottom = top + int(obj.rectangle.h)
            for i, shelf_y in enumerate(shelves):
                if top < shelf_y:
                    if i not in shelf_objects:
                        shelf_objects[i] = []
                    shelf_objects[i].append((left, top, right, bottom))

    # 3. 检测每个货架上的空隙
    empty_areas = []
    gap_threshold = 50
    for shelf_key, objects in shelf_objects.items():
        objects.sort(key=lambda x: x[0])
        for i in range(len(objects) - 1):
            _, _, right1, _ = objects[i]
            left2, _, _, _ = objects[i + 1]
            gap_width = left2 - right1
            if gap_width > gap_threshold:
                        shelves[shelf_key + 1] if shelf_key + 1 < len(shelves) else height,

    # ...(其余代码与以前相同)


