首页 > 编程问答 >从图像中切割YOLOv3预测结果

从图像中切割YOLOv3预测结果

时间:2024-07-27 04:46:39浏览次数:5  
标签:python opencv computer-vision yolo

我使用德国交通标志检测数据集来训练YOLOv3。当我预测时,我就会得到结果。但是,使用此代码我似乎无法在检测结果周围绘制边界框。现在,这不是我的意图。我想把结果删掉。但我不确定如何将 YOLOv3 预测结果转换为图像上的坐标,以便将其剪掉。你能帮我解决这个问题吗?

class BoundBox:
    def __init__(self, xmin, ymin, xmax, ymax, objness = None, classes = None):
        self.xmin = xmin
        self.ymin = ymin
        self.xmax = xmax
        self.ymax = ymax
        self.objness = objness
        self.classes = classes
        self.label = -1
        self.score = -1
 
    def get_label(self):
        if self.label == -1:
            self.label = np.argmax(self.classes)
 
        return self.label
 
    def get_score(self):
        if self.score == -1:
            self.score = self.classes[self.get_label()]
 
        return self.score
 
def _sigmoid(x):
    return 1. / (1. + np.exp(-x))
 
def decode_netout(netout, anchors, obj_thresh, net_h, net_w):
    grid_h, grid_w = netout.shape[:2]
    nb_box = 3
    netout = netout.reshape((grid_h, grid_w, nb_box, -1))
    nb_class = netout.shape[-1] - 5
    boxes = []
    netout[..., :2]  = _sigmoid(netout[..., :2])
    netout[..., 4:]  = _sigmoid(netout[..., 4:])
    netout[..., 5:]  = netout[..., 4][..., np.newaxis] * netout[..., 5:]
    netout[..., 5:] *= netout[..., 5:] > obj_thresh
 
    for i in range(grid_h*grid_w):
        row = i / grid_w
        col = i % grid_w
        for b in range(nb_box):
            # 4th element is objectness score
            objectness = netout[int(row)][int(col)][b][4]
            if(objectness.all() <= obj_thresh): continue
            # first 4 elements are x, y, w, and h
            x, y, w, h = netout[int(row)][int(col)][b][:4]
            x = (col + x) / grid_w # center position, unit: image width
            y = (row + y) / grid_h # center position, unit: image height
            w = anchors[2 * b + 0] * np.exp(w) / net_w # unit: image width
            h = anchors[2 * b + 1] * np.exp(h) / net_h # unit: image height
            # last elements are class probabilities
            classes = netout[int(row)][col][b][5:]
            box = BoundBox(x-w/2, y-h/2, x+w/2, y+h/2, objectness, classes)
            boxes.append(box)
    return boxes
 
def correct_yolo_boxes(boxes, image_h, image_w, net_h, net_w):
    new_w, new_h = net_w, net_h
    for i in range(len(boxes)):
        x_offset, x_scale = (net_w - new_w)/2./net_w, float(new_w)/net_w
        y_offset, y_scale = (net_h - new_h)/2./net_h, float(new_h)/net_h
        boxes[i].xmin = int((boxes[i].xmin - x_offset) / x_scale * image_w)
        boxes[i].xmax = int((boxes[i].xmax - x_offset) / x_scale * image_w)
        boxes[i].ymin = int((boxes[i].ymin - y_offset) / y_scale * image_h)
        boxes[i].ymax = int((boxes[i].ymax - y_offset) / y_scale * image_h)
 
def _interval_overlap(interval_a, interval_b):
    x1, x2 = interval_a
    x3, x4 = interval_b
    if x3 < x1:
        if x4 < x1:
            return 0
        else:
            return min(x2,x4) - x1
    else:
        if x2 < x3:
             return 0
        else:
            return min(x2,x4) - x3
 
def bbox_iou(box1, box2):
    intersect_w = _interval_overlap([box1.xmin, box1.xmax], [box2.xmin, box2.xmax])
    intersect_h = _interval_overlap([box1.ymin, box1.ymax], [box2.ymin, box2.ymax])
    intersect = intersect_w * intersect_h
    w1, h1 = box1.xmax-box1.xmin, box1.ymax-box1.ymin
    w2, h2 = box2.xmax-box2.xmin, box2.ymax-box2.ymin
    union = w1*h1 + w2*h2 - intersect
    return float(intersect) / union
 
def do_nms(boxes, nms_thresh):
    if len(boxes) > 0:
        nb_class = len(boxes[0].classes)
    else:
        return
    for c in range(nb_class):
        sorted_indices = np.argsort([-box.classes[c] for box in boxes])
        for i in range(len(sorted_indices)):
            index_i = sorted_indices[i]
            if boxes[index_i].classes[c] == 0: continue
            for j in range(i+1, len(sorted_indices)):
                index_j = sorted_indices[j]
                if bbox_iou(boxes[index_i], boxes[index_j]) >= nms_thresh:
                    boxes[index_j].classes[c] = 0
 
# load and prepare an image
def load_image_pixels(filename, shape):
    # load the image to get its shape
    image = load_img(filename)
    width, height = image.size
    # load the image with the required size
    image = load_img(filename, target_size=shape)
    # convert to numpy array
    image = img_to_array(image)
    # scale pixel values to [0, 1]
    image = image.astype('float32')
    image /= 255.0
    # add a dimension so that we have one sample
    image = expand_dims(image, 0)
    return image, width, height
 
# get all of the results above a threshold
def get_boxes(boxes, labels, thresh):
    v_boxes, v_labels, v_scores = list(), list(), list()
    # enumerate all boxes
    for box in boxes:
        # enumerate all possible labels
        for i in range(len(labels)):
            # check if the threshold for this label is high enough
            if box.classes[i] > thresh:
                v_boxes.append(box)
                v_labels.append(labels[i])
                v_scores.append(box.classes[i]*100)
                # don't break, many labels may trigger for one box
    return v_boxes, v_labels, v_scores
 
# draw all results
def draw_boxes(filename, v_boxes, v_labels, v_scores):
    # load the image
    data = plt.imread(filename)
    # plot the image
    plt.imshow(data)
    # get the context for drawing boxes
    ax = plt.gca()
    # plot each box
    for i in range(len(v_boxes)):
        box = v_boxes[i]
        # get coordinates
        y1, x1, y2, x2 = box.ymin, box.xmin, box.ymax, box.xmax
        # calculate width and height of the box
        width, height = x2 - x1, y2 - y1
        # create the shape
        rect = Rectangle((x1, y1), width, height, fill=False, color='white')
        # draw the box
        ax.add_patch(rect)
        # draw text and score in top left corner
        label = "%s (%.3f)" % (v_labels[i], v_scores[i])
        plt.text(x1, y1, label, color='white')
    # show the plot
    plt.show()
 
# load yolov3 model
model = load_model('/content/drive/MyDrive/model.h5', custom_objects={'yolo_loss': yolo_loss})
# define the expected input shape for the model
input_w, input_h = 416, 416
# define our new photo
photo_filename = '/content/images/00008.jpg'
# load and prepare image
image, image_w, image_h = load_image_pixels(photo_filename, (input_w, input_h))
# make prediction
yhat = model.predict(image)
# summarize the shape of the list of arrays
print([a.shape for a in yhat])
# define the anchors
anchors = [[116,90, 156,198, 373,326], [30,61, 62,45, 59,119], [10,13, 16,30, 33,23]]
# define the probability threshold for detected objects
class_threshold = 0.6
boxes = list()
for i in range(len(yhat)):
    # decode the output of the network
    boxes += decode_netout(yhat[i][0], anchors[i], class_threshold, input_h, input_w)
# correct the sizes of the bounding boxes for the shape of the image
correct_yolo_boxes(boxes, image_h, image_w, input_h, input_w)
# suppress non-maximal boxes
do_nms(boxes, 0.5)
# define the labels
labels = ["1", "2", "3", "4"]
# get the details of the detected objects
v_boxes, v_labels, v_scores = get_boxes(boxes, labels, class_threshold)
# summarize what we found
for i in range(len(v_boxes)):
    print(v_labels[i], v_scores[i])
# draw what we found
draw_boxes(photo_filename, v_boxes, v_labels, v_scores)

代码已经提供边界框坐标。需要做的就是使用这些坐标来裁剪图像中检测到的对象。以下是执行此操作的更新后的 draw_boxes 函数:

import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from PIL import Image

def draw_boxes(filename, v_boxes, v_labels, v_scores):
    # load the image
    data = plt.imread(filename)
    # plot the image
    plt.imshow(data)
    # get the context for drawing boxes
    ax = plt.gca()
    # 创建一个列表来存储裁剪后的对象
    cropped_objects = []
    # 循环遍历每个框
    for i in range(len(v_boxes)):
        box = v_boxes[i]
        # get coordinates
        y1, x1, y2, x2 = box.ymin, box.xmin, box.ymax, box.xmax
        # calculate width and height of the box
        width, height = x2 - x1, y2 - y1
        # create the shape
        rect = Rectangle((x1, y1), width, height, fill=False, color='white')
        # draw the box
        ax.add_patch(rect)
        # draw text and score in top left corner
        label = "%s (%.3f)" % (v_labels[i], v_scores[i])
        plt.text(x1, y1, label, color='white')

        # 使用 PIL 裁剪检测到的对象
        cropped_object = Image.open(filename).crop((x1, y1, x2, y2))
        cropped_objects.append(cropped_object)

    # show the plot
    plt.show()
    # 返回裁剪后的对象
    return cropped_objects

# ... (的代码的其余部分)

# 获取检测到的对象
cropped_objects = draw_boxes(photo_filename, v_boxes, v_labels, v_scores)

# 现在可以保存或使用 cropped_objects 列表,它包含图像中每个检测到的对象的 PIL.Image 对象。
# 例如,要保存第一个检测到的对象:
cropped_objects[0].save("cropped_object.jpg")

这段代码进行以下更改:

  1. 导入 PIL 的 Image: 添加 from PIL import Image 以便使用 PIL 处理图像裁剪。
  2. draw_boxes 函数中裁剪对象:
    • 创建一个名为 cropped_objects 的列表来存储裁剪的对象。
    • 在循环的每个边界框中,使用 Image.open(filename).crop((x1, y1, x2, y2)) 从原始图像中裁剪对象,并将其附加到 cropped_objects 列表中。
  3. 返回 cropped_objects draw_boxes 函数现在返回 cropped_objects 列表。
  4. 使用裁剪的对象: 现在, cropped_objects 列表包含从图像中检测到的每个对象的 PIL.Image 对象。现在可以循环遍历列表并保存每个对象或对每个对象执行其他操作。

这段代码现在将在检测到的边界框周围绘制边界框,并还将每个检测到的对象裁剪到 PIL.Image 对象列表中,可以将其保存或用于进一步处理。

标签:python,opencv,computer-vision,yolo
From: 67059442

相关文章

  • 使用类型提示将 Python 转换为 Cython
    类型提示现在在Python3.5版本中可用。在规范(PEP484)中,目标(和非目标)被明确暴露:#RationaleandGoals此PEP旨在为类型注释提供标准语法,开放Python代码更容易静态分析和重构、潜在的运行时类型检查以及(也许在某些情况下)利用类型信息生成代码。......
  • 在 Python 类型提示中区分 PySpark 和 Pandas DataFrame (PyCharm)
    在PyCharm中,如果使用apyspark.sql.DataFrame代替pandas.DataFrame,类型提示似乎不会触发警告,反之亦然。例如以下代码根本不会生成任何警告:frompyspark.sqlimportDataFrameasSparkDataFramefrompandasimportDataFrameasPandasDataFramedef......
  • 如何在Python中继承类型提示?
    所以我的问题是,当我有一个A类型的类来做事情并且我使用这些函数作为subclass(B)时,它们仍然是类A的类型,并且不接受我的类B对象作为参数或作为函数签名。我的问题简化了:fromtypingimportTypeVar,Generic,CallableT=TypeVar('T'......
  • Python - 如何传递类对象的函数参数类型(打字)
    我想python3.7附带了(不确定),不仅可以将变量名传递给函数,还可以传递变量的类型。我想知道的是是否有可能传递特定类的类型。以同样的方式传递:deffoo_func(i:int)->None:pass如果我有一个类,让我们说:classfoo_class(object):pass我如何转换fo......
  • Opencv学习项目4——手部跟踪
    主要是使用opencv和mediapipe库来实现手部跟踪,首先我们先介绍一下mediapipe库mediapipe库介绍MediaPipe是一个由Google开发的开源框架,用于构建基于机器学习的应用程序,特别是涉及实时数据处理和传感的应用。它提供了一套工具和库,使开发者可以轻松地构建复杂的多媒体处理......
  • 使用 Python 构建简单 REST API
    使用Python构建简单RESTAPI1.概述本技术文档旨在指导开发者使用Python框架Flask构建一个基本的RESTAPI。通过学习本指南,您将掌握创建、读取、更新和删除(CRUD)操作的基本知识,并能够使用Python构建自己的API。2.安装依赖首先,您需要确保已安装Python和Flask......
  • Python——Pandas(第二讲)
    文章目录变量类型的转换Pandas支持的数据类型在不同数据类型间转换建立索引新建数据框时建立索引读入数据时建立索引指定某列为索引列将索引还原变量列引用和修改索引引用索引修改索引修改索引名修改索引值更新索引Series的索引和切片DataFrame的索引和切片选择列按......
  • 基于Python+Django的红色文化研学网站设计与实现
    ......
  • 【python】对网站进行请求-初识
    python实现对网站进行请求代码如下importrequestsdefget_data(url,headers=None,params=None,timeout=10):try:res=requests.get(url,headers=headers,params=params,timeout=timeout)res.raise_for_status()returnres.text......
  • 【python】Django初识-从未有如此美妙的开局
    Django初识python、Django安装与验证python安装Python官网https://www.python.org/Django安装pipinstallDjango验证python是否安装成功python--version验证Django是否安装成功python3-mdjango--version创建第一个Django项目项目创建与服务器启动打开cmd,输......