OpenCV与AI深度学习 | 基于YOLO和EasyOCR从视频中识别车牌

时间：2024-12-15 21:59:21浏览次数：7

标签：plate AI frame YOLO cv2 EasyOCR number

本文来源公众号“OpenCV与AI深度学习”，仅用于学术分享，侵权删，干货满满。

在本文中，我们将探讨如何使用 Python 中的 YOLO（You Only Look Once）和 EasyOCR（Optical Character Recognition）从视频文件中实现车牌检测。这种方法利用深度学习实时检测和识别车牌。

先决条件

在开始之前，请确保已安装以下 Python 包：

pip install opencv-python ultralytics easyocr Pillow numpy

实现步骤

步骤 1：初始化库

我们将首先导入必要的库。我们将使用 OpenCV 进行视频处理、使用 YOLO 进行对象检测以及使用 EasyOCR 读取检测到的车牌上的文字。

import cv2
from ultralytics import YOLO
import easyocr
from PIL import Image
import numpy as np

# Initialize EasyOCR reader
reader = easyocr.Reader(['en'], gpu=False)

# Load your YOLO model (replace with your model's path)
model = YOLO('best_float32.tflite', task='detect')

# Open the video file (replace with your video file path)
video_path = 'sample4.mp4'
cap = cv2.VideoCapture(video_path)

# Create a VideoWriter object (optional, if you want to save the output)
output_path = 'output_video.mp4'
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, 30.0, (640, 480))  # Adjust frame size if necessary

步骤2：处理视频帧

我们将读取视频文件中的每一帧，对其进行处理以检测车牌，然后应用 OCR 来识别车牌上的文字。为了提高性能，我们可以跳过每三帧的处理。

# Frame skipping factor (adjust as needed for performance)
frame_skip = 3  # Skip every 3rd frame
frame_count = 0

while cap.isOpened():
    ret, frame = cap.read()  # Read a frame from the video
    if not ret:
        break  # Exit loop if there are no frames left

    # Skip frames
    if frame_count % frame_skip != 0:
        frame_count += 1
        continue  # Skip processing this frame

    # Resize the frame (optional, adjust size as needed)
    frame = cv2.resize(frame, (640, 480))  # Resize to 640x480

    # Make predictions on the current frame
    results = model.predict(source=frame)

    # Iterate over results and draw predictions
    for result in results:
        boxes = result.boxes  # Get the boxes predicted by the model
        for box in boxes:
            class_id = int(box.cls)  # Get the class ID
            confidence = box.conf.item()  # Get confidence score
            coordinates = box.xyxy[0]  # Get box coordinates as a tensor

            # Extract and convert box coordinates to integers
            x1, y1, x2, y2 = map(int, coordinates.tolist())  # Convert tensor to list and then to int

            # Draw the box on the frame
            cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 255, 0), 2)  # Draw rectangle

            # Try to apply OCR on detected region
            try:
                # Ensure coordinates are within frame bounds
                r0 = max(0, x1)
                r1 = max(0, y1)
                r2 = min(frame.shape[1], x2)
                r3 = min(frame.shape[0], y2)

                # Crop license plate region
                plate_region = frame[r1:r3, r0:r2]

                # Convert to format compatible with EasyOCR
                plate_image = Image.fromarray(cv2.cvtColor(plate_region, cv2.COLOR_BGR2RGB))
                plate_array = np.array(plate_image)

                # Use EasyOCR to read text from plate
                plate_number = reader.readtext(plate_array)
                concat_number = ' '.join([number[1] for number in plate_number])
                number_conf = np.mean([number[2] for number in plate_number])

                # Draw the detected text on the frame
                cv2.putText(
                    img=frame,
                    text=f"Plate: {concat_number} ({number_conf:.2f})",
                    org=(r0, r1 - 10),
                    fontFace=cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale=0.7,
                    color=(0, 0, 255),
                    thickness=2
                )

            except Exception as e:
                print(f"OCR Error: {e}")
                pass

    # Show the frame with detections
    cv2.imshow('Detections', frame)

    # Write the frame to the output video (optional)
    out.write(frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break  # Exit loop if 'q' is pressed

    frame_count += 1  # Increment frame count

# Release resources
cap.release()
out.release()  # Release the VideoWriter object if used
cv2.destroyAllWindows()

代码说明：

初始化 EasyOCR：初始化 EasyOCR 阅读器以进行英文文本识别。

加载 YOLO 模型：YOLO 模型从指定路径加载。请确保将此路径替换为您的模型路径。

读取视频帧：使用 OpenCV 打开视频文件，VideoWriter如果要保存输出，则初始化。

帧处理：读取并调整每一帧的大小。该模型预测车牌位置。

绘制预测：在帧上绘制检测到的边界框。包含车牌的区域被裁剪以进行 OCR 处理。

应用 OCR：EasyOCR 从裁剪的车牌图像中读取文本。检测到的文本和置信度分数显示在框架上。

输出视频：处理后的帧可以显示在窗口中，也可以选择保存到输出视频文件中。

THE END !

文章结束，感谢阅读。您的点赞，收藏，评论是我继续更新的动力。大家有推荐的公众号可以评论区留言，共同学习，一起进步。

标签：plate,AI,frame,YOLO,cv2,EasyOCR,number
From： https://blog.csdn.net/csdn_xmj/article/details/144145668

Debiasing Model Updates for Improving Personalized Federated Training为改进个性
第一部分：解决的问题联邦学习（FL）是一种分布式机器学习方法，允许设备在不共享本地数据的情况下协同训练模型。在个性化联邦学习中，目标是为每个设备训练个性化模型，而不是一个通用的全局模型。然而，由于设备之间数据分布的异质性，传统方法会导致模型偏差。第二部分：解决的方法/idea......
使用YOLOv4训练DeepFashion2数据集详解
文章目录使用YOLOv4训练DeepFashion2数据集详解一、引言二、准备工作1、数据集和代码准备2、环境配置三、数据预处理1、生成训练和验证集标签2、调整数据集路径四、模型训练1、修改配置文件2、开始训练五、使用示例六、总结使用YOLOv4训练DeepFashion2数据集详解......
Python中实现YOLO目标检测
文章目录Python中实现YOLO目标检测一、引言二、环境准备1、安装依赖2、下载预训练模型三、目标检测1、图像检测2、视频检测四、使用示例1、轨迹追踪五、总结Python中实现YOLO目标检测一、引言YOLO（YouOnlyLookOnce）是一种流行的实时目标检测算法，以其速度快和准......
SVN 报错 | svn: E170004: Commit failed (details follow): svn: E170004: Directory
问题描述IDEA中通过SVN拉取项目后进行修改，第一次commit提交代码的时候成功提交，第二次修改后再提交的时候报错了，提示“Directory'xxx'isoutofdate”解决方法报错的原因是本地项目过时了，和svn服务器的项目版本不一致。需要先update更新本地的项目，再重新修改代码然......
yolo11的分类模型可能遇到的问题
1、魔改的yolo11-cls（主改head.py内容），加载时使用如下方式，否则可能魔改无效yolo=YOLO("yolo11n-cls.yaml").load("yolo11n-cls.pt")yolo11n-cls.yaml里将nc写成自己的类别数 2、train时data指定的yaml无效，提示数据集未找到直接用目录，到train、val的上一层即可。我的train......
Neo4j - Run a docker container
zzh@ZZHPC:~$dockerpullneo4j:latest zzh@ZZHPC:~$dockerrun\--publish=7474:7474--publish=7687:7687\--volume=$HOME/neo4j/data:/data\-dneo4j HowtousethisimageYoucanstartaNeo4jcontainerlikethis:dockerrun\......
AI即时直播换脸换声技术解析与应用前景
文中插图下面有实验场，可以亲自体验AI的强大之处！AI在多个领域的应用场景不断扩展，尤其是在娱乐、社交媒体以及直播行业。AI即时直播换脸与换声，作为这一波AI技术革新的代表性应用，不仅在技术上实现了巨大的突破，也带来了前所未有的创作自由。然而，这项技术的出现也引发了广泛的讨论，......
AI数字人(无人)直播：技术架构与未来展望
文中配图下面有实验场，可以亲自体验一把AI数字人的强大！近年来，随着人工智能技术的迅猛发展，AI数字人（DigitalHuman）逐渐成为了直播行业的新兴力量。AI数字人直播不仅能够模拟人类行为、声音和情感反应，还能在虚拟环境中进行高度交互，吸引了广泛的关注与投资。本文将深入探讨AI数字人......
【原创】ARM64 实时linux操作系xenomai4(EVL)构建安装简述
目录0环境说明1内核构建2库编译方式1交叉编译方式2本地编译3测试单元测试hectic:EVL上下文切换latmus：latency测试4RK3588xenomai4实时性能5总结xenomai4虽然推出很长时间了(2021第一个稳定版本)，但当时只是在x86上跑了一下就再没关注过，最近一直想看看xenomai4在ARM64上......
三文带你轻松上手鸿蒙的 AI 语音 03-文本合成声音
三文带你轻松上手鸿蒙的AI语音03-文本合成声音前言接上文三文带你轻松上手鸿蒙的AI语音02-声音文件转文本HarmonyOSNEXT提供的AI文本合并语音功能，可以将一段不超过10000字符的文本合成为语音并进行播报。场景举例手机在无网状态下，系统应用无障碍（屏幕朗读）接入......

OpenCV与AI深度学习 | 基于YOLO和EasyOCR从视频中识别车牌

先决条件

实现步骤

步骤 1：初始化库

步骤2：处理视频帧

相关文章

赞助商

阅读排行