首页 > 其他分享 >opencv OCR检测:EAST

opencv OCR检测:EAST

时间:2024-02-05 10:56:34浏览次数:27  
标签:OCR dimensions geometry Invalid assert opencv shape scores EAST

加载east 模型 进行 文本检测

模型下载 https://codeload.github.com/oyyd/frozen_east_text_detection.pb/zip/refs/heads/master

#coding:utf-8
import cv2
import math
############ Utility functions ############
def decode(scores, geometry, scoreThresh):
    detections = []
    confidences = []

    ############ CHECK DIMENSIONS AND SHAPES OF geometry AND scores ############
    assert len(scores.shape) == 4, "Incorrect dimensions of scores"
    assert len(geometry.shape) == 4, "Incorrect dimensions of geometry"
    assert scores.shape[0] == 1, "Invalid dimensions of scores"
    assert geometry.shape[0] == 1, "Invalid dimensions of geometry"
    assert scores.shape[1] == 1, "Invalid dimensions of scores"
    assert geometry.shape[1] == 5, "Invalid dimensions of geometry"
    assert scores.shape[2] == geometry.shape[2], "Invalid dimensions of scores and geometry"
    assert scores.shape[3] == geometry.shape[3], "Invalid dimensions of scores and geometry"
    height = scores.shape[2]
    width = scores.shape[3]
    for y in range(0, height):

        # Extract data from scores
        scoresData = scores[0][0][y]
        x0_data = geometry[0][0][y]
        x1_data = geometry[0][1][y]
        x2_data = geometry[0][2][y]
        x3_data = geometry[0][3][y]
        anglesData = geometry[0][4][y]
        for x in range(0, width):
            score = scoresData[x]

            # If score is lower than threshold score, move to next x
            if(score<scoreThresh):
                continue

            # Calculate offset
            offsetX = x * 4.0
            offsetY = y * 4.0
            angle = anglesData[x]

            # Calculate cos and sin of angle
            cosA = math.cos(angle)
            sinA = math.sin(angle)
            h = x0_data[x] + x2_data[x]
            w = x1_data[x] + x3_data[x]

            # Calculate offset
            offset = ([offsetX + cosA * x1_data[x] + sinA * x2_data[x], offsetY - sinA * x1_data[x] + cosA * x2_data[x]])

            # Find points for rectangle
            p1 = (-sinA * h + offset[0], -cosA * h + offset[1])
            p3 = (-cosA * w + offset[0],  sinA * w + offset[1])
            center = (0.5*(p1[0]+p3[0]), 0.5*(p1[1]+p3[1]))
            detections.append((center, (w,h), -1*angle * 180.0 / math.pi))
            confidences.append(float(score))

    # Return detections and confidences
    return [detections, confidences]

modelpath = "d:/downloads/frozen_east_text_detection.pb"

net = cv2.dnn.readNetFromTensorflow(modelpath)
names = net.getLayerNames()
outNames = ['feature_fusion/Conv_7/Sigmoid', 'feature_fusion/concat_3']
inputsize = (320,320)

# input need 3 channels
img = cv2.imread('d:/ocr.png',1)
height = img.shape[0]
width = img.shape[1]
rW = width / float(inputsize[0])
rH = height /float(inputsize[1])
confThreshold = 0.5
nmsThreshold = 0.4 
scalefactor = 1.0
meanval = (123.68, 116.78, 103.94)
# pre proc
blob = cv2.dnn.blobFromImage(img,scalefactor,inputsize,meanval,  True,False)
net.setInput(blob)
out = net.forward(outNames)
t,_ = net.getPerfProfile()
label  = "inference time: %.2f ms"%(t*1000.0/cv2.getTickFrequency())
print(label)
print(out[0].shape, out[1].shape)
scores = out[0]
geometry = out[1]

[boxes, confidences] = decode(scores, geometry, confThreshold)

if(1):
    frame = img
    # Apply NMS
    indices = cv2.dnn.NMSBoxesRotated(boxes, confidences, confThreshold,nmsThreshold)
    print(indices)
    for i in indices:
        # get 4 corners of the rotated rect
        vertices = cv2.boxPoints(boxes[i])
        print("vertices:", vertices)
        # scale the bounding box coordinates based on the respective ratios
        for j in range(4):
            vertices[j][0] *= rW
            vertices[j][1] *= rH
        for j in range(4):
            ri = lambda x: int(round(x))
            p1 = (ri(vertices[j][0]), ri(vertices[j][1]))
            p2 = (ri(vertices[(j + 1) % 4][0]), ri(vertices[(j + 1) % 4][1]))
            cv2.line(frame, p1, p2, (0, 255, 0), 2, cv2.LINE_AA);
            # cv.putText(frame, "{:.3f}".format(confidences[i[0]]), (vertices[0][0], vertices[0][1]), cv.FONT_HERSHEY_SIMPLEX, 0.5, (255, 0, 0), 1, cv.LINE_AA)

    # Put efficiency information
    cv2.putText(frame, label, (0, 15), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255))

    # Display the frame
    cv2.imshow("result",frame)
    cv2.waitKey(3000)
    cv2.destroyAllWindows()

 

标签:OCR,dimensions,geometry,Invalid,assert,opencv,shape,scores,EAST
From: https://www.cnblogs.com/hakula/p/18007548

相关文章

  • 2020-2021 ICPC East Central North America Regional Contest (ECNA 2020)
    Preface队友C麻了我直接3h下班雀魂启动,如果时间多点感觉还有AK希望不过不得不说北美场难度都集中在模拟题上了,一般压轴都是数学或者几何,而这类题目遇到徐神祁神就是洒洒水了A.AllintheFamily出题人真是丧心病狂,不过这题只是看起来恶心实际写起来感觉还好做法本身由于树......
  • 2022-2023 ICPC East Central North America Regional Contest (ECNA 2022)
    Preface闲了两天没训练,今天又开始上班,结果唐得发昏后期也没题可写直接光速下班只能说感觉老外的题目难度跨度都好大,easy确实简单,hard确实难,medium确实少A.A-MazingPuzzle题目看起来很复杂,但仔细一想会发现有用的状态总数只有\(4n^2\)种即我们可以暴力记录下两个机器人的坐......
  • opencv 简介
    OpenCV介绍OpenCV是一个的跨平台计算机视觉库,可以运行在Linux、Windows和MacOS操作系统上。它轻量级而且高效——由一系列C函数和少量C++类构成,同时也提供了Python接口,实现了图像处理和计算机视觉方面的很多通用算法。在本文中,将介绍OpenCV库,包括它的主要模块和典型......
  • 全流程机器视觉工程开发(三)任务前瞻 - 从opencv的安装编译说起,到图像增强和分割
    前言最近开始做这个裂缝识别的任务了,大大小小的问题我已经摸得差不多了,然后关于识别任务和分割任务我现在也弄的差不多了。现在开始做正式的业务,也就是我们说的裂缝识别的任务。作为前言,先来说说场景:现在相机返回过来的照片:都是jpeg格式的照片,当然也可能是别的格式,目前主流是......
  • 【OpenCV】在Linux上使用OpenCvSharp
    前言OpenCV是一个基于Apache2.0许可(开源)发行的跨平台计算机视觉和机器学习软件库,它具有C++,Python,Java和MATLAB接口,并支持Windows,Linux,Android和MacOS。OpenCvSharp是一个OpenCV的.Netwrapper,应用最新的OpenCV库开发,使用习惯比EmguCV更接近原始的OpenCV,该库采用LGPL发行,对商业......
  • VideoCrafter2:腾讯AI如何用少量数据生成更清晰视频
    引言去年10月,腾讯发布了VideoCrafter1模型,引起了广泛关注。短短3个月后,腾讯AI实验室再次创新,推出了VideoCrafter2模型。这一次,他们克服了高质量视频扩散模型的数据限制,仅使用有限数据就实现了显著改进,既保留了良好的动态效果,又大幅提升了视频质量。VideoCrafter2模型概述VideoCraft......
  • 利用pyautogui调用微信ocr
    importpyautoguiimporttimedefocrweixin(filename):#鼠标点击,默认左键#移至sheet页pyautogui.click(1386,33)#点击地址框pyautogui.click(1364,94)pyautogui.hotkey('ctrl','a')time.sleep(0.5)pyautogui.typewrite(......
  • Python调用微信OCR识别文字和坐标
    python的ocr识别最方便的最准确的方法就是直接调微信的ocr注意:调用的时候先把微信关掉。importosimportjsonimporttimefromwechat_ocr.ocr_managerimportOcrManager,OCR_MAX_TASK_IDwechat_ocr_dir=r"C:\Users\mydell\AppData\Roaming\Tencent\WeChat\XPlugin\P......
  • OpenCvSharp打造智能考勤系统,实现高效人脸录入和精准考勤识别
     概述:该考勤系统基于OpenCV和OpenCvSharp实现,包含员工人脸录入和上下班考勤人脸识别。员工人脸特征通过ORB方法提取并存储,考勤时通过相似度计算识别员工。系统灵活、可扩展,提高考勤效率,确保准确性。实现基于OpenCV和OpenCvSharp的考勤系统,包括员工人脸录入和上下班考勤人脸识......
  • opencv水平线与垂直线清除(表格线清除)
     1.腐蚀(Erosing)腐蚀是一种常见的形态学操作,它通过将图像中的物体边界向内部腐蚀来减小物体的大小。腐蚀操作通常用于去除图像中的小白噪声、分离物体等。在腐蚀操作中,我们需要定义一个结构元素(通常是一个小的矩形或圆形),然后将这个结构元素在图像上滑动,当结构元素完全覆盖......