基于 tesseract-wasm+ fastify 开发一个简单的中文ocr 服务

时间：2023-11-03 20:47:35浏览次数：37

标签：ocr const app wasm fastify import tesseract

以前我简单介绍过tesseract-wasm,基于此wasm 包我们可以直接基于nodejs 调用tesseract 的方法实现ocr 处理，以下是一个简单的demo
基于fastify 开发了一个简单的api，同时包含了一个简单的web 可以测试

项目结构

package.json

  "name": "tesseract",

  "version": "1.0.0",

  "main": "index.js",

  "license": "MIT",

  "dependencies": {

    "@fastify/static": "^6.12.0",

    "fastify": "^4.24.3",

    "fastify-file-upload": "^4.0.0",

    "sharp": "^0.32.6",

    "tesseract-wasm": "^0.10.0"

},

  "scripts": {

    "dev": "node  demo.mjs"

demo.mjs

import { readFileSync } from "node:fs";

import { fileURLToPath } from "node:url";

import path from "node:path"

import { fastify } from "fastify";

import { createOCREngine } from "tesseract-wasm";

import { loadWasmBinary } from "tesseract-wasm/node";

import sharp from "sharp";

import fileUpload from 'fastify-file-upload'

import  {fastifyStatic} from '@fastify/static'

const __filename = fileURLToPath(import.meta.url);

const __dirname = path.dirname(__filename);

async function loadImage(path) {

  const image = await sharp(path).ensureAlpha();

  const { width, height } = await image.metadata();

  return {

    data: await image.raw().toBuffer(),

    width,

    height,

};

/** Resolve a URL relative to the current module. */

function resolve(path) {

  return fileURLToPath(new URL(path, import.meta.url).href);

const wasmBinary = await loadWasmBinary();

// 基于wasm创建引擎

const engine = await createOCREngine({ wasmBinary });

// 加载中文模型

const model = readFileSync("chi_sim.traineddata");

engine.loadModel(model);

const app = fastify({ logger: true });

// fastify 文件处理插件

app.register(fileUpload)

// 静态文件插件，注册简单测试页面

app.register(fastifyStatic, {

  root: path.join(__dirname, 'public'),

  prefix: '/', // optional: default '/'

})

// ocr 服务调用

app.post('/ocr', async function (req, reply) {

  // some code to handle file

  console.log(`starting index`, Date.now().toLocaleString());

  const file = req.body.file

  const image = await loadImage(file.data);

  engine.loadImage(image);

  const text = engine.getText((progress) => {

    console.log(`\rRecognizing text (${progress}% done)...`);

});

  console.log(`ending`, Date.now().toLocaleString());

  reply.send({

    code: 200,

    text: text,

});

})

app.listen({

  port: 3000,

  host: "0.0.0.0"

}, (err, address) => {

  if (err) {

    app.log.error(err)

    process.exit(1)

  app.log.info(`server listening on ${address}`)

})

静态页面
index.html

<!DOCTYPE html>

<html lang="en">

<head>

    <meta charset="UTF-8">

    <meta name="viewport" content="width=device-width, initial-scale=1.0">

    <title>OCR Demo</title>

    <style>

        body {

            display: flex;

            flex-direction: column;

            align-items: center;

            justify-content: center;

            height: 100vh;

            margin: 0;

            padding: 0;

            background-color: #f0f0f0;

        #file-upload {

            margin-top: 20px;

        #display-area {

            display: flex;

            justify-content: space-around;

            width: 100%;

        #image-display img {

            width: 100%;

            height: 100%;

            height: auto;

        #image-display,

        #text-display {

            width: 500px;

            height: 500px;

            overflow: auto;

    </style>

</head>

<body>

    <input type="file" id="file-upload" accept="image/*">

    <div id="display-area">

        <div id="image-display"></div>

        <div id="text-display"></div>

    </div>

    <script type="module" src="my.js"></script>

</body>

</html>

my.js
处理接口调用进行显示处理
docker 集成

FROM node:18.18.2-bullseye-slim

WORKDIR /app

COPY package.json /app/package.json

COPY yarn.lock /app/yarn.lock

COPY demo.mjs /app/demo.mjs

COPY public/ /app/public

COPY chi_sim.traineddata /app/chi_sim.traineddata

RUN yarn

EXPOSE 3000

ENTRYPOINT [ "node","demo.mjs" ]

启动&&效果
启动

yarn dev 或者docker-compose up -d

效果

说明

简单demo 我已经push 到docker hub了，可以直接使用dalongrong/tesseract-wasm:ocr-web 启动方式

docker run -d -p 3000:3000 dalongrong/tesseract-wasm:ocr-web

以上只是一个简单的示例，可以参考调整，目前来说并ocr 识别并不是很快

参考资料

https://fastify.dev/
https://github.com/huangang/fastify-file-upload
https://github.com/tesseract-ocr/tesseract
https://github.com/robertknight/tesseract-wasm
https://github.com/robertknight/tesseract-wasm/tree/main/examples
https://github.com/libvips/libvips
https://github.com/lovell/sharp
https://github.com/rongfengliang/tesseract-wasm-learning
https://flaviocopes.com/fix-dirname-not-defined-es-module-scope/

标签：ocr,const,app,wasm,fastify,import,tesseract
From： https://www.cnblogs.com/rongfengliang/p/17808410.html

tesseract-wasm 基于webassembly 的tesseract npm 包
tesseract是一个开源的ocr工具，社区提供可一个基于webassembly的tesseract-wasm，可以方便直接基于浏览器的ocr识别以下是一个简单的试用项目代码package.json {"name":"tesseract","version":"1.0.0","main":"index.js"......
opencv wasm 试用
基于webassembly的opencvnodejs开发是一个很不错的体验，不用考虑nodeaddon的各种问题，而且性能也不错以下是基于echamudi/opencv-wasm构建的4.8.1版本的试用，代码来自opencv-wasm的示例demo试用pacakge.json {"name":"opencv-wasm","version":......
【论文阅读笔记】【OCR-文本识别】 Towards Accurate Scene Text Recognition with Se
SRNCVPR2020读论文思考的问题论文试图解决什么问题？如何利用文本的上下文语义信息来辅助文本识别任务RNN能部分利用语义信息，但它的利用方式是串行的，极大地限制了语义信息的帮助，会造成错误累积以及效率缓慢等问题文章提出了什么样的解决方法？提出全局语义理解......
【论文阅读笔记】【OCR-文本识别】 Read Like Humans: Autonomous, Bidirectional and
ABINetCVPR2021(Oral)读论文思考的问题论文试图解决什么问题？如何对语言的上下文进行建模而不是对视觉特征的上下文信息进行建模如何在端到端的文本识别模型中更好、更高效地对文本的语言知识进行建模，提升对困难情况的字符识别效果文章提出了什么样的解决方法？......
【论文阅读笔记】【OCR-文本识别】 From Two to One: A New Scene Text Recognizer wi
VisionLANICCV2021读论文思考的问题论文试图解决什么问题？使用语言模型对识别的文本的上下文语义信息进行建模时，会有以下问题：引入额外的计算量；识别的视觉和语言特征很难做一个很好的融合、互补能否在不使用语言模型的情况下，直接赋予视觉模型一定的语言建模能力？......
wasm-pack 基于rust 的 WebAssembly 开发工具
目前基于WebAssembly的应用是越来越多了，同时周边工具以及生成也越来越强大了，wasm-pack是rust周边一个很强大的工具,以下是一个简单的试用参考使用安装 curlhttps://rustwasm.github.io/wasm-pack/installer/init.sh-sSf|sh创建简单项目......
【实操】Java+百度ocr，实现图片识别文字小工具
前言......
fastify-awilix 基于awilix 的fastify 依赖注入扩展
依赖注入是一个很不错的开发模式，可以帮助我们开发灵活的业务服务，fastify-awilix是基于awilix实现的一个fastify扩展说明目前nodejs的ioc框架也是很多的，fastify-awilix属于官方提供的一个扩展，还是挺不错的，值得试用下参考资料https://github.com/fastify/fastify-awilixhtt......
fastify-sensible fastify 一些工具类插件
fastify-sensible是来自官方的一个插件，提供了一些默认实现（实际上就是一些方便的工具类）方便使用说明对于基于fastify开发的一些接口服务，通用的http状态码，以及异常处理fastify-sensible是一个很不错的工具包参考资料https://github.com/fastify/fastify-sensible......
fastify-autoload + ncc + s3 实现模块的插件化开发加载
以前简单说明过基于fastify-autoload的插件化加载fastify插件，方便实现开发，但是对于实际生产环境我们可以需要频繁的模块修改，发布以及构建，所以需要我们需要频繁的调整，不是很方便,我们可以基于ncc进行入口的打包，同时对于每个插件也基于ncc打包为独立的文件，这样我们开发的插件只需......

基于 tesseract-wasm+ fastify 开发一个简单的中文ocr 服务

项目结构

说明

参考资料

相关文章

赞助商

阅读排行