鸿蒙学习实战之路-Core Vision Kit 通用文字识别实现指南

通用文字识别（OCR）就是把图片里的文字"读"出来，转换成计算机能处理的文本。Core Vision Kit 提供的 OCR 能力可厉害了，支持多种语言，识别速度快，精度也不错，特别适合各种需要文字提取的场景。Core Vision Kit 可应用于各种场景，提升用户体验和应用效率。通用文字识别：可用于扫描和识别文档、名片、票据等印刷品中的文字内容，方便用户快速录入和存储信息。人脸检测：应用于相册

刺猬二大爷

576人浏览 · 2025-12-25 02:30:00

刺猬二大爷 · 2025-12-25 02:30:00 发布

鸿蒙学习实战之路-Core Vision Kit 通用文字识别实现指南

最近好多朋友问我：“西兰花啊，我想做个鸿蒙应用，需要文字识别功能，有没有简单的实现方法啊？” 害，这问题可问对人了！作为一个正在把 npm install 炒成 ohpm install 的前端厨子^_，我刚好用鸿蒙的 Core Vision Kit 实现过类似功能～

说起 Core Vision Kit（基础视觉服务），那可是鸿蒙提供的"视觉魔法箱"啊！它包含了通用文字识别（OCR）、人脸检测、人脸比对以及主体分割等能力，就像给应用装了一双会看的眼睛。开发者还可以结合 Vision Kit 的 UI 控件能力（比如人脸活体检测），让应用变得更智能、更好用～

今天这篇，我就手把手带你实现通用文字识别（OCR）功能，全程不超过 10 分钟（不含下载依赖时间）～

功能概述

通用文字识别（OCR）就是把图片里的文字"读"出来，转换成计算机能处理的文本。Core Vision Kit 提供的 OCR 能力可厉害了，支持多种语言，识别速度快，精度也不错，特别适合各种需要文字提取的场景。

场景介绍

Core Vision Kit 可应用于各种场景，提升用户体验和应用效率。以下是一些典型的应用场景：

通用文字识别：可用于扫描和识别文档、名片、票据等印刷品中的文字内容，方便用户快速录入和存储信息。
人脸检测：应用于相册管理、照片美化等功能中，也可以用于自动检测和定位照片中的人脸。
人脸比对：常用于人脸认证、考勤打卡、门禁系统等需要验证用户身份的场景。
主体分割：可以检测出图片中区别于背景的前景物体或区域（即"显著主体"），并将其从背景中分离出来，适用于需要识别和提取图像主要信息的场景。
多目标识别：帮助开发者从图片中识别常见的目标对象（动物、植物、建筑物、人、人脸、文本、表格等）并给出位置信息。
骨骼点检测：人体骨骼关键点检测，主要检测人体的一些关键点，通过关键点描述人体骨骼信息，适用于智能视频监控、病人监护系统、人机交互等场景。

咱们今天重点说通用文字识别（OCR），这玩意儿应用可广了：文档翻拍电子化、街景文字识别翻译、票据卡证信息提取，都能用得上！

识别效果示例：

在这里插入图片描述

约束与限制

约束项	具体说明
支持格式	JPEG、JPG、PNG
支持语言	简体中文、英文、日文、韩文、繁体中文
文本长度	不超过 10000 字符
识别类型	文档印刷体（手写体识别能力有限）
图像质量	建议 720p 以上，100px < 高度 < 15210px，100px < 宽度 < 10000px
拍摄角度	与文本平面垂直方向夹角应小于 30 度
设备支持	不支持模拟器

🥦 西兰花警告：
我有个朋友第一次做的时候，用了张模糊到看不清的照片，结果识别率为 0！血泪教训啊朋友们，图片质量真的很重要～

开发步骤

1. 导入依赖

首先咱们得把需要的工具库导进来，就像炒菜前先备菜一样～

import { textRecognition } from "@kit.CoreVisionKit";
import { image } from "@kit.ImageKit";
import { photoAccessHelper } from "@kit.MediaLibraryKit";
import { fileIo } from "@kit.CoreFileKit";

2. 页面布局设计

接下来咱们设计一下页面，放个图片显示区域、识别结果区域，还有两个按钮（选择图片和开始识别）。这个布局就像给咱们的文字识别功能搭个灶台～

build() {
  Column() {
    // 显示选中的图片
    Image(this.chooseImage)
      .objectFit(ImageFit.Fill)
      .height('60%')

    // 显示识别结果
    Text(this.dataValues)
      .copyOption(CopyOptions.LocalDevice)
      .height('15%')
      .margin(10)
      .width('60%')

    // 选择图片按钮
    Button('选择图片')
      .type(ButtonType.Capsule)
      .fontColor(Color.White)
      .alignSelf(ItemAlign.Center)
      .width('80%')
      .margin(10)
      .onClick(() => this.selectImage())

    // 开始识别按钮
    Button('开始识别')
      .type(ButtonType.Capsule)
      .fontColor(Color.White)
      .alignSelf(ItemAlign.Center)
      .width('80%')
      .margin(10)
      .onClick(async () => this.textRecognitionTest())
  }
  .width('100%')
  .height('100%')
  .justifyContent(FlexAlign.Center)
}

3. 初始化与资源释放

咱们的文字识别引擎需要在页面加载时初始化，页面销毁时释放资源，就像炒菜前点火，炒完熄火一样～

async aboutToAppear(): Promise<void> {
  // 初始化文字识别引擎
  const initResult = await textRecognition.init();
  console.info(`OCR service initialization result: ${initResult}`);
}

async aboutToDisappear(): Promise<void> {
  // 释放资源
  await textRecognition.release();
  console.info('OCR service released successfully');
}

🥦 西兰花小贴士：
一定要记得释放资源哦，不然会造成内存泄漏，就像炒完菜不关火会浪费燃气一样^_

4. 图片选择与加载

现在咱们需要实现图片选择功能，让用户可以从相册里选一张图片进行识别。这就像给咱们的识别系统准备原材料～

// 打开图库选择图片
private async selectImage() {
  const uri = await this.openPhoto();
  if (uri) {
    this.loadImage(uri);
  }
}

// 调用图库
private openPhoto(): Promise<string> {
  return new Promise((resolve) => {
    const photoPicker = new photoAccessHelper.PhotoViewPicker();
    photoPicker.select({
      MIMEType: photoAccessHelper.PhotoViewMIMETypes.IMAGE_TYPE,
      maxSelectNumber: 1
    }).then((res) => {
      resolve(res.photoUris[0]);
    }).catch((err) => {
      console.error(`Failed to get photo: ${err.code} - ${err.message}`);
      resolve('');
    });
  });
}

// 加载图片并转换为PixelMap
private loadImage(uri: string) {
  setTimeout(async () => {
    try {
      const fileSource = await fileIo.open(uri, fileIo.OpenMode.READ_ONLY);
      const imageSource = image.createImageSource(fileSource.fd);
      this.chooseImage = await imageSource.createPixelMap();
      await fileIo.close(fileSource);
    } catch (error) {
      console.error(`Failed to load image: ${error}`);
    }
  }, 100);
}

5. 执行文字识别

最后，咱们需要实现文字识别的核心功能，让系统"看懂"图片里的文字。这就像给原材料进行加工烹饪～

private textRecognitionTest() {
  if (!this.chooseImage) return;

  // 配置识别参数
  const visionInfo: textRecognition.VisionInfo = {
    pixelMap: this.chooseImage // 传入PixelMap对象
  };

  // 配置识别选项
  const textConfiguration: textRecognition.TextRecognitionConfiguration = {
    isDirectionDetectionSupported: false // 是否支持朝向检测
  };

  // 执行识别
  textRecognition.recognizeText(visionInfo, textConfiguration)
    .then((result: textRecognition.TextRecognitionResult) => {
      // 识别成功
      console.info(`Recognition result: ${JSON.stringify(result)}`);
      this.dataValues = result.value; // 显示识别结果
    })
    .catch((error: BusinessError) => {
      // 识别失败
      console.error(`Recognition failed: ${error.code} - ${error.message}`);
      this.dataValues = `Error: ${error.message}`;
    });
}

完整代码示例

说了这么多，咱们直接上完整的示例代码吧！我给大家准备了一个简单的页面，包含图片选择和文字识别功能～

import { textRecognition } from '@kit.CoreVisionKit';
import { image } from '@kit.ImageKit';
import { hilog } from '@kit.PerformanceAnalysisKit';
import { BusinessError } from '@kit.BasicServicesKit';
import { fileIo } from '@kit.CoreFileKit';
import { photoAccessHelper } from '@kit.MediaLibraryKit';
import { Button, ButtonType, Column, Image, ImageFit, ItemAlign, Text, TextAlign } from '@kit.ArkUI';

@Entry
@Component
struct TextRecognitionDemo {
  private imageSource: image.ImageSource | undefined = undefined;
  @State chooseImage: PixelMap | undefined = undefined;
  @State dataValues: string = '';

  async aboutToAppear(): Promise<void> {
    const initResult = await textRecognition.init();
    hilog.info(0x0000, 'OCRDemo', `OCR初始化成功: ${initResult}`);
  }

  async aboutToDisappear(): Promise<void> {
    await textRecognition.release();
    hilog.info(0x0000, 'OCRDemo', 'OCR资源已释放');
  }

  build() {
    Column() {
      Image(this.chooseImage)
        .objectFit(ImageFit.Fill)
        .height('60%')

      Text(this.dataValues)
        .textAlign(TextAlign.Start)
        .copyOption(Text.CopyOptions.LocalDevice)
        .height('15%')
        .margin(10)
        .width('90%')
        .backgroundColor('#f5f5f5')
        .padding(10)
        .borderRadius(5)

      Button('选择图片')
        .type(ButtonType.Capsule)
        .fontColor(Color.White)
        .alignSelf(ItemAlign.Center)
        .width('80%')
        .margin(10)
        .backgroundColor('#007dff')
        .onClick(() => this.selectImage())

      Button('开始识别')
        .type(ButtonType.Capsule)
        .fontColor(Color.White)
        .alignSelf(ItemAlign.Center)
        .width('80%')
        .margin(10)
        .backgroundColor('#007dff')
        .onClick(() => this.textRecognitionTest())
    }
    .width('100%')
    .height('100%')
    .justifyContent(FlexAlign.Center)
  }

  private async selectImage() {
    const uri = await this.openPhoto();
    if (uri) this.loadImage(uri);
  }

  private openPhoto(): Promise<string> {
    return new Promise((resolve) => {
      const photoPicker = new photoAccessHelper.PhotoViewPicker();
      photoPicker.select({
        MIMEType: photoAccessHelper.PhotoViewMIMETypes.IMAGE_TYPE,
        maxSelectNumber: 1
      }).then((res) => {
        resolve(res.photoUris[0]);
      }).catch((err) => {
        hilog.error(0x0000, 'OCRDemo', `选择图片失败: ${err.code} - ${err.message}`);
        resolve('');
      });
    });
  }

  private loadImage(uri: string) {
    setTimeout(async () => {
      try {
        const fileSource = await fileIo.open(uri, fileIo.OpenMode.READ_ONLY);
        this.imageSource = image.createImageSource(fileSource.fd);
        this.chooseImage = await this.imageSource.createPixelMap();
        await fileIo.close(fileSource);
      } catch (error) {
        hilog.error(0x0000, 'OCRDemo', `加载图片失败: ${error}`);
      }
    }, 100);
  }

  private textRecognitionTest() {
    if (!this.chooseImage) return;

    const visionInfo: textRecognition.VisionInfo = {
      pixelMap: this.chooseImage
    };

    const textConfiguration: textRecognition.TextRecognitionConfiguration = {
      isDirectionDetectionSupported: false
    };

    textRecognition.recognizeText(visionInfo, textConfiguration)
      .then((result) => {
        hilog.info(0x0000, 'OCRDemo', `识别结果: ${JSON.stringify(result)}`);
        this.dataValues = result.value;
      })
      .catch((error) => {
        hilog.error(0x0000, 'OCRDemo', `识别失败: ${error.code} - ${error.message}`);
        this.dataValues = `识别失败: ${error.message}`;
      });
  }
}