Skip to content

AXERA-TECH/ax_asr_api

Repository files navigation

ax_asr_api

C++ ASR API on Axera platforms

支持平台:

  • AX650
  • AX630C
  • AX620Q
  • AX8850

支持模型:

  • Whisper-Tiny
  • Whisper-Base
  • Whisper-Small
  • Whisper-Turbo
  • Sensevoice

文档目录

更新

快速开始

可从Release页面下载预编译库

使用示例:

#include "ax_asr_api.h"

AX_ASR_HANDLE handle = AX_ASR_Init(WHISPER_TINY, model_path);

char* result;
if (0 != AX_ASR_RunFile(handle, wav_file, language, &result)) {
    AX_ASR_Uninit(handle);
    return -1;
}

free(result);
AX_ASR_Uninit(handle);

下载模型

安装huggingface_hub

pip3 install -U huggingface_hub

运行下载脚本:

bash download_models.sh

编译

依赖

系统要求

目前在Ubuntu 22.04上编译成功,
需要安装CMake >= 3.13

sudo apt install cmake build-essential

获取交叉编译器

  • AX650/AX630C(aarch64) 从此处获取aarch64交叉编译器
    将其添加到PATH:
export PATH=$PATH:path of gcc-arm-9.2-2019.12-x86_64-aarch64-none-linux-gnu/bin
  • AX620Q(arm-uclibc-linux) 从此处获取
export PATH=$PATH:path of arm-AX620E-linux-uclibcgnueabihf/bin

获取BSP

bash download_bsp.sh

交叉编译

  • AX650
bash build_ax650.sh

编译完成后的产物在install/ax650下

  • AX630C
bash build_ax630c.sh

编译完成后的产物在install/ax630c下

  • AX620Q
bash build_ax620q.sh

编译完成后的产物在install/ax620q下

  • AX8850
bash build_ax8850_aarch64.sh.sh

编译完成后的产物在install/ax8850_aarch64下

本地编译

暂不支持

其它编译选项

  • BUILD_TESTS 默认OFF
    负责编译tests目录下的单元测试,可执行程序生成在install/ax650或install/ax630c下
bash build_ax650.sh -DBUILD_TESTS=ON
  • LOG_LEVEL_DEBUG 默认OFF
    打印源码中的调试信息
bash build_ax650.sh -DLOG_LEVEL_DEBUG=ON
  • BUILD_SERVER 默认ON
    编译asr_server
bash build_ax650.sh -DBUILD_SERVER=ON

测试

主程序

./install/ax650/main -a demo.wav -t whisper_tiny -p ./models-ax650/whisper -l zh

Usage:

./install/ax8850_aarch64/main --help
usage: ./install/ax8850_aarch64/main --audio=string --model_type=string [options] ...
options:
  -a, --audio         audio file, support wav and mp3 (string)
  -t, --model_type    Choose from whisper_tiny, whisper_base, whisper_small, whisper_turbo, sensevoice (string)
  -p, --model_path    model path which contains axmodel (string [=./models-ax650])
  -l, --language      en, zh (string [=zh])
  -?, --help          print this message

服务端(asr_server)

./install/ax8850_aarch64/asr_server --port 8080

Usage:

./install/ax8850_aarch64/asr_server --help
usage: ./install/ax8850_aarch64/asr_server [options] ...
options:
  -p, --port          On which port to run the server (int [=8080])
  -m, --model_path    model path which contains axmodel (string [=./models-ax650])
  -?, --help          print this message

客户端

Python

cd scripts
pip install openai
python test_asr_server.py --ip 10.126.33.146 --port 8080 --audio ../demo.wav -m sensevoice -l zh

Check python test_asr_server.py --help for help.

单元测试

以下为tests下单元测试的使用示例和说明:

  • test_whisper_tiny: 加载whisper tiny模型,打印demo.wav的识别结果
  • test_whisper_base: 加载whisper base模型,打印demo.wav的识别结果
  • test_whisper_small: 加载whisper small模型,打印demo.wav的识别结果
  • test_whisper_turbo: 加载whisper turbo模型,打印demo.wav的识别结果
  • test_sensevoice: 加载sensevoice模型,打印demo.wav的识别结果

性能表现

RTF(Real Time Factor)为推理时间除以音频时长,越小表示越快
WER(Word Error Rate)为词错误率,在私有数据集上测试

  • RTF
asr_type AX650 AX630C
Whisper-Tiny 0.0373
Whisper-Base 0.0668 0.3849
Whisper-Small 0.2110
Whisper-Turbo 0.4372
Sensevoice 0.0364 0.1170
  • WER
asr_type
Whisper-Tiny 0.24
Whisper-Base 0.18
Whisper-Small 0.11
Whisper-Turbo 0.06
Sensevoice 0.02

集成

编译产物包含 include/ax_asr_api.h 和 lib/libax_asr_api.so

讨论

  • Github issues
  • QQ 群: 139953715

贡献

License

This project is licensed under the MIT License - see the LICENSE file for details.