tesseract4编译安装C6

    本文地址:http://tongxinmao.com/Article/Detail/id/367

      前几天在服务器上安装了 tesseract3.05,发现识别效果不是太满意,

    然后在本机Windows上下载最新版 tesseract 4安装后,发现识别率比3.X版本要高很多。

    于是果断把服务器上版本升级到最新版。

     

    1、先说安装前的准备,先更新依赖

    yum -y update 
    yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel

    2、Autoconf 需要2.64以上版本支持。从官网下载源码编译

    查询当前版本并移除。参考下面的帖子

    http://blog.csdn.net/knowledgeaaa/article/details/50667870

    仍然无法识别到AUTOCONF,使用以下方法解决

    1. Download autoconf-archive source & compile (http://babyname.tips/mirrors/gnu/autoconf-archive/?C=M;O=D)

    cd /tmp
    wget http://babyname.tips/mirrors/gnu/autoconf-archive/autoconf-archive-2017.09.28.tar.xz
    tar xf autoconf-archive-2017.09.28.tar.xz 
    cd autoconf-archive-2017.09.28
    ./configure && make && make install 
    
    # copy the generated macros to /usr/share/aclocal/, such that autogen.sh can find them
    cp ./m4/* /usr/share/aclocal/


    3、安装GCC4.8以支持C++11。从官网下载源码编译

    参考下面的帖子

    gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)

    http://www.centoscn.com/image-text/config/2015/0206/4643.html

    4、下载并安装依赖程序leptonica,版本1.74以上版本。从官网下载源码编译

    这个我上一篇帖子有详细流程

    https://my.oschina.net/u/2328100/blog/882777

    5、下载并安装最新版tesseract 4

    ./autogen.sh
    PKG_CONFIG_PATH=/usr/local/lib/pkgconfig 

    LIBLEPT_HEADERSDIR=/usr/local/include 

    ./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include"

     make
    make install
    ldconfig

    6、下载语言包到目录/usr/local/share/tessdata

    到下面的地址下载需要的语言包。

    https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-400



    Compile

    As of January 2017, the clang builds but OpenMP will only use a single thread, reducing performance. For best results, use gcc.

    The exact values of CPPFLAGS and LDFLAGS can be read from brew info icu4c.

    git clone  
    cd tesseract
    ./autogen.sh
    ./configure CC=gcc-6 CXX=g++-6 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib
    make -j
    sudo make install  # if desiredmake training # if installed with training dependencies



    ./configure: line 4237: syntax error near unexpected token `-mavx,'
    ./configure: line 4237: `AX_CHECK_COMPILE_FLAG(-mavx, avx=1, avx=0)'

    ensure that autoconf-archive is installed. Don't forget to run ./autogen.sh after the installation of autoconf-archive. Note this error happens often under CentOS, where autoconf-archive is missing and no package is available. Some projects help with installing.


    6. 一些常见错误的解决方法

    6.1 Error1 多出现在./configure阶段

    /bin/sh../libtool --tag=CC   --mode=linkgcc  -g -O2 -no-undefined  -o xtractprotos xtractprotos.o liblept.la

    libtool:link: gcc -g -O2 -o .libs/xtractprotos xtractprotos.o  ./.libs/liblept.so -Wl,-rpath-Wl,/usr/local/lib

    ./.libs/liblept.so:undefined reference to `sincos'

    ./.libs/liblept.so:undefined reference to `sqrt'

    ./.libs/liblept.so:undefined reference to `ceil'

    ./.libs/liblept.so:undefined reference to `tan'

    ./.libs/liblept.so:undefined reference to `powf'

    ./.libs/liblept.so:undefined reference to `sqrtf'

    ./.libs/liblept.so:undefined reference to `expf'

    ./.libs/liblept.so:undefined reference to `log'

    ./.libs/liblept.so:undefined reference to `sincosf'

    ./.libs/liblept.so:undefined reference to `atan'

    ./.libs/liblept.so:undefined reference to `logf'

    ./.libs/liblept.so:undefined reference to `floorf'

    ./.libs/liblept.so:undefined reference to `sin'

    ./.libs/liblept.so:undefined reference to `tanf'

    ./.libs/liblept.so:undefined reference to `atan2'

    collect2:ld returned 1 exit status

    make[2]:*** [xtractprotos] Error 1

    make[2]:Leaving directory `/root/Downloads/leptonlib-1.67/src'

    make[1]:*** [all-recursive] Error 1

    make[1]:Leaving directory `/root/Downloads/leptonlib-1.67'

    make: ***[all] Error 2


    解决方法:重新安装,在./configure阶段时,执行command ./configure --with-libtiff && make  

    6.2 Error1 出现在ldconfig阶段

    ldconfig:/usr/lib/libtesseract.so.3 is not a symbolic link

    ldconfig:/usr/lib/liblept.so.2 is not a symbolic link


    解决方法:1. 移动到/usr/lib2. 执行下面命令:(Example)

    rm -rf liblept.so.2

    ln -s liblept.so.2.0.0liblept.so.2

     

    rm -rf libtesseract.so.3

    ln -s libtesseract.so.3.0.1libtesseract.so.3


    即参考/usr/local/lib中的快捷方式引用,删除文件liblept.so.2,然后创建一个名为liblept.so.2的快捷方式,指向文件liblept.so.2.0.0

    7. Linux中项目调用Tesseract注意事项

    安装的时候生成的so Library仅在/usr/local/lib里面,而项目调用的lib则在/usr/lib里面,所以要将/usr/local/lib中相关的tesseract和leptonica的library拷贝到/usr/lib当中,注意快捷方式也要拷贝,否则出错,解决方法参考6.2


    上一篇:虚拟打印机实现
    下一篇:NTP使用