tesseract4编译安装C6
本文地址:http://tongxinmao.com/Article/Detail/id/367
前几天在服务器上安装了 tesseract3.05,发现识别效果不是太满意,
然后在本机Windows上下载最新版 tesseract 4安装后,发现识别率比3.X版本要高很多。
于是果断把服务器上版本升级到最新版。
1、先说安装前的准备,先更新依赖
yum -y update
yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
2、Autoconf 需要2.64以上版本支持。从官网下载源码编译
查询当前版本并移除。参考下面的帖子
http://blog.csdn.net/knowledgeaaa/article/details/50667870
仍然无法识别到AUTOCONF,使用以下方法解决
Download autoconf-archive source & compile (http://babyname.tips/mirrors/gnu/autoconf-archive/?C=M;O=D)
cd /tmp wget http://babyname.tips/mirrors/gnu/autoconf-archive/autoconf-archive-2017.09.28.tar.xz tar xf autoconf-archive-2017.09.28.tar.xz cd autoconf-archive-2017.09.28 ./configure && make && make install # copy the generated macros to /usr/share/aclocal/, such that autogen.sh can find them cp ./m4/* /usr/share/aclocal/
3、安装GCC4.8以支持C++11。从官网下载源码编译
参考下面的帖子
gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)
http://www.centoscn.com/image-text/config/2015/0206/4643.html
4、下载并安装依赖程序leptonica,版本1.74以上版本。从官网下载源码编译
这个我上一篇帖子有详细流程
https://my.oschina.net/u/2328100/blog/882777
5、下载并安装最新版tesseract 4
./autogen.sh
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
LIBLEPT_HEADERSDIR=/usr/local/include
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include"
make
make install
ldconfig
6、下载语言包到目录/usr/local/share/tessdata
到下面的地址下载需要的语言包。
https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-400
Compile
As of January 2017, the clang builds but OpenMP will only use a single thread, reducing performance. For best results, use gcc.
The exact values of CPPFLAGS
and LDFLAGS
can be read from brew info icu4c
.
git clone cd tesseract ./autogen.sh ./configure CC=gcc-6 CXX=g++-6 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib make -j sudo make install # if desiredmake training # if installed with training dependencies
./configure: line 4237: syntax error near unexpected token `-mavx,' ./configure: line 4237: `AX_CHECK_COMPILE_FLAG(-mavx, avx=1, avx=0)'
ensure that autoconf-archive
is installed. Don't forget to run ./autogen.sh
after the installation of autoconf-archive
. Note this error happens often under CentOS, where autoconf-archive
is missing and no package is available. Some projects help with installing.
6. 一些常见错误的解决方法
6.1 Error1 多出现在./configure阶段
/bin/sh../libtool --tag=CC --mode=linkgcc -g -O2 -no-undefined -o xtractprotos xtractprotos.o liblept.la
libtool:link: gcc -g -O2 -o .libs/xtractprotos xtractprotos.o ./.libs/liblept.so -Wl,-rpath-Wl,/usr/local/lib
./.libs/liblept.so:undefined reference to `sincos'
./.libs/liblept.so:undefined reference to `sqrt'
./.libs/liblept.so:undefined reference to `ceil'
./.libs/liblept.so:undefined reference to `tan'
./.libs/liblept.so:undefined reference to `powf'
./.libs/liblept.so:undefined reference to `sqrtf'
./.libs/liblept.so:undefined reference to `expf'
./.libs/liblept.so:undefined reference to `log'
./.libs/liblept.so:undefined reference to `sincosf'
./.libs/liblept.so:undefined reference to `atan'
./.libs/liblept.so:undefined reference to `logf'
./.libs/liblept.so:undefined reference to `floorf'
./.libs/liblept.so:undefined reference to `sin'
./.libs/liblept.so:undefined reference to `tanf'
./.libs/liblept.so:undefined reference to `atan2'
collect2:ld returned 1 exit status
make[2]:*** [xtractprotos] Error 1
make[2]:Leaving directory `/root/Downloads/leptonlib-1.67/src'
make[1]:*** [all-recursive] Error 1
make[1]:Leaving directory `/root/Downloads/leptonlib-1.67'
make: ***[all] Error 2
解决方法:重新安装,在./configure阶段时,执行command ./configure --with-libtiff && make
6.2 Error1 出现在ldconfig阶段
ldconfig:/usr/lib/libtesseract.so.3 is not a symbolic link
ldconfig:/usr/lib/liblept.so.2 is not a symbolic link
解决方法:1. 移动到/usr/lib2. 执行下面命令:(Example)
rm -rf liblept.so.2
ln -s liblept.so.2.0.0liblept.so.2
rm -rf libtesseract.so.3
ln -s libtesseract.so.3.0.1libtesseract.so.3
即参考/usr/local/lib中的快捷方式引用,删除文件liblept.so.2,然后创建一个名为liblept.so.2的快捷方式,指向文件liblept.so.2.0.0
7. Linux中项目调用Tesseract注意事项
安装的时候生成的so Library仅在/usr/local/lib里面,而项目调用的lib则在/usr/lib里面,所以要将/usr/local/lib中相关的tesseract和leptonica的library拷贝到/usr/lib当中,注意快捷方式也要拷贝,否则出错,解决方法参考6.2