tesseract4编译安装C6

本文地址:http://tongxinmao.com/Article/Detail/id/367

  前几天在服务器上安装了 tesseract3.05,发现识别效果不是太满意,

然后在本机Windows上下载最新版 tesseract 4安装后,发现识别率比3.X版本要高很多。

于是果断把服务器上版本升级到最新版。

 

1、先说安装前的准备,先更新依赖

yum -y update 
yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel

2、Autoconf 需要2.64以上版本支持。从官网下载源码编译

查询当前版本并移除。参考下面的帖子

http://blog.csdn.net/knowledgeaaa/article/details/50667870

仍然无法识别到AUTOCONF,使用以下方法解决

  1. Download autoconf-archive source & compile (http://babyname.tips/mirrors/gnu/autoconf-archive/?C=M;O=D)

cd /tmp
wget http://babyname.tips/mirrors/gnu/autoconf-archive/autoconf-archive-2017.09.28.tar.xz
tar xf autoconf-archive-2017.09.28.tar.xz 
cd autoconf-archive-2017.09.28
./configure && make && make install 

# copy the generated macros to /usr/share/aclocal/, such that autogen.sh can find them
cp ./m4/* /usr/share/aclocal/


3、安装GCC4.8以支持C++11。从官网下载源码编译

参考下面的帖子

gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)

http://www.centoscn.com/image-text/config/2015/0206/4643.html

4、下载并安装依赖程序leptonica,版本1.74以上版本。从官网下载源码编译

这个我上一篇帖子有详细流程

https://my.oschina.net/u/2328100/blog/882777

5、下载并安装最新版tesseract 4

./autogen.sh
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig 

LIBLEPT_HEADERSDIR=/usr/local/include 

./configure --with-extra-includes=/usr/local/include --with-extra-libraries=/usr/local/lib LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include"

 make
make install
ldconfig

6、下载语言包到目录/usr/local/share/tessdata

到下面的地址下载需要的语言包。

https://github.com/tesseract-ocr/tesseract/wiki/Data-Files#data-files-for-version-400



Compile

As of January 2017, the clang builds but OpenMP will only use a single thread, reducing performance. For best results, use gcc.

The exact values of CPPFLAGS and LDFLAGS can be read from brew info icu4c.

git clone  
cd tesseract
./autogen.sh
./configure CC=gcc-6 CXX=g++-6 CPPFLAGS=-I/usr/local/opt/icu4c/include LDFLAGS=-L/usr/local/opt/icu4c/lib
make -j
sudo make install  # if desiredmake training # if installed with training dependencies



./configure: line 4237: syntax error near unexpected token `-mavx,'
./configure: line 4237: `AX_CHECK_COMPILE_FLAG(-mavx, avx=1, avx=0)'

ensure that autoconf-archive is installed. Don't forget to run ./autogen.sh after the installation of autoconf-archive. Note this error happens often under CentOS, where autoconf-archive is missing and no package is available. Some projects help with installing.


6. 一些常见错误的解决方法

6.1 Error1 多出现在./configure阶段

/bin/sh../libtool --tag=CC   --mode=linkgcc  -g -O2 -no-undefined  -o xtractprotos xtractprotos.o liblept.la

libtool:link: gcc -g -O2 -o .libs/xtractprotos xtractprotos.o  ./.libs/liblept.so -Wl,-rpath-Wl,/usr/local/lib

./.libs/liblept.so:undefined reference to `sincos'

./.libs/liblept.so:undefined reference to `sqrt'

./.libs/liblept.so:undefined reference to `ceil'

./.libs/liblept.so:undefined reference to `tan'

./.libs/liblept.so:undefined reference to `powf'

./.libs/liblept.so:undefined reference to `sqrtf'

./.libs/liblept.so:undefined reference to `expf'

./.libs/liblept.so:undefined reference to `log'

./.libs/liblept.so:undefined reference to `sincosf'

./.libs/liblept.so:undefined reference to `atan'

./.libs/liblept.so:undefined reference to `logf'

./.libs/liblept.so:undefined reference to `floorf'

./.libs/liblept.so:undefined reference to `sin'

./.libs/liblept.so:undefined reference to `tanf'

./.libs/liblept.so:undefined reference to `atan2'

collect2:ld returned 1 exit status

make[2]:*** [xtractprotos] Error 1

make[2]:Leaving directory `/root/Downloads/leptonlib-1.67/src'

make[1]:*** [all-recursive] Error 1

make[1]:Leaving directory `/root/Downloads/leptonlib-1.67'

make: ***[all] Error 2


解决方法:重新安装,在./configure阶段时,执行command ./configure --with-libtiff && make  

6.2 Error1 出现在ldconfig阶段

ldconfig:/usr/lib/libtesseract.so.3 is not a symbolic link

ldconfig:/usr/lib/liblept.so.2 is not a symbolic link


解决方法:1. 移动到/usr/lib2. 执行下面命令:(Example)

rm -rf liblept.so.2

ln -s liblept.so.2.0.0liblept.so.2

 

rm -rf libtesseract.so.3

ln -s libtesseract.so.3.0.1libtesseract.so.3


即参考/usr/local/lib中的快捷方式引用,删除文件liblept.so.2,然后创建一个名为liblept.so.2的快捷方式,指向文件liblept.so.2.0.0

7. Linux中项目调用Tesseract注意事项

安装的时候生成的so Library仅在/usr/local/lib里面,而项目调用的lib则在/usr/lib里面,所以要将/usr/local/lib中相关的tesseract和leptonica的library拷贝到/usr/lib当中,注意快捷方式也要拷贝,否则出错,解决方法参考6.2


上一篇:虚拟打印机实现
下一篇:NTP使用