阅读背景:

图像处理提高了tesseract的OCR精度。

来源:互联网 

I've been using tesseract to convert documents into text. The quality of the documents ranges wildly, and I'm looking for tips on what sort of image processing might improve the results. I've noticed that text that is highly pixellated - for example that generated by fax machines - is especially difficult for tesseract to process - presumably all those jagged edges to the characters confound the shape-recognition algorithms. I've been using tesseract to convert documents




你的当前访问异常,请进行认证后继续阅读剩余内容。

分享到: