浅谈OCR之Tesseract
光學字符識別(OCR,Optical Character Recognition)是指對文本資料進行掃描,然后對圖像文件進行分析處理,獲取文字及版面信息的過程。OCR技術非常專業,一般多是印刷、打印行業的從業人員使用,可以快速的將紙質資料轉換為電子資料。關于中文OCR,目前國內水平較高的有清華文通、漢王、尚書,其產品各有千秋,價格不菲。國外OCR發展較早,像一些大公司,如IBM、微軟、HP等,即使沒有推出單獨的OCR產品,但是他們的研發團隊早已掌握核心技術,將OCR功能植入了自身的軟件系統。對于我們程序員來說,一般用不到那么高級的,主要在開發中能夠集成基本的OCR功能就可以了。這兩天我查找了很多免費OCR軟件、類庫,特地整理一下,今天首先來談談Tesseract,下一次將討論下Onenote 2010中的OCR API實現。可以在這里查看OCR技術的發展簡史。
測試代碼下載
轉載請注明出處:http://www.cnblogs.com/brooks-dotnet/archive/2010/10/05/1844203.html?
?
1、Tesseract概述
Tesseract的OCR引擎最先由HP實驗室于1985年開始研發,至1995年時已經成為OCR業內最準確的三款識別引擎之一。然而,HP不久便決定放棄OCR業務,Tesseract也從此塵封。
數年以后,HP意識到,與其將Tesseract束之高閣,不如貢獻給開源軟件業,讓其重煥新生--2005年,Tesseract由美國內華達州信息技術研究所獲得,并求諸于Google對Tesseract進行改進、消除Bug、優化工作。
Tesseract目前已作為開源項目發布在Google Project,其項目主頁在這里查看,其最新版本3.0已經支持中文OCR,并提供了一個命令行工具。本次我們來測試一下Tesseract 3.0,由于命令行對最終用戶不太友好,我用WPF簡單封裝了一下,就可以方便的進行中文OCR了。
?
1.1、首先到Tesseract項目主頁下載命令行工具、源代碼、中文語言包:
?
1.2、命令行工具解壓縮后如下(不含1.jpg、1.txt):
?
1.3、為了進行中文OCR,將簡體中文語言包復制到【tessdata】目錄下:
?
1.4、在DOS下切換到Tesseract的命令行目錄,查看一下tesseract.exe的命令格式:
?
Imagename為待OCR的圖片,outputbase為OCR后的輸出文件,默認是文本文件(.txt),lang為使用的語言包,configfile為配置文件。
?
1.5、下面來測試一下,準備一張jpg格式的圖片,這里我是放到了和Tesseract同一個目錄中:
?
輸入:tesseract.exe 1.jpg 1 -l chi_sim,然后回車,幾秒鐘就OCR完成了:
這里注意命令的格式:imagename要加上擴展名.jpg,輸出文件和語言包不需要加擴展名。
?
OCR結果:
?
可以看到結果不是很理想,中文識別還說的過去,但是英文、數字大都亂碼。不過作為老牌的OCR引擎,能做到這種程度已經相當不錯了,期待Google的后續升級吧,支持一下。
?
2、使用WPF封裝Tesseract命令行
2.1、鑒于命令行書寫容易出錯,且對最終用戶很不友好,我做了一個簡單的WPF小程序,將Tesseract的命令行封裝了一下:
?
左邊選擇圖片、預覽,右邊選擇輸出目錄,顯示OCR結果,支持本地及網絡圖片的預覽。
?
2.2、為了使得圖片預覽支持縮放、移動,原本打算使用微軟的Zoom It API,可惜不支持WPF,于是使用了一個第三方的類:
using?System;using?System.Windows.Controls;
using?System.Windows.Input;
using?System.Windows.Media.Animation;
using?System.Windows;
using?System.Windows.Media;
namespace?PanAndZoom
{
????public?class?PanAndZoomViewer?:?ContentControl
????{
????????public?double?DefaultZoomFactor?{?get;?set;?}
????????private?FrameworkElement?source;
????????private?Point?ScreenStartPoint?=?new?Point(0,?0);
????????private?TranslateTransform?translateTransform;
????????private?ScaleTransform?zoomTransform;
????????private?TransformGroup?transformGroup;
????????private?Point?startOffset;
????????public?PanAndZoomViewer()
????????{
????????????this.DefaultZoomFactor?=?1.4;
????????}
????????public?override?void?OnApplyTemplate()
????????{
????????????base.OnApplyTemplate();
????????????Setup(this);
????????}
????????void?Setup(FrameworkElement?control)
????????{
????????????this.source?=?VisualTreeHelper.GetChild(this,?0)?as?FrameworkElement;
????????????this.translateTransform?=?new?TranslateTransform();
????????????this.zoomTransform?=?new?ScaleTransform();
????????????this.transformGroup?=?new?TransformGroup();
????????????this.transformGroup.Children.Add(this.zoomTransform);
????????????this.transformGroup.Children.Add(this.translateTransform);
????????????this.source.RenderTransform?=?this.transformGroup;
????????????this.Focusable?=?true;
????????????this.KeyDown?+=?new?KeyEventHandler(source_KeyDown);
????????????this.MouseMove?+=?new?MouseEventHandler(control_MouseMove);
????????????this.MouseDown?+=?new?MouseButtonEventHandler(source_MouseDown);
????????????this.MouseUp?+=?new?MouseButtonEventHandler(source_MouseUp);
????????????this.MouseWheel?+=?new?MouseWheelEventHandler(source_MouseWheel);
????????}
????????void?source_KeyDown(object?sender,?KeyEventArgs?e)
????????{
????????????//?hit?escape?to?reset?everything
????????????if?(e.Key?==?Key.Escape)?Reset();
????????}
????????void?source_MouseWheel(object?sender,?MouseWheelEventArgs?e)
????????{
????????????//?zoom?into?the?content.??Calculate?the?zoom?factor?based?on?the?direction?of?the?mouse?wheel.
????????????double?zoomFactor?=?this.DefaultZoomFactor;
????????????if?(e.Delta?<=?0)?zoomFactor?=?1.0?/?this.DefaultZoomFactor;
????????????//?DoZoom?requires?both?the?logical?and?physical?location?of?the?mouse?pointer
????????????var?physicalPoint?=?e.GetPosition(this);
????????????DoZoom(zoomFactor,?this.transformGroup.Inverse.Transform(physicalPoint),?physicalPoint);
????????}
????????void?source_MouseUp(object?sender,?MouseButtonEventArgs?e)
????????{
????????????if?(this.IsMouseCaptured)
????????????{
????????????????//?we're?done.??reset?the?cursor?and?release?the?mouse?pointer
????????????????this.Cursor?=?Cursors.Arrow;
????????????????this.ReleaseMouseCapture();
????????????}
????????}
????????void?source_MouseDown(object?sender,?MouseButtonEventArgs?e)
????????{
????????????//?Save?starting?point,?used?later?when?determining?how?much?to?scroll.
????????????this.ScreenStartPoint?=?e.GetPosition(this);
????????????this.startOffset?=?new?Point(this.translateTransform.X,?this.translateTransform.Y);
????????????this.CaptureMouse();
????????????this.Cursor?=?Cursors.ScrollAll;
????????}
????????void?control_MouseMove(object?sender,?MouseEventArgs?e)
????????{
????????????if?(this.IsMouseCaptured)
????????????{
????????????????//?if?the?mouse?is?captured?then?move?the?content?by?changing?the?translate?transform.??
????????????????//?use?the?Pan?Animation?to?animate?to?the?new?location?based?on?the?delta?between?the?
????????????????//?starting?point?of?the?mouse?and?the?current?point.
????????????????var?physicalPoint?=?e.GetPosition(this);
????????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreatePanAnimation(physicalPoint.X?-?this.ScreenStartPoint.X?+?this.startOffset.X),?HandoffBehavior.Compose);
????????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreatePanAnimation(physicalPoint.Y?-?this.ScreenStartPoint.Y?+?this.startOffset.Y),?HandoffBehavior.Compose);
????????????}
????????}
????????///?<summary>Helper?to?create?the?panning?animation?for?x,y?coordinates.</summary>
????????///?<param?name="toValue">New?value?of?the?coordinate.</param>
????????///?<returns>Double?animation</returns>
????????private?DoubleAnimation?CreatePanAnimation(double?toValue)
????????{
????????????var?da?=?new?DoubleAnimation(toValue,?new?Duration(TimeSpan.FromMilliseconds(300)));
????????????da.AccelerationRatio?=?0.1;
????????????da.DecelerationRatio?=?0.9;
????????????da.FillBehavior?=?FillBehavior.HoldEnd;
????????????da.Freeze();
????????????return?da;
????????}
????????///?<summary>Helper?to?create?the?zoom?double?animation?for?scaling.</summary>
????????///?<param?name="toValue">Value?to?animate?to.</param>
????????///?<returns>Double?animation.</returns>
????????private?DoubleAnimation?CreateZoomAnimation(double?toValue)
????????{
????????????var?da?=?new?DoubleAnimation(toValue,?new?Duration(TimeSpan.FromMilliseconds(500)));
????????????da.AccelerationRatio?=?0.1;
????????????da.DecelerationRatio?=?0.9;
????????????da.FillBehavior?=?FillBehavior.HoldEnd;
????????????da.Freeze();
????????????return?da;
????????}
????????///?<summary>Zoom?into?or?out?of?the?content.</summary>
????????///?<param?name="deltaZoom">Factor?to?mutliply?the?zoom?level?by.?</param>
????????///?<param?name="mousePosition">Logical?mouse?position?relative?to?the?original?content.</param>
????????///?<param?name="physicalPosition">Actual?mouse?position?on?the?screen?(relative?to?the?parent?window)</param>
????????public?void?DoZoom(double?deltaZoom,?Point?mousePosition,?Point?physicalPosition)
????????{
????????????double?currentZoom?=?this.zoomTransform.ScaleX;
????????????currentZoom?*=?deltaZoom;
????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreateZoomAnimation(-1?*?(mousePosition.X?*?currentZoom?-?physicalPosition.X)));
????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreateZoomAnimation(-1?*?(mousePosition.Y?*?currentZoom?-?physicalPosition.Y)));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty,?CreateZoomAnimation(currentZoom));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty,?CreateZoomAnimation(currentZoom));
????????}
????????///?<summary>Reset?to?default?zoom?level?and?centered?content.</summary>
????????public?void?Reset()
????????{
????????????this.translateTransform.BeginAnimation(TranslateTransform.XProperty,?CreateZoomAnimation(0));
????????????this.translateTransform.BeginAnimation(TranslateTransform.YProperty,?CreateZoomAnimation(0));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleXProperty,?CreateZoomAnimation(1));
????????????this.zoomTransform.BeginAnimation(ScaleTransform.ScaleYProperty,?CreateZoomAnimation(1));
????????}
????}
}
?
?
?
2.3、除了使用鼠標。還可以使用滾動條調節圖片預覽效果:
<WrapPanel?Grid.Row="2"?Grid.Column="0">????????????????<Label?Name="lab長度"?Content="長度:"?Margin="3"?/>
????????????????<Slider?Name="sl長度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="400"?Value="{Binding?ElementName=img圖片,?Path=Width,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab寬度"?Content="寬度:"?Margin="3"?/>
????????????????<Slider?Name="sl寬度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="400"?Value="{Binding?ElementName=img圖片,?Path=Height,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab透明度"?Content="透明度:"?Margin="3"?/>
????????????????<Slider?Name="sl透明度"?MinWidth="50"?Margin="3"?VerticalAlignment="Center"?Maximum="1"?Value="{Binding?ElementName=img圖片,?Path=Opacity,?Mode=TwoWay}"?/>
????????????????<Label?Name="lab拉伸方式"?Content="拉伸方式:"?Margin="3"?/>
????????????????<ComboBox?Name="txt拉伸方式"?Margin="3"?MinWidth="85">
????????????????????<ComboBoxItem?Content="Fill"?/>
????????????????????<ComboBoxItem?Content="None"?IsSelected="True"?/>
????????????????????<ComboBoxItem?Content="Uniform"?/>
????????????????????<ComboBoxItem?Content="UniformToFill"?/>
????????????????</ComboBox>
????????????</WrapPanel>
????????????<local:PanAndZoomViewer?Grid.Row="3"?Grid.Column="0"?Height="300"?Margin="3">
????????????????<Image?Name="img圖片"?Stretch="{Binding?ElementName=txt拉伸方式,?Path=Text,?Mode=TwoWay}"?/>
????????????</local:PanAndZoomViewer>
?
?
??
2.4、由于Tesseract命令行不支持直接OCR網絡圖片,故先下載:
private?void?fnStartDownload(string?v_strImgPath,?string?v_strOutputDir,?out?string?v_strTmpPath)????????{
????????????int?n?=?v_strImgPath.LastIndexOf('/');
????????????string?URLAddress?=?v_strImgPath.Substring(0,?n);
????????????string?fileName?=?v_strImgPath.Substring(n?+?1,?v_strImgPath.Length?-?n?-?1);
????????????this.__OutputFileName?=?v_strOutputDir?+?"\\"?+?fileName.Substring(0,?fileName.LastIndexOf("."));
????????????if?(!Directory.Exists(System.Configuration.ConfigurationManager.AppSettings["tmpPath"]))
????????????{
????????????????Directory.CreateDirectory(System.Configuration.ConfigurationManager.AppSettings["tmpPath"]);
????????????}
????????????string?Dir?=?System.Configuration.ConfigurationManager.AppSettings["tmpPath"];
????????????v_strTmpPath?=?Dir?+?"\\"?+?fileName;
????????????WebRequest?myre?=?WebRequest.Create(URLAddress);
????????????client.DownloadFile(v_strImgPath,?v_strTmpPath);
????????????//Stream?str?=?client.OpenRead(v_strImgPath);
????????????//StreamReader?reader?=?new?StreamReader(str);
????????????//byte[]?mbyte?=?new?byte[Int32.Parse(System.Configuration.ConfigurationManager.AppSettings["MaxDownloadImgLength"])];
????????????//int?allmybyte?=?(int)mbyte.Length;
????????????//int?startmbyte?=?0;
????????????//while?(allmybyte?>?0)
????????????//{
????????????//????int?m?=?str.Read(mbyte,?startmbyte,?allmybyte);
????????????//????if?(m?==?0)
????????????//????{
????????????//????????break;
????????????//????}
????????????//????startmbyte?+=?m;
????????????//????allmybyte?-=?m;
????????????//}
????????????//FileStream?fstr?=?new?FileStream(v_strTmpPath,?FileMode.Create,?FileAccess.Write);
????????????//fstr.Write(mbyte,?0,?startmbyte);
????????????//str.Close();
????????????//fstr.Close();
????????}
?
?
??
2.5、使用Process來調用Tesseract命令行:
private?void?fnOCR(string?v_strTesseractPath,?string?v_strSourceImgPath,?string?v_strOutputPath,?string?v_strLangPath)????????{
????????????using?(Process?process?=?new?System.Diagnostics.Process())
????????????{
????????????????process.StartInfo.FileName?=?v_strTesseractPath;
????????????????process.StartInfo.Arguments?=?v_strSourceImgPath?+?"?"?+?v_strOutputPath?+?"?-l?"?+?v_strLangPath;
????????????????process.StartInfo.UseShellExecute?=?false;
????????????????process.StartInfo.CreateNoWindow?=?true;
????????????????process.StartInfo.RedirectStandardOutput?=?true;
????????????????process.Start();
????????????????process.WaitForExit();
????????????}
????????}
?
?
??
2.6、測試本地圖片:
?
2.7、測試網絡圖片:
?
小結:
本次我們簡單討論了下Tesseract的用法,作為一款開源、免費的OCR引擎,能夠支持中文十分難得。雖然其識別效果不是很理想,但是對于要求不高的中小型項目來說,已經足夠用了。這里有一份免費OCR工具列表,感興趣的朋友可以研究一下。下一次將測試一下Onenote 2010中OCR功能,以及如何調用其API,為項目所用。
轉載于:https://www.cnblogs.com/Crackers/p/4142290.html
總結
以上是生活随笔為你收集整理的浅谈OCR之Tesseract的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: [译]用AngularJS构建大型ASP
- 下一篇: 旁路电容和去耦电容