C# OCR(tesseract) traineddata (Tesseract documentation)

광학 문자 인식(OCR: optical character recognition) 엔진 중 tesseract는 구글이 개발하였습니다. 테서랙트의 광학 문자 인식 성능을 높이기 위해 트레이닝을 진행할 수 있는데요, 이번 주제는 traineddata(tessdata) 관련 자료입니다.

테서랙트 OCR 트레인드데이터 파일을 다운로드 받을 수 있는 곳을 소개합니다.

Two more sets of official traineddata, trained at Google, are made available in the following Github repos.
These do not have the legacy models and only have LSTM models usable with --oem 1.