<!DOCTYPE article PUBLIC '-//NLM//DTD Journal Publishing DTD v2.1 20050630//EN' 'http://uploads.ingentaconnect.com/docs/dtd/ingenta-journalpublishing.dtd'>
<article article-type="research-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="aggregator">72010604</journal-id>
      <journal-title>Electronic Imaging</journal-title>
      <issn pub-type="ppub">2470-1173</issn><issn pub-type="epub"></issn>
      <publisher>
        <publisher-name>Society for Imaging Science and Technology</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.2352/ISSN.2470-1173.2018.2.VIPC-176</article-id>
      <article-id pub-id-type="sici">2470-1173(20180128)2018:2L.1761;1-</article-id>
      <article-id pub-id-type="publisher-id">s9.phd</article-id>
      <article-id pub-id-type="other">/ist/ei/2018/00002018/00000002/art00009</article-id>
      <article-categories>
        <subj-group>
          <subject>Articles</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Approach for Machine-Printed Arabic Character Recognition: the-state-of-the-art deep-learning method</article-title>
      </title-group>
      <contrib-group>
        <contrib>
          <name>
            <surname>Ko</surname>
            <given-names>Daegun</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Lee</surname>
            <given-names>Changhyung</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Han</surname>
            <given-names>Donghyeop</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Ohk</surname>
            <given-names>Hyeongsu</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Kang</surname>
            <given-names>Kimin</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Han</surname>
            <given-names>Seongwook</given-names>
          </name>
        </contrib>
      </contrib-group>
      <pub-date>
        <day>28</day>
        <month>01</month>
        <year>2018</year>
      </pub-date>
      <volume>2018</volume>
      <issue>2</issue>
      <fpage>176-1</fpage>
      <lpage>176-8</lpage>
      <permissions>
        <copyright-year>2018</copyright-year>
      </permissions>
      <abstract>
        <p>Optical character recognition (OCR) automatically recognizes texts in an image and converts them into machine codes such as ASCII or Unicode. Compared to many research studied on OCR for other languages, recognizing Arabic language is still a challenging problem due to character connection
 and segmentation issues. In this work, we propose a deep-learning framework of recognizing Arabic characters based on the multi-dimensional bi-direction long short-term memory (MD-BLSTM) with connectionist temporal classification (CTC). To train this framework, we generate over one-million
 Arabic text-line images dataset that contains Arabic digits, basic Arabic forms with isolated shape and connected forms. To compare the results, we also measure the performance of other OCR software such as Tesseract made by Hewlett-Packard and Google Inc. Tesseract version 3 and version 4
 are used. Results show that deep-learning method outperforms the conventional methods in terms of recognition error rate, although the Tesseract_3.0 system was faster.</p>
      </abstract>
      <kwd-group>
        <kwd>DEEP-LEARNING</kwd>
        <kwd>LONG SHORT-TERM MEMORY</kwd>
        <kwd>CONNECTIONIST TEMPORAL CLASSIFICATION</kwd>
        <kwd>TESSERACT</kwd>
        <kwd>ARABIC CHARACTER RECOGNITION</kwd>
        <kwd>OCR PERFORMANCE</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
