<!DOCTYPE article PUBLIC '-//NLM//DTD Journal Publishing DTD v2.1 20050630//EN' 'http://uploads.ingentaconnect.com/docs/dtd/ingenta-journalpublishing.dtd'>
<article article-type="research-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="aggregator">72010604</journal-id>
      <journal-title>Electronic Imaging</journal-title>
      <issn pub-type="ppub">2470-1173</issn><issn pub-type="epub"></issn>
      <publisher>
        <publisher-name>Society for Imaging Science and Technology</publisher-name>
        <publisher-loc>7003 Kilworth Lane, Springfield, VA 22151 USA</publisher-loc>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.2352/ISSN.2470-1173.2020.10.IPAS-313</article-id>
      <article-id pub-id-type="sici">2470-1173(20200126)2020:10L.3131;1-</article-id>
      <article-id pub-id-type="publisher-id">ei_24701173_v2020n10_input/s25.xml</article-id>
      <article-id pub-id-type="other">/ist/ei/2020/00002020/00000010/art00024</article-id>
      <article-categories>
        <subj-group>
          <subject>Articles</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Multiscale Convolutional Descriptor Aggregation for Visual Place Recognition</article-title>
      </title-group>
      <contrib-group>
        <contrib>
          <name>
            <surname>Imbriaco</surname>
            <given-names>Raffaele</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>Bondarev</surname>
            <given-names>Egor</given-names>
          </name>
        </contrib>
        <contrib>
          <name>
            <surname>With</surname>
            <given-names>Peter H.N. de</given-names>
          </name>
        </contrib>
      </contrib-group>
      <pub-date>
        <day>26</day>
        <month>01</month>
        <year>2020</year>
      </pub-date>
      <volume>2020</volume>
      <issue>10</issue>
      <fpage>313-1</fpage>
      <lpage>313-7</lpage>
      <permissions>
        <copyright-year>2020</copyright-year>
      </permissions>
      <abstract>
        <p>
          <italic>Visual place recognition using query and database images from different sources remains a challenging task in computer vision. Our method exploits global descriptors for efficient image matching and local descriptors for geometric verification. We present a novel, multi-scale aggregation
 method for local convolutional descriptors, using memory vector construction for efficient aggregation. The method enables to find preliminary set of image candidate matches and remove visually similar but erroneous candidates. We deploy the multi-scale aggregation for visual place recognition
 on 3 large-scale datasets. We obtain a Recall@10 larger than 94% for the Pittsburgh dataset, outperforming other popular convolutional descriptors used in image retrieval and place recognition. Additionally, we provide a comparison for these descriptors on a more challenging dataset containing
 query and database images obtained from different sources, achieving over 77% Recall@10.</italic>
        </p>
      </abstract>
      <kwd-group>
        <kwd>Deep learning</kwd>
        <kwd>Visual place recognition</kwd>
        <kwd>Local descriptors</kwd>
      </kwd-group>
    </article-meta>
  </front>
</article>
