Change detection in image pairs has traditionally been a binary process, reporting either “Change” or “No Change.” In this paper, we present LambdaNet, a novel deep architecture for performing pixel-level directional change detection based on a four class classification scheme. LambdaNet successfully incorporates the notion of “directional change” and identifies differences between two images as “Additive Change” when a new object appears, “Subtractive Change” when an object is removed, “Exchange” when different objects are present in the same location, and “No Change.” To obtain pixel annotated change maps for training, we generated directional change class labels for the Change Detection 2014 dataset. Our tests illustrate that LambdaNet would be suitable for situations where the type of change is unstructured, such as change detection scenarios in satellite imagery.
By combining terrestrial panorama images and aerial imagery, or using LiDAR, large 3D point clouds can be generated for 3D city modeling. We describe an algorithm for change detection in point clouds, including three new contributions: change detection for LOD2 models compared to 3D point clouds, the application of detected changes for creating extended and textured LOD2 models, and change detection between point clouds of different years. Overall, LOD2 model-to-point-cloud changes are reliably found in practice, and the algorithm achieves a precision of 0.955 and recall of 0.983 on a synthetic dataset. Despite not having a watertight model, texturing results are visually promising, improving over directly textured LOD2 models.
Change detection from ground vehicles has various applications, such as the detection of roadside Improvised Explosive Devices (IEDs). Although IEDs are hidden, they are often accompanied by visible markers, which can be any kind of object. Because of this, any suspicious change in the environment compared to an earlier moment in time, should be detected. Little work has been published to solve this ill-posed problem using deep learning. This paper shows the feasibility of applying convolutional neural networks (CNNs) to HD video, to accurately predict the presence and location of such markers in real time. The network is trained for the detection of pixel-level changes in HD video, compared to an earlier reference recording. We investigate Siamese CNNs in combination with an encoder-decoder architecture and introduce a modified double-margin contrastive loss function, to achieve pixel-level change detection results. Our dataset consists of seven pairs of challenging real-world recordings with geo-tagged test objects. The proposed network architecture is capable of comparing two images of 1920×1440 pixels in 150 ms on a GTX1080Ti GPU. The proposed network significantly outperforms state-of-the-art networks and algorithms on our dataset in terms of F-1 score, on average by 0.28.
This paper presents a novel method for 3D scene modeling using stereo vision, with an application to image registration. The method constists of two steps. First, disparity estimates are refined, by filling gaps of invalid disparity and removing halos of incorrectly assigned disparity. A coarse segmentation is obtained by identifying depth slices, after which objects are clustered based on color and texture information using Gabor filters. The second step consists of reconstructing the resulting objects in 3D for scene alignment by fitting a planar region. A 2D triangle mesh is generated, and a 3D mesh model is obtained by projecting each triangle onto the fitted plane. Both of these extensions result in improved alignment quality with respect to the state of the art, and operate in near real time using multi-threading. As a bonus, the refined disparity map can also be used in combination with the existing method.