Recently, the semantic inference from images is widely used for various applications, such as augmented reality, autonomous robots, and indoor navigation. As a pioneering work for semantic segmentation, the fully convolutional networks (FCN) was introduced and outperformed traditional
methods. However, since FCN only takes account of the local contextual dependency, it does not reflect the global contextual dependency. In this paper, we explore variants of FCN with local and global contextual dependencies in the semantic segmentation problem. In addition, we tried to improve
the performance of semantic segmentation with extra depth information from a commercial RGBD camera. Our experiment result indicates that exploiting the global contextual dependencies and the additional depth information improves the quality of semantic segmentation