Although the concept of Regions Of Interest (ROI) is known in video analysis, the ROI extraction problem has been hardly addressed in maritime surveillance, particularly for vessel detection and tracking. A video captured by a maritime surveillance camera may contain irrelevant regions, such as shorelines, bridges and piers. As a result, non-relevant moving objects (e.g. cars moving along the shorelines) can be misleadingly detected by the vessel or ship surveillance system. This paper proposes a robust water region extraction method based on spatiotemporallyoriented energy features in combination with a mean shift clustering algorithm. The method targets not only the conventional RGB surveillance data, but also data from thermal cameras. Experimental results reveal that the pixel-wise water segmentation recall is 95.23% on average for the RGB images and 94.29% on average for the thermal images, even in the presence of islands or other complex shoreline shapes. The measured average precisions are 93.88% and 95.41% for the RGB and thermal datasets, respectively.