Land surface temperature resources based on satellite remote sensing technology are of great practical significance for urban planning, ecological protection, and sustainable development. However, due to the limitations of satellite observation mode and resolution, it is necessary to further improve the accuracy and integrity of retrieved land surface temperature data through data reconstruction. By using traditional data reconstruction methods such as interpolation, it is difficult to solve the problem of time dependence and spatial heterogeneity. Taking the Miyun Reservoir in Beijing as an example, based on the analysis of temporal and spatial distribution characteristics of surface temperature, this paper proposes a reconstruction method combining multi-data fusion and spatiotemporal bidirectional attention mechanism prediction. First, the time series features of temperature series near meteorological stations are extracted, and the enhanced time series features of the current region are output by the time attention function. Then, a neural network is used to extract more accurate microclimate boundary features from high-definition satellite images, and the spatial attention function is used to output the enhanced image boundary features of the current region. Finally, the enhanced features of the two channels are fused by a Long Short-Term Memory network. The results show the following: (1) the influence of microclimate effect characteristics on the prediction results is obvious, and the prediction results are more stable in the same microclimate characteristics while the prediction results in the microclimate boundary area fluctuate greatly; (2) the spatial attention mechanism has a significant inhibition effect on the outliers of different coverage features, which reflects the data fusion optimization effect of high-precision satellite images. These results indicate that the developed method maximizes the potential of obtaining high-precision continuous surface temperature resources through remote sensing image inversion, which is valuable for the study of microclimate characteristics of urban lakes and sustainable development.
Object detection and video single-frame detection have seen substantial advancements in recent years, particularly with deep-learning-based approaches demonstrating strong performance. However, these detectors often struggle in practical scenarios such as the analysis of video frames captured by unmanned aerial vehicles. The existing detectors usually do not perform well, especially for some objects with small area, large scale variation, dense distribution, and motion blur. To address these challenges, we propose a new feature extraction network: Attention-based Weighted Fusion Network. Our proposed method incorporates the Self-Attention Residual Block to enhance feature extraction capabilities. To accurately locate and identify objects of interest, we introduce the Mixed Attention Module, which significantly enhances object detection accuracy. Additionally, we incorporate adaptive learnable weights for each feature map to emphasize contributions from feature maps with varying resolutions during feature fusion. The performance of our method is evaluated on two datasets: PASCAL VOC and VisDrone2019. Experimental results demonstrate that our proposed method is superior to the baseline and other detectors. Our method achieves 87.1% mean average precision on the Pascal VOC 2007 test set and surpasses the baseline by 3.1% AP50. In addition, our method also exhibits lower false detection rate and missed detection rate compared with other detectors.