Deep nested U-structure network with frequency attention for building semantic segmentation

Zhang, X. et al. An improved encoder–decoder network based on strip pool method applied to segmentation of farmland vacancy field. Entropy https://doi.org/10.3390/e23040435 (2021).

Article
PubMed
PubMed Central

Google Scholar

Li, D. et al. Building extraction from airborne multi-spectral lidar point clouds based on graph geometric moments convolutional neural networks. Remote Sensing 12, 3186 (2020).

ADS

Google Scholar

Peng, B., Al-Huda, Z., Xie, Z. & Wu, X. Multi-scale region composition of hierarchical image segmentation. Multimed. Tools Appl. 8, 1–23 (2020).

Google Scholar

Al-Huda, Z., Peng, B., Yang, Y. & Ahmed, M. Object scale selection of hierarchical image segmentation using reliable regions. In 2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), 1081–1088 (IEEE, 2019).

Algabri, R. & Choi, M.-T. Deep-learning-based indoor human following of mobile robot using color feature. Sensors 20, 2699 (2020).

ADS
PubMed
PubMed Central

Google Scholar

Algabri, R. & Choi, M.-T. Target recovery for robust deep learning-based person following in mobile robots: Online trajectory prediction. Appl. Sci. 11, 4165 (2021).

CAS

Google Scholar

Yu, B., Yang, L. & Chen, F. Semantic segmentation for high spatial resolution remote sensing images based on convolution neural network and pyramid pooling module. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 11, 3252–3261 (2018).

ADS

Google Scholar

Ok, A. O. Automated detection of buildings from single vhr multispectral images using shadow information and graph cuts. ISPRS J. Photogramm. Remote. Sens. 86, 21–40 (2013).

ADS

Google Scholar

Ghanea, M., Moallem, P. & Momeni, M. Building extraction from high-resolution satellite images in urban areas: Recent methods and strategies against significant challenges. Int. J. Remote Sens. 37, 5234–5248 (2016).

Google Scholar

Gao, H., Tang, Y., Jing, L., Li, H. & Ding, H. A novel unsupervised segmentation quality evaluation method for remote sensing images. Sensors 17, 2427 (2017).

ADS
PubMed
PubMed Central

Google Scholar

Ahmadi, S., Zoej, M. V., Ebadi, H., Moghaddam, H. A. & Mohammadzadeh, A. Automatic urban building boundary extraction from high resolution aerial images using an innovative model of active contours. Int. J. Appl. Earth Obs. Geoinf. 12, 150–157 (2010).

ADS

Google Scholar

Sun, Y., Zhang, X., Zhao, X. & Xin, Q. Extracting building boundaries from high resolution optical images and lidar data by integrating the convolutional neural network and the active contour model. Remote Sens. 10, 1459 (2018).

ADS
CAS

Google Scholar

Vakalopoulou, M., Karantzalos, K., Komodakis, N. & Paragios, N. Building detection in very high resolution multispectral data with deep learning features. In 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), 1873–1876 (IEEE, 2015).

Chen, L.-C., Papandreou, G., Schroff, F. & Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017).

Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

Wang, F. et al. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3156–3164 (2017).

Jégou, M., Drozdzal, D., Vazquez, A. & Romero, Y. B. The one hundred layers tiramisu: Fully convolutional densenets for semantic segmentation. In Computer Vision and Pattern Recognition, 1175–1183. (IEEE, 2017).

Cai, J. & Chen, Y. Mha-net: Multipath hybrid attention network for building footprint extraction from high-resolution remote sensing imagery. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 14, 5807–5817 (2021).

ADS

Google Scholar

Wei, S., Ji, S. & Lu, M. Toward automatic building footprint delineation from aerial images using cnn and regularization. IEEE Trans. Geosci. Remote Sens. 58, 2178–2189 (2019).

ADS

Google Scholar

Sun, K., Xiao, B., Liu, D. & Wang, J. Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 5693–5703 (2019).

Zhu, Q., Liao, C., Hu, H., Mei, X. & Li, H. Map-net: Multiple attending path neural network for building footprint extraction from remote sensed imagery. IEEE Trans. Geosci. Remote Sens. 59, 6169–6181 (2020).

ADS

Google Scholar

Ji, S., Wei, S. & Lu, M. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Trans. Geosci. Remote Sens. 57, 574–586 (2018).

ADS

Google Scholar

Zhu, Y., Liang, Z., Yan, J., Chen, G. & Wang, X. Ed-net: Automatic building extraction from high-resolution aerial images with boundary information. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 14, 4595–4606 (2021).

ADS

Google Scholar

Yang, G., Zhang, Q. & Zhang, G. Eanet: Edge-aware network for the extraction of buildings from aerial images. Remote Sens. 12, 2161 (2020).

ADS

Google Scholar

Takikawa, T., Acuna, D., Jampani, V. & Fidler, S. Gated-scnn: Gated shape cnns for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 5229–5238 (2019).

Xu, Y., Wu, L., Xie, Z. & Chen, Z. Building extraction in very high resolution remote sensing imagery using deep learning and guided filters. Remote Sens. 10, 144 (2018).

ADS

Google Scholar

Alshehhi, R., Marpu, P. R., Woon, W. L. & Dalla Mura, M. Simultaneous extraction of roads and buildings in remote sensing imagery with convolutional neural networks. ISPRS J. Photogramm. Remote Sens. 130, 139–149 (2017).

ADS

Google Scholar

Guo, Z. et al. Village building identification based on ensemble convolutional neural networks. Sensors 17, 2487 (2017).

ADS
PubMed
PubMed Central

Google Scholar

Long, J., Shelhamer, E. & Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431–3440 (2015).

Badrinarayanan, V., Kendall, A. & Cipolla, R. Segnet: A deep convolutional encoder–decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017).

PubMed

Google Scholar

Ronneberger, O. Invited talk: U-net convolutional networks for biomedical image segmentation. In Bildverarbeitung für die Medizin 2017, 3–3 (Springer, 2017).

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K. & Yuille, A. L. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40, 834–848 (2017).

PubMed

Google Scholar

Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder–decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), 801–818 (2018).

Kang, J. et al. Picoco: Pixelwise contrast and consistency learning for semisupervised building footprint segmentation. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. 14, 10548–10559. https://doi.org/10.1109/JSTARS.2021.3119286 (2021).

Article
ADS

Google Scholar

Fenglei, W., Xin, G., Zongze, Z., Lida, X. & Chao, M. A boundary-enhanced semantic segmentation model for buildings. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. (2025).

Li, J., Hu, Y. & Huang, X. Casaformer: A cross-and self-attention based lightweight network for large-scale building semantic segmentation. Int. J. Appl. Earth Obs. Geoinf. 130, 103942 (2024).

Google Scholar

Pu, X., Jia, H., Zheng, L., Wang, F. & Xu, F. Classwise-sam-adapter: Parameter efficient fine-tuning adapts segment anything to sar domain for semantic segmentation. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. (2025).

Jin, Q. et al. Iterative pseudo-labeling based adaptive copy-paste supervision for semi-supervised tumor segmentation. Knowl. Based Syst. 8, 113785 (2025).

Google Scholar

Jin, Q. et al. Inter-and intra-uncertainty based feature aggregation model for semi-supervised histopathology image segmentation. Expert Syst. Appl. 238, 122093 (2024).

Google Scholar

Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014).

Wang, X., Girshick, R., Gupta, A. & He, K. Non-local neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7794–7803 (2018).

Cho, K., Courville, A. & Bengio, Y. Describing multimedia content using attention-based encoder–decoder networks. IEEE Trans. Multimed. 17, 1875–1886 (2015).

Google Scholar

Jaderberg, M. et al. Spatial transformer networks. Adv. Neural Inf. Process. Syst. 28, 32 (2015).

Google Scholar

Hu, J., Shen, L. & Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 7132–7141 (2018).

Woo, S., Park, J., Lee, J.-Y. & Kweon, I. S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), 3–19 (2018).

Yang, Y. & Soatto, S. Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 4085–4095 (2020).

Qin, Z., Zhang, P., Wu, F. & Li, X. Fcanet: Frequency channel attention networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 783–792 (2021).

Patro, B. N., Namboodiri, V. P. & Agneeswaran, V. S. Spectformer: Frequency and attention is what you need in a vision transformer. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 9543–9554 (IEEE, 2025).

Rao, Y., Zhao, W., Zhu, Z., Lu, J. & Zhou, J. Global filter networks for image classification. Adv. Neural. Inf. Process. Syst. 34, 980–993 (2021).

Google Scholar

Guibas, J. et al. Adaptive Fourier neural operators: Efficient token mixers for transformers. arXiv preprint arXiv:2111.13587 (2021).

Xu, Z., Gong, H., Wan, X. & Li, H. Asc: Appearance and structure consistency for unsupervised domain adaptation in fetal brain mri segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, 325–335 (Springer, 2023).

Qin, X. et al. U2-net: Going deeper with nested u-structure for salient object detection. Pattern Recogn. 106, 107404 (2020).

Google Scholar

Zheng, Z., Zhang, S., Shen, J., Shao, Y. & Zhang, Y. A two-stage cnn for automated tire defect inspection in radiographic image. Meas. Sci. Technol. 32, 115403 (2021).

ADS
CAS

Google Scholar

Shi, W., Jiang, F. & Zhao, D. Single image super-resolution with dilated convolution based multi-scale information learning inception module. In 2017 IEEE International Conference on Image Processing (ICIP), 977–981 (IEEE, 2017).

Iandola, F. et al. Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869 (2014).

Wang, Z., Simoncelli, E. & Bovik, A. Multiscale structural similarity for image quality assessment. In The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, 1398–1402. https://doi.org/10.1109/ACSSC.2003.1292216 (2003).

Guo, J.-M., Markoni, H. & Lee, J.-D. Barnet: Boundary aware refinement network for crack detection. IEEE Trans. Intell. Transport. Syst. https://doi.org/10.1109/TITS.2021.3069135 (2021).

Article

Google Scholar

Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F. & Adam, H. Encoder–decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (2018).

Badrinarayanan, V., Kendall, A. & Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481–2495 (2017).

PubMed

Google Scholar

Ronneberger, O., Fischer, P. & Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015 (eds Navab, N. et al.) 234–241 (Springer, 2015).

Google Scholar

Zuo, X., Shao, Z., Wang, J., Huang, X. & Wang, Y. A cross-stage features fusion network for building extraction from remote sensing images. Geo-Spatial Inf. Sci. 6, 1–15 (2024).

Google Scholar

Zhu, W. et al. A method for building extraction in remote sensing images based on swintransformer. Int. J. Digit. Earth 17, 2353113 (2024).

ADS

Google Scholar

Cao, S. et al. Bemrf-net: Boundary enhancement and multiscale refinement fusion for building extraction from remote sensing imagery. IEEE J. Select. Top. Appl. Earth Observ. Remote Sens. (2024).

Wang, W. et al. Tdfnet: Twice decoding v-mamba-cnn fusion features for building extraction. Geo-spatial Inf. Sci. 6, 1–20 (2025).

Google Scholar

Ma, X., Zhang, X. & Pun, M.-O. Rs 3 mamba: Visual state space model for remote sensing image semantic segmentation. IEEE Geosci. Remote Sens. Lett. 21, 1–5 (2024).

Google Scholar

Deep nested U-structure network with frequency attention for building semantic segmentation

Continue Reading

More posts

From “obscene” childhood goals to Paris heartbreakLee keeps pushing

Chinese pre-orders for Apple’s iPhone 17 break records amid strong demand

Space News: The discovery of a gravitational wave 10 years ago shook astrophysics – these ripples in spacetime continue to reveal dark objects in the cosmos – Lake County News,California

Ocean Water Could Soon Have a Shocking Effect on Sharks’ Teeth : ScienceAlert