基于Transformer和空间注意力的红外与可见光图像融合
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

新疆维吾尔自治区自然科学基金项目(No.2021D01C077)资助。


Infrared and visible image fusion based on transformer and spatial attention model
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    目前,已经有很多研究人员将卷积神经网络应用到红外与可见光图像融合任务中,并取得了较好的融合效果。其中有很多方法是基于自编码器架构的网络模型,这类方法通过自监督方式进行训练,在测试阶段需要采用手工设计的融合策略对特征进行融合。但现有的基于自编码器网络的方法很少能够充分地利用浅层特征和深层特征,而且卷积神经网络受到感受野的限制,建立长距离依赖较为困难,因而丢失了全局信息。而Transformer借助于自注意力机制,可以建立长距离依赖,有效获取全局上下文信息。在融合策略方面,大多数方法设计的较为粗糙,没有专门考虑不同模态图像的特性。因此,在编码器中结合了CNN和Transformer,使编码器能够提取更加全面的特征。并将注意力模型应用到融合策略中,更精细化地优化特征。实验结果表明,该融合算法相较于其他图像融合算法在主观和客观评价上均取得了优秀的结果。

    Abstract:

    Currently,the applications of convolutional neural networks to the task of fusing infrared and visible images have achieved better fusion results. Many of these methods are based on network models with self encoder architectures,which are trained in a self supervised methods and require the use of hand designed fusion strategies to fuse features in the testing phase. However,existing methods based on self encoder networks rarely make full use of both shallow and deep features,and convolutional neural networks are limited by the receptive field,making it more difficult to establish long range dependencies and thus losing global information. In contrast,Transformer,with the help of self attention mechanism,can establish long range dependencies and effectively obtain global contextual information. In terms of fusion strategies,most of the methods are designed in a crude way and do not specifically consider the characteristics of different modal images. Therefore,CNN and Transformer are combined in the encoder to enable the encoder to extract more comprehensive features. And the attention model is applied to the fusion strategy to optimize the features in a more refined way. The experimental results show that the fusion algorithm achieves excellent results in both subjective and objective evaluations compared to other image fusion algorithms.

    参考文献
    相似文献
    引证文献
引用本文

耿俊,吴子豪,李文海,李晓瑜.基于Transformer和空间注意力的红外与可见光图像融合[J].激光与红外,2024,54(3):457~465
GENG Jun, WU Zi-hao, LI Wen-hai, LI Xiao-yu. Infrared and visible image fusion based on transformer and spatial attention model[J]. LASER & INFRARED,2024,54(3):457~465

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:
  • 最后修改日期:2023-04-15
  • 录用日期:
  • 在线发布日期: 2024-03-22
  • 出版日期: