基于混合注意力机制的软件缺陷预测方法
DOI:
作者:
作者单位:

上海机电工程研究所

作者简介:

通讯作者:

中图分类号:

基金项目:


Software Defect Prediction via Mixed Attention Mechanism
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    软件缺陷预测技术用于定位软件中可能存在缺陷的代码模块,从而辅助开发人员进行测试与修复。传统的软件缺陷特征为基于软件规模、复杂度和语言特点等人工提取的静态度量元信息。然而,静态度量元特征无法直接捕捉程序上下文中的缺陷信息,从而影响了软件缺陷预测的性能。为了充分利用程序上下文中的语法语义信息,论文提出了一种基于混合注意力机制的软件缺陷预测方法 DP-MHA(Defect Prediction via Mixed Attention Mechanism)。DP-MHA首先从程序模块中提取基于AST树的语法语义序列并进行词嵌入编码和位置编码,然后基于多头注意力机制自学习上下文语法语义信息,最后利用全局注意力机制提取关键的语法语义特征,用于构建软件缺陷预测模型并识别存在潜在缺陷的代码模块。为了验证DP-MHA的有效性,论文选取了六个Apache的开源Java数据集,与经典的基于RF的静态度量元方法、基于RBM+RF、DBN+RF无监督学习方法和基于CNN和RNN深度学习方法进行对比,实验结果表明,DP-MHA在F1值分别提升了16.6%、34.3%、26.4%、7.1%、4.9%。

    Abstract:

    In order to assist developers in testing and fixing bugs, software defect prediction technique is used to locate defective code snippets in programs. Traditional defect prediction features are manual static code metrics based on software scale, software complexity and language characteristic. However, these features cannot capture defect information from program context, resulting in the degradation of defect prediction performance. To take full advantage of the syntactic and semantic features in program context, we propose a method called Defect Prediction via Mixed Attention Mechanism (DP-MHA) in this paper. Specifically, DP-MHA first extracts the AST tree-based syntactic and semantic sequence from programs and performs word embedding and positional encoding. Then it learns the contextual syntax and semantic information by the Multi-head attention mechanism. Finally it uses the global attention mechanism to extract key syntactic and semantic features which are used to build a software defect prediction model and identify code snippets with potential defects. In order to verify the effectiveness of DP-MHA, we select six Apache open-source Java projects, and compare it with the state-of-the-art methods including classical static code metric method based on RF, unsupervised learning method based on RBM+RF, DBN+RF and deep learning method based on CNN, RNN. The experimental results show that DP-MHA improves F1-Measure by 16.6%, 34.3%, 26.4%, 7.1% and 4.9%, respectively.

    参考文献
    相似文献
    引证文献
引用本文

刁旭炀,吴凯,陈都,周俊峰,高璞.基于混合注意力机制的软件缺陷预测方法计算机测量与控制[J].,2023,31(3):56-62.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-07-31
  • 最后修改日期:2022-09-04
  • 录用日期:2022-09-05
  • 在线发布日期: 2023-03-15
  • 出版日期: