结合Bert与Bi-LSTM的英文文本分类模型

首页 > 过刊浏览>2023年第31卷第4期 >213-218

结合Bert与Bi-LSTM的英文文本分类模型
DOI:
                        
CSTR:
                        
作者:
                        
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学(61502290)

English Text Classification Model Combining Bert and Bi-LSTM

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

作为自然语言处理技术中的底层任务之一，文本分类任务对于上游任务有非常重要的辅助价值。而随着最近几年深度学习广泛应用于NLP中的上下游任务的趋势，深度学习在下游任务文本分类中性能不错。但是目前的基于深层学习网络的模型在捕捉文本序列的长距离型上下文语义信息进行建模方面仍有不足，同时也没有引入语言信息来辅助分类器进行分类。针对这些问题，提出了一种新颖的结合Bert与Bi-LSTM的英文文本分类模。该模型不仅能够通过Bert预训练语言模型引入语言信息提升分类的准确性，还能基于Bi-LSTM网络去捕捉双向的上下文语义依赖信息对文本进行显示建模。具体而言，该模型主要有输入层、Bert预训练语言模型层、Bi-LSTM层以及分类器层搭建而成。实验结果表明，与现有的分类模型相比较，所提出的Bert-Bi-LSTM模型在MR数据集、SST-2数据集以及CoLA数据集测试中达到了最高的分类准确率，分别为86.2%、91.5%与83.2%，大大提升了英文文本分类模型的性能。

Abstract:

As a type of downstream natural language processing tasks, the text classification has very vital auxiliary value for the upstream task. With the trend that deep learning is widely used in the upstream and downstream tasks of NLP in recent years, deep neural networks are also applied to text classification tasks. However, the current model based on convolutional neural network cannot model the context semantic information of the text sequence well, and it also does not introduce language information to assist the classifier to classify. To solve these problems, a novel English text classification model combining Bert and Bi-LSTM is proposed. The proposed model can not only boost the performance of classification by introducing language information into Bert pre training language model, but also capture bi-directional context semantic dependency information based on Bi-LSTM network to display and model text. Specifically, the model is mainly composed of input layer, Bert pre training language model layer, Bi-LSTM layer and classifier layer. Compared with the baseline models, The Extensive experimental results demonstrate that the proposed Bert-Bi-LSTM model achieves the highest classification accuracy in MR dataset, sst-2 dataset and CoLA dataset with 86.2%, 91.5% and 83.2% respectively, which greatly improves the performance of the English text classification model.

参考文献

相似文献

引证文献

引用本文

张卫娜.结合Bert与Bi-LSTM的英文文本分类模型计算机测量与控制[J].,2023,31(4):213-218.

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-08-27
最后修改日期:2022-09-27
录用日期:2022-09-27
在线发布日期: 2023-04-24
出版日期:

引用本文

相关视频

分享

文章指标

历史

文章二维码