挖掘数据模式结构信息的混合数据分类方法
DOI:
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(81701793),常州市科技计划项目(CJ20160010),常州轻工职业技术学院博士基金(BSJJ13101010)。


A hybrid data classification method based on mining the information of data pattern structure
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    数据集中数据之间往往相互关联,所有数据整体上呈现特定的模式结构,而传统分类方法(如支持向量机)忽略数据关联信息,仅仅利用数据的物理特征(如距离、相似性等)构建数据分类模型,并在分类阶段计算测试样本与所建立分类模型间的相似性来预测测试样本的标签类型。为了解决传统分类方法利用单一数据信息的问题,提出一种挖掘数据模式结构信息的混合数据分类方法。该方法融合了两种不同类型的分类技术,将使用单一数据物理特征的传统分类方法作为普通分类方法,将利用数据模式结构信息的分类方法作为高级分类方法。特别地,该方法不仅可有效地识别数据模式结构信息以提高数据分类性能,还能提高传统分类方法的泛化能力。在人造数据集和UCI真实数据集上的大量实验结果表明了该混合数据分类方法的有效性,其分类性能优于传统分类方法。

    Abstract:

    To the best of our knowledge, data are often correlated with other data in a dataset, and as a whole, a specific pattern structure is presented from all of the data. However, traditional classification methods (e.g., the support vector machine, SVM) do not take into account the correlation information between pair of data, and classification models are built just by taking advantage of the physical features (e.g., distance or similarity) of the input training data samples. Furthermore, data classification is realized by determining the similarities between the testing data samples and the built classification models in prediction phase. In order to solve the problem on the classification using the individual data information by traditional classification techniques, a hybrid data classification method based on mining the information of data pattern structure (HDCM) is proposed. The proposed classification method consists of two different types of classification techniques, on the one hand, the traditional classification methods based on using sole physical features of data are regarded as common classification methods, and on the other hand, the classification approach based on utilizing the information of data pattern structure is considered as advanced classification methods. In particular the proposed classification method not only has facility in effectively identifying the information of data pattern structure to enhance classification performance, but generalization ability of traditional classification approaches is promoted. A large number of experimental results on synthetic and UCI real-world datasets demonstrate the effectiveness of the proposed classification technique, and better classification performance can be obtained by the proposed classification technique in comparison to traditional classification methods.

    参考文献
    相似文献
    引证文献
引用本文

王惠宇,顾苏杭.挖掘数据模式结构信息的混合数据分类方法计算机测量与控制[J].,2019,27(4):190-197.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2018-10-16
  • 最后修改日期:2018-10-25
  • 录用日期:2018-10-25
  • 在线发布日期: 2019-04-26
  • 出版日期:
文章二维码