基于云计算技术的海量云数据模糊聚类算法设计
DOI:
CSTR:
作者:
作者单位:

中国民用航空飞行学院

作者简介:

通讯作者:

中图分类号:

基金项目:


Design of Fuzzy Clustering Algorithm for Massive Cloud Data Based on Cloud Computing Technology
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    云数据呈现出爆炸式增长,其规模海量、来源多元异构、结构复杂且动态变化显著。为实现高维、复杂云数据的高效处理、增强对云数据不确定性和模糊性的适应能力,设计基于云计算技术的海量云数据模糊聚类算法。构建基于云计算的海量云数据分析框架,主节点服务器采用随机森林算法实现来自多个异构源的海量云数据融合后,在对其作切分处理后,将得到的多个云数据切片分配给从节点服务器,计算节点在MapReduce数据模型下调用模糊K-means算法执行本地云数据聚类任务,采用量子粒子群算法优化初始聚类中心后,输出云数据聚类结果。实验结果表明:该方法可实现云数据模糊聚类,簇内云数据呈现紧凑分布形态,簇间数据区分度高;聚类中心优化选择后,聚类误差降低至0.10左右,分离系数为0.891,分离熵为10.441;计算节点数量为10时,加速比达到最大。

    Abstract:

    Cloud data is showing explosive growth, with massive scale, diverse and heterogeneous sources, complex structure, and significant dynamic changes. To achieve efficient processing of high-dimensional and complex cloud data, and enhance adaptability to the uncertainty and ambiguity of cloud data, a massive cloud data fuzzy clustering algorithm based on cloud computing technology is designed. Build a massive cloud data analysis framework based on cloud computing. The master node server uses the random forest algorithm to fuse massive cloud data from multiple heterogeneous sources. After segmenting it, the obtained multiple cloud data slices are assigned to the slave node servers. The computing nodes call the fuzzy K-means algorithm under the MapReduce data model to perform local cloud data clustering tasks. The quantum particle swarm optimization algorithm is used to optimize the initial clustering center and output the cloud data clustering results. The experimental results show that this method can achieve fuzzy clustering of cloud data, with compact distribution of cloud data within clusters and high discrimination of data between clusters; After optimizing the selection of clustering centers, the clustering error was reduced to around 0.10, the separation coefficient was 0.891, and the separation entropy was 10.441; When the number of computing nodes is 10, the acceleration ratio reaches its maximum.

    参考文献
    相似文献
    引证文献
引用本文

罗萍,张雷.基于云计算技术的海量云数据模糊聚类算法设计计算机测量与控制[J].,2026,34(3):194-200.

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-08-01
  • 最后修改日期:2025-09-25
  • 录用日期:2025-09-26
  • 在线发布日期: 2026-03-24
  • 出版日期:
文章二维码