针对具有稀疏性的流式大数据卸载方法

doi:10.11908/j.issn.0253-374x.19054

首页 > 过刊浏览>2020年第48卷第02期 >276-286. DOI:10.11908/j.issn.0253-374x.19054

针对具有稀疏性的流式大数据卸载方法
DOI:
                        10.11908/j.issn.0253-374x.19054
                    
CSTR:
                        [cstr]
                    
作者:
                        王顺王顺
同济大学 电子与信息工程学院，上海 200092；同济大学 国家高性能计算机工程技术中心同济分中心，上海 201804
在期刊界中查找
在百度中查找
在本站中查找
李振星李振星
北京捷软世纪信息技术有限公司，北京 100085
在期刊界中查找
在百度中查找
在本站中查找
连增申连增申
北京捷软世纪信息技术有限公司，北京 100085
在期刊界中查找
在百度中查找
在本站中查找
曾国荪曾国荪
同济大学 电子与信息工程学院，上海 200092；同济大学 国家高性能计算机工程技术中心同济分中心，上海 201804
在期刊界中查找
在百度中查找
在本站中查找
丁春玲丁春玲
同济大学 化学科学与工程学院，上海 200092
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:TP338
基金项目:国家社科(17BQT086)；国家海底科学观测系统子项目(2970000001/001/016)；CCF信息系统开放课题项目(CCFIS2018-01-03)。

Load Shedding Methods for Big Data Stream with Sparsity

Author:

WANG Shun
WANG Shun
College of Electronics and Information Engineering，Tongji University, Shanghai 200092, China；Tongji Branch National Engineering & Technology Center of High Performance Computer, Tongji University, Shanghai 201804, China
在期刊界中查找
在百度中查找
在本站中查找
LI Zhenxing
LI Zhenxing
Beijing AgileCentury Information Technology Co.,Ltd.，100085, China
在期刊界中查找
在百度中查找
在本站中查找
LIAN Zengshen
LIAN Zengshen
Beijing AgileCentury Information Technology Co.,Ltd.，100085, China
在期刊界中查找
在百度中查找
在本站中查找
ZENG Guosun
ZENG Guosun
College of Electronics and Information Engineering，Tongji University, Shanghai 200092, China；Tongji Branch National Engineering & Technology Center of High Performance Computer, Tongji University, Shanghai 201804, China
在期刊界中查找
在百度中查找
在本站中查找
DING Chunling
DING Chunling
College of Chemical Science and Engineering， Tongji University, Shanghai 200092, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

在保证实时性的前提下提高流式大数据卸载的准确性是一个重要问题。针对具有稀疏性的流式大数据开展2种典型场景下的卸载研究。对普通均匀业务的流式大数据进行空间建模，使用弹性距离对数据间的距离进行放缩，提出基于离心率的卸载方法。对异常检测业务流式大数据应用场景进行特征分析，使用预处理自动机对数据的动态处理过程进行描述，在综合考虑数据和处理行为相似度基础上，提出基于等价类划分的卸载方法。重复试验表明，所提出的卸载方法与传统卸载方法相比能明显提高卸载的有效性。

关键词:流式大数据;数据卸载;稀疏性;弹性距离;行为相似

Abstract:

How to improve the accuracy of load shedding under the premise of ensuring real-time performance is an important problem. Sparsity is a widespread feature of the big data stream. Therefore, we propose two load-shedding methods of the big data stream with sparsity in two scenarios. In the normal business scenario, we model the big data stream with the high dimensional space. Then we propose a load shedding method based on centrifugation, which uses the elastic distance to measure the distance of data. In the anomaly-monitoring scenario, we analyze the feature of the big data stream and propose a load shedding method based on equivalence class, which uses the combined similarity to divide the data set into equivalence classes. The combined similarity was composed of processing behavior similarity and data similarity to measure the difference between data. Repeated test results show that the two load shedding methods in this paper can significantly improve the accuracy compared with the conventional load shedding methods.

Key words:big data stream;load shedding;sparsity;elastic distance;behavior similarity

引用本文

王顺,李振星,连增申,曾国荪,丁春玲.针对具有稀疏性的流式大数据卸载方法[J].同济大学学报(自然科学版),2020,48(02):276~286

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2019-02-21
最后修改日期:2020-01-13
录用日期:2019-12-06
在线发布日期: 2020-02-26
出版日期:

引用本文

分享

文章指标

历史

文章二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码