Welcome to the ADMT Publication Server

Concept-Driven Load Shedding: Reducing Size and Error of Voluminous and Variable Data Streams

DocUID: 2018-008 Full Text: PDF

Author: Nikos R. Katsipoulakis, Alexandros Labrinidis, Panos K. Chrysanthis

Abstract: Load shedding is a technique that aims to ameliorate the consequences of the Velocity and the Volume of Big Data stream processing. When temporal input spikes appear, tuples are shed until a Stream Processing Engine's (SPE) processing capacity is not overwhelmed and results are produced in a timely fashion. Existing load shedding techniques have become obsolete and are not applicable to modern use-cases which require the extraction of patterns from continuously evolving (i.e., Variable) voluminous streams.In this work, we identify the shortcomings of existing load shedding techniques when applied to streams with concept drift. We propose Concept-Driven load shedding (CoD), which aims at limiting the data volume imposed on the SPE while producing high accuracy results. On top of that, we designed CoD for modern SPEs and made its overhead negligible. Our experiments indicate that CoD can deliver more than 10x more accurate results compared to the state of the art in load shedding. Also, CoD can offer up to 2.25× better performance compared to normal processing and reduce the processed data volume significantly.

Keywords: Data Streams, Data Aggregation, Approximate Continuous Queries

Published In: IEEE BigData 2018

Pages: 418-427

Year Published: 2018

DOI: 10.1109/BigData.2018.8622265

Project: PittSmartLiving Subject Area: Data Aggregation, Data Streams

Publication Type: Conference Paper

Sponsor: NSF CNS-1739413

Citation:Text Latex BibTex XML Nikos R. Katsipoulakis, Alexandros Labrinidis, and Panos K. Chrysanthis. Concept-Driven Load Shedding: Reducing Size and Error of Voluminous and Variable Data Streams. IEEE BigData 2018. 418-427. 2018. DOI: 10.1109/BigData.2018.8622265.