Event-Driven Big Data with Accumulo — Leveraging Big Data in Motion



leveraging accumulo

Talk Abstract

Events define our world – designing a system that rapidly adapts and incorporates many diverse events into relevant, dynamic models produces rich, timely situational analysis. Additionally, events happen at a defined time allowing analysis to move backward and forward in time, even imaginary time with “what if” events.

Accumulo allows the assembly of extremely large event sets to form a high resolution, dynamic event fabric. Key Accumulo design constructs enable high velocity, disparate events to form complex models. They include establishing a high performance flexible event data model and vocabulary with efficient indexing and complex event contexts, dynamic event versioning, managing event race conditions, event security, incorporating event confidence, and correcting event errors. Processing involves high speed messaging and flexible rule tables. Complementing this is an elastic architecture that handles unpredictable event surges and timely analysis demands across multiple nodes.

Accumulo makes high event resolution possible; event-driven makes it immediately actionable. Our findings detail the ingestion and processing of billions of actual events that transform into dynamic decision models instantly available at big-data scale. Additionally, the changes over time of the events and decision models provide a rich strategic base for analytics. A working demonstration using Amazon EC2 VMs, Elastic MapReduce, and Accumulo stores summarizes the event-driven approach. VMs are available to all conference participants for future investigations.


John Hebeler

Principal Engineer, Lockheed Martin

John Hebeler, Principal Engineer for Lockheed Martin, is the Technical Lead Developer on a major big-data analytic system based on Accumulo. He focuses on Big Data streaming architectures, diverse data integration, and the Semantic Web, has co-written Semantic Web Programming and a P2P networking book, holds two patents on distributed technologies, and presents at technical conferences. He is currently pursuing his PhD in Information Systems based upon Big Data Integration at the University of Maryland.