Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 20202020 21 Online recommendations Vasiliki Kalavri | Boston University 2020 Sensor measurements analysis • Monitoring applications • Complex filtering and alarm activation • Aggregation of multiple within 2% of today’s high. 23 Vasiliki Kalavri | Boston University 2020 Financial transaction analysis • Fraud detection, online risk calculation Example: Someone steals your phone and sings in your frequency • top-K cell towers used 25 Vasiliki Kalavri | Boston University 2020 Web activity analysis • Visualization and aggregation • impressions, clicks, transactions, likes, comments • Analytics0 码力 | 34 页 | 2.53 MB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Kafka Vasiliki Kalavri | Boston University 2020 Apache Flink • An open-source, distributed data analysis framework • True streaming at its core • Streaming & Batch API Historic data Kafka, RabbitMQ0 码力 | 26 页 | 3.33 MB | 1 年前3
Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020that contain a vertex and all of its neighbors. Although this model can enable a theoretical analysis of streaming algorithms, it cannot adequately model real-world unbounded streams, as the neighbors0 码力 | 72 页 | 7.77 MB | 1 年前3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020sequential data access, high-rate append-only updates Data Warehouse • complex, offline analysis • large and relatively static and historical data • batched updates during downtimes, e.g.0 码力 | 45 页 | 1.22 MB | 1 年前3
Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020http://infolab.stanford.edu/~ullman/mmds/book.pdf • Ken Christensen, Allen Roginsky, Miguel Jimeno. A new analysis of the false positive rate of a Bloom filter. Information Processing Letters 110 (2010). Further0 码力 | 74 页 | 1.06 MB | 1 年前3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020cardinalities. European Symposium on Algorithms, 2003. • Flajolet, Philippe, et al. Hyperloglog: the analysis of a near-optimal cardinality estimation algorithm. 2007. https://hal.archives-ouvertes.fr/fi0 码力 | 69 页 | 630.01 KB | 1 年前3
PyFlink 1.15 Documentationand streaming workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationand streaming workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries0 码力 | 36 页 | 266.80 KB | 1 年前3
共 8 条
- 1













