 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analyticsavailability, recovery semantics, and guarantees Vasiliki Kalavri | Boston University 2020 Today’s topics • High-availability and fault-tolerance in distributed stream processing • Recovery semantics against failures and guarantee correct results after recovery? • how can we ensure minimal downtime and fast recovery? • how can we hide recovery side-effects from downstream applications? Vasiliki failures. 7 Vasiliki Kalavri | Boston University 2020 Recovery types 8 Vasiliki Kalavri | Boston University 2020 Recovery types • Precise recovery (exactly-once) • It hides the effects of a failure0 码力 | 49 页 | 2.08 MB | 1 年前3 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analyticsavailability, recovery semantics, and guarantees Vasiliki Kalavri | Boston University 2020 Today’s topics • High-availability and fault-tolerance in distributed stream processing • Recovery semantics against failures and guarantee correct results after recovery? • how can we ensure minimal downtime and fast recovery? • how can we hide recovery side-effects from downstream applications? Vasiliki failures. 7 Vasiliki Kalavri | Boston University 2020 Recovery types 8 Vasiliki Kalavri | Boston University 2020 Recovery types • Precise recovery (exactly-once) • It hides the effects of a failure0 码力 | 49 页 | 2.08 MB | 1 年前3
 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020URLs that contain malware? • Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem ??? Vasiliki Kalavri | Boston University 2020 URLs that contain malware? • Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem A hash table requires O(logn) bits per element0 码力 | 74 页 | 1.06 MB | 1 年前3 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020URLs that contain malware? • Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem ??? Vasiliki Kalavri | Boston University 2020 URLs that contain malware? • Filter out all compromised passwords? • Remove duplicate tuples on recovery when using upstream backup? The membership problem A hash table requires O(logn) bits per element0 码力 | 74 页 | 1.06 MB | 1 年前3
 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 41 Recovery process 1. Stop and restart the application. All operators have empty state. ??? Vasiliki Kalavri | Boston University 2020 42 Recovery process 1. Stop and restart Vasiliki Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might Vasiliki Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might0 码力 | 81 页 | 13.18 MB | 1 年前3 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 41 Recovery process 1. Stop and restart the application. All operators have empty state. ??? Vasiliki Kalavri | Boston University 2020 42 Recovery process 1. Stop and restart Vasiliki Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might Vasiliki Kalavri | Boston University 2020 End-to-end exactly once • Flink’s checkpointing and recovery mechanism only resets the internal state of a streaming application • Some result records might0 码力 | 81 页 | 13.18 MB | 1 年前3
 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020missing, out-of-order, delayed data 4. Guarantee deterministic (on replay) and correct results (on recovery) 5. Combine batch (historical) and stream processing 6. Ensure availability despite failures0 码力 | 45 页 | 1.22 MB | 1 年前3 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020missing, out-of-order, delayed data 4. Guarantee deterministic (on replay) and correct results (on recovery) 5. Combine batch (historical) and stream processing 6. Ensure availability despite failures0 码力 | 45 页 | 1.22 MB | 1 年前3
 监控Apache Flink应用程序(入门)-release-1.7/monitoring/metrics.html#latency-tracking 2. During periods of high load or during recovery, events might spend some time in the message queue until they are processed by Flink (see previous0 码力 | 23 页 | 148.62 KB | 1 年前3 监控Apache Flink应用程序(入门)-release-1.7/monitoring/metrics.html#latency-tracking 2. During periods of high load or during recovery, events might spend some time in the message queue until they are processed by Flink (see previous0 码力 | 23 页 | 148.62 KB | 1 年前3
 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020intervals • Keep intermediate state in memory • Use Spark's RDDs instead of replication • Parallel recovery mechanism in case of failures 44 input stream time-based micro-batches D-Streams • During an0 码力 | 54 页 | 2.83 MB | 1 年前3 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020intervals • Keep intermediate state in memory • Use Spark's RDDs instead of replication • Parallel recovery mechanism in case of failures 44 input stream time-based micro-batches D-Streams • During an0 码力 | 54 页 | 2.83 MB | 1 年前3
共 6 条
- 1













