Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020a coordinator or generated periodically • We want to snapshot stream process graphs after the complete computation of an epoch. Epoch Snapshotting 35 ??? Vasiliki Kalavri | Boston University 2020 LWv2vbdeatzXafRQEfoGJ0iG12gDrpFXeQginL0jF7Rm/FkvBjvxsesdcmoZw7QHxifP4y cle0= A epoch-complete consistent cut that includes events that Epoch cuts p40 码力 | 81 页 | 13.18 MB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020block channels and upstream operators All affected operators block until the reconfiguration is complete • State is scoped to a single task • Each stateful task is responsible for processing and operators can check the frontier (watermark) at the output of the stateful operator to ensure only complete state is migrated Live state migration ??? Vasiliki Kalavri | Boston University 2020 36 control operators can check the frontier (watermark) at the output of the stateful operator to ensure only complete state is migrated Helpers buffer data that cannot yet be safely routed and configuration commands0 码力 | 93 页 | 2.42 MB | 1 年前3
Scalable Stream Processing - Spark Streaming and Flinkappended to the result table since the last trigger will be written to the external storage. 2. Complete: the entire updated result table will be written to external storage. 3. Update: only the rows appended to the result table since the last trigger will be written to the external storage. 2. Complete: the entire updated result table will be written to external storage. 3. Update: only the rows appended to the result table since the last trigger will be written to the external storage. 2. Complete: the entire updated result table will be written to external storage. 3. Update: only the rows0 码力 | 113 页 | 1.22 MB | 1 年前3
Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020Reconfiguring Flink applications ??? Vasiliki Kalavri | Boston University 2020 • A consistent and complete snapshot of an application’s state • Checkpoints are automatically created and removed by Flink the new set of parallel tasks • For exactly-once results, we need to prevent a checkpoint to complete after the savepoint! • Use the integrated savepoint-and-cancel command 15 Scaling from a Savepoint0 码力 | 41 页 | 4.09 MB | 1 年前3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020arrival rate or order f’ ∞ ? Continuously arriving, possibly unbounded data f read write Complete data accessible in persistent storage 30 Vasiliki Kalavri | Boston University 2020 Consider0 码力 | 34 页 | 2.53 MB | 1 年前3
Streaming in Apache Flinkadd/sort this event into the queue */ /* set an event-time timer for when the stream is complete up to the event-time of this event */ } @Override public void onTimer(long timestamp, OnTimerContext0 码力 | 45 页 | 3.00 MB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020recovery node may need to re-process many tuples • all tuples that contributed to lost state • a complete queue-trimming interval worth of tuples, if level-0 and level-1 acks are periodically transmitted0 码力 | 49 页 | 2.08 MB | 1 年前3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020is appended to the stream table? Synopses: Maintain summaries of streaming data instead of the complete history. 29 Vasiliki Kalavri | Boston University 2020 Stream synopses requirements • Single-pass:0 码力 | 45 页 | 1.22 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020data items • Batching hurts latency as events can only be processed once the entire batch is complete Batching Profitability A A’ Spark Streaming • Treat streaming computation as a series of deterministic0 码力 | 54 页 | 2.83 MB | 1 年前3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020∪τ S = Lτ ∪ Sτ Languages supporting union operators and non-blocking UDAs on data streams are complete, in the sense that they can express every monotonic function on their input. 49 Vasiliki Kalavri0 码力 | 53 页 | 532.37 KB | 1 年前3
共 10 条
- 1













