 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analyticsvkalavri@bu.edu Spring 2020 3/17: High availability, recovery semantics, and guarantees Vasiliki Kalavri | Boston University 2020 Today’s topics • High-availability and fault-tolerance in distributed state consists of • input queues • operator state • output queues • Short recovery time • High runtime overhead • The checkpoint interval determines the trade-off 14 Ni primary secondary state consists of • input queues • operator state • output queues • Short recovery time • High runtime overhead • The checkpoint interval determines the trade-off 14 Ni primary secondary0 码力 | 49 页 | 2.08 MB | 1 年前3 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analyticsvkalavri@bu.edu Spring 2020 3/17: High availability, recovery semantics, and guarantees Vasiliki Kalavri | Boston University 2020 Today’s topics • High-availability and fault-tolerance in distributed state consists of • input queues • operator state • output queues • Short recovery time • High runtime overhead • The checkpoint interval determines the trade-off 14 Ni primary secondary state consists of • input queues • operator state • output queues • Short recovery time • High runtime overhead • The checkpoint interval determines the trade-off 14 Ni primary secondary0 码力 | 49 页 | 2.08 MB | 1 年前3
 Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020Stream Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 3/31: High-availability & reconfiguration ??? Vasiliki Kalavri | Boston University 2020 • To recover from failures state 2 Checkpointing guards the state from failures, but what about process failure? High-availability ??? Vasiliki Kalavri | Boston University 2020 3 Flink processes ??? Vasiliki Kalavri | Boston • A high-availability mode migrates the responsibility and metadata for a job to another JobManager in case the original JobManager disappears. • Flink relies on Apache ZooKeeper for high-availability0 码力 | 41 页 | 4.09 MB | 1 年前3 Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020Stream Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 3/31: High-availability & reconfiguration ??? Vasiliki Kalavri | Boston University 2020 • To recover from failures state 2 Checkpointing guards the state from failures, but what about process failure? High-availability ??? Vasiliki Kalavri | Boston University 2020 3 Flink processes ??? Vasiliki Kalavri | Boston • A high-availability mode migrates the responsibility and metadata for a job to another JobManager in case the original JobManager disappears. • Flink relies on Apache ZooKeeper for high-availability0 码力 | 41 页 | 4.09 MB | 1 年前3
 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020over time, rather than being available in full before its processing begins. • Data streams are high-volume, real-time data that might be unbounded • we cannot store the entire stream in an accessible or groups of rows Data Stream Management System • continuous queries • sequential data access, high-rate append-only updates Data Warehouse • complex, offline analysis • large and relatively append-only Update rates relatively low high, bursty Processing Model query-driven / pull-based data-driven / push-based Queries ad-hoc continuous Latency relatively high low 5 Vasiliki Kalavri | Boston0 码力 | 45 页 | 1.22 MB | 1 年前3 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020over time, rather than being available in full before its processing begins. • Data streams are high-volume, real-time data that might be unbounded • we cannot store the entire stream in an accessible or groups of rows Data Stream Management System • continuous queries • sequential data access, high-rate append-only updates Data Warehouse • complex, offline analysis • large and relatively append-only Update rates relatively low high, bursty Processing Model query-driven / pull-based data-driven / push-based Queries ad-hoc continuous Latency relatively high low 5 Vasiliki Kalavri | Boston0 码力 | 45 页 | 1.22 MB | 1 年前3
 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020the query is running might be impractical. • state accumulation and re-partitioning • high-availability and low latency requirements • scheduling overhead Challenges in streaming optimization of Optimizations 14 ??? Vasiliki Kalavri | Boston University 2020 15 Safety • Attribute availability: the set of attributes B reads from must be disjoint from the set of attributes A writes to A D C Eddy C D A B ??? Vasiliki Kalavri | Boston University 2020 18 Safety • attribute availability: the set of attributes B reads from must be disjoint from the set of attributes A writes to.0 码力 | 54 页 | 2.83 MB | 1 年前3 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020the query is running might be impractical. • state accumulation and re-partitioning • high-availability and low latency requirements • scheduling overhead Challenges in streaming optimization of Optimizations 14 ??? Vasiliki Kalavri | Boston University 2020 15 Safety • Attribute availability: the set of attributes B reads from must be disjoint from the set of attributes A writes to A D C Eddy C D A B ??? Vasiliki Kalavri | Boston University 2020 18 Safety • attribute availability: the set of attributes B reads from must be disjoint from the set of attributes A writes to.0 码力 | 54 页 | 2.83 MB | 1 年前3
 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020pipeline ??? Vasiliki Kalavri | Boston University 2020 23 Progress is controlled though buffer availability A enters the system and is processed by Task 1 The result is serialized into an output to establish a virtual circuit (VC) or connection. • CFC uses a credit system to signal the availability of buffer space from receivers to senders. ??? Vasiliki Kalavri | Boston University 2020 27 processors and is implemented in Apache Flink. • Each task informs its senders of its buffer availability via credit messages. • This way, senders always know whether receivers have the required0 码力 | 43 页 | 2.42 MB | 1 年前3 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020pipeline ??? Vasiliki Kalavri | Boston University 2020 23 Progress is controlled though buffer availability A enters the system and is processed by Task 1 The result is serialized into an output to establish a virtual circuit (VC) or connection. • CFC uses a credit system to signal the availability of buffer space from receivers to senders. ??? Vasiliki Kalavri | Boston University 2020 27 processors and is implemented in Apache Flink. • Each task informs its senders of its buffer availability via credit messages. • This way, senders always know whether receivers have the required0 码力 | 43 页 | 2.42 MB | 1 年前3
 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020future values Examples • Find all stocks priced between $20 and $200, where the spread between the high tick and the low tick over the past 30 minutes is greater than 3% of the last price, and where in greater than $5 Billion that have gained in price today by at least 2%, and are within 2% of today’s high. 23 Vasiliki Kalavri | Boston University 2020 Financial transaction analysis • Fraud detection Retractions & results amendment Reconfiguration & updates Debugging Fault-tolerance & high-availability Vasiliki Kalavri | Boston University 2020 actions, alerts continuous analytics … Building0 码力 | 34 页 | 2.53 MB | 1 年前3 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020future values Examples • Find all stocks priced between $20 and $200, where the spread between the high tick and the low tick over the past 30 minutes is greater than 3% of the last price, and where in greater than $5 Billion that have gained in price today by at least 2%, and are within 2% of today’s high. 23 Vasiliki Kalavri | Boston University 2020 Financial transaction analysis • Fraud detection Retractions & results amendment Reconfiguration & updates Debugging Fault-tolerance & high-availability Vasiliki Kalavri | Boston University 2020 actions, alerts continuous analytics … Building0 码力 | 34 页 | 2.53 MB | 1 年前3
 监控Apache Flink应用程序(入门)org/projects/flink/flink-docs-release-1.7/monitoring/metrics.html#latency-tracking 2. During periods of high load or during recovery, events might spend some time in the message queue until they are processed be further controlled by setting metrics.latency.granularity as desired. Due to the potentially high number of histograms (in particular for metrics.latency.granularity: subtask), enabling latency tracking should also monitor the CPU load of the TaskManagers. If your TaskManagers are constantly under very high load, you might be able to improve the overall performance by decreasing the number of task slots0 码力 | 23 页 | 148.62 KB | 1 年前3 监控Apache Flink应用程序(入门)org/projects/flink/flink-docs-release-1.7/monitoring/metrics.html#latency-tracking 2. During periods of high load or during recovery, events might spend some time in the message queue until they are processed be further controlled by setting metrics.latency.granularity as desired. Due to the potentially high number of histograms (in particular for metrics.latency.granularity: subtask), enabling latency tracking should also monitor the CPU load of the TaskManagers. If your TaskManagers are constantly under very high load, you might be able to improve the overall performance by decreasing the number of task slots0 码力 | 23 页 | 148.62 KB | 1 年前3
 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020Jackson networks • Action • predictive, at-once for all operators Too fine-grained, impractical for high-rate streams Sampling degrades accuracy ??? Vasiliki Kalavri | Boston University 2020 Queuing Jackson networks • Action • predictive, at-once for all operators Too fine-grained, impractical for high-rate streams Sampling degrades accuracy Simplified models make strong assumptions Unsuitable be scaled and upstream channels • All-at-once • move state to be migrated in one operation • high latency during migration if the state is large • Progressive • move state to be migrated in smaller0 码力 | 93 页 | 2.42 MB | 1 年前3 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020Jackson networks • Action • predictive, at-once for all operators Too fine-grained, impractical for high-rate streams Sampling degrades accuracy ??? Vasiliki Kalavri | Boston University 2020 Queuing Jackson networks • Action • predictive, at-once for all operators Too fine-grained, impractical for high-rate streams Sampling degrades accuracy Simplified models make strong assumptions Unsuitable be scaled and upstream channels • All-at-once • move state to be migrated in one operation • high latency during migration if the state is large • Progressive • move state to be migrated in smaller0 码力 | 93 页 | 2.42 MB | 1 年前3
 Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020: k 2r → 0 and e−k2−r → 1 • If k ≪ 2r : k 2r → ∞ and e−k2−r → 0 The estimate 2R cannot be too high or too low. 8 ??? Vasiliki Kalavri | Boston University 2020 9 Is it good enough? ??? Vasiliki need to use multiple hash functions and combine their estimates: • Using many hash functions for a high-rate stream is expensive • Finding many random and independent hash functions is difficult ??? collision probability • Counter overestimation is almost certain for very large data streams with high-frequency elements Counting Bloom Filter ??? Vasiliki Kalavri | Boston University 2020 20 • A space-efficient0 码力 | 69 页 | 630.01 KB | 1 年前3 Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020: k 2r → 0 and e−k2−r → 1 • If k ≪ 2r : k 2r → ∞ and e−k2−r → 0 The estimate 2R cannot be too high or too low. 8 ??? Vasiliki Kalavri | Boston University 2020 9 Is it good enough? ??? Vasiliki need to use multiple hash functions and combine their estimates: • Using many hash functions for a high-rate stream is expensive • Finding many random and independent hash functions is difficult ??? collision probability • Counter overestimation is almost certain for very large data streams with high-frequency elements Counting Bloom Filter ??? Vasiliki Kalavri | Boston University 2020 20 • A space-efficient0 码力 | 69 页 | 630.01 KB | 1 年前3
 Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020place the ball at the least full bin: • when d=2, the maximum load is ln ln n / ln 2 + O(1), with high probability • when d>2, the maximum load keeps decreasing, but only by a constant factor 10 • selected uniformly at random • At the end of the process, the maximum load is Θ(ln n/ln ln n), with high probability ??? Vasiliki Kalavri | Boston University 2020 Dynamic resource allocation • Choose0 码力 | 31 页 | 1.47 MB | 1 年前3 Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020place the ball at the least full bin: • when d=2, the maximum load is ln ln n / ln 2 + O(1), with high probability • when d>2, the maximum load keeps decreasing, but only by a constant factor 10 • selected uniformly at random • At the end of the process, the maximum load is Θ(ln n/ln ln n), with high probability ??? Vasiliki Kalavri | Boston University 2020 Dynamic resource allocation • Choose0 码力 | 31 页 | 1.47 MB | 1 年前3
共 14 条
- 1
- 2
相关搜索词
 HighavailabilityrecoverysemanticsandguaranteesCS591K1DataStreamProcessingAnalyticsSpring2020FaulttolerancedemoreconfigurationprocessingfundamentalsStreamingoptimizationsFlowcontrolloadsheddingCourseintroduction监控ApacheFlink应用程序应用程序入门ElasticitystatemigrationPartCardinalityfrequencyestimationSkewmitigation













