 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020Stream Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 4/09: Flow control and load shedding ??? Vasiliki Kalavri | Boston University 2020 Keeping up with the producers what if the queue grows larger than available memory? • block the producer (back-pressure, flow control) 2 ??? Vasiliki Kalavri | Boston University 2020 Load management approaches 3 ! Load shedder runtime and selectively drops tuples according to a QoS specification. • Similar to congestion control or video streaming in a lower quality. 4 ??? Vasiliki Kalavri | Boston University 2020 https://commons0 码力 | 43 页 | 2.42 MB | 1 年前3 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020Stream Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 4/09: Flow control and load shedding ??? Vasiliki Kalavri | Boston University 2020 Keeping up with the producers what if the queue grows larger than available memory? • block the producer (back-pressure, flow control) 2 ??? Vasiliki Kalavri | Boston University 2020 Load management approaches 3 ! Load shedder runtime and selectively drops tuples according to a QoS specification. • Similar to congestion control or video streaming in a lower quality. 4 ??? Vasiliki Kalavri | Boston University 2020 https://commons0 码力 | 43 页 | 2.42 MB | 1 年前3
 PyFlink 1.15 Documentationa DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You _bootstrap_inner self.run() File "D:\Anaconda3\envs\py37\lib\site-packages\apache_beam\runners\worker\data_plane.py", ˓→ line 218, in run while not self._finished.wait(next_call - time.time()): File "D:\Anac0 码力 | 36 页 | 266.77 KB | 1 年前3 PyFlink 1.15 Documentationa DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You _bootstrap_inner self.run() File "D:\Anaconda3\envs\py37\lib\site-packages\apache_beam\runners\worker\data_plane.py", ˓→ line 218, in run while not self._finished.wait(next_call - time.time()): File "D:\Anac0 码力 | 36 页 | 266.77 KB | 1 年前3
 PyFlink 1.16 Documentationa DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You _bootstrap_inner self.run() File "D:\Anaconda3\envs\py37\lib\site-packages\apache_beam\runners\worker\data_plane.py", ˓→ line 218, in run while not self._finished.wait(next_call - time.time()): File "D:\Anac0 码力 | 36 页 | 266.80 KB | 1 年前3 PyFlink 1.16 Documentationa DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You _bootstrap_inner self.run() File "D:\Anaconda3\envs\py37\lib\site-packages\apache_beam\runners\worker\data_plane.py", ˓→ line 218, in run while not self._finished.wait(next_call - time.time()): File "D:\Anac0 码力 | 36 页 | 266.80 KB | 1 年前3
 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020This is called processing time Vasiliki Kalavri | Boston University 2020 • What if you were in a plane and not on a train? • What if you never came back online? • How long do we have to wait before0 码力 | 22 页 | 2.22 MB | 1 年前3 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020This is called processing time Vasiliki Kalavri | Boston University 2020 • What if you were in a plane and not on a train? • What if you never came back online? • How long do we have to wait before0 码力 | 22 页 | 2.22 MB | 1 年前3
 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020time rate increase : input rate : throughput ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 3 • Detect environment to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Automatic Scaling Control 4 ??? Vasiliki Kalavri | Boston University 2020 The automatic scaling problem 5 Given a logical congestion, back pressure, throughput Policy • Queuing theory models: for latency objectives • Control theory models: e.g., PID controller • Rule-based models, e.g. if CPU utilization > 70% => scale0 码力 | 93 页 | 2.42 MB | 1 年前3 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020time rate increase : input rate : throughput ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 3 • Detect environment to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Automatic Scaling Control 4 ??? Vasiliki Kalavri | Boston University 2020 The automatic scaling problem 5 Given a logical congestion, back pressure, throughput Policy • Queuing theory models: for latency objectives • Control theory models: e.g., PID controller • Rule-based models, e.g. if CPU utilization > 70% => scale0 码力 | 93 页 | 2.42 MB | 1 年前3
 Streaming in Apache FlinkDataStream Streaming in Apache FlinkDataStream- control = env.fromElements("DROP", "IGNORE").keyBy(x -> x); DataStream - streamOfWords = env.fromElements("data", "DROP", "artisans", "IGNORE") .keyBy(x -> x); control ValueStateDescriptor<>("blocked", Boolean.class)); } @Override public void flatMap1(String control_value, Collector - out) throws Exception { blocked.update(Boolean.TRUE); } @Override 0 码力 | 45 页 | 3.00 MB | 1 年前3
 Scalable Stream Processing - Spark Streaming and Flinkautomatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the0 码力 | 113 页 | 1.22 MB | 1 年前3 Scalable Stream Processing - Spark Streaming and Flinkautomatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the0 码力 | 113 页 | 1.22 MB | 1 年前3
 Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? 12 • Detect environment changes: external workload and system performance unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 12 • Detect environment0 码力 | 41 页 | 4.09 MB | 1 年前3 Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? 12 • Detect environment changes: external workload and system performance unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 12 • Detect environment0 码力 | 41 页 | 4.09 MB | 1 年前3
 Apache Flink的过去、现在和未来Services O_0 O_1 I_0 I_1 I_2 P_0 P_1 P_2 S_0 S_1 Order Inventory Payment Shipping Flow-Control Async Call Auto Scale State Management Event Driven Flink 的未来 offline Real-time Batch Processing0 码力 | 33 页 | 3.36 MB | 1 年前3 Apache Flink的过去、现在和未来Services O_0 O_1 I_0 I_1 I_2 P_0 P_1 P_2 S_0 S_1 Order Inventory Payment Shipping Flow-Control Async Call Auto Scale State Management Event Driven Flink 的未来 offline Real-time Batch Processing0 码力 | 33 页 | 3.36 MB | 1 年前3
 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 [1, 4, 5, 23, 8, 0, 7] 5 median ‣ We cannot store the entire stream ‣ No control over arrival rate or order f’ ∞ ? Continuously arriving, possibly unbounded data f read write0 码力 | 34 页 | 2.53 MB | 1 年前3 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 [1, 4, 5, 23, 8, 0, 7] 5 median ‣ We cannot store the entire stream ‣ No control over arrival rate or order f’ ∞ ? Continuously arriving, possibly unbounded data f read write0 码力 | 34 页 | 2.53 MB | 1 年前3
共 12 条
- 1
- 2













