Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020Stream Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 4/09: Flow control and load shedding ??? Vasiliki Kalavri | Boston University 2020 Keeping up with the producers what if the queue grows larger than available memory? • block the producer (back-pressure, flow control) 2 ??? Vasiliki Kalavri | Boston University 2020 Load management approaches 3 ! Load shedder runtime and selectively drops tuples according to a QoS specification. • Similar to congestion control or video streaming in a lower quality. 4 ??? Vasiliki Kalavri | Boston University 2020 https://commons0 码力 | 43 页 | 2.42 MB | 1 年前3
PyFlink 1.15 DocumentationPython Version Supported PyFlink Version Python Version Supported PyFlink 1.16 Python 3.6 to 3.9 PyFlink 1.15 Python 3.6 to 3.8 PyFlink 1.14 Python 3.6 to 3.8 You could check your Python version as following: following: 3 pyflink-docs, Release release-1.15 python3 --version Create a Python virtual environment Virtual environment gives you the ability to isolate the Python dependencies of different projects g. venv virtualenv venv # You can also create Python virtual environment with a specific Python version virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 DocumentationPython Version Supported PyFlink Version Python Version Supported PyFlink 1.16 Python 3.6 to 3.9 PyFlink 1.15 Python 3.6 to 3.8 PyFlink 1.14 Python 3.6 to 3.8 You could check your Python version as following: following: 3 pyflink-docs, Release release-1.16 python3 --version Create a Python virtual environment Virtual environment gives you the ability to isolate the Python dependencies of different projects g. venv virtualenv venv # You can also create Python virtual environment with a specific Python version virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated0 码力 | 36 页 | 266.80 KB | 1 年前3
Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020operator placement • skew and straggler mitigation • Migrate to a different cluster or software version 9 Reconfiguration cases ??? Vasiliki Kalavri | Boston University 2020 Streaming applications unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? 12 • Detect environment changes: external workload and system performance unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 12 • Detect environment0 码力 | 41 页 | 4.09 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020available cores / threads • Fused operators can share the address space but use separate threads of control • avoid communication cost without losing pipeline parallelism • use a shared buffer for communication • fixed number of random state accesses, 32K L1 cache • the throughput of the non-shared version degrades first State sharing B A Β Α Profitability ??? Vasiliki Kalavri | Boston University 20200 码力 | 54 页 | 2.83 MB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020time rate increase : input rate : throughput ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 3 • Detect environment to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Automatic Scaling Control 4 ??? Vasiliki Kalavri | Boston University 2020 The automatic scaling problem 5 Given a logical congestion, back pressure, throughput Policy • Queuing theory models: for latency objectives • Control theory models: e.g., PID controller • Rule-based models, e.g. if CPU utilization > 70% => scale0 码力 | 93 页 | 2.42 MB | 1 年前3
Streaming in Apache FlinkDataStreamcontrol = env.fromElements("DROP", "IGNORE").keyBy(x -> x); DataStream streamOfWords = env.fromElements("data", "DROP", "artisans", "IGNORE") .keyBy(x -> x); control ValueStateDescriptor<>("blocked", Boolean.class)); } @Override public void flatMap1(String control_value, Collector out) throws Exception { blocked.update(Boolean.TRUE); } @Override 0 码力 | 45 页 | 3.00 MB | 1 年前3
Scalable Stream Processing - Spark Streaming and Flinkautomatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the0 码力 | 113 页 | 1.22 MB | 1 年前3
Apache Flink的过去、现在和未来Services O_0 O_1 I_0 I_1 I_2 P_0 P_1 P_2 S_0 S_1 Order Inventory Payment Shipping Flow-Control Async Call Auto Scale State Management Event Driven Flink 的未来 offline Real-time Batch Processing0 码力 | 33 页 | 3.36 MB | 1 年前3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 [1, 4, 5, 23, 8, 0, 7] 5 median ‣ We cannot store the entire stream ‣ No control over arrival rate or order f’ ∞ ? Continuously arriving, possibly unbounded data f read write0 码力 | 34 页 | 2.53 MB | 1 年前3
共 12 条
- 1
- 2













