Resource Control - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Stream Processing and Analytics Vasiliki (Vasia) Kalavri  vkalavri@bu.edu Spring 2020 4/09: Flow control and load shedding ??? Vasiliki Kalavri | Boston University 2020 Keeping up with the producers what if the queue grows larger than available memory? • block the producer (back-pressure, flow control) 2 ??? Vasiliki Kalavri | Boston University 2020 Load management approaches 3 ! Load shedder stabilize. • Requires a persistent input source. • Suitable for transient load increase. Scale resource allocation: • Addresses the case of increased load and additionally ensures no resources are

0 码力 | 43 页 | 2.42 MB | 1 年前
3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

hashing • indexing, pre-fetching • minimize disk access • scheduling Objectives • optimize resource utilization or minimize resources • decrease latency, increase throughput • minimize monetary Kalavri | Boston University 2020 28 Safety • Ensure resource kinds: all resources required by a fused operator should remain available. • Ensure resource amounts: the total amount of resources required by available cores / threads • Fused operators can share the address space but use separate threads of control • avoid communication cost without losing pipeline parallelism • use a shared buffer for communication

0 码力 | 54 页 | 2.83 MB | 1 年前
3
Scalable Stream Processing - Spark Streaming and Flink

Output Operations (3/4) ▶ What’s wrong with this code? ▶ Creating a connection object has time and resource overheads. ▶ Creating and destroying a connection object for each record can incur unnecessarily automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the automatically converts this batch-like query to a streaming execution plan. ▶ 2. Specify triggers to control when to update the results. • Each time a trigger fires, Spark checks for new data (new row in the

0 码力 | 113 页 | 1.22 MB | 1 年前
3
Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

• minimize performance disruption, e.g. latency spikes • avoid introducing load imbalance • Resource management • utilization, isolation • Automation • continuous monitoring • bottleneck detection unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? 12 • Detect environment changes: external workload and system performance unblock computations to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 12 • Detect environment

0 码力 | 41 页 | 4.09 MB | 1 年前
3
Apache Flink的过去、现在和未来

Time Window 2015 年阿里巴巴开始使用 Flink 并持续贡献社区重构分布式架构 Client Dispatcher Job Manager Task Manager Resource Manager Cluster Manager Task Manager 1. Submit job 2. Start job 3. Request slots 4. Allocate Services O_0 O_1 I_0 I_1 I_2 P_0 P_1 P_2 S_0 S_1 Order Inventory Payment Shipping Flow-Control Async Call Auto Scale State Management Event Driven Flink 的未来 offline Real-time Batch Processing

0 码力 | 33 页 | 3.36 MB | 1 年前
3
PyFlink 1.15 Documentation

py See Submitting PyFlink jobs for more details. 1.1.1.4 YARN Apache Hadoop YARN is a cluster resource management framework for managing the resources and scheduling jobs in a Hadoop cluster. It’s supported a DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You

0 码力 | 36 页 | 266.77 KB | 1 年前
3
PyFlink 1.16 Documentation

py See Submitting PyFlink jobs for more details. 1.1.1.4 YARN Apache Hadoop YARN is a cluster resource management framework for managing the resources and scheduling jobs in a Hadoop cluster. It’s supported a DataStream API for building robust, stateful streaming applications. It provides fine-grained control over state and timer, which allows for the implementation of advanced event-driven systems. You

0 码力 | 36 页 | 266.80 KB | 1 年前
3
监控Apache Flink应用程序(入门)

https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/stream/operators/#task-chaining-and-resource-groups 4 进度和吞吐量监控知道您的应用程序正在运行并且检查点正常工作是件好事，但是它并不能告诉您应用程序是否正在实际取得进展并与上游系统保持同步。 4.1 吞吐量 Fli overall memory consumption of the Job- and TaskManager containers to ensure they don’t exceed their resource limits. This is particularly important, when using the RocksDB statebackend, since RocksDB allocates Flink processes alone. System resource monitoring is disabled by default and requires additional dependencies on the classpath. Please check out the Flink system resource metrics documentation9 for additional

0 码力 | 23 页 | 148.62 KB | 1 年前
3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020

time rate increase : input rate : throughput ??? Vasiliki Kalavri | Boston University 2020 Control: When and how much to adapt? Mechanism: How to apply the re-configuration? 3 • Detect environment to ensure result correctness ??? Vasiliki Kalavri | Boston University 2020 Automatic Scaling Control 4 ??? Vasiliki Kalavri | Boston University 2020 The automatic scaling problem 5 Given a logical congestion, back pressure, throughput Policy • Queuing theory models: for latency objectives • Control theory models: e.g., PID controller • Rule-based models, e.g. if CPU utilization > 70% => scale

0 码力 | 93 页 | 2.42 MB | 1 年前
3
Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Θ(ln n/ln ln n), with high probability ??? Vasiliki Kalavri | Boston University 2020 Dynamic resource allocation • Choose one among n workers • check the load of each worker and send the item to

0 码力 | 31 页 | 1.47 MB | 1 年前
3

共 13 条前往

页

分类

语言

格式

Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Scalable Stream Processing - Spark Streaming and Flink

Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Apache Flink的过去、现在和未来

PyFlink 1.15 Documentation

PyFlink 1.16 Documentation

监控Apache Flink应用程序(入门)

Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020