Streaming in Apache Flinkup an environment to develop Flink programs • Implement streaming data processing pipelines • Flink managed state • Event time Streaming in Apache Flink • Streams are natural • Events of any type0 码力 | 45 页 | 3.00 MB | 1 年前3
Rancher Kubernetes Engine 2, VMWare vSANSAP SAP Data Intelligence 3 on Rancher Kubernetes Engine 2 using VMware vSAN and vSphere SUSE Linux Enterprise Server 15 SP4 Rancher Kubernetes Engine 2 SAP Data Intelligence 3 Dr. Ulrich Schairer, (SUSE) 1 SAP Data Intelligence 3 on Rancher Kubernetes Engine 2 using VMware vSAN and vSphere SAP Data Intelligence 3 on Rancher Kubernetes Engine 2 using VMware vSAN and vSphere Date: 2023-07-24 SAP possi- ble errors or the consequences thereof. 2 SAP Data Intelligence 3 on Rancher Kubernetes Engine 2 using VMware vSAN and vSphere Contents 1 Introduction 4 2 Requirements 5 3 Preparations 70 码力 | 29 页 | 213.09 KB | 1 年前3
Scalable Stream Processing - Spark Streaming and FlinkScalable Stream Processing - Spark Streaming and Flink Amir H. Payberah payberah@kth.se 05/10/2018 The Course Web Page https://id2221kth.github.io 1 / 79 Where Are We? 2 / 79 Stream Processing Systems Spark streaming ▶ Flink 4 / 79 Spark Streaming 5 / 79 Contribution ▶ Design issues • Continuous vs. micro-batch processing • Record-at-a-Time vs. declarative APIs 6 / 79 Spark Streaming ▶ Run Run a streaming computation as a series of very small, deterministic batch jobs. • Chops up the live stream into batches of X seconds. • Treats each batch as RDDs and processes them using RDD operations0 码力 | 113 页 | 1.22 MB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020??? Vasiliki Kalavri | Boston University 2020 2 • Costs of streaming operator execution • state, parallelism, selectivity • Dataflow optimizations • plan translation alternatives • Runtime optimizations the basics 3 source sink input port output port dataflow graph ??? Vasiliki Kalavri | Boston University 2020 Revisiting the basics 4 Dataflow graph • operators are nodes, data channels are edges ??? Vasiliki Kalavri | Boston University 2020 12 • What does efficient mean in the context of streaming? • queries run continuously • streams are unbounded • In traditional ad-hoc database queries0 码力 | 54 页 | 2.83 MB | 1 年前3
Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020Processing and Analytics Vasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 4/28: Graph Streaming ??? Vasiliki Kalavri | Boston University 2020 Modeling the world as a graph 2 Social networks a vertex and all of its neighbors. Although this model can enable a theoretical analysis of streaming algorithms, it cannot adequately model real-world unbounded streams, as the neighbors cannot be continuously generated as a stream of edges? • How can we perform iterative computation in a streaming dataflow engine? How can we propagate watermarks? • Do we need to run the computation from scratch for0 码力 | 72 页 | 7.77 MB | 1 年前3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020Kalavri vkalavri@bu.edu CS 591 K1: Data Stream Processing and Analytics Spring 2020 2/04: Streaming languages and operator semantics Vasiliki Kalavri | Boston University 2020 Vasiliki Kalavri | Boston interval of 5–15 s) by an item of type C with Z < 5. 8 Vasiliki Kalavri | Boston University 2020 Streaming Operators 9 Vasiliki Kalavri | Boston University 2020 Operator types (I) • Single-Item Operators println!("seen: {:?}", x)) .connect_loop(handle); }); t (t, l1) (t, (l1, l2)) Streaming Iteration Example Terminate after 100 iterations Create the feedback loop 13 Vasiliki Kalavri0 码力 | 53 页 | 532.37 KB | 1 年前3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020relatively static and historical data • batched updates during downtimes, e.g. every night Streaming Data Warehouse • low-latency materialized view updates • pre-aggregated, pre-processed streams streams and historical data Data Management Approaches 4 storage analytics static data streaming data Vasiliki Kalavri | Boston University 2020 DBMS vs. DSMS DBMS DSMS Data persistent relations stream can be viewed as a massive, dynamic, one-dimensional vector A[1…N]. The size N of the streaming vector is defined as the product of the attribute domain size(s). Note that N might be unknown0 码力 | 45 页 | 1.22 MB | 1 年前3
這些年,我們一起追的Hadoop以及找不到老師教的技術,想辦法變 成自己的專長。 目前負責 Java 與 .NET 雲端運算相 關技術的推廣,主要包括 Hadoop Platform 與 NoSQL 等 Big Data 相關 應用,Google App Engine、Microsoft Azure 與 CloudBees 等雲端平台的運 用,以及 Android、Windows Phone 等 Smart Phone 的應用程式開發。 PS. 除了我的照片之外,投影片裡頭 (Slave)! 10 / 74 Hadoop 1.x 架構與限制 比較基本的模組: Hadoop HDFS (Storage) Hadoop MapReduce (Computing Engine + Resource Management + Job Scheduling / Monitoring + ...) 比較明顯的限制: 每個 Cluster 大概就是 4,000 - 4,500 x 架構 比較基本的模組: Hadoop Common (Core Libraries) Hadoop HDFS (Storage) Hadoop MapReduce (Computing Engine) Hadoop YARN (Resource Management + Job Scheduling / Monitoring) 比較沒人知道的事: Hadoop 2.x 也默默地做了四五年了0 码力 | 74 页 | 45.76 MB | 1 年前3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020to congestion control or video streaming in a lower quality. 4 ??? Vasiliki Kalavri | Boston University 2020 https://commons.wikimedia.org/wiki/File:Adaptive_streaming_overview_daseddon_2011_07_28.png answers … S1 S2 Sr Input Manager Scheduler QoS Monitor Load Shedder Query Execution Engine Qm Q2 Q1 Ad-hoc or continuous queries Input streams … ??? Vasiliki Kalavri | Boston University Kalavri | Boston University 2020 Rate control • In a network of consumers and producers such as a streaming execution graph with multiple operators, back-pressure has the effect that all operators slow0 码力 | 43 页 | 2.42 MB | 1 年前3
Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 20204/02: Elasticity policies and state migration ??? Vasiliki Kalavri | Boston University 2020 Streaming applications are long-running • Workload will change • Conditions might change • State is resources, spawn new processes or release unused resources, safely terminate processes • Adjust dataflow channels and network connections • Re-partition and migrate state in a consistent manner • Block problem 5 Given a logical dataflow with sources S1, S2, … Sn and rates λ1, λ2, … λn identify the minimum parallelism πi per operator i, such that the physical dataflow can sustain all source rates0 码力 | 93 页 | 2.42 MB | 1 年前3
共 331 条
- 1
- 2
- 3
- 4
- 5
- 6
- 34













