Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020streaming operator execution • state, parallelism, selectivity • Dataflow optimizations • plan translation alternatives • Runtime optimizations • load management, scheduling, state management • Optimization0 码力 | 54 页 | 2.83 MB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020streaming computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic streaming computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic streaming computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 40 码力 | 49 页 | 2.08 MB | 1 年前3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020Download and play around with “part-00000-of-00500.csv” of: • job events • task events • machine events 13 Vasiliki Kalavri | Boston University 2020 Software requirements • All assignments assume Windows user, you are advised to use Windows subsystem for Linux (WSL), Cygwin, or a Linux virtual machine to run Flink in a UNIX environment. • A Java 8.x installation. To develop Flink applications and0 码力 | 34 页 | 2.53 MB | 1 年前3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020Streaming & Batch API Historic data Kafka, RabbitMQ, ... HDFS, JDBC, ... Event logs ETL, Graphs, Machine Learning Relational, … Low latency, windowing, aggregations, ... 2 Vasiliki Kalavri | Boston This value is typically proportional to the number of physical CPU cores that the TaskManager's machine has (e.g., equal to the number of cores, or half the number of cores). 18 Vasiliki Kalavri | Boston0 码力 | 26 页 | 3.33 MB | 1 年前3
PyFlink 1.15 Documentationstreaming workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such 1.1.1.2 Local This page shows you how to set up PyFlink development environment in your local machine. This is usually used for local execution or development in an IDE. Set up Python environment It0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 Documentationstreaming workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such 1.1.1.2 Local This page shows you how to set up PyFlink development environment in your local machine. This is usually used for local execution or development in an IDE. Set up Python environment It0 码力 | 36 页 | 266.80 KB | 1 年前3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020streaming computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 2 Vasiliki Kalavri | Boston University 2020 •0 码力 | 24 页 | 914.13 KB | 1 年前3
Streaming in Apache Flink11M) ... 1> (50797,12M) Stateful Transformations • local: Flink state is kept local to the machine that processes it • durable: Flink state is automatically checkpointed and restored • vertically0 码力 | 45 页 | 3.00 MB | 1 年前3
监控Apache Flink应用程序(入门)gather insights about system resources, i.e. memory, CPU & network-related metrics for the whole machine as opposed to the Flink processes alone. System resource monitoring is disabled by default and requires0 码力 | 23 页 | 148.62 KB | 1 年前3
共 9 条
- 1













