 Streaming in Apache FlinkEIT Summer School 2019 Apache Flink Based on https://training.ververica.com Maximilian Michels Streaming in Apache FlinkEIT Summer School 2019 Apache Flink Based on https://training.ververica.com Maximilian Michels- apache.org> Software Engineer / Consultant Committer @ Apache Beam / Apache Flink @stadtlegende @stadtlegende Dr Paris Carbone - Senior Researcher @ RISE Committer @ Apache Flink @SenorCarbone Contents • DataSet API • DataStream API • Concepts • Set up an environment to develop Implement streaming data processing pipelines • Flink managed state • Event time Streaming in Apache Flink • Streams are natural • Events of any type like sensors, click streams, logs • Batch processing 0 码力 | 45 页 | 3.00 MB | 1 年前3
 Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics SpringKalavri vkalavri@bu.edu Spring 2020 1/30: Introduction to Apache Flink and Apache Kafka Vasiliki Kalavri | Boston University 2020 Apache Flink • An open-source, distributed data analysis framework file:///home/user/wordcount_out Run with a class entry point and arguments: ./bin/flink run -c org.apache.flink.examples.java.wordcount.WordCount \ ./examples/batch/WordCount.jar Kalavri | Boston University 2020 Resources • Documentation • https://flink.apache.org/ • Community • https://flink.apache.org/community.html#mailing-lists • Conference • http://flink-forward.org/0 码力 | 26 页 | 3.33 MB | 1 年前3 Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics SpringKalavri vkalavri@bu.edu Spring 2020 1/30: Introduction to Apache Flink and Apache Kafka Vasiliki Kalavri | Boston University 2020 Apache Flink • An open-source, distributed data analysis framework file:///home/user/wordcount_out Run with a class entry point and arguments: ./bin/flink run -c org.apache.flink.examples.java.wordcount.WordCount \ ./examples/batch/WordCount.jar Kalavri | Boston University 2020 Resources • Documentation • https://flink.apache.org/ • Community • https://flink.apache.org/community.html#mailing-lists • Conference • http://flink-forward.org/0 码力 | 26 页 | 3.33 MB | 1 年前3
 Apache Flink的过去、现在和未来Apache Flink的过去、现在和未来 杨克特(鲁尼) 阿里巴巴高级技术专家 过去 一切从2014年开始 2009 - 2014 2014 • 柏林工业大学博士生项目 • 基于流式 runtime 的批处理引擎 • 2014 年 8 月份 发布 Flink 0.6.0 Flink 0.7 Runtime Distributed Streaming Dataflow DataStream Manager Cluster Manager Task Manager 1. Submit job 2. Start job 3. Request slots 4. Allocate Container 5. Start Task Manager 6. Schedule Task YARN RM K8S RM 增量 Checkpoint 时间 全量状态 增量状态 增量 snapshot Processing & Streaming Analytics Event-driven Applications ✔ ✔ ✔ 扫码加入社群 与志同道合的码友一起 Code Up 阿里云开发者社区 Apache Flink China 2群 粘贴二维码 谢谢!0 码力 | 33 页 | 3.36 MB | 1 年前3 Apache Flink的过去、现在和未来Apache Flink的过去、现在和未来 杨克特(鲁尼) 阿里巴巴高级技术专家 过去 一切从2014年开始 2009 - 2014 2014 • 柏林工业大学博士生项目 • 基于流式 runtime 的批处理引擎 • 2014 年 8 月份 发布 Flink 0.6.0 Flink 0.7 Runtime Distributed Streaming Dataflow DataStream Manager Cluster Manager Task Manager 1. Submit job 2. Start job 3. Request slots 4. Allocate Container 5. Start Task Manager 6. Schedule Task YARN RM K8S RM 增量 Checkpoint 时间 全量状态 增量状态 增量 snapshot Processing & Streaming Analytics Event-driven Applications ✔ ✔ ✔ 扫码加入社群 与志同道合的码友一起 Code Up 阿里云开发者社区 Apache Flink China 2群 粘贴二维码 谢谢!0 码力 | 33 页 | 3.36 MB | 1 年前3
 监控Apache Flink应用程序(入门)监控Apache Flink应用程序(入门) caolei Exported on 01/10/2020 caolei – 监控Apache Flink应用程序(入门) – 2 Table of Contents 1 Flink指标体系 ...................................................................... ............................................................................... 21 caolei – 监控Apache Flink应用程序(入门) – 3 4.13.2.1 Key Metrics ..................................................... caolei – 监控Apache Flink应用程序(入门) – 4 原文地址:https://www.ververica.com/blog/monitoring-apache-flink-applications-101 这篇博文介绍了Apache Flink内置的监控和度量系统,通过该系统,开发人员可以有效地监控他们的Flink作 业。通常,对于一个刚刚开始使用Apache Flink进行0 码力 | 23 页 | 148.62 KB | 1 年前3 监控Apache Flink应用程序(入门)监控Apache Flink应用程序(入门) caolei Exported on 01/10/2020 caolei – 监控Apache Flink应用程序(入门) – 2 Table of Contents 1 Flink指标体系 ...................................................................... ............................................................................... 21 caolei – 监控Apache Flink应用程序(入门) – 3 4.13.2.1 Key Metrics ..................................................... caolei – 监控Apache Flink应用程序(入门) – 4 原文地址:https://www.ververica.com/blog/monitoring-apache-flink-applications-101 这篇博文介绍了Apache Flink内置的监控和度量系统,通过该系统,开发人员可以有效地监控他们的Flink作 业。通常,对于一个刚刚开始使用Apache Flink进行0 码力 | 23 页 | 148.62 KB | 1 年前3
 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics SpringVasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 3/24: Exactly-once fault-tolerance in Apache Flink ??? Vasiliki Kalavri | Boston University 2020 Some slides in this lecture have been generously Protocol Output Logs 38 ??? Vasiliki Kalavri | Boston University 2020 Asynchronous checkpoints in Apache Flink 39 ??? Vasiliki Kalavri | Boston University 2020 40 • A source of increasing numbers partitioned the position up to which they were consumed when the checkpoint was taken. • Event logs like Apache Kafka can provide records from a previous offset of the stream. 43 ??? Vasiliki Kalavri | Boston0 码力 | 81 页 | 13.18 MB | 1 年前3 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics SpringVasiliki (Vasia) Kalavri vkalavri@bu.edu Spring 2020 3/24: Exactly-once fault-tolerance in Apache Flink ??? Vasiliki Kalavri | Boston University 2020 Some slides in this lecture have been generously Protocol Output Logs 38 ??? Vasiliki Kalavri | Boston University 2020 Asynchronous checkpoints in Apache Flink 39 ??? Vasiliki Kalavri | Boston University 2020 40 • A source of increasing numbers partitioned the position up to which they were consumed when the checkpoint was taken. • Event logs like Apache Kafka can provide records from a previous offset of the stream. 43 ??? Vasiliki Kalavri | Boston0 码力 | 81 页 | 13.18 MB | 1 年前3
 PyFlink 1.15 Documentation. . . . . . 26 1.3.4.1 O1: Could not find any factory for identifier ‘xxx’ that implements ‘org.apache.flink.table.factories.DynamicTableFactory’ in the classpath . . . . . . . 26 1.3.4.2 O2: ClassNotFoundException: . . . . . . . . . 29 1.3.4.3 O3: NoSuchMethodError: org.apache.flink.table.factories.DynamicTableFactory$Context.getCatalogTable()Lorg/apache/flink/table/catalog/CatalogTable 30 1.3.5 Runtime issues large . . . . . . . . . . . . . . . . . . . . 30 1.3.5.2 Q2: An error occurred while calling z:org.apache.flink.client.python.PythonEnvUtils.resetCallbackClient 30 1.3.6 Data type issues . . . . . . .0 码力 | 36 页 | 266.77 KB | 1 年前3 PyFlink 1.15 Documentation. . . . . . 26 1.3.4.1 O1: Could not find any factory for identifier ‘xxx’ that implements ‘org.apache.flink.table.factories.DynamicTableFactory’ in the classpath . . . . . . . 26 1.3.4.2 O2: ClassNotFoundException: . . . . . . . . . 29 1.3.4.3 O3: NoSuchMethodError: org.apache.flink.table.factories.DynamicTableFactory$Context.getCatalogTable()Lorg/apache/flink/table/catalog/CatalogTable 30 1.3.5 Runtime issues large . . . . . . . . . . . . . . . . . . . . 30 1.3.5.2 Q2: An error occurred while calling z:org.apache.flink.client.python.PythonEnvUtils.resetCallbackClient 30 1.3.6 Data type issues . . . . . . .0 码力 | 36 页 | 266.77 KB | 1 年前3
 PyFlink 1.16 Documentation. . . . . . 26 1.3.4.1 O1: Could not find any factory for identifier ‘xxx’ that implements ‘org.apache.flink.table.factories.DynamicTableFactory’ in the classpath . . . . . . . 26 1.3.4.2 O2: ClassNotFoundException: . . . . . . . . . 29 1.3.4.3 O3: NoSuchMethodError: org.apache.flink.table.factories.DynamicTableFactory$Context.getCatalogTable()Lorg/apache/flink/table/catalog/CatalogTable 30 1.3.5 Runtime issues large . . . . . . . . . . . . . . . . . . . . 30 1.3.5.2 Q2: An error occurred while calling z:org.apache.flink.client.python.PythonEnvUtils.resetCallbackClient 30 1.3.6 Data type issues . . . . . . .0 码力 | 36 页 | 266.80 KB | 1 年前3 PyFlink 1.16 Documentation. . . . . . 26 1.3.4.1 O1: Could not find any factory for identifier ‘xxx’ that implements ‘org.apache.flink.table.factories.DynamicTableFactory’ in the classpath . . . . . . . 26 1.3.4.2 O2: ClassNotFoundException: . . . . . . . . . 29 1.3.4.3 O3: NoSuchMethodError: org.apache.flink.table.factories.DynamicTableFactory$Context.getCatalogTable()Lorg/apache/flink/table/catalog/CatalogTable 30 1.3.5 Runtime issues large . . . . . . . . . . . . . . . . . . . . 30 1.3.5.2 Q2: An error occurred while calling z:org.apache.flink.client.python.PythonEnvUtils.resetCallbackClient 30 1.3.6 Data type issues . . . . . . .0 码力 | 36 页 | 266.80 KB | 1 年前3
 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020algorithms Vasiliki Kalavri | Boston University 2020 Tools Apache Flink: flink.apache.org Apache Kafka: kafka.apache.org Apache Beam: beam.apache.org Google Cloud Platform: cloud.google.com 5 Vasiliki comprehensively compare features and processing guarantees of streaming systems • be proficient in using Apache Flink and Kafka to build end-to-end, scalable, and reliable streaming applications • have a solid Official Semester Dates 11 Vasiliki Kalavri | Boston University 2020 Final Project You will use Apache Flink and Kafka to build a real-time monitoring and anomaly detection framework for datacenters0 码力 | 34 页 | 2.53 MB | 1 年前3 Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020algorithms Vasiliki Kalavri | Boston University 2020 Tools Apache Flink: flink.apache.org Apache Kafka: kafka.apache.org Apache Beam: beam.apache.org Google Cloud Platform: cloud.google.com 5 Vasiliki comprehensively compare features and processing guarantees of streaming systems • be proficient in using Apache Flink and Kafka to build end-to-end, scalable, and reliable streaming applications • have a solid Official Semester Dates 11 Vasiliki Kalavri | Boston University 2020 Final Project You will use Apache Flink and Kafka to build a real-time monitoring and anomaly detection framework for datacenters0 码力 | 34 页 | 2.53 MB | 1 年前3
 Scalable Stream Processing - Spark Streaming and FlinkWord Count in Spark Streaming (1/6) ▶ First we create a StreamingContex import org.apache.spark._ import org.apache.spark.streaming._ // Create a local StreamingContext with two working threads and batch hour. ▶ Store the result in MySQL. [https://databricks.com/blog/2016/07/28/structured-streaming-in-apache-spark.html] 60 / 79 Structured Streaming Example (2/3) ▶ We could express it as the following Fault-Tolerant Model for Stream Processing on Large Clusters”, HotCloud’12. ▶ P. Carbone et al., “Apache flink: Stream and batch processing in a single engine”, 2015. ▶ Some slides were derived from Heather0 码力 | 113 页 | 1.22 MB | 1 年前3 Scalable Stream Processing - Spark Streaming and FlinkWord Count in Spark Streaming (1/6) ▶ First we create a StreamingContex import org.apache.spark._ import org.apache.spark.streaming._ // Create a local StreamingContext with two working threads and batch hour. ▶ Store the result in MySQL. [https://databricks.com/blog/2016/07/28/structured-streaming-in-apache-spark.html] 60 / 79 Structured Streaming Example (2/3) ▶ We could express it as the following Fault-Tolerant Model for Stream Processing on Large Clusters”, HotCloud’12. ▶ P. Carbone et al., “Apache flink: Stream and batch processing in a single engine”, 2015. ▶ Some slides were derived from Heather0 码力 | 113 页 | 1.22 MB | 1 年前3
 State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020current record so that all records with the same key access the same state State management in Apache Flink 5 Vasiliki Kalavri | Boston University 2020 Operator state Keyed state State types 6 byte streams. https://rocksdb.org/ https://www.ververica.com/blog/manage-rocksdb-memory-size-apache-flink Vasiliki Kalavri | Boston University 2020 • RocksDB is a persistent key value store: data University 2020 • Working with State: https://ci.apache.org/projects/flink/flink-docs- release-1.10/dev/stream/state/state.html • Managing State in Apache Flink - Tzu-Li (Gordon) Tai: https:// www.youtube0 码力 | 24 页 | 914.13 KB | 1 年前3 State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020current record so that all records with the same key access the same state State management in Apache Flink 5 Vasiliki Kalavri | Boston University 2020 Operator state Keyed state State types 6 byte streams. https://rocksdb.org/ https://www.ververica.com/blog/manage-rocksdb-memory-size-apache-flink Vasiliki Kalavri | Boston University 2020 • RocksDB is a persistent key value store: data University 2020 • Working with State: https://ci.apache.org/projects/flink/flink-docs- release-1.10/dev/stream/state/state.html • Managing State in Apache Flink - Tzu-Li (Gordon) Tai: https:// www.youtube0 码力 | 24 页 | 914.13 KB | 1 年前3
共 20 条
- 1
- 2













