saving your work - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

PyFlink 1.15 Documentation

Python 3.6 to 3.9 PyFlink 1.15 Python 3.6 to 3.8 PyFlink 1.14 Python 3.6 to 3.8 You could check your Python version as following: 3 pyflink-docs, Release release-1.15 python3 --version Create a Python production when there are massive Python dependencies. It’s supported to use Python virtual environment in your PyFlink jobs, see PyFlink Dependency Management for more details. Create a virtual environment using environment, run: source venv/bin/activate That is, execute the activate script under the bin directory of your virtual environment. Create a virtual environment using conda To create a virtual environment using

0 码力 | 36 页 | 266.77 KB | 1 年前
3
PyFlink 1.16 Documentation

Python 3.6 to 3.9 PyFlink 1.15 Python 3.6 to 3.8 PyFlink 1.14 Python 3.6 to 3.8 You could check your Python version as following: 3 pyflink-docs, Release release-1.16 python3 --version Create a Python production when there are massive Python dependencies. It’s supported to use Python virtual environment in your PyFlink jobs, see PyFlink Dependency Management for more details. Create a virtual environment using environment, run: source venv/bin/activate That is, execute the activate script under the bin directory of your virtual environment. Create a virtual environment using conda To create a virtual environment using

0 码力 | 36 页 | 266.80 KB | 1 年前
3
Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020

and reliable streaming applications • have a solid understanding of how stream processing systems work and what factors affect their performance • be aware of the challenges and trade-offs one needs Flink and Kafka to build a real-time monitoring and anomaly detection framework for datacenters. Your framework will: • Detect “suspicious” event patterns • Raise alerts for abnormal system metrics transaction analysis • Fraud detection, online risk calculation Example: Someone steals your phone and sings in your banking app. The app allows transfers of up to €1000 and so the thief makes transfers

0 码力 | 34 页 | 2.53 MB | 1 年前
3
State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Directory for checkpoints filesystem # # state.checkpoints.dir: path/to/checkpoint/folder/ In your Flink program: val env = StreamExecutionEnvironment.getExecutionEnvironment val checkpointPath: checkpointId, long timestamp) void restoreState(List state) Operator state 22 • A function can work with operator list state by implementing the ListCheckpointed interface • snapshotState() is invoked

0 码力 | 24 页 | 914.13 KB | 1 年前
3
监控Apache Flink应用程序(入门)

terms of the number of records for any partition in this window. An increasing value over time is your best indication that the consumer group is not keeping up with the producers. millisBehindLatest buffer events for some time (e.g. in a time window) for functional reasons. 4. Each computation in your Flink topology (framework or user code), as well as each network shuffle, takes time and adds to checkpointing interval for each record. In practice, it has proven invaluable to add timestamps to your events at multiple stages (at least at creation, persistence, ingestion by Flink, publication by

0 码力 | 23 页 | 148.62 KB | 1 年前
3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Kalavri | Boston University 2020 14 Combining estimates • Average won’t work: The expected value of 2R is too large. • Median won’t work: it is always a power of 2, thus, if the correct estimate is between

0 码力 | 69 页 | 630.01 KB | 1 年前
3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020

operators can be placed at any location in the query plan • Dropping near the source avoids wasting work but it might affect results of multiple queries if the source is connected to multiple queries.

0 码力 | 43 页 | 2.42 MB | 1 年前
3
Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Edge endpoints must have different signs • When merging components, if flipping all signs doesn’t work => the graph is not bipartite Bipartite graph checking ??? Vasiliki Kalavri | Boston University

0 码力 | 72 页 | 7.77 MB | 1 年前
3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

the size of intermediate results • execute selective joins first => follow-up joins will have less work to do Algebraic re-orderings ??? Vasiliki Kalavri | Boston University 2020 20 Safety • Ensure

0 码力 | 54 页 | 2.83 MB | 1 年前
3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020

for data streams • patterns, transformations, declarative • traditional blocking operators don’t work on streams • non-blocking versions or windows • how to define non-blocking aggregates • NB-SQL

0 码力 | 53 页 | 532.37 KB | 1 年前
3

共 12 条前往

页

分类

语言

格式

PyFlink 1.15 Documentation

PyFlink 1.16 Documentation

Course introduction - CS 591 K1: Data Stream Processing and Analytics Spring 2020

State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

监控Apache Flink应用程序(入门)

Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Graph streaming algorithms - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020