 PyFlink 1.15 Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.1.2 O2: Java gateway process exited before sending its port number . . . . . . . . . . . 22 1.3.2 Usage issues . . . . . . . . . . . . . . virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated before to use it. To activate the virtual environment, run: source venv/bin/activate That is, execute venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual environment, run: 4 Chapter 1. How to build docs locally0 码力 | 36 页 | 266.77 KB | 1 年前3 PyFlink 1.15 Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.1.2 O2: Java gateway process exited before sending its port number . . . . . . . . . . . 22 1.3.2 Usage issues . . . . . . . . . . . . . . virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated before to use it. To activate the virtual environment, run: source venv/bin/activate That is, execute venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual environment, run: 4 Chapter 1. How to build docs locally0 码力 | 36 页 | 266.77 KB | 1 年前3
 PyFlink 1.16 Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.1.2 O2: Java gateway process exited before sending its port number . . . . . . . . . . . 22 1.3.2 Usage issues . . . . . . . . . . . . . . virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated before to use it. To activate the virtual environment, run: source venv/bin/activate That is, execute venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual environment, run: 4 Chapter 1. How to build docs locally0 码力 | 36 页 | 266.80 KB | 1 年前3 PyFlink 1.16 Documentation. . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.3.1.2 O2: Java gateway process exited before sending its port number . . . . . . . . . . . 22 1.3.2 Usage issues . . . . . . . . . . . . . . virtualenv --python /path/to/python/executable venv The virtual environment needs to be activated before to use it. To activate the virtual environment, run: source venv/bin/activate That is, execute venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual environment, run: 4 Chapter 1. How to build docs locally0 码力 | 36 页 | 266.80 KB | 1 年前3
 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020causality: • An event is pre-snapshot if it occurs before the local snapshot on a process, otherwise it is post- snapshot • If event A happens causally before B and B is pre-snapshot, then A is also pre-snapshot University 2020 Example p1 p2 p3 ⊙ init before marker after marker 23 m1 m2 ??? Vasiliki Kalavri | Boston University 2020 Example p1 p2 p3 ⊙ init before marker after marker s1 Snapshot 23 m1 m1 m2 ??? Vasiliki Kalavri | Boston University 2020 p1 p2 p3 ⊙ marker s1 Snapshot Example before marker after marker 24 m1 m2 recording… p1 records its state, forwards the marker, and starts0 码力 | 81 页 | 13.18 MB | 1 年前3 Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020causality: • An event is pre-snapshot if it occurs before the local snapshot on a process, otherwise it is post- snapshot • If event A happens causally before B and B is pre-snapshot, then A is also pre-snapshot University 2020 Example p1 p2 p3 ⊙ init before marker after marker 23 m1 m2 ??? Vasiliki Kalavri | Boston University 2020 Example p1 p2 p3 ⊙ init before marker after marker s1 Snapshot 23 m1 m1 m2 ??? Vasiliki Kalavri | Boston University 2020 p1 p2 p3 ⊙ marker s1 Snapshot Example before marker after marker 24 m1 m2 recording… p1 records its state, forwards the marker, and starts0 码力 | 81 页 | 13.18 MB | 1 年前3
 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? 2 ??? Vasiliki Kalavri | Boston University 2020 Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? • drop messages 2 ??? Vasiliki Kalavri | Boston Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? • drop messages • buffer messages in a queue:0 码力 | 43 页 | 2.42 MB | 1 年前3 Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? 2 ??? Vasiliki Kalavri | Boston University 2020 Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? • drop messages 2 ??? Vasiliki Kalavri | Boston Producers can generate events in a higher rate than the rate consumers can process events. • What happens if consumers cannot keep up with the event rate? • drop messages • buffer messages in a queue:0 码力 | 43 页 | 2.42 MB | 1 年前3
 Scalable Stream Processing - Spark Streaming and Flinkoperators to emit all records that depend only on records before the barrier. ▶ Once all sinks have received the barriers, Flink knows that all records before the barriers will never be needed again. ▶ Asynchronous operators to emit all records that depend only on records before the barrier. ▶ Once all sinks have received the barriers, Flink knows that all records before the barriers will never be needed again. ▶ Asynchronous0 码力 | 113 页 | 1.22 MB | 1 年前3 Scalable Stream Processing - Spark Streaming and Flinkoperators to emit all records that depend only on records before the barrier. ▶ Once all sinks have received the barriers, Flink knows that all records before the barriers will never be needed again. ▶ Asynchronous operators to emit all records that depend only on records before the barrier. ▶ Once all sinks have received the barriers, Flink knows that all records before the barriers will never be needed again. ▶ Asynchronous0 码力 | 113 页 | 1.22 MB | 1 年前3
 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020different output 9 Vasiliki Kalavri | Boston University 2020 Outputs after recovery 10 Recovery type Before failure After failure Precise t1 t2 t3 t4 t5 t6 … Gap t1 t2 t3 t5 t6 … Rollback-repeating address non-determinism • Each output is checkpointed together with its unique ID to stable storage before being delivered to the next stage • Retries simply replay the output that has been checkpointed0 码力 | 49 页 | 2.08 MB | 1 年前3 High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020different output 9 Vasiliki Kalavri | Boston University 2020 Outputs after recovery 10 Recovery type Before failure After failure Precise t1 t2 t3 t4 t5 t6 … Gap t1 t2 t3 t5 t6 … Rollback-repeating address non-determinism • Each output is checkpointed together with its unique ID to stable storage before being delivered to the next stage • Retries simply replay the output that has been checkpointed0 码力 | 49 页 | 2.08 MB | 1 年前3
 监控Apache Flink应用程序(入门)event become visible. Once the event is created it is usually stored in a persistent message queue, before it is processed by Apache Flink, which then writes the results to a database or calls a downstream common reason is skew in the partition key of the data, which can be mitigated by pre-aggregating before the shuffle or keying on a more evenly distributed key. 4.13.2.1 Key Metrics Metrics Scope0 码力 | 23 页 | 148.62 KB | 1 年前3 监控Apache Flink应用程序(入门)event become visible. Once the event is created it is usually stored in a persistent message queue, before it is processed by Apache Flink, which then writes the results to a database or calls a downstream common reason is skew in the partition key of the data, which can be mitigated by pre-aggregating before the shuffle or keying on a more evenly distributed key. 4.13.2.1 Key Metrics Metrics Scope0 码力 | 23 页 | 148.62 KB | 1 年前3
 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020intermediate data • operator properties • How can we estimate the cost of different strategies? • before execution or during runtime Query optimization (I) ??? Vasiliki Kalavri | Boston University 2020 tasks to receiving tasks. • The network component of a TaskManager collects records in buffers before they are shipped, i.e., records are not shipped one by one but batched. ??? Vasiliki Kalavri |0 码力 | 54 页 | 2.83 MB | 1 年前3 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020intermediate data • operator properties • How can we estimate the cost of different strategies? • before execution or during runtime Query optimization (I) ??? Vasiliki Kalavri | Boston University 2020 tasks to receiving tasks. • The network component of a TaskManager collects records in buffers before they are shipped, i.e., records are not shipped one by one but batched. ??? Vasiliki Kalavri |0 码力 | 54 页 | 2.83 MB | 1 年前3
 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020When a query arrives: • if the user is sampled: add the query to S • if we haven’t seen the user before: generate a random integer ru between 0 and 9 and add the user to the sample if ru = 0. ??? Vasiliki When a query arrives: • if the user is sampled: add the query to S • if we haven’t seen the user before: generate a random integer ru between 0 and 9 and add the user to the sample if ru = 0. Do we0 码力 | 74 页 | 1.06 MB | 1 年前3 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020When a query arrives: • if the user is sampled: add the query to S • if we haven’t seen the user before: generate a random integer ru between 0 and 9 and add the user to the sample if ru = 0. ??? Vasiliki When a query arrives: • if the user is sampled: add the query to S • if we haven’t seen the user before: generate a random integer ru between 0 and 9 and add the user to the sample if ru = 0. Do we0 码力 | 74 页 | 1.06 MB | 1 年前3
 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020plane and not on a train? • What if you never came back online? • How long do we have to wait before we decide that we have seen all events? How do we know what event time it is? 7 Vasiliki Kalavri0 码力 | 22 页 | 2.22 MB | 1 年前3 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020plane and not on a train? • What if you never came back online? • How long do we have to wait before we decide that we have seen all events? How do we know what event time it is? 7 Vasiliki Kalavri0 码力 | 22 页 | 2.22 MB | 1 年前3
共 14 条
- 1
- 2













