Exactly-once fault-tolerance in Apache Flink - CS 591 K1: Data Stream Processing and Analytics Spring 2020sha1_base64="S9FZAoXVvp7baivRTBZw7 3oyxCU=">AB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoY9lsJ+3SzSbsboQS+hu8eFDx6h/ y5r9x2+agrQ8GHu/NMDMvTAXxnW/ndLK6tr6RnmzsrW9s7tX3T940EmGPosEYlqh1Sj4BJ9w43AdqQxqH AVji6mfqtJ1SaJ/LejF sha1_base64="S9FZAoXVvp7baivRTBZw7 3oyxCU=">AB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoY9lsJ+3SzSbsboQS+hu8eFDx6h/ y5r9x2+agrQ8GHu/NMDMvTAXxnW/ndLK6tr6RnmzsrW9s7tX3T940EmGPosEYlqh1Sj4BJ9w43AdqQxqH AVji6mfqtJ1SaJ/LejF sha1_base64="S9FZAoXVvp7baivRTBZw7 3oyxCU=">AB63icbVBNS8NAEJ3Ur1q/qh69LBbBU0lEUG9FLx4rGFtoY9lsJ+3SzSbsboQS+hu8eFDx6h/ y5r9x2+agrQ8GHu/NMDMvTAXxnW/ndLK6tr6RnmzsrW9s7tX3T940EmGPosEYlqh1Sj4BJ9w43AdqQxqH AVji6mfqtJ1SaJ/LejF0 码力 | 81 页 | 13.18 MB | 1 年前3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020University 2020 Composite subscription pattern language A(X>0) & (B(Y=10);[timespan:5] C(Z<5))[within:15] A, B, C are topics X, Y, Z are inner fields The rule fires when an item of type A having an an attribute X > 0 enters the system and also an item of type B with Y = 10 is detected, followed (in a time interval of 5–15 s) by an item of type C with Z < 5. 8 Vasiliki Kalavri | Boston University BY CustomerId AS PATTERN (X Y Z) WHERE X.Event = ‘order’ AND Y.Event = ‘rebate’ AND Y.ItemID = X.ItemID AND Z.Event = ‘cancel’ AND Z.ItemID = Y.ItemID Partitions the stream into0 码力 | 53 页 | 532.37 KB | 1 年前3
Flink如何实时分析Iceberg数据湖的CDC数据a + 1 W0ERE a (100 U2,)TE G=FG SET (1,2 W0ERE a=0 )1, b=0 QH=Ey特点 1. b携带S意过滤条R; 2. 不依赖k=y; 一般uWkn行的r有列y值e新值; 数t量 a条QH=Ey更新i量数t集 a条QH=EyQ更新一行数t 计算模g 长耗时的sUN 流o增量l入 更新频率 T频更新 高频更新 pache Iceberg D585t5F685-3 D3t3F685-6 D3t3F685-7 D585t5F685-8 D585t5F685-4 2 6 3 6 2 4 5 4 App8y D585t6on App8y D585t6on App8y D585t6on App8y D585t6on 23s7- 23s7-3 23s7-2 23s7-4 -N1ER2 F-LE1 DELE2E F-LE1 u件级别n发读a 5 4 ACCly DeleFiBA ACCly DeleFiBA ACCly DeleFiBA ACCly DeleFiBA 68Ek- 68Ek-3 68Ek-2 68Ek-4 K满足y确o要求J 2Kk现高吞e写入J 3K满足n发高t读aJ 4Kb以k现EA8CEhBF级别的增量ra J 方案p结 R点 K同一N68EkV的重hDeleFe -ileb以 缓存I加速 J3I20 码力 | 36 页 | 781.69 KB | 1 年前3
PyFlink 1.15 DocumentationCreate conda virtual environment under a directory, e.g. venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual currently only supports Python 3.6, 3.7 and 3.8 in PyFlink officially. RUN apt-get update -y && \ apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev libffi-dev && \ wget https://www.python0 码力 | 36 页 | 266.77 KB | 1 年前3
PyFlink 1.16 DocumentationCreate conda virtual environment under a directory, e.g. venv conda create --name venv python=3.8 -y The conda virtual environment needs to be activated before to use it. To activate the conda virtual currently only supports Python 3.6, 3.7 and 3.8 in PyFlink officially. RUN apt-get update -y && \ apt-get install -y build-essential libssl-dev zlib1g-dev libbz2-dev libffi-dev && \ wget https://www.python0 码力 | 36 页 | 266.80 KB | 1 年前3
Skew mitigation - CS 591 K1: Data Stream Processing and Analytics Spring 2020δ*N, where N is the number of stream elements • The solution will not contain any item y with frequency: • freq(y) < (δ - ε)*N, for a user-chosen value ε 4 (δ - ε)*Ν δ*Ν not included may be included0 码力 | 31 页 | 1.47 MB | 1 年前3
Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020then: X*.add({x, f}) // remove unpopular elements from the heap for (y, fy) in X* do: if fy <= f* then X*.remove({y, fy}) return X* Computing top-k ??? Vasiliki Kalavri | Boston University0 码力 | 69 页 | 630.01 KB | 1 年前3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020StreamExecutionEnvironment .disableOperatorChaining() val input: DataStream[X] = ... val result: DataStream[Y] = input .filter(new Filter1()) .map(new Map1()) // disable chaining for Map2 .map(new Map2()).disableChaining()0 码力 | 54 页 | 2.83 MB | 1 年前3
共 8 条
- 1













