Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020a mathematical structure, e.g. a sequence of (finite) relation states over a common schema R: [r1(R), r2(R), ..., ], where the individual relations are unordered sets. src dest bytes 1 2 20K 2 relations. • as a list of tuple-index pairs, whereindicates that t ∈ rj • as a serialization of r1 followed by a series of delta tuples that indicate updates to make to obtain r2, r3, ..., etc. 20K 2 5 32K 2 3 28K src dest bytes 1 2 20K 2 5 32K src dest bytes 2 5 32K 2 3 28K 1 2 28K R1 R2 R3 • concatenation (1, 2, 20K), (2, 5, 32K) EOR (1, 2, 20K), (2, 5, 32K), (2, 3, 28K) EOR (2, 0 码力 | 45 页 | 1.22 MB | 1 年前3
Flow control and load shedding - CS 591 K1: Data Stream Processing and Analytics Spring 2020L2=18.75 O2 I1 c=10 s=0.5 c=10 s=0.8 c=5 s=1.0 O1 c=10 s=0.9 L1=26.5 5 14 5 5 19 5 r1=10 r/s r2=20 r/s ??? Vasiliki Kalavri | Boston University 2020 13 I2 c=10 s=0.7 c=10 s=0.5 L2=18.75 O2 I1 c=10 s=0.5 c=10 s=0.8 c=5 s=1.0 O1 c=10 s=0.9 L1=26.5 5 14 5 5 19 5 r1=10 r/s r2=20 r/s LT=? ??? Vasiliki Kalavri | Boston University 2020 13 I2 c=10 s=0.7 c=10 s=0 L2=18.75 O2 I1 c=10 s=0.5 c=10 s=0.8 c=5 s=1.0 O1 c=10 s=0.9 L1=26.5 5 14 5 5 19 5 r1=10 r/s r2=20 r/s LT=640 cycles/s ??? Vasiliki Kalavri | Boston University 2020 Reacting to overload0 码力 | 43 页 | 2.42 MB | 1 年前3
Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020denoted S ⊆τ R. In general, if S1, ..., Sn and R1, ..., Rn be timestamped sequences, then (S1, ..., Sn) ⊆τ (R1, ..., Rn) when (S1, ..., Sn) = (R1, ..., Rn) for some τ . 47 Vasiliki Kalavri | Boston0 码力 | 53 页 | 532.37 KB | 1 年前3
Streaming in Apache FlinkReduceFunction{ public SensorReading reduce(SensorReading r1, SensorReading r2) { return r1.value() > r2.value() ? r1 : r2; } } private static class MyWindowFunction extends ProcessWindowFunction< 0 码力 | 45 页 | 3.00 MB | 1 年前3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020false, it definitely isn’t Separate bloom filters for every 10-minute range to avoid saturation r1 is delivered a second time and a catalog lookup is issued to verify it is a duplicate Vasiliki false, it definitely isn’t Separate bloom filters for every 10-minute range to avoid saturation r1 is delivered a second time and a catalog lookup is issued to verify it is a duplicate r8 is a0 码力 | 49 页 | 2.08 MB | 1 年前3
Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020.map(r => (r.id, r.temperature)) .keyBy(_._1) .timeWindow(Time.seconds(15)) .reduce((r1, r2) => (r1._1, r1._2.min(r2._2))) 15 ReduceFunction example The function is evaluated for every0 码力 | 35 页 | 444.84 KB | 1 年前3
共 6 条
- 1













