 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020rows Data Stream Management System • continuous queries • sequential data access, high-rate append-only updates Data Warehouse • complex, offline analysis • large and relatively static and Data persistent relations streams Data Access random sequential, single-pass Updates arbitrary append-only Update rates relatively low high, bursty Processing Model query-driven / pull-based data-driven collection of IP addresses accessing a web server 12 With some practical value for use-cases with append-only data It preserves all history without the option to discard old events Vasiliki Kalavri |0 码力 | 45 页 | 1.22 MB | 1 年前3 Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020rows Data Stream Management System • continuous queries • sequential data access, high-rate append-only updates Data Warehouse • complex, offline analysis • large and relatively static and Data persistent relations streams Data Access random sequential, single-pass Updates arbitrary append-only Update rates relatively low high, bursty Processing Model query-driven / pull-based data-driven collection of IP addresses accessing a web server 12 With some practical value for use-cases with append-only data It preserves all history without the option to discard old events Vasiliki Kalavri |0 码力 | 45 页 | 1.22 MB | 1 年前3
 Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020New streams (derived) are defined as virtual views in SQL • Semantics are equivalent to having an append-only table to which new tuples are continuously added. 34 Vasiliki Kalavri | Boston University itemID, start_price, start_time FROM OpenAuction WHERE start_price > 1000 Derived stream as an append- only table. 35 Vasiliki Kalavri | Boston University 2020 User-Defined Aggregates (UDAs) Constructs operations for each new arriving tuple: 1. Append the encoded new tuple to IN, 2. Copy IN to TAPE, and compute F(IN) − OUT 3. Return the result obtained in 2 and append it to OUT. Non-blocking 46 Vasiliki0 码力 | 53 页 | 532.37 KB | 1 年前3 Streaming languages and operator semantics - CS 591 K1: Data Stream Processing and Analytics Spring 2020New streams (derived) are defined as virtual views in SQL • Semantics are equivalent to having an append-only table to which new tuples are continuously added. 34 Vasiliki Kalavri | Boston University itemID, start_price, start_time FROM OpenAuction WHERE start_price > 1000 Derived stream as an append- only table. 35 Vasiliki Kalavri | Boston University 2020 User-Defined Aggregates (UDAs) Constructs operations for each new arriving tuple: 1. Append the encoded new tuple to IN, 2. Copy IN to TAPE, and compute F(IN) − OUT 3. Return the result obtained in 2 and append it to OUT. Non-blocking 46 Vasiliki0 码力 | 53 页 | 532.37 KB | 1 年前3
 Scalable Stream Processing - Spark Streaming and Flinkupdates the result. 57 / 79 Programming Model (2/2) 58 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external can be updated in place, such as a MySQL table. 59 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external can be updated in place, such as a MySQL table. 59 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external0 码力 | 113 页 | 1.22 MB | 1 年前3 Scalable Stream Processing - Spark Streaming and Flinkupdates the result. 57 / 79 Programming Model (2/2) 58 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external can be updated in place, such as a MySQL table. 59 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external can be updated in place, such as a MySQL table. 59 / 79 Output Modes ▶ Three output modes: 1. Append: only the new rows appended to the result table since the last trigger will be written to the external0 码力 | 113 页 | 1.22 MB | 1 年前3
 Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020approach and durably store all events in a sequential (possibly partitioned) log • A log is an append-only sequence of records on disk • a producer generates messages by simply appending them to the0 码力 | 33 页 | 700.14 KB | 1 年前3 Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020approach and durably store all events in a sequential (possibly partitioned) log • A log is an append-only sequence of records on disk • a producer generates messages by simply appending them to the0 码力 | 33 页 | 700.14 KB | 1 年前3
共 4 条
- 1













