efficient deep learning - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 2 Vasiliki Kalavri | Boston University 2020 • No explicit in Apache Flink - Tzu-Li (Gordon) Tai: https:// www.youtube.com/watch?v=euFMWFDThiE • Webinar: Deep Dive on Apache Flink State - Seth Wiesman: https:// www.youtube.com/watch?v=9GF8Hwqzwnk Further

0 码力 | 24 页 | 914.13 KB | 1 年前
3
High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020

computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic State computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 Logic State computation maintains state: • rolling aggregations • window contents • input offsets • machine learning models State in dataflow computations 3 Vasiliki Kalavri | Boston University 2020 4 Distributed

0 码力 | 49 页 | 2.08 MB | 1 年前
3
Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Batch API Historic data Kafka, RabbitMQ, ... HDFS, JDBC, ... Event logs ETL, Graphs,  Machine Learning  Relational, … Low latency,  windowing, aggregations, ... 2 Vasiliki Kalavri | Boston University

0 码力 | 26 页 | 3.33 MB | 1 年前
3
Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020

user queries approximate results ??? Vasiliki Kalavri | Boston University 2020 A simple and efficient synopsis Suppose that our data consists of a large numeric time series. What summary would let this series? 3 var = ∑ (xi − μ)2 N ??? Vasiliki Kalavri | Boston University 2020 A simple and efficient synopsis Suppose that our data consists of a large numeric time series. What summary would let observations var = ∑ (xi − μ)2 N ??? Vasiliki Kalavri | Boston University 2020 A simple and efficient synopsis Suppose that our data consists of a large numeric time series. What summary would let

0 码力 | 74 页 | 1.06 MB | 1 年前
3
PyFlink 1.15 Documentation

workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such as Pandas

0 码力 | 36 页 | 266.77 KB | 1 年前
3
PyFlink 1.16 Documentation

workloads, such as real-time data processing pipelines, large-scale exploratory data analysis, Machine Learning (ML) pipelines and ETL processes. If you’re already familiar with Python and libraries such as Pandas

0 码力 | 36 页 | 266.80 KB | 1 年前
3
Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020

at any point in the stream. 13 It is the most general model Hard to develop space-efficient and time-efficient algorithms Vasiliki Kalavri | Boston University 2020 Relational Streaming Model Vasiliki

0 码力 | 45 页 | 1.22 MB | 1 年前
3
Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

setMaxParallelism(1024) 33 Setting the max parallelism ??? Vasiliki Kalavri | Boston University 2020 • A Deep Dive into Rescalable State in Apache Flink: https:// flink.apache.org/features/2017/07/04/flink-rescalable-state

0 码力 | 41 页 | 4.09 MB | 1 年前
3
Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Distributed execution in Flink ??? Vasiliki Kalavri | Boston University 2020 9 Identify the most efficient way to execute a query • There may exist several ways to execute a computation • query plans plan B output Lowest-cost plan ??? Vasiliki Kalavri | Boston University 2020 12 • What does efficient mean in the context of streaming? • queries run continuously • streams are unbounded • In

0 码力 | 54 页 | 2.83 MB | 1 年前
3
Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020

working set. If consumers are slow, throughput might degrade. • DBs support secondary indexes for efficient search while MBs only offer topic-based subscription. • DB query results depend on a snapshot

0 码力 | 33 页 | 700.14 KB | 1 年前
3

共 13 条前往

页

分类

语言

格式

State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020

High-availability, recovery semantics, and guarantees - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Introduction to Apache Flink and Apache Kafka - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020

PyFlink 1.15 Documentation

PyFlink 1.16 Documentation

Stream processing fundamentals - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Fault-tolerance demo & reconfiguration - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Streaming optimizations - CS 591 K1: Data Stream Processing and Analytics Spring 2020

Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020