 PyFlink 1.15 Documentationhtml in the Browser 1.1 Getting Started This page summarizes the basic steps required to setup and get started with PyFlink. There are live notebooks where you can try PyFlink out without any other step: following checks. Check the logging messages in the log file to see if there are any problems: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os If there any any problems, you could check the logging messages in the log file as following: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os0 码力 | 36 页 | 266.77 KB | 1 年前3 PyFlink 1.15 Documentationhtml in the Browser 1.1 Getting Started This page summarizes the basic steps required to setup and get started with PyFlink. There are live notebooks where you can try PyFlink out without any other step: following checks. Check the logging messages in the log file to see if there are any problems: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os If there any any problems, you could check the logging messages in the log file as following: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os0 码力 | 36 页 | 266.77 KB | 1 年前3
 PyFlink 1.16 Documentationhtml in the Browser 1.1 Getting Started This page summarizes the basic steps required to setup and get started with PyFlink. There are live notebooks where you can try PyFlink out without any other step: following checks. Check the logging messages in the log file to see if there are any problems: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os If there any any problems, you could check the logging messages in the log file as following: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os0 码力 | 36 页 | 266.80 KB | 1 年前3 PyFlink 1.16 Documentationhtml in the Browser 1.1 Getting Started This page summarizes the basic steps required to setup and get started with PyFlink. There are live notebooks where you can try PyFlink out without any other step: following checks. Check the logging messages in the log file to see if there are any problems: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os If there any any problems, you could check the logging messages in the log file as following: # Get the installation directory of PyFlink python3 -c "import pyflink;import os;print(os.path.dirname(os0 码力 | 36 页 | 266.80 KB | 1 年前3
 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 MapReduce combiners example: URL access frequency 26 map() reduce() GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml https://www.google.be/ Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml https://www.google.be/ Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml0 码力 | 54 页 | 2.83 MB | 1 年前3 Streaming optimizations	- CS 591 K1: Data Stream Processing and Analytics Spring 2020Boston University 2020 MapReduce combiners example: URL access frequency 26 map() reduce() GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml https://www.google.be/ Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml https://www.google.be/ Accept-Language: en-US,en;q=0.8 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3 GET /dumprequest HTTP/1.1 Host: rve.org.uk Connection: keep-alive Accept: text/html,application/ xhtml+xml0 码力 | 54 页 | 2.83 MB | 1 年前3
 State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020• The keys are ordered according to a user-specified comparator function. Basic operations • Get(key): fetch a single key-value from the DB • Put(key, val): insert a single key-value into the DB State.get(): Iterable[T] • ListState.update(values: java.util.List[T]) Flink’s state primitives 13 Vasiliki Kalavri | Boston University 2020 • MapState[K, V]: a map of keys and values • get(key: ReducingState[T]: aggregates values using a ReduceFunction • ReducingState.add(value: T) • ReducingState.get() • AggregatingState[I, O]: aggregates values using an AggregateFunction Flink’s state primitives0 码力 | 24 页 | 914.13 KB | 1 年前3 State management - CS 591 K1: Data Stream Processing and Analytics Spring 2020• The keys are ordered according to a user-specified comparator function. Basic operations • Get(key): fetch a single key-value from the DB • Put(key, val): insert a single key-value into the DB State.get(): Iterable[T] • ListState.update(values: java.util.List[T]) Flink’s state primitives 13 Vasiliki Kalavri | Boston University 2020 • MapState[K, V]: a map of keys and values • get(key: ReducingState[T]: aggregates values using a ReduceFunction • ReducingState.add(value: T) • ReducingState.get() • AggregatingState[I, O]: aggregates values using an AggregateFunction Flink’s state primitives0 码力 | 24 页 | 914.13 KB | 1 年前3
 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020How can we select a representative sample of an unbounded stream? • we want to ask queries and get statistically meaningful answers about the entire stream • we don’t necessarily know the queries How can we select a representative sample of an unbounded stream? • we want to ask queries and get statistically meaningful answers about the entire stream • we don’t necessarily know the queries users by hashing usernames to b buckets and selecting the query if h(user) < a. For example, to get a 30% sample: • use 10 buckets, b0, b1, …, b9. • select the query if the user hash value is in0 码力 | 74 页 | 1.06 MB | 1 年前3 Filtering and sampling streams - CS 591 K1: Data Stream Processing and Analytics Spring 2020How can we select a representative sample of an unbounded stream? • we want to ask queries and get statistically meaningful answers about the entire stream • we don’t necessarily know the queries How can we select a representative sample of an unbounded stream? • we want to ask queries and get statistically meaningful answers about the entire stream • we don’t necessarily know the queries users by hashing usernames to b buckets and selecting the query if h(user) < a. For example, to get a 30% sample: • use 10 buckets, b0, b1, …, b9. • select the query if the user hash value is in0 码力 | 74 页 | 1.06 MB | 1 年前3
 Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020watermark public abstract long currentWatermark(); } } 19 ProcessWindowFunction interface Get start and end timestamps Iterate over the window contents Vasiliki Kalavri | Boston University 2020 override def processElement(r: SensorReading, ctx:Context, out: Collector[String]): Unit = { // get previous temperature val prevTemp = lastTemp // update last temperature lastTemp0 码力 | 35 页 | 444.84 KB | 1 年前3 Windows and triggers - CS 591 K1: Data Stream Processing and Analytics Spring 2020watermark public abstract long currentWatermark(); } } 19 ProcessWindowFunction interface Get start and end timestamps Iterate over the window contents Vasiliki Kalavri | Boston University 2020 override def processElement(r: SensorReading, ctx:Context, out: Collector[String]): Unit = { // get previous temperature val prevTemp = lastTemp // update last temperature lastTemp0 码力 | 35 页 | 444.84 KB | 1 年前3
 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020the application developer Live state migration ??? Vasiliki Kalavri | Boston University 2020 35 get state control command Helper operators, hidden from the application developer Live state migration 2020 36 control command Live state migration ??? Vasiliki Kalavri | Boston University 2020 36 get state control command Live state migration ??? Vasiliki Kalavri | Boston University 2020 36 transfer0 码力 | 93 页 | 2.42 MB | 1 年前3 Elasticity and state migration: Part I - CS 591 K1: Data Stream Processing and Analytics Spring 2020the application developer Live state migration ??? Vasiliki Kalavri | Boston University 2020 35 get state control command Helper operators, hidden from the application developer Live state migration 2020 36 control command Live state migration ??? Vasiliki Kalavri | Boston University 2020 36 get state control command Live state migration ??? Vasiliki Kalavri | Boston University 2020 36 transfer0 码力 | 93 页 | 2.42 MB | 1 年前3
 Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020R = 6, 2R = 64 distinct elements No estimate in between powers of 2! 9 Is it good enough? To get a better estimate, we need to use multiple hash functions and combine their estimates: • Using many work: it is always a power of 2, thus, if the correct estimate is between two powers of 2, we won’t get a good estimate. Solution: harmonic mean (HyperLogLog) ̂n = am ⋅ m2 ⋅ ( m−1 ∑ j=0 2−COUNT[j])0 码力 | 69 页 | 630.01 KB | 1 年前3 Cardinality and frequency estimation - CS 591 K1: Data Stream Processing and Analytics Spring 2020R = 6, 2R = 64 distinct elements No estimate in between powers of 2! 9 Is it good enough? To get a better estimate, we need to use multiple hash functions and combine their estimates: • Using many work: it is always a power of 2, thus, if the correct estimate is between two powers of 2, we won’t get a good estimate. Solution: harmonic mean (HyperLogLog) ̂n = am ⋅ m2 ⋅ ( m−1 ∑ j=0 2−COUNT[j])0 码力 | 69 页 | 630.01 KB | 1 年前3
 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020output: rewards based on how fast the user meets goals • e.g. pop 500 bubbles within 1 minute and get extra life Vasiliki Kalavri | Boston University 2020 What’s the meaning of one minute? 3 Vasiliki0 码力 | 22 页 | 2.22 MB | 1 年前3 Notions of time and progress - CS 591 K1: Data Stream Processing and Analytics Spring 2020output: rewards based on how fast the user meets goals • e.g. pop 500 bubbles within 1 minute and get extra life Vasiliki Kalavri | Boston University 2020 What’s the meaning of one minute? 3 Vasiliki0 码力 | 22 页 | 2.22 MB | 1 年前3
 Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020subscribers are disconnected. • Synchronization: interacting parties are not blocked • Subscribers get notified asynchronously while possibly performing some other concurrent action. 18 Paradigm Space0 码力 | 33 页 | 700.14 KB | 1 年前3 Stream ingestion and pub/sub systems - CS 591 K1: Data Stream Processing and Analytics Spring 2020subscribers are disconnected. • Synchronization: interacting parties are not blocked • Subscribers get notified asynchronously while possibly performing some other concurrent action. 18 Paradigm Space0 码力 | 33 页 | 700.14 KB | 1 年前3
共 12 条
- 1
- 2













