ClickHouse on Kubernetesdeployment on AWS configures external ingress. clickhouse-client --host $AWS_ELB_HOST_NAME Replication requires Zookeeper Install minimal Zookeeper in separate namespace. kubectl create ns zoons0 码力 | 34 页 | 5.06 MB | 1 年前3
ClickHouse on Kubernetesdeployment on AWS configures external ingress. clickhouse-client --host $AWS_ELB_HOST_NAME Replication requires Zookeeper to be enabled Install minimal Zookeeper in separate namespace. kubectl0 码力 | 29 页 | 3.87 MB | 1 年前3
ClickHouse in ProductioncountIf(CounterType='Show') as SumShows, countIf(CounterType='Click') as SumClicks, BannerID FROM EventLogHDFS GROUP BY BannerID ORDER BY SumClicks desc LIMIT 3; 52 / 97 In ClickHouse: Most Clicked Banner SELECT countIf(CounterType='Show') as SumShows, countIf(CounterType='Click') as SumClicks, BannerID FROM EventLogHDFS GROUP BY BannerID ORDER BY SumClicks desc LIMIT 3; ┌─SumShows─┬─SumClicks─┬───BannerID─┐ │ 6485 │ 1015 countIf(CounterType='Show') as SumShows, countIf(CounterType='Click') as SumClicks, BannerID FROM EventLogLocal GROUP BY BannerID ORDER BY SumClicks desc LIMIT 3; 55 / 97 In ClickHouse: Query Local Copy SELECT c0 码力 | 100 页 | 6.86 MB | 1 年前3
Что нужно знать об архитектуре ClickHouse, чтобы его эффективно использоватьнеделю. SELECT Referer, count(*) AS count FROM hits WHERE CounterID = 1234 AND Date >= today() - 7 GROUP BY Referer ORDER BY count DESC LIMIT 10 Типичный запрос в системе веб-аналитики Быстро читаем Distributed таблицы CSV 227 Gb, ~1.3 млрд строк SELECT passenger_count, avg(total_amount) FROM trips GROUP BY passenger_count NYC taxi benchmark Шардов 1 3 140 Время, с. 1,224 0,438 0,043 Ускорени е x2 https://t.me/clickhouse_ru › GitHub: https://github.com/yandex/ClickHouse/ › Google group: https://groups.google.com/group/clickhouse Спасибо0 码力 | 28 页 | 506.94 KB | 1 年前3
5. ClickHouse at Ximalaya for Shanghai Meetup 2019 PDFtimestamps, arrayEnumerate(pages) as index FROM (SELECT * FROM client_log_all ORDER BY timestamp) GROUP BY user ����������������� ���������� SELECT user, groupArray(page) as pages, groupArray(timestamp) pages[i+2]='Order'), index, pages) as level_3 FROM (SELECT * FROM client_log_all ORDER BY timestamp) GROUP BY user • ����������������������������������������������������������������������� ����������������� 'Order' ), sortedPages, nextSortedPages, nextNextSortedPages) as level_2 … FROM client_log_all GROUP BY user • �������������� ������������������������������� �������� ������ ������ ���� ���� � �������0 码力 | 28 页 | 6.87 MB | 1 年前3
1. Machine Learning with ClickHousequery SELECT cab_type, simpleLinearRegression(trip_distance, total_amount) FROM trips WHERE <...> GROUP BY cab_type ┌─cab_type─┬─simpleLinearRegression(trip_distance, total_amount)─┐ │ yellow │ (2.4343401638740527 toYear(pickup_datetime) AS y, simpleLinearRegression(trip_distance, total_amount) FROM trips WHERE <...> GROUP BY y ┌────y─┬─simpleLinearRegression(trip_distance, total_amount)─┐ │ 2009 │ (2.553562453857034,3 year, stochasticLinearRegressionState(total_amount, trip_distance) AS model FROM trips WHERE <...> GROUP BY year Ok. 39 / 62 Apply several trained models SELECT evalMLMethod(model, trip_distance), total_amount0 码力 | 64 页 | 1.38 MB | 1 年前3
0. Machine Learning with ClickHouse query SELECT cab_type, simpleLinearRegression(trip_distance, total_amount) FROM trips WHERE <...> GROUP BY cab_type ┌─cab_type─┬─simpleLinearRegression(trip_distance, total_amount)─┐ │ yellow │ (2.4343401638740527 toYear(pickup_datetime) AS y, simpleLinearRegression(trip_distance, total_amount) FROM trips WHERE <...> GROUP BY y ┌────y─┬─simpleLinearRegression(trip_distance, total_amount)─┐ │ 2009 │ (2.553562453857034,3 year, stochasticLinearRegressionState(total_amount, trip_distance) AS model FROM trips WHERE <...> GROUP BY year Ok. 39 / 62 Apply several trained models SELECT evalMLMethod(model, trip_distance), total_amount0 码力 | 64 页 | 1.38 MB | 1 年前3
3. Sync Clickhouse with MySQL_MongoDBevery day…… Possible Solutions 4. CollapsingMergeTree ● FINAL is slow ● GROUP BY id HAVING sum(sign)>0 ○ Need to use GROUP BY in every query ○ Not suitable for multi-column primary key Our Solution:0 码力 | 38 页 | 7.13 MB | 1 年前3
2. Clickhouse玩转每天千亿数据-趣头条select column1, column2 from table where column=value 凡是涉及group by, order by, distinct, join这样的SQL内存占用不再是O(1) 解决: 1:max_bytes_before_external_group_by 2:max_bytes_before_external_sort 3:uniq / uniqCombined0 码力 | 14 页 | 1.10 MB | 1 年前3
2. 腾讯 clickhouse实践 _2019丁晓坤&熊峰ARRAY JOIN Goals GROUP BY key ORDER BY value DESC LIMIT 10 SELECT play_times_key AS key, sum(play_times_value) AS value FROM wegame ARRAY JOIN play_times_key, play_times_value GROUP BY key ORDER BY0 码力 | 26 页 | 3.58 MB | 1 年前3
共 15 条
- 1
- 2













