 4. ClickHouse在苏宁用户画像场景的实践C++、Java、Go编程经验,熟悉大数据架构、解决方案  ClickHouse Contributor  Github: https://github.com/andyyzh Contents 苏宁如何使用ClickHouse ClickHouse集成Bitmap 用户画像场景实践 2 选择ClickHouse的原因 1. 速度快 2. 特性发布快 实时聚合分析监控数据,主要使用物化视图技术。  用户画像场景 -- 标签数据的存储、用户画像查询引擎。 7 Contents 苏宁如何使用ClickHouse ClickHouse集成Bitmap 用户画像场景实践 8 Bitmap位存储和位计算 每个bit位表示一个数字id,对亍40亿个的用户id,只需要40亿bit位, 约477m大小 = (4 * 109 高16位 Key 0xEE6B 0x2800 低16位 Value Bitmap Container 0 1 1 0 ① ② ③ ④ ClickHouse集成RoaringBitmap Bitmap字段类型,该类型扩展自AggregateFunction类型,字段类型定义: AggregateFunction( groupBitmap, UInt(8|16|32|64))0 码力 | 32 页 | 1.47 MB | 1 年前3 4. ClickHouse在苏宁用户画像场景的实践C++、Java、Go编程经验,熟悉大数据架构、解决方案  ClickHouse Contributor  Github: https://github.com/andyyzh Contents 苏宁如何使用ClickHouse ClickHouse集成Bitmap 用户画像场景实践 2 选择ClickHouse的原因 1. 速度快 2. 特性发布快 实时聚合分析监控数据,主要使用物化视图技术。  用户画像场景 -- 标签数据的存储、用户画像查询引擎。 7 Contents 苏宁如何使用ClickHouse ClickHouse集成Bitmap 用户画像场景实践 8 Bitmap位存储和位计算 每个bit位表示一个数字id,对亍40亿个的用户id,只需要40亿bit位, 约477m大小 = (4 * 109 高16位 Key 0xEE6B 0x2800 低16位 Value Bitmap Container 0 1 1 0 ① ② ③ ④ ClickHouse集成RoaringBitmap Bitmap字段类型,该类型扩展自AggregateFunction类型,字段类型定义: AggregateFunction( groupBitmap, UInt(8|16|32|64))0 码力 | 32 页 | 1.47 MB | 1 年前3
 2. ClickHouse MergeTree原理解析-朱凯l …… l 智慧组织 l 智慧城市 l 智慧产业 l …… EDT 企业级大数据平台 BAS区块链企业应用服务平台 ECP 企 业 云 平 台 服务(咨询、实施、运维、定制开发、系统集成……) 面向 集团企业 面向 能源行业 面向 社会治理 公司主要客户 海尔集团 东风汽车 中信重工 首创经中 河南省人民医院 宏发股份 国家电网 国家电投集团 华能集团0 码力 | 35 页 | 13.25 MB | 1 年前3 2. ClickHouse MergeTree原理解析-朱凯l …… l 智慧组织 l 智慧城市 l 智慧产业 l …… EDT 企业级大数据平台 BAS区块链企业应用服务平台 ECP 企 业 云 平 台 服务(咨询、实施、运维、定制开发、系统集成……) 面向 集团企业 面向 能源行业 面向 社会治理 公司主要客户 海尔集团 东风汽车 中信重工 首创经中 河南省人民医院 宏发股份 国家电网 国家电投集团 华能集团0 码力 | 35 页 | 13.25 MB | 1 年前3
 Тестирование ClickHouse которого мы заслуживаемкак проект › Открытый исходный код на C++ › Больше 300 тысяч строк кода › Открытый репозиторий на GitHub › Изменения через пулл реквесты › В неделю вливается 40 пулл реквестов › 20% изменений от внешних поведение › Memory – использование неинициализованной памяти Ссылки: › Основной репозиторий: https://github.com/google/sanitizers › Внутри LLVM: http://compiler-rt.llvm.org/ 10 / 77 Тестирование ClickHouse ускорение Не сработало: › Кэширование с помощью distcc › Unity builds + precompiled headers - https://github.com/sakra/cotire - Возможно недотюнили 13 / 77 Тестирование ClickHouse, которого мы заслуживаем0 码力 | 84 页 | 9.60 MB | 1 年前3 Тестирование ClickHouse которого мы заслуживаемкак проект › Открытый исходный код на C++ › Больше 300 тысяч строк кода › Открытый репозиторий на GitHub › Изменения через пулл реквесты › В неделю вливается 40 пулл реквестов › 20% изменений от внешних поведение › Memory – использование неинициализованной памяти Ссылки: › Основной репозиторий: https://github.com/google/sanitizers › Внутри LLVM: http://compiler-rt.llvm.org/ 10 / 77 Тестирование ClickHouse ускорение Не сработало: › Кэширование с помощью distcc › Unity builds + precompiled headers - https://github.com/sakra/cotire - Возможно недотюнили 13 / 77 Тестирование ClickHouse, которого мы заслуживаем0 码力 | 84 页 | 9.60 MB | 1 年前3
 ClickHouse in ProductionHighload Architecture https://github.com/donnemartin/system-design-primer 3 / 97 Highload Architecture › Webserver (Apache, Nginx) › Cache (Memcached) https://github.com/donnemartin/system-design-primer (Memcached) › Message Broker (Kafka, Amazon SQS) › Coordination system (Zookeeper, etcd) https://github.com/donnemartin/system-design-primer 5 / 97 Highload Architecture › Webserver (Apache, Nginx) › Coordination system (Zookeeper, etcd) › MapReduce (Hadoop, Spark) › Network File System (S3, HDFS) https://github.com/donnemartin/system-design-primer 6 / 97 Highload Architecture › Webserver (Apache, Nginx) ›0 码力 | 100 页 | 6.86 MB | 1 年前3 ClickHouse in ProductionHighload Architecture https://github.com/donnemartin/system-design-primer 3 / 97 Highload Architecture › Webserver (Apache, Nginx) › Cache (Memcached) https://github.com/donnemartin/system-design-primer (Memcached) › Message Broker (Kafka, Amazon SQS) › Coordination system (Zookeeper, etcd) https://github.com/donnemartin/system-design-primer 5 / 97 Highload Architecture › Webserver (Apache, Nginx) › Coordination system (Zookeeper, etcd) › MapReduce (Hadoop, Spark) › Network File System (S3, HDFS) https://github.com/donnemartin/system-design-primer 6 / 97 Highload Architecture › Webserver (Apache, Nginx) ›0 码力 | 100 页 | 6.86 MB | 1 年前3
 1. Machine Learning with ClickHouseClickHouse simpleLinearRegression supports only single factor › You are welcome to contribute to https://github.com/clickhouse/ClickHouse There are stochastic regression methods in ClickHouse › stochasticLinearRegression configuration file › Add model description which matches models_config More details: › Tutorial https://github.com/ClickHouse/clickhouse-presentations/blob/master/tutorials/ catboost_with_clickhouse_en.md › aggregate functions for ML More detailed description https://github.com/ClickHouse/ClickHouse/issues/7345 You are welcome to contribute to https://github.com/clickhouse/ClickHouse 61 / 62 Thank you! QA 620 码力 | 64 页 | 1.38 MB | 1 年前3 1. Machine Learning with ClickHouseClickHouse simpleLinearRegression supports only single factor › You are welcome to contribute to https://github.com/clickhouse/ClickHouse There are stochastic regression methods in ClickHouse › stochasticLinearRegression configuration file › Add model description which matches models_config More details: › Tutorial https://github.com/ClickHouse/clickhouse-presentations/blob/master/tutorials/ catboost_with_clickhouse_en.md › aggregate functions for ML More detailed description https://github.com/ClickHouse/ClickHouse/issues/7345 You are welcome to contribute to https://github.com/clickhouse/ClickHouse 61 / 62 Thank you! QA 620 码力 | 64 页 | 1.38 MB | 1 年前3
 0. Machine Learning with ClickHouse ClickHouse simpleLinearRegression supports only single factor › You are welcome to contribute to https://github.com/clickhouse/ClickHouse There are stochastic regression methods in ClickHouse › stochasticLinearRegression configuration file › Add model description which matches models_config More details: › Tutorial https://github.com/ClickHouse/clickhouse-presentations/blob/master/tutorials/ catboost_with_clickhouse_en.md › aggregate functions for ML More detailed description https://github.com/ClickHouse/ClickHouse/issues/7345 You are welcome to contribute to https://github.com/clickhouse/ClickHouse 61 / 62 Thank you! QA 620 码力 | 64 页 | 1.38 MB | 1 年前3 0. Machine Learning with ClickHouse ClickHouse simpleLinearRegression supports only single factor › You are welcome to contribute to https://github.com/clickhouse/ClickHouse There are stochastic regression methods in ClickHouse › stochasticLinearRegression configuration file › Add model description which matches models_config More details: › Tutorial https://github.com/ClickHouse/clickhouse-presentations/blob/master/tutorials/ catboost_with_clickhouse_en.md › aggregate functions for ML More detailed description https://github.com/ClickHouse/ClickHouse/issues/7345 You are welcome to contribute to https://github.com/clickhouse/ClickHouse 61 / 62 Thank you! QA 620 码力 | 64 页 | 1.38 MB | 1 年前3
 C++ zero-cost abstractions  на примере хеш-таблиц  в ClickHouseвычисление SipHash ~980 MB/s. CityHash ~9 GB/s. 4. Не использовать устаревшие хэш-функции. FNV1a https://github.com/rurban/smhasher Выбор хеш-функции 10 10 По умолчанию в ClickHouse плохие хэш-функции 1. CRC32-C хэш-таблиц под свой сценарий агрегации данных. https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/HashTable/HashTable.h https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/exam0 码力 | 49 页 | 2.73 MB | 1 年前3 C++ zero-cost abstractions  на примере хеш-таблиц  в ClickHouseвычисление SipHash ~980 MB/s. CityHash ~9 GB/s. 4. Не использовать устаревшие хэш-функции. FNV1a https://github.com/rurban/smhasher Выбор хеш-функции 10 10 По умолчанию в ClickHouse плохие хэш-функции 1. CRC32-C хэш-таблиц под свой сценарий агрегации данных. https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/HashTable/HashTable.h https://github.com/ClickHouse/ClickHouse/blob/master/src/Common/exam0 码力 | 49 页 | 2.73 MB | 1 年前3
 Что нужно знать об архитектуре ClickHouse, чтобы его эффективно использоватьМожно сюда: › clickhouse-feedback@yandex-team.ru › Telegram: https://t.me/clickhouse_ru › GitHub: https://github.com/yandex/ClickHouse/ › Google group: https://groups.google.com/group/clickhouse Спасибо0 码力 | 28 页 | 506.94 KB | 1 年前3 Что нужно знать об архитектуре ClickHouse, чтобы его эффективно использоватьМожно сюда: › clickhouse-feedback@yandex-team.ru › Telegram: https://t.me/clickhouse_ru › GitHub: https://github.com/yandex/ClickHouse/ › Google group: https://groups.google.com/group/clickhouse Спасибо0 码力 | 28 页 | 506.94 KB | 1 年前3
 ClickHouse on KubernetesQuick Start Installing the ClickHouse Operator [Optional] Get sample files from github repo: git clone https://github.com/Altinity/clickhouse-operator Install the operator: kubectl apply -f c0 码力 | 34 页 | 5.06 MB | 1 年前3 ClickHouse on KubernetesQuick Start Installing the ClickHouse Operator [Optional] Get sample files from github repo: git clone https://github.com/Altinity/clickhouse-operator Install the operator: kubectl apply -f c0 码力 | 34 页 | 5.06 MB | 1 年前3
 ClickHouse on KubernetesInstalling and removing the ClickHouse operator [Optional] Get sample files from github repo: git clone https://github.com/Altinity/clickhouse-operator Install the operator: kubectl apply -f c0 码力 | 29 页 | 3.87 MB | 1 年前3 ClickHouse on KubernetesInstalling and removing the ClickHouse operator [Optional] Get sample files from github repo: git clone https://github.com/Altinity/clickhouse-operator Install the operator: kubectl apply -f c0 码力 | 29 页 | 3.87 MB | 1 年前3
共 13 条
- 1
- 2













