 Go 构建大型开源分布式数据库技术内幕介绍两个有趣的项目 Spark on TiDB TiSpark TiDB + SparkSQL = TiSpark TiKV TiKV TiKV TiKV TiKV TiDB TiDB TiDB TiDB + SparkSQL = TiSpark Spark Master TiKV Connector Data Storage & Coprocessor PD Spark Exec TiKV Connector Spark Exec TiKV Connector Spark Exec Features Beyond Raw Spark ● Index support ● Complex Calculation Pushdown ● CBO ○ Pick up right Access Path ○ Join Reorder Use Case ● Analytical with Spark ○ Possiblility for get rid of Hadoop ● Embrace Spark echo-system ○ Support of complex transformation and analytics with Scala / Python and R ○ Machine Learning Libraries ○ Spark Streaming0 码力 | 44 页 | 649.68 KB | 1 年前3 Go 构建大型开源分布式数据库技术内幕介绍两个有趣的项目 Spark on TiDB TiSpark TiDB + SparkSQL = TiSpark TiKV TiKV TiKV TiKV TiKV TiDB TiDB TiDB TiDB + SparkSQL = TiSpark Spark Master TiKV Connector Data Storage & Coprocessor PD Spark Exec TiKV Connector Spark Exec TiKV Connector Spark Exec Features Beyond Raw Spark ● Index support ● Complex Calculation Pushdown ● CBO ○ Pick up right Access Path ○ Join Reorder Use Case ● Analytical with Spark ○ Possiblility for get rid of Hadoop ● Embrace Spark echo-system ○ Support of complex transformation and analytics with Scala / Python and R ○ Machine Learning Libraries ○ Spark Streaming0 码力 | 44 页 | 649.68 KB | 1 年前3
 5 How to integrate Graph mode into RDBMS smoothly Worker Spark Driver TiKV Cluster (Storage) Metadata TiKV TiKV TiKV MySQL Clients Syncer Data location Job TiSpark DistSQL API TiKV TiDB TSO/Data location Worker Worker Spark Cluster0 码力 | 26 页 | 1.14 MB | 1 年前3 5 How to integrate Graph mode into RDBMS smoothly Worker Spark Driver TiKV Cluster (Storage) Metadata TiKV TiKV TiKV MySQL Clients Syncer Data location Job TiSpark DistSQL API TiKV TiDB TSO/Data location Worker Worker Spark Cluster0 码力 | 26 页 | 1.14 MB | 1 年前3
 2.5 Go在猎豹移动的应用restart依赖健康检测;  api质量监控,使用日志来追踪,通过本 地日志+flume+hdfs+hive;  实时监控可以考虑flume sink到kafka,再 依赖Spark计算; RPC  协议&远程调用的选型;  net/rpc,thrift,grpc等;  链路追踪,参考Google Dapper论文,核 心思路是关键库植入代码,因为缺乏0 码力 | 24 页 | 4.26 MB | 1 年前3 2.5 Go在猎豹移动的应用restart依赖健康检测;  api质量监控,使用日志来追踪,通过本 地日志+flume+hdfs+hive;  实时监控可以考虑flume sink到kafka,再 依赖Spark计算; RPC  协议&远程调用的选型;  net/rpc,thrift,grpc等;  链路追踪,参考Google Dapper论文,核 心思路是关键库植入代码,因为缺乏0 码力 | 24 页 | 4.26 MB | 1 年前3
 How to start a VC-backed startupFramework from MIT ● You can learn it ● Explore and validate ○ Not 100%, but accurateHave you found a spark? ● In B2B 100+ conversations with the target audience: ○ Notes showing a trend. ○ Commitments?0 码力 | 32 页 | 7.43 MB | 6 月前3 How to start a VC-backed startupFramework from MIT ● You can learn it ● Explore and validate ○ Not 100%, but accurateHave you found a spark? ● In B2B 100+ conversations with the target audience: ○ Notes showing a trend. ○ Commitments?0 码力 | 32 页 | 7.43 MB | 6 月前3
 Go vs. GoPlus(Go+)数据科学不是基础设施,而是数学应用软件 • 全能力:统计/预测/洞察/规划/决策/… 数据科学的基建时期:大数据的兴起 • Map/Reduce (2004) • Hadoop (2006) • Spark (2009) • 大数据的兴起,是数据科学基础设施化的开始 • 以大规模处理能力为优先 • 功能上相对局限 数据科学的基建时期:深度学习的兴起 • TensorFlow/Python (2015)0 码力 | 54 页 | 1.82 MB | 1 年前3 Go vs. GoPlus(Go+)数据科学不是基础设施,而是数学应用软件 • 全能力:统计/预测/洞察/规划/决策/… 数据科学的基建时期:大数据的兴起 • Map/Reduce (2004) • Hadoop (2006) • Spark (2009) • 大数据的兴起,是数据科学基础设施化的开始 • 以大规模处理能力为优先 • 功能上相对局限 数据科学的基建时期:深度学习的兴起 • TensorFlow/Python (2015)0 码力 | 54 页 | 1.82 MB | 1 年前3
共 5 条
- 1













