Hive - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

TiDB 开源分布式关系型数据库

提供丰富的监控指标满足运维管理的需求，使用DataX 将 TiDB 的数据以 T+1 同步到 Hive 做数据备份。基于TiDB 中通快递进行实时数仓宽表的建设，业务的 OLTP 数据通过 TiDB 实时写入，后续 OLAP 的业务通过 TiSpark 做分钟级的分析。经过业务实测，TiSpark 同步 3 亿条数据到 Hive 大概需要 10 分钟，为中通快递的实时数仓建设与离线 T+1 的整合提供保障 InnoDB Cluster 方案扩展性有限, 性能受损, 同时要修改业务端的代码, 复杂度较高; *。 MongoDB 不能实时从 Binlog 同步数据, 不适合使用 SQL 语义; *。 ”Hive 不便于做增量更新; *。 Phoenix on HBase 的索引变更与维护比较困难, 聚合查询的效率不高; *。 CRDB 兼容 PostgreSQL 协议, 线上数据迁移需做协议的转化, 还有一部分线下日志相关的数据。日志流数据是在线上 App 采集数据, 流转到 Kafka, 挑选出有价值的信! 过 Flink 写入TiDB , 供给分析使用。采用 Spark 连接HIVE和 EROSPIKE , 通过TiSpark 直接访问 TiDB, 实现跨多个数据源的计算和分析。跨源异构计算架构打破了不同数据库之问的壁垒, 实现数据价值的最大化, 为用户画像、精准营销等业务场景提

0 码力 | 58 页 | 9.51 MB | 1 年前
3
TiDB v5.1 Documentation

have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. 11.7.10.1 Configure Drainer Modify the configuration file of Drainer including SQuirreLSQL and hive-beeline. For example, to use it with beeline: ./beeline Beeline version 1.2.2 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000 1: jdbc:hive2://localhost:10000> use together with Hive You can use TiSpark together with Hive. Before starting Spark, you need to set the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder and copy hive-site.xml to

0 码力 | 2745 页 | 47.65 MB | 1 年前
3
使用 TiDB 进行实时数据分析-马晓宇

�� TiSpark ● TiSpark �� TiDB �� Apache Spark �� ● �� Apache Spark �� ○ Apache Zeppelin�� Hive ��R �� ● � TiDB �� ○ �� Join �� ● � TiDB �� ● ��WIP�

0 码力 | 36 页 | 9.32 MB | 1 年前
3
TiDB v5.2 Documentation

have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. 11.7.10.1 Configure Drainer Modify the configuration file of Drainer including SQuirreLSQL and hive-beeline. For example, to use it with beeline: ./beeline Beeline version 1.2.2 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000 1: jdbc:hive2://localhost:10000> use together with Hive You can use TiSpark together with Hive. Before starting Spark, you need to set the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder and copy hive-site.xml to

0 码力 | 2848 页 | 47.90 MB | 1 年前
3
TiDB v5.3 Documentation

have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. 11.7.10.1 Configure Drainer Modify the configuration file of Drainer jdbc:hive2://localhost:10000 If the following message is displayed, you have enabled beeline successfully. Beeline version 1.2.2 by Apache Hive Then, you can run the query command: 1: jdbc:hive2://localhost:10000> together with Hive You can use TiSpark together with Hive. Before starting Spark, you need to set the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder and copy hive �→ -site.xml

0 码力 | 2996 页 | 49.30 MB | 1 年前
3
TiDB v5.2 中文手册

Consumer Client 用户文档目前 Drainer 提供了多种输出方式，包括 MySQL、TiDB、file 等。但是用户往往有一些自定义的需求，比如输出到 Elasticsearch、Hive 等，这些需求 Drainer 现在还没有实现，因此 Drainer 增加了输出到 Kafka 的功能，将 binlog 数据解析后按一定的格式再输出到 Kafka 中，用户编写代码从 Kafka row(s) SQuirreLSQL 和 hive-beeline 可以使用 JDBC 连接 Thrift 服务器。例如，使用 beeline 连接： ./beeline Beeline version 1.2.2 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000 1: jdbc:hive2://localhost:10000> +-----------+--+ 1 row selected (1.97 seconds) 11.14.2.6 和 Hive 一起使用 TiSpark TiSpark 可以和 Hive 混合使用。在启动 Spark 之前，需要添加 HADOOP_CONF_DIR 环境变量指向 Hadoop 配置目录并且将 hive-site.xml 拷贝到 $SPARK_HOME/conf 目录下。 val tisparkDF

0 码力 | 2259 页 | 48.16 MB | 1 年前
3
TiDB v5.1 中文手册

Consumer Client 用户文档目前 Drainer 提供了多种输出方式，包括 MySQL、TiDB、file 等。但是用户往往有一些自定义的需求，比如输出到 Elasticsearch、Hive 等，这些需求 Drainer 现在还没有实现，因此 Drainer 增加了输出到 Kafka 的功能，将 binlog 数据解析后按一定的格式再输出到 Kafka 中，用户编写代码从 Kafka row(s) SQuirreLSQL 和 hive-beeline 可以使用 JDBC 连接 Thrift 服务器。例如，使用 beeline 连接： ./beeline Beeline version 1.2.2 by Apache Hive beeline> !connect jdbc:hive2://localhost:10000 1: jdbc:hive2://localhost:10000> +-----------+--+ 1 row selected (1.97 seconds) 11.14.2.6 和 Hive 一起使用 TiSpark TiSpark 可以和 Hive 混合使用。在启动 Spark 之前，需要添加 HADOOP_CONF_DIR 环境变量指向 Hadoop 配置目录并且将 hive-site.xml 拷贝到 $SPARK_HOME/conf 目录下。 val tisparkDF

0 码力 | 2189 页 | 47.96 MB | 1 年前
3
TiDB v5.4 Documentation

have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. 11.7.10.1 Configure Drainer Modify the configuration file of Drainer jdbc:hive2://localhost:10000 If the following message is displayed, you have enabled beeline successfully. Beeline version 1.2.2 by Apache Hive Then, you can run the query command: 1: jdbc:hive2://localhost:10000> together with Hive You can use TiSpark together with Hive. Before starting Spark, you need to set the HADOOP_CONF_DIR environment variable to your Hadoop configuration folder and copy hive �→ -site.xml

0 码力 | 3650 页 | 52.72 MB | 1 年前
3
TiDB v5.3 中文手册

Consumer Client 用户文档目前 Drainer 提供了多种输出方式，包括 MySQL、TiDB、file 等。但是用户往往有一些自定义的需求，比如输出到 Elasticsearch、Hive 等，这些需求 Drainer 现在还没有实现，因此 Drainer 增加了输出到 Kafka 的功能，将 binlog 数据解析后按一定的格式再输出到 Kafka 中，用户编写代码从 Kafka 首先，通过如下命令启用 beeline： ./bin/beeline jdbc:hive2://localhost:10000 如果显示如下信息则表示 beeline 启用成功： Beeline version 1.2.2 by Apache Hive 然后，你可以运行如下查询命令： 1: jdbc:hive2://localhost:10000> use testdb; +---------+--+ +-----------+--+ 1 row selected (1.97 seconds) 11.14.1.6 和 Hive 一起使用 TiSpark TiSpark 可以和 Hive 混合使用。在启动 Spark 之前，需要添加 HADOOP_CONF_DIR 环境变量指向 Hadoop 配置目录并且将 hive-site.xml 拷贝到 spark/conf 目录下。 val tisparkDF = spark

0 码力 | 2374 页 | 49.52 MB | 1 年前
3
TiDB v6.1 Documentation

imported source file) | Files exported from Dumpling Parquet files exported by Amazon Aurora or Apache Hive CSV files Data from local disks or Amazon S3 | | Downstream | TiDB | | Advantages | Support quickly files of Dumpling • Other compatible CSV files • Parquet files exported from Amazon Aurora or Apache Hive • Supported TiDB versions: v2.1 and later versions • Kubernetes support: Yes. See Quickly restore have customized requirements for outputting data to other formats, for example, Elasticsearch and Hive, so this feature is introduced. 13.11.10.1 Configure Drainer Modify the configuration file of Drainer

0 码力 | 4487 页 | 84.44 MB | 1 年前
3

共 33 条前往

页

TiDB 开源分布布式分布式关系数据据库数据库 v5 Documentation 使用进行实时分析数据分析马晓宇中文手册 v6

分类

语言

格式

TiDB 开源分布式关系型数据库

TiDB v5.1 Documentation

使用 TiDB 进行实时数据分析-马晓宇

TiDB v5.2 Documentation

TiDB v5.3 Documentation

TiDB v5.2 中文手册

TiDB v5.1 中文手册

TiDB v5.4 Documentation

TiDB v5.3 中文手册

TiDB v6.1 Documentation