Spark SQL 使用beeline访问hive仓库
发表于:2025-12-03 作者:千家信息网编辑
千家信息网最后更新 2025年12月03日,一、添加hive-site.xml在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据vim hive-site.xml
千家信息网最后更新 2025年12月03日Spark SQL 使用beeline访问hive仓库
一、添加hive-site.xml
在$SPARK_HOME/conf下添加hive-site.xml的配置文件,目的是能正常访问hive的元数据
vim hive-site.xml javax.jdo.option.ConnectionURL jdbc:mysql://192.168.1.201:3306/hiveDB?createDatabaseIfNotExist=true javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver javax.jdo.option.ConnectionUserName root javax.jdo.option.ConnectionPassword 123456 hive.cli.print.header true hive.cli.print.current.db true 注意:在节点上不需要部署hive,只要是你可以连接到hive的元数据就可以!
二、启动thriftserver服务
[hadoop@hadoop003 spark]$ ./sbin/start-thriftserver.sh --jars ~/softwares/mysql-connector-java-5.1.47.jar starting org.apache.spark.sql.hive.thriftserver.HiveThriftServer2, logging to /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out检查日志,确认thriftserver服务正常启动
[hadoop@hadoop003 spark]$ tail -50f /home/hadoop/app/spark/logs/spark-hadoop-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-hadoop003.out19/05/21 09:39:14 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:15 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:15 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:15 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.019/05/21 09:39:15 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added admin role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: Added public role in metastore19/05/21 09:39:15 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_all_databases19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_all_databases 19/05/21 09:39:15 INFO metastore.HiveMetaStore: 0: get_functions: db=default pat=*19/05/21 09:39:15 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_functions: db=default pat=* 19/05/21 09:39:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/73df82dd-1fd3-4dd5-97f1-680d53bd44bc_resources19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created local directory: /tmp/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc19/05/21 09:39:16 INFO session.SessionState: Created HDFS directory: /tmp/hive/hadoop/73df82dd-1fd3-4dd5-97f1-680d53bd44bc/_tmp_space.db19/05/21 09:39:16 INFO client.HiveClientImpl: Warehouse location for Hive client (version 1.2.2) is file:/home/hadoop/app/spark-2.4.2-bin-hadoop-2.6.0-cdh6.7.0/spark-warehouse19/05/21 09:39:16 INFO session.SessionManager: Operation log root directory is created: /tmp/hadoop/operation_logs19/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread pool size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread wait queue size: 10019/05/21 09:39:16 INFO session.SessionManager: HiveServer2: Background operation thread keepalive time: 10 seconds19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: CLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is inited.19/05/21 09:39:16 INFO service.AbstractService: Service: HiveServer2 is inited.19/05/21 09:39:16 INFO service.AbstractService: Service:OperationManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:SessionManager is started.19/05/21 09:39:16 INFO service.AbstractService: Service:CLIService is started.19/05/21 09:39:16 INFO metastore.ObjectStore: ObjectStore, initialize called19/05/21 09:39:16 INFO DataNucleus.Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing19/05/21 09:39:16 INFO metastore.MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY19/05/21 09:39:16 INFO metastore.ObjectStore: Initialized ObjectStore19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: get_databases: default19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=get_databases: default 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Shutting down the object store...19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Shutting down the object store... 19/05/21 09:39:16 INFO metastore.HiveMetaStore: 0: Metastore shutdown complete.19/05/21 09:39:16 INFO HiveMetaStore.audit: ugi=hadoop ip=unknown-ip-addr cmd=Metastore shutdown complete. 19/05/21 09:39:16 INFO service.AbstractService: Service:ThriftBinaryCLIService is started.19/05/21 09:39:16 INFO service.AbstractService: Service:HiveServer2 is started.19/05/21 09:39:16 INFO thriftserver.HiveThriftServer2: HiveThriftServer2 started19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4a7c72af{/sqlserver,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3e24bab6{/sqlserver/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@391e2a{/sqlserver/session,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@641cea11{/sqlserver/session/json,null,AVAILABLE,@Spark}19/05/21 09:39:16 INFO thrift.ThriftCLIService: Starting ThriftBinaryCLIService on port 10000 with 5...500 worker threads#标志启动成功三、启动beeline
[hadoop@hadoop003 spark]$ ./bin/beeline -u jdbc:hive2://localhost:10000 -n hadoopConnecting to jdbc:hive2://localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Supplied authorities: localhost:1000019/05/21 09:46:19 INFO jdbc.Utils: Resolved authority: localhost:1000019/05/21 09:46:19 INFO jdbc.HiveConnection: Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10000Connected to: Spark SQL (version 2.4.2)Driver: Hive JDBC (version 1.2.1.spark2)Transaction isolation: TRANSACTION_REPEATABLE_READBeeline version 1.2.1.spark2 by Apache Hive0: jdbc:hive2://localhost:10000> select * from student.student limit 5;+---------+-----------+-----------------+--------------------------------------------+--+| stu_id | stu_name | stu_phone_num | stu_email |+---------+-----------+-----------------+--------------------------------------------+--+| 1 | Burke | 1-300-746-8446 | ullamcorper.velit.in@ametnullaDonec.co.uk || 2 | Kamal | 1-668-571-5046 | pede.Suspendisse@interdumenim.edu || 3 | Olga | 1-956-311-1686 | Aenean.eget.metus@dictumcursusNunc.edu || 4 | Belle | 1-246-894-6340 | vitae.aliquet.nec@neque.co.uk || 5 | Trevor | 1-300-527-4967 | dapibus.id@acturpisegestas.net |+---------+-----------+-----------------+--------------------------------------------+--+5 rows selected (3.275 seconds)0: jdbc:hive2://localhost:10000> 启动成功
四、注意
1、最好在spark/bin目录下启动beeline
因为如果你启动sparkbeeline的机器还部署了hive,恰巧你的hive环境变量正好在spark环境变量之前,那么很可能启动的是hive的beeline
比如:
[hadoop@hadoop003 spark]$ beelinels: cannot access /home/hadoop/app/spark/lib/spark-assembly-*.jar: No such file or directorywhich: no hbase in (/home/hadoop/app/hive/bin:/home/hadoop/app/spark/bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//bin:/home/hadoop/app/hadoop-2.6.0-cdh6.7.0//sbin:/home/hadoop/app/zookeeper/bin:/usr/java/jdk1.8.0_131/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/hadoop/bin)Beeline version 1.1.0-cdh6.7.0 by Apache Hive # 这不就是hive么beeline> 此时你查看下环境变量
[hadoop@hadoop003 spark]$ cat ~/.bash_profile # .bash_profile# Get the aliases and functionsif [ -f ~/.bashrc ]; then . ~/.bashrcfi# User specific environment and startup programsPATH=$PATH:$HOME/binexport PATH#####JAVA_HOME#####export JAVA_HOME=/usr/java/jdk1.8.0_131####ZOOKEEPER_HOME####export ZOOKEEPER_HOME=/home/hadoop/app/zookeeper#####HADOOP_HOME######export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh6.7.0/export SPARK_HOME=/home/hadoop/app/spark#####HIVE_HOME#####export HIVE_HOME=/home/hadoop/app/hiveexport PATH=$HIVE_HOME/bin:$SPARK_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$JAVA_HOME/bin:$PATH果然如果不指定beeline路径就会优先使用hive的beeline
变量
环境
成功
数据
服务
这不
就是
文件
日志
最好
机器
标志
目录
目的
节点
路径
检查
配置
仓库
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
登陆linux服务器
云梯 网络安全
方舟非官方服务器会有别人吗
vb数据库编程实例讲解
macos网络安全性
软件开发收入确认时间
网络安全等级过高怎么设置
考勤数据库已满
中原工学院外文数据库
联合国服务贸易数据库
宜宾网络安全知识竞赛答案
hp服务器 pe
香港ctg服务器租用价格表
高可用数据库连接
服务器组建台式机
私人电脑能连接公司服务器吗
网络安全产品越来越碎片化
邯郸软件开发培训学校哪个好
软件开发 工作思路和规划
数据库概论基本知识
网络安全设备厂家供应
网络安全生动的标题
群辉服务器
科研数据库技术技巧
如何利用爬虫抓取网站数据库
浪潮服务器内存条安装
我们的dna是如何录入数据库的
网络安全法课件app
网络安全科技公司的经营范围
青岛达内软件开发