Spark运行环境的安装步骤
发表于:2025-12-02 作者:千家信息网编辑
千家信息网最后更新 2025年12月02日,这篇文章主要介绍"Spark运行环境的安装步骤",在日常操作中,相信很多人在Spark运行环境的安装步骤问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答"Spark运行环境
千家信息网最后更新 2025年12月02日Spark运行环境的安装步骤3、安装spark-1.4.0
这篇文章主要介绍"Spark运行环境的安装步骤",在日常操作中,相信很多人在Spark运行环境的安装步骤问题上存在疑惑,小编查阅了各式资料,整理出简单好用的操作方法,希望对大家解答"Spark运行环境的安装步骤"的疑惑有所帮助!接下来,请跟着小编一起来学习吧!
1、准备工作
scala-2.9.3:一种编程语言,下载地址:http://www.scala-lang.org/download/
spark-1.4.0:必须是编译好的Spark,如果下载的是Source,则需要自己根据环境使用SBT或者MAVEN重新编译才能使用。
编译好的 Spark下载地址:http://spark.apache.org/downloads.html。
2、安装scala-2.9.3
#解压scala-2.9.3.tgztar -zxvf scala-2.9.3.tgz#配置SCALA_HOMEvi /etc/profile#添加如下环境export SCALA_HOME=/home/apps/scala-2.9.3export PATH=.:$SCALA_HOME/bin:$PATH#测试scala安装是否成功#直接输入scala
3、安装spark-1.4.0
#解压spark-1.4.0.tgztar -zxvf spark-1.4.0.tgz#配置SPARK_HOMEvi /etc/profile#添加如下环境export SCALA_HOME=/home/apps/spark-1.4.0export PATH=.:$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
4、修改Spark配置文件
#复制slaves.template和 spark-env.sh.template各一份cp spark-env.sh.template spark-env.shcp slaves.template slaves#slaves,此文件是指定子节点的主机,直接添加子节点主机名即可
在spark-env.sh末端添加如下几行:
#JDK安装路径export JAVA_HOME=/root/app/jdk#SCALA安装路径export SCALA_HOME=/root/app/scala-2.9.3#主节点的IP地址export SPARK_MASTER_IP=192.168.1.200#分配的内存大小export SPARK_WORKER_MEMORY=200m#指定hadoop的配置文件目录export HADOOP_CONF_DIR=/root/app/hadoop/etc/hadoop#指定worker工作时分配cpu数量export SPARK_WORKER_CORES=1#指定spark实例,一般1个足以export SPARK_WORKER_INSTANCES=1#jvm操作,在spark1.0之后增加了spark-defaults.conf默认配置文件,该配置参数在默认配置在该文件中export SPARK_JAVA_OPTS
spark-defaults.conf中还有如下配置参数:
SPARK.MASTER //spark://hostname:8080SPARK.LOCAL.DIR //spark工作目录(做shuffle的目录)SPARK.EXECUTOR.MEMORY //spark1.0抛弃SPARK_MEM参数,使用该参数
5、测试spark安装是否成功
在主节点机器上启动顺序1、先启动hdfs(./sbin/start-dfs.sh)2、启动spark-master(./sbin/start-master.sh)3、启动spark-worker(./sbin/start-slaves.sh)4、jps查看进程有 主节点:namenode、secondrynamnode、master 从节点:datanode、worker5、启动spark-shell15/06/21 21:23:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/06/21 21:23:47 INFO spark.SecurityManager: Changing view acls to: root15/06/21 21:23:47 INFO spark.SecurityManager: Changing modify acls to: root15/06/21 21:23:47 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)15/06/21 21:23:47 INFO spark.HttpServer: Starting HTTP Server15/06/21 21:23:47 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:47 INFO server.AbstractConnector: Started SocketConnector@0 .0.0.0:3865115/06/21 21:23:47 INFO util.Utils: Successfully started service 'HTTP class server' on port 38651.Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 1.4.0 /_/ Using Scala version 2.10.4 (Java HotSpot(TM) Client VM, Java 1.7.0_65)Type in expressions to have them evaluated.Type :help for more information.15/06/21 21:23:54 INFO spark.SparkContext: Running Spark version 1.4.015/06/21 21:23:54 INFO spark.SecurityManager: Changing view acls to: root15/06/21 21:23:54 INFO spark.SecurityManager: Changing modify acls to: root15/06/21 21:23:54 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)15/06/21 21:23:56 INFO slf4j.Slf4jLogger: Slf4jLogger started15/06/21 21:23:56 INFO Remoting: Starting remoting15/06/21 21:23:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@192.168.1.200:57658]15/06/21 21:23:57 INFO util.Utils: Successfully started service 'sparkDriver' on port 57658.15/06/21 21:23:58 INFO spark.SparkEnv: Registering MapOutputTracker15/06/21 21:23:58 INFO spark.SparkEnv: Registering BlockManagerMaster15/06/21 21:23:58 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-4f1badf6-1e92-47ca-98a2-6d82f4882f15/blockmgr-530e4335-9e59-45d4-b9fb-6014089f5a0015/06/21 21:23:58 INFO storage.MemoryStore: MemoryStore started with capacity 267.3 MB15/06/21 21:23:59 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-4f1badf6-1e92-47ca-98a2-6d82f4882f15/httpd-4b2cca3c-e8d4-4ab3-9c3d-38ec579ec87315/06/21 21:23:59 INFO spark.HttpServer: Starting HTTP Server15/06/21 21:23:59 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:59 INFO server.AbstractConnector: Started SocketConnector@0 .0.0.0:5189915/06/21 21:23:59 INFO util.Utils: Successfully started service 'HTTP file server' on port 51899.15/06/21 21:23:59 INFO spark.SparkEnv: Registering OutputCommitCoordinator15/06/21 21:23:59 INFO server.Server: jetty-8.y.z-SNAPSHOT15/06/21 21:23:59 INFO server.AbstractConnector: Started SelectChannelConnector@0 .0.0.0:404015/06/21 21:23:59 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.15/06/21 21:23:59 INFO ui.SparkUI: Started SparkUI at http://192.168.1.200:404015/06/21 21:24:00 INFO executor.Executor: Starting executor ID driver on host localhost15/06/21 21:24:00 INFO executor.Executor: Using REPL class URI: http://192.168.1.200:3865115/06/21 21:24:01 INFO util.Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59385.15/06/21 21:24:01 INFO netty.NettyBlockTransferService: Server created on 5938515/06/21 21:24:01 INFO storage.BlockManagerMaster: Trying to register BlockManager15/06/21 21:24:01 INFO storage.BlockManagerMasterEndpoint: Registering block manager localhost:59385 with 267.3 MB RAM, BlockManagerId(driver, localhost, 59385)15/06/21 21:24:01 INFO storage.BlockManagerMaster: Registered BlockManager15/06/21 21:24:02 INFO repl.SparkILoop: Created spark context..Spark context available as sc.15/06/21 21:24:03 INFO hive.HiveContext: Initializing execution hive, version 0.13.115/06/21 21:24:04 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore15/06/21 21:24:04 INFO metastore.ObjectStore: ObjectStore, initialize called15/06/21 21:24:04 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored15/06/21 21:24:04 INFO DataNucleus.Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored15/06/21 21:24:05 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/06/21 21:24:07 WARN DataNucleus.Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)15/06/21 21:24:14 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"15/06/21 21:24:14 INFO metastore.MetaStoreDirectSql: MySQL check failed, assuming we are not on mysql: Lexical error at line 1, column 5. Encountered: "@" (64), after : "".15/06/21 21:24:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:15 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:18 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:18 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.15/06/21 21:24:19 INFO metastore.ObjectStore: Initialized ObjectStore15/06/21 21:24:20 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 0.13.1aa15/06/21 21:24:24 INFO metastore.HiveMetaStore: Added admin role in metastore15/06/21 21:24:24 INFO metastore.HiveMetaStore: Added public role in metastore15/06/21 21:24:24 INFO metastore.HiveMetaStore: No user is added in admin role, since config is empty15/06/21 21:24:25 INFO session.SessionState: No Tez session required at this point. hive.execution.engine=mr.15/06/21 21:24:25 INFO repl.SparkILoop: Created sql context (with Hive support)..SQL context available as sqlContext.6、使用wordcount例子测试,启动spark-shell之前先上传一份文件到hdfs7、代码: val file = sc.textFile("hdfs://hadoop.master:9000/data/intput/wordcount.data") val count = file.flatMap(line=>(line.split(" "))).map(word=>(word,1)).reduceByKey(_+_) count.collect() count.textAsFile("hdfs://hadoop.master:9000/data/output")理解上面的代码你需要学习scala语言。直接打印结果:hadoop dfs -cat /data/output/p*(im,1)(are,1)(yes,1)(hi,2)(do,1)(no,3)(to,1)(lll,1)(,3)(hello,3)(xiaoming,1)(ga,1)(world,1)到此,关于"Spark运行环境的安装步骤"的学习就结束了,希望能够解决大家的疑惑。理论与实践的搭配能更好的帮助大家学习,快去试试吧!若想继续学习更多相关知识,请继续关注网站,小编会继续努力为大家带来更多实用的文章!
配置
环境
文件
节点
学习
步骤
运行
参数
地址
目录
工作
测试
编译
成功
主机
代码
更多
语言
路径
帮助
数据库的安全要保护哪些东西
数据库安全各自的含义是什么
生产安全数据库录入
数据库的安全性及管理
数据库安全策略包含哪些
海淀数据库安全审计系统
建立农村房屋安全信息数据库
易用的数据库客户端支持安全管理
连接数据库失败ssl安全错误
数据库的锁怎样保障安全
软件开发工程师的岗位有哪些
无需数据库的博客
邮件服务器的搭建论文
网络安全信息有哪些
数据库怎么投入使用
设置代理服务器的作用
单网卡vpn服务器搭建
数据库与客户端的对接
集成服务器会计如何入账
专业软件开发过程服务标准
网络安全法开始实施
hcna网络技术实训总结
戴尔服务器手动关闭一个cpu
公众号服务器怎么打开
广州等你网络技术有限公司
象山软件开发联系方式
网络安全美术画画
外卖系统数据库 逻辑
护航网络安全教育活动
视频直播软件开发公司
pg数据库中给用户授权
深圳手机应用软件开发怎样收费
x79服务器内存
山东腾纵软件开发局
幼儿园网络安全儿童画
武汉手游软件开发公司
阳江通讯软件开发供应商家
梦幻怎么选服务器
58同城服务器累趴下了
荔湾软件开发哪里实惠