千家信息网

Spark Eclipse开发环境的搭建方法

发表于:2025-12-02 作者:千家信息网编辑
千家信息网最后更新 2025年12月02日,本篇内容介绍了"Spark Eclipse开发环境的搭建方法"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有
千家信息网最后更新 2025年12月02日Spark Eclipse开发环境的搭建方法

本篇内容介绍了"Spark Eclipse开发环境的搭建方法"的有关知识,在实际案例的操作过程中,不少人都会遇到这样的困境,接下来就让小编带领大家学习一下如何处理这些情况吧!希望大家仔细阅读,能够学有所成!

Spark Eclipse 开发环境搭建

1 安装Spark环境

  • 首先下载与集群 Hadoop 版本对应的 Spark 编译好的版本,解压缩到指定位置,注意用户权限

  • 进入解压缩之后的 SPARK_HOME 目录

  • 配置 /etc/profile 或者 ~/.bashrc 中配置 SPARK_HOME

  • cd $SPARK_HOME/conf cp spark-env.sh.template spark-env.sh

  • vim spark-env.sh

export SCALA_HOME=/home/hadoop/cluster/scala-2.10.5export JAVA_HOME=/home/hadoop/cluster/jdk1.7.0_79export HADOOP_HOME=/home/hadoop/cluster/hadoop-2.6.0export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop#注意这个地方一定要指定为IP,否则下面的eclipse去连接的时候会报:#All masters are unresponsive! Giving up. 这个错误的。SPARK_MASTER_IP=10.16.112.121SPARK_LOCAL_DIRS=/home/hadoop/cluster/spark-1.4.0-bin-hadoop2.6SPARK_DRIVER_MEMORY=1G

2 standalone 模式开启 spark

sbin/start-master.shsbin/start-slave.sh

此时可以在浏览器中输入:http://yourip:8080 查看Spark集群的情况
此时默认的 Spark-Master 为: spark://10.16.112.121:7077

3 利用 Scala-Eclipse IDE 与 Maven 构建 Spark 开发环境

  • 首先下载 Scala-Eclipse IDE 去 scala 官网下载即可

  • 打开IDE, 新建 Maven 项目, pom.xml 填写如下:

        4.0.0        spark.test        FirstTrySpark        0.0.1-SNAPSHOT                            2.6.0          1.4.0                                                    org.apache.hadoop                    hadoop-client                    ${hadoop.version}                    provided                                                                                javax.servlet                              *                                                                                              org.apache.hadoop                    hadoop-common                    2.6.0                                                    org.apache.hadoop                    hadoop-mapreduce-client-jobclient                    2.6.0                                                        org.apache.spark                        spark-core_2.10                        ${spark.version}                                                src/main/java                                                                                                net.alchim31.maven                                scala-maven-plugin                                3.2.0                                                                                                                                                                                compile                                                        testCompile                                                                                                                                                                                                2.10                                                                                                                org.apache.maven.plugins                                maven-assembly-plugin                                2.5.5                                                                                                                        jar-with-dependencies                                                                                                                                                                                                package                                                                                                        single                                                                                                                                                                                                        org.apache.maven.plugins                                maven-compiler-plugin                                                                        1.7                                        1.7                                                                                                                                                src/main/resources                                                
  • 新建几个 Source Folder

src/main/java      #编写 java 代码src/main/scala     #编写 scala 代码src/main/resources #存放资源文件src/test/java      #编写测试 java 代码src/test/scala     #编写测试 scala 代码src/test/resources #存放资源文件

此时环境全部搭建完毕!

4 编写测试代码是否可以连接成功

  • 测试代码如下:

import org.apache.spark.SparkConfimport org.apache.spark.SparkConfimport org.apache.spark.SparkContext/** * @author clebeg */object FirstTry {  def main(args: Array[String]): Unit = {    val conf = new SparkConf    conf.setMaster("spark://yourip:7077")    conf.set("spark.app.name", "first-tryspark")    val sc = new SparkContext(conf)    val rawblocks = sc.textFile("hdfs://yourip:9000/user/hadoop/linkage")    println(rawblocks.first)  }}

5 部分错误汇总

大部分问题上面已经提到,这里不多说,下面提几个主要的问题:
  • Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
    分析问题:点开运行ID对应的运行日志发现下面的错误:

15/10/10 08:49:01 INFO executor.CoarseGrainedExecutorBackend: Registered signal handlers for [TERM, HUP, INT]15/10/10 08:49:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable15/10/10 08:49:02 INFO spark.SecurityManager: Changing view acls to: hadoop,Administrator15/10/10 08:49:02 INFO spark.SecurityManager: Changing modify acls to: hadoop,Administrator15/10/10 08:49:02 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop, Administrator); users with modify permissions: Set(hadoop, Administrator)15/10/10 08:49:02 INFO slf4j.Slf4jLogger: Slf4jLogger started15/10/10 08:49:02 INFO Remoting: Starting remoting15/10/10 08:49:02 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://driverPropsFetcher@10.16.112.121:58708]15/10/10 08:49:02 INFO util.Utils: Successfully started service 'driverPropsFetcher' on port 58708.Exception in thread "main" java.lang.reflect.UndeclaredThrowableException        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1643)        at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:65)        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:146)        at org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:245)        at org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)        at scala.concurrent.Await$.result(package.scala:107)        at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:97)        at org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:159)        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)        at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:65)        at java.security.AccessController.doPrivileged(Native Method)        at javax.security.auth.Subject.doAs(Subject.java:415)        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)        ... 4 more15/10/10 08:51:02 INFO util.Utils: Shutdown hook called

仔细一看原来是权限的问题:立马关闭 Hadoop, 在 etc/hadoop/core-site.xml 中添加:

    hadoop.security.authorization    false  

设置任何人都可以读取,问题立马搞定。

  • java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

  1. 到地址http://www.barik.net/archive/2015/01/19/172716/ 下载包含 winutils.exe 的 hadoop2.6 重新编译的版本。注意一定要下载对应自己的Hadoop版本。

  2. 减压缩到指定位置,设置 HADOOP_HOME 环境变量。注意一定要重新启动 eclipse。 搞定!

  • 本文中提到的数据在哪里获取? http://bit.ly/1Aoywaq 操作代码如下:

mkdir linkagecd linkage/curl -o donation.zip http://bit.ly/1Aoywaqunzip donation.zipunzip "block_*.zip"hdfs dfs -mkdir /user/hadoop/linkagehdfs dfs -put block_*.csv /user/hadoop/linkage

"Spark Eclipse开发环境的搭建方法"的内容就介绍到这里了,感谢大家的阅读。如果想了解更多行业相关的知识可以关注网站,小编将为大家输出更多高质量的实用文章!

0