首页 > 其他分享 >012-01Spark On YARN 环境搭建

012-01Spark On YARN 环境搭建

时间:2023-04-03 21:35:14浏览次数:45  
标签:INFO 15 22 08 30 YARN 012 Client 01Spark


1、Scala 安装


http://www.scala-lang.org/files/archive/scala-2.10.4.tgz

tar -zxvf scala-2.10.4.tgz -C app/
cd  app
ln -s scala-2.10.4 scala




2、Spark 安装


tar -zxvf spark-1.4.0-bin-hadoop2.6.tgz -C app


ln -s spark-1.4.0-bin-hadoop2.6 spark




# vim spark-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_76
export SCALA_HOME=/home/hadoop/app/scala
export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0




## worker节点的主机名列表
# vim slaves

192.168.2.20
192.168.2.33



# mv log4j.properties.template log4j.properties






## 在Master节点上执行


 cd  $SPARK_HOME/bin



./start-all.sh




3、配置系统环境变量


vim /etc/profile
export SCALA_HOME=/home/hadoop/app/scala
export SPARK_HOME=/home/hadoop/app/spark
export PATH=$PATH:$HIVE_HOME/bin:$HBASE_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin






source /etc/profile




4、相关测试
## 监控页面URL
http://192.168.2.20:8080/




## 先切换到“cd $SPARK_HOME”目录




(1)、本地模式
#进行spark-shell命令



./spark-shell 



#测试



sc.textFile("/home/hadoop/wc.txt").flatMap( line=>line.split("\t") ).map( word=>(word,1) ).reduceByKey(_ + _).collect



#验证



http://192.168.2.20:4040/





(2)、 基于YARN模式



cd $SPARK_HOME
bin/spark-submit  --class  org.apache.spark.examples.SparkPi \
--master yarn-cluster \
--num-executors 3 \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
lib/spark-examples*.jar  10









执行步骤出现的日志



[hadoop@mycluster spark]$ bin/spark-submit  --class  org.apache.spark.examples.SparkPi \
> --master yarn-cluster \
> --num-executors 3 \
> --driver-memory 1g \
> --executor-memory 1g \
> -executor-cores 1 \
> lib/spark-examples*.jar  10
Error: Unrecognized option '-executor-cores'.
Run with --help for usage help or --verbose for debug output
[hadoop@mycluster spark]$ bin/spark-submit  --class  org.apache.spark.examples.SparkPi \
> --master yarn-cluster \
> --num-executors 3 \
> --driver-memory 1g \
> --executor-memory 1g \
> --executor-cores 1 \
> lib/spark-examples*.jar  10
15/08/30 22:53:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/08/30 22:53:29 INFO RMProxy:  Connecting to ResourceManager at mycluster/192.168.2.20:8032
15/08/30 22:53:29 INFO Client:  Requesting a new application from cluster with 1 NodeManagers
15/08/30 22:53:29 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
15/08/30 22:53:29 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
15/08/30 22:53:29 INFO Client: Setting up container launch context for our AM
15/08/30 22:53:29 INFO Client: Preparing resources for our AM container
15/08/30 22:53:30 INFO Client: Uploading resource file:/home/hadoop/app/spark-1.4.0-bin-hadoop2.6/lib/spark-assembly-1.4.0-hadoop2.6.0.jar -> hdfs://mycluster:9000/user/hadoop/.sparkStaging/application_1440995865051_0005/spark-assembly-1.4.0-hadoop2.6.0.jar
15/08/30 22:53:33 INFO Client: Uploading resource file:/home/hadoop/app/spark-1.4.0-bin-hadoop2.6/lib/spark-examples-1.4.0-hadoop2.6.0.jar -> hdfs://mycluster:9000/user/hadoop/.sparkStaging/application_1440995865051_0005/spark-examples-1.4.0-hadoop2.6.0.jar
15/08/30 22:53:39 INFO Client: Uploading resource file:/tmp/spark-ecb5f2dc-f66b-42e6-a8ae-befce75074c0/__hadoop_conf__846873578807129658.zip -> hdfs://mycluster:9000/user/hadoop/.sparkStaging/application_1440995865051_0005/__hadoop_conf__846873578807129658.zip
15/08/30 22:53:40 INFO Client: Setting up the launch environment for our AM container
15/08/30 22:53:40 INFO SecurityManager: Changing view acls to: hadoop
15/08/30 22:53:40 INFO SecurityManager: Changing modify acls to: hadoop
15/08/30 22:53:40 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); users with modify permissions: Set(hadoop)
15/08/30 22:53:40 INFO Client: Submitting application 5 to ResourceManager
15/08/30 22:53:40 INFO YarnClientImpl: Submitted application application_1440995865051_0005
15/08/30 22:53:41 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:41 INFO Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: default
         start time: 1441000420286
         final status: UNDEFINED
         tracking URL: http://mycluster:8088/proxy/application_1440995865051_0005/
         user: hadoop
15/08/30 22:53:43 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:45 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:46 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:48 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:50 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:52 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:54 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:56 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:57 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:58 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:53:59 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:54:00 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:54:01 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:54:02 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:54:03 INFO Client: Application report for application_1440995865051_0005 (state: ACCEPTED)
15/08/30 22:54:04 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:04 INFO Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: 192.168.2.20
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1441000420286
         final status: UNDEFINED
         tracking URL: http://mycluster:8088/proxy/application_1440995865051_0005/
         user: hadoop
15/08/30 22:54:05 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:06 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:07 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:08 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:09 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:10 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:11 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:12 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:13 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:15 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:17 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:18 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:19 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:20 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:21 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:23 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:24 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:25 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:26 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:27 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:29 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:30 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:31 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:33 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:34 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:36 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:37 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:38 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:40 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:41 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:42 INFO Client: Application report for application_1440995865051_0005 (state: RUNNING)
15/08/30 22:54:43 INFO Client: Application report for application_1440995865051_0005 (state: FINISHED)
15/08/30 22:54:43 INFO Client:
         client token: N/A
         diagnostics: N/A
         ApplicationMaster host: 192.168.2.20
         ApplicationMaster RPC port: 0
         queue: default
         start time: 1441000420286
         final status: SUCCEEDED
         tracking URL: http://mycluster:8088/proxy/application_1440995865051_0005/A
         user: hadoop
15/08/30 22:54:43 INFO Utils: Shutdown hook called
15/08/30 22:54:43 INFO Utils: Deleting directory /tmp/spark-ecb5f2dc-f66b-42e6-a8ae-befce75074c0








常见问题:



基于YARN模式下执行上述spark-submit,出现下面的错误



[hadoop@mycluster spark]$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi   --master yarn-cluster   --master yarn-cluster 10
Exception in thread "main" java.lang.Exception: When running with master 'yarn-cluster' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment.
        at org.apache.spark.deploy.SparkSubmitArguments.validateSubmitArguments(SparkSubmitArguments.scala:239)
        at org.apache.spark.deploy.SparkSubmitArguments.validateArguments(SparkSubmitArguments.scala:216)
        at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:103)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:106)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
15/08/30 22:25:45 INFO Utils: Shutdown hook called






解决方案: 配置



cd $SPARK_HOME/conf



vi spark-env.sh 
# Options read in YARN client mode
# - HADOOP_CONF_DIR, to point Spark towards Hadoop configuration files
HADOOP_CONF_DIR=/home/hadoop/app/hadoop-2.6.0/etc/hadoop





标签:INFO,15,22,08,30,YARN,012,Client,01Spark
From: https://blog.51cto.com/u_14361901/6167559

相关文章

  • 【深入浅出 Yarn 架构与实现】6-2 NodeManager 状态机管理
    一、简介NodeManager(NM)中的状态机分为三类:Application、Container和LocalizedResource,它们均直接或者间接参与维护一个应用程序的生命周期。当NM收到某个Application的第一个container启动命令时,它会创建一个「Application状态机」来跟踪该应用程序在该节点的状态;每个......
  • 不让三星、LG独美,夏普在IFA 2012推出新型IGZO屏幕
    如果有人问现在手机、平板里最优秀的屏幕技术是什么,我想很多人都会脱口而出三星的Super AMOLED,LG的IPS。虽然很多日系的品牌也有生产IPS,例如日立、东芝等,但是出来效果还是跟LG原厂IPS有点差距。而在IFA2012展会上,夏普展示了他们的全新IGZO屏幕,并用实物演示,有7寸分辨率WXGA即1280*......
  • 230123-Git命令行代理及加速设置
    ⭐️方法1:设置全局国内/国外代理gitconfig--globalhttp.proxyhttp://127.0.0.1:XXXXgitconfig--globalhttps.proxyhttp://127.0.0.1:XXXX⭐️方法2:仅设置github的代理gitconfig--globalhttp.https://github.com.proxyhttp://127.0.0.1:XXXXgitconfig--globalhttp......
  • 180122 特征值与特征向量的几何解释与python代码,附matplotlib绘制多边形
    HowtoPlotPolygonsinPythonShapely-ManualShapely-Test3Blue1Brown-线性代数的几何解释DownloadsShapely-WindowsShapely-MacorLinux红色基坐标(竖着看)1001绿色变换矩阵(竖着看)3102蓝色特征向量(竖着看)1−2√202√2黑色变换矩阵(左乘)特征向量(竖着......
  • 20201231之类的八位数字设置日期格式
    excel选中要设置的区域,点开菜单“数据”,选分列,取消所有分隔方式的勾选,下一步选择日期YMD即可将20200101格式的数据读取为日期格式,此时20200101可能显示为43831(自1900年1月1日以来的第43831天),设置成需要的日期格式即可 ......
  • [oeasy]python0123_中文字符_文字编码_gb2312_激光照排技术_王选
    中文编码GB2312回忆上次内容上次回顾了日韩各有编码格式日本有假名五十音一字节可以勉强放下 有日本汉字字符数量超过20000+  韩国有谚文数量超过500一个字节放不下 有朝鲜汉字字符数量超过20000+......
  • [oeasy]python0122_日韩字符_日文假名_JIS_Shift_韩国谚文
    日文假名和韩国谚文回忆上次内容上次回顾了非ascii的拉丁字符编码的进化过程0-127是ascii的领域 世界各地编码分布拉丁字符扩展ascii共16种由iso组织制定从iso-8859-1到iso-8859-16 无法同时显示俄文和法文  此......
  • [oeasy]python0122_日韩字符_日文假名_JIS_Shift_韩国谚文
    日文假名和韩国谚文回忆上次内容上次回顾了非ascii的拉丁字符编码的进化过程0-127是ascii的领域世界各地编码分布拉丁字符扩展ascii共16种由iso组织制定从iso-8859-1到iso-8859-16无法同时显示俄文和法文此时中日韩的文字也需要进入计算机象形文字的字符集超级巨大日本......
  • 通过Sysmon+Nxlogs收集Windows Server 2012服务器日志-并以Syslog形式发送Json格式数
    0x01环境介绍WindowsServer2012已经安装部署好了域控,目的除了收集Windows服务器本身的日志外还收集域控环境下的各种日志。0x02Nxlog配置和使用使用社区版本即可,下载地址:https://nxlog.co/downloads/nxlog-ce#nxlog-community-edition使用的版本是当前最新版本安装过程就省略,......
  • 关于SQLsever2012报错的一些经验总结
    问题描述:数据库连接实例时出现报错情况;问题截图:  故障软件:SQLsever2012操作系统:windowssever2022R2数据中心期望结果:可以打开之前的实例 总结经验: 上面这张图是1月15号出现的,距离今天已经过去了40天,当时查询了n多资料也没有解决的这个问题,由于过年和其他事务的出现,以至于在......