编译spark源码及塔建源码阅读环境

时间:2021-07-08 05:25:45

编译spark源码及塔建源码阅读环境

  (一),编译spark源码

    1,更换maven的下载镜像: 

<mirrors>
<!-- 阿里云仓库 -->
<mirror>
<id>alimaven</id>
<mirrorOf>central</mirrorOf>
<name>aliyun maven</name>
<url>http://maven.aliyun.com/nexus/content/repositories/central/</url>
</mirror> <!-- *仓库1 -->
<mirror>
<id>repo1</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo1.maven.org/maven2/</url>
</mirror> <!-- *仓库2 -->
<mirror>
<id>repo2</id>
<mirrorOf>central</mirrorOf>
<name>Human Readable Name for this Mirror.</name>
<url>http://repo2.maven.org/maven2/</url>
</mirror>
</mirrors>

    2,使用编译命令

     $ export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

     $  mvn -Pyarn -Phadoop-2.7 -Pspark-ganglia-lgpl -Pkinesis-asl -Phive -DskipTests clean package   (大约需要1个多小时左右,看网速)

    3,生成部署包

     $ export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"

     $ ./dev/make-distribution.sh -name custom-spark-tgz -Psparkr -Phadoop-2.7 -Phive -Phive-thriftserver -Pyarn   (生成spark-2.1.0-bin-custom-spark.tgz)

  (二),塔建阅读环境

    1,将上步编译好的spark导入IDEA中

      编译spark源码及塔建源码阅读环境

      编译spark源码及塔建源码阅读环境

    2,异常解决

     编译spark源码及塔建源码阅读环境

      解决方法:进入编译好的spark文件下:external/flueme-sink/target/spark-streaming-flume-sink_2.11-2.0.0-source.jar包解压

           再将解压的文件中target/spark-streaming-flume-sink_2.11-2.1.0-sources/org/apache/spark/streaming/flume/sink的文件复制到

           external/flume-sink/src/main/scala/org/apache/spark/streaming/flume/sink   ,执行rebuild即可

    3,到此阅读环境就算搭好了,验证可使用spark 中的localPi 例子