前景提要
部署安装 IDEA、Spark、Flume,同时安装 Spark 环境所需要的 Scala
系统环境
三台 Centos7 系统的主机
一台 master 节点,两台 slave 节点,分别为 slave1、slave2
Hadoop 完全分布式集群
所需安装包
ideaIC-2021.3.2.tar.gz
1
| https://www.jetbrains.com/idea/download/other.html
|
scala-2.11.12.tgz
1
| https://www.scala-lang.org/download/2.11.12.html
|
spark-3.1.3-bin-hadoop3.2.tgz
1
| https://archive.apache.org/dist/spark/spark-3.1.3/spark-3.1.3-bin-hadoop3.2.tgz
|
flink-1.12.2-bin-scala_2.12.tgz
1
| https://archive.apache.org/dist/flink/flink-1.12.2/flink-1.12.2-bin-scala_2.12.tgz
|
1、安装 IDEA
1)解压 IDEA 压缩包
1 2 3
| [root@master ~]cd /tdsgpo [root@master tdsgpo]sudo tar -zxvf ideaIC-2021.3.2.tar.gz
|
2)启动 IDEA
1 2
| [root@master tdsgpo]cd /tdsgpo/idea-IC-213.6777.52/bin [root@master bin]sh idea.sh
|
2、安装 Scala
1)解压 Scala 压缩包
1 2 3 4 5 6 7 8 9 10
| [root@master bin]cd /tdsgpo/ [root@master tdsgpo]sudo tar -zxvf scala-2.12.10.tgz
[root@master tdsgpo]sudo vim /etc/profile
export SCALA_HOME=/tdsgpo/scala-2.12.10 export PATH=$PATH:$SCALA_HOME/bin
[root@master tdsgpo]source /etc/profile
|
3、安装 Spark
1)解压 Spark 压缩包
1 2 3 4 5 6 7 8 9 10 11
| [root@master tdsgpo]sudo tar -zxvf spark-3.1.3-bin-hadoop3.2.tgz
[root@master tdsgpo]mv spark-3.1.3-bin-hadoop3.2 /tdsgpo/spark-3.1.3
[root@master tdsgpo]sudo vim /etc/profile
export SPARK_HOME=/tdsgpo/spark-3.1.3 export PATH=$PATH:$SPARK_HOME/bin
[root@master tdsgpo]source /etc/profile
|
2)配置 slaves
1 2 3 4 5
| [root@master tdsgpo]sudo vim /tdsgpo/spark-3.1.3 /conf/slaves
master slave1 slave2
|
3)配置 spark-env.sh
1 2 3 4 5 6 7 8 9 10 11
| [root@master tdsgpo]sudo vim /tdsgpo/spark-3.1.3/conf/spark-env.sh
export SPARK_MASTER_HOST=master SPARK_MASTER_WEBUI_PORT=8081 export SPARK_WORKER_MEMORY=512m export SPARK_EXECUTOR_MEMORY=512m
[root@master tdsgpo]scp /tdsgpo/spark-3.1.3 slave1:/tdsgpo [root@master tdsgpo]scp /tdsgpo/spark-3.1.3 slave2:/tdsgpo
[root@master tdsgpo]sudo chown -R 777 /tdsgpo/spark-3.1.3
|
4)启动 Spark 集群
1 2 3 4 5 6 7 8 9 10 11
| [root@master ~]cd /tdsgpo/spark-3.1.3/sbin [root@master sbin]./start-master.sh [root@master sbin]./start-slave.sh spark://master:7077
[root@master ~]cd /tdsgpo/spark-3.1.3/sbin [root@master sbin]./start-slave.sh spark://master:7077
[root@master sbin]jps 2314 Worker 2341 Jps
|
4、安装 Flink
1)解压 Spark 压缩包
1 2 3 4
| [root@master ~]cd /tdsgpo/ [root@master tdsgpo]sudo tar -zxvf flink-1.12.2-bin-scala_2.12.tgz
|
2)配置 flink-conf.yaml
1 2 3 4 5 6 7 8 9
| [root@master tdsgpo]sudo vim /tdsgpo/flink-1.12.2/conf/flink-conf.yaml
jobmanager.rpc.address: hadoop-master jobmanager.rpc.port: 6123 jobmanager.heap.size: 1024m taskmanager.memory.process.size: 1024m taskmanager.numberOfTaskSlots: 2 parallelism.default: 1 [root@master tdsgpo]
|
3)配置 masters
1 2 3
| [root@master tdsgpo]sudo vim /tdsgpo/flink-1.12.2/conf/masters
master:8081
|
4)配置 slaves
1 2 3 4 5
| [root@master tdsgpo]sudo vim /tdsgpo/flink-1.12.2/conf/slaves
master slave1 slave2
|
5)配置 workers
1 2 3 4 5
| [root@master tdsgpo]sudo vim /tdsgpo/flink-1.12.2/conf/workers
master slave1 slave2
|
6)拷贝 flink 文件夹到两台从节点
1 2 3 4 5
| [root@master tdsgpo]scp -r /tdsgpo/flink-1.12.2 slave1:/tdsgpo/ [root@master tdsgpo]scp -r /tdsgpo/flink-1.12.2 slave2:/tdsgpo/
[root@master tdsgpo]sudo chown -R 777 /tdsgpo/flink-1.12.2/
|
7)启动 flink 集群
1 2
| [root@master tdsgpo]cd /tdsgpo/flink-1.12.2/ [root@master flink-1.12.2]./bin/start-cluster.sh
|
下篇文章我们讲解在 Hadoop 完全分布式集群里面搭建 IDEA 代码编辑器以及集成 Sqoop、Flume、Kafka。