2025年4月

1、在hadoop on yarn环境基础上, 增加spark配置.
spark-env.sh

HADOOP_CONF_DIR=/usr/local/hadoop34/etc/hadoop
YARN_CONF_DIR=/usr/local/hadoop34/etc/hadoop

workers文件:

slave1

2、运行测试

./bin/spark-submit --master yarn --class org.apache.spark.examples.SparkPi ./examples/jars/spark-examples_2.12-3.5.5.jar 10

# 运行pyspark
./bin/pyspark --master yarn


1、下载
https://hadoop.apache.org/releases.html
jdk: https://bell-sw.com/pages/downloads/#jdk-8-lts

2、网络配置 (/etc/hosts)

192.168.0.110 master
192.168.0.140 slave1

3、ssh免密配置

# 在master主机执行
ssh-keygen -t rsa
ssh-copy-id master
ssh-copy-id worker1

4、环境变量配置 (/etc/profile)

export JAVA_HOME=/usr/local/jdk8
export HADOOP_HOME=/usr/local/hadoop34
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native:$LD_LIBRARY_PATH
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

5、关闭主机防火墙

systemctl disable --now firewalld.service

- 阅读剩余部分 -