高可用集群的搭建
在搭建高可用集群之前,如果搭建了完全分布式hadoop,先执行stop-all.sh停掉所有的服务,只保留jdk和zookeeper的2个服务,然后再去搭建。
目标:
- 高可用集群简介
- 部署Hadoop高可用集群
一.高可用集群简介
- HDFS高可用集群
- YARN高可用集群
二.部署高可用集群
先分别在每一个机器中建文件夹Hadoop313-HA
mkdir -p /export/servers/hadoop313-HA
- 规划Hadoop高可用集群
- 安装Hadoop
在hadoop01的/export/servers目录下安装hadoop,并使用mv指令改名为hadoop313-HA
- 修改系统环境变量
vi /etc/profile
# 然后在尾部添加
export HADOOP_HOME=/export/servers/hadoop313-HA
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile #让文件生效
- 修改配置文件
1)配置Hadoop运行时环境 vi hadoop-env.sh
export JAVA_HOME=/export/servers/jdk1.8.0_241
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_JOURNALNODE_USER=root
export HDFS_ZKFC_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
2)配置Hadoop vi core-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/export/data/hadoop313-HA/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/export/data/hadoop313-HA/datanode</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>hadoop01:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>hadoop01:9870</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>hadoop02:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>hadoop02:9870</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/export/data/journaldata</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.permissions.enable</name>
<value>false</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
4)配置MapReduce vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=${HADOOP_HOME}</value>
</property>
</configuration>
5)配置YARN vi yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>jyarn</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop02</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
</configuration>
6)配置Hadoop从节点运行的虚拟机
- 分发Hadoop安装目录
scp -r /export/servers/hadoop313-HA root@hadoop02:/export/servers/
scp -r /export/servers/hadoop313-HA root@hadoop03:/export/servers/
- 分发系统环境变量文件,并使用source生效
scp /etc/profile root@hadoop02:/etc/
scp /etc/profile root@hadoop03:/etc/
- 启动Hadoop高可用集群
1)分别在虚拟机hadoop01、hadoop02和hadoop03中启动JournalNode
hdfs --daemon start journalnode
2)在虚拟机hadoop01上格式化HDFS文件系统
hdfs namenode -format
3)同步NameNode
scp -r /export/data/hadoop313-HA/namenode/ hadoop02:/export/data/hadoop313-HA/
scp -r /export/data/hadoop313-HA/namenode/ hadoop03:/export/data/hadoop313-HA/
注意:同步NameNode是为了确保初次启动HDFS时两个NameNode存储的FSImage文件一致。并且此操作只在初次启动Hadoop高可用集群之前执行。
4)格式化ZKFC
为了确保ZooKeeper集群能够通过ZKFC为HDFS提供高可用,在****初次启动****Hadoop高可用集群之前需要进行格式化ZKFC的操作
hdfs zkfc -formatZK
5)启动HDFS
start-dfs.sh
6)启动YARN
start-yarn.sh
- 查看集群状态信息
- 测试主备切换
【测试1】:关闭hadoop01中状态为active的NameNode
hdfs --daemon stop namenode
此时,重新查看NameNode的状态,*发现hadoop01无法访问,hadoop02备胎转正*
【测试2】:关闭hadoop01中状态为active的ResourceManager
yarn --daemon stop resourcemanager
此时,重新查看ResourceManager的状态,*发现hadoop01无法访问,hadoop02备胎转正*