开源大数据集群部署(十七)HADOOP集群配置(二)

櫰木2年前技术文章1237

HADOOP集群配置

配置文件workers

[root@hd1.dtstack.com software]# cd /opt/hadoop/etc/hadoop
[root@hd1.dtstack.com hadoop]# pwd
/opt/hadoop/etc/hadoop
[root@hd1.dtstack.com hadoop]# cat >> workers <<EOF
hd3.dtstack.com
hd1.dtstack.com
hd1.dtstack.com
EOF


配置文件hdfs-site.xml

 [root@hd1.dtstack.com hadoop]# cat hdfs-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
--><!-- Put site-specific property overrides in this file. --><configuration>
    <property>
      <!--指定副本数为3-->
      <name>dfs.replication</name>
        <value>3</value>
        </property>
    <!-- 完全分布式集群名称 -->
    <property>
        <name>dfs.nameservices</name>
        <value>mycluster</value>
    </property>
    <!-- 集群中NameNode节点都有哪些 -->
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>
    <!-- nn1的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>hd1.dtstack.com:8020</value>
    </property>
    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>hd2.dtstack.com:8020</value>
    </property>
 
    <!-- nn1的http通信地址 -->
    <property>
 
        <name>dfs.namenode.http-address.mycluster.nn1</name>
 
        <value>hd1.dtstack.com:9870</value>
    </property>
    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>hd2.dtstack.com:9870</value>
    </property>
    <!-- 指定NameNode元数据在JournalNode上的存放位置 -->
    <property>
       <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://hd1.dtstack.com:8485;hd2.dtstack.com:8485;hd3.dtstack.com:8485/mycluster</value>
    </property>
     <!-- 指定namenode上存储hdfs名字空间元数据存放位置 -->
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:///data/hadoop/dfs/name</value>
    </property>
     <!-- 指定datanode上数据块的物理存储位置-->
    <property>
         <name>dfs.datanode.data.dir</name>
         <value>file:///data/hadoop/dfs/data</value>
    
    </property>
    <!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>
                sshfence
                shell(/bin/true)
        </value>
    </property>
    <!-- 使用隔离机制时需要ssh无秘钥登录-->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/home/hadoop/.ssh/id_rsa</value>
    </property>
    <!-- 声明journalnode服务器存储目录-->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/data/hadoop/data/jn</value>
    </property>
    <!-- 关闭权限检查-->
    <property>
    <name>dfs.permissions.enable</name>
    <value>true</value>
    </property>
 
    <!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式-->
    <property>
    <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>
    <!--开启自动故障切换-->
    <property>
    <name>dfs.ha.automatic-failover.enabled</name>
    <value>true</value>
    </property>
 
    <!-- namenode设置 -->
    <property>
        <name>dfs.namenode.keytab.file</name>
        <value>/etc/security/keytab/hdfs.keytab</value>
    </property>
    <property>
        <name>dfs.namenode.kerberos.principal</name>
        <value>hdfs/_HOST@DTSTACK.COM</value>
    </property>
    <property>
        <name>dfs.namenode.kerberos.https.principal</name>
        <value>HTTP/_HOST@DTSTACK.COM</value>
    </property>
 
 
    <!-- datanode设置 -->
 
    <property>
        <name>dfs.data.transfer.protection</name>
        <value>integrity</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir.perm</name>
        <value>750</value>
    </property>
    <property>
        <name>dfs.block.access.token.enable</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.datanode.keytab.file</name>
        <value>/etc/security/keytab/hdfs.keytab</value>
    </property>
    <property>
        <name>dfs.datanode.kerberos.principal</name>
        <value>hdfs/_HOST@DTSTACK.COM</value>
    </property>
    <property>
        <name>dfs.datanode.kerberos.https.principal</name>
        <value>HTTP/_HOST@DTSTACK.COM</value>
    </property>
    <property>
        <name>dfs.permissions.superusergroup</name>
        <value>hadoop</value>
    </property>
 
    <property>
        <name>dfs.datanode.https.address</name>
        <value>0.0.0.0:60075</value>
    </property>
 
    <!-- journalnode设置 -->
 
    <property>
        <name>dfs.journalnode.keytab.file</name>
        <value>/etc/security/keytab/hdfs.keytab</value>
    </property>
    <property>
        <name>dfs.journalnode.kerberos.principal</name>
        <value>hdfs/_HOST@DTSTACK.COM</value>
    </property>
    <property>
        <name>dfs.journalnode.kerberos.internal.spnego.principal</name>
        <value>HTTP/_HOST@DTSTACK.COM</value>
    </property>
 
    <property>
        <name>dfs.journalnode.https-address</name>
        <value>0.0.0.0:18480</value>
    </property>
 
    <!-- http设置 -->
 
    <property>
        <name>dfs.http.policy</name>
        <value>HTTPS_ONLY</value>
    </property>
 
    <!-- webhdfs配置 -->
 
    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.web.authentication.kerberos.principal</name>
        <value>HTTP/_HOST@DTSTACK.COM</value>
    </property>
    <property>
        <name>dfs.web.authentication.kerberos.keytab</name>
        <value>/etc/security/keytab/hdfs.keytab</value>
    </property>
    
<property>
        <name>dfs.permissions.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.permissions</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.namenode.inode.attributes.provider.class</name>
        <value>org.apache.ranger.authorization.hadoop.RangerHdfsAuthorizer</value>
    </property>
    <property>
        <name>dfs.permissions.ContentSummary.subAccess</name>
        <value>true</value>
    </property>
</configuration>


配置文件core-site.xml

[root@hd1.dtstack.com hadoop]# cat core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
 
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <!-- 把两个NameNode)的地址组装成一个集群mycluster -->
        <name>fs.defaultFS</name>
        <value>hdfs://mycluster</value>
    </property>
 
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/data/tmp</value>
    </property>
 
    <property>
        <name>ha.zookeeper.quorum</name>
 
        <value>hd1.dtstack.com:2181,hd2.dtstack.com:2181,hd3.dtstack.com:2181</value>
    </property>
 
    <description>垃圾回收机制存放时间</description>
    <property>
        <name>fs.trash.interval</name>
        <value>10080</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hive.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hive.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.hive.users</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.trino.hosts</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.trino.groups</name>
        <value>*</value>
    </property>
    <property>
        <name>hadoop.proxyuser.trino.users</name>
        <value>*</value>
    </property>
    <property>
        <name>io.file.buffer.size</name>
        <value>131072</value>
        <description>Size of read/write buffer used in SequenceFiles</description>
    </property>
 
    <!-- Kerberos主体到系统用户的具体映射规则 -->
    <property>
        <name>hadoop.security.auth_to_local</name>
        <value>
    RULE:[2:$1/$2@$0]([ndj]n\/.*@DTSTACK\.COM)s/.*/hdfs/
    RULE:[2:$1/$2@$0]([rn]m\/.*@DTSTACK\.COM)s/.*/yarn/
    RULE:[2:$1/$2@$0](jhs\/.*@DTSTACK\.COM)s/.*/mapred/
    RULE:[1:$1@$0](^.*@DTSTACK\.COM$)s/^(.*)@DTSTACK\.COM$/$1/g
    RULE:[2:$1@$0](^.*@DTSTACK\.COM$)s/^(.*)@DTSTACK\.COM$/$1/g
    DEFAULT
  </value>
    </property>
 
    <!-- 启用Hadoop集群Kerberos安全认证 -->
    <property>
        <name>hadoop.security.authentication</name>
        <value>kerberos</value>
    </property>
 
    <!-- 启用Hadoop集群授权管理 -->
    <property>
        <name>hadoop.security.authorization</name>
        <value>true</value>
    </property>
 
</configuration>


配置文件yarn-site.xml

[root@hd1.dtstack.com hadoop]# cat yarn-site.xml
<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
 
<!-- Site specific YARN configuration properties -->
<configuration>
    <!-- Site specific YARN configuration properties -->
<property>
<!--设置shuffle流程-->
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用resourcemanager ha-->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <!--声明两台resourcemanager的地址-->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster-yarn1</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>hd1.dtstack.com</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>hd2.dtstack.com</value>
    </property>
 
    <!--指定zookeeper集群的地址-->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>hd1.dtstack.com:2181,hd2.dtstack.com:2181,hd3.dtstack.com:2181</value>
    </property>
    <!--启用自动恢复-->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>
    <!--指定resourcemanager的状态信息存储在zookeeper集群-->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
    <property>
        <name>yarn.application.classpath</name>
        <value>/opt/hadoop/etc/hadoop:/opt/hadoop/share/hadoop/common/lib/*:/opt/hadoop/share/hadoop/common/*:/opt/hadoop/share/hadoop/hdfs:/opt/hadoop/share/hadoop/hdfs/lib/*:/opt/hadoop/share/hadoop/hdfs/*:/opt/hadoop/share/hadoop/mapreduce/lib/*:/opt/hadoop/share/hadoop/mapreduce/*:/opt/hadoop/share/hadoop/yarn:/opt/hadoop/share/hadoop/yarn/lib/*:/opt/hadoop/share/hadoop/yarn/*</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>hd1.dtstack.com:8088</value>
    </property>
    <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>hd2.dtstack.com:8088</value>
    </property>
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.scheduler.minimum-allocation-mb</name>
        <value>512</value>
    </property>
 
    <property>
        <name>yarn.scheduler.maximum-allocation-mb</name>
        <value>12288</value>
    </property>
    <property>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>8</value>
    </property>
 
    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>12288</value>
    </property>
 
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>
    <!-- 配置Node Manager使用LinuxContainerExecutor管理Container -->
    <property>
        <name>yarn.nodemanager.container-executor.class</name>
        <value>org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor</value>
    </property>
 
    <!-- 配置Node Manager的启动用户的所属组 -->
    <property>
        <name>yarn.nodemanager.linux-container-executor.group</name>
        <value>hadoop</value>
    </property>
 
    <!-- LinuxContainerExecutor脚本路径 -->
    <property>
        <name>yarn.nodemanager.linux-container-executor.path</name>
        <value>/opt/hadoop/bin/container-executor</value>
    </property>
    <!-- Resource Manager 服务的Kerberos主体 -->
    <property>
        <name>yarn.resourcemanager.principal</name>
        <value>yarn/_HOST@DTSTACK.COM</value>
    </property>
 
    <!-- Resource Manager 服务的Kerberos密钥文件 -->
    <property>
        <name>yarn.resourcemanager.keytab</name>
        <value>/etc/security/keytab/yarn.keytab</value>
    </property>
 
    <!-- Node Manager 服务的Kerberos主体 -->
    <property>
        <name>yarn.nodemanager.principal</name>
        <value>yarn/_HOST@DTSTACK.COM</value>
    </property>
 
    <!-- Node Manager 服务的Kerberos密钥文件 -->
    <property>
        <name>yarn.nodemanager.keytab</name>
        <value>/etc/security/keytab/yarn.keytab</value>
    </property>
 
    <property>
        <name>yarn.admin.acl</name>
        <value>*</value>
    </property>
 
    <!---日志相关配置-->
 
 <property>
          <name>yarn.log-aggregation.retain-check-interval-seconds</name>
          <value>604800</value>
    </property>
 
    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
        <description>default is false</description>
    </property>
 
    <property>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/tmp/logs</value>
        <description>default is /tmp/logs</description>
    </property>
 
    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>604800</value>
        <description>远程日志保存时间单位s</description>
    </property>
 
    <property>
        <name>yarn.nodemanager.delete.debug-delay-sec</name>
        <value>600</value>
        <description>application 执行结束后延迟删除本地文件及日志</description>
    </property>
   
    <property>
        <name>yarn.log.server.url</name>
       <value>http://172.16.106.181:19888/jobhistory/logs/</value>
    </property>
 
    <property>
        <name>yarn.nodemanager.disk-health-checker.enable</name>
       <value>true</value>
   </property>
 
    <property>
        <name>yarn.nodemanager.disk-health-checker.interval-ms</name>
       <value>120000</value>
   </property>
 
  <property>
     <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
     <value>0.25</value>
  </property>
  <property>
     <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
     <value>90</value>
  </property>
 
    <property>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/data/yarn/local</value>
    </property>
    <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value>/data/yarn/logs</value>
    </property>
    <property>
        <name>yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb</name>
        <value>10240</value>
    </property>
</configuration>


配置文件mapred-site.xml

[root@hd1.dtstack.com hadoop]# cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at
 
    http://www.apache.org/licenses/LICENSE-2.0
 
  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
 
<!-- Put site-specific property overrides in this file. -->
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
        <description>The runtime framework for executing MapReduce jobs. Can be one of local, classic or yarn.</description>
        <final>true</final>
    </property>
    <property>
        <name>mapreduce.jobtracker.http.address</name>
        <value>hd1.dtstack.com:50030</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>hd1.dtstack.com:10020</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.webapp.address</name>
        <value>hd1.dtstack.com:19888</value>
    </property>
    <property>
        <name>mapred.job.tracker</name>
        <value>http://hd1.dtstack.com:9001</value>
    </property>
    <property>
        <name>mapreduce.jobhistory.done-dir</name>
        <value>/history/done</value>
    </property>
 
    <property>
        <name>mapreduce.jobhistory.intermediate-done-dir</name>
        <value>/history/done_intermediate</value>
    </property>
 
    <property>
        <name>yarn.app.mapreduce.am.staging-dir</name>
        <value>/tmp/hadoop-yarn/staging</value>
    </property>
    <!-- 历史服务器的Kerberos主体 -->
    <property>
        <name>mapreduce.jobhistory.keytab</name>
        <value>/etc/security/keytab/yarn.keytab</value>
    </property>
 
    <!-- 历史服务器的Kerberos密钥文件 -->
    <property>
        <name>mapreduce.jobhistory.principal</name>
        <value>yarn/_HOST@DTSTACK.COM</value>
    </property>
</configuration>


配置文件hadoop-env.sh

添加到最后

root@hd1.dtstack.com hadoop]# cat hadoop-env.sh
export JAVA_HOME=/opt/java
export HADOOP_OS_TYPE=${HADOOP_OS_TYPE:-$(uname -s)}
export HDFS_NAMENODE_OPTS="-Dcom.sun.management.jmxremote.authenticate=false   -Dcom.sun.management.jmxremote.ssl=false   -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=9609   -javaagent:/opt/prometheus/jmx_prometheus_javaagent-0.3.1.jar=9509:/opt/prometheus/namenode.yml $HDFS_NAMENODE_OPTS"
export YARN_RESOURCEMANAGER_OPTS="-Dcom.sun.management.jmxremote.authenticate=false   -Dcom.sun.management.jmxremote.ssl=false   -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=9603   -javaagent:/opt/prometheus/jmx_prometheus_javaagent-0.3.1.jar=9503:/opt/prometheus/resourcemanager.yml $YARN_RESOURCEMANAGER_OPTS"
export HDFS_DATANODE_OPTS="-Dcom.sun.management.jmxremote.authenticate=false   -Dcom.sun.management.jmxremote.ssl=false   -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=9601   -javaagent:/opt/prometheus/jmx_prometheus_javaagent-0.3.1.jar=9501:/opt/prometheus/datanode.yml $HDFS_DATANODE_OPTS"
export YARN_NODEMANAGER_OPTS="-Dcom.sun.management.jmxremote.authenticate=false   -Dcom.sun.management.jmxremote.ssl=false   -Dcom.sun.management.jmxremote.local.only=false   -Dcom.sun.management.jmxremote.port=9604   -javaagent:/opt/prometheus/jmx_prometheus_javaagent-0.3.1.jar=9504:/opt/prometheus/nodemanager.yml $YARN_NODEMANAGER_OPTS"


配置文件ssl-server.xml

[root@hd1.dtstack.com hadoop]# cat ssl-server.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<configuration>
 
    <property>
        <name>ssl.server.truststore.location</name>
        <value>/data/kerberos/hdfs_ca/truststore</value>
        <description> SSL密钥库路径</description>
    </property>
 
    <property>
        <name>ssl.server.truststore.password</name>
        <value>abc123</value>
        <description> SSL密钥库密码 </description>
    </property>
 
    <property>
        <name>ssl.server.truststore.type</name>
        <value>jks</value>
        <description>Okeystore文件输出类型</description>
    </property>
 
    <property>
        <name>ssl.server.truststore.reload.interval</name>
        <value>10000</value>
        <description>Truststore reload check interval, in milliseconds.
  Default value is 10000 (10 seconds).
  </description>
    </property>
 
    <property>
        <name>ssl.server.keystore.location</name>
        <value>/data/kerberos/hdfs_ca/keystore</value>
        <description>Keystore to be used by NN and DN. Must be specified.
  </description>
    </property>
 
    <property>
        <name>ssl.server.keystore.password</name>
        <value>abc123</value>
        <description>Must be specified.
  </description>
    </property>
 
    <property>
        <name>ssl.server.keystore.keypassword</name>
        <value>abc123</value>
        <description>Must be specified.
  </description>
    </property>
 
    <property>
        <name>ssl.server.keystore.type</name>
        <value>jks</value>
        <description>Optional. The keystore file format, default value is "jks".
  </description>
    </property>
 
    <property>
        <name>ssl.server.exclude.cipher.list</name>
        <value>TLS_ECDHE_RSA_WITH_RC4_128_SHA,SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA,
  SSL_RSA_WITH_DES_CBC_SHA,SSL_DHE_RSA_WITH_DES_CBC_SHA,
  SSL_RSA_EXPORT_WITH_RC4_40_MD5,SSL_RSA_EXPORT_WITH_DES40_CBC_SHA,
  SSL_RSA_WITH_RC4_128_MD5</value>
        <description>Optional. The weak security cipher suites that you want excluded
  from SSL communication.</description>
    </property>
 
</configuration>


配置文件ssl-client.xml
[root@hd1.dtstack.com hadoop]# cat ssl-client.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
   Licensed to the Apache Software Foundation (ASF) under one or more
   contributor license agreements.  See the NOTICE file distributed with
   this work for additional information regarding copyright ownership.
   The ASF licenses this file to You under the Apache License, Version 2.0
   (the "License"); you may not use this file except in compliance with
   the License.  You may obtain a copy of the License at
 
       http://www.apache.org/licenses/LICENSE-2.0
 
   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
-->
<configuration>
 
    <property>
        <name>ssl.client.truststore.location</name>
        <value>/data/kerberos/hdfs_ca/truststore</value>
        <description>Truststore to be used by clients like distcp. Must be
  specified.
  </description>
    </property>
 
    <property>
        <name>ssl.client.truststore.password</name>
        <value>abc123</value>
        <description>Optional. Default value is "".
  </description>
    </property>
 
    <property>
        <name>ssl.client.truststore.type</name>
        <value>jks</value>
        <description>Optional. The keystore file format, default value is "jks".
  </description>
    </property>
 
    <property>
        <name>ssl.client.truststore.reload.interval</name>
        <value>10000</value>
        <description>Truststore reload check interval, in milliseconds.
  Default value is 10000 (10 seconds).
  </description>
    </property>
 
    <property>
        <name>ssl.client.keystore.location</name>
        <value>/data/kerberos/hdfs_ca/keystore</value>
        <description>Keystore to be used by clients like distcp. Must be
  specified.
  </description>
    </property>
 
    <property>
        <name>ssl.client.keystore.password</name>
        <value>abc123</value>
        <description>Optional. Default value is "".
  </description>
    </property>
 
    <property>
        <name>ssl.client.keystore.keypassword</name>
        <value>abc123</value>
        <description>Optional. Default value is "".
  </description>
    </property>
 
    <property>
        <name>ssl.client.keystore.type</name>
        <value>jks</value>
        <description>Optional. The keystore file format, default value is "jks".
  </description>
    </property>
 
</configuration>

2 权限修改

[root@hd1.dtstack.com hadoop]# chown  hdfs:hadoop /opt/hadoop/etc/hadoop/

Ø 配置文件container-executor.cfg

[root@hd1.dtstack.com hadoop]# cat >container-executor.cfg<<EOF
yarn.nodemanager.linux-container-executor.group=hadoop
banned.users=hdfs,yarn,mapred
min.user.id=1000
allowed.system.users=hdfs,yarn,mapred
feature.tc.enabled=false
EOF


3 配置分发

hd1.dtstack.com上已配置完成的文件分发到hd2.dtstack.com-hd3.dtstack.com主机上

[root@hd1.dtstack.com software]#cd /opt
[root@hd1.dtstack.com software]# scp -r hadoop-3.2.4 root@hd2.dtstack.com:/opt/
[root@hd1.dtstack.com software]# scp -r hadoop-3.2.4 root@hd3.dtstack.com:/opt/


4 hadoop集群环境变量

/etc/profile中加入hadoop集群环境变量

由于在前边已经配置过所以不用添加

重新执行source /etc/profile

5 hadoop集群HA模式

创建对应数据目录

[hdfs@hd1.dtstack.com ~]$ mkdir -p /data/hadoop/data
[hdfs@hd1.dtstack.com ~]$ chown hdfs:hdfs -R /data/hadoop
[hdfs@hd1.dtstack.com ~]$ chmod 755 /data/hadoop/data
[hdfs@hd2.dtstack.com ~]$ mkdir -p /data/hadoop/data
[hdfs@hd2.dtstack.com ~]$ chown hdfs:hdfs -R /data/hadoop
[hdfs@hd2.dtstack.com ~]$ chmod 755 /data/hadoop/data
[hdfs@hd3.dtstack.com ~]$ mkdir -p /data/hadoop/data
[hdfs@hd3.dtstack.com ~]$ chown hdfs:hdfs -R /data/hadoop
[hdfs@hd3.dtstack.com ~]$ chmod 755 /data/hadoop/data


 

hdfs权限下执行

Ø 启动journalnode

hd1.dtstack.com-hd3.dtstack.com主机上分别执行:

[hdfs@hd1.dtstack.com ~]$ hadoop-daemon.sh start journalnode
[hdfs@hd2.dtstack.com ~]$ hadoop-daemon.sh start journalnode
[hdfs@hd3.dtstack.com ~]$ hadoop-daemon.sh start journalnode


执行jps命令查看JournalNode是否已启动,存在表示启动成功,否则查看日志分析原因解决并重新启动

 

Ø [nn1]上对其进行格式化,并启动

hd1.dtstack.com主机上执行:

[hdfs@hd1.dtstack.com ~]$ hdfs namenode -format
[hdfs@hd1.dtstack.com ~]$ hadoop-daemon.sh start namenode


执行jps命令查看NameNode是否已启动,存在表示启动成功,否则查看日志分析原因解决并重新启动

图片1.png 

Ø [nn2]上,同步nn1的元数据信息

hd2.dtstack.com主机上执行:

[hdfs@hd2.dtstack.com ~]$ hdfs namenode -bootstrapStandby


Ø 启动[nn2]

hd2.dtstack.com主机上执行:

[hdfs@hd2.dtstack.com ~]$ hadoop-daemon.sh start namenode


执行jps命令查看NameNode是否已启动,存在表示启动成功,否则查看日志分析原因解决并重新启动

图片2.png 

 

Ø zookeeper上配置故障自动转移节点(前提zk集群已启动)

[hdfs@hd2.dtstack.com ~]$ hdfs zkfc -formatZK


Ø nn2上启动所有数据节点

[hdfs@hd2.dtstack.com ~]$ hadoop-daemons.sh start datanode


Ø nn2启动DFSZK

[hdfs@hd2.dtstack.com ~]$ hadoop-daemon.sh start zkfc


注意:

ü 先在哪台机器启动,哪个机器的NameNode就是Active

注意:

ü Historyserver只能在nn1上(hd1.dtstack.com)执行,原因是配置文件配置在nn1

Ø 手动切换nn2为激活状态(强制)

[hdfs@hd2.dtstack.com ~]$ hdfs haadmin -transitionToActive --forcemanual nn1


注意:hdfs haadmin -transitionToActive nn1此命令正常激活会失败需要用上面的强制激活

启动yarn

Ø 修改yarn启停脚本

yarn用户做免密

yarn启停脚本start-yarn.shstop-yarn.sh顶部加入

YARN_RESOURCEMANAGER_USER=yarn
YARN_NODEMANAGER_USER=yarn


Ø 启动

Ø nodemanager机器上创建任务执行目录(使用yarn用户

Ø mkdir -p /data/yarn/{logs,local}

Ø 

hd1.dtstack.com主机上执行:

[root@hd1.dtstack.com ~]# start-yarn.sh


启动HistoryServer
[root@hd1.dtstack.com~]# 
sudo -u yarn ./sbin/mr-jobhistory-daemon.sh start historyserver


集群验证

Ø NN验证

获取集群状态

 [hdfs@hd1.dtstack.com hadoop]$ hdfs haadmin -getServiceState nn1
 [hdfs@hd1.dtstack.com hadoop]$ hdfs haadmin -getServiceState nn2


图片4.png 

Ø YARN验证

获取集群状态

[hdfs@hd1.dtstack.com hadoop]$ yarn rmadmin -getServiceState rm1
[hdfs@hd1.dtstack.com hadoop]$ yarn rmadmin -getServiceState rm2


图片5.png 

Ø WEB页面验证

 

yarn访问地址:http://hd1.dtstack.com:8088/

图片6.png 

访问地址https://hd1.dtstack.com:9871/dfshealth.html#tab-overview

直接输入 thisisunsafe

图片7.png 


相关文章

hbase无法执行脚本停止服务问题分析

hbase无法执行脚本停止服务问题分析

问题现象:hbase执行stop-hbase.sh无法停止hbase进程定位过程:1.     执行脚本发现脚本在执行了停止命令后一直在循环查询hbase状态2...

sqlserevr索引、自增列查询SQL

sqlserevr索引、自增列查询SQL

一、索引查询1.可视化方式查询1) 进入实例数据库内,选择想要查看的数据库及表信息。2) 展开表,即可查看其索引情况(一般情况下,PK为主键,IX为索引)。 3) 或者右击表,编辑表结构,可以查看到索...

MySQL 评估 ALTER TABLE 进度(5.7)

MySQL 评估 ALTER TABLE 进度(5.7)

一、前言问题:大表里执行 ALTER TABLE 的时候,经常会比较忐忑,会面临 “跑又跑不完 Kill 也不敢 Kill” 的窘境。需求:客户在执行 ALTER TABLE 时也会让我们来评估影响的...

阿里金融云经典网络和线下某银行实现网络互通

阿里金融云经典网络和线下某银行实现网络互通

需求某银行需要和某阿里金融云账号下的经典网络实例内网打通。已知不考虑将该服务器从经典网络类型迁移至VPC类型。阿里金融云环境下,之前是支持拉线下到经典网络专线的,但是目前和阿里侧核查,确认已不支持,仅...

chengying-6.0登入接口逆向

chengying-6.0登入接口逆向

版本更新首先是登入的加密url:http://172.16.121.70/login参数1. username:admin@dtstack.com2. password:614bb9438210c69...

Kubernetes网络模型与CNI网络插件

Kubernetes网络模型与CNI网络插件

在 Flannel 的网络插件中,容器跨主机网络的两种实现方法:UDP 和 VXLAN。它们有一个共性,就是用户的容器都连接在 docker0 网桥上。而网络插件则在宿主机上创建了一个特殊的设备(UD...

发表评论    

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。