共计 9228 个字符,预计需要花费 24 分钟才能阅读完成。
这篇文章主要讲解了“hadoop 环境如何部署”,文中的讲解内容简单清晰,易于学习与理解,下面请大家跟着丸趣 TV 小编的思路慢慢深入,一起来研究和学习“hadoop 环境如何部署”吧!
准备工作
以下步骤要在所有节点上执行
1.1 修改 hostname
vi /etc/sysconfig/network
1.2 关闭 SELinux
查看 SELinux 状态 getenforce
若 SELinux 没有关闭,按照下述方式关闭
vi /etc/selinux/config
修改 SELinux=disabled。重启生效,可以等后面都设置完了重启主机
1.3 关闭防火墙
service iptables stop
chkconfig iptables off
chkconfig iptables –list
1.4 网络配置
vim /etc/sysconfig/network-scripts/ifcfg-eth0
1.5 修改 host
127.0.0.1 localhost# 必须配置
# CDH Cluster
192.168.88.11 hadoop1
192.168.88.12 hadoop2
192.168.88.13 hadoop3
1.6 配置 hadoop1 到 hadoop2 免密登录
1.7 所有节点配置 NTP 服务
集群中所有主机必须保持时间同步,如果时间相差较大会引起各种问题。具体思路如下:
master 节点作为 ntp 服务器与外界对时中心同步时间,随后对所有 datanode 节点提供时间同步服务。所有 datanode 节点以 master 节点为基础同步时间。
所有节点安装相关组件:yum install ntp。
完成后,配置开机启动:chkconfig ntpd on ,
检查是否设置成功:chkconfig –list ntpd 其中 2 - 5 为 on 状态就代表成功。
主节点配置
在配置之前,先使用 ntpdate 手动同步一下时间,免得本机与对时中心时间差距太大,使得 ntpd 不能正常同步。这里选用 65.55.56.206 作为对时中心, ntpdate -u 202.112.10.36
vi /etc/ntp.conf
# For more information about this file, see the man pages
# ntp.conf(5), ntp_acc(5), ntp_auth(5), ntp_clock(5), ntp_misc(5), ntp_mon(5).
driftfile /var/lib/ntp/drift
# Permit time synchronization with our time source, but do not
# permit the source to query or modify the service on this system.
restrict default kod nomodify notrap nopeer noquery
restrict -6 default kod nomodify notrap nopeer noquery
# Permit all access over the loopback interface. This could
# be tightened as well, but to do so would effect some of
# the administrative functions.
restrict 127.0.0.1
restrict -6 ::1
# Hosts on local network are less restricted.
# 允许内网其他机器同步时间
restrict 192.168.1.0 mask 255.255.255.0 nomodify notrap
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
# 中国这边最活跃的时间服务器 : http://www.pool.ntp.org/zone/cn
server 210.72.145.44 perfer # 中国国家受时中心
server 202.112.10.36 # 1.cn.pool.ntp.org
server 59.124.196.83 # 0.asia.pool.ntp.org
#broadcast 192.168.1.255 autokey # broadcast server
#broadcastclient # broadcast client
#broadcast 224.0.1.1 autokey # multicast server
#multicastclient 224.0.1.1 # multicast client
#manycastserver 239.255.254.254 # manycast server
#manycastclient 239.255.254.254 autokey # manycast client
# allow update time by the upper server
# 允许上层时间服务器主动修改本机时间
restrict 210.72.145.44 nomodify notrap noquery
restrict 202.112.10.36 nomodify notrap noquery
restrict 59.124.196.83 nomodify notrap noquery
# Undisciplined Local Clock. This is a fake driver intended for backup
# and when no outside source of synchronized time is available.
# 外部时间服务器不可用时,以本地时间作为时间服务
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
# Enable public key cryptography.
#crypto
includefile /etc/ntp/crypto/pw
# Key file containing the keys and key identifiers used when operating
# with symmetric key cryptography.
keys /etc/ntp/keys
# Specify the key identifiers which are trusted.
#trustedkey 4 8 42
# Specify the key identifier to use with the ntpdc utility.
#requestkey 8
# Specify the key identifier to use with the ntpq utility.
#controlkey 8
# Enable writing of statistics records.
#statistics clockstats cryptostats loopstats peerstats
service ntpd start
ntpstat
这个一般需要 5 -10 分钟后才能成功连接和同步
[root@hadoop1 ~]# netstat -tlunp | grep ntp
udp 0 0 192.168.88.11:123 0.0.0.0:* 17339/ntpd ############################
udp 0 0 127.0.0.1:123 0.0.0.0:* 17339/ntpd
udp 0 0 0.0.0.0:123 0.0.0.0:* 17339/ntpd
udp 0 0 fe80::20c:29ff:fe7c:123 :::* 17339/ntpd
udp 0 0 ::1:123 :::* 17339/ntpd
udp 0 0 :::123 :::* 17339/ntpd
[root@hadoop1 ~]# ntpq -p
remote refid st t when poll reach delay offset jitter
==============================================================================
202.118.1.130 .INIT. 16 u – 64 0 0.000 0.000 0.000#################################
# ntpstat
unsynchronised
time server re-starting
polling server every 64 s
连接并同步后:
synchronised to NTP server (202.112.10.36) at stratum 3
time correct to within 275 ms
polling server every 256 s
# yum install ntp
# chkconfig ntp on
# vim /etc/ntp.conf
driftfile /var/lib/ntp/drift
restrict 127.0.0.1
restrict -6 ::1
# 配置时间服务器为本地的时间服务器
server 192.168.1.135
restrict 192.168.1.135 nomodify notrap noquery
server 127.127.1.0 # local clock
fudge 127.127.1.0 stratum 10
includefile /etc/ntp/crypto/pw
keys /etc/ntp/keys
[root@hadoop2 soft]# ntpdate -u hadoop1
2.cloudra 安装 所有节点
2.1 下载 cloudera-manager.repo wget http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo
1 将 cloudera-manager.repo 文件拷贝到所有节点的 /etc/yum.repos.d/ 文件夹下
mv cloudera-manager.repo /etc/yum.repos.d/
vi /etc/yum.conf
timeout=50000
yum list|grep cloudera
如果列出的不是你安装的版本,执行下面命令重试
yum clean all
yum list | grep cloudera
2.2 下载 CDH 将之前下载的 Parcel 那 3 个文件拷贝到 /opt/cloudera/parcel-repo 目录下(如果没有该目录,请自行创建)
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1###.sha1 文件后缀更改为.sha,同时把内容只保留 hash 码部分
wget http://archive-primary.cloudera.com/cdh6/parcels/5.2.1/manifest.json
2.4 在 master[hadoop1] 节点安装 daemons、server、agent(先装 daemons)
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-server-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-agent-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
yum –nogpgcheck localinstall cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum –nogpgcheck localinstall cloudera-manager-server-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum –nogpgcheck localinstall cloudera-manager-agent-5.2.1-1.cm521.p0.109.el6.x86_64.rpm(注:agent 安装需要联网)
2.5 在 slave-1[hadoop2]、slave-2[hadoop3] 节点安装 daemons、agent(先装 daemons)
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
wget http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/cloudera-manager-agent-5.2.1-1.cm521.p0.109.el5.x86_64.rpm
yum –nogpgcheck localinstall cloudera-manager-daemons-5.2.1-1.cm521.p0.109.el6.x86_64.rpm
yum –nogpgcheck localinstall cloudera-manager-agent-5.2.1-1.cm521.p0.109.el6.x86_64.rpm(注:agent 安装需要联网)
2.6 在 master、slave-1、slave-2 节点安装 JDK、oraclejdk
rpm -ivh jdk-6u31-linux-amd64.rpm
3. 在 master 节点安装 mysql 数据库,并配置 cdh 需要的数据库选项
yum install mysql-server mysql mysql-devel
chkconfig mysqld on
service mysqld start
mysql –u root
use mysql
update user set password=password(1234) where user= root
update user set password=password(1234) where host= localhost
update user set password=password(1234) where host= hadoop1
service mysqld restart
mysql -u root -p1234
create database cloudera
4. 在 master 节点配置 cloudera manager 数据库并启动 cm 的 server 及 agent 程序
1. 拷贝 mysql-connector-java-5.1.7-bin.jar 到 /usr/share/java 下并重命名 mysql-connector-java.jar
2. 运行 /usr/share/cmf/schema/scm_prepare_database.sh -h hadoop1 mysql cloudera root 1234
3. 启动 cm server:service cloudera-scm-server start
4. 添加 cm server 服务:chkconfig cloudera-scm-server on
5. 启动 cm agent:chkconfig cloudera-scm-agent on
6. 添加 cm agent 服务:service cloudera-scm-server start
5、修改所有节点的 agent 配置文件
/etc/cloudera-scm-agent/config.ini 将配置文件中的 host 改成 cdh-master
6、在 slave 节点配置 cloudera manager agent 程序
1. 启动 cm agent:chkconfig cloudera-scm-agent on
2. 添加 cm agent 服务:service cloudera-scm-agent start
7、测试 agent 和 server 是否通信成功
service cloudera-scm-server status
service cloudera-scm-agent status
netstat –anp | grep 7182
# server 端开启的是 7182 端口,用于和 agent 进行通讯
启动失败时可以查看日志
server 日志 /var/log/cloudera-scm-server
agent 日志 /var/log/cloudera-scm-agent
8 设置 parcel[master]
mv CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel /opt/cloudera/parcel-repo
[root@hadoop1 parcel-repo]# tail -5 manifest.json
replaces : IMPALA, SOLR, SPARK ,
hash : 7dcb31e557a7da951bfb6337e02b0b884aa3d2a2\n
}
]
[root@hadoop1 parcel-repo]# tail -1 CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1
7dcb31e557a7da951bfb6337e02b0b884aa3d2a2\n
[root@hadoop1 parcel-repo]# mv CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha1 CDH-5.2.1-1.cdh6.2.1.p0.12-el5.parcel.sha
9.[root@hadoop1 soft]# rpm -ivh oracle-j2sdk1.7-1.7.0+update67-1.x86_64.rpm 所有节点
CDH 集群安装
CM 安装成功后浏览器输入 http://ip:7180,ip 是 CM 安装的主机 ip 或者主机名。显示如下界面,用户名和密码都输入 admin,进入 web 管理界面。
免费版 -〉继续 - 查找并选择需要安装 CDH 的机器,点击“继续”192.168.88.[11-13]-
二、卸载步骤
记录卸载过程和问题。现有环境 Cloudera Manager + (1 + 2) 的 CDH 环境。
1、先在 Manage 管理端移除所有服务。
2、删除 Manager Server
在 Manager 节点运行
/usr/share/cmf/uninstall-cloudera-manager.sh 如果没有该脚本,则可以手动删除,先停止服务:
service cloudera-scm-server stop
service cloudera-scm-server-db stop 然后删除:
yum remove cloudera-manager-serversudo
yum remove cloudera-manager-server-db3、删除所有 CDH 节点上的 CDH 服务,先停止服务:
service cloudera-scm-agent hard_stop 卸载安装的软件:
yum remove cloudera-manager-* hadoop hue-common bigtop-* 4、删除残余数据:
rm -Rf /usr/share/cmf /var/lib/cloudera* /var/cache/yum/cloudera*
5、kill 掉所有 Manager 和 Hadoop 进程(选作,如果你正确停止 Cloud Manager 和所有服务则无须此步)
$ for u in hdfs mapred cloudera-scm hbase hue zookeeper oozie hive impala flume; do sudo kill $(ps -u $u -o pid=); done6、删除 Manager 的 lock 文件
在 Manager 节点运行:
rm /tmp/.scm_prepare_node.lock 至此,删除完成。
/var/log/cloudera-manager-installer/3.install-cloudera-manager-server.log
http://archive-primary.cloudera.com/cm5/redhat/5/x86_64/cm/5.2.1/RPMS/x86_64/
获取锁 卸载装
Couldn t resolve host archive.cloudera.com dns8.8.8.8
注意主机名要与 host 一致,若不一致就删掉,重新搜索
正在搜索要重新卸载,再安装
[root@h02 soft]# service cloudera-scm-agent status
cloudera-scm-agent dead but pid file exists
[root@client ~]# cd /var/run
[root@client ~]# rm –f cloudera-scm-agent.pid
在日志中发现这样一条错误信息:
ERROR ENGINE Error in HTTP server: shutting down Traceback (most recent call last)
IOError: [Errno 2] No such file or directory: /var/lib/cloudera-scm-agent/uuid
[root@h02 cloudera-scm-agent]# mkdir /var/lib/cloudera-scm-agent/
[root@h02 cloudera-scm-agent]# chmod 777 /var/lib/cloudera-scm-agent/
感谢各位的阅读,以上就是“hadoop 环境如何部署”的内容了,经过本文的学习后,相信大家对 hadoop 环境如何部署这一问题有了更深刻的体会,具体使用情况还需要大家实践验证。这里是丸趣 TV,丸趣 TV 小编将为大家推送更多相关知识点的文章,欢迎关注!