ITPub博客

首页 > 大数据 > Hadoop > hadoop实战1--部署

hadoop实战1--部署

原创 Hadoop 作者:shaozi74108 时间:2019-02-17 19:57:31 0 删除 编辑
hadoop部署
Hadoop介绍:
广义: 以apache hadoop软件为主的生态圈(hive zookeeper spark hbase)
狭义: 单指apache hadoop软件
相关官网:
hadoop.apache.org
hive.apache.org
spark.apache.org
cdh-hadoop:
hadoop软件及版本:
1.x 企业不用
2.x 主流
3.x 没有企业敢用
a.采坑
b.很多公司都是CDH5.x部署大数据环境 (),即2.6.0-cdh5.7.0 =? apache hadoop2.6.0
很多公司都是CDH5.X部署大数据环境(),相当于是把一个生态圈的组件,集中成为一个系统。
作为基础环境,里面装的2.6.0-cdh5.7.0, 注意此版本不等于apache hadoop2.6.0 ,因为
cdh5.7.0中hadoop做了bug升级。
hadoop软件:
hdfs:存储 分布式文件系统
mapreduce:计算。用java计算job1,job2,但企业不用java(开发难度大,代码复杂)
yarn: 资源和作业调度(cpu memory分配),即:哪个作业分配到哪个节点中调度。
--如果需要按照ssh
Ubuntu Linux:
$ sudo apt-get install ssh
$ sudo apt-get install rsync
----------------------------------------------------------------------------------------------------
安装部分:
环境:CentOS 伪分布安装:即 单节点安装
HADOOP版本:hadoop-2.6.0-cdh5.7.0.tar.gz
JDK版本:jdk-8u45-linux-x64.gz
安装原则:不同软件需要指定对应的用户
linux       root用户
mysql     mysqladmin用户
hadoop  hadoop用户
1.创建hadoop用户和上传hadoop软件
******************************
useradd hadoop
su - hadoop
mkdir app
cd app/
上传hadoop包
结果如下:
[hadoop@hadoop app]$ pwd
/home/hadoop/app
[hadoop@hadoop app]$ ls -l
total 304288
drwxr-xr-x 15 hadoop hadoop      4096 Feb 14 23:37 hadoop-2.6.0-cdh5.7.0
-rw-r--r--  1 root   root   311585484 Feb 14 17:32 hadoop-2.6.0-cdh5.7.0.tar.gz
***********************************
2.部署jdk , 要用CDH版本的JDK
***********************************
创建JDK目录,上传JDK包 要用CDH版本的JDK
su - root
mkdir /usr/java             #上传JDK包到此目录
mkdir /usr/share/java   #部署CDH环境时jdbc jar包需要放到此目录,否则报错
cd   /usr/java
tar   -xzvf     jdk-8u45-linux-x64.gz  #解压JDK
drwxr-xr-x 8 uucp  143      4096 Apr 11  2015 jdk1.8.0_45   #注意解压后用户、组是不对的,需要改用户组为root:root
chown -R root:root jdk1.8.0_45
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.8.0_45
结果如下:
[root@hadoop java]# pwd
/usr/java
[root@hadoop java]# ll
total 169216
drwxr-xr-x 8 root root      4096 Apr 11  2015 jdk1.8.0_45
-rw-r--r-- 1 root root 173271626 Jan 26 18:35 jdk-8u45-linux-x64.gz
*****************************************
3.设置java环境变量
su - root
vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.8.0_45
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JER_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JER_HOME/bin:$PATH
source /etc/profile
[root@hadoop java]# which java
/usr/java/jdk1.8.0_45/bin/java
**********************
4.解压hadoop
su - hadoop
cd  /home/hadoop/app
[hadoop@hadoop002 app]$ tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz
[hadoop@hadoop002 app]$ cd hadoop-2.6.0-cdh5.7.0
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ll
total 76
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 bin  可执行脚本
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 bin-mapreduce1
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 cloudera
drwxr-xr-x  6 hadoop hadoop  4096 Mar 24  2016 etc  配置目录(conf)
drwxr-xr-x  5 hadoop hadoop  4096 Mar 24  2016 examples
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 examples-mapreduce1
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 include
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 lib  jar包目录
drwxr-xr-x  2 hadoop hadoop  4096 Mar 24  2016 libexec
drwxr-xr-x  3 hadoop hadoop  4096 Mar 24  2016 sbin hadoop组件的启动 停止脚本
drwxr-xr-x  4 hadoop hadoop  4096 Mar 24  2016 share
drwxr-xr-x 17 hadoop hadoop  4096 Mar 24  2016 src
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
*********************************************************
4.解压并配置hadoop
su - hadoop
cd app
tar -xzvf hadoop-2.6.0-cdh5.7.0.tar.gz


cd  /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop
vi  core-site.xml
<configuration>
  <property>
       <name>fs.defaultFS</name>
       <value>hdfs://localhost:9000</value>
   </property>
</configuration>
vi hdfs-site.xml
<configuration>
<property>
       <name>dfs.replication</name>
       <value>1</value>
   </property>
</configuration>
配置hadoop的环境变量,否则会在启动时候报错
vi /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop/hadoop-env.sh
export HADOOP_CONF_DIR=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/etc/hadoop
export JAVA_HOME=/usr/java/jdk1.8.0_45
*****************************
*****************************
5.配置ssh localhost无密码信任关系
su - hadoop
ssh-keygen  #一直回车
cd .ssh   #可以看到两个文件


cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys  #生成authorized_keys信任文件
ssh localhost date
The authenticity of host 'localhost (127.0.0.1)' can't be established.
RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Wed Feb 13 22:41:17 CST 2019
chmod 600 authorized_keys   # 非常重要,如果不更改权限,执行ssh localhost date时会让输入密码,但hadoop用户根本无密码,此时就是权限搞的猫腻。
**********************************
6.格式化
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs namenode -format
***************************************
cd /home/hadoop/app/hadoop-2.6.0-cdh5.7.0
bin/hdfs namenode -format  #为何进入bin 再 hdfs namenode -format说找不到hdfs命令
***************************************
7.启动hadoop服务
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
19/02/13 22:47:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is b1:94:33:ec:95:89:bf:06:3b:ef:30:2f:d7:8e:d2:4c.
Are you sure you want to continue connecting (yes/no)? yes  #输入yes,,因为 ssh 信任关系 是配置的是localhost,而非0.0.0.0
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
19/02/13 22:49:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/stop-dfs.sh
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ sbin/start-dfs.sh
19/02/13 22:57:23 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [localhost]
localhost: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop002.out
localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop002.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop002.out
19/02/13 22:57:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ jps  #检验是否正常启动,需启动以下四个服务
15059 Jps
14948 SecondaryNameNode 第二名称节点 老二
14783 DataNode  数据节点  小弟
14655 NameNode  名称节点  老大 读写
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
open  #安装成功可以打开hadoop的web管理界面:如图


8.配置hadoop命令环境变量
***************************************************************
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ cat ~/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export HADOOP_PREFIX=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export PATH=$HADOOP_PREFIX/bin:$PATH
source ~/.bash_profile
/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
***************************************************************
9.操作hadoop, hdfs dfs操作命令和Linux命令极其相似
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
19/02/13 23:08:35 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
19/02/13 23:11:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ls /
bin   dev  home  lib64       media  opt   root  sbin     srv  tmp  var
boot  etc  lib   lost+found  mnt    proc  run   selinux  sys  usr
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -mkdir /ruozedata
19/02/13 23:11:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ bin/hdfs dfs -ls /
19/02/13 23:11:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2019-02-13 23:11 /ruozedata
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$ ls /
bin   dev  home  lib64       media  opt   root  sbin     srv  tmp  var
boot  etc  lib   lost+found  mnt    proc  run   selinux  sys  usr
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$
10.查看帮助
[hadoop@hadoop002 hadoop-2.6.0-cdh5.7.0]$   bin/hdfs --help
作业:
1.ssh博客 阅读 摘抄
2.部署hdfs伪分布式
3.博客要写到hdfs伪分布式
小提示:
如果 su - zookeeper不能切换
解决方法:
更改:/etc/passwd中zookeeper用户的登录方式由/sbin/nologin==>/bin/bash即可


来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/28339956/viewspace-2636207/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
数据库方向,传统关系型数据库,etl方向,涉及银行,保险等海量数据。欢迎大牛批评指点。

注册时间:2016-02-22

  • 博文量
    9
  • 访问量
    19856