3台机器的hostname和ip分别如下
andy1 192.168.224.144 master namenode
andy2 192.168.224.145 slave datenode
andy3 192.168.224.143 slave datenode
修改主机配置文件:[主要实现用主机名代替IP地址,如果所有命令中都采用IP地址,那么则不需要做此配置]
namenode中/etc/hosts(andy1)
192.168.224.144andy1 # Added by NetworkManager
127.0.0.1localhost.localdomain localhost
::1andy1 localhost6.localdomain6localhost6
127.0.1.1ubuntu
192.168.224.145 andy2 //
192.168.224.143 andy3 // 这两行为新加的,本文件主要修改实现IP地址和主机名的对应
[
192.168.224.144 andy1
192.168.224.145 andy2
192.168.224.143 andy3
]
在andy2,adny3 中加入相应的其余2个主机的IP 主机名对应
一、安装SSH
1 $ sudo apt-get install openssh-server
然后确认sshserver是否启动了:
$ ps -e |grep ssh
若只有ssh-agent ,则表示没启动成功,继续
执行 $ /etc/init.d/ssh start 启动
看到sshd 说明已经启动
2 首先用andy用户登录每台机器(包括namenode),在/home/andy/目录下建立.ssh目录,并将目录权限设为:drwxr-xr-x,设置命令: chmod 755 .ssh
命令为 : sudo mkdir /home/andy/.ssh
sudo chmod 755 .ssh (sudo chmod 755 /home/andy/.ssh)
namenode 上执行 ssh-keygen -t rsa
运行结果为:
Generating public/private rsa key pair.
Enter file in which to save the key (/home/andy/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/andy/.ssh/id_rsa.
Your public key has been saved in /home/andy/.ssh/id_rsa.pub.
The key fingerprint is:
7a:a3:d2:43:d9:07:9c:93:d0:14:56:24:f9:e4:87:b7 andy@andy1
The key's randomart image is:
+--[ RSA 2048]----+
| o==o |
| ..o.. |
| o * . |
| * + o |
| oSo o . |
| o.. . E |
| o. o. |
| . oo . |
| ... |
+-----------------+
最后三句直接回车
[是否可以考虑为每一台机器都建立公钥密钥呢]
3
然后将id_rsa的内容复制到每个机器(包括本机)
在namenode 下执行
$ cp /home/andy/.ssh/id_rsa.pub /home/andy/.ssh/authorized_keys
$ scp /home/andy/.ssh/authorized_keys andy2:/home/andy/.ssh/
$ scp /home/andy/.ssh/authorized_keys andy3:/home/andy/.ssh/
[scp /home/andy/.ssh/authorized_keys 192.168.224.145:/home/andy/.ssh/]
andy登录每台机器,设置/home/andy/.ssh/authorized_keys 权限
具体命令如下:
$ cd /home/andy/.ssh
$ chmod 644 authorized_keys
二 安装jdk 下载后
将文件复制到 /usr/java 中,其中java文件是之前 mkdir /usr/java 创建的
root 用户操作
[不知道什么原因,以下命令先切换大/usr/java 下才是有效的。在未执行此命令时,安装的jdk文件不可见(没找到,不知道跑那里去了)]
sudo chmod u+x /usr/java/jdk-6u25-linux-i586.bin
sudo /usr/java/jdk-6u25-linux-i586.bin
/etc/profile (配置jdk环境变量)将以下内容写在文件的末尾
export JAVA_HOME=/usr/java/jdk1.6.0_25
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
source /etc/profile
这一步很重要,如果没有的话,会显示
程序“java”已包含在下列软件包中:
* gcj-4.4-jre-headless
* gcj-4.5-jre-headless
* openjdk-6-jre-headless
请尝试:apt-get install <选定的软件包>
运行java -version 测试jdk是否安装成功!
andy2+andy3的JDK 安装,如上,之所以不能够复制,可能是/urs下的权限有限制
三 hadoop 安装,就按装在andy里面吧
将hadoop-0.20.2.tar.gz复制到andy目录里
运行: sudo tar zxvf hadoop-0.20.2.tar.gz // 这样hadoop解压成功
之后运行:sudo mv hadoop-0.20.2 hadoop //这一步是将文件名进行修改,实际上采用的是将文件剪切到新建hadoop文件夹里面
修改master即andy1的hadoop的 /home/andy/hadoop/conf/hadoop-env.sh的内容
export JAVA_HOME=/usr/java/jdk1.6.0_25
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HADOOP_HEAPSIZE=200
export HADOOP_HOME=/home/andy/hadoop
在每台机器上查看 hadoop下的 bin/hadoop
显示结果为:
Usage: hadoop [--config confdir] COMMAND
where COMMAND is one of:
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
datanode run a DFS datanode
dfsadmin run a DFS admin client
mradmin run a Map-Reduce admin client
fsck run a DFS filesystem checking utility
fs run a generic filesystem user client
balancer run a cluster balancing utility
jobtracker run the MapReduce job Tracker node
pipes run a Pipes job
tasktracker run a MapReduce task Tracker node
job manipulate MapReduce jobs
queue get information regarding JobQueues
version print the version
jar <jar> run a jar file
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME <src>* <dest> create a hadoop archive
daemonlog get/set the log level for each daemon
or
CLASSNAME run the class named CLASSNAME
Most commands print help when invoked w/o parameters.
切换到hadoop根目录,cd /home/andy/haoop
mkdir tmp
mkdir hdfs
mkdir hdfs/name (不应该创建,否则后面的hadoop格式化不会成功!)
mkdir hdfs/data
切换到conf目录:
修改以下三个文件如下:
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://andy1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/andy/hadoop/tmp</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/andy/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/andy/hadoop/hdfs/data</value>
</property>
</configuration>
marped-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>andy1:9001</value>
</property>
</configuration>
其他的节点采用
scp -r /home/andy/hdoop andy2:/home/andy/hadoop
scp -r /home/andy/hdoop andy3:/home/andy/hadoop
复制过去就可以的。
因为hadoop的设置全都在hadoop目录中,所以复制过去后,所有的信息都OK了
测试下:
在 andy1(master)上 输入 bin/start-all.sh
得到结果:
starting namenode, logging to /home/andy/hadoop/logs/hadoop-andy-namenode-andy1.out
andy2: starting datanode, logging to /home/andy/hadoop/logs/hadoop-andy-datanode-andy2.out
andy3: starting datanode, logging to /home/andy/hadoop/logs/hadoop-andy-datanode-andy3.out
andy1: starting secondarynamenode, logging to /home/andy/hadoop/logs/hadoop-andy-secondarynamenode-andy1.out
starting jobtracker, logging to /home/andy/hadoop/logs/hadoop-andy-jobtracker-andy1.out
andy2: starting tasktracker, logging to /home/andy/hadoop/logs/hadoop-andy-tasktracker-andy2.out
andy3: starting tasktracker, logging to /home/andy/hadoop/logs/hadoop-andy-tasktracker-andy3.out
测试
格式化分布式文件系统:
1 bin/hadoop namenode -format
2 启动hadoop
bin/start-all.sh
3 检测 启动的服务 jps
在hdfs文件中创建一个input文件夹,并将所有的conf中文件复制到此文件夹中
命令如下:
bin/hadoop fs -mkdir input
bin/hadoop fs -copyFromLoc