ITPub博客

首页 > 大数据 > Hadoop > Hadoop 测试例子

Hadoop 测试例子

Hadoop 作者:陶卡 时间:2013-02-27 17:50:58 0 删除 编辑

1smart用户下建立目录

/home/smart/hadoop_test/input

 

2、建立两个文本文件

cd /home/smart/hadoop_test/input

echo "hello world" >test1.txt

echo "hello hadoop" >test2.txt

 

3、将文件内容放入hadoop

cd /usr/local/hadoop-0.20.2/

bin/hadoop dfs -put /home/smart/hadoop_test/input in

 

4、查看放入hadoop后的内容结果

bin/hadoop dfs -ls ./in/*

 

结果为:

-rw-r--r--   2 smart supergroup         12 2013-02-23 20:06 /user/smart/in/test1.txt

-rw-r--r--   2 smart supergroup         13 2013-02-23 20:06 /user/smart/in/test2.txt

 

5、执行hadoop的自带例子程序,测试map reduce过程(计算文件内单词出现测试)

bin/hadoop jar hadoop-0.20.2-examples.jar wordcount in out

 

运行过程为:

13/02/23 20:17:22 INFO input.FileInputFormat: Total input paths to process : 2

13/02/23 20:17:22 INFO mapred.JobClient: Running job: job_201302231949_0001

13/02/23 20:17:23 INFO mapred.JobClient:  map 0% reduce 0%

13/02/23 20:17:34 INFO mapred.JobClient:  map 50% reduce 0%

13/02/23 20:17:37 INFO mapred.JobClient:  map 100% reduce 0%

13/02/23 20:17:46 INFO mapred.JobClient:  map 100% reduce 100%

13/02/23 20:17:48 INFO mapred.JobClient: Job complete: job_201302231949_0001

13/02/23 20:17:48 INFO mapred.JobClient: Counters: 17

13/02/23 20:17:48 INFO mapred.JobClient:   Job Counters

13/02/23 20:17:48 INFO mapred.JobClient:     Launched reduce tasks=1

13/02/23 20:17:48 INFO mapred.JobClient:     Launched map tasks=2

13/02/23 20:17:48 INFO mapred.JobClient:     Data-local map tasks=2

13/02/23 20:17:48 INFO mapred.JobClient:   FileSystemCounters

13/02/23 20:17:48 INFO mapred.JobClient:     FILE_BYTES_READ=55

13/02/23 20:17:48 INFO mapred.JobClient:     HDFS_BYTES_READ=25

13/02/23 20:17:48 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=180

13/02/23 20:17:48 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25

13/02/23 20:17:48 INFO mapred.JobClient:   Map-Reduce Framework

13/02/23 20:17:48 INFO mapred.JobClient:     Reduce input groups=3

13/02/23 20:17:48 INFO mapred.JobClient:     Combine output records=4

13/02/23 20:17:48 INFO mapred.JobClient:     Map input records=2

13/02/23 20:17:48 INFO mapred.JobClient:     Reduce shuffle bytes=61

13/02/23 20:17:48 INFO mapred.JobClient:     Reduce output records=3

13/02/23 20:17:48 INFO mapred.JobClient:     Spilled Records=8

13/02/23 20:17:48 INFO mapred.JobClient:     Map output bytes=41

13/02/23 20:17:48 INFO mapred.JobClient:     Combine input records=4

13/02/23 20:17:48 INFO mapred.JobClient:     Map output records=4

13/02/23 20:17:48 INFO mapred.JobClient:     Reduce input records=4

 

6、查看hadoop的测试程序执行结果

1)、查看hadoop内部文件内容

bin/hadoop dfs -ls

 

结果:

drwxr-xr-x   - smart supergroup          0 2013-02-23 20:06 /user/smart/in

drwxr-xr-x   - smart supergroup          0 2013-02-23 20:17 /user/smart/out

 

2)、查看out目录内容

bin/hadoop dfs -ls ./out

 

结果:

drwxr-xr-x   - smart supergroup          0 2013-02-23 20:17 /user/smart/out/_logs

-rw-r--r--   2 smart supergroup         25 2013-02-23 20:17 /user/smart/out/part-r-00000

 

3)、查看测试程序(wordcount计算结果)

bin/hadoop dfs -cat ./out/part-r-00000

 

结果:

hadoop  1

hello   2

world   1

<!-- 正文结束 -->

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/23150399/viewspace-1120017/,如需转载,请注明出处,否则将追究法律责任。

上一篇: 没有了~
下一篇: 没有了~
请登录后发表评论 登录
全部评论

注册时间:2010-01-08