Hadoop2.7与ceph10.2 S3对象存储进行集成测试

时间:2024-02-22 09:08:13
公司提出测试需求,将Hadoop2.7与ceph10.2 S3对象存储进行集成测试,hadoop官网介绍:[官网介绍](http://hadoop.apache.org/docs/current/hadoop-aws/tools/hadoop-aws/index.html) 后查阅相关资料完成对接测试,现将环境部署,对接测试完整过程,整理如下: **hadoop环境** 2台主机 主机名分别为master slave .  master作为hadoop namenode,slave作为datanode. hadoop集群部署过程参考: [hadoop集群部署](http://www.178pt.com/156.html) ceph10.2 radosgw配置过程参考:[radosgw配置](http://www.178pt.com/250.html) **hadoop集成s3** 在master(namenode)节点上修改core-site.xml,增加如下配置(endpoint key根据实际填写): <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/hadoop/tmp</value> <description>Abase for other temporary   directories.</description> </property> ##增加如下内容 <property> <name>fs.s3a.access.key</name> <value>YZ8H5J5B4BS4HGJ6U8YC</value> <description>AWS access key ID. Omit for Role-based authentication.</description> </property> <property> <name>fs.s3a.secret.key</name> <value>KzPrV6ytwoZoQCMHzbnXXMQKrjH5MLnD3Wsb0AjJ</value> <description>AWS secret key</description> </property> <property> <name>fs.s3a.endpoint</name> <value>192.168.1.31:7480</value> <description>AWS S3 endpoint to connect to. An up-to-date list is provided in the AWS Documentation: regions and endpoints. Without this property, the standard region (s3.amazonaws.com) is assumed. </description> </property> <property> <name>fs.s3a.connection.ssl.enabled</name> <value>false</value> <description>Enables or disables SSL connections to S3.</description> </property> ##增加结束 </configuration> 在master  slave 2个hadoop节点上拷贝s3相关的jar包,否则会报错。 [root@master etc]# pwd /usr/hadoop/hadoop-2.7.3/etc [root@master etc]# cp hadoop/share/hadoop/tools/lib/hadoop-aws-2.7.3.jar hadoop/share/hadoop/common/lib/ [root@master etc]# cp hadoop/share/hadoop/tools/lib/aws-java-sdk-1.7.4.jar hadoop/share/hadoop/common/lib/ [root@master etc]# cp hadoop/share/hadoop/tools/lib/joda-time-2.9.4.jar hadoop/share/hadoop/common/lib/ [root@master etc]# cp hadoop/share/hadoop/tools/lib/jackson-*.jar hadoop/share/hadoop/common/lib/ 重启hadoop [root@master etc]# stop-all.sh [root@master etc]# start-all.sh **hadoop集成s3测试** ceph 节点上创建桶hadoop,并上传文件 [root@radosgw1 ~]# s3cmd mb s3://hadoop Bucket \'s3://hadoop/\' created [root@radosgw1 ~]# s3cmd put abc s3://hadoop upload: \'abc\' -> \'s3://hadoop/abc\'  [1 of 1] 1109 of 1109   100% in    1s  1096.74 B/s  done [root@radosgw1 ~]# s3cmd ls s3://hadoop 2018-04-25 08:47      1109   s3://hadoop/abc hadoop master节点上查看 [root@master ~]# hadoop fs -ls s3a://hadoop/ Found 1 items -rw-rw-rw-   1       1109 2018-04-25 16:47 s3a://hadoop/abc 1、 从hadoop client本机上传文件到对象存储 [root@master ~]# ls ceshi.txt ceshi.txt [root@master ~]# hadoop fs -put ceshi.txt s3a://hadoop/ [root@master ~]# hadoop fs -ls s3a://hadoop/ Found 2 items -rw-rw-rw-   1       1109 2018-04-25 16:47 s3a://hadoop/abc -rw-rw-rw-   1       1083 2018-04-25 16:52 s3a://hadoop/ceshi.txt [root@master ~]# 集群端查看 [root@radosgw1 ~]# s3cmd ls s3://hadoop 2018-04-25 08:47      1109   s3://hadoop/abc 2018-04-25 08:52      1083   s3://hadoop/ceshi.txt 2、 将文件从对象存储下载到本地 [root@master ~]# rm -f ceshi.txt [root@master ~]# ls ceshi.txt ls: cannot access ceshi.txt: No such file or directory [root@master ~]# hadoop fs -get s3a://hadoop/ceshi.txt [root@master ~]# ls ceshi.txt ceshi.txt [root@master ~]# 3、 将文件从对象拷贝到hdfs文件系统 [root@master ~]# hdfs dfs -ls / Found 4 items drwxr-xr-x   - root supergroup          0 2018-04-25 15:21 /haha drwxr-xr-x   - root supergroup          0 2018-04-25 12:10 /input drwxr-xr-x   - root supergroup          0 2018-04-25 12:11 /output drwx------   - root supergroup          0 2018-04-25 12:11 /tmp [root@master ~]# hdfs dfs -ls /ceshi.txt ls: `/ceshi.txt\': No such file or directory [root@master ~]# hadoop distcp s3a://hadoop/ceshi.txt /ceshi.txt 18/04/25 17:00:10 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile=\'null\', copyStrategy=\'uniformsize\', sourceFileListing=null, sourcePaths=[s3a://hadoop/ceshi.txt], targetPath=/ceshi.txt, targetPathExists=false, preserveRawXattrs=false} 18/04/25 17:00:10 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:00:30 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 18/04/25 17:00:30 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 18/04/25 17:00:31 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:00:31 INFO mapreduce.JobSubmitter: number of splits:1 18/04/25 17:00:32 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_0009 18/04/25 17:00:32 INFO impl.YarnClientImpl: Submitted application application_1524633996089_0009 18/04/25 17:00:32 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0009/ 18/04/25 17:00:32 INFO tools.DistCp: DistCp job-id: job_1524633996089_0009 18/04/25 17:00:32 INFO mapreduce.Job: Running job: job_1524633996089_0009 18/04/25 17:00:40 INFO mapreduce.Job: Job job_1524633996089_0009 running in uber mode : false 18/04/25 17:00:40 INFO mapreduce.Job:  map 0% reduce 0% 18/04/25 17:00:52 INFO mapreduce.Job:  map 100% reduce 0% 18/04/25 17:01:05 INFO mapreduce.Job: Job job_1524633996089_0009 completed successfully 18/04/25 17:01:05 INFO mapreduce.Job: Counters: 38 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=121596 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=330 HDFS: Number of bytes written=1083 HDFS: Number of read operations=14 HDFS: Number of large read operations=0 HDFS: Number of write operations=4 S3A: Number of bytes read=1083 S3A: Number of bytes written=0 S3A: Number of read operations=3 S3A: Number of large read operations=0 S3A: Number of write operations=0 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=20780 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=20780 Total vcore-milliseconds taken by all map tasks=20780 Total megabyte-milliseconds taken by all map tasks=21278720 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=135 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=154 CPU time spent (ms)=1350 Physical memory (bytes) snapshot=113676288 Virtual memory (bytes) snapshot=862224384 Total committed heap usage (bytes)=29032448 File Input Format Counters Bytes Read=195 File Output Format Counters Bytes Written=0 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=1083 BYTESEXPECTED=1083 COPY=1 [root@master ~]# hdfs dfs -ls /ceshi.txt -rw-r--r--   1 root supergroup       1083 2018-04-25 17:00 /ceshi.txt [root@master ~]# 4、  将文件从HDFS文件系统拷贝到s3对象存储中 s3对象列出所有文件 [root@radosgw1 ~]# s3cmd ls s3://hadoop 2018-04-25 08:47      1109   s3://hadoop/abc 2018-04-25 08:52      1083   s3://hadoop/ceshi.txt [root@radosgw1 ~]# 将hdfs文件系统下的/haha目录中anaconda-ks.cfg文件传到s3对象存储里面 [root@master ~]# hdfs dfs -ls /haha Found 1 items -rw-r--r--   1 root supergroup       1083 2018-04-25 15:21 /haha/anaconda-ks.cfg [root@master ~]# hadoop distcp /haha/anaconda-ks.cfg s3a://hadoop/ 18/04/25 17:06:18 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile=\'null\', copyStrategy=\'uniformsize\', sourceFileListing=null, sourcePaths=[/haha/anaconda-ks.cfg], targetPath=s3a://hadoop/, targetPathExists=true, preserveRawXattrs=false} 18/04/25 17:06:18 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:06:24 INFO Configuration.deprecation: io.sort.mb is deprecated. Instead, use mapreduce.task.io.sort.mb 18/04/25 17:06:24 INFO Configuration.deprecation: io.sort.factor is deprecated. Instead, use mapreduce.task.io.sort.factor 18/04/25 17:06:25 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:06:26 INFO mapreduce.JobSubmitter: number of splits:1 18/04/25 17:06:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_0010 18/04/25 17:06:26 INFO impl.YarnClientImpl: Submitted application application_1524633996089_0010 18/04/25 17:06:26 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0010/ 18/04/25 17:06:26 INFO tools.DistCp: DistCp job-id: job_1524633996089_0010 18/04/25 17:06:26 INFO mapreduce.Job: Running job: job_1524633996089_0010 18/04/25 17:06:35 INFO mapreduce.Job: Job job_1524633996089_0010 running in uber mode : false 18/04/25 17:06:35 INFO mapreduce.Job:  map 0% reduce 0% 18/04/25 17:06:57 INFO mapreduce.Job:  map 100% reduce 0% 18/04/25 17:08:14 INFO mapreduce.Job: Job job_1524633996089_0010 completed successfully 18/04/25 17:08:14 INFO mapreduce.Job: Counters: 38 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=121562 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1459 HDFS: Number of bytes written=0 HDFS: Number of read operations=10 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 S3A: Number of bytes read=0 S3A: Number of bytes written=1083 S3A: Number of read operations=11 S3A: Number of large read operations=0 S3A: Number of write operations=3 Job Counters Launched map tasks=1 Other local map tasks=1 Total time spent by all maps in occupied slots (ms)=86489 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=86489 Total vcore-milliseconds taken by all map tasks=86489 Total megabyte-milliseconds taken by all map tasks=88564736 Map-Reduce Framework Map input records=1 Map output records=0 Input split bytes=134 Spilled Records=0 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=151 CPU time spent (ms)=1760 Physical memory (bytes) snapshot=116514816 Virtual memory (bytes) snapshot=863125504 Total committed heap usage (bytes)=29032448 File Input Format Counters Bytes Read=242 File Output Format Counters Bytes Written=0 org.apache.hadoop.tools.mapred.CopyMapper$Counter BYTESCOPIED=1083 BYTESEXPECTED=1083 COPY=1 [root@master ~]# s3集群端验证 [root@radosgw1 ~]# s3cmd ls s3://hadoop 2018-04-25 08:47      1109   s3://hadoop/abc 2018-04-25 09:08      1083   s3://hadoop/anaconda-ks.cfg 2018-04-25 08:52      1083   s3://hadoop/ceshi.txt [root@radosgw1 ~]# 5、 将对象存储中的文件作为mapreduce的输入,进行计算之后将结果输出到hdfs文件系统中。 将对象存储中的/hadoop/abc文件作为mapreduce的文件输入,计算结果输出到hdfs的/result目录 [root@master ~]# hadoop fs -ls s3a://hadoop/ Found 3 items -rw-rw-rw-   1       1109 2018-04-25 16:47 s3a://hadoop/abc -rw-rw-rw-   1       1083 2018-04-25 17:08 s3a://hadoop/anaconda-ks.cfg -rw-rw-rw-   1       1083 2018-04-25 16:52 s3a://hadoop/ceshi.txt [root@master ~]# hdfs dfs -ls /result ls: `/result\': No such file or directory [root@master ~]# 当前hdfs是没有/result目录的,下面进行计算操作 [root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount s3a://hadoop/abc /result 18/04/25 17:19:53 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:19:55 INFO input.FileInputFormat: Total input paths to process : 1 18/04/25 17:19:56 INFO mapreduce.JobSubmitter: number of splits:1 18/04/25 17:19:56 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_0011 18/04/25 17:19:57 INFO impl.YarnClientImpl: Submitted application application_1524633996089_0011 18/04/25 17:19:57 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0011/ 18/04/25 17:19:57 INFO mapreduce.Job: Running job: job_1524633996089_0011 18/04/25 17:20:06 INFO mapreduce.Job: Job job_1524633996089_0011 running in uber mode : false 18/04/25 17:20:06 INFO mapreduce.Job:  map 0% reduce 0% 18/04/25 17:20:23 INFO mapreduce.Job:  map 100% reduce 0% 18/04/25 17:20:31 INFO mapreduce.Job:  map 100% reduce 100% 18/04/25 17:20:32 INFO mapreduce.Job: Job job_1524633996089_0011 completed successfully 18/04/25 17:20:32 INFO mapreduce.Job: Counters: 54 File System Counters FILE: Number of bytes read=1442 FILE: Number of bytes written=240937 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=81 HDFS: Number of bytes written=1121 HDFS: Number of read operations=5 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 S3A: Number of bytes read=1109 S3A: Number of bytes written=0 S3A: Number of read operations=1 S3A: Number of large read operations=0 S3A: Number of write operations=0 Job Counters Launched map tasks=1 Launched reduce tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=14541 Total time spent by all reduces in occupied slots (ms)=5450 Total time spent by all map tasks (ms)=14541 Total time spent by all reduce tasks (ms)=5450 Total vcore-milliseconds taken by all map tasks=14541 Total vcore-milliseconds taken by all reduce tasks=5450 Total megabyte-milliseconds taken by all map tasks=14889984 Total megabyte-milliseconds taken by all reduce tasks=5580800 Map-Reduce Framework Map input records=43 Map output records=104 Map output bytes=1517 Map output materialized bytes=1442 Input split bytes=81 Combine input records=104 Combine output records=79 Reduce input groups=79 Reduce shuffle bytes=1442 Reduce input records=79 Reduce output records=79 Spilled Records=158 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=230 CPU time spent (ms)=2230 Physical memory (bytes) snapshot=324866048 Virtual memory (bytes) snapshot=1723260928 Total committed heap usage (bytes)=162926592 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1109 File Output Format Counters Bytes Written=1121 [root@master ~]# 计算成功后,查看hdfs目录,下面可以看到目录存在,且计算结果文件也存在,且返回SUCCESS [root@master ~]# hdfs dfs -ls /result Found 2 items -rw-r--r--   1 root supergroup          0 2018-04-25 17:20 /result/_SUCCESS -rw-r--r--   1 root supergroup       1121 2018-04-25 17:20 /result/part-r-00000 [root@master ~]# 6、 将对象存储中的文件作为mapreduce的输入,进行计算之后将结果输出到对象存储桶中。 首先查看对象存储桶中hadoop下result目录是否存在。待会输出结果会传到这里。 [root@master ~]# hadoop fs -ls s3a://hadoop/result ls: `s3a://hadoop/result\': No such file or directory [root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount s3a://hadoop/abc s3a://hadoop/result 18/04/25 17:25:27 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:25:39 INFO input.FileInputFormat: Total input paths to process : 1 18/04/25 17:25:40 INFO mapreduce.JobSubmitter: number of splits:1 18/04/25 17:25:41 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_0012 18/04/25 17:25:41 INFO impl.YarnClientImpl: Submitted application application_1524633996089_0012 18/04/25 17:25:41 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0012/ 18/04/25 17:25:41 INFO mapreduce.Job: Running job: job_1524633996089_0012 18/04/25 17:25:53 INFO mapreduce.Job: Job job_1524633996089_0012 running in uber mode : false 18/04/25 17:25:53 INFO mapreduce.Job:  map 0% reduce 0% 18/04/25 17:26:57 INFO mapreduce.Job:  map 100% reduce 0% 18/04/25 17:27:18 INFO mapreduce.Job:  map 100% reduce 67% 18/04/25 17:27:27 INFO mapreduce.Job:  map 100% reduce 100% 18/04/25 17:32:44 INFO mapreduce.Job: Job job_1524633996089_0012 completed successfully 18/04/25 17:32:44 INFO mapreduce.Job: Counters: 54 File System Counters FILE: Number of bytes read=1442 FILE: Number of bytes written=240925 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=81 HDFS: Number of bytes written=0 HDFS: Number of read operations=1 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 S3A: Number of bytes read=1109 S3A: Number of bytes written=1121 S3A: Number of read operations=19 S3A: Number of large read operations=0 S3A: Number of write operations=5 Job Counters Launched map tasks=1 Launched reduce tasks=1 Rack-local map tasks=1 Total time spent by all maps in occupied slots (ms)=22928 Total time spent by all reduces in occupied slots (ms)=198775 Total time spent by all map tasks (ms)=22928 Total time spent by all reduce tasks (ms)=198775 Total vcore-milliseconds taken by all map tasks=22928 Total vcore-milliseconds taken by all reduce tasks=198775 Total megabyte-milliseconds taken by all map tasks=23478272 Total megabyte-milliseconds taken by all reduce tasks=203545600 Map-Reduce Framework Map input records=43 Map output records=104 Map output bytes=1517 Map output materialized bytes=1442 Input split bytes=81 Combine input records=104 Combine output records=79 Reduce input groups=79 Reduce shuffle bytes=1442 Reduce input records=79 Reduce output records=79 Spilled Records=158 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=256 CPU time spent (ms)=1550 Physical memory (bytes) snapshot=336670720 Virtual memory (bytes) snapshot=1724592128 Total committed heap usage (bytes)=162926592 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1109 File Output Format Counters Bytes Written=1121 [root@master ~]# 成功后,验证查看 hadoop节点验证 [root@master ~]# hadoop fs -ls s3a://hadoop/result Found 2 items -rw-rw-rw-   1          0 2018-04-25 17:33 s3a://hadoop/result/_SUCCESS -rw-rw-rw-   1       1121 2018-04-25 17:32 s3a://hadoop/result/part-r-00000 [root@master ~]# ceph集群节点验证 [root@radosgw1 ~]# s3cmd ls s3://hadoop/result/ 2018-04-25 09:33         0   s3://hadoop/result/_SUCCESS 2018-04-25 09:32      1121   s3://hadoop/result/part-r-00000 7、 将 HDFS 中的文件作为 MapReduce 的输入,计算结果输出到对象存储的存储空间中 下面将hdfs中ceshi.txt作为计算输入,将结果输出对象存储中hadoop/output目录中。 前期查看 [root@master ~]# hdfs dfs -ls / Found 6 items -rw-r--r--   1 root supergroup       1083 2018-04-25 17:00 /ceshi.txt [root@master ~]# hadoop fs -ls s3a://hadoop/output ls: `s3a://hadoop/output\': No such file or directory 开始计算并输出 [root@master ~]# hadoop jar /usr/hadoop/hadoop-2.7.3/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount /ceshi.txt  s3a://hadoop/output 18/04/25 17:39:55 INFO client.RMProxy: Connecting to ResourceManager at master/192.168.1.20:8032 18/04/25 17:40:04 INFO input.FileInputFormat: Total input paths to process : 1 18/04/25 17:40:05 INFO mapreduce.JobSubmitter: number of splits:1 18/04/25 17:40:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1524633996089_0013 18/04/25 17:40:06 INFO impl.YarnClientImpl: Submitted application application_1524633996089_0013 18/04/25 17:40:06 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1524633996089_0013/ 18/04/25 17:40:06 INFO mapreduce.Job: Running job: job_1524633996089_0013 18/04/25 17:40:19 INFO mapreduce.Job: Job job_1524633996089_0013 running in uber mode : false 18/04/25 17:40:19 INFO mapreduce.Job:  map 0% reduce 0% 18/04/25 17:41:16 INFO mapreduce.Job:  map 100% reduce 0% 18/04/25 17:41:36 INFO mapreduce.Job:  map 100% reduce 67% 18/04/25 17:41:45 INFO mapreduce.Job:  map 100% reduce 100% 18/04/25 17:46:38 INFO mapreduce.Job: Job job_1524633996089_0013 completed successfully 18/04/25 17:46:38 INFO mapreduce.Job: Counters: 54 File System Counters FILE: Number of bytes read=1404 FILE: Number of bytes written=240873 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=1176 HDFS: Number of bytes written=0 HDFS: Number of read operations=2 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 S3A: Number of bytes read=0 S3A: Number of bytes written=1091 S3A: Number of read operations=18 S3A: Number of large read operations=0 S3A: Number of write operations=5 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=17302 Total time spent by all reduces in occupied slots (ms)=173487 Total time spent by all map tasks (ms)=17302 Total time spent by all reduce tasks (ms)=173487 Total vcore-milliseconds taken by all map tasks=17302 Total vcore-milliseconds taken by all reduce tasks=173487 Total megabyte-milliseconds taken by all map tasks=17717248 Total megabyte-milliseconds taken by all reduce tasks=177650688 Map-Reduce Framework Map input records=41 Map output records=102 Map output bytes=1483 Map output materialized bytes=1404 Input split bytes=93 Combine input records=102 Combine output records=77 Reduce input groups=77 Reduce shuffle bytes=1404 Reduce input records=77 Reduce output records=77 Spilled Records=154 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=261 CPU time spent (ms)=1570 Physical memory (bytes) snapshot=325062656 Virtual memory (bytes) snapshot=1724448768 Total committed heap usage (bytes)=162926592 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1083 File Output Format Counters Bytes Written=1091 在hadoop节点和集群节点验证 [root@master ~]# hadoop fs -ls s3a://hadoop/output Found 2 items -rw-rw-rw-   1          0 2018-04-25 17:47 s3a://hadoop/output/_SUCCESS -rw-rw-rw-   1       1091 2018-04-25 17:46 s3a://hadoop/output/part-r-00000 [root@master ~]# [root@radosgw1 ~]# s3cmd ls s3://hadoop/output/ 2018-04-25 09:47         0   s3://hadoop/output/_SUCCESS 2018-04-25 09:46      1091   s3://hadoop/output/part-r-00000 [root@radosgw1 ~]# 可以看到集群端和hadoop节点端都能看到。 至此,配置测试结束。