如何正确地执行hadoop distcp -f命令?

时间:2022-08-28 20:47:54

I want to get backup, on my hadoop cluster, for some folders and files. I ran this command:

我想在hadoop集群上获得一些文件夹和文件的备份。我跑这个命令:

hadoop distcp -p -update -f hdfs://cluster1:8020/srclist hdfs://cluster2:8020/hdpBackup/

My srclist file :

我srclist文件:

hdfs://cluster1:8020/user/user1/folder1
hdfs://cluster1:8020/user/user1/folder2
hdfs://cluster1:8020/user/user1/file1

folder1 contains two files : part-00000 and part-00001

folder1包含两个文件:部分00000和部分00001。

folder2 contains two files : file and file_old

folder2包含两个文件:file和file_old。

That command works but explodes all folders contents.

该命令可以工作,但会爆炸所有文件夹的内容。

Result :

结果:

--hdpBackup
  - part-00000
  - part-00001
  - file1
  - file
  - file_old

But I want to get result :

但是我想要得到结果:

--hdpBackup
  - folder1
  - folder2
  - file1

I can not use hdfs://cluster1:8020/user/user1/* because user1 contains many folders and files.

我不能使用hdfs://cluster1:8020/user/user1/*,因为user1包含许多文件夹和文件。

How can I solve this problem ?

我该如何解决这个问题?

1 个解决方案

#1


2  

Use the script below, it is shell programming:

使用下面的脚本,它是shell编程:

 #!/bin/sh

 for line in `awk '{print $1}' /home/Desktop/distcp/srclist`;
 do
 line1=$(echo $line | awk 'BEGIN{FS="/"}{print $NF}')

 echo "$line  $line1 file are source dest" 

 hadoop distcp  $line hdfs://10.20.53.157/user/root/backup1/$line1

 done

srclist file needs to be in the local file system contails paths like:

srclist文件需要在本地文件系统中,比如:

   hdfs://10.20.53.157/user/root/Wholefileexaple_1
   hdfs://10.20.53.157/user/root/Wholefileexaple_2

#1


2  

Use the script below, it is shell programming:

使用下面的脚本,它是shell编程:

 #!/bin/sh

 for line in `awk '{print $1}' /home/Desktop/distcp/srclist`;
 do
 line1=$(echo $line | awk 'BEGIN{FS="/"}{print $NF}')

 echo "$line  $line1 file are source dest" 

 hadoop distcp  $line hdfs://10.20.53.157/user/root/backup1/$line1

 done

srclist file needs to be in the local file system contails paths like:

srclist文件需要在本地文件系统中,比如:

   hdfs://10.20.53.157/user/root/Wholefileexaple_1
   hdfs://10.20.53.157/user/root/Wholefileexaple_2