Hadoop流式传输失败,出现java.io.FileNotFoundException

时间:2022-06-11 08:36:22

I have written a map only python map-reduce job which accepts data from standard input and process it to produce some output. It works fine when executed locally. However, when I am trying to execute it using hadoop I get file not found exception. Its not able to locate the mapper.py file. Also, here is the command that I use to run the script :

我只编写了一个map map pylow map-reduce job,它接受来自标准输入的数据并处理它以产生一些输出。它在本地执行时工作正常。但是,当我尝试使用hadoop执行它时,我得到文件未找到异常。它无法找到mapper.py文件。此外,这是我用来运行脚本的命令:

hadoop jar hadoop-streaming-1.1.1.jar -D mapred.reduce.tasks=0 -file "$PWD/mapper.py" -mapper "$PWD/mapper.py" -input "relevance/test.txt" -output "relevance/test_output_8.txt"

The file test.txt has been copied to HDFS as well.

文件test.txt也已复制到HDFS。

Error:

错误:

java.io.FileNotFoundException: File /data1/mapr-hadoop/mapred/local/taskTracker/***********/job_201405060940_908425/attempt_201405060940_908425_m_000000_0/work/******/mapper.py does not exist.

can any one figure out what am I missing here?

任何人都可以弄清楚我在这里失踪了什么?

1 个解决方案

#1


0  

Getting rid of the the $PWD from the file path solved the problem. The working command :

从文件路径中删除$ PWD解决了这个问题。工作命令:

hadoop jar hadoop-streaming-1.1.1.jar -D mapred.reduce.tasks=0 -file "mapper.py" -mapper "mapper.py" -input "relevance/test.txt" -output "relevance/test_output_8.txt"

hadoop jar hadoop-streaming-1.1.1.jar -D mapred.reduce.tasks = 0 -file“mapper.py”-mapper“mapper.py”-input“relevant / test.txt”-output“relevant / test_output_8。文本”

Also, make sure the paths are specified withing " " . I came across lot of examples on the web where the examples had the " " missing.

另外,请确保使用“”指定路径。我在网上看到了很多例子,其中的例子都没有了。

#1


0  

Getting rid of the the $PWD from the file path solved the problem. The working command :

从文件路径中删除$ PWD解决了这个问题。工作命令:

hadoop jar hadoop-streaming-1.1.1.jar -D mapred.reduce.tasks=0 -file "mapper.py" -mapper "mapper.py" -input "relevance/test.txt" -output "relevance/test_output_8.txt"

hadoop jar hadoop-streaming-1.1.1.jar -D mapred.reduce.tasks = 0 -file“mapper.py”-mapper“mapper.py”-input“relevant / test.txt”-output“relevant / test_output_8。文本”

Also, make sure the paths are specified withing " " . I came across lot of examples on the web where the examples had the " " missing.

另外,请确保使用“”指定路径。我在网上看到了很多例子,其中的例子都没有了。