解决hiveserver2报错:java.io.IOException: Job status not available - Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

时间:2022-02-18 16:36:52

用户使用的sql: select count( distinct patient_id ) from argus.table_aa000612_641cd8ce_ceff_4ea0_9b27_0a3a743f0fe3;

下面做不同的测试:

1.beeline -u jdbc:hive2://0.0.0.0:10000 -e "select count( distinct patient_id ) from argus.table_aa000612_641cd8ce_ceff_4ea0_9b27_0a3a743f0fe3;"

报错:

INFO  : Starting Job = job_1494385775332_0822, Tracking URL = http://hadoop1.tj2.yiducloud.cn:8088/proxy/application_1494385775332_0822/
INFO : Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1494385775332_0822
ERROR : Ended Job = job_1494385775332_0822 with exception 'java.io.IOException(Job status not available )'
java.io.IOException: Job status not available
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:)
at org.apache.hadoop.mapreduce.Job.getJobState(Job.java:)
at org.apache.hadoop.mapred.JobClient$NetworkedJob.getJobState(JobClient.java:)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:)
at org.apache.hive.service.cli.operation.SQLOperation.access$(SQLOperation.java:)
at org.apache.hive.service.cli.operation.SQLOperation$$.run(SQLOperation.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hive.service.cli.operation.SQLOperation$.run(SQLOperation.java:)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:)
at java.util.concurrent.FutureTask.run(FutureTask.java:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
Error: Error while processing statement: FAILED: Execution Error, return code from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=)

yarn上报错:

Diagnostics: 
Staging dir does not exist /mr-history/anonymous/.staging

2. beeline -u jdbc:hive2://0.0.0.0:10000 -n rd -e "select count( distinct patient_id ) from argus.table_aa000612_641cd8ce_ceff_4ea0_9b27_0a3a743f0fe3;"

报错栈:

INFO  : Kill Command = /usr/lib/hadoop/bin/hadoop job  -kill job_1494385775332_0823
INFO : Hadoop job information for Stage-: number of mappers: ; number of reducers:
INFO : -- ::, Stage- map = %, reduce = %
ERROR : Ended Job = job_1494385775332_0823 with exception 'java.io.IOException(java.io.IOException: Unknown Job job_1494385775332_0823
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:)
at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$.callBlockingMethod(MRClientProtocol.java:)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
)'
java.io.IOException: java.io.IOException: Unknown Job job_1494385775332_0823
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:)
at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$.callBlockingMethod(MRClientProtocol.java:)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:)
at org.apache.hadoop.mapred.ClientServiceDelegate.getTaskCompletionEvents(ClientServiceDelegate.java:)
at org.apache.hadoop.mapred.YARNRunner.getTaskCompletionEvents(YARNRunner.java:)
at org.apache.hadoop.mapreduce.Job$.run(Job.java:)
at org.apache.hadoop.mapreduce.Job$.run(Job.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.mapreduce.Job.getTaskCompletionEvents(Job.java:)
at org.apache.hadoop.mapred.JobClient$NetworkedJob.getTaskCompletionEvents(JobClient.java:)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.computeReducerTimeStatsPerJob(HadoopJobExecHelper.java:)
at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:)
at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:)
at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:)
at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:)
at org.apache.hive.service.cli.operation.SQLOperation.access$(SQLOperation.java:)
at org.apache.hive.service.cli.operation.SQLOperation$$.run(SQLOperation.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hive.service.cli.operation.SQLOperation$.run(SQLOperation.java:)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:)
at java.util.concurrent.FutureTask.run(FutureTask.java:)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:)
at java.lang.Thread.run(Thread.java:)
Caused by: java.io.IOException: Unknown Job job_1494385775332_0823
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:)
at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$.callBlockingMethod(MRClientProtocol.java:)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:)
at java.lang.reflect.Constructor.newInstance(Constructor.java:)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.unwrapAndThrowException(MRClientProtocolPBClientImpl.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:)
at java.lang.reflect.Method.invoke(Method.java:)
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:)
... more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): Unknown Job job_1494385775332_0823
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getTaskAttemptCompletionEvents(HistoryClientService.java:)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBServiceImpl.java:)
at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$.callBlockingMethod(MRClientProtocol.java:)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at org.apache.hadoop.ipc.Server$Handler$.run(Server.java:)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:)
at org.apache.hadoop.ipc.Client.call(Client.java:)
at org.apache.hadoop.ipc.Client.call(Client.java:)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:)
at com.sun.proxy.$Proxy47.getTaskAttemptCompletionEvents(Unknown Source)
at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getTaskAttemptCompletionEvents(MRClientProtocolPBClientImpl.java:)
... more
Error: Error while processing statement: FAILED: Execution Error, return code from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=)

yarn上报错:

Diagnostics: 
Job init failed : org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.FileNotFoundException: File does not exist: hdfs://tijmu/mr-history/rd/.staging/job_1494385775332_0823/job.splitmetainfo    

3. beeline -u jdbc:hive2://0.0.0.0:10000 -n work -e "xxx": 报错和1一样

4. hive -e "select count( distinct patient_id ) from argus.table_aa000612_641cd8ce_ceff_4ea0_9b27_0a3a743f0fe3;" 正常结束

5. 怀疑是权限调试时改了代谢病的hdfs配置

6. 解决方案:根据第二步Diagnostics的报错信息搜索到一篇文章:https://community.hortonworks.com/questions/17489/job-init-fail-job-splitmetainfo-file-does-not-exis.html

 解决hiveserver2报错:java.io.IOException: Job status not available - Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask

对比了hive和hs2中此属性的值,果然不一样,在hs2中采用hive中的值后(set yarn.app.mapreduce.am.staging-dir=/user/history;)hql执行正常。

另一个集群的HS2上此属性值也一样是/mr-history, hdfs同样存在目录/mr-history/rd/.staging,权限也一样,但执行就没问题,因此估计还是和第5点有关。
抽查了以前的集群,值都为/user。问题关键是需要两个地方配置成一样。