I am trying to follow this blog post from Google using their new CLOUDML tools.
我正在尝试使用他们新的CLOUDML工具关注Google发布的这篇博文。
https://cloud.google.com/blog/big-data/2016/12/how-to-train-and-classify-images-using-google-cloud-machine-learning-and-cloud-dataflow
Running from within their provided docker instance
从其提供的docker实例中运行
docker pull gcr.io/cloud-datalab/datalab:local
docker run -it -p "127.0.0.1:8080:8080" \
--entrypoint=/bin/bash \
gcr.io/cloud-datalab/datalab:local
starting from: root@9e93221352d8:~/google-cloud-ml/samples/flowers#
从:root @ 9e93221352d8开始:〜/ google-cloud-ml / samples / flowers#
To the run the first preprocessing step:
要运行第一个预处理步骤:
Assign appropriate values.
PROJECT=$(gcloud config list project --format "value(core.project)")
JOB_ID="flowers_${USER}_$(date +%Y%m%d_%H%M%S)"
BUCKET="gs://${PROJECT}-ml"
GCS_PATH="${BUCKET}/${USER}/${JOB_ID}"
DICT_FILE=gs://cloud-ml-data/img/flower_photos/dict.txt
Preprocess the eval set.
python trainer/preprocess.py \
--input_dict "$DICT_FILE" \
--input_path "gs://cloud-ml-data/img/flower_photos/eval_set.csv" \
--output_path "${GCS_PATH}/preproc/eval" \
--cloud
returns
回报
(27042c30421ec530): Workflow failed. Causes: (70e56dda0121e0fa): One or more access checks for temp location or staged files failed. Please refer to other error messages for details. For more information on security and permissions, please see https://cloud.google.com/dataflow/security-and-permissions.
Heading to the console, the logs read:
前往控制台,日志显示:
(531d956bf99b5f27): Staged package cloudml.latest.tar.gz at location 'gs://api-project-773889352370-ml/flowers__20170106_123249/preproc/staging/flowers-20170106-123312.1483705994.201001/cloudml.latest.tar.gz' is inaccessible.
I tried again authenticating with
我再次尝试进行身份验证
gcloud beta auth application-default login
and getting the key from the browser. Nothing seems wrong there.
并从浏览器获取密钥。那里似乎没有错。
I have successfully run the MNIST cloud learning tutorial, so there is no authentication issues communicating with google compute engine.
我已经成功运行了MNIST云学习教程,因此没有与谷歌计算引擎通信的身份验证问题。
I can confirm the path to my bucket is correct:
我可以确认我的桶的路径是正确的:
root@9e93221352d8:~/google-cloud-ml/samples/flowers# echo ${GCS_PATH}
gs://api-project-773889352370-ml//flowers__20170106_165608
but no folder flowers__20170106_165608 is ever created (due to permissions).
但是没有创建文件夹flowers__20170106_165608(由于权限)。
Does Dataflow need seperate credentials? I went to the console and made sure my account is open to the dataflow API. Anything beyond
Dataflow是否需要单独的凭据?我去了控制台并确保我的帐户对数据流API开放。超越任何东西
root@9e93221352d8:~/google-cloud-ml/samples/flowers# gcloud config list
Your active configuration is: [default]
[component_manager]
disable_update_check = True
[compute]
region = us-central1
zone = us-central1-a
[core]
account = ####<- scrubbed for SO, its correct.
project = api-project-773889352370
Edit: To show that the service accounts tab on the console.
编辑:显示控制台上的服务帐户选项卡。
Edit: Accepted answer below. I'm accepting this answer because Jeremy Lewi is correct. The problem is not that dataflow does have permissions, but because the GCS object was never created. Going into the preprocess logger you can see
编辑:接受以下答案。我接受了这个答案,因为杰里米·莱维是正确的。问题不在于数据流确实具有权限,而是因为从未创建过GCS对象。进入预处理记录器,你可以看到
The tutorial google has shown is probably not well configured for the free tier, i'm guessing it distributes to too many instances and exceeds the CPU quota. If i cannot solve, I will open a correctly framed question.
谷歌已经显示的教程可能没有为免费套餐配置好,我猜它分配给太多实例并超过了CPU配额。如果我无法解决,我将打开一个正确框架的问题。
1 个解决方案
#1
0
Please see the information about service accounts at the link provided by the error message. I suspect the service account is not authorized correctly to view the staged file.
请在错误消息提供的链接中查看有关服务帐户的信息。我怀疑服务帐户未被正确授权才能查看暂存文件。
#1
0
Please see the information about service accounts at the link provided by the error message. I suspect the service account is not authorized correctly to view the staged file.
请在错误消息提供的链接中查看有关服务帐户的信息。我怀疑服务帐户未被正确授权才能查看暂存文件。