Apache beam - java.io.FileNotFoundException并且无法创建凭证对象

时间:2022-08-15 15:34:07

I have written a pipeline to extract G suite activity logs by referring the G suite java-quickstart where the code reads client_secret.json file as below,

我编写了一个管道,通过引用G套件java-quickstart来提取G套件活动日志,其中代码读取client_secret.json文件,如下所示,

InputStream in = new FileInputStream("D://mypath/client_secret.json");
GoogleClientSecrets clientSecrets = GoogleClientSecrets.load(JSON_FACTORY, new InputStreamReader(in));

Pipeline runs as expected in local(runner=DirectRunner) but the same code fails with java.io.FileNotFoundException expection when executed on cloud(runner=DataflowRunner)

管道在本地(runner = DirectRunner)中按预期运行,但在云上执行时,相同的代码在java.io.FileNotFoundException期限失败(runner = DataflowRunner)

I understand local path is invalid when executed on cloud. Any suggestion here?

我理解在云上执行时本地路径无效。这里有什么建议吗?

Updated:

更新:

I have modified the code as below and I am able to read the client_secrets.json file

我修改了下面的代码,我能够读取client_secrets.json文件

    InputStream in =
    Activities.class.getResourceAsStream("client_secret.json");

Actual problem is in creating the credential object

实际问题在于创建凭证对象

private static   java.io.File DATA_STORE_DIR = new java.io.File(System.getProperty("user.home"),
         ".credentials/admin-reports_v1-java-quickstart");
private static final List<String> SCOPES = Arrays.asList(ReportsScopes.ADMIN_REPORTS_AUDIT_READONLY);

static {
    try {
        HTTP_TRANSPORT = GoogleNetHttpTransport.newTrustedTransport();
        DATA_STORE_FACTORY = new FileDataStoreFactory(DATA_STORE_DIR);
    } catch (Throwable t) {
        t.printStackTrace();
        System.exit(1);
    }
}

public static Credential authorize() throws IOException {
    // Load client secrets.
    InputStream in =
    Activities.class.getResourceAsStream("client_secret.json");

    GoogleClientSecrets clientSecrets = GoogleClientSecrets.load(JSON_FACTORY, new InputStreamReader(in));

    // Build flow and trigger user authorization request.
    GoogleAuthorizationCodeFlow flow = new GoogleAuthorizationCodeFlow.Builder(HTTP_TRANSPORT, JSON_FACTORY,
            clientSecrets, SCOPES).setDataStoreFactory(DATA_STORE_FACTORY).setAccessType("offline").build();
    Credential credential = new AuthorizationCodeInstalledApp(flow, new LocalServerReceiver()).authorize("user");
    System.out.println("Credentials saved to " + DATA_STORE_DIR.getAbsolutePath());
    return credential;
}

Observations:

观察:

Local execution:

本地执行:

  1. On initial execution, program attempts to open browser to authorize the request and stores the authenticated object in a file - "StoredCredential".
  2. 在初始执行时,程序尝试打开浏览器以授权请求并将经过身份验证的对象存储在文件中 - “StoredCredential”。
  3. On further executions, the stored file is used to make API calls.
  4. 在进一步执行时,存储的文件用于进行API调用。

Running on cloud(DataFlowRunner):

在云上运行(DataFlowRunner):

  1. When I check logs, dataflow tries to open a browser to authenticate the request and stops there.
  2. 当我检查日志时,数据流会尝试打开浏览器来验证请求并停在那里。

What I need?

我需要的?

How to modify GoogleAuthorizationCodeFlow.Builder such that the credential object can be created while running as dataflow pipeline?

如何修改GoogleAuthorizationCodeFlow.Builder,以便在作为数据流管道运行时可以创建凭证对象?

2 个解决方案

#1


2  

I have found a solution to create GoogleCredential object using the service account. Below is the code for it.

我找到了使用服务帐户创建GoogleCredential对象的解决方案。下面是它的代码。

    public static Credential authorize() throws IOException, GeneralSecurityException {

        String emailAddress = "service_account.iam.gserviceaccount.com";
        GoogleCredential credential = new GoogleCredential.Builder()
                .setTransport(HTTP_TRANSPORT)
                .setJsonFactory(JSON_FACTORY)
                .setServiceAccountId(emailAddress)
                .setServiceAccountPrivateKeyFromP12File(Activities.class.getResourceAsStream("MYFILE.p12"))
                .setServiceAccountScopes(Collections.singleton(ReportsScopes.ADMIN_REPORTS_AUDIT_READONLY))
                .setServiceAccountUser("USER_NAME")
                .build();

        return credential;
    }

#2


0  

Can you try running the program multiple times locally. What I am wondering is, if the "StoredCredential" file is available, will it just work? Or will it try to load up the browser again?

你能尝试在本地多次运行程序吗?我想知道的是,如果“StoredCredential”文件可用,它会起作用吗?或者它会尝试再次加载浏览器?

If so, can you determine the proper place to store that file, and download a copy of it from GCS onto the Dataflow worker? There should be APIs to download GCS files bundled with the dataflow SDK jar. So you should be able to use those to download the credential file.

如果是这样,您是否可以确定存储该文件的正确位置,并将其从GCS下载到Dataflow工作者?应该有API来下载与数据流SDK jar捆绑在一起的GCS文件。因此,您应该能够使用它们来下载凭据文件。

#1


2  

I have found a solution to create GoogleCredential object using the service account. Below is the code for it.

我找到了使用服务帐户创建GoogleCredential对象的解决方案。下面是它的代码。

    public static Credential authorize() throws IOException, GeneralSecurityException {

        String emailAddress = "service_account.iam.gserviceaccount.com";
        GoogleCredential credential = new GoogleCredential.Builder()
                .setTransport(HTTP_TRANSPORT)
                .setJsonFactory(JSON_FACTORY)
                .setServiceAccountId(emailAddress)
                .setServiceAccountPrivateKeyFromP12File(Activities.class.getResourceAsStream("MYFILE.p12"))
                .setServiceAccountScopes(Collections.singleton(ReportsScopes.ADMIN_REPORTS_AUDIT_READONLY))
                .setServiceAccountUser("USER_NAME")
                .build();

        return credential;
    }

#2


0  

Can you try running the program multiple times locally. What I am wondering is, if the "StoredCredential" file is available, will it just work? Or will it try to load up the browser again?

你能尝试在本地多次运行程序吗?我想知道的是,如果“StoredCredential”文件可用,它会起作用吗?或者它会尝试再次加载浏览器?

If so, can you determine the proper place to store that file, and download a copy of it from GCS onto the Dataflow worker? There should be APIs to download GCS files bundled with the dataflow SDK jar. So you should be able to use those to download the credential file.

如果是这样,您是否可以确定存储该文件的正确位置,并将其从GCS下载到Dataflow工作者?应该有API来下载与数据流SDK jar捆绑在一起的GCS文件。因此,您应该能够使用它们来下载凭据文件。