通过Python脚本编写执行BigQuery的参数脚本时出错

时间:2021-01-10 15:32:10

I'm trying to adapt the asynch_query.py script found at https://github.com/GoogleCloudPlatform/bigquery-samples-python/tree/master/python/samples for use in executing a query and having the output go to a BigQuery table. The JSON section of the script as I've created it for seting the parameters is as follows:

我正在尝试调整https://github.com/GoogleCloudPlatform/bigquery-samples-python/tree/master/python/samples中的asynch_query.py脚本,以用于执行查询并将输出转到BigQuery表。我创建用于设置参数的脚本的JSON部分如下:

    job_data = {
    'jobReference': {
            'projectId': project_id,
            'job_id': str(uuid.uuid4())
            },
    'configuration': {
            'query': {
                    'query': queryString,
                    'priority': 'BATCH' if batch else 'INTERACTIVE',
                    'createDisposition': 'CREATE_IF_NEEDED',
                    'defaultDataset': {
                            'datasetId': 'myDataset'
                            },
                    'destinationTable': {
                            'datasetID': 'myDataset',
                            'projectId': project_id,
                            'tableId': 'testTable'
                            },
                    'tableDefinitions': {
                            '(key)': {
                                    'schema': {
                                        'fields': [
                                        {
                                            'description': 'eventLabel',
                                            'fields': [],
                                            'mode': 'NULLABLE',
                                            'name': 'eventLabel',
                                            'type': 'STRING'
                                        }]
                                    } 
                            }
                    }
            }
    }
    }

When I run my script I get an error message that a "Required parameter is missing". I've been through the documentation at https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.query trying to figure out what is missing, but attempts at various configurations have failed. Can anyone identify what is missing and how I would fix this error?

当我运行我的脚本时,我收到一条错误消息“缺少必需参数”。我已经浏览了https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.query上的文档,试图找出丢失的内容,但尝试各种配置都失败了。任何人都可以识别丢失的内容以及如何修复此错误?

1 个解决方案

#1


Not sure what's going on. To insert the results of a query into another table I use this code:

不知道发生了什么事。要将查询结果插入另一个表,我使用以下代码:

def create_table_from_query(connector, query,dest_table):
body = {
    'configuration': {
        'query': {
            'destinationTable': {
                'projectId': your_project_id,
                'tableId': dest_table,
                'datasetId': your_dataset_id
            },
            'writeDisposition': 'WRITE_TRUNCATE',
            'query': query,
        },
    }
}

response = connector.jobs().insert(projectId=self._project_id,
                                        body=body).execute()
wait_job_completion(response['jobReference']['jobId'])

def wait_job_completion(connector, job_id):
    while True:
        response = connector.jobs().get(projectId=self._project_id,
                                             jobId=job_id).execute()
        if response['status']['state'] == 'DONE':
            return

where connector is build('bigquery', 'v2', http=authorization)

构建连接器的位置('bigquery','v2',http =授权)

Maybe you could start from there and keep adding new fields as you wish (notice that you don't have to define the schema of the table as it's already contained in the results of the query).

也许您可以从那里开始并继续添加新字段(请注意,您不必定义表的模式,因为它已经包含在查询结果中)。

#1


Not sure what's going on. To insert the results of a query into another table I use this code:

不知道发生了什么事。要将查询结果插入另一个表,我使用以下代码:

def create_table_from_query(connector, query,dest_table):
body = {
    'configuration': {
        'query': {
            'destinationTable': {
                'projectId': your_project_id,
                'tableId': dest_table,
                'datasetId': your_dataset_id
            },
            'writeDisposition': 'WRITE_TRUNCATE',
            'query': query,
        },
    }
}

response = connector.jobs().insert(projectId=self._project_id,
                                        body=body).execute()
wait_job_completion(response['jobReference']['jobId'])

def wait_job_completion(connector, job_id):
    while True:
        response = connector.jobs().get(projectId=self._project_id,
                                             jobId=job_id).execute()
        if response['status']['state'] == 'DONE':
            return

where connector is build('bigquery', 'v2', http=authorization)

构建连接器的位置('bigquery','v2',http =授权)

Maybe you could start from there and keep adding new fields as you wish (notice that you don't have to define the schema of the table as it's already contained in the results of the query).

也许您可以从那里开始并继续添加新字段(请注意,您不必定义表的模式,因为它已经包含在查询结果中)。