I'm trying to use the public GitHub dataset on BigQuery to count events - PushEvents, in this case - on a per repository basis over time.
我正在尝试使用BigQuery上的公共GitHub数据集来计算事件 - 在这种情况下是PushEvents - 在每个存储库的基础上随着时间的推移。
SELECT COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent'
AND repository_name = "account/repo"
GROUP BY pushed_at
ORDER BY pushed_at DESC
Basically just retrieve the count for a specified repo and event type, group the count by date and return the list. BigQuery validates the following, but then fails the query with a:
基本上只检索指定仓库和事件类型的计数,按日期对计数进行分组并返回列表。 BigQuery验证以下内容,但随后使用以下命令使查询失败:
Field 'pushed_at' not found.
As far as I can tell from GitHub's PushEvent documentation, however, pushed_at is an available field. Anybody have examples of related queries that execute properly? Any suggestions as to what's being done incorrectly here?
但是,据我所知,从GitHub的PushEvent文档中,push_at是一个可用字段。有人有正确执行的相关查询的例子吗?关于这里做错了什么的任何建议?
1 个解决方案
#1
The field is called repository_pushed_at, and you also probably meant to include it in the SELECT list, i.e.
该字段称为repository_pushed_at,您也可能将其包含在SELECT列表中,即
SELECT repository_pushed_at, COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent'
AND repository_name = "account/repo"
GROUP BY repository_pushed_at
ORDER BY repository_pushed_at DESC
#1
The field is called repository_pushed_at, and you also probably meant to include it in the SELECT list, i.e.
该字段称为repository_pushed_at,您也可能将其包含在SELECT列表中,即
SELECT repository_pushed_at, COUNT(*)
FROM [githubarchive:github.timeline]
WHERE type = 'PushEvent'
AND repository_name = "account/repo"
GROUP BY repository_pushed_at
ORDER BY repository_pushed_at DESC