I have data like this i.e.
我有这样的数据,即
cityName stateId cityText1 52 cityText2 52 cityText3 52 cityExp1 72 cityExp2 72 cityExp3 72 city1 21 city2 21
i am using SUB QUERIES FOR GETTING DATA. now with BIGQUERY I want data like this:
我正在使用SUB QUERIES获取数据。现在有BIGQUERY我想要这样的数据:
cityName rowNumber cityText1 1 cityText2 1 cityText3 1 cityExp1 2 cityExp2 2 cityExp3 2 city1 3 city2 3
I have tried to use row_number() but it gives unique number to every row. so is this possible how i want.
我曾尝试使用row_number()但它为每一行提供唯一的编号。所以这可能是我想要的。
1 个解决方案
#1
You need first to join them with a scalar value to be one partition, then you can apply the ROW_NUMBER on that partition.
首先需要将标量值作为一个分区加入它们,然后可以在该分区上应用ROW_NUMBER。
Update: scroll to the bottom of the answer to see a suggestion without using scalar.
更新:滚动到答案的底部,查看不使用标量的建议。
SELECT stateId,
row_number() over (partition BY scalar) AS INDEX
FROM
(SELECT stateId,
1 AS scalar
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
GROUP BY stateId) d
this returns:
+-----+---------+-------+---+
| Row | stateId | index | |
+-----+---------+-------+---+
| 1 | 52 | 1 | |
| 2 | 72 | 2 | |
| 3 | 21 | 3 | |
+-----+---------+-------+---+
Then you can join the table again and prepare the final output. For our static tables it's a pretty long query:
然后,您可以再次加入表格并准备最终输出。对于我们的静态表,这是一个非常长的查询:
SELECT t.cityName,
t.stateId,
d.index
FROM
(SELECT *
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)) t
JOIN
(SELECT stateId,
row_number() over (partition BY scalar) AS INDEX
FROM
(SELECT stateId,
1 AS scalar
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
GROUP BY stateId)) d ON d.stateId=t.stateId
this returns the final output:
这将返回最终输出:
+-----+------------+-----------+---------+---+
| Row | t_cityName | t_stateId | d_index | |
+-----+------------+-----------+---------+---+
| 1 | cityText1 | 52 | 1 | |
| 2 | cityText2 | 52 | 1 | |
| 3 | cityText3 | 52 | 1 | |
| 4 | cityExp1 | 72 | 2 | |
| 5 | cityExp2 | 72 | 2 | |
| 6 | cityExp3 | 72 | 2 | |
| 7 | city1 | 21 | 3 | |
| 8 | city2 | 21 | 3 | |
+-----+------------+-----------+---------+---+
Update: After the update without scalar the query becomes:
更新:在没有标量的更新后,查询变为:
SELECT stateId,
row_number() over () AS INDEX
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
group by stateId
#1
You need first to join them with a scalar value to be one partition, then you can apply the ROW_NUMBER on that partition.
首先需要将标量值作为一个分区加入它们,然后可以在该分区上应用ROW_NUMBER。
Update: scroll to the bottom of the answer to see a suggestion without using scalar.
更新:滚动到答案的底部,查看不使用标量的建议。
SELECT stateId,
row_number() over (partition BY scalar) AS INDEX
FROM
(SELECT stateId,
1 AS scalar
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
GROUP BY stateId) d
this returns:
+-----+---------+-------+---+
| Row | stateId | index | |
+-----+---------+-------+---+
| 1 | 52 | 1 | |
| 2 | 72 | 2 | |
| 3 | 21 | 3 | |
+-----+---------+-------+---+
Then you can join the table again and prepare the final output. For our static tables it's a pretty long query:
然后,您可以再次加入表格并准备最终输出。对于我们的静态表,这是一个非常长的查询:
SELECT t.cityName,
t.stateId,
d.index
FROM
(SELECT *
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)) t
JOIN
(SELECT stateId,
row_number() over (partition BY scalar) AS INDEX
FROM
(SELECT stateId,
1 AS scalar
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
GROUP BY stateId)) d ON d.stateId=t.stateId
this returns the final output:
这将返回最终输出:
+-----+------------+-----------+---------+---+
| Row | t_cityName | t_stateId | d_index | |
+-----+------------+-----------+---------+---+
| 1 | cityText1 | 52 | 1 | |
| 2 | cityText2 | 52 | 1 | |
| 3 | cityText3 | 52 | 1 | |
| 4 | cityExp1 | 72 | 2 | |
| 5 | cityExp2 | 72 | 2 | |
| 6 | cityExp3 | 72 | 2 | |
| 7 | city1 | 21 | 3 | |
| 8 | city2 | 21 | 3 | |
+-----+------------+-----------+---------+---+
Update: After the update without scalar the query becomes:
更新:在没有标量的更新后,查询变为:
SELECT stateId,
row_number() over () AS INDEX
FROM
(SELECT 'cityText1' AS cityName,
52 AS stateId),
(SELECT 'cityText2' AS cityName,
52 AS stateId),
(SELECT 'cityText3' AS cityName,
52 AS stateId),
(SELECT 'cityExp1' AS cityName,
72 AS stateId),
(SELECT 'cityExp2' AS cityName,
72 AS stateId),
(SELECT 'cityExp3' AS cityName,
72 AS stateId),
(SELECT 'city1' AS cityName,
21 AS stateId),
(SELECT 'city2' AS cityName,
21 AS stateId)
group by stateId