我想在Big查询中加入两个带有公共列的表吗?

时间:2022-04-24 19:15:31

To join the tables, I am using the following query.

要加入表,我使用以下查询。

SELECT *
FROM(select user as uservalue1 FROM [projectname.FullData_Edited]) as FullData_Edited 
JOIN (select user as uservalue2 FROM [projectname.InstallDate]) as InstallDate 
ON FullData_Edited.uservalue1=InstallDate.uservalue2;

The query works but the joined table only has two columns uservalue1 and uservalue2. I want to keep all the columns present in both the table. Any idea how to achieve that?

查询有效但连接表只有两列uservalue1和uservalue2。我想保留表中的所有列。知道怎么做到这一点?

2 个解决方案

#1


2  

#legacySQL
SELECT <list of fields to output>
FROM [projectname:datasetname.FullData_Edited] AS FullData_Edited
JOIN [projectname:datasetname.InstallDate] AS InstallDate
ON FullData_Edited.user = InstallDate.user

or (and preferable)

或(并且优选)

#standardSQL
SELECT <list of fields to output>
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
ON FullData_Edited.user = InstallDate.user

Note, using SELECT * in such cases lead to Ambiguous column name error, so it is better to put explicit list of columns/fields you need to have in your output

注意,在这种情况下使用SELECT *会导致不明确的列名错误,因此最好在输出中放置需要的列/字段的显式列表

The way around it is in using USING() syntax as in example below.
Assuming that user is the ONLY ambiguous field - it does the trick

解决它的方法是使用USING()语法,如下例所示。假设用户是唯一不明确的领域 - 它就是诀窍

#standardSQL
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

For example:

例如:

#standardSQL
WITH `projectname.datasetname.FullData_Edited` AS (
  SELECT 1 user, 'a' field1
),
`projectname.datasetname.InstallDate` AS (
  SELECT 1 user, 'b' field2
)
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

returns

回报

user    field1  field2   
1       a       b    

whereas using ON FullData_Edited.user = InstallDate.user gives below error

而使用ON FullData_Edited.user = InstallDate.user会给出以下错误

Error: Duplicate column names in the result are not supported. Found duplicate(s): user

错误:不支持结果中的重复列名称。发现重复:用户

#2


1  

Don't use subqueries if you want all columns:

如果您想要所有列,请不要使用子查询:

SELECT *
FROM [projectname.FullData_Edited] as FullData_Edited JOIN
     [projectname.InstallDate] as InstallDate 
     ON FullData_Edited.uservalue1 = InstallDate.uservalue2;

You may have to list out the particular columns you want to avoid duplicate column names.

您可能必须列出要避免重复列名称的特定列。

While you are at it, you should also switch to standard SQL.

当你在它时,你也应该切换到标准SQL。

#1


2  

#legacySQL
SELECT <list of fields to output>
FROM [projectname:datasetname.FullData_Edited] AS FullData_Edited
JOIN [projectname:datasetname.InstallDate] AS InstallDate
ON FullData_Edited.user = InstallDate.user

or (and preferable)

或(并且优选)

#standardSQL
SELECT <list of fields to output>
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
ON FullData_Edited.user = InstallDate.user

Note, using SELECT * in such cases lead to Ambiguous column name error, so it is better to put explicit list of columns/fields you need to have in your output

注意,在这种情况下使用SELECT *会导致不明确的列名错误,因此最好在输出中放置需要的列/字段的显式列表

The way around it is in using USING() syntax as in example below.
Assuming that user is the ONLY ambiguous field - it does the trick

解决它的方法是使用USING()语法,如下例所示。假设用户是唯一不明确的领域 - 它就是诀窍

#standardSQL
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

For example:

例如:

#standardSQL
WITH `projectname.datasetname.FullData_Edited` AS (
  SELECT 1 user, 'a' field1
),
`projectname.datasetname.InstallDate` AS (
  SELECT 1 user, 'b' field2
)
SELECT *
FROM `projectname.datasetname.FullData_Edited` AS FullData_Edited
JOIN `projectname.datasetname.InstallDate` AS InstallDate
USING (user)

returns

回报

user    field1  field2   
1       a       b    

whereas using ON FullData_Edited.user = InstallDate.user gives below error

而使用ON FullData_Edited.user = InstallDate.user会给出以下错误

Error: Duplicate column names in the result are not supported. Found duplicate(s): user

错误:不支持结果中的重复列名称。发现重复:用户

#2


1  

Don't use subqueries if you want all columns:

如果您想要所有列,请不要使用子查询:

SELECT *
FROM [projectname.FullData_Edited] as FullData_Edited JOIN
     [projectname.InstallDate] as InstallDate 
     ON FullData_Edited.uservalue1 = InstallDate.uservalue2;

You may have to list out the particular columns you want to avoid duplicate column names.

您可能必须列出要避免重复列名称的特定列。

While you are at it, you should also switch to standard SQL.

当你在它时,你也应该切换到标准SQL。