如何组合表与模式差异?

I am currently working with a data set that relates to the current year. Data that relates to each year is housed in a separate table. When the data is queried, it is done so using a UNION ALL query.

我目前正在处理与本年度相关的数据集。与每年有关的数据被存放在一个单独的表中。在查询数据时，使用UNION ALL查询来完成它。

Unfortunately, the data sets provided in the past do not share the same schema as that for the current year, some fields have been added, some retired, and others have been renamed. I have no control over this.

不幸的是，过去提供的数据集与当前年份没有相同的模式，一些字段被添加，一些已经退役，其他的已经被重命名。我无法控制这个。

In this case, how am I to do UNION ALL queries across these tables when the schema are different? The differences are not very significant, but they deviate enough to cause problems.

在这种情况下，当模式不同时，我如何在这些表中执行UNION所有查询?这些差异不是很显著，但它们偏离的程度足以引起问题。

Any suggestions?

有什么建议吗?

Do I merge everything into one large table including all fields spanning across all years and then add new ones as they appear? Or, do I continue to keep these tables separate?

我是否将所有内容合并到一个大的表中，包括跨越所有年份的所有字段，然后在它们出现时添加新的字段?或者，我是否继续将这些表分开?

1 个解决方案

#1

Well, for one, don't try to UNION (actually UNION ALL would probably be more appropriate) with SELECT *.

嗯，首先，不要试图结合(实际上是联合所有可能更合适)选择*。

You can:

您可以:

add columns to the sets that don't have a particular column, with token default / NULL values
将列添加到没有特定列的集合中，并带有令牌默认值/空值。
convert columns that are currently "the same" but use incompatible types
转换当前“相同”但使用不兼容类型的列。
simply leave out columns that aren't common enough to bother including
简单地排除那些不常见的列，包括。

For example:

例如:

DECLARE @a TABLE(d DATE, c INT, x FLOAT);

DECLARE @b TABLE(d DATETIME, c VARCHAR(32));

DECLARE @c TABLE(d DATE, x INT, y INT);

SELECT d, c = CONVERT(VARCHAR(32), c), x = CONVERT(INT, x) FROM @a
UNION ALL
SELECT CONVERT(DATE, d), c, x = NULL FROM @b
UNION ALL 
SELECT d, c = 'not supplied', x FROM @c;

#1