At first my question may seem simple and has been asked before. Bear with me - I think it is a unique question.
一开始,我的问题似乎很简单,以前也有人问过。请听我说——我认为这是一个独特的问题。
Table A has columns State
, County
, Month
, Year
, and Rate
. Each State,County composite is listed several times with different dates and rate. Some rows have State and County set and everything else in that row is NULL.
表A列有州、县、月、年和利率。每个州、县的综合指数都以不同的日期和比率被列出好几次。有些行有状态和县集,这一行中的其他行都是空的。
Table B has default rates for each month and year. Columns are Month
, Year
, and Rate
. In this table I have several years worth of default data.
表B每个月和年的违约率。列是月、年和速率。在这个表中,我有几年的默认数据。
So for each State,County composite in Table A I want to fill in any missing data with the data from Table B.
对于每个州,表A中的县组合我想用表B中的数据填充任何缺失的数据。
I created a Table C that looks just like Table A except all the data is filled in with the default data from Table B. Then I tried to UNION Table A and Table C together. But I am ending up with two problems.
我创建了一个表C,它看起来就像表a,但是所有的数据都是由表b的默认数据填充的,然后我尝试将a和表C放在一起。但是我最后遇到了两个问题。
First I am ending up with duplicate rows with everything the same except for the rate. In this case I want to keep only the row that was originally in Table A (not the 'default rate').
首先,我得到的结果是重复的行除了速率。在这种情况下,我只想保留原来在表A中的行(而不是“默认速率”)。
Second I am ending up with rows rows that have State and County set but everything else is NULL. I need to replace these rows with row for every single default rate.
第二,我最后得到的行有州和郡的集合,但其他的都是空的。我需要为每个默认速率用行替换这些行。
So in the end I want to have one row for each State,County,Month,Year composite.
最后,我想要每个州,县,月,年的综合数据有一行。
Is it possible to combine the tables as I have described.
是否有可能把我所描述的表合并起来。
Let me know if you need anything clarified. Thanks.
如果你需要澄清什么,请告诉我。谢谢。
Table A has several thousand rows. 1 to 48 rows for each State,County composite:
表A有几千行。每个州1 - 48行,县组合:
+-------+--------+-------+------+------+ | State | County | Month | Year | Rate | +-------+--------+-------+------+------+ | NY | Albany | 1 | 2011 | ### | | NY | Albany | 2 | 2011 | ### | ... | NY | Albany | 12 | 2011 | ### | | NY | Albany | 1 | 2012 | ### | ... | NY | Albany | 12 | 2012 | ### | | NY | Monroe | 1 | 2011 | ### | ... | NY | Monroe | 12 | 2011 | ### | | NY | Essex | NULL | NULL | NULL | +-------+--------+-------+------+------+
Table B has 36 rows. One row for each month over 3 years:
表B有36行。三年内每月一次:
+-------+------+------+ | Month | Year | Rate | +-------+------+------+ | 1 | 2011 | *** | | 2 | 2011 | *** | | ... | | | | 12 | 2011 | *** | | 1 | 2012 | *** | | ... | | | | 12 | 2012 | *** | | 1 | 2013 | *** | | ... | | | | 12 | 2013 | *** | +-------+------+------+
Resulting table has more rows than Table A. Each State,County composite has at lease the 36 rows from the default table:
结果表的行数比表a多。
+-------+--------+-------+------+------+ | State | County | Month | Year | Rate | +-------+--------+-------+------+------+ | NY | Albany | 1 | 2011 | ### | | ... | | | | | | NY | Albany | 12 | 2011 | ### | | NY | Albany | 1 | 2012 | ### | | ... | | | | | | NY | Albany | 12 | 2012 | ### | | NY | Albany | 1 | 2013 | *** | | ... | | | | | | NY | Albany | 12 | 2013 | *** | | NY | Monroe | 1 | 2011 | ### | | ... | | | | | | NY | Monroe | 12 | 2011 | ### | | NY | Monroe | 1 | 2012 | *** | | ... | | | | | | NY | Monroe | 12 | 2012 | *** | | NY | Monroe | 1 | 2013 | *** | | ... | | | | | | NY | Monroe | 12 | 2013 | *** | | NY | Essex | 1 | 2011 | *** | | ... | | | | | | NY | Essex | 12 | 2011 | *** | | NY | Essex | 1 | 2012 | *** | | ... | | | | | | NY | Essex | 12 | 2012 | *** | | NY | Essex | 1 | 2013 | *** | | ... | | | | | | NY | Essex | 12 | 2013 | *** | +-------+--------+-------+------+------+
Key: ***
is a rate from the default table. ###
is a rate from the other table
键:***是来自默认表的速率。###是来自另一个表的速率
1 个解决方案
#1
3
I think the best approach is to generate all combinations of the geography and time. You can do this by taking the state
and county
from tablea
and cross joining with the year
and month
from tableb
. Then use left join
to see if there is any value in tablea
. If so, choose it. Otherwise, take the value from tableb
:
我认为最好的方法是生成地理和时间的所有组合。你可以把州和县从表a中拿出来,与表b中的年份和月份交叉。然后使用左连接查看表a中是否有任何值。如果是这样的话,选择它。否则,取表b的值:
select sc.state, sc.county, ym.year, ym.month, coalesce(a.rate, ym.rate) as rate
from (select distinct state, county from tablea) sc cross join
tableb ym left outer join
tablea a
on a.state = sc.state and a.county = sc.county and
a.year = ym.year and a.month = ym.month ;
+7chars
+ 7字符
#1
3
I think the best approach is to generate all combinations of the geography and time. You can do this by taking the state
and county
from tablea
and cross joining with the year
and month
from tableb
. Then use left join
to see if there is any value in tablea
. If so, choose it. Otherwise, take the value from tableb
:
我认为最好的方法是生成地理和时间的所有组合。你可以把州和县从表a中拿出来,与表b中的年份和月份交叉。然后使用左连接查看表a中是否有任何值。如果是这样的话,选择它。否则,取表b的值:
select sc.state, sc.county, ym.year, ym.month, coalesce(a.rate, ym.rate) as rate
from (select distinct state, county from tablea) sc cross join
tableb ym left outer join
tablea a
on a.state = sc.state and a.county = sc.county and
a.year = ym.year and a.month = ym.month ;
+7chars
+ 7字符