自连接与内部和外部连接查询

时间:2021-10-22 00:15:32

I have a table that is set up so that one column (attribute) contains information like first name, last name, account number and any other information related to a thing in the database. Another column (attributeType) contains a number indicating what the attribute is e.g. 1 might be first name, 2 last name and 3 account number etc. There is another column (enddate) indicating if the record is current by having a date there. Usually it will be set to the year 9999 when current and some date in the past otherwise. All data describing the same thing has a unique value too in another column (entity) so that each record with the same number in the entity column will be describing the one person. E.g.

我已经设置了一个表,以便一个列(属性)包含一些信息,如名称、姓、帐户号和与数据库中某个事物相关的任何其他信息。另一个列(attributeType)包含一个数字,该数字指示属性是什么,例如:1可能是名、2名姓和3个帐号等等。通常它将被设置为9999年,当当前和过去的某个日期,否则。描述同一事物的所有数据在另一个列(实体)中也有一个惟一的值,因此实体列中编号相同的每个记录都将描述一个人。如。

entity  attribute  attributetype  enddate
------  ---------  -------------  --------
1       ben        1              9999-1-1
1       alt        2              9999-1-1
1       12345      3              9999-1-1
2       sam        1              9999-1-1
2       smith      2              9999-1-1
2       98765      3              1981-1-1

I want to select a person from the above table with a specific 1st and last name where the name will be current but not output the account number if it is not. Assuming the table is called tblAccount I do the following for the name part:

我想从上面的表中选择一个人,其中有一个特定的名字将会是当前的名字,但如果不是,则不会输出账号。假设该表名为tblAccount,我对name部分做如下操作:

select ta1.attribute '1st Name', ta2.attribute 'last name'
from tblAccount ta1
inner join tblAccount ta2 on ta1.entity = ta2.entity
where ta1.attribute = 'sam' and ta2.attribute = 'smith'
      and ta1.attributetype = 1 and ta2. attributetype = 2
      and ta1.enddate > getdate() and ta2.enddate > getdate()

and it outputs the first and last names as expected, but when I want to include the account number column I get nothing output:

它会像预期的那样输出姓和名,但是当我想要包含账号列时,我不会得到任何输出:

select ta1.attribute '1st Name', ta2.attribute 'last name', ta3.attribute 'account#'
from tblAccount ta1
inner join tblAccount ta2 on ta1.entity = ta2.entity
left join tblAccount ta3 on ta1.entity = ta3.entity
where ta1.attribute = 'sam' and ta2.attribute = 'smith'
      and ta1.attributetype = 1 and ta2. attributetype = 2
      and ta1.enddate > getdate() and ta2.enddate > getdate()
      and ta3.attributetype = 3 and ta3.enddate > getdate()

What I would like to see is the first and last names output with nothing in the account# column in the above case where it is not current. What am I doing wrong and how can I correct this query?

我希望看到的是上面的account#列中不包含任何内容的第一个和最后一个名称的输出,该列不是当前的。我做错了什么,我该如何更正这个查询?

2 个解决方案

#1


4  

You have to move the date comparison to the join condition:

您必须将日期比较移动到join条件:

select ta1.attribute '1st Name'
    , ta2.attribute 'last name'
    , ta3.attribute 'account#'
from tblAccount ta1
inner join tblAccount ta2 
    on ta1.entity = ta2.entity
     and ta1.attributetype = 1 and ta2. attributetype = 2
     and ta1.enddate > getdate() and ta2.enddate > getdate()
left join tblAccount ta3 on ta1.entity = ta3.entity
      and ta3.attributetype = 3 and ta3.enddate > getdate()
where ta1.attribute = 'sam' and ta2.attribute = 'smith'

When it's in the where clause it's comparing getdate() to NULL if there is no account, which returns NULL. So no record.

在where子句中,如果没有account,它会将getdate()与NULL进行比较,后者返回NULL。所以没有记录。

EDIT:

编辑:

In response to the valid concern about multiple valid records, and to make the code a little more maintenance freindly:

为了响应对多个有效记录的有效关注,并使代码更易于维护:

DECLARE @FNAME VARCHAR(50) = 'sam'
    , @LNAME VARCHAR(50) = 'smith'
    , @now DATETIME2(7) = GETDATE();

SELECT 
    name.[1st Name]
    , name.[last name]
    , name.entity
    , 
        (
            select 
                top 1 
                ta3.attribute
            FROM tblAccount ta3 
            WHERE 
                ta3.entity = name.entity
                and 
                ta3.attributetype = 3 
                and 
                ta3.enddate > @now
            ORDER BY 
                ta3.enddate 
        )
FROM 
    (        
        select 
            ta1.attribute '1st Name'
            , ta2.attribute 'last name'
            , ta.entity
            , ROW_NUMBER()
                OVER(
                    PARTITION BY 
                        ta1.entity
                    ORDER BY 
                        ta1.enddate
                    ) r
        from 
            tblAccount ta1
        inner join tblAccount ta2 
            on 
            ta1.entity = ta2.entity
            and 
            ta2. attributetype = 2
            and 
            ta2.enddate > @now
            and 
            ta2.attribute = @LNAME
        where 
            ta1.attributetype = 1 
            and 
            ta1.attribute = @fname 
            and 
            ta1.enddate > @now
    ) name
WHERE    
    NAME.r = 1

;

;

This code works around the implied assumptions of one entity per first/last name and exactly one enddate after the execution time. The variables are a little more stored proc friendly, and allow you to change the "as of" date. And if you're stuck with EAV, you will likely want stored procs. I'm taking the first record ending after the date in question, on the assumption that any later record(s) should only be valid after that one expires. Maybe it's overkill, since it's beyond the scope of the OP's question but it's a valid point.

这段代码围绕着一个实体的隐含假设,每个实体的姓/名,以及执行时间之后的一个字节。这些变量更适合存储proc,并且允许您更改“as of”日期。如果您被困在EAV中,您可能需要存储的proc。我取第一个记录,在问题的日期之后结束,假设以后的任何记录都应该在那个记录过期后有效。也许它是过量的,因为它超出了OP的范围,但它是一个有效的点。

I say "stuck with EAV". While EAV isn't always bad; neither is shooting someone in the back. In either case, you'd better have solid justification if you expect to get it past a jury. In a NoSQL storage pattern it's fine, but EAV is usually a poor implementation pattern for the RDBMS paradigm.

我说"坚持EAV"虽然EAV并不总是坏的;也没有人在背后开枪。无论哪种情况,如果你想让陪审团通过,你最好有充分的理由。在NoSQL存储模式中,这是可以的,但是对于RDBMS范式来说,EAV通常是一个糟糕的实现模式。

Though from the OP's later comment, it looks like he's hit on one of the better reasons.

尽管从OP后来的评论来看,他似乎找到了一个更好的理由。

#2


1  

Each attribute is actually a distinct entity in this model, but they are all sharing the same storage in the same physical table (Why?). This yields:

在这个模型中,每个属性实际上都是一个不同的实体,但是它们都在同一个物理表*享相同的存储(为什么?)这个收益率:

with data as (
   select entity = 1, attribute = 'ben',   attributeType=1, enddate = convert(datetime,'99990101') union all
   select entity = 1, attribute = 'alt',   attributeType=2, enddate = convert(datetime,'99990101') union all
   select entity = 1, attribute = '12345', attributeType=3, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = 'sam',   attributeType=1, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = 'smith', attributeType=2, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = '67890', attributeType=3, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = '68790', attributeType=3, enddate = convert(datetime,'20130331') union all
   select entity = 2, attribute = '876', attributeType=3, enddate = convert(datetime,'19810101') 
) 
select top 1
    FirstName, LastName, AccountNum
from (
  select top 1 
    a1.entity, FirstName, LastName
  from (
    select entity, enddate, attribute as FirstName
    from data d 
    where d.enddate >= getdate()
      and attributeType = 1
  ) a1
  join (
    select entity, enddate, attribute as LastName
    from data 
    where enddate >= getdate()
      and attributeType = 2
  ) a2 on a1.entity = a2.entity
     and a1.enddate = a2.enddate
  where FirstName = 'sam' and LastName = 'smith'
    and a1.enddate >= getdate() and a2.enddate >= getdate()
  order by a1.enddate
) E
left join (
  select entity, enddate, attribute as AccountNum
  from data 
  where enddate >= getdate()
    and attributeType = 3
) a3 on a3.entity = E.entity
order by a3.enddate

returning:

返回:

FirstName LastName AccountNum
--------- -------- ----------
sam       smith    68790

Note that it is quite common for accountign departments, at least, to enter future transactions during quiet times of the month, especially if those transactions will take effect during busy times of the month (ie month-end). The same for annual transactions. One should not assume that only one record can exist with an expiry > getdate().

请注意,会计部门至少在每月的安静时间内输入未来的交易是很常见的,特别是如果这些交易将在每月的繁忙时间生效(即月末)。年度交易也是如此。我们不应该假设只有一个记录可以在过期> getdate()中存在。

#1


4  

You have to move the date comparison to the join condition:

您必须将日期比较移动到join条件:

select ta1.attribute '1st Name'
    , ta2.attribute 'last name'
    , ta3.attribute 'account#'
from tblAccount ta1
inner join tblAccount ta2 
    on ta1.entity = ta2.entity
     and ta1.attributetype = 1 and ta2. attributetype = 2
     and ta1.enddate > getdate() and ta2.enddate > getdate()
left join tblAccount ta3 on ta1.entity = ta3.entity
      and ta3.attributetype = 3 and ta3.enddate > getdate()
where ta1.attribute = 'sam' and ta2.attribute = 'smith'

When it's in the where clause it's comparing getdate() to NULL if there is no account, which returns NULL. So no record.

在where子句中,如果没有account,它会将getdate()与NULL进行比较,后者返回NULL。所以没有记录。

EDIT:

编辑:

In response to the valid concern about multiple valid records, and to make the code a little more maintenance freindly:

为了响应对多个有效记录的有效关注,并使代码更易于维护:

DECLARE @FNAME VARCHAR(50) = 'sam'
    , @LNAME VARCHAR(50) = 'smith'
    , @now DATETIME2(7) = GETDATE();

SELECT 
    name.[1st Name]
    , name.[last name]
    , name.entity
    , 
        (
            select 
                top 1 
                ta3.attribute
            FROM tblAccount ta3 
            WHERE 
                ta3.entity = name.entity
                and 
                ta3.attributetype = 3 
                and 
                ta3.enddate > @now
            ORDER BY 
                ta3.enddate 
        )
FROM 
    (        
        select 
            ta1.attribute '1st Name'
            , ta2.attribute 'last name'
            , ta.entity
            , ROW_NUMBER()
                OVER(
                    PARTITION BY 
                        ta1.entity
                    ORDER BY 
                        ta1.enddate
                    ) r
        from 
            tblAccount ta1
        inner join tblAccount ta2 
            on 
            ta1.entity = ta2.entity
            and 
            ta2. attributetype = 2
            and 
            ta2.enddate > @now
            and 
            ta2.attribute = @LNAME
        where 
            ta1.attributetype = 1 
            and 
            ta1.attribute = @fname 
            and 
            ta1.enddate > @now
    ) name
WHERE    
    NAME.r = 1

;

;

This code works around the implied assumptions of one entity per first/last name and exactly one enddate after the execution time. The variables are a little more stored proc friendly, and allow you to change the "as of" date. And if you're stuck with EAV, you will likely want stored procs. I'm taking the first record ending after the date in question, on the assumption that any later record(s) should only be valid after that one expires. Maybe it's overkill, since it's beyond the scope of the OP's question but it's a valid point.

这段代码围绕着一个实体的隐含假设,每个实体的姓/名,以及执行时间之后的一个字节。这些变量更适合存储proc,并且允许您更改“as of”日期。如果您被困在EAV中,您可能需要存储的proc。我取第一个记录,在问题的日期之后结束,假设以后的任何记录都应该在那个记录过期后有效。也许它是过量的,因为它超出了OP的范围,但它是一个有效的点。

I say "stuck with EAV". While EAV isn't always bad; neither is shooting someone in the back. In either case, you'd better have solid justification if you expect to get it past a jury. In a NoSQL storage pattern it's fine, but EAV is usually a poor implementation pattern for the RDBMS paradigm.

我说"坚持EAV"虽然EAV并不总是坏的;也没有人在背后开枪。无论哪种情况,如果你想让陪审团通过,你最好有充分的理由。在NoSQL存储模式中,这是可以的,但是对于RDBMS范式来说,EAV通常是一个糟糕的实现模式。

Though from the OP's later comment, it looks like he's hit on one of the better reasons.

尽管从OP后来的评论来看,他似乎找到了一个更好的理由。

#2


1  

Each attribute is actually a distinct entity in this model, but they are all sharing the same storage in the same physical table (Why?). This yields:

在这个模型中,每个属性实际上都是一个不同的实体,但是它们都在同一个物理表*享相同的存储(为什么?)这个收益率:

with data as (
   select entity = 1, attribute = 'ben',   attributeType=1, enddate = convert(datetime,'99990101') union all
   select entity = 1, attribute = 'alt',   attributeType=2, enddate = convert(datetime,'99990101') union all
   select entity = 1, attribute = '12345', attributeType=3, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = 'sam',   attributeType=1, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = 'smith', attributeType=2, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = '67890', attributeType=3, enddate = convert(datetime,'99990101') union all
   select entity = 2, attribute = '68790', attributeType=3, enddate = convert(datetime,'20130331') union all
   select entity = 2, attribute = '876', attributeType=3, enddate = convert(datetime,'19810101') 
) 
select top 1
    FirstName, LastName, AccountNum
from (
  select top 1 
    a1.entity, FirstName, LastName
  from (
    select entity, enddate, attribute as FirstName
    from data d 
    where d.enddate >= getdate()
      and attributeType = 1
  ) a1
  join (
    select entity, enddate, attribute as LastName
    from data 
    where enddate >= getdate()
      and attributeType = 2
  ) a2 on a1.entity = a2.entity
     and a1.enddate = a2.enddate
  where FirstName = 'sam' and LastName = 'smith'
    and a1.enddate >= getdate() and a2.enddate >= getdate()
  order by a1.enddate
) E
left join (
  select entity, enddate, attribute as AccountNum
  from data 
  where enddate >= getdate()
    and attributeType = 3
) a3 on a3.entity = E.entity
order by a3.enddate

returning:

返回:

FirstName LastName AccountNum
--------- -------- ----------
sam       smith    68790

Note that it is quite common for accountign departments, at least, to enter future transactions during quiet times of the month, especially if those transactions will take effect during busy times of the month (ie month-end). The same for annual transactions. One should not assume that only one record can exist with an expiry > getdate().

请注意,会计部门至少在每月的安静时间内输入未来的交易是很常见的,特别是如果这些交易将在每月的繁忙时间生效(即月末)。年度交易也是如此。我们不应该假设只有一个记录可以在过期> getdate()中存在。