sql选择多个行的最早日期

时间:2022-04-16 08:54:53

I have a database that looks like the following;

我有这样一个数据库;

circuit_uid   |  customer_name   | location      | reading_date | reading_time | amps | volts  |  kw  | kwh | kva  |  pf  |  key
--------------------------------------------------------------------------------------------------------------------------------------
cu1.cb1.r1    | Customer 1       | 12.01.a1      | 2012-01-02   | 00:01:01     | 4.51 | 229.32 | 1.03 |  87 | 1.03 | 0.85 |    15
cu1.cb1.r1    | Customer 1       | 12.01.a1      | 2012-01-02   | 01:01:01     | 4.18 | 230.3 | 0.96 |  90 | 0.96 | 0.84 |    16
cu1.cb1.s2    | Customer 2       | 10.01.a1      | 2012-01-02   | 00:01:01     | 7.34 | 228.14 | 1.67 | 179 | 1.67 | 0.88 | 24009
cu1.cb1.s2    | Customer 2       | 10.01.a1      | 2012-01-02   | 01:01:01     | 9.07 |  228.4 | 2.07 | 182 | 2.07 | 0.85 | 24010
cu1.cb1.r1    | Customer 3       | 01.01.a1      | 2012-01-02   | 00:01:01     | 7.32 | 229.01 | 1.68 | 223 | 1.68 | 0.89 | 48003 
cu1.cb1.r1    | Customer 3       | 01.01.a1      | 2012-01-02   | 01:01:01     | 6.61 | 228.29 | 1.51 | 226 | 1.51 | 0.88 | 48004

What I am trying to do is produce a result that has the KWH reading for each customer from the earliest (min(reading_time)) on that date, the date will be selected by the user in a web form.

我要做的是生成一个结果,该结果具有从最早的日期(最小读数(读取时间)开始的每个客户的KWH读数,该日期将由用户在web表单中选择。

The result would be/should be similar to;

结果将/应该类似于;

Customer 1   87
Customer 2   179
Customer 3   223

There are more than the number of rows per day shown here and there are more customers and the number of customers would change regularly.

这里显示的行数多于每天的行数,并且有更多的客户,客户的数量会定期变化。

I do not have much experience with SQL, I have looked at subqueries etc. but I do not have the chops to figure out how arrange it by the earliest reading per customer and then just output the kwh column.

我对SQL没有太多的经验,我查看过子查询等等,但是我没有能力去弄清楚如何根据每个客户的早期阅读来安排它,然后只输出kwh列。

This is running in PostgreSQL 8.4 on Redhat/CentOS.

这在Redhat/CentOS的PostgreSQL 8.4中运行。

3 个解决方案

#1


3  

select customer_name,
       kwh,
       reading_date, 
       reading_time
from (
   select customer_name,
          kwh,
          reading_time,
          reading_date,
          row_number() over (partition by customer_name order by reading_time) as rn
   from readings
   where reading_date = date '2012-11-17'
) t
where rn = 1

As an alternative:

作为一种替代方法:

select r1.customer_name,
       r1.kwh, 
       r1.reading_date,
       r1.reading_time
from readings r1
where reading_date = date '2012-11-17'
and reading_time = (select min(r2.reading_time)
                    from readings
                    where r2.customer_name = r1.customer_name
                    and r2.read_date = r1.reading_date);

But I'd expect the first one to be faster.

但我希望第一个更快。

Btw: why do you store date and time in two separate columns? Are you aware that this could be handled better with a timestamp column?

顺便说一句:你为什么把日期和时间分别放在两栏里?您知道使用时间戳列可以更好地处理这个问题吗?

#2


3  

This should be among the fastest possible solutions:

这应该是最快的解决办法之一:

SELECT DISTINCT ON (customer_name)
       customer_name, kwh  -- add more columns as needed.
FROM   readings
WHERE  reading_date = user_date
ORDER  BY customer_name, reading_time

Seems to be another application of:

似乎是:

#3


0  

   SELECT rt.circuit_uid ,  rt.customer_name, rt.kwh
   FROM READING_TABLE rt JOIN  
       (SELECT circuit_uid, reading_time
       FROM READING_TABLE
       WHERE reading_date = '2012-01-02'
       GROUP BY customer_uid
       HAVING MIN(reading_time) = reading_time) min_time
   ON (rt.circuit_uid = min_time.circuit_uid 
      AND rt.reading_time = min_time.reading_time);

Parameterize the reading_date value in above query.

参数化上面查询中的reading_date值。

#1


3  

select customer_name,
       kwh,
       reading_date, 
       reading_time
from (
   select customer_name,
          kwh,
          reading_time,
          reading_date,
          row_number() over (partition by customer_name order by reading_time) as rn
   from readings
   where reading_date = date '2012-11-17'
) t
where rn = 1

As an alternative:

作为一种替代方法:

select r1.customer_name,
       r1.kwh, 
       r1.reading_date,
       r1.reading_time
from readings r1
where reading_date = date '2012-11-17'
and reading_time = (select min(r2.reading_time)
                    from readings
                    where r2.customer_name = r1.customer_name
                    and r2.read_date = r1.reading_date);

But I'd expect the first one to be faster.

但我希望第一个更快。

Btw: why do you store date and time in two separate columns? Are you aware that this could be handled better with a timestamp column?

顺便说一句:你为什么把日期和时间分别放在两栏里?您知道使用时间戳列可以更好地处理这个问题吗?

#2


3  

This should be among the fastest possible solutions:

这应该是最快的解决办法之一:

SELECT DISTINCT ON (customer_name)
       customer_name, kwh  -- add more columns as needed.
FROM   readings
WHERE  reading_date = user_date
ORDER  BY customer_name, reading_time

Seems to be another application of:

似乎是:

#3


0  

   SELECT rt.circuit_uid ,  rt.customer_name, rt.kwh
   FROM READING_TABLE rt JOIN  
       (SELECT circuit_uid, reading_time
       FROM READING_TABLE
       WHERE reading_date = '2012-01-02'
       GROUP BY customer_uid
       HAVING MIN(reading_time) = reading_time) min_time
   ON (rt.circuit_uid = min_time.circuit_uid 
      AND rt.reading_time = min_time.reading_time);

Parameterize the reading_date value in above query.

参数化上面查询中的reading_date值。