I have a table named 'Attendance' which is used to record student attendance time in courses. This table has 4 columns, say 'id', 'course_id', 'attendance_time', and 'student_name'. An example of few records in this table is:
我有一个名为“出席”的表格,用来记录学生上课的时间。该表有4列,分别是“id”、“course_id”、“attendance_time”和“student_name”。这个表中很少有记录的例子是:
23 100 1/1/2010 10:00:00 Tom
2010年1月23日10:00:00汤姆
24 100 1/1/2010 10:20:00 Bob
24 100 1/2010 10:20:00 Bob
25 187 1/2/2010 08:01:01 Lisa
25187 /2/2010 08:01 / 01丽莎。
.....
.....
I want to create a summary of the latest attendance time for each course. I created a query below:
我想对每门课程的最新出勤时间做一个总结。我创建了一个查询如下:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id
The result would be something like this
结果是这样的。
100 1/1/2010 10:20:00
100年1/1/2010 10:20:00
187 1/2/2010 08:01:01
187年1/2/2010 08:01:01
Now, all I want to do is add the 'id' column to the result above. How to do it?
现在,我要做的就是将“id”列添加到上面的结果中。如何去做?
I can't just change the command to something like this
我不能把命令变成这样
SELECT id, course_id, max(attendance_time) FROM attendance GROUP BY id, course_id
because it would return all the records as if the aggregate function is not used. Please help me.
因为它会返回所有记录,就好像没有使用聚合函数一样。请帮助我。
2 个解决方案
#1
5
This is a typical 'greatest per group', 'greatest-n-per-group' or 'groupwise maximum' query that comes up on Stack Overflow almost every day. You can search Stack Overflow for these terms to find many different examples of how to solve this with different databases. One way to solve it is as follows:
这是典型的“每个组最大”、“每个组最大n”或“组最大”查询,几乎每天都会出现堆栈溢出。您可以搜索这些术语的Stack Overflow来找到许多不同的例子,说明如何使用不同的数据库来解决这个问题。一种解决方法是:
SELECT
T2.course_id,
T2.attendance_time
T2.id
FROM (
SELECT
course_id,
MAX(attendance_time) AS attendance_time
FROM attendance
GROUP BY course_id
) T1
JOIN attendance T2
ON T1.course_id = T2.course_id
AND T1.attendance_time = T2.attendance_time
Note that this query can in theory return multiple rows per course_id if there are multiple rows with the same attendance_time. If that cannot happen then you don't need to worry about this issue. If this is a potential problem then you can solve this by adding an extra grouping on course_id, attendance_time and selecting the minimum or maximum id.
注意,如果有多个行具有相同的attendance_time,那么这个查询理论上可以为每个course_id返回多个行。如果那不可能发生,那么你不需要担心这个问题。如果这是一个潜在的问题,那么您可以通过在course_id、attendance_time上添加额外的分组并选择最小或最大id来解决这个问题。
#2
0
What do you need the additional column for? It already has a course ID, which identifies the data. A synthetic ID to the query would be useless because it does not refer to anything. If you want to get the max from the query results for a single course, then you can add a where condition like this:
您需要额外的列做什么?它已经有一个课程ID,用于标识数据。查询的合成ID将是无用的,因为它不引用任何内容。如果您想从查询结果中获得单个课程的最大值,那么您可以添加如下条件:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id **WHERE course_id = your_id_here**;
按course_id **从考勤组中选择course_id, max(attendance_time),其中course_id = your_id_here*;
If you mean that the column should be named 'id', you can alias it in the query:
如果您的意思是列应该被命名为“id”,您可以在查询中对其进行别名:
SELECT course_id **AS id**, max(attendance_time) FROM attendance GROUP BY course_id;
从考勤组中选择课程id** AS id**、max(attendance_time);
You could make a view out of your query to easily access the aggregate data:
您可以从查询中创建一个视图来轻松访问聚合数据:
CREATE VIEW max_course_times AS SELECT course_id AS id, max(attendance_time) FROM attendance GROUP BY course_id;
创建VIEW max_times AS选择course_id作为id, max(attendance_time)作为course_id从考勤组;
SELECT * FROM max_course_times;
从max_course_times SELECT *;
#1
5
This is a typical 'greatest per group', 'greatest-n-per-group' or 'groupwise maximum' query that comes up on Stack Overflow almost every day. You can search Stack Overflow for these terms to find many different examples of how to solve this with different databases. One way to solve it is as follows:
这是典型的“每个组最大”、“每个组最大n”或“组最大”查询,几乎每天都会出现堆栈溢出。您可以搜索这些术语的Stack Overflow来找到许多不同的例子,说明如何使用不同的数据库来解决这个问题。一种解决方法是:
SELECT
T2.course_id,
T2.attendance_time
T2.id
FROM (
SELECT
course_id,
MAX(attendance_time) AS attendance_time
FROM attendance
GROUP BY course_id
) T1
JOIN attendance T2
ON T1.course_id = T2.course_id
AND T1.attendance_time = T2.attendance_time
Note that this query can in theory return multiple rows per course_id if there are multiple rows with the same attendance_time. If that cannot happen then you don't need to worry about this issue. If this is a potential problem then you can solve this by adding an extra grouping on course_id, attendance_time and selecting the minimum or maximum id.
注意,如果有多个行具有相同的attendance_time,那么这个查询理论上可以为每个course_id返回多个行。如果那不可能发生,那么你不需要担心这个问题。如果这是一个潜在的问题,那么您可以通过在course_id、attendance_time上添加额外的分组并选择最小或最大id来解决这个问题。
#2
0
What do you need the additional column for? It already has a course ID, which identifies the data. A synthetic ID to the query would be useless because it does not refer to anything. If you want to get the max from the query results for a single course, then you can add a where condition like this:
您需要额外的列做什么?它已经有一个课程ID,用于标识数据。查询的合成ID将是无用的,因为它不引用任何内容。如果您想从查询结果中获得单个课程的最大值,那么您可以添加如下条件:
SELECT course_id, max(attendance_time) FROM attendance GROUP BY course_id **WHERE course_id = your_id_here**;
按course_id **从考勤组中选择course_id, max(attendance_time),其中course_id = your_id_here*;
If you mean that the column should be named 'id', you can alias it in the query:
如果您的意思是列应该被命名为“id”,您可以在查询中对其进行别名:
SELECT course_id **AS id**, max(attendance_time) FROM attendance GROUP BY course_id;
从考勤组中选择课程id** AS id**、max(attendance_time);
You could make a view out of your query to easily access the aggregate data:
您可以从查询中创建一个视图来轻松访问聚合数据:
CREATE VIEW max_course_times AS SELECT course_id AS id, max(attendance_time) FROM attendance GROUP BY course_id;
创建VIEW max_times AS选择course_id作为id, max(attendance_time)作为course_id从考勤组;
SELECT * FROM max_course_times;
从max_course_times SELECT *;