SQL INSERT INTO SELECT并将SELECT数据返回到Create Row View Counts

So I'm creating a system that will be pulling 50-150 records at a time from a table and display them to the user, and I'm trying to keep a view count for each record.

所以我正在创建一个系统,它将从表中一次拉出50-150条记录并将它们显示给用户,并且我正在尝试为每条记录保留一个视图计数。

I figured the most efficient way would be to create a MEMORY table that I use an INSERT INTO to pull the IDs of the rows into and then have a cron function that runs regularly to aggregate the view ID counts and clears out the memory table, updating the original one with the latest view counts. This avoids constantly updating the table that'll likely be getting accessed the most, so I'm not locking 150 rows at a time with each query(or the whole table if I'm using MyISAM).

我认为最有效的方法是创建一个MEMORY表,我使用INSERT INTO将行的ID拉入,然后有一个cron函数定期运行以聚合视图ID计数并清除内存表,更新具有最新视图的原始计数。这样可以避免不断更新最可能被访问的表,因此我不会在每次查询时一次锁定150行(如果我使用的是MyISAM,则不是整个表)。

Basically, the method explained here.

基本上,这里解释的方法。

However, I would of course like to do this at the same time as I pull the records information for viewing, and I'd like to avoid running a second, separate query just to get the same set of data for its counts.

但是,我当然希望在我提取记录信息以供查看的同时执行此操作,并且我希望避免运行第二个单独的查询,以便为其计数获取相同的数据集。

Is there any way to SELECT a dataset, return that dataset, and simultaneously insert a single column from that dataset into another table?

有没有办法选择数据集,返回该数据集,同时将该数据集中的单个列插入另一个表?

It looks like PostgreSQL might have something similar to what I want with the RETURNING keyword, but I'm using MySQL.

看起来PostgreSQL可能有类似于我想要的RETURNING关键字,但我正在使用MySQL。

2 个解决方案

#1

First of all, I would not add a counter column to the Main table. I would create a separate Audit table that would hold ID of the item from the Main table plus at least timestamp when that ID was requested. In essence, Audit table would store a history of requests. In this approach you can easily generate much more interesting reports. You can always calculate grand totals per item and also you can calculate summaries by day, week, month, etc per item or across all items. Depending on the volume of data you can periodically delete Audit entries older than some threshold (a month, a year, etc).

首先,我不会在Main表中添加计数器列。我将创建一个单独的Audit表,该表将保留Main表中项的ID以及请求该ID时的至少时间戳。实质上,Audit表将存储请求的历史记录。在这种方法中,您可以轻松生成更有趣的报告。您始终可以计算每个项目的总计,还可以按每个项目或所有项目按天,周,月等计算摘要。根据数据量,您可以定期删除超过某个阈值(一个月,一年等)的审核条目。

Also, you can easily store more information in Audit table as needed, for example, user ID to calculate stats per user.

此外,您可以根据需要在Audit表中轻松存储更多信息,例如,用户ID来计算每个用户的统计信息。

To populate Audit table "automatically" I would create a stored procedure. The client code would call this stored procedure instead of performing the original SELECT. Stored procedure would return exactly the same result as original SELECT does, but would also add necessary details to the Audit table transparently to the client code.

要“自动”填充Audit表,我将创建一个存储过程。客户端代码将调用此存储过程而不是执行原始SELECT。存储过程将返回与原始SELECT完全相同的结果,但也会向Audit表透明地向客户端代码添加必要的详细信息。

So, let's assume that Audit table looks like this:

所以,我们假设Audit表看起来像这样:

CREATE TABLE AuditTable
(
    ID int 
    IDENTITY -- SQL Server
    SERIAL -- Postgres
    AUTO_INCREMENT -- MySQL
    NOT NULL,
    ItemID int NOT NULL,
    RequestDateTime datetime NOT NULL
)

and your main SELECT looks like this:

你的主要SELECT看起来像这样:

SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>

To perform both INSERT and SELECT in one statement in SQL Server I'd use OUTPUT clause, in Postgres - RETURNING clause, in MySQL - ??? I don't think it has anything like this. So, MySQL procedure would have several separate statements.

要在SQL Server中的一个语句中执行INSERT和SELECT,我将使用OUTPUT子句,在Postgres - RETURNING子句中,在MySQL中 - ???我不认为它有这样的东西。所以,MySQL程序会有几个单独的语句。

MySQL

At first do your SELECT and insert results into a temporary (possibly memory) table. Then copy item IDs from temporary table into Audit table. Then SELECT from temporary table to return result to the client.

首先执行SELECT并将结果插入到临时(可能是内存)表中。然后将项目ID从临时表复制到Audit表中。然后从临时表中选择SELECT以将结果返回给客户端。

CREATE TEMPORARY TABLE TempTable
(
    ItemID int NOT NULL,
    Col1 ..., 
    Col2 ..., 
    ...
)
ENGINE = MEMORY
SELECT ItemID, Col1, Col2, ...
FROM MainTable
WHERE <complex criteria>
;

INSERT INTO AuditTable (ItemID, RequestDateTime)
SELECT ItemID, NOW()
FROM TempTable;

SELECT ItemID, Col1, Col2, ...
FROM TempTable
ORDER BY ...;

SQL Server (just to tease you. this single statement does both INSERT and SELECT)

SQL Server(只是为了取笑你。这个单个语句同时执行INSERT和SELECT)

MERGE INTO AuditTable
USING
(
    SELECT ItemID, Col1, Col2, ...
    FROM MainTable
    WHERE <complex criteria>
) AS Src
ON 1 = 0
WHEN NOT MATCHED BY TARGET THEN
INSERT
    (ItemID, RequestDateTime)
VALUES
    (Src.ItemID, GETDATE())
OUTPUT
    Src.ItemID, Src.Col1, Src.Col2, ...
;

You can leave Audit table as it is, or you can set up cron to summarize it periodically. It really depends on the volume of data. In our system we store individual rows for a week, plus we summarize stats per hour and keep it for 6 weeks, plus we keep daily summary for 18 months. But, important part, all these summaries are separate Audit tables, we don't keep auditing information in the Main table, so we don't need to update it.

您可以按原样保留Audit表,也可以设置cron以定期对其进行汇总。这实际上取决于数据量。在我们的系统中,我们将个别行存储一周,此外我们总结每小时的统计数据并保持6周,此外我们将每日摘要保存18个月。但是,重要的是,所有这些摘要都是单独的审计表,我们不会在主表中保留审计信息,因此我们不需要更新它。

Joe Celko explained it very well in SQL Style Habits: Attack of the Skeuomorphs:

Joe Celko在SQL Style Habits:Skeuomorphs的攻击中解释得非常好:

Now go to any SQL Forum text search the postings. You will find thousands of postings with DDL that include columns named createdby, createddate, modifiedby and modifieddate with that particular meta data on the end of the row declaration. It is the old mag tape header label written in a new language! Deja Vu!

现在去任何SQL论坛文本搜索帖子。您将在DDL中找到数千个帖子,这些帖子包含名为createdby,createddate,modifiedby和modifieddate的列,这些列包含在行声明末尾的特定元数据。这是用新语言编写的旧磁带标题标签! Deja Vu!

The header records appeared only once on a tape. But these meta data values appear over and over on every row in the table. One of the main reasons for using databases (not just SQL) was to remove redundancy from the data; this just adds more redundancy. But now think about what happens to the audit trail when a row is deleted? What happens to the audit trail when a row is updated? The trail is destroyed. The audit data should be separated from the schema. Would you put the log file on the same disk drive as the database? Would an accountant let the same person approve and receive a payment?

标题记录仅在磁带上出现一次。但是这些元数据值在表中的每一行上反复出现。使用数据库(不仅仅是SQL)的主要原因之一是从数据中删除冗余;这只会增加更多冗余。但是现在考虑删除行时审计跟踪会发生什么?更新行时审计跟踪会发生什么?小径被摧毁。审计数据应与架构分开。你会把日志文件放在与数据库相同的磁盘驱动器上吗?会计师会让同一个人批准并收到付款吗?

#2

You're kind of asking if MySQL supports a SELECT trigger. It doesn't. You'll need to do this as two queries, however you can stick those inside a stored procedure - then you can pass in the range you're fetching, have it both return the results AND do the INSERT into the other table.

你有点问MySQL是否支持SELECT触发器。它没有。你需要将它作为两个查询来执行,但是你可以将它们放在存储过程中 - 然后你可以传入你正在获取的范围,让它返回结果并将INSERT插入到另一个表中。

Updated answer with skeleton example for stored procedure:

更新了存储过程的骨架示例的答案:

DELIMITER $$
CREATE PROCEDURE `FetchRows`(IN StartID INT, IN EndID INT)
BEGIN
    UPDATE Blah SET ViewCount = ViewCount+1 WHERE id >= StartID AND id <= EndID;
    # ^ Assumes counts are stored in the same table. If they're in a seperate table, do an INSERT INTO ... ON DUPLICATE KEY UPDATE ViewCount = ViewCount+1 instead.
    SELECT * FROM Blah WHERE id >= StartID AND id <= EndID;
END$$
DELIMITER ;

#1