mysql唯一索引来自一组列

时间:2021-04-14 04:25:08

I am creating an application that inserts (or updates) values in mysql daily. A simplified recordset with headers is :

我正在创建一个应用程序,每天在mysql中插入(或更新)值。带有头文件的简化记录集是:

ItemName,ItemNumber,ItemQty,Date
test1,1,5,2016/01/01
test1,1,3,2016/01/02
test2,2,7,2016/01/01
test2,2,5,2016/01/02

When using a simple insert statement for the above recordset with 16 columns and 216.000 records takes about 4 minutes (php/mysql) - This covers a week of values. Of course if I import the same recordset I get duplicates. I am trying to find a way to effectively disallow duplicate entries. The aim is to : In the scenario where I import every day a recordset that has dates for the current week I end up with the addition of the new dates only.

当为上面的记录集使用一个简单的insert语句,包含16列和216.000条记录时,大约需要4分钟(php/mysql)——这包含了一周的值。当然,如果我导入相同的记录集,我就会得到重复。我正在试图找到一种有效地不允许重复条目的方法。目的是:在我每天导入一个有当前周日期的记录集的场景中,我只添加新的日期。

The only thing that might change in consecutive imports is the ItemQty. In php I made a logic where I query the db for ItemName,ItemNumber,Date with the values I am trying to insert. If there is a result on the SELECT statement, I break. If there isn't, I proceed inserting a new row. Problem is that with the addition of this logic now it does not take 4 minutes, but a couple of hours. (Works though)

在连续导入中唯一可能改变的是ItemQty。在php中,我创建了一个逻辑,在其中我查询db以获取ItemName、ItemNumber、Date以及我要插入的值。如果SELECT语句中有结果,我就中断。如果没有,我继续插入一个新行。问题是现在加入这个逻辑并不需要4分钟,而是几个小时。(工作)

Any ideas?

什么好主意吗?

I was thinking perhaps when I insert, to insert something like a checksum column, for example md5(ItemName,ItemNumber,ItemQty,Date) and then check this checksum rather than SELECT * FROM $table WHERE ItemName = value ,ItemNumber = value,ItemQty = value,Date = value that I currently have.

当我插入时,我可能在想,插入一些类似于校验和列的东西,例如md5(ItemName,ItemNumber,ItemQty,Date),然后检查这个校验和,而不是从$table中选择*,其中ItemName = value, ItemNumber = value,ItemQty = value,Date = value。

My problem is that the records I insert have nothing unique basically. Uniqueness comes from a group of fields only if compared to the dataset to be imported. If I manage somehow to get uniqueness, I'll solve my other problem too, which is deleting a row or updating a row when the ItemQty changes.

我的问题是我插入的记录基本上没有什么独特之处。惟一性只来自一组字段,如果与要导入的数据集相比较。如果我设法获得惟一性,我还将解决另一个问题,即删除一行或在ItemQty更改时更新一行。

2 个解决方案

#1


1  

The one that you are looking for is the unique constraint. Using unique constraint, you can add all your columns to the constraint and if all columns satisfied the inserting data, it will not proceed in inserting

你要找的是唯一约束。使用唯一约束,可以将所有列添加到约束中,如果所有列都满足插入数据,则不会继续插入

#2


1  

Few options:

几个选项:

1) On PHP, iterate over the records, mapping the duplicate ones and keeping the newests

1)在PHP上,遍历记录,映射重复的记录,保存newests

$itemsArray = []; // The array where you have stored your data

$uniqueItems = [];

foreach($itemsArray as $item)
{
    if(isset($uniqueItems[$item['ItemName']]))
    {
        $oldRecord = $uniqueItems[$item['ItemName']];

        $newTimeStamp = strtotime($item['Date']); // Might not work with your format date
        $currentTimeStamp = strtotiem($oldRecord['Date']);

        if($newTimeStamp > $currentTimeStamp)
        {
            $uniqueItems[$item['ItemName']] = $item;
        }
    }
    else
    {
        $uniqueItems[$item['ItemName']] = $item;
    }
}

// uniqueItems now hold only 1 record per ItemName (the newest one)

2) Sort the data in php by date on ascending order(before inserting in database). Then, on your clause, use ON DUPLICATE KEY UPDATE. This will cause mysql to update the records with duplicate key. In this case, the older records will be inserted first, so the lastest records will be inserted last, overwritting the old records data.

2)按升序对php中的数据进行数据排序(在插入数据库之前)。然后,在您的子句上,使用重复的键更新。这将导致mysql使用重复键更新记录。在这种情况下,将首先插入旧记录,因此最后的记录将最后插入,覆盖旧记录数据。

#1


1  

The one that you are looking for is the unique constraint. Using unique constraint, you can add all your columns to the constraint and if all columns satisfied the inserting data, it will not proceed in inserting

你要找的是唯一约束。使用唯一约束,可以将所有列添加到约束中,如果所有列都满足插入数据,则不会继续插入

#2


1  

Few options:

几个选项:

1) On PHP, iterate over the records, mapping the duplicate ones and keeping the newests

1)在PHP上,遍历记录,映射重复的记录,保存newests

$itemsArray = []; // The array where you have stored your data

$uniqueItems = [];

foreach($itemsArray as $item)
{
    if(isset($uniqueItems[$item['ItemName']]))
    {
        $oldRecord = $uniqueItems[$item['ItemName']];

        $newTimeStamp = strtotime($item['Date']); // Might not work with your format date
        $currentTimeStamp = strtotiem($oldRecord['Date']);

        if($newTimeStamp > $currentTimeStamp)
        {
            $uniqueItems[$item['ItemName']] = $item;
        }
    }
    else
    {
        $uniqueItems[$item['ItemName']] = $item;
    }
}

// uniqueItems now hold only 1 record per ItemName (the newest one)

2) Sort the data in php by date on ascending order(before inserting in database). Then, on your clause, use ON DUPLICATE KEY UPDATE. This will cause mysql to update the records with duplicate key. In this case, the older records will be inserted first, so the lastest records will be inserted last, overwritting the old records data.

2)按升序对php中的数据进行数据排序(在插入数据库之前)。然后,在您的子句上,使用重复的键更新。这将导致mysql使用重复键更新记录。在这种情况下,将首先插入旧记录,因此最后的记录将最后插入,覆盖旧记录数据。