I've almost finished a project involving customers and products and only identified at the end that we have duplicate records due to keying errors, where sales staff have added the same customer to the database more than once.
我几乎完成了一个涉及客户和产品的项目,最后我们发现由于键入错误我们有重复的记录,销售人员不止一次地将同一个客户添加到数据库中。
What I need to do is to identify the duplicate records by comparing Customer name and their Postcode and merge the Products so that the resulting updated products field is consistent with all of the products that are applicable to them, but only one customer record exists.
我需要做的是通过比较客户名称和他们的邮政编码来识别重复记录并合并产品,以便得到的更新产品字段与适用于他们的所有产品一致,但只存在一个客户记录。
In order to explain this, I have put together a small example.
为了解释这一点,我举了一个小例子。
DROP TABLE IF EXISTS `tblProducts`;
CREATE TABLE `tblProducts` (
`ID` int(10) DEFAULT NULL,
`Customer` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`Postcode` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`Products` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `tblProducts` VALUES ('1', 'Bradford', 'BR1 2HJ', '111&222&444');
INSERT INTO `tblProducts` VALUES ('2', 'Bradford', 'BR1 2HJ', '222');
INSERT INTO `tblProducts` VALUES ('3', 'Tanner', 'TE4 9PO', '777&333');
INSERT INTO `tblProducts` VALUES ('4', 'Smythe', 'SM3 8KO', '111&222');
INSERT INTO `tblProducts` VALUES ('5', 'Francis', 'FL2 6HG', '444&333');
INSERT INTO `tblProducts` VALUES ('6', 'Tanner', 'TE4 9PO', '555');
INSERT INTO `tblProducts` VALUES ('7', 'Peters', 'PE4 4PE', '444');
INSERT INTO `tblProducts` VALUES ('8', 'Jeffrey', 'JE9 4JK', '444&555&888');
INSERT INTO `tblProducts` VALUES ('9', 'Barnes', 'BA5 5AB', '999');
INSERT INTO `tblProducts` VALUES ('10', 'Smythe', 'SM1 4GE', '888&777&222');
If we run the following query, you will see that we have two duplicates, for Bradford and Tanner.
如果我们运行以下查询,您将看到我们有两个重复项,分别用于Bradford和Tanner。
SELECT Customer, Postcode, COUNT(*) FROM tblProducts group by Customer, Postcode having count(*) > 1
Customer Postcode COUNT(*)
Bradford BR1 2HJ 2
Tanner TE4 9PO 2
The separate duplicate records are:
单独的重复记录是:
Customer Postcode Products
Bradford BR1 2HJ 111&222&444
Bradford BR1 2HJ 222
Tanner TE4 9PO 777&333
Tanner TE4 9PO 555
I need to run a MySQL query to 'merge products where customer and postcode count > 1' as above, so the end result will be:
我需要运行MySQL查询以“合并客户和邮政编码计数> 1的产品”,因此最终结果将是:
Customer Postcode Products
Bradford BR1 2HJ 111&222&444
Tanner TE4 9PO 777&333&555
Note that there is only one instance of 222 in the first record as 222 already existed. The duplicate record will be removed from the MySQL table so that only one record exists.
请注意,第一条记录中只有222个实例,因为222已经存在。重复记录将从MySQL表中删除,因此只存在一条记录。
I must admit, I had assumed this would be easy for MySQL to achieve and have spent ages researched merging rows, merging fields, removing duplicates and not found anything that seems to specifically to help.
我必须承认,我认为这对于MySQL来说很容易实现,并且已经花费了很多时间来研究合并行,合并字段,删除重复项以及找不到任何特别有用的东西。
Link to jsfiddle if it helps: http://sqlfiddle.com/#!9/966550/4/0
如果有帮助,请链接到jsfiddle:http://sqlfiddle.com/#!9/966550/4/0
Can anyone help please as I am stuck.
当我被困住时,任何人都可以帮助。
Many thanks,
非常感谢,
Rob
抢
1 个解决方案
#1
1
SELECT TP.Customer,TP.Postcode,TP.Products
FROM tblProducts TP
INNER JOIN
(
SELECT MIN(ID) ID FROM tblProducts GROUP BY Customer, Postcode
)INNERTABLE ON INNERTABLE.ID=TP.ID
You can try above query.
您可以尝试以上查询。
#1
1
SELECT TP.Customer,TP.Postcode,TP.Products
FROM tblProducts TP
INNER JOIN
(
SELECT MIN(ID) ID FROM tblProducts GROUP BY Customer, Postcode
)INNERTABLE ON INNERTABLE.ID=TP.ID
You can try above query.
您可以尝试以上查询。