I have a 1,000,000 row .csv file that I uploaded into a table using mySQL Workbench, but I forgot to makes the dates YYYY-MM-DD
before I started, so they all uploaded as 0000-00-00
.
我有一个1,000,000行.csv文件,我使用mySQL Workbench上载到一个表中,但是我在开始之前忘记将日期设置为yyyyyyyyy - mm - dd,所以它们都以000000 -00的形式上载。
It took almost 8 hours to upload the million records, so I'd REALLY like to not have to do it all over again, but I can't figure out if there's a way for me to replace JUST that one column of data from the same file I originally uploaded, now that I've changed the dates to be in the correct format.
历经近8小时上传百万记录,所以我真的想没有做一遍,但是我不知道如果有一个方法让我代替,一列从相同的文件我原来上传的数据,现在,我已经改变了日期正确的格式。
Does anyone know if this is possible?
有人知道这是否可能吗?
Edit
编辑
It's WAY too long to post everything, but: here's the show create table
with some of the meat taken out:
发布所有内容实在太长了,但是:这是show create table,里面有些肉被去掉了:
CREATE TABLE myTable
( lineID int(11) NOT NULL AUTO_INCREMENT,
1 varchar(50) DEFAULT NULL,
2 varchar(1) DEFAULT NULL,
3 int(4) DEFAULT NULL,
4 varchar(20) DEFAULT NULL,
DATE date DEFAULT NULL,
PRIMARY KEY (lineID)
) ENGINE=InnoDB AUTO_INCREMENT=634205 DEFAULT CHARSET=utf8
Version is 5.6.20
版本是5.6.20
截图:
1 个解决方案
#1
3
Ok. I would recommend using LOAD DATA INFILE explicitly. For those that have not used it, consider it just as a select statement for now til you see it.
好的。我建议显式地使用LOAD DATA INFILE。对于那些还没有使用它的人,请将它视为一个select语句,直到您看到它为止。
Here is a nice article on performance and strategies titled Testing the Fastest Way to Import a Table into MySQL. Don't let the mysql version of the title or inside the article scare you away. Jumping to the bottom and picking up some conclusions:
这里有一篇关于性能和策略的好文章,标题是测试将一个表导入MySQL的最快方法。不要让mysql版本的标题或文章里面吓跑你。从下面开始,得出一些结论:
The fastest way you can import a table into MySQL without using raw files is the LOAD DATA syntax. Use parallelization for InnoDB for better results, and remember to tune basic parameters like your transaction log size and buffer pool. Careful programming and importing can make a >2-hour problem became a 2-minute process. You can disable temporarily some security features for extra performance
在不使用原始文件的情况下,将一个表导入MySQL的最快方法就是加载数据语法。为了获得更好的结果,请对InnoDB使用并行化,并记住调优基本参数,如事务日志大小和缓冲池。仔细的编程和导入可以使>2小时的问题变成2分钟的过程。您可以暂时禁用某些安全特性以获得额外的性能
There are also fine points in there, mainly in peer comments back and forth about secondary indexes (which you do not have). The important point for others is to add them after the fact.
这里也有一些很好的地方,主要是关于二级索引(您没有)的对等评论。对其他人来说,重要的一点是在事实之后再加上它们。
I hope these links are useful. And your data comes in ... in 10 minutes (in another test table with LOAD DATA INFILE
).
我希望这些链接是有用的。你的数据来自。10分钟后(在另一个加载数据的测试表中)。
General Comments
About the slowest way to do it is in a programming language via a while loop, row by row. Getting faster is certainly batch, where one insert statement passes along, say, 200 to 1k rows at a time. Up substantially in performance is LOAD DATA INFILE. Fastest is raw files (what I do, but beyond the scope of talking here).
最慢的方法是通过一个while循环,一行一行地使用编程语言。更快当然是批处理,其中一个insert语句每次传递200到1k行。性能上的显著提高是在文件中加载数据。最快的是原始文件(我所做的,但是超出了这里的讨论范围)。
#1
3
Ok. I would recommend using LOAD DATA INFILE explicitly. For those that have not used it, consider it just as a select statement for now til you see it.
好的。我建议显式地使用LOAD DATA INFILE。对于那些还没有使用它的人,请将它视为一个select语句,直到您看到它为止。
Here is a nice article on performance and strategies titled Testing the Fastest Way to Import a Table into MySQL. Don't let the mysql version of the title or inside the article scare you away. Jumping to the bottom and picking up some conclusions:
这里有一篇关于性能和策略的好文章,标题是测试将一个表导入MySQL的最快方法。不要让mysql版本的标题或文章里面吓跑你。从下面开始,得出一些结论:
The fastest way you can import a table into MySQL without using raw files is the LOAD DATA syntax. Use parallelization for InnoDB for better results, and remember to tune basic parameters like your transaction log size and buffer pool. Careful programming and importing can make a >2-hour problem became a 2-minute process. You can disable temporarily some security features for extra performance
在不使用原始文件的情况下,将一个表导入MySQL的最快方法就是加载数据语法。为了获得更好的结果,请对InnoDB使用并行化,并记住调优基本参数,如事务日志大小和缓冲池。仔细的编程和导入可以使>2小时的问题变成2分钟的过程。您可以暂时禁用某些安全特性以获得额外的性能
There are also fine points in there, mainly in peer comments back and forth about secondary indexes (which you do not have). The important point for others is to add them after the fact.
这里也有一些很好的地方,主要是关于二级索引(您没有)的对等评论。对其他人来说,重要的一点是在事实之后再加上它们。
I hope these links are useful. And your data comes in ... in 10 minutes (in another test table with LOAD DATA INFILE
).
我希望这些链接是有用的。你的数据来自。10分钟后(在另一个加载数据的测试表中)。
General Comments
About the slowest way to do it is in a programming language via a while loop, row by row. Getting faster is certainly batch, where one insert statement passes along, say, 200 to 1k rows at a time. Up substantially in performance is LOAD DATA INFILE. Fastest is raw files (what I do, but beyond the scope of talking here).
最慢的方法是通过一个while循环,一行一行地使用编程语言。更快当然是批处理,其中一个insert语句每次传递200到1k行。性能上的显著提高是在文件中加载数据。最快的是原始文件(我所做的,但是超出了这里的讨论范围)。