MySQL:通过块检索大量选择

时间:2023-02-08 23:47:13

I have select with more then

我已经选择了更多

70 milion rows

70万行

I'd like to save the selected data into the one large csv file on win2012 R2

我想将所选数据保存到win2012 R2上的一个大型csv文件中

Q: How to retrive the data from MySQL by chanks for better performance ?

问:如何通过通道从MySQL中检索数据以获得更好的性能?

because when I try to save one the large select I got

因为当我试图保存一个我得到的大选择

out of memory errors

内存不足错误

1 个解决方案

#1


19  

You could try using the LIMIT feature. If you do this:

您可以尝试使用LIMIT功能。如果你这样做:

SELECT * FROM MyTable ORDER BY whatever LIMIT 0,1000

You'll get the first 1,000 rows. The first LIMIT value (0) defines the starting row in the result set. It's zero-indexed, so 0 means "the first row". The second LIMIT value is the maximum number of rows to retrieve. To get the next few sets of 1,000, do this:

你会获得前1000行。第一个LIMIT值(0)定义结果集中的起始行。它是零索引的,所以0表示“第一行”。第二个LIMIT值是要检索的最大行数。要获得接下来的几组1000,请执行以下操作:

SELECT * FROM MyTable ORDER BY whatever LIMIT 1000,1000 -- rows 1,001 - 2,000
SELECT * FROM MyTable ORDER BY whatever LIMIT 2000,1000 -- rows 2,001 - 3,000

And so on. When the SELECT returns no rows, you're done.

等等。当SELECT没有返回任何行时,你就完成了。

This isn't enough on its own though, because any changes done to the table while you're processing your 1K rows at a time will throw off the order. To freeze the results in time, start by querying the results into a temporary table:

但这本身并不足够,因为在您一次处理1K行时对表进行的任何更改都会导致顺序失效。要及时冻结结果,首先将结果查询到临时表中:

CREATE TEMPORARY TABLE MyChunkedResult AS (
  SELECT *
  FROM MyTable
  ORDER BY whatever
);

Side note: it's a good idea to make sure the temporary table doesn't exist beforehand:

附注:确保事先不存在临时表是个好主意:

DROP TEMPORARY TABLE IF EXISTS MyChunkedResult;

At any rate, once the temporary table is in place, pull the row chunks from there:

无论如何,一旦临时表到位,从那里拉出行块:

SELECT * FROM MyChunkedResult LIMIT 0, 1000;
SELECT * FROM MyChunkedResult LIMIT 1000,1000;
SELECT * FROM MyChunkedResult LIMIT 2000,1000;
.. and so on.

I'll leave it to you to create the logic that will calculate the limit value after each chunk and check for the end of results. I'd also recommend much larger chunks than 1,000 records; it's just a number I picked out of the air.

我将留给你创建逻辑,计算每个块后的限制值并检查结果的结束。我还推荐比1000条记录更大的块;这只是我从空中挑选的一个数字。

Finally, it's good form to drop the temporary table when you're done:

最后,完成后删除临时表是一种很好的形式:

DROP TEMPORARY TABLE MyChunkedResult;

#1


19  

You could try using the LIMIT feature. If you do this:

您可以尝试使用LIMIT功能。如果你这样做:

SELECT * FROM MyTable ORDER BY whatever LIMIT 0,1000

You'll get the first 1,000 rows. The first LIMIT value (0) defines the starting row in the result set. It's zero-indexed, so 0 means "the first row". The second LIMIT value is the maximum number of rows to retrieve. To get the next few sets of 1,000, do this:

你会获得前1000行。第一个LIMIT值(0)定义结果集中的起始行。它是零索引的,所以0表示“第一行”。第二个LIMIT值是要检索的最大行数。要获得接下来的几组1000,请执行以下操作:

SELECT * FROM MyTable ORDER BY whatever LIMIT 1000,1000 -- rows 1,001 - 2,000
SELECT * FROM MyTable ORDER BY whatever LIMIT 2000,1000 -- rows 2,001 - 3,000

And so on. When the SELECT returns no rows, you're done.

等等。当SELECT没有返回任何行时,你就完成了。

This isn't enough on its own though, because any changes done to the table while you're processing your 1K rows at a time will throw off the order. To freeze the results in time, start by querying the results into a temporary table:

但这本身并不足够,因为在您一次处理1K行时对表进行的任何更改都会导致顺序失效。要及时冻结结果,首先将结果查询到临时表中:

CREATE TEMPORARY TABLE MyChunkedResult AS (
  SELECT *
  FROM MyTable
  ORDER BY whatever
);

Side note: it's a good idea to make sure the temporary table doesn't exist beforehand:

附注:确保事先不存在临时表是个好主意:

DROP TEMPORARY TABLE IF EXISTS MyChunkedResult;

At any rate, once the temporary table is in place, pull the row chunks from there:

无论如何,一旦临时表到位,从那里拉出行块:

SELECT * FROM MyChunkedResult LIMIT 0, 1000;
SELECT * FROM MyChunkedResult LIMIT 1000,1000;
SELECT * FROM MyChunkedResult LIMIT 2000,1000;
.. and so on.

I'll leave it to you to create the logic that will calculate the limit value after each chunk and check for the end of results. I'd also recommend much larger chunks than 1,000 records; it's just a number I picked out of the air.

我将留给你创建逻辑,计算每个块后的限制值并检查结果的结束。我还推荐比1000条记录更大的块;这只是我从空中挑选的一个数字。

Finally, it's good form to drop the temporary table when you're done:

最后,完成后删除临时表是一种很好的形式:

DROP TEMPORARY TABLE MyChunkedResult;