I'm having a trouble when tried to use array_combine
in a foreach
loop. It will end up with an error:
当尝试在foreach循环中使用array_combine时,我遇到了麻烦。它将以一个错误结束:
PHP Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 85 bytes) in
Here is my code:
这是我的代码:
$data = array();
$csvData = $this->getData($file);
if ($columnNames) {
$columns = array_shift($csvData);
foreach ($csvData as $keyIndex => $rowData) {
$data[$keyIndex] = array_combine($columns, array_values($rowData));
}
}
return $data;
The source file CSV which I've used has approx ~1,000,000 rows. This row
我使用的源文件CSV有大约1,000,000行。这一行
$csvData = $this->getData($file)
I was using a while loop to read CSV and assign it into an array, it's working without any problem. The trouble come from array_combine
and foreach
loop.
我使用了一个while循环来读取CSV并将其分配到一个数组中,它可以正常工作。麻烦来自array_combine和foreach循环。
Do you have any idea to resolve this or simply have a better solution?
你有什么办法解决这个问题吗?或者你有更好的办法?
UPDATED
Here is the code to read the CSV file (using while loop)
下面是读取CSV文件的代码(使用while循环)
$data = array();
if (!file_exists($file)) {
throw new Exception('File "' . $file . '" do not exists');
}
$fh = fopen($file, 'r');
while ($rowData = fgetcsv($fh, $this->_lineLength, $this->_delimiter, $this->_enclosure)) {
$data[] = $rowData;
}
fclose($fh);
return $data;
UPDATED 2
The code above is working without any problem if you are playing around with a CSV file <=20,000~30,000 rows. From 50,000 rows and up, the memory will be exhausted.
上面的代码在使用CSV文件<=20,000~30,000行时没有任何问题。从50,000行以上,内存将被耗尽。
1 个解决方案
#1
4
You're in fact keeping (or trying to keep) two distinct copies of the whole dataset in your memory. First you load the whole CSV date into memory using getData()
and the you copy the data into the $data
array by looping over the data in memory and creating a new array.
实际上,您在您的内存中保存(或试图保存)整个数据集的两个不同副本。首先,使用getData()将整个CSV日期加载到内存中,然后通过循环访问内存中的数据并创建一个新数组,将数据复制到$data数组中。
You should use stream based reading when loading the CSV data to keep just one data set in memory. If you're on PHP 5.5+ (which you definitely should by the way) this is a simple as changing your getData
method to look like that:
在加载CSV数据时,应该使用基于流的读取,以便在内存中仅保留一个数据集。如果你在PHP 5.5+上(顺便说一下,你肯定应该这样做),这是一个简单的改变你的getData方法,看起来像这样:
protected function getData($file) {
if (!file_exists($file)) {
throw new Exception('File "' . $file . '" do not exists');
}
$fh = fopen($file, 'r');
while ($rowData = fgetcsv($fh, $this->_lineLength, $this->_delimiter, $this->_enclosure)) {
yield $rowData;
}
fclose($fh);
}
This makes use of a so-called generator which is a PHP >= 5.5 feature. The rest of your code should continue to work as the inner workings of getData
should be transparent to the calling code (only half of the truth).
这利用了所谓的生成器,即PHP >= 5.5特性。其余的代码应该继续工作,因为getData的内部工作对调用代码应该是透明的(只有一半是真实的)。
UPDATE to explain how extracting the column headers will work now.
更新以解释提取列标题将如何工作。
$data = array();
$csvData = $this->getData($file);
if ($columnNames) { // don't know what this one does exactly
$columns = null;
foreach ($csvData as $keyIndex => $rowData) {
if ($keyIndex === 0) {
$columns = $rowData;
} else {
$data[$keyIndex/* -1 if you need 0-index */] = array_combine(
$columns,
array_values($rowData)
);
}
}
}
return $data;
#1
4
You're in fact keeping (or trying to keep) two distinct copies of the whole dataset in your memory. First you load the whole CSV date into memory using getData()
and the you copy the data into the $data
array by looping over the data in memory and creating a new array.
实际上,您在您的内存中保存(或试图保存)整个数据集的两个不同副本。首先,使用getData()将整个CSV日期加载到内存中,然后通过循环访问内存中的数据并创建一个新数组,将数据复制到$data数组中。
You should use stream based reading when loading the CSV data to keep just one data set in memory. If you're on PHP 5.5+ (which you definitely should by the way) this is a simple as changing your getData
method to look like that:
在加载CSV数据时,应该使用基于流的读取,以便在内存中仅保留一个数据集。如果你在PHP 5.5+上(顺便说一下,你肯定应该这样做),这是一个简单的改变你的getData方法,看起来像这样:
protected function getData($file) {
if (!file_exists($file)) {
throw new Exception('File "' . $file . '" do not exists');
}
$fh = fopen($file, 'r');
while ($rowData = fgetcsv($fh, $this->_lineLength, $this->_delimiter, $this->_enclosure)) {
yield $rowData;
}
fclose($fh);
}
This makes use of a so-called generator which is a PHP >= 5.5 feature. The rest of your code should continue to work as the inner workings of getData
should be transparent to the calling code (only half of the truth).
这利用了所谓的生成器,即PHP >= 5.5特性。其余的代码应该继续工作,因为getData的内部工作对调用代码应该是透明的(只有一半是真实的)。
UPDATE to explain how extracting the column headers will work now.
更新以解释提取列标题将如何工作。
$data = array();
$csvData = $this->getData($file);
if ($columnNames) { // don't know what this one does exactly
$columns = null;
foreach ($csvData as $keyIndex => $rowData) {
if ($keyIndex === 0) {
$columns = $rowData;
} else {
$data[$keyIndex/* -1 if you need 0-index */] = array_combine(
$columns,
array_values($rowData)
);
}
}
}
return $data;