在分隔符(;)和分隔符(,)上销毁csv文件?

时间:2022-09-23 07:46:34

when I explode csv file on delimiter (;) the explode successfully in some excel program and failed in others

当我在delimiter(;)上爆炸csv文件时,在一些excel程序中爆炸成功,而在其他程序中失败

also when I explode csv file on delimiter (,) the explode successfully in some excel program and failed in others

当我在delimiter()上爆炸csv文件时,在一些excel程序中爆炸成功,在其他程序中爆炸失败

How can I do explode in all versions of excel? How can I know the perfect delimiter to explode?

我怎么能在所有版本的excel中爆表呢?我怎么知道最完美的分隔符是什么?

yes there is code..

是的有代码. .

if (!function_exists('create_csv')) {
    function create_csv($query, &$filename = false, $old_csv = false) {
        if(!$filename) $filename = "data_export_".date("Y-m-d").".csv";
        $ci = &get_instance();
        $ci->load->helper('download');
        $ci->load->dbutil();
        $delimiter = ";";
        $newline = "\r\n";
        $csv = "Data:".date("Y-m-d").$newline;
        if($old_csv)
            $csv .= $old_csv;
        else
            $csv .= $ci->dbutil->csv_from_result($query, $delimiter, $newline);
        $columns = explode($newline, $csv);
        $titles = explode($delimiter, $columns[1]);
        $new_titles = array();
        foreach ($titles as $item) {
            array_push($new_titles, lang(trim($item,'"')));
        }
        $columns[1] = implode($delimiter, $new_titles);
        $csv = implode($newline, $columns);
        return $csv;
    }
}

sometimes I put $delimiter = ";"; and sometims $delimiter = ",";

有时我写上$分隔符= ";";有时$delimiter = ",";

thanks..

谢谢. .

5 个解决方案

#1


1  

You can use helper function to detect best delimiter like:

您可以使用辅助函数来检测最佳分隔符,如:

public function find_delimiter($csv)
{
    $delimiters = array(',', '.', ';');
    $bestDelimiter = false;
    $count = 0;
    foreach ($delimiters as $delimiter)
        if (substr_count($csv, $delimiter) > $count) {
            $count = substr_count($csv, $delimiter);
            $bestDelimiter = $delimiter;
        }
    return $bestDelimiter;
}

#2


1  

If you have an idea of the expected data (number of columns) then this might work as a good guess, and could be a good alternative to comparing which occurs the most (depending on what kind of data you're expecting). It would work even better if you have a header record, I'd imagine. (You could put in a check for specific header values)

如果您对预期数据(列的数量)有一个概念,那么这可能是一个很好的猜测,并且可能是一个很好的替代方法,可以比较发生最多的数据(取决于您期望的数据类型)。我想,如果你有一个页眉记录,那就更好了。(您可以对特定的页眉值进行检查)

Sorry for not fitting it into your code, but I am not really sure what those calls you are making do, but you should be able to fit it around.

很抱歉没有将它放入代码中,但是我不确定您正在进行的那些调用是做什么的,但是您应该能够将它适应。

$expected_num_of_columns = 10;
$delimiter = "";

foreach (array(",", ";") as $test_delimiter) {
   $fid = fopen ($filename, "r");
   $csv_row = fgetcsv($fid, 0, $test_delimiter);
   if (count($csv_row) == $expected_num_of_columns) {
       $delimiter = $test_delimiter;
       break;
   }
   fclose($fid);
}

if (empty($delimiter)) {
   die ("Input file did not contain the correct number of fields (" . $expected_num_of_columns . ")");  
}

Don't use this if, for example, all or most of the fields contain non-integer numbers (e.g. a list of monetary amounts) and has no header record, because files separated by ; are most likely to use , as the decimal point and there could be the same number of commas and semi-colons.

例如,如果所有或大部分字段都包含非整数(例如货币金额列表),并且没有头记录(因为文件被分隔),则不要使用此方法;是最可能使用的,作为小数点,可以有相同数量的逗号和半冒号。

#3


0  

The short answer is, you probably can't unless you can apply some heuristic to determine the file format. If you don't know and can't detect the format of the file you're parsing, then parsing it is going to be difficult.

简而言之,除非你能应用一些启发式方法来确定文件格式,否则你可能做不到。如果您不知道也无法检测正在解析的文件的格式,那么解析它将会很困难。

However, once you have determined (or, required a particular one) the delimiter format. You will probably find that php's built-in fgetcsv will be easier and more accurate than a manual explode based strategy.

但是,一旦您确定了(或者需要一个特定的)分隔符格式。您可能会发现php的内置fgetcsv将比基于手工爆炸的策略更容易和更准确。

#4


0  

There is no way to be 100% sure you are targeting the real delimiter. All you can do is guessing.

没有办法100%确定您的目标是真正的分隔符。你所能做的就是猜测。

You should start by finding the right delimiter, then explode the CSV on this delimiter.

您应该从找到正确的分隔符开始,然后在这个分隔符上爆炸CSV。

To find the delimiter, basically, you want a function that counts the number of , and the number of ; and that returns the greater.

要找到分隔符,基本上,你需要一个函数来计数;这样就会得到更大的回报。

Something like :

喜欢的东西:

$array = explode(find_delimiter($csv), $csv);

Hope it helps ;)

希望它能帮助;)

Edit : Your find_delimiter function could be something like :

编辑:您的find_delimiter函数可以是:

function find_delimiter($csv)
{
   $arrDelimiters = array(',', '.', ';');
   $arrResults = array();
   foreach ($arrDelimiters as $delimiter)
   {
       $arrResults[$delimiter] = count(explode($delimiter, $csv));
   }
   $arrResults = rsort($arrResults);
   return (array_keys($arrResults)[0]);
}

#5


0  

Well, it looks like you exactly know that your delimiter will be "," or ";". This is a good place to start. Thus, you may try to replace all commas (,) to semicolons (;), and then explode by the semicolon only. However, in this approach you would definitely have a problem in some cases, because some lines of your CSV files could be like this:

嗯,看起来您确实知道您的分隔符将是“,”或“;”。这是一个很好的起点。因此,您可以尝试将所有逗号(,)替换为分号(;),然后只使用分号进行爆炸。但是,在这种方法中,在某些情况下您肯定会遇到问题,因为您的CSV文件的某些行可能是这样的:

"name,value",other name,other value,last name;last value

“名称、值”、其他名称、其他值、姓;最后值

In this way delimiter of your CSV file will be comma if there will be four columns in your CSV file. However, by changing commas to semicolons you would get five columns which would be incorrect. So, changing some delimiter to another is not a good way.

这样,如果CSV文件中有4列,则CSV文件的分隔符为逗号。然而,通过将逗号改为分号,您将得到5列,这是不正确的。因此,将一些分隔符更改为另一个不是一个好方法。

But still, if your CSV file is correctly formatted, then you may find correct delimiter in any of the lines. So, you may try to create some function like find_delimiter($csvLine) as proposed by @johnkork, but the problem with this is that the function itself can't know which delimiter to search for. However, you exactly know all the possible delimiters, so you may try to create another, quite similar, function like delimiter_exists($csvLine, $delimiter) which returns true or false.

但是,如果您的CSV文件格式正确,那么您可以在任何行中找到正确的分隔符。因此,您可以尝试按照@johnkork的建议创建类似find_delimiter($csvLine)的函数,但问题是函数本身不知道要搜索哪个分隔符。但是,您确切地知道所有可能的分隔符,因此您可以尝试创建另一个类似delimiter_exists的函数($csvLine, $delimiter),它返回true或false。

But even the function delimiter_exists($csvLine, $delimiter) is not enough. Why? Because for the instance of CSV line provided above you would get that both "," and ";" are delimiters that exists. For comma it would CSV file with four columns, and for semicolon it would be two columns.

但是即使函数delimiter_exists($csvLine, $delimiter)也不够。为什么?因为对于上面提供的CSV行实例,您会得到“,”和“;”都是存在的分隔符。对于逗号,它将是包含四列的CSV文件,对于分号,它将是两列。

Thus, there is no universal way which would get you exactly what you want. However, there may be another way you can check for - the first line of CSV file which is the header assuming your CSV files have a header. Mostly, headers in CSV file have (not necessarily) no other symbols, except for the alphanumeric names of the columns, which are delimited by the specific delimiter. So, you may try to create function like delimiter_exists($csvHeader, $delimiter) whose implementation could be like this:

因此,没有一种万能的方法可以让你得到你想要的。但是,您还可以通过另一种方式来检查—CSV文件的第一行,即假定您的CSV文件有一个头文件的头文件。大多数情况下,CSV文件中的标头(不一定)没有其他符号,除了列的字母数字名称,它们由特定的分隔符分隔。因此,您可以尝试创建像delimiter_exists($csvHeader, $delimiter)这样的函数,其实现可以是这样的:

function delimiter_exists($csvHeader, $delimiter) {
    return (bool)preg_match("/$delimiter/", $csvHeader);
}

For you specific case you may use it like this:

对于您的具体情况,您可以这样使用:

$csvHeader = "abc;def";
$delimiter = delimiter_exists($csvHeader, ',') ? ',' : ';';

Hope this helps!

希望这可以帮助!

#1


1  

You can use helper function to detect best delimiter like:

您可以使用辅助函数来检测最佳分隔符,如:

public function find_delimiter($csv)
{
    $delimiters = array(',', '.', ';');
    $bestDelimiter = false;
    $count = 0;
    foreach ($delimiters as $delimiter)
        if (substr_count($csv, $delimiter) > $count) {
            $count = substr_count($csv, $delimiter);
            $bestDelimiter = $delimiter;
        }
    return $bestDelimiter;
}

#2


1  

If you have an idea of the expected data (number of columns) then this might work as a good guess, and could be a good alternative to comparing which occurs the most (depending on what kind of data you're expecting). It would work even better if you have a header record, I'd imagine. (You could put in a check for specific header values)

如果您对预期数据(列的数量)有一个概念,那么这可能是一个很好的猜测,并且可能是一个很好的替代方法,可以比较发生最多的数据(取决于您期望的数据类型)。我想,如果你有一个页眉记录,那就更好了。(您可以对特定的页眉值进行检查)

Sorry for not fitting it into your code, but I am not really sure what those calls you are making do, but you should be able to fit it around.

很抱歉没有将它放入代码中,但是我不确定您正在进行的那些调用是做什么的,但是您应该能够将它适应。

$expected_num_of_columns = 10;
$delimiter = "";

foreach (array(",", ";") as $test_delimiter) {
   $fid = fopen ($filename, "r");
   $csv_row = fgetcsv($fid, 0, $test_delimiter);
   if (count($csv_row) == $expected_num_of_columns) {
       $delimiter = $test_delimiter;
       break;
   }
   fclose($fid);
}

if (empty($delimiter)) {
   die ("Input file did not contain the correct number of fields (" . $expected_num_of_columns . ")");  
}

Don't use this if, for example, all or most of the fields contain non-integer numbers (e.g. a list of monetary amounts) and has no header record, because files separated by ; are most likely to use , as the decimal point and there could be the same number of commas and semi-colons.

例如,如果所有或大部分字段都包含非整数(例如货币金额列表),并且没有头记录(因为文件被分隔),则不要使用此方法;是最可能使用的,作为小数点,可以有相同数量的逗号和半冒号。

#3


0  

The short answer is, you probably can't unless you can apply some heuristic to determine the file format. If you don't know and can't detect the format of the file you're parsing, then parsing it is going to be difficult.

简而言之,除非你能应用一些启发式方法来确定文件格式,否则你可能做不到。如果您不知道也无法检测正在解析的文件的格式,那么解析它将会很困难。

However, once you have determined (or, required a particular one) the delimiter format. You will probably find that php's built-in fgetcsv will be easier and more accurate than a manual explode based strategy.

但是,一旦您确定了(或者需要一个特定的)分隔符格式。您可能会发现php的内置fgetcsv将比基于手工爆炸的策略更容易和更准确。

#4


0  

There is no way to be 100% sure you are targeting the real delimiter. All you can do is guessing.

没有办法100%确定您的目标是真正的分隔符。你所能做的就是猜测。

You should start by finding the right delimiter, then explode the CSV on this delimiter.

您应该从找到正确的分隔符开始,然后在这个分隔符上爆炸CSV。

To find the delimiter, basically, you want a function that counts the number of , and the number of ; and that returns the greater.

要找到分隔符,基本上,你需要一个函数来计数;这样就会得到更大的回报。

Something like :

喜欢的东西:

$array = explode(find_delimiter($csv), $csv);

Hope it helps ;)

希望它能帮助;)

Edit : Your find_delimiter function could be something like :

编辑:您的find_delimiter函数可以是:

function find_delimiter($csv)
{
   $arrDelimiters = array(',', '.', ';');
   $arrResults = array();
   foreach ($arrDelimiters as $delimiter)
   {
       $arrResults[$delimiter] = count(explode($delimiter, $csv));
   }
   $arrResults = rsort($arrResults);
   return (array_keys($arrResults)[0]);
}

#5


0  

Well, it looks like you exactly know that your delimiter will be "," or ";". This is a good place to start. Thus, you may try to replace all commas (,) to semicolons (;), and then explode by the semicolon only. However, in this approach you would definitely have a problem in some cases, because some lines of your CSV files could be like this:

嗯,看起来您确实知道您的分隔符将是“,”或“;”。这是一个很好的起点。因此,您可以尝试将所有逗号(,)替换为分号(;),然后只使用分号进行爆炸。但是,在这种方法中,在某些情况下您肯定会遇到问题,因为您的CSV文件的某些行可能是这样的:

"name,value",other name,other value,last name;last value

“名称、值”、其他名称、其他值、姓;最后值

In this way delimiter of your CSV file will be comma if there will be four columns in your CSV file. However, by changing commas to semicolons you would get five columns which would be incorrect. So, changing some delimiter to another is not a good way.

这样,如果CSV文件中有4列,则CSV文件的分隔符为逗号。然而,通过将逗号改为分号,您将得到5列,这是不正确的。因此,将一些分隔符更改为另一个不是一个好方法。

But still, if your CSV file is correctly formatted, then you may find correct delimiter in any of the lines. So, you may try to create some function like find_delimiter($csvLine) as proposed by @johnkork, but the problem with this is that the function itself can't know which delimiter to search for. However, you exactly know all the possible delimiters, so you may try to create another, quite similar, function like delimiter_exists($csvLine, $delimiter) which returns true or false.

但是,如果您的CSV文件格式正确,那么您可以在任何行中找到正确的分隔符。因此,您可以尝试按照@johnkork的建议创建类似find_delimiter($csvLine)的函数,但问题是函数本身不知道要搜索哪个分隔符。但是,您确切地知道所有可能的分隔符,因此您可以尝试创建另一个类似delimiter_exists的函数($csvLine, $delimiter),它返回true或false。

But even the function delimiter_exists($csvLine, $delimiter) is not enough. Why? Because for the instance of CSV line provided above you would get that both "," and ";" are delimiters that exists. For comma it would CSV file with four columns, and for semicolon it would be two columns.

但是即使函数delimiter_exists($csvLine, $delimiter)也不够。为什么?因为对于上面提供的CSV行实例,您会得到“,”和“;”都是存在的分隔符。对于逗号,它将是包含四列的CSV文件,对于分号,它将是两列。

Thus, there is no universal way which would get you exactly what you want. However, there may be another way you can check for - the first line of CSV file which is the header assuming your CSV files have a header. Mostly, headers in CSV file have (not necessarily) no other symbols, except for the alphanumeric names of the columns, which are delimited by the specific delimiter. So, you may try to create function like delimiter_exists($csvHeader, $delimiter) whose implementation could be like this:

因此,没有一种万能的方法可以让你得到你想要的。但是,您还可以通过另一种方式来检查—CSV文件的第一行,即假定您的CSV文件有一个头文件的头文件。大多数情况下,CSV文件中的标头(不一定)没有其他符号,除了列的字母数字名称,它们由特定的分隔符分隔。因此,您可以尝试创建像delimiter_exists($csvHeader, $delimiter)这样的函数,其实现可以是这样的:

function delimiter_exists($csvHeader, $delimiter) {
    return (bool)preg_match("/$delimiter/", $csvHeader);
}

For you specific case you may use it like this:

对于您的具体情况,您可以这样使用:

$csvHeader = "abc;def";
$delimiter = delimiter_exists($csvHeader, ',') ? ',' : ';';

Hope this helps!

希望这可以帮助!