通过逗号分隔字符串,但在双引号内使用Javascript忽略逗号。

时间:2021-08-21 21:42:52

I'm looking for [a, b, c, "d, e, f", g, h]to turn into an array of 6 elements: a, b, c, "d,e,f", g, h. I'm a bit of a noob with RegEx so any help is great. I'm trying to do this through Javascript. This is what I have so far:

我在找[a, b, c, d,e,f, g, h]来转换成6个元素的数组:a, b, c, d,e,f, g, h,我和RegEx有点像noob,所以任何帮助都很好。我是通过Javascript来实现的。这就是我目前所拥有的:

str = str.split(/,+|"[^"]+"/g); 

But right now it's splitting out everything that's in the double-quotes, which is incorrect. Thanks for any help.

但是现在它把双引号中的所有东西都分开了,这是不正确的。感谢任何帮助。

Edit: Okay sorry I worded this question really poorly. I'm being given a string not an array.

编辑:好的,对不起,我把这个问题说得很糟糕。我得到的是一个字符串而不是数组。

var str = 'a, b, c, "d, e, f", g, h';

And I want to turn that into an array using something like the "split" function.

我想把它变成一个数组,使用"split"函数。

9 个解决方案

#1


46  

Here's what I would do.

这是我要做的。

var str = 'a, b, c, "d, e, f", g, h';
var arr = str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g);
/* will match:

    (
        ".*?"       double quotes + anything but double quotes + double quotes
        |           OR
        [^",\s]+    1 or more characters excl. double quotes, comma or spaces of any kind
    )
    (?=             FOLLOWED BY
        \s*,        0 or more empty spaces and a comma
        |           OR
        \s*$        0 or more empty spaces and nothing else (end of string)
    )

*/
arr = arr || [];
// this will prevent JS from throwing an error in
// the below loop when there are no matches
for (var i = 0; i < arr.length; i++) console.log('arr['+i+'] =',arr[i]);

#2


4  

This works well for me. (I used semicolons so the alert message would show the difference between commas added when turning the array into a string and the actual captured values.)

这对我来说很有效。(我使用分号,因此警告消息将显示在将数组转换为字符串和实际捕获值时添加的逗号之间的区别。)

var str = 'a; b; c; "d; e; f"; g; h; "i"';
var array = str.match(/("[^"]*")|[^;]+/g); 
alert(array);

#3


2  

Here is a JavaScript function to do it:

下面是一个JavaScript函数:

function splitCSVButIgnoreCommasInDoublequotes(str) {  
    //split the str first  
    //then merge the elments between two double quotes  
    var delimiter = ',';  
    var quotes = '"';  
    var elements = str.split(delimiter);  
    var newElements = [];  
    for (var i = 0; i < elements.length; ++i) {  
        if (elements[i].indexOf(quotes) >= 0) {//the left double quotes is found  
            var indexOfRightQuotes = -1;  
            var tmp = elements[i];  
            //find the right double quotes  
            for (var j = i + 1; j < elements.length; ++j) {  
                if (elements[j].indexOf(quotes) >= 0) {  
                    indexOfRightQuotes = j;  
                }  
            }  
            //found the right double quotes  
            //merge all the elements between double quotes  
            if (-1 != indexOfRightQuotes) {   
                for (var j = i + 1; j <= indexOfRightQuotes; ++j) {  
                    tmp = tmp + delimiter + elements[j];  
                }  
                newElements.push(tmp);  
                i = indexOfRightQuotes;  
            }  
            else { //right double quotes is not found  
                newElements.push(elements[i]);  
            }  
        }  
        else {//no left double quotes is found  
            newElements.push(elements[i]);  
        }  
    }  

    return newElements;  
}  

#4


1  

I know it's a bit long, but here's my take:

我知道有点长,但这是我的:

var sample="[a, b, c, \"d, e, f\", g, h]";

var inQuotes = false, items = [], currentItem = '';

for(var i = 0; i < sample.length; i++) {
  if (sample[i] == '"') { 
    inQuotes = !inQuotes; 

    if (!inQuotes) {
      if (currentItem.length) items.push(currentItem);
      currentItem = '';
    }

    continue; 
  }

  if ((/^[\"\[\]\,\s]$/gi).test(sample[i]) && !inQuotes) {
    if (currentItem.length) items.push(currentItem);
    currentItem = '';
    continue;
  }

  currentItem += sample[i];
}

if (currentItem.length) items.push(currentItem);

console.log(items);

As a side note, it will work both with, and without the braces in the start and end.

作为附注,它将同时使用,并且在开始和结束时都不使用括号。

#5


1  

Here's a non-regex one that assumes doublequotes will come in pairs:

这里有一个非正则表达式假设双引号是成对的:

function splitCsv(str) {
  return str.split(',').reduce((accum,curr)=>{
    if(accum.isConcatting) {
      accum.soFar[accum.soFar.length-1] += ','+curr
    } else {
      accum.soFar.push(curr)
    }
    if(curr.split('"').length % 2 == 0) {
      accum.isConcatting= !accum.isConcatting
    }
    return accum;
  },{soFar:[],isConcatting:false}).soFar
}

console.log(splitCsv('asdf,"a,d",fdsa'),' should be ',['asdf','"a,d"','fdsa'])
console.log(splitCsv(',asdf,,fds,'),' should be ',['','asdf','','fds',''])
console.log(splitCsv('asdf,"a,,,d",fdsa'),' should be ',['asdf','"a,,,d"','fdsa'])

#6


0  

Something like a stack should do the trick. Here I vaguely use marker boolean as stack (just getting my purpose served with it).

类似于堆栈的东西应该可以做到这一点。在这里,我模糊地使用标记布尔作为堆栈(只是得到我的目的和它一起使用)。

var str = "a,b,c,blah\"d,=,f\"blah,\"g,h,";
var getAttributes = function(str){
  var result = [];
  var strBuf = '';
  var start = 0 ;
  var marker = false;
  for (var i = 0; i< str.length; i++){

    if (str[i] === '"'){
      marker = !marker;
    }
    if (str[i] === ',' && !marker){
      result.push(str.substr(start, i - start));
      start = i+1;
    }
  }
  if (start <= str.length){
    result.push(str.substr(start, i - start));
  }
  return result;
};

console.log(getAttributes(str));

#7


0  

jsfiddle setting image code output image

jsfiddle设置图像代码输出图像。

The code works if your input string in the format of stringTocompare. Run the code on https://jsfiddle.net/ to see output for fiddlejs setting. Please refer to the screenshot. You can either use split function for the same for the code below it and tweak the code according to you need. Remove the bold or word with in ** from the code if you dont want to have comma after split attach=attach**+","**+actualString[t+1].

如果您的输入字符串是stringTocompare的格式,那么代码是有效的。在https://jsfiddle.net/上运行代码,查看fiddlejs设置的输出。请参考截图。您可以对下面的代码使用split函数,并根据需要调整代码。如果你不想在split attach=attach**+","**+actualString[t+1]上使用逗号,请在**中删除粗体或字。

var stringTocompare='"Manufacturer","12345","6001","00",,"Calfe,eto,lin","Calfe,edin","4","20","10","07/01/2018","01/01/2006",,,,,,,,"03/31/2004"';

console.log(stringTocompare);

var actualString=stringTocompare.split(',');
console.log("Before");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}
//var actualString=stringTocompare.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/);
for(var i=0;i<actualString.length;i++){
var flag=0;
var x=actualString[i];
if(x!==null)
{
if(x[0]=='"' && x[x.length-1]!=='"'){
   var p=0;
   var t=i;
   var b=i;
   for(var k=i;k<actualString.length;k++){
   var y=actualString[k];
        if(y[y.length-1]!=='"'){        
        p++;
        }
        if(y[y.length-1]=='"'){

                flag=1;
        }
        if(flag==1)
        break;
   }
   var attach=actualString[t];
for(var s=p;s>0;s--){

  attach=attach+","+actualString[t+1];
  t++;
}
actualString[i]=attach;
actualString.splice(b+1,p);
}
}


}
console.log("After");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}




  [1]: https://i.stack.imgur.com/3FcxM.png

#8


-1  

Assuming your string really looks like '[a, b, c, "d, e, f", g, h]', I believe this would be 'an acceptable use case for eval():

假设你的字符串看起来很像'[a, b, c, ' d, e, f ', g, h]',我认为这是eval()的一个可接受的用例:

myString = 'var myArr ' + myString;
eval(myString);

console.log(myArr); // will now be an array of elements: a, b, c, "d, e, f", g, h

Edit: As Rocket pointed out, strict mode removes eval's ability to inject variables into the local scope, meaning you'd want to do this:

编辑:正如Rocket指出的,严格的模式消除了eval在局部范围内注入变量的能力,这意味着您希望这样做:

var myArr = eval(myString);

#9


-1  

I've had similar issues with this, and I've found no good .net solution so went DIY. NOTE: This was also used to reply to

我也遇到过类似的问题,我也没有找到好的。net解决方案。注意:这也是用来回复的。

Splitting comma separated string, ignore commas in quotes, but allow strings with one double quotation

分割逗号分隔字符串,忽略引号中的逗号,但允许字符串有一个双引号。

but seems more applicable here (but useful over there)

但是在这里似乎更适用(但在那里很有用)

In my application I'm parsing a csv so my split credential is ",". this method I suppose only works for where you have a single char split argument.

在我的应用程序中,我正在解析一个csv,所以我的拆分凭证是“,”。我想,这种方法只适用于只有一个char类型的参数。

So, I've written a function that ignores commas within double quotes. it does it by converting the input string into a character array and parsing char by char

因此,我写了一个函数,它忽略了双引号中的逗号。它通过将输入字符串转换为字符数组并通过char解析char来实现它。

public static string[] Splitter_IgnoreQuotes(string stringToSplit)
    {   
        char[] CharsOfData = stringToSplit.ToCharArray();
        //enter your expected array size here or alloc.
        string[] dataArray = new string[37];
        int arrayIndex = 0;
        bool DoubleQuotesJustSeen = false;          
        foreach (char theChar in CharsOfData)
        {
            //did we just see double quotes, and no command? dont split then. you could make ',' a variable for your split parameters I'm working with a csv.
            if ((theChar != ',' || DoubleQuotesJustSeen) && theChar != '"')
            {
                dataArray[arrayIndex] = dataArray[arrayIndex] + theChar;
            }
            else if (theChar == '"')
            {
                if (DoubleQuotesJustSeen)
                {
                    DoubleQuotesJustSeen = false;
                }
                else
                {
                    DoubleQuotesJustSeen = true;
                }
            }
            else if (theChar == ',' && !DoubleQuotesJustSeen)
            {
                arrayIndex++;
            }
        }
        return dataArray;
    }

This function, to my application taste also ignores ("") in any input as these are unneeded and present in my input.

对于我的应用程序来说,这个函数在任何输入中都忽略了(“”),因为这些在我的输入中是不需要的。

#1


46  

Here's what I would do.

这是我要做的。

var str = 'a, b, c, "d, e, f", g, h';
var arr = str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g);
/* will match:

    (
        ".*?"       double quotes + anything but double quotes + double quotes
        |           OR
        [^",\s]+    1 or more characters excl. double quotes, comma or spaces of any kind
    )
    (?=             FOLLOWED BY
        \s*,        0 or more empty spaces and a comma
        |           OR
        \s*$        0 or more empty spaces and nothing else (end of string)
    )

*/
arr = arr || [];
// this will prevent JS from throwing an error in
// the below loop when there are no matches
for (var i = 0; i < arr.length; i++) console.log('arr['+i+'] =',arr[i]);

#2


4  

This works well for me. (I used semicolons so the alert message would show the difference between commas added when turning the array into a string and the actual captured values.)

这对我来说很有效。(我使用分号,因此警告消息将显示在将数组转换为字符串和实际捕获值时添加的逗号之间的区别。)

var str = 'a; b; c; "d; e; f"; g; h; "i"';
var array = str.match(/("[^"]*")|[^;]+/g); 
alert(array);

#3


2  

Here is a JavaScript function to do it:

下面是一个JavaScript函数:

function splitCSVButIgnoreCommasInDoublequotes(str) {  
    //split the str first  
    //then merge the elments between two double quotes  
    var delimiter = ',';  
    var quotes = '"';  
    var elements = str.split(delimiter);  
    var newElements = [];  
    for (var i = 0; i < elements.length; ++i) {  
        if (elements[i].indexOf(quotes) >= 0) {//the left double quotes is found  
            var indexOfRightQuotes = -1;  
            var tmp = elements[i];  
            //find the right double quotes  
            for (var j = i + 1; j < elements.length; ++j) {  
                if (elements[j].indexOf(quotes) >= 0) {  
                    indexOfRightQuotes = j;  
                }  
            }  
            //found the right double quotes  
            //merge all the elements between double quotes  
            if (-1 != indexOfRightQuotes) {   
                for (var j = i + 1; j <= indexOfRightQuotes; ++j) {  
                    tmp = tmp + delimiter + elements[j];  
                }  
                newElements.push(tmp);  
                i = indexOfRightQuotes;  
            }  
            else { //right double quotes is not found  
                newElements.push(elements[i]);  
            }  
        }  
        else {//no left double quotes is found  
            newElements.push(elements[i]);  
        }  
    }  

    return newElements;  
}  

#4


1  

I know it's a bit long, but here's my take:

我知道有点长,但这是我的:

var sample="[a, b, c, \"d, e, f\", g, h]";

var inQuotes = false, items = [], currentItem = '';

for(var i = 0; i < sample.length; i++) {
  if (sample[i] == '"') { 
    inQuotes = !inQuotes; 

    if (!inQuotes) {
      if (currentItem.length) items.push(currentItem);
      currentItem = '';
    }

    continue; 
  }

  if ((/^[\"\[\]\,\s]$/gi).test(sample[i]) && !inQuotes) {
    if (currentItem.length) items.push(currentItem);
    currentItem = '';
    continue;
  }

  currentItem += sample[i];
}

if (currentItem.length) items.push(currentItem);

console.log(items);

As a side note, it will work both with, and without the braces in the start and end.

作为附注,它将同时使用,并且在开始和结束时都不使用括号。

#5


1  

Here's a non-regex one that assumes doublequotes will come in pairs:

这里有一个非正则表达式假设双引号是成对的:

function splitCsv(str) {
  return str.split(',').reduce((accum,curr)=>{
    if(accum.isConcatting) {
      accum.soFar[accum.soFar.length-1] += ','+curr
    } else {
      accum.soFar.push(curr)
    }
    if(curr.split('"').length % 2 == 0) {
      accum.isConcatting= !accum.isConcatting
    }
    return accum;
  },{soFar:[],isConcatting:false}).soFar
}

console.log(splitCsv('asdf,"a,d",fdsa'),' should be ',['asdf','"a,d"','fdsa'])
console.log(splitCsv(',asdf,,fds,'),' should be ',['','asdf','','fds',''])
console.log(splitCsv('asdf,"a,,,d",fdsa'),' should be ',['asdf','"a,,,d"','fdsa'])

#6


0  

Something like a stack should do the trick. Here I vaguely use marker boolean as stack (just getting my purpose served with it).

类似于堆栈的东西应该可以做到这一点。在这里,我模糊地使用标记布尔作为堆栈(只是得到我的目的和它一起使用)。

var str = "a,b,c,blah\"d,=,f\"blah,\"g,h,";
var getAttributes = function(str){
  var result = [];
  var strBuf = '';
  var start = 0 ;
  var marker = false;
  for (var i = 0; i< str.length; i++){

    if (str[i] === '"'){
      marker = !marker;
    }
    if (str[i] === ',' && !marker){
      result.push(str.substr(start, i - start));
      start = i+1;
    }
  }
  if (start <= str.length){
    result.push(str.substr(start, i - start));
  }
  return result;
};

console.log(getAttributes(str));

#7


0  

jsfiddle setting image code output image

jsfiddle设置图像代码输出图像。

The code works if your input string in the format of stringTocompare. Run the code on https://jsfiddle.net/ to see output for fiddlejs setting. Please refer to the screenshot. You can either use split function for the same for the code below it and tweak the code according to you need. Remove the bold or word with in ** from the code if you dont want to have comma after split attach=attach**+","**+actualString[t+1].

如果您的输入字符串是stringTocompare的格式,那么代码是有效的。在https://jsfiddle.net/上运行代码,查看fiddlejs设置的输出。请参考截图。您可以对下面的代码使用split函数,并根据需要调整代码。如果你不想在split attach=attach**+","**+actualString[t+1]上使用逗号,请在**中删除粗体或字。

var stringTocompare='"Manufacturer","12345","6001","00",,"Calfe,eto,lin","Calfe,edin","4","20","10","07/01/2018","01/01/2006",,,,,,,,"03/31/2004"';

console.log(stringTocompare);

var actualString=stringTocompare.split(',');
console.log("Before");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}
//var actualString=stringTocompare.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/);
for(var i=0;i<actualString.length;i++){
var flag=0;
var x=actualString[i];
if(x!==null)
{
if(x[0]=='"' && x[x.length-1]!=='"'){
   var p=0;
   var t=i;
   var b=i;
   for(var k=i;k<actualString.length;k++){
   var y=actualString[k];
        if(y[y.length-1]!=='"'){        
        p++;
        }
        if(y[y.length-1]=='"'){

                flag=1;
        }
        if(flag==1)
        break;
   }
   var attach=actualString[t];
for(var s=p;s>0;s--){

  attach=attach+","+actualString[t+1];
  t++;
}
actualString[i]=attach;
actualString.splice(b+1,p);
}
}


}
console.log("After");
for(var i=0;i<actualString.length;i++){
console.log(actualString[i]);
}




  [1]: https://i.stack.imgur.com/3FcxM.png

#8


-1  

Assuming your string really looks like '[a, b, c, "d, e, f", g, h]', I believe this would be 'an acceptable use case for eval():

假设你的字符串看起来很像'[a, b, c, ' d, e, f ', g, h]',我认为这是eval()的一个可接受的用例:

myString = 'var myArr ' + myString;
eval(myString);

console.log(myArr); // will now be an array of elements: a, b, c, "d, e, f", g, h

Edit: As Rocket pointed out, strict mode removes eval's ability to inject variables into the local scope, meaning you'd want to do this:

编辑:正如Rocket指出的,严格的模式消除了eval在局部范围内注入变量的能力,这意味着您希望这样做:

var myArr = eval(myString);

#9


-1  

I've had similar issues with this, and I've found no good .net solution so went DIY. NOTE: This was also used to reply to

我也遇到过类似的问题,我也没有找到好的。net解决方案。注意:这也是用来回复的。

Splitting comma separated string, ignore commas in quotes, but allow strings with one double quotation

分割逗号分隔字符串,忽略引号中的逗号,但允许字符串有一个双引号。

but seems more applicable here (but useful over there)

但是在这里似乎更适用(但在那里很有用)

In my application I'm parsing a csv so my split credential is ",". this method I suppose only works for where you have a single char split argument.

在我的应用程序中,我正在解析一个csv,所以我的拆分凭证是“,”。我想,这种方法只适用于只有一个char类型的参数。

So, I've written a function that ignores commas within double quotes. it does it by converting the input string into a character array and parsing char by char

因此,我写了一个函数,它忽略了双引号中的逗号。它通过将输入字符串转换为字符数组并通过char解析char来实现它。

public static string[] Splitter_IgnoreQuotes(string stringToSplit)
    {   
        char[] CharsOfData = stringToSplit.ToCharArray();
        //enter your expected array size here or alloc.
        string[] dataArray = new string[37];
        int arrayIndex = 0;
        bool DoubleQuotesJustSeen = false;          
        foreach (char theChar in CharsOfData)
        {
            //did we just see double quotes, and no command? dont split then. you could make ',' a variable for your split parameters I'm working with a csv.
            if ((theChar != ',' || DoubleQuotesJustSeen) && theChar != '"')
            {
                dataArray[arrayIndex] = dataArray[arrayIndex] + theChar;
            }
            else if (theChar == '"')
            {
                if (DoubleQuotesJustSeen)
                {
                    DoubleQuotesJustSeen = false;
                }
                else
                {
                    DoubleQuotesJustSeen = true;
                }
            }
            else if (theChar == ',' && !DoubleQuotesJustSeen)
            {
                arrayIndex++;
            }
        }
        return dataArray;
    }

This function, to my application taste also ignores ("") in any input as these are unneeded and present in my input.

对于我的应用程序来说,这个函数在任何输入中都忽略了(“”),因为这些在我的输入中是不需要的。