使用正则表达式用括号分解字符串

时间:2022-02-02 23:34:45

I have a problem with decomposing a string.

我有分解字符串的问题。

The problem:

I want to decompose string like this:

我想像这样分解字符串:

(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)aaabbb

(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)AAABBB

and return an object like this:

并返回一个这样的对象:

Object {
       identifier01: "12345678901234", 
       identifier11: "060606", 
       identifier17: "121212", 
       identifier21: "1321asdfght", 
       identifier10: "aaabbb" 
}

Rules:
identifier01 has always 14 numeric characters
identifier11 has always 6 numeric characters
identifier17 has always 6 numeric characters
identifier21 has always from 1 to 20 alphanumeric characters
identifier10 has always from 1 to 20 alphanumeric characters

规则:identifier01始终包含14个数字字符标识符11始终包含6个数字字符标识符17始终包含6个数字字符标识符21始终包含1到20个字母数字字符标识符10始终包含1到20个字母数字字符

The problem is the identifier21 and identifier10 do not have a fixed length of characters (they vary from 1 do 20 characters). What is more, only identifier01 is always at the beginning and rest of identifiers can have a different order, let's say:

问题是标识符21和标识符10没有固定长度的字符(它们从1到20个字符变化)。更重要的是,只有identifier01始终在开头,其余的标识符可以有不同的顺序,比方说:

(01)12345678901234(21)111122233344(10)abcdeed(11)050505(17)060606

(01)12345678901234(21)111122233344(10)abcdeed(11)050505(17)060606

or even a particular identifier could not exist at all:

甚至根本不存在特定的标识符:

(01)12345678901234(21)111122233344(17)060606

(01)12345678901234(21)111122233344(17)060606

My approach:

parseStringToAnObject: function (value) {
           var regs = [
                ["(01) ", /\([^)]*01\)([0-9]{14})/],
                ["(10) ", /\([^)]*10\)([0-9a-zA-Z]{1,20})/],
                ["(11) ", /\([^)]*11\)([0-9]{6})/],
                ["(17) ", /\([^)]*17\)([0-9]{6})/],
                ["(21) ", /\([^)]*21\)([0-9a-zA-Z]{1,20})/]
            ];

            var tempObj = {};

            while (value.length > 0) {
                var ok = false;
                for (var i = 0; i < regs.length; i++) {
                    var match = value.match(regs[i][1]);
                    console.log(match);
                    if (match) {
                        ok = true;
                        switch (match[0].slice(0, 4)) {
                            case "(01)":
                                tempObj.identifier01 = match[1];
                                break;
                            case "(21)":
                                tempObj.identifier21 = match[1];
                                break;
                            case "(11)":
                                tempObj.identifier11 = match[1];
                                break;
                            case "(17)":
                                tempObj.identifier17 = match[1];
                                break;
                            case "(10)":
                                tempObj.identifier10 = match[1];
                                break;
                         }

                        value = value.slice(match[0].length);
                        break;
                    } else {
                        console.log("Regex error");
                    }
                }
                if (!ok) {
                    return false;
                }
            }
            console.log(tempObj);
            return tempObj;
 }

Results:

My function returns me a proper data but only when I do not type identifiers with variable amount of characters. When I type e.g.

我的函数返回一个正确的数据,但只有当我没有输入具有可变字符数量的标识符时。当我输入例如

(01)12345678901234(21)abder123(17)121212

(01)12345678901234(21)abder123(17)121212

or

要么

(01)12345678901234(10)123aaaaabbbddd(21)qwerty

(01)12345678901234(10)123aaaaabbbddd(21)QWERTY

or

要么

(01)12345678901234(17)060606(10)aabbcc121212(11)030303

(01)12345678901234(17)060606(10)aabbcc121212(11)030303

it always returns me false.

它总是让我失意。

Could you suggest a better and more refined approach, please?
Thanks for all answers and solutions in advance!

你能建议一个更好,更精致的方法吗?感谢所有的答案和解决方案!

2 个解决方案

#1


1  

Here's how I'd do it:

我是这样做的:

var s = "(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)aaabbb",
    r = {};

s.replace( /\((\d\d)\)([^()]+)/g, function(m,p,d){ r['identifier'+p]=d;return '';} );
 
console.log(r);

#2


1  

I wouldn't even use regex:

我甚至不使用正则表达式:

var final = {}; 
var a = "(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)aaabbb";
a.split("(").slice(1).sort().map(function(i) {
    var pieces = i.split(')'); 
    final["identifier" + pieces[0]] = pieces[1]; 
});
console.log(final);

//Object {identifier01: "12345678901234", identifier10: "aaabbb", identifier11: "060606", identifier17: "121212", identifier21: "1321asdfght"}

#1


1  

Here's how I'd do it:

我是这样做的:

var s = "(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)aaabbb",
    r = {};

s.replace( /\((\d\d)\)([^()]+)/g, function(m,p,d){ r['identifier'+p]=d;return '';} );
 
console.log(r);

#2


1  

I wouldn't even use regex:

我甚至不使用正则表达式:

var final = {}; 
var a = "(01)12345678901234(11)060606(17)121212(21)1321asdfght(10)aaabbb";
a.split("(").slice(1).sort().map(function(i) {
    var pieces = i.split(')'); 
    final["identifier" + pieces[0]] = pieces[1]; 
});
console.log(final);

//Object {identifier01: "12345678901234", identifier10: "aaabbb", identifier11: "060606", identifier17: "121212", identifier21: "1321asdfght"}