如何使用Javascript正则表达式提取字符串?

时间:2022-09-13 11:15:39

it might look obvious but I wasted too much time trying to get it to work...

这可能看起来很明显,但我浪费了太多的时间试图让它工作……

I'm trying to extract a substring from a file with Javascript Regex. Here is a slice from the file :

我试图用Javascript Regex从文件中提取子字符串。这是文件的一个切片:

DATE:20091201T220000
SUMMARY:Dad's birthday

the field I want to extract is Summary, so I'm trying to write a method that returns only the summary text. Here is the method :

我要提取的字段是Summary,所以我尝试编写一个只返回摘要文本的方法。方法如下:

extractSummary : function(iCalContent) {
  /*
  input : iCal file content
  return : Event summary
  */
  var arr = iCalContent.match(/^SUMMARY\:(.)*$/g);
  return(arr);
}

clearly, I'm a Regex noob :)) could you fix it please ? thanks

很明显,我是一个Regex noob:)你能修好它吗?谢谢

5 个解决方案

#1


59  

You need to use the m flag:

你需要使用m标志:

multiline; treat beginning and end characters (^ and $) as working over multiple lines (i.e., match the beginning or end of each line (delimited by \n or \r), not only the very beginning or end of the whole input string)

多行;治疗开始和结束字符(^和$)到多个行(即工作。,匹配每一行的开头或结尾(以\n或\r分隔),而不只是整个输入字符串的开头或结尾)

Also put the * in the right place:

把*放在正确的位置:

"DATE:20091201T220000\r\nSUMMARY:Dad's birthday".match(/^SUMMARY\:(.*)$/gm);
//------------------------------------------------------------------^    ^
//-----------------------------------------------------------------------|

#2


69  

function extractSummary(iCalContent) {
  var rx = /\nSUMMARY:(.*)\n/g;
  var arr = rx.exec(iCalContent);
  return arr[1]; 
}

You need these changes:

你需要这些变化:

  • Put the * inside the parenthesis as suggested above. Otherwise your matching group will contain only one character.

    把*放在上面的括号里。否则,您的匹配组将只包含一个字符。

  • Get rid of the ^ and $. With the global option they match on start and end of the full string, rather than on start and end of lines. Match on explicit newlines instead.

    ^和$的摆脱。使用全局选项,它们在整个字符串的开始和结束时匹配,而不是在行的开始和结束时匹配。匹配显式换行符。

  • I suppose you want the matching group (what's inside the parenthesis) rather than the full array? arr[0] is the full match ("\nSUMMARY:...") and the next indexes contain the group matches.

    我猜你想要匹配的组(括号里面是什么)而不是完整的数组?arr[0]是完整的匹配(“\nSUMMARY:…”),下一个索引包含组匹配。

  • String.match(regexp) is supposed to return an array with the matches. In my browser it doesn't (Safari on Mac returns only the full match, not the groups), but Regexp.exec(string) works.

    match(regexp)应该返回一个带有匹配项的数组。在我的浏览器中,它不会(Mac上的Safari只返回完整的匹配项,不返回组),但是Regexp.exec(string)可以工作。

#3


11  

Your regular expression most likely wants to be

你的正则表达式很可能是你想要的

/\nSUMMARY:(.*)$/g

A helpful little trick I like to use is to default assign on match with an array.

我喜欢使用的一个有用的小技巧是默认分配与数组匹配。

var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value
return arr[0];

This way you don't get annoying type errors when you go to use arr

这样在使用arr时就不会出现恼人的类型错误

#4


6  

(.*) instead of (.)* would be a start. The latter will only capture the last character on the line.

(.*)而不是(.)*将是一个开始。后者只捕获行上的最后一个字符。

Also, no need to escape the :.

同样,没有必要逃避:。

#5


-1  

this is how you can parse iCal files with javascript

这就是如何使用javascript解析iCal文件的方法

    function calParse(str) {

        function parse() {
            var obj = {};
            while(str.length) {
                var p = str.shift().split(":");
                var k = p.shift(), p = p.join();
                switch(k) {
                    case "BEGIN":
                        obj[p] = parse();
                        break;
                    case "END":
                        return obj;
                    default:
                        obj[k] = p;
                }
            }
            return obj;
        }
        str = str.replace(/\n /g, " ").split("\n");
        return parse().VCALENDAR;
    }

    example = 
    'BEGIN:VCALENDAR\n'+
    'VERSION:2.0\n'+
    'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+
    'BEGIN:VEVENT\n'+
    'DTSTART:19970714T170000Z\n'+
    'DTEND:19970715T035959Z\n'+
    'SUMMARY:Bastille Day Party\n'+
    'END:VEVENT\n'+
    'END:VCALENDAR\n'


    cal = calParse(example);
    alert(cal.VEVENT.SUMMARY);

#1


59  

You need to use the m flag:

你需要使用m标志:

multiline; treat beginning and end characters (^ and $) as working over multiple lines (i.e., match the beginning or end of each line (delimited by \n or \r), not only the very beginning or end of the whole input string)

多行;治疗开始和结束字符(^和$)到多个行(即工作。,匹配每一行的开头或结尾(以\n或\r分隔),而不只是整个输入字符串的开头或结尾)

Also put the * in the right place:

把*放在正确的位置:

"DATE:20091201T220000\r\nSUMMARY:Dad's birthday".match(/^SUMMARY\:(.*)$/gm);
//------------------------------------------------------------------^    ^
//-----------------------------------------------------------------------|

#2


69  

function extractSummary(iCalContent) {
  var rx = /\nSUMMARY:(.*)\n/g;
  var arr = rx.exec(iCalContent);
  return arr[1]; 
}

You need these changes:

你需要这些变化:

  • Put the * inside the parenthesis as suggested above. Otherwise your matching group will contain only one character.

    把*放在上面的括号里。否则,您的匹配组将只包含一个字符。

  • Get rid of the ^ and $. With the global option they match on start and end of the full string, rather than on start and end of lines. Match on explicit newlines instead.

    ^和$的摆脱。使用全局选项,它们在整个字符串的开始和结束时匹配,而不是在行的开始和结束时匹配。匹配显式换行符。

  • I suppose you want the matching group (what's inside the parenthesis) rather than the full array? arr[0] is the full match ("\nSUMMARY:...") and the next indexes contain the group matches.

    我猜你想要匹配的组(括号里面是什么)而不是完整的数组?arr[0]是完整的匹配(“\nSUMMARY:…”),下一个索引包含组匹配。

  • String.match(regexp) is supposed to return an array with the matches. In my browser it doesn't (Safari on Mac returns only the full match, not the groups), but Regexp.exec(string) works.

    match(regexp)应该返回一个带有匹配项的数组。在我的浏览器中,它不会(Mac上的Safari只返回完整的匹配项,不返回组),但是Regexp.exec(string)可以工作。

#3


11  

Your regular expression most likely wants to be

你的正则表达式很可能是你想要的

/\nSUMMARY:(.*)$/g

A helpful little trick I like to use is to default assign on match with an array.

我喜欢使用的一个有用的小技巧是默认分配与数组匹配。

var arr = iCalContent.match(/\nSUMMARY:(.*)$/g) || [""]; //could also use null for empty value
return arr[0];

This way you don't get annoying type errors when you go to use arr

这样在使用arr时就不会出现恼人的类型错误

#4


6  

(.*) instead of (.)* would be a start. The latter will only capture the last character on the line.

(.*)而不是(.)*将是一个开始。后者只捕获行上的最后一个字符。

Also, no need to escape the :.

同样,没有必要逃避:。

#5


-1  

this is how you can parse iCal files with javascript

这就是如何使用javascript解析iCal文件的方法

    function calParse(str) {

        function parse() {
            var obj = {};
            while(str.length) {
                var p = str.shift().split(":");
                var k = p.shift(), p = p.join();
                switch(k) {
                    case "BEGIN":
                        obj[p] = parse();
                        break;
                    case "END":
                        return obj;
                    default:
                        obj[k] = p;
                }
            }
            return obj;
        }
        str = str.replace(/\n /g, " ").split("\n");
        return parse().VCALENDAR;
    }

    example = 
    'BEGIN:VCALENDAR\n'+
    'VERSION:2.0\n'+
    'PRODID:-//hacksw/handcal//NONSGML v1.0//EN\n'+
    'BEGIN:VEVENT\n'+
    'DTSTART:19970714T170000Z\n'+
    'DTEND:19970715T035959Z\n'+
    'SUMMARY:Bastille Day Party\n'+
    'END:VEVENT\n'+
    'END:VCALENDAR\n'


    cal = calParse(example);
    alert(cal.VEVENT.SUMMARY);