I have a log file that is very long, each entry begins on a new line. But some entries have new line breaks in it. So I am splitting my log file using this code, and then I run different Regex rules on it, and everything works fine:var str = data.split('\n');
.
我有一个非常长的日志文件,每个条目都从一个新行开始。但有些参赛作品中有新的换行符。所以我使用这个代码分割我的日志文件,然后我对它运行不同的正则表达式规则,一切正常:var str = data.split('\ n');.
Once I have some more complex text, that included line breaks in the string. My code breaks. Below is the sample of log file. First line is normal, second line end at (ends here).
一旦我有一些更复杂的文本,其中包括字符串中的换行符。我的代码中断了。以下是日志文件的示例。第一行是正常的,第二行结束于(此处结束)。
3708 07:11:59 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {1846518641516}
908 07:11:40 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {148815184185}, ** [Content]: new: Please note the following when using this app:
▪ Some text
▪ Some text
▪ Some text
▪ Some more and more text., old: Please note the following when using this app:
▪ Some text
▪ Some text
▪ Some text
▪ Some text
▪ Some text
▪ Some text
ends here
Hopefully my question is clear. How should I refactor my var str = data.split('\n');
in order for it to work for both kind of entries?
希望我的问题很明确。我应该如何重构我的var str = data.split('\ n');为了使它适用于这两种条目?
Thank you for help
谢谢你的帮助
1 个解决方案
#1
2
You need to split at \n
that is followed with a string of digits, a space, and a time-like string:
你需要在\ n之后拆分,后跟一串数字,一个空格和一个类似时间的字符串:
s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/)
See the regex demo
请参阅正则表达式演示
Details:
-
\n
- a newline followed with... -
(?=\d+ \d{2}:\d{2}:\d{2}\b)
- (a positive lookahead that only requires that the string immediately to the right meets the pattern, else fail occurs)-
\d+
- 1 or more digits -
-
\d{2}:\d{2}:\d{2}
- 2 digits,:
twice and again 2 diigts -
\b
- trailing word boundary
\ d + - 1位或更多位数
- 空间
\ d {2}:\ d {2}:\ d {2} - 2位数,:两次,2位数
\ b - 尾随字边界
-
\ n - 后面跟着一个换行符......
(?= \ d + \ d {2}:\ d {2}:\ d {2} \ b) - (一个正面的前瞻,只要求右边的字符串符合模式,否则会发生失败)\ d + - 一个或多个数字 - 空格\ d {2}:\ d {2}:\ d {2} - 2位数字:两次,再次2个diigts \ b - 尾随字边界
var s = "3708 07:11:59 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {1846518641516} \r\n908 07:11:40 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {148815184185}, ** [Content]: new: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some more and more text., old: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\nends here";
var res = s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/);
console.log(res);
#1
2
You need to split at \n
that is followed with a string of digits, a space, and a time-like string:
你需要在\ n之后拆分,后跟一串数字,一个空格和一个类似时间的字符串:
s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/)
See the regex demo
请参阅正则表达式演示
Details:
-
\n
- a newline followed with... -
(?=\d+ \d{2}:\d{2}:\d{2}\b)
- (a positive lookahead that only requires that the string immediately to the right meets the pattern, else fail occurs)-
\d+
- 1 or more digits -
-
\d{2}:\d{2}:\d{2}
- 2 digits,:
twice and again 2 diigts -
\b
- trailing word boundary
\ d + - 1位或更多位数
- 空间
\ d {2}:\ d {2}:\ d {2} - 2位数,:两次,2位数
\ b - 尾随字边界
-
\ n - 后面跟着一个换行符......
(?= \ d + \ d {2}:\ d {2}:\ d {2} \ b) - (一个正面的前瞻,只要求右边的字符串符合模式,否则会发生失败)\ d + - 一个或多个数字 - 空格\ d {2}:\ d {2}:\ d {2} - 2位数字:两次,再次2个diigts \ b - 尾随字边界
var s = "3708 07:11:59 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {1846518641516} \r\n908 07:11:40 INFO (username): SAVE: master:/url_path, language: en, version: 1, id: {148815184185}, ** [Content]: new: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some more and more text., old: Please note the following when using this app:\r\n\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\n▪ Some text\r\nends here";
var res = s.split(/\n(?=\d+ \d{2}:\d{2}:\d{2}\b)/);
console.log(res);