具有已知开始和结束索引的两个字符位置之间的正则表达式

In regex, generally speaking, is there a way to select data between two line positions? I'm not even sure the correct terminology (character/line position, index, column?) after a few days of reading up on regex, but what I mean is...

在正则表达式中,一般来说,有没有办法在两个线位置之间选择数据?在阅读完正则表达式几天后,我甚至不确定正确的术语(字符/行位置,索引,列?),但我的意思是......

Select the data between two indices, what is between ^.{4} and ^.{7}, for example:

选择两个索引之间的数据,例如^。{4}和^。{7}之间的数据,例如:

TESTINGREGEX
ISNTTHEBEST!

TESTINGREGEXCANBEFUN
ISNTTHEBEST!ANDFARFROMFUN

the results I'm looking for would be:

我正在寻找的结果将是:

TESTREGEX
ISNTBEST!

and

TESTREGEXCANBEFUN
ISNTBEST!ANDFARFROMFUN

I'm wondering, so I can learn if it's possible, how to achieve it? I'm very familiar with other ways to do this using other tools, but I'm curious how to achieve this using regex.

我想知道,所以我可以学习是否有可能,如何实现它?我非常熟悉使用其他工具执行此操作的其他方法,但我很好奇如何使用正则表达式实现此目的。

I've tried working with non capturing groups, and wondering if maybe I'm being limited by the fact that I'm attempting to apply this regex within the atom editor find and replace regex feature (falling victim to: Avoiding Common Pitfalls), so I'm hoping to get a few suggestions to broaden my knowledge and try out. I'm guessing javascript, and/or sed style regex answers would be acceptable...really anything would help!

我已经尝试过与非捕获组合作,并想知道是否可能因为我试图在原子编辑器中应用此正则表达式来查找并替换正则表达式功能(成为常见陷阱的受害者),所以我希望得到一些建议来拓宽我的知识并尝试。我猜javascript,和/或sed风格的正则表达式答案是可以接受的......真的有什么可以帮助!

EDIT: .{3}(?=.{5}$) from Mark's answer works for me and with the example text I gave in the OP. And it's a good thing to know when able to count from the $ end of line. But I'm realizing I actually need the opposite... I need to count out from the ^ start of line. Is this not possible; re: comments on there being no support for lookbehind?

编辑:。{3}(?=。{5} $)来自Mark的答案适用于我,以及我在OP中提供的示例文本。知道什么时候可以从$ end算起来是一件好事。但我意识到我实际上需要相反的......我需要从行的开头算起来。这是不可能的; re:关于没有支持lookbehind的评论?

2 个解决方案

#1

With just regex it's possible, just not in javascript. The regex (?<=^.{4}).+(?=.{5}$) works to capture the group between the 4th letter and the 5th to last letter. Since javascript doesn't support positive look behinds, you'll have to use some ammount of javascript beyond a simple .replace(regex, "") to remove those characters.

只有正则表达式,它是可能的,而不是在JavaScript中。正则表达式(?<= ^。{4})。+(?=。{5} $)用于捕获第4个字母和第5个字母到最后一个字母之间的组。由于javascript不支持积极的外观,你必须使用一些简单的.replace(正则表达式,“”)之外的一些javascript来删除这些字符。

The next closest regex possible in javascript would be .{3}(?=.{5}$), which would match 3 characters before the 5th to last letter.

javascript中可能的下一个最接近的正则表达式是。{3}(?=。{5} $),它将在第5个字母到最后一个字母之前匹配3个字符。

If you wanted with pure regex in javascript to capture something a few characters after the start of a string it would be impossible.

如果你想在javascript中使用纯正则表达式来捕获字符串开头后的几个字符,那将是不可能的。

#2

The regex ^(.{4}).{3}(.{5})$ (expressed in JavaScript's dialect, but the features used in it are quite common) will give you two capture groups you can combine to get the output you describe:

正则表达式^(。{4})。{3}(。{5})$(用JavaScript的方言表示,但其中使用的功能非常常见)将为您提供两个捕获组,您可以将它们组合起来获得输出描述:

function test(str) {
  var match = str.match(/^(.{4}).{3}(.{5})$/);
  console.log(str, '=>', match[1] + match[2]);
}
test("TESTINGREGEX");
test("ISNTTHEBEST!");

If the lines are of varying length and you want to ignore everything after the end of what you want, just drop the $ assertion at the end.

如果行的长度不同,并且您希望在结束之后忽略所有内容,则只需将$断言放在最后。

#1