正则表达式,在Ruby中具有前瞻性

时间:2022-04-03 15:45:46

My current regex battle is replacing all commas before a number in a string. The regex must then ignore all following commas. I've been screwing around on rubular for about an hour and can't quite seem to get something working.

我当前的正则表达式之争是在字符串中的数字前替换所有逗号。然后正则表达式必须忽略所有后续逗号。我已经在rubular上拧了大约一个小时,看起来似乎无法正常工作。

Test String...

测试字符串......

'this is, a , sentence33 Here, is another.'

Desired Output...

期望的输出......

'this is comma a comma sentence33 Here, is another.'

So something along the lines of...

所以有些东西......

testString.gsub(/\,*\d\d/,"comma")

To give you some background, I'm doing a little scraping sideproject. The elements I'm gathering are largely comma separated beginning with a two digit age. However sometimes theres a headline preceeding the age that may contain commas. To preserve the structure I set up later on, I need to replace the commas in the headline.

为了给你一些背景知识,我正在做一些有趣的侧面项目。我收集的元素主要以逗号分隔,从两位数年龄开始。然而,有时候可能包含逗号的年龄前的标题。为了保留我稍后设置的结构,我需要替换标题中的逗号。

AFTER TRYING STACK OVERFLOW'S ANSWER...

在尝试叠加溢出之后的答案......

I'm still having some issues. Don't laugh but here's the actual line from the screen scraping thats causing problems...

我还有一些问题。不要笑,但这里是从屏幕抓取导致问题的实际线...

statsString =     "              23,  5'9\",  140lb,  29w,                        Slim,                 Brown       Hair,             Shaved Body,              White,    Looking for       Friendship,    1-on-1 Sex,    Relationship.   Out      Yes,SmokeNo,DrinkNo,DrugsNo,ZodiacCancer.      Versatile,                  7.5\"                    Cut, Safe Sex Only,     HIV      Negative, Prefer meeting at:Public Place.                   PerformerContact  xxxxxx87                                                   This user has TURNED OFF his IM                                     Send Smile      Write xxxxxx87 a message:" 

First to all of these fragments I add 'xx, ' so that my comma filtering will work in all cases, those with and without text ahead of the age. Followed by the actual fix. The output is below...

首先对所有这些片段添加“xx”,以便我的逗号过滤在所有情况下都能正常工作,那些在年龄之前有文本和没有文本的情况。接下来是实际修复。输出低于......

statsString = 'xx, ' + statsString

statsString = statsString.gsub(/\,(?=.*\d)/, 'comma');

 => "xxcomma               23comma  5'9\"comma  140lbcomma  29wcomma                        Slimcomma                 Brown       Haircomma             Shaved Bodycomma              Whitecomma    Looking for       Friendshipcomma    1-on-1 Sexcomma    Relationship.   Out      YescommaSmokeNocommaDrinkNocommaDrugsNocommaZodiacCancer.      Versatilecomma                  7.5\"                    Cutcomma Safe Sex Onlycomma     HIV      Negativecomma Prefer meeting at:Public Place.                   PerformerContact  xxxxx87                                                   This user has TURNED OFF his IM                                     Send Smile      Write xxxxxxx87 a message:" 

2 个解决方案

#1


2  

Code:

码:

testString = 'this is, a , sentence33 Here, is another.';
result = testString.gsub(/\,(?=.*\d)/, 'comma');
print result;

Output:

输出:

this iscomma a comma sentence33 Here, is another.

这是逗号句子33这里是另一个。

Test:

测试:

http://ideone.com/9nt1b

http://ideone.com/9nt1b

#2


1  

Not so short, but, seems to solve your task:

不是那么短暂,但似乎解决了你的任务:

str = 'this is, a , sentence33 Here, is another.'

str = str.match(/(.*)(\d+.*)/) do

    before = $1
    tail = $2

    before.gsub( /,/, 'comma' ) + tail
end

print str

#1


2  

Code:

码:

testString = 'this is, a , sentence33 Here, is another.';
result = testString.gsub(/\,(?=.*\d)/, 'comma');
print result;

Output:

输出:

this iscomma a comma sentence33 Here, is another.

这是逗号句子33这里是另一个。

Test:

测试:

http://ideone.com/9nt1b

http://ideone.com/9nt1b

#2


1  

Not so short, but, seems to solve your task:

不是那么短暂,但似乎解决了你的任务:

str = 'this is, a , sentence33 Here, is another.'

str = str.match(/(.*)(\d+.*)/) do

    before = $1
    tail = $2

    before.gsub( /,/, 'comma' ) + tail
end

print str