I am trying to parse a string and get another string in the middle.
我正在解析一个字符串并在中间得到另一个字符串。
ie.
ie。
Hello world this is a string
Hello world,这是一个字符串。
I need to find the string between "world" and "is" (this). I have looked around but haven't been able to figure it out yet, mainly because I am new to Objective C... Anyone have an idea of how to do this, with RegEx or without?
我需要找到“world”和“is”之间的字符串。我四处看了看,但还没弄清楚,主要是因为我对Objective C很陌生…有没有人知道如何使用正则表达式或无正则表达式?
4 个解决方案
#1
30
The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.
Jacques给出的正则表达式解决方案是可行的,并且要求iOS 4.0及更高版本也是正确的。使用正则表达式也非常缓慢,如果搜索表达式是已知的字符串常量,那么就会造成过度的破坏。
You can solve the problem using methods on NSString
, or a class named NSScanner
, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).
您可以使用NSString或NSScanner类上的方法来解决这个问题,这两种方法都可以在iPhone OS 2.0和更早的时候使用,因为实际上在Mac OS X 10.0之前:)。
So what you want is a new method on NSString
like this?
你想要的是NSString上的新方法?
@interface NSString (CWAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end;
@end
No problem, and we assume we should return nil
is no such strings could be found.
没问题,我们假设我们应该返回nil不会找到这样的字符串。
The implementation using NSString
only is quite straight forward:
使用NSString的实现非常直接:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSRange startRange = [self rangeOfString:start];
if (startRange.location != NSNotFound) {
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound) {
targetRange.length = endRange.location - targetRange.location;
return [self substringWithRange:targetRange];
}
}
return nil;
}
@end
Or you could do the implementation using the NSScanner
class:
或者你可以使用NSScanner类来实现:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSScanner* scanner = [NSScanner scannerWithString:self];
[scanner setCharactersToBeSkipped:nil];
[scanner scanUpToString:start intoString:NULL];
if ([scanner scanString:start intoString:NULL]) {
NSString* result = nil;
if ([scanner scanUpToString:end intoString:&result]) {
return result;
}
}
return nil;
}
@end
#2
12
Just a simple modification to PeyloW's answer, that returns all strings within the start and end strings:
对PeyloW的答案做一个简单的修改,返回开始和结束字符串中的所有字符串:
-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end
{
NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];
NSRange startRange = [self rangeOfString:start];
for( ;; )
{
if (startRange.location != NSNotFound)
{
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound)
{
targetRange.length = endRange.location - targetRange.location;
[strings addObject:[self substringWithRange:targetRange]];
NSRange restOfString;
restOfString.location = endRange.location + endRange.length;
restOfString.length = [self length] - restOfString.location;
startRange = [self rangeOfString:start options:0 range:restOfString];
}
else
{
break;
}
}
else
{
break;
}
}
return strings;
}
#3
3
See the ICU user guide on regular expressions.
参见ICU正则表达式用户指南。
If you know there'll just be one result:
如果你知道只有一个结果:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]
NSTextCheckingResult *result = [regex firstMatchInString:string
options:0 range:NSMakeRange(0, [string length]];
// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];
The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".
b确保在世界之前和之后有一个单词边界(所以“hello world this is not a string”是不匹配的)。\s会在“世界”之后和“世界”之前占用所有的空白。. + ?找到你要找的东西,用?让它不贪心这样"hello world这是一个字符串hello world这是一个字符串"不会给你"this a string hello world this"
I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.
我将由您决定如何处理多个匹配。NSRegularExpression文档应该会帮助您解决这个问题。
If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.
如果你想确保比赛不跨句子边界,你可以([^]+ ?),而不是(+ ?),或者你可以使用enumerateSubstringsInRange:选择:usingBlock:关于字符串并通过NSStringEnumerationBySentences选项。
This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.
这些东西都需要4.0+。如果您想支持3.0+,请查看RegexKitLite。
#4
1
If it happens to be just strings seperated by white spaces you can use the following code: Either
如果它恰好是由空格分隔的字符串,您可以使用以下代码:任何一种
[string componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]
OR
或
NSMutableArray *parts = [NSMutableArray arrayWithCapacity:1];
NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *token;
while ([scanner scanUpToCharactersFromSet:[NSCharacterSet whitespaceCharacterSet]] intoString:&token]) {
[parts addObject:token];
}
#1
30
The regular expressions solution that Jacques gives works, and the caveat of requiring iOS 4.0 and later is true. Using regular expressions is also quite slow, and an overkill if the search expressions are known string constants.
Jacques给出的正则表达式解决方案是可行的,并且要求iOS 4.0及更高版本也是正确的。使用正则表达式也非常缓慢,如果搜索表达式是已知的字符串常量,那么就会造成过度的破坏。
You can solve the problem using methods on NSString
, or a class named NSScanner
, both have been available since iPhone OS 2.0 and long before that, since before Mac OS X 10.0 actually :).
您可以使用NSString或NSScanner类上的方法来解决这个问题,这两种方法都可以在iPhone OS 2.0和更早的时候使用,因为实际上在Mac OS X 10.0之前:)。
So what you want is a new method on NSString
like this?
你想要的是NSString上的新方法?
@interface NSString (CWAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end;
@end
No problem, and we assume we should return nil
is no such strings could be found.
没问题,我们假设我们应该返回nil不会找到这样的字符串。
The implementation using NSString
only is quite straight forward:
使用NSString的实现非常直接:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSRange startRange = [self rangeOfString:start];
if (startRange.location != NSNotFound) {
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound) {
targetRange.length = endRange.location - targetRange.location;
return [self substringWithRange:targetRange];
}
}
return nil;
}
@end
Or you could do the implementation using the NSScanner
class:
或者你可以使用NSScanner类来实现:
@implementation NSString (NSAddition)
- (NSString*) stringBetweenString:(NSString*)start andString:(NSString*)end {
NSScanner* scanner = [NSScanner scannerWithString:self];
[scanner setCharactersToBeSkipped:nil];
[scanner scanUpToString:start intoString:NULL];
if ([scanner scanString:start intoString:NULL]) {
NSString* result = nil;
if ([scanner scanUpToString:end intoString:&result]) {
return result;
}
}
return nil;
}
@end
#2
12
Just a simple modification to PeyloW's answer, that returns all strings within the start and end strings:
对PeyloW的答案做一个简单的修改,返回开始和结束字符串中的所有字符串:
-(NSMutableArray*)stringsBetweenString:(NSString*)start andString:(NSString*)end
{
NSMutableArray* strings = [NSMutableArray arrayWithCapacity:0];
NSRange startRange = [self rangeOfString:start];
for( ;; )
{
if (startRange.location != NSNotFound)
{
NSRange targetRange;
targetRange.location = startRange.location + startRange.length;
targetRange.length = [self length] - targetRange.location;
NSRange endRange = [self rangeOfString:end options:0 range:targetRange];
if (endRange.location != NSNotFound)
{
targetRange.length = endRange.location - targetRange.location;
[strings addObject:[self substringWithRange:targetRange]];
NSRange restOfString;
restOfString.location = endRange.location + endRange.length;
restOfString.length = [self length] - restOfString.location;
startRange = [self rangeOfString:start options:0 range:restOfString];
}
else
{
break;
}
}
else
{
break;
}
}
return strings;
}
#3
3
See the ICU user guide on regular expressions.
参见ICU正则表达式用户指南。
If you know there'll just be one result:
如果你知道只有一个结果:
NSRegularExpression *regex = [NSRegularExpression
regularExpressionWithPattern:@"\bworld\s+(.+)\s+is\b" options:0 error:NULL]
NSTextCheckingResult *result = [regex firstMatchInString:string
options:0 range:NSMakeRange(0, [string length]];
// Gets the string inside the first set of parentheses in the regex
NSString *inside = [string substringWithRange:[result rangeAtIndex:1]];
The \b makes sure there's a word boundary before world and after is (so "hello world this isn't a string" wouldn't match). The \s gobbles up any whitespace after world and before is. The .+? finds what you're looking for, with the ? making it non-greedy so that "hello world this is a string hello world this is a string" doesn't give you "this a string hello world this".
b确保在世界之前和之后有一个单词边界(所以“hello world this is not a string”是不匹配的)。\s会在“世界”之后和“世界”之前占用所有的空白。. + ?找到你要找的东西,用?让它不贪心这样"hello world这是一个字符串hello world这是一个字符串"不会给你"this a string hello world this"
I'll leave it up to you to figure out how to handle multiple matches. The NSRegularExpression documentation should help you out.
我将由您决定如何处理多个匹配。NSRegularExpression文档应该会帮助您解决这个问题。
If you want to make sure the match doesn't cross sentence boundaries, you could do ([^.]+?) instead of (.+?), or you could use enumerateSubstringsInRange:options:usingBlock: on your string and pass NSStringEnumerationBySentences in the options.
如果你想确保比赛不跨句子边界,你可以([^]+ ?),而不是(+ ?),或者你可以使用enumerateSubstringsInRange:选择:usingBlock:关于字符串并通过NSStringEnumerationBySentences选项。
This stuff all needs 4.0+. If you want to support 3.0+, look into RegexKitLite.
这些东西都需要4.0+。如果您想支持3.0+,请查看RegexKitLite。
#4
1
If it happens to be just strings seperated by white spaces you can use the following code: Either
如果它恰好是由空格分隔的字符串,您可以使用以下代码:任何一种
[string componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]]
OR
或
NSMutableArray *parts = [NSMutableArray arrayWithCapacity:1];
NSScanner *scanner = [NSScanner scannerWithString:string];
NSString *token;
while ([scanner scanUpToCharactersFromSet:[NSCharacterSet whitespaceCharacterSet]] intoString:&token]) {
[parts addObject:token];
}