在字或边界处截断包含表情符号或unicode字符的字符串

时间:2023-02-05 21:36:16

在字或边界处截断包含表情符号或unicode字符的字符串

How can I truncate a string at a given length without annihilating a unicode character that might be smack in the middle of my length? How can one determine the index of the beginning of a unicode character in a string so that I can avoid creating ugly strings. The square with half of an A visible is the location of another emoji character which has been truncated.

如何在不消除可能在我的长度中间发出的unicode字符的情况下截断给定长度的字符串?如何确定字符串中unicode字符开头的索引,以便我可以避免创建丑陋的字符串。具有一半A可见的正方形是另一个被截断的表情符号字符的位置。

-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range

NSString *original = [_postDictionay objectForKey:@"message"];

NSMutableString *truncated = [NSMutableString string];

NSArray *components = [original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]];

for(int x=0; x<[components count]; x++)
{
    //If the truncated string is still shorter then the range desired. (leave space for ...)
    if([truncated length]+[[components objectAtIndex:x] length]<range.length-3)
    {
        //Just checking if its the first word
        if([truncated length]==0 && x==0)
        {
            //start off the string
            [truncated appendString:[components objectAtIndex:0]];
        }
        else
        {
            //append a new word to the string
            [truncated appendFormat:@" %@",[components objectAtIndex:x]];
        }

    }
    else
    {
        x=[components count];
    }
}

if([truncated length]==0 || [truncated length]< range.length-20)
{
    truncated = [NSMutableString stringWithString:[original substringWithRange:NSMakeRange(range.location, range.length-3)]];
}

[truncated appendString:@"..."];

NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];

return statusString;

}

UPDATE Thanks to the answer, was able to use one simple function for my needs!

更新感谢答案,能够使用一个简单的功能满足我的需求!

-(NSMutableAttributedString*)constructStatusAttributedStringWithRange:(CFRange)range
{
NSString *original = [_postDictionay objectForKey:@"message"];

NSMutableString *truncated = [NSMutableString stringWithString:[original substringWithRange:[original rangeOfComposedCharacterSequencesForRange:NSMakeRange(range.location, range.length-3)]]];
[truncated appendString:@"..."];

NSMutableAttributedString *statusString = [[NSMutableAttributedString alloc]initWithString:truncated];
[statusString addAttribute:(id)kCTFontAttributeName value:[StyleSingleton streamStatusFont] range:NSMakeRange(0, [statusString length])];
[statusString addAttribute:(id)kCTForegroundColorAttributeName value:(id)[StyleSingleton streamStatusColor].CGColor range:NSMakeRange(0, [statusString length])];

return statusString;

}

2 个解决方案

#1


14  

NSString has a method rangeOfComposedCharacterSequencesForRange that you can use to find the enclosing range in the string that contains only complete composed characters. For example

NSString有一个方法rangeOfComposedCharacterSequencesForRange,您可以使用该方法查找仅包含完整组合字符的字符串中的封闭范围。例如

NSString *s =  @"????";
NSRange r = [s rangeOfComposedCharacterSequencesForRange:NSMakeRange(0, 1)];

gives the range { 0, 2 } because the Emoji character is stored as two UTF-16 characters (surrogate pair) in the string.

给出范围{0,2},因为表情符号字符在字符串中存储为两个UTF-16字符(代理项对)。

Remark: You could also check if you can simplify your first loop by using

备注:您还可以通过使用来检查是否可以简化第一个循环

enumerateSubstringsInRange:options:usingBlock

with the NSStringEnumerationByWords option.

使用NSStringEnumerationByWords选项。

#2


2  

"truncate a string at a given length" <-- Do you mean length as in byte length or length as in number of characters? If the latter, then a simple substringToIndex: will suffice (check the bounds first though). If the former, then I'm afraid you'll have to do something like:

“截断给定长度的字符串”< - 你的意思是长度,如字节长度或长度,字符数?如果是后者,那么一个简单的substringToIndex:就足够了(尽管先检查边界)。如果是前者,那么我担心你必须做以下事情:

NSString *TruncateString(NSString *original, NSUInteger maxBytesToRead, NSStringEncoding targetEncoding) {
    NSMutableString *truncatedString = [NSMutableString string];

    NSUInteger bytesRead = 0;
    NSUInteger charIdx = 0;

    while (bytesRead < maxBytesToRead && charIdx < [original length]) {
        NSString *character = [original substringWithRange:NSMakeRange(charIdx++, 1)];

        bytesRead += [character lengthOfBytesUsingEncoding:targetEncoding];

        if (bytesRead <= maxBytesToRead)
            [truncatedString appendString:character];
    }

    return truncatedString;
}

EDIT: Your code can be rewritten as follows:

编辑:您的代码可以重写如下:

NSString *original = [_postDictionay objectForKey:@"message"];

NSArray *characters = [[original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"SELF != ''"]];

NSArray *truncatedCharacters = [characters subarrayWithRange:range];

NSString *truncated = [NSString stringWithFormat:@"%@...", [truncatedCharacters componentsJoinedByString:@" "]];

#1


14  

NSString has a method rangeOfComposedCharacterSequencesForRange that you can use to find the enclosing range in the string that contains only complete composed characters. For example

NSString有一个方法rangeOfComposedCharacterSequencesForRange,您可以使用该方法查找仅包含完整组合字符的字符串中的封闭范围。例如

NSString *s =  @"????";
NSRange r = [s rangeOfComposedCharacterSequencesForRange:NSMakeRange(0, 1)];

gives the range { 0, 2 } because the Emoji character is stored as two UTF-16 characters (surrogate pair) in the string.

给出范围{0,2},因为表情符号字符在字符串中存储为两个UTF-16字符(代理项对)。

Remark: You could also check if you can simplify your first loop by using

备注:您还可以通过使用来检查是否可以简化第一个循环

enumerateSubstringsInRange:options:usingBlock

with the NSStringEnumerationByWords option.

使用NSStringEnumerationByWords选项。

#2


2  

"truncate a string at a given length" <-- Do you mean length as in byte length or length as in number of characters? If the latter, then a simple substringToIndex: will suffice (check the bounds first though). If the former, then I'm afraid you'll have to do something like:

“截断给定长度的字符串”< - 你的意思是长度,如字节长度或长度,字符数?如果是后者,那么一个简单的substringToIndex:就足够了(尽管先检查边界)。如果是前者,那么我担心你必须做以下事情:

NSString *TruncateString(NSString *original, NSUInteger maxBytesToRead, NSStringEncoding targetEncoding) {
    NSMutableString *truncatedString = [NSMutableString string];

    NSUInteger bytesRead = 0;
    NSUInteger charIdx = 0;

    while (bytesRead < maxBytesToRead && charIdx < [original length]) {
        NSString *character = [original substringWithRange:NSMakeRange(charIdx++, 1)];

        bytesRead += [character lengthOfBytesUsingEncoding:targetEncoding];

        if (bytesRead <= maxBytesToRead)
            [truncatedString appendString:character];
    }

    return truncatedString;
}

EDIT: Your code can be rewritten as follows:

编辑:您的代码可以重写如下:

NSString *original = [_postDictionay objectForKey:@"message"];

NSArray *characters = [[original componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"SELF != ''"]];

NSArray *truncatedCharacters = [characters subarrayWithRange:range];

NSString *truncated = [NSString stringWithFormat:@"%@...", [truncatedCharacters componentsJoinedByString:@" "]];