I have a string composed of words, some of which contain punctuation, which I would like to remove, but I have been unable to figure out how to do this.
我有一个由单词组成的字符串,其中一些包含了标点符号,我想删除它,但是我一直无法弄清楚如何做到这一点。
For example if I have something like
例如,如果我有
var words = "Hello, this : is .. a string?"
I would like to be able to create an array with
我想要创建一个数组
"[Hello, this, is, a, string]"
My original thought was to use something like words.stringByTrimmingCharactersInSet()
to remove any characters I didn't want but that would only take characters off the ends.
我最初的想法是使用像words.stringByTrimmingCharactersInSet()来删除我不想要的字符,但这只会把字符去掉。
I thought maybe I could iterate through the string with something in the vein of
我想也许我可以用一些类似于
for letter in words {
if NSCharacterSet.punctuationCharacterSet.characterIsMember(letter){
//remove that character from the string
}
}
but I'm unsure how to remove the character from the string. I'm sure there are some problems with the way that if statement is set up, as well, but it shows my thought process.
但是我不确定如何从字符串中删除字符。我确信if语句的建立方式也存在一些问题,但它展示了我的思考过程。
6 个解决方案
#1
11
Xcode 8.3.2 • Swift 3.1
Xcode 8.3.2•Swift 3.1
extension String {
var words: [String] {
return components(separatedBy: .punctuationCharacters)
.joined()
.components(separatedBy: .whitespaces)
.filter{!$0.isEmpty}
}
}
let sentence = "Hello, this : is .. a string?"
let myWordList = sentence.words // ["Hello", "this", "is", "a", "string"]
#2
3
This works with Xcode 8.1, Swift 3:
这适用于Xcode 8.1, Swift 3:
First define general-purpose extension for filtering by CharacterSet
:
首先定义通用性扩展,通过字符集进行过滤:
extension String {
func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String
{
var filteredString = self
while true {
if let forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters) {
filteredString.removeSubrange(forbiddenCharRange)
}
else {
break
}
}
return filteredString
}
}
Then filter using punctuation:
然后过滤器使用标点符号:
let s:String = "Hello, world!"
s.removingCharacters(inCharacterSet: CharacterSet.punctuationCharacters) // => "Hello world"
#3
3
String
has a enumerateSubstringsInRange()
method. With the .ByWords
option, it detects word boundaries and punctuation automatically:
String有一个enumerateSubstringsInRange()方法。通过.ByWords选项,它自动检测单词边界和标点符号:
Swift 3/4:
斯威夫特3/4:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstrings(in: string.startIndex..<string.endIndex,
options: .byWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
Swift 2:
斯威夫特2:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstringsInRange(string.characters.indices,
options: .ByWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
#4
0
An alternate way to filter characters from a set and obtain an array of words is by using the array's filter
and reduce
methods. It's not as compact as other answers, but it shows how the same result can be obtained in a different way.
从集合中筛选字符并获取单词数组的另一种方法是使用数组的筛选和减少方法。它不像其他答案那样简洁,但它显示了如何以不同的方式获得相同的结果。
First define an array of the characters to remove:
首先定义要删除的字符数组:
let charactersToRemove = Set(Array(".:?,"))
next convert the input string into an array of characters:
接下来将输入字符串转换为字符数组:
let arrayOfChars = Array(words)
Now we can use reduce
to build a string, obtained by appending the elements from arrayOfChars
, but skipping all the ones included in charactersToRemove
:
现在我们可以使用reduce来构建一个字符串,它是通过添加arrayOfChars中的元素来获得的,但是省略了characters storemove中包含的所有元素:
let filteredString = arrayOfChars.reduce("") {
let str = String($1)
return $0 + (charactersToRemove.contains($1) ? "" : str)
}
This produces a string without the punctuation characters (as defined in charactersToRemove
).
这将产生一个没有标点字符的字符串(如characters storemove中定义的那样)。
The last 2 steps:
最后两个步骤:
split the string into an array of words, using the blank character as separator:
将字符串拆分为单词数组,使用空白字符作为分隔符:
let arrayOfWords = filteredString.componentsSeparatedByString(" ")
last, remove all empty elements:
最后,删除所有空元素:
let finalArrayOfWords = arrayOfWords.filter { $0.isEmpty == false }
#5
0
NSScaner way:
NSScaner道:
let words = "Hello, this : is .. a string?"
//
let scanner = NSScanner(string: words)
var wordArray:[String] = []
var word:NSString? = ""
while(!scanner.atEnd) {
var sr = scanner.scanCharactersFromSet(NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKMNOPQRSTUVWXYZ"), intoString: &word)
if !sr {
scanner.scanLocation++
continue
}
wordArray.append(String(word!))
}
println(wordArray)
#6
-1
let charactersToRemove = NSCharacterSet.punctuationCharacterSet().invertedSet
let aWord = "".join(words.componentsSeparatedByCharactersInSet(charactersToRemove))
#1
11
Xcode 8.3.2 • Swift 3.1
Xcode 8.3.2•Swift 3.1
extension String {
var words: [String] {
return components(separatedBy: .punctuationCharacters)
.joined()
.components(separatedBy: .whitespaces)
.filter{!$0.isEmpty}
}
}
let sentence = "Hello, this : is .. a string?"
let myWordList = sentence.words // ["Hello", "this", "is", "a", "string"]
#2
3
This works with Xcode 8.1, Swift 3:
这适用于Xcode 8.1, Swift 3:
First define general-purpose extension for filtering by CharacterSet
:
首先定义通用性扩展,通过字符集进行过滤:
extension String {
func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String
{
var filteredString = self
while true {
if let forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters) {
filteredString.removeSubrange(forbiddenCharRange)
}
else {
break
}
}
return filteredString
}
}
Then filter using punctuation:
然后过滤器使用标点符号:
let s:String = "Hello, world!"
s.removingCharacters(inCharacterSet: CharacterSet.punctuationCharacters) // => "Hello world"
#3
3
String
has a enumerateSubstringsInRange()
method. With the .ByWords
option, it detects word boundaries and punctuation automatically:
String有一个enumerateSubstringsInRange()方法。通过.ByWords选项,它自动检测单词边界和标点符号:
Swift 3/4:
斯威夫特3/4:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstrings(in: string.startIndex..<string.endIndex,
options: .byWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
Swift 2:
斯威夫特2:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstringsInRange(string.characters.indices,
options: .ByWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
#4
0
An alternate way to filter characters from a set and obtain an array of words is by using the array's filter
and reduce
methods. It's not as compact as other answers, but it shows how the same result can be obtained in a different way.
从集合中筛选字符并获取单词数组的另一种方法是使用数组的筛选和减少方法。它不像其他答案那样简洁,但它显示了如何以不同的方式获得相同的结果。
First define an array of the characters to remove:
首先定义要删除的字符数组:
let charactersToRemove = Set(Array(".:?,"))
next convert the input string into an array of characters:
接下来将输入字符串转换为字符数组:
let arrayOfChars = Array(words)
Now we can use reduce
to build a string, obtained by appending the elements from arrayOfChars
, but skipping all the ones included in charactersToRemove
:
现在我们可以使用reduce来构建一个字符串,它是通过添加arrayOfChars中的元素来获得的,但是省略了characters storemove中包含的所有元素:
let filteredString = arrayOfChars.reduce("") {
let str = String($1)
return $0 + (charactersToRemove.contains($1) ? "" : str)
}
This produces a string without the punctuation characters (as defined in charactersToRemove
).
这将产生一个没有标点字符的字符串(如characters storemove中定义的那样)。
The last 2 steps:
最后两个步骤:
split the string into an array of words, using the blank character as separator:
将字符串拆分为单词数组,使用空白字符作为分隔符:
let arrayOfWords = filteredString.componentsSeparatedByString(" ")
last, remove all empty elements:
最后,删除所有空元素:
let finalArrayOfWords = arrayOfWords.filter { $0.isEmpty == false }
#5
0
NSScaner way:
NSScaner道:
let words = "Hello, this : is .. a string?"
//
let scanner = NSScanner(string: words)
var wordArray:[String] = []
var word:NSString? = ""
while(!scanner.atEnd) {
var sr = scanner.scanCharactersFromSet(NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKMNOPQRSTUVWXYZ"), intoString: &word)
if !sr {
scanner.scanLocation++
continue
}
wordArray.append(String(word!))
}
println(wordArray)
#6
-1
let charactersToRemove = NSCharacterSet.punctuationCharacterSet().invertedSet
let aWord = "".join(words.componentsSeparatedByCharactersInSet(charactersToRemove))