如何使用NSRegularExpression在双引号之间提取字符串?

时间:2022-09-15 13:37:18

Hi I have to extract data between double quotes.If my string is:

嗨我必须在双引号之间提取数据。如果我的字符串是:

"""rach"",""jkdj""","""abc"",13","123,4.45,""19""","3.33,""123"",""2.221"""

My strings are here data from CSV files.I need to extract data between the double quotes.I try to accomplish this with NSRegularExpression.

我的字符串来自CSV文件中的数据。我需要在双引号之间提取数据。我尝试使用NSRegularExpression完成此操作。

My regex pattern is :

我的正则表达式是:

NSString *exp=@"\".+\""

I again get the entire string as the match. Where am I going wrong? How can I get ""rach"", ""jkdj"" and ""abc"", 13 and so on...

我再次将整个字符串作为匹配。我哪里错了?我怎么能得到“”rach“”,“”jkdj“”和“”abc“”,13等等......

Thank you @Derek.Your reply helped me a great deal.My data is wierd coz I'm just trying various combinations of data into a CSV file.The entry into the CSV file can be of numerous combinations.There may or may not be commas or double quotes in the data itself.What I want is just data between double quotes( not a problem if the double quotes themselves are included ).I hope I'm able to explain what I want.With your help, I have written a regex for this string.

谢谢@ Derek.Your的回复给了我很多帮助。我的数据很奇怪因为我只是尝试将各种数据组合成CSV文件。进入CSV文件可以有多种组合。可能有也可能没有数据本身中的逗号或双引号。我想要的只是双引号之间的数据(如果包含双引号本身则不成问题。)我希望我能够解释我想要的东西。在你的帮助下,我写了这个字符串的正则表达式。

NSString *exp=@"[^,]\"*[^,]*,(([^,]\"*?,*?)*|(\"*[^,]*\"*)*)";

Here exp is my regex.

这里exp是我的正则表达式。

"""pav"",""ani""","""abc"",13","123,4.45,""19""","3.33,""123"",""2.221"""

And this is my string.The first double quote has ""pav"",""ani"".The second has ""abc"",13.The third has 123,4.45,""19"".Fourth has 3.33,""123"",""2.221"".So I need each of these as a match with the double quotes included wouldn't be an issue.

这是我的字符串。第一个双引号有“”pav“”,“ani”“。第二个有”“abc”“,13。第三个有123,4.45,”“19”“。第四个有3.33 ,“”123“”,“”2.221“”。所以我需要将这些中的每一个与所包含的双引号相匹配不会成为问题。

I ought to get the following as each match:

我应该在每场比赛中获得以下内容:

"""pav"",""ani"""
"""abc"",13"
"123,4.45,""19"""
"3.33,""123"",""2.221"""

But I get this with the regex that I mentioned.

但我得到了我提到的正则表达式。

2013-09-20 11:09:04.398 regexPractice[13968] match: """pav"",""ani"""
2013-09-20 11:09:04.425 regexPractice[13968] match: """abc"",13"
2013-09-20 11:09:04.434 regexPractice[13968] match: "123,4.45
2013-09-20 11:09:04.442 regexPractice[13968] match: ""19""","3.33
2013-09-20 11:09:04.454 regexPractice[13968] match: ""123"",""2.221"""

I can see that the regex needs a slight change but I can't find where.

我可以看到正则表达式需要稍微改变但我找不到位置。

Any clues? TIA

有什么线索吗? TIA

2 个解决方案

#1


1  

I found the following seems to work:

我发现以下似乎有效:

\"\"[^"]+\"\"

The logic is quote, quote, any character that is not a quote (more than one times), quote, quote.

逻辑是报价,报价,任何不是报价的字符(超过一次),报价,报价。

You could put brackets around the any character that is not a quote (more than one times) part to capture the inner part if you want:

如果需要,您可以在任何非引号(超过一次)部分的任何字符周围放置括号以捕获内部部分:

\"\"([^"]+)\"\"

#2


0  

OK.. maybe this is what you want:

好的..也许这就是你想要的:

\"\"\".+?\"\"\"

.+? - is lazy... I was always wondering how to use lazy operators.

。+? - 很懒...我总是想知道如何使用懒惰操作符。

But there seems to be something weird about how your data is defined - I was looking for triple quotes.

但是关于如何定义数据似乎有些奇怪 - 我正在寻找三重引号。

Here is the full line:

这是完整的一行:

"""rach"",""jkdj""","""abc"",13","123,4.45,""19""","3.33,""123"",""2.221"""

Manually splitting using sets of triple quotes:

使用三重引号组手动拆分:

"""rach"",""jkdj"""

"""abc"",13","123,4.45,""19"""

"3.33,""123"",""2.221""" -- this one doesn't have triple quote at start

“3.33”,“123”“,”“2.221”“” - 这个开头没有三重报价

#1


1  

I found the following seems to work:

我发现以下似乎有效:

\"\"[^"]+\"\"

The logic is quote, quote, any character that is not a quote (more than one times), quote, quote.

逻辑是报价,报价,任何不是报价的字符(超过一次),报价,报价。

You could put brackets around the any character that is not a quote (more than one times) part to capture the inner part if you want:

如果需要,您可以在任何非引号(超过一次)部分的任何字符周围放置括号以捕获内部部分:

\"\"([^"]+)\"\"

#2


0  

OK.. maybe this is what you want:

好的..也许这就是你想要的:

\"\"\".+?\"\"\"

.+? - is lazy... I was always wondering how to use lazy operators.

。+? - 很懒...我总是想知道如何使用懒惰操作符。

But there seems to be something weird about how your data is defined - I was looking for triple quotes.

但是关于如何定义数据似乎有些奇怪 - 我正在寻找三重引号。

Here is the full line:

这是完整的一行:

"""rach"",""jkdj""","""abc"",13","123,4.45,""19""","3.33,""123"",""2.221"""

Manually splitting using sets of triple quotes:

使用三重引号组手动拆分:

"""rach"",""jkdj"""

"""abc"",13","123,4.45,""19"""

"3.33,""123"",""2.221""" -- this one doesn't have triple quote at start

“3.33”,“123”“,”“2.221”“” - 这个开头没有三重报价