Problem:
问题:
So lets say I have the following string:
所以我要说我有以下字符串:
<p><span style=\"font-weight:bold;\">Description:</span>Thomas is currently
developing a enterprise resource management course for Pluralsight </p>
I am trying to do a regex.replace to remove <span style=\"font-weight:bold;\">Description:</span>
我正在尝试使用regex.replace删除描述:
Often times both the start tag and end tag will not be present so both of these must be optional. Also they won't always be spans. The only thing I can guarentee is that the word "Description:" will be present.
通常,开始标记和结束标记都不会出现,因此这两者都必须是可选的。它们也不会总是跨越。我唯一可以保证的是“描述:”这个词会出现。
What I've tried:
我尝试过的:
This was as close as I could get:
这是我能得到的最接近的:
(?:<.*>)?Description:(?:<\/.*>)?
Unfortunately the starting capture group is also grabbing the starting p tag. I need to make it so that there is never more than 1 start or end tag.
不幸的是,起始捕获组也在抓取起始p标签。我需要这样做,以便永远不会超过1个开始或结束标记。
Also when I use it in a:
当我在一个地方使用它时:
Regex.Replace(text, @"(?:<.*>)?Description:(?:<\\/.*>)?", "")
I'm being returned
我被送回了
</span>Thomas is currently developing a enterprise resource management course for Pluralsight </p>
with the end span tag which it should not be capturing and the starting p tag missing...
与结束跨度标签,它不应该捕获和起始p标签丢失...
EDIT: Although similar to the thread that @kblok posted I only want to remove the first surrounding tag if it's present. This thread is about removing all surrounding tags. Hence my problem with removing the p tag
编辑:虽然类似于@kblok发布的帖子我只想删除第一个周围的标签,如果它存在。这个主题是关于删除所有周围的标签。因此我删除p标签的问题
2 个解决方案
#1
1
Assuming you don't need to worry about quoted angle brackets, you could use
假设你不需要担心引用的尖括号,你可以使用
(?:<[^<]*>)?Description:(?:<\/[^<]*>)?
Improved pattern to enforce start/end tag name match and around Description only, also remove Description: when tags are not present.
改进的模式以强制执行开始/结束标记名称匹配和仅描述,也删除描述:当标记不存在时。
(?:(?<open><)(?<start>[^ >]+)[^<>]*>)?Description:\k<open>\/?\k<start>>|Description:
#2
0
This pattern explicitly excludes <p>
tags.
此模式明确排除
标记。
(?:<(?!p>|/)[^<>]*>)?Description:(?:</[^<>]*>)?
This one does the same, but is more strict about matching opening and closing tags. It also allows white space between tags
这个做的相同,但对匹配开始和结束标签更严格。它还允许标签之间的空白区域
(?:<(?!p>|/)(?<tag>[^ >]+)(?=[ >])[^<>]*>)?\s*Description:\s*(?:<\/\k<tag>[^<>]*>)?
Considering VDWWD's warning, even this ugly thing might be a bit naive with all possible HTML formatting variations considered, but it should at least match well-formed, simple cases as you've described.
考虑到VDWWD的警告,即使这个丑陋的事情可能有点天真,考虑到所有可能的HTML格式变化,但它至少应该与您描述的格式良好的简单案例相匹配。
#1
1
Assuming you don't need to worry about quoted angle brackets, you could use
假设你不需要担心引用的尖括号,你可以使用
(?:<[^<]*>)?Description:(?:<\/[^<]*>)?
Improved pattern to enforce start/end tag name match and around Description only, also remove Description: when tags are not present.
改进的模式以强制执行开始/结束标记名称匹配和仅描述,也删除描述:当标记不存在时。
(?:(?<open><)(?<start>[^ >]+)[^<>]*>)?Description:\k<open>\/?\k<start>>|Description:
#2
0
This pattern explicitly excludes <p>
tags.
此模式明确排除
标记。
(?:<(?!p>|/)[^<>]*>)?Description:(?:</[^<>]*>)?
This one does the same, but is more strict about matching opening and closing tags. It also allows white space between tags
这个做的相同,但对匹配开始和结束标签更严格。它还允许标签之间的空白区域
(?:<(?!p>|/)(?<tag>[^ >]+)(?=[ >])[^<>]*>)?\s*Description:\s*(?:<\/\k<tag>[^<>]*>)?
Considering VDWWD's warning, even this ugly thing might be a bit naive with all possible HTML formatting variations considered, but it should at least match well-formed, simple cases as you've described.
考虑到VDWWD的警告,即使这个丑陋的事情可能有点天真,考虑到所有可能的HTML格式变化,但它至少应该与您描述的格式良好的简单案例相匹配。