如何使用Javascript标记跨HTML的多个标签的一些文本？

I would like to use a regexp to markup some text that may span any number of tags in HTML.

我想使用正则表达式来标记可能跨越HTML中任意数量的标签的一些文本。

Ex. given the regex "brown\ fox.*lazy\ dog"

防爆。给予正则表达式“brown \ fox。* lazy \ dog”

<div>The quick brown <a href="fox.html">fox</a></div>
<div>jumps over</div>
<div>the lazy <a href="dog.html">dog</a></div>

would be transformed to

会变成

<div>The quick <strong>brown </strong><a href="fox.html"><strong>fox</strong></a></div>
<div><strong>jumps over</strong></div>
<div><strong>the lazy </strong><a href="dog.html"><strong>dog</strong></a></div>

Having an empty <strong> element between the close tags would be fine too. Using any Javascript libraries is fine. It can be browser specific.

在close标签之间有一个空的元素也没关系。使用任何Javascript库都没问题。它可以是特定于浏览器的。

2 个解决方案

#1

I would do it in two pass, first by locating the whole sentence and secondly by putting each word in strong.

我会在两次通过中做到这一点，首先是通过查找整个句子，然后是将每个单词强化。

And as I don't find practical to build the regexes by hand, I generate them :

由于我没有找到实用的手工构建正则表达式，我生成它们：

var sentence = 'the quick brown fox jumps over the lazy dog';
var r1 = new RegExp(sentence.split(' ').join('\\s*(<[^>]*>\\s*)*'), 'i');
var r2 = new RegExp('('+sentence.split(' ').join('|')+')', 'gi');
str = str.replace(r1, function(sentence) {
  return sentence.replace(r2, '<strong>$1</strong>')
});

Demonstration

示范

I don't guarantee it works in all cases but I don't see any case of failure right now. This code ensures the sentence is complete, doesn't include words outside tags, and that the order of the words is correct.

我不保证它在所有情况下都有效，但我现在看不到任何失败的情况。此代码确保句子完整，不包括标签外的单词，并且单词的顺序正确。

#2

I was hoping someone could come up with a simpler solution. Here's what I came up with. http://jsbin.com/usapej/4

我希望有人能想出一个更简单的解决方案。这就是我想出的。 http://jsbin.com/usapej/4

// Initial values
var html = $('#text').html();
var re = /brown fox(.|[\r\n])*lazy dog/;
var openTag = "<strong>";
var closeTag = "</strong>";

// build a list of tags in the HTML
var tagRe = /<[^>]*>/g;
var matches = [];
var tagResult;
var offset = 0;
while((tagResult = tagRe.exec(html)) !== null) {
  // Make the index relative to the start of the string w/o the tags
  tagResult.index -= offset;
  offset += tagResult[0].length;
  matches.push(tagResult);
}

// put our markup in the HTML
var text = $('#text').text();
var result = re.exec(text);
text = text.substring(0, result.index) + openTag + result[0] + closeTag + text.substring(result.index + result[0].length);

// Put the original tags back in surrounded by our close and open tags if it's inside our match
offset = 0;
var p;
for(var i = 0; i < matches.length; i++) {
  var m = matches[i];
  if(m.index <= result.index) {
    text = text.substring(0, m.index + offset) + m[0] + text.substring(m.index + offset);
    offset += m[0].length;
  } else if(m.index > result.index + result[0].length) {
    p = m.index + offset + openTag.length + closeTag.length;
    text = text.substring(0, p) + m[0] + text.substring(p);
    offset += m[0].length;
  } else {
    p = m.index + offset + openTag.length;
    var t = closeTag + m[0] + openTag;
    text = text.substring(0, p) + t + text.substring(p);
    offset += t.length;
  }
}

// put the HTML back into the document
$('#text').html(text);

#1