Say I have data like this:
假设我有这样的数据:
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
Using PHP, how would I sort through the HTML tags, returning all text from within the option values. For instance, given the code above, I'd like to return 'Test - 123', 'Test - 456', 'Test - 789'.
使用PHP,如何对HTML标记进行排序,从选项值中返回所有文本。例如,给定上面的代码,我想返回'Test - 123', 'Test - 456', 'Test - 789'。
Thanks for the help!
谢谢你的帮助!
UPDATE: So that I'm more clear - I'm using filegetcontents() to get the html from a site. For my purposes, I'd like to be able to sort through the html, find the option values, and output them. In this case, return 'Test - 123', 'Test - 456', etc.
更新:这样我就更清楚了——我使用filegetcontents()从站点获取html。出于我的目的,我希望能够对html进行排序,找到选项值,并输出它们。在这种情况下,返回“Test - 123”、“Test - 456”等。
6 个解决方案
#1
0
If we're doing regex stuff, I like this perl-like syntax:
如果我们在做regex,我喜欢这种类似perl的语法:
$test = "<option value=\"abc\" >Test - 123</option>\n" .
"<option value=\"abc\" >Test - 456</option>\n" .
"<option value=\"abc\" >Test - 789</option>\n";
for ($offset=0; preg_match("/<option[^>]*>([^<]+)/",$test, $matches,
PREG_OFFSET_CAPTURE, $offset); $offset=$matches[1][1])
print($matches[1][0] . "\n");'
#2
3
There are many ways, which one is the best depends on more details than you've provided in your question.
One possibility: DOMDocument and DOMXPath
有很多方法,哪个是最好的取决于比你在你的问题中提供的更多的细节。一种可能性是DOMDocument和DOMXPath
<?php
$doc = new DOMDocument;
$doc->loadhtml('<html><head><title>???</title></head><body>
<form method="post" action="?" id="form1">
<div>
<select name="foo">
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
</select>
</div>
</form>
</body></html>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//form[@id="form1"]//option') as $o) {
echo 'option text: ', $o->nodeValue, " \n";
}
prints
打印
option text: Test - 123
option text: Test - 456
option text: Test - 789
#3
1
This code would load the values into an array, assuming you have line breaks in between the option tags like you showed:
该代码将把值加载到一个数组中,假设在选项标签之间有换行符,如您所示:
// Load your HTML into a string.
$html = <<<EOF
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
EOF;
// Break the values into an array.
$vals = explode("\n", strip_tags($html));
#4
1
If you’ve not just a fracture like the one mentioned, use a real parser like DOMDocument that you can walk through with DOMXPath.
如果您不像前面提到的那样,只是出现了一个断裂,那么请使用一个真正的语法分析器,比如DOMDocument,您可以使用DOMXPath处理它。
Otherwise try this regular expression together with preg_match_all
:
否则,使用preg_match_all来尝试这个正则表达式:
<option(?:[^>"']+|"[^"]*"|'[^']*')*>([^<]+)</option>
#5
0
http://networking.ringofsaturn.com/Web/removetags.php
http://networking.ringofsaturn.com/Web/removetags.php
preg_match_all("s/<[a-zA-Z\/][^>]*>//g", $data, $out);
#6
0
Using strip_tags
unless I'm misunderstanding the question.
使用strip_tags,除非我误解了这个问题。
$string = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$string = strip_tags($string);
Update: Missed that you loosely specify an array in your question. In this case, and I'm sure there's a cleaner method, I'd do something like:
更新:错过了松散地指定问题中的数组。在这种情况下,我确信有一种更干净的方法,我会这样做:
$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$stringarray = split("\n", strip_tags($teststring));
print_r($stringarray);
Update 2: And just to top and tail it, to present it as you originally asked (not an array as we may have been misled to believe, try the following:
更新2:点击顶部并跟踪它,按照您最初的要求显示它(不是我们可能被误导的数组,请尝试以下操作:
$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$stringarray = split("\n", strip_tags($teststring));
$newstring = join($stringarray, "','");
echo "'" . $newstring . "'\n";
#1
0
If we're doing regex stuff, I like this perl-like syntax:
如果我们在做regex,我喜欢这种类似perl的语法:
$test = "<option value=\"abc\" >Test - 123</option>\n" .
"<option value=\"abc\" >Test - 456</option>\n" .
"<option value=\"abc\" >Test - 789</option>\n";
for ($offset=0; preg_match("/<option[^>]*>([^<]+)/",$test, $matches,
PREG_OFFSET_CAPTURE, $offset); $offset=$matches[1][1])
print($matches[1][0] . "\n");'
#2
3
There are many ways, which one is the best depends on more details than you've provided in your question.
One possibility: DOMDocument and DOMXPath
有很多方法,哪个是最好的取决于比你在你的问题中提供的更多的细节。一种可能性是DOMDocument和DOMXPath
<?php
$doc = new DOMDocument;
$doc->loadhtml('<html><head><title>???</title></head><body>
<form method="post" action="?" id="form1">
<div>
<select name="foo">
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
</select>
</div>
</form>
</body></html>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//form[@id="form1"]//option') as $o) {
echo 'option text: ', $o->nodeValue, " \n";
}
prints
打印
option text: Test - 123
option text: Test - 456
option text: Test - 789
#3
1
This code would load the values into an array, assuming you have line breaks in between the option tags like you showed:
该代码将把值加载到一个数组中,假设在选项标签之间有换行符,如您所示:
// Load your HTML into a string.
$html = <<<EOF
<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>
EOF;
// Break the values into an array.
$vals = explode("\n", strip_tags($html));
#4
1
If you’ve not just a fracture like the one mentioned, use a real parser like DOMDocument that you can walk through with DOMXPath.
如果您不像前面提到的那样,只是出现了一个断裂,那么请使用一个真正的语法分析器,比如DOMDocument,您可以使用DOMXPath处理它。
Otherwise try this regular expression together with preg_match_all
:
否则,使用preg_match_all来尝试这个正则表达式:
<option(?:[^>"']+|"[^"]*"|'[^']*')*>([^<]+)</option>
#5
0
http://networking.ringofsaturn.com/Web/removetags.php
http://networking.ringofsaturn.com/Web/removetags.php
preg_match_all("s/<[a-zA-Z\/][^>]*>//g", $data, $out);
#6
0
Using strip_tags
unless I'm misunderstanding the question.
使用strip_tags,除非我误解了这个问题。
$string = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$string = strip_tags($string);
Update: Missed that you loosely specify an array in your question. In this case, and I'm sure there's a cleaner method, I'd do something like:
更新:错过了松散地指定问题中的数组。在这种情况下,我确信有一种更干净的方法,我会这样做:
$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$stringarray = split("\n", strip_tags($teststring));
print_r($stringarray);
Update 2: And just to top and tail it, to present it as you originally asked (not an array as we may have been misled to believe, try the following:
更新2:点击顶部并跟踪它,按照您最初的要求显示它(不是我们可能被误导的数组,请尝试以下操作:
$teststring = '<option value="abc" >Test - 123</option>
<option value="def" >Test - 456</option>
<option value="ghi" >Test - 789</option>';
$stringarray = split("\n", strip_tags($teststring));
$newstring = join($stringarray, "','");
echo "'" . $newstring . "'\n";