如何解析XML的媒体:PHP内容?

时间:2023-01-15 00:07:39

I've found a great tutorial on how to accomplish most of the work at:

我找到了一个关于如何完成大部分工作的精彩教程:

https://www.developphp.com/video/PHP/simpleXML-Tutorial-Learn-to-Parse-XML-Files-and-RSS-Feeds

https://www.developphp.com/video/PHP/simpleXML-Tutorial-Learn-to-Parse-XML-Files-and-RSS-Feeds

but I can't understand how to extract media:content images from the feeds. I've read as much info as i can find, but i'm still stuck.

但我无法理解如何提取媒体:来自提要的内容图像。我已经阅读了尽可能多的信息,但我仍然卡住了。

ie: How to get media:content with SimpleXML this suggests using:

即:如何获取媒体:使用SimpleXML的内容建议使用:

foreach ($xml->channel->item as $news){
    $ns_media = $news->children('http://search.yahoo.com/mrss/');
    echo $ns_media->content; // displays "<media:content>"}

but i can't get it to work.

但是我无法让它发挥作用。

Here's my script and feed i'm trying to parse:

这是我正在尝试解析的脚本和提要:

<?php
$html = "";
$url = "http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC";
$xml = simplexml_load_file($url);
for($i = 0; $i < 10; $i++){
    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description;
    $pubDate = $xml->channel->item[$i]->pubDate;

    $html .= "<a href='$link'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate<hr />";
}
echo $html;
?>

I don't know where to add this code into the script to make it work. Honestly, i've browsed for hours, but couldn't find working script that would parse media:content.

我不知道将此代码添加到脚本中以使其工作。老实说,我浏览了几个小时,但找不到可以解析媒体的工作脚本:内容。

Can someone help with this?

有人可以帮忙吗?

========================

========================

UPDATE:

更新:

Thanx to fusion3k, i got the final code working:

Thanx to fusion3k,我得到了最终的代码:

<?php
$html = "";
$url = "http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC";
$xml = simplexml_load_file($url);
for($i = 0; $i < 5; $i++){

    $image = $xml->channel->item[$i]->children('media', True)->content->attributes();
    $title = $xml->channel->item[$i]->title;
    $link = $xml->channel->item[$i]->link;
    $description = $xml->channel->item[$i]->description;
    $pubDate = $xml->channel->item[$i]->pubDate;

    $html .= "<img src='$image' alt='$title'>";
    $html .= "<a href='$link'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate<hr />";
}
echo $html;
?>

Basically all i needed was this simple line:

基本上我只需要这个简单的线:

$image = $xml->channel->item[$i]->children('media', True)->content->attributes();

Can't believe it was so hard for non techie to find this info online after reading dozens of posts and articles. Well, hope this will serve well for other folks like me :)

无法相信非技术人员在阅读了数十篇文章和文章后在网上找到这些信息是如此困难。好吧,希望这对我这样的其他人有用:)

2 个解决方案

#1


6  

To get 'url' attribute, use ->attribute() syntax:

要获取'url'属性,请使用 - > attribute()语法:

$ns_media = $news->children('http://search.yahoo.com/mrss/');

/* Echoes 'url' attribute: */
echo $ns_media->content->attributes()['url'];
// in php < 5.5: $attr = $ns_media->content->attributes(); echo $attr['url'];

/* Catches 'url' attribute: */
$url = $ns_media->content->attributes()['url']->__toString();
// in php < 5.5: $attr = $ns_media->content->attributes(); $url = $attr['url']->__toString();

Namespaces explanation:

The ->children() arguments is not the URL of your XML, it is a Namespace URI.

- > children()参数不是XML的URL,它是名称空间URI。

XML namespaces are used for providing uniquely named elements and attributes in an XML document:

XML名称空间用于在XML文档中提供唯一命名的元素和属性:

<xxx>       Standard XML tag
<yyy:zzz>   Namespaced tag
 └┬┘ └┬┘
  │   └──── Element Name
  └──────── Element Prefix (Namespace Identifier)

So, in your case, <media:content> is the “content” element of Namespace “media”. Namespaced elements must be have an associated Namespace URI, as attribute of a parent node or — most commonly — of the root element: this attribute has the form xmlns:yyy="NamespaceURI" (in your case xmlns:media="http://search.yahoo.com/mrss/" as attribute of root node <rss>).

因此,在您的情况下, 是命名空间“media”的“content”元素。 Namespaced元素必须具有关联的Namespace URI,作为父节点的属性或 - 最常见的 - 根元素的属性:此属性的格式为xmlns:yyy =“NamespaceURI”(在您的情况下为xmlns:media =“http:/ /search.yahoo.com/mrss/“作为根节点 的属性”。 :content>

Ultimately, the above $news->children( 'http://search.yahoo.com/mrss/' ) means “retrieve all children elements with http://search.yahoo.com/mrss/ as Namespace URI; an alternative — most intelligible — syntax is: $news->children( 'media', True ) (True means “regarded as a prefix”).

最终,上面的$ news-> children('http://search.yahoo.com/mrss/')意味着“使用http://search.yahoo.com/mrss/作为命名空间URI检索所有子元素;另一种 - 最容易理解的 - 语法是:$ news-> children('media',True)(True表示“被视为前缀”)。

Returning to the code in example, the generic syntax to retrieve all first item's children with prefix media is:

回到示例中的代码,检索具有前缀media的所有第一个项目的子代的通用语法是:

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'http://search.yahoo.com/mrss/' );

or (identical result):

或(相同的结果):

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'media', True );

Your new code:

If you want to show the <media:content url> thumbnail for each element in your page, modify the original code in this way:

如果要显示页面中每个元素的 缩略图,请以这种方式修改原始代码: :content>

(...)
$pubDate = $xml->channel->item[$i]->pubDate;
$image   = $xml->channel->item[$i]->children( 'media', True )->content->attributes()['url'];
// in php < 5.5:
// $attr  = $xml->channel->item[$i]->children( 'media', True )->content->attributes();
// $image = $attr['url'];

$html   .= "<a href='$link'><h3>$title</h3></a>";
$html   .= "<img src='$image' alt='$title'>";
(...)

#2


4  

Simple example for newbs like me:

像我这样的新手的简单例子:

$url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCwNPPl_oX8oUtKVMLxL13jg";
$rss = simplexml_load_file($url);

foreach($rss->entry as $item) {

  $time = $item->published;
  $time = date('Y-m-d \ H:i', strtotime($time));

  $media_group = $item->children( 'media', true );
  $title = $media_group->group->title;
  $description = $media_group->group->description;
  $views = $media_group->group->community->statistics->attributes()['views'];
}
echo $time . ' :: ' . $title . '<br>' . $description . '<br>' . $views . '<br>';

#1


6  

To get 'url' attribute, use ->attribute() syntax:

要获取'url'属性,请使用 - > attribute()语法:

$ns_media = $news->children('http://search.yahoo.com/mrss/');

/* Echoes 'url' attribute: */
echo $ns_media->content->attributes()['url'];
// in php < 5.5: $attr = $ns_media->content->attributes(); echo $attr['url'];

/* Catches 'url' attribute: */
$url = $ns_media->content->attributes()['url']->__toString();
// in php < 5.5: $attr = $ns_media->content->attributes(); $url = $attr['url']->__toString();

Namespaces explanation:

The ->children() arguments is not the URL of your XML, it is a Namespace URI.

- > children()参数不是XML的URL,它是名称空间URI。

XML namespaces are used for providing uniquely named elements and attributes in an XML document:

XML名称空间用于在XML文档中提供唯一命名的元素和属性:

<xxx>       Standard XML tag
<yyy:zzz>   Namespaced tag
 └┬┘ └┬┘
  │   └──── Element Name
  └──────── Element Prefix (Namespace Identifier)

So, in your case, <media:content> is the “content” element of Namespace “media”. Namespaced elements must be have an associated Namespace URI, as attribute of a parent node or — most commonly — of the root element: this attribute has the form xmlns:yyy="NamespaceURI" (in your case xmlns:media="http://search.yahoo.com/mrss/" as attribute of root node <rss>).

因此,在您的情况下, 是命名空间“media”的“content”元素。 Namespaced元素必须具有关联的Namespace URI,作为父节点的属性或 - 最常见的 - 根元素的属性:此属性的格式为xmlns:yyy =“NamespaceURI”(在您的情况下为xmlns:media =“http:/ /search.yahoo.com/mrss/“作为根节点 的属性”。 :content>

Ultimately, the above $news->children( 'http://search.yahoo.com/mrss/' ) means “retrieve all children elements with http://search.yahoo.com/mrss/ as Namespace URI; an alternative — most intelligible — syntax is: $news->children( 'media', True ) (True means “regarded as a prefix”).

最终,上面的$ news-> children('http://search.yahoo.com/mrss/')意味着“使用http://search.yahoo.com/mrss/作为命名空间URI检索所有子元素;另一种 - 最容易理解的 - 语法是:$ news-> children('media',True)(True表示“被视为前缀”)。

Returning to the code in example, the generic syntax to retrieve all first item's children with prefix media is:

回到示例中的代码,检索具有前缀media的所有第一个项目的子代的通用语法是:

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'http://search.yahoo.com/mrss/' );

or (identical result):

或(相同的结果):

$xml = simplexml_load_file( 'http://rssfeeds.webmd.com/rss/rss.aspx?RSSSource=RSS_PUBLIC' );
$xml->channel->item[0]->children( 'media', True );

Your new code:

If you want to show the <media:content url> thumbnail for each element in your page, modify the original code in this way:

如果要显示页面中每个元素的 缩略图,请以这种方式修改原始代码: :content>

(...)
$pubDate = $xml->channel->item[$i]->pubDate;
$image   = $xml->channel->item[$i]->children( 'media', True )->content->attributes()['url'];
// in php < 5.5:
// $attr  = $xml->channel->item[$i]->children( 'media', True )->content->attributes();
// $image = $attr['url'];

$html   .= "<a href='$link'><h3>$title</h3></a>";
$html   .= "<img src='$image' alt='$title'>";
(...)

#2


4  

Simple example for newbs like me:

像我这样的新手的简单例子:

$url = "https://www.youtube.com/feeds/videos.xml?channel_id=UCwNPPl_oX8oUtKVMLxL13jg";
$rss = simplexml_load_file($url);

foreach($rss->entry as $item) {

  $time = $item->published;
  $time = date('Y-m-d \ H:i', strtotime($time));

  $media_group = $item->children( 'media', true );
  $title = $media_group->group->title;
  $description = $media_group->group->description;
  $views = $media_group->group->community->statistics->attributes()['views'];
}
echo $time . ' :: ' . $title . '<br>' . $description . '<br>' . $views . '<br>';