需要帮助创建用于获取图像的正则表达式模式

I made an RSS reader and I'm trying to get to display a preview image too. Here's what I'm using to get the image and the only thing that's not working is the pattern

我做了一个RSS阅读器,我也试图显示一个预览图像。这是我用来获取图像的唯一方法,唯一不起作用的是模式

if item?.content != nil {

        print("works until here")
        let htmlContent = item!.content as NSString
        var imageSource = ""

        let rangeOfString = NSMakeRange(0, htmlContent.length)
        let regex =  try! NSRegularExpression(pattern: "(http[^\\s]+(jpg|jpeg|png|tiff)\\b)", options: .caseInsensitive)

        if htmlContent.length > 0 {
            let match = regex.firstMatch(in: htmlContent as String, options: [], range: rangeOfString)

            if match != nil {
                let imageURL = htmlContent.substring(with: (match!.rangeAt(2))) as NSString
                print(imageURL)

                if NSString(string: imageURL.lowercased).range(of: "feedburner").location == NSNotFound {
                    imageSource = imageURL as String
                }
            }
        }

        if imageSource != "" {
            cell.itemImageView.setImageWith(NSURL(string: imageSource) as URL!, placeholderImage: UIImage(named: "thumbnail"))
        }else {
             cell.itemImageView.image = UIImage(named: "thumbnail")
        }
    }

I need help creating a good pattern for getting the image from "st-gallery" class of the travelator.ro website.

我需要帮助创建一个良好的模式,从travelator.ro网站的“st-gallery”类获取图像。

Many thanks in advance. :)

提前谢谢了。 :)

1 个解决方案

#1

Regular expressions can't parse HTML. Regular expressions recognize the set of Regular Languages. HTML is a context-free language, which is higher on the Chomsky Hierarchy. Regular expressions can't recognize context free languages.

正则表达式无法解析HTML。正则表达式识别常规语言集。 HTML是一种无上下文的语言,在Chomsky层次结构中更高。正则表达式无法识别无上下文语言。

You would need to use a more complicated parser. HTML parsing libraries have done this, I suggest you look at using one of those.

您需要使用更复杂的解析器。 HTML解析库已经完成了这个,我建议你看一下使用其中的一个。

#1

正则表达式无法解析HTML。正则表达式识别常规语言集。 HTML是一种无上下文的语言,在Chomsky层次结构中更高。正则表达式无法识别无上下文语言。

You would need to use a more complicated parser. HTML parsing libraries have done this, I suggest you look at using one of those.

您需要使用更复杂的解析器。 HTML解析库已经完成了这个,我建议你看一下使用其中的一个。

秒客网

需要帮助创建用于获取图像的正则表达式模式

1 个解决方案

#1

#1

相关文章