如何从变量中获取数据并将其放入另一个变量中

i'm having a little bit of an issue: I would like to take this data,

我有一点问题:我想拿这个数据,

for item in g_data:
    print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[0]["href"]
    print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[1]["href"]
    print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[2]["href"]
    print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[3]["href"]

and use the results in another process.

并在另一个过程中使用结果。

The code currently prints out the urls of the first page of an amazon search term, I would like to take those urls and then scrape the data on the page. How would I go about making it so that it would be something like this:

代码当前打印出亚马逊搜索词的第一页的网址,我想拿这些网址,然后抓取页面上的数据。我将如何制作它以使它像这样:

If for item in g_data returns url, taker url[1:15] and do 'x' with it.

如果g_data中的项目返回url,则接受url [1:15]并使用它执行'x'。

If for item in g_data does not return url, say "No urls to work with".

如果对于g_data中的项目没有返回url,请说“没有可以使用的URL”。

Any help or leads you could give would be great, thanks once again.

你能给予的任何帮助或线索都会很棒,再次感谢。

1 个解决方案

#1

If you want to take each item in g_data, find all urls in the item and if there are any, do x with them, if there are no urls in the item, then just print something, then this should work:

如果你想在g_data中获取每个项目,找到项目中的所有网址,如果有,请用它们做x,如果项目中没有网址,那么只需打印一些内容,那么这应该有效:

def do_x(url):
    """ Does x with the given url. """
    short = url[1:15]
    # do x with short
    # ...

# process all items in g_data
for item in g_data:
    # find all links in the item
    links = item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})

    if not links:
        # no links in this item -> skip
        print("No urls to work with.")
        continue

    # process all links
    for link in links:
        urls = link["href"]
        # process each url
        for url in urls:
            do_x(url)

Is this what you wanted?

这是你想要的吗?

#1