i'm having a little bit of an issue: I would like to take this data,
我有一点问题:我想拿这个数据,
for item in g_data:
print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[0]["href"]
print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[1]["href"]
print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[2]["href"]
print item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})[3]["href"]
and use the results in another process.
并在另一个过程中使用结果。
The code currently prints out the urls of the first page of an amazon search term, I would like to take those urls and then scrape the data on the page. How would I go about making it so that it would be something like this:
代码当前打印出亚马逊搜索词的第一页的网址,我想拿这些网址,然后抓取页面上的数据。我将如何制作它以使它像这样:
If for item in g_data
returns url
, taker url[1:15]
and do 'x' with it.
如果g_data中的项目返回url,则接受url [1:15]并使用它执行'x'。
If for item in g_data
does not return url, say "No urls to work with"
.
如果对于g_data中的项目没有返回url,请说“没有可以使用的URL”。
Any help or leads you could give would be great, thanks once again.
你能给予的任何帮助或线索都会很棒,再次感谢。
1 个解决方案
#1
If you want to take each item in g_data
, find all urls in the item and if there are any, do x with them, if there are no urls in the item, then just print something, then this should work:
如果你想在g_data中获取每个项目,找到项目中的所有网址,如果有,请用它们做x,如果项目中没有网址,那么只需打印一些内容,那么这应该有效:
def do_x(url):
""" Does x with the given url. """
short = url[1:15]
# do x with short
# ...
# process all items in g_data
for item in g_data:
# find all links in the item
links = item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})
if not links:
# no links in this item -> skip
print("No urls to work with.")
continue
# process all links
for link in links:
urls = link["href"]
# process each url
for url in urls:
do_x(url)
Is this what you wanted?
这是你想要的吗?
#1
If you want to take each item in g_data
, find all urls in the item and if there are any, do x with them, if there are no urls in the item, then just print something, then this should work:
如果你想在g_data中获取每个项目,找到项目中的所有网址,如果有,请用它们做x,如果项目中没有网址,那么只需打印一些内容,那么这应该有效:
def do_x(url):
""" Does x with the given url. """
short = url[1:15]
# do x with short
# ...
# process all items in g_data
for item in g_data:
# find all links in the item
links = item.contents[1].find_all("a", {"class":"a-link-normal s-access-detail-page a-text-normal"})
if not links:
# no links in this item -> skip
print("No urls to work with.")
continue
# process all links
for link in links:
urls = link["href"]
# process each url
for url in urls:
do_x(url)
Is this what you wanted?
这是你想要的吗?