属性错误:'NoneType'对象没有属性'text' python3 +代理。

时间:2020-12-25 18:18:44

There are some question like this online but I looked at them and none of them have helped me I am currently working on a script that pulls an item name from http://www.supremenewyork.com/shop/all/accessories

网上有一些类似的问题,但我看了一下,没有一个人帮我,我现在正在写一个脚本,从http://www.mainmenewyork.com/shop/all/accessories中提取一个项目名称。

I want it to pull this information from supreme uk but im having trouble with the proxy stuff but right now im strugglinh with this script everytime I run it I get the error listed above in the title.

我想让它从最高的英国获取信息,但是我在代理问题上遇到了麻烦,但是现在我在运行这个脚本的时候,每次运行它都会得到上面列出的错误。

Here is my Script:

这是我的脚本:

import requests
from bs4 import BeautifulSoup

URL = ('http://www.supremenewyork.com/shop/all/accessories')

proxy_script = requests.get(URL).text
soup = BeautifulSoup(proxy_script, 'lxml')

for item in soup.find_all('div', class_='inner-article'):
    name = soup.find('h1', itemprop='name').text
    print(name)

I am always getting this error and when I run the script without the .text at the end of the itemprop=name I just get a bunch of None's like this:

我总是会得到这个错误当我运行脚本的时候没有。text在itemprop=名字的末尾我只是得到了一堆这样的东西:

None
None
None etc......

The exact amount of Nones as there are Items available to print

确切的Nones数量,因为有可以打印的项目。

1 个解决方案

#1


0  

Here we go, I've commented the code that I've used below. and the reason we use class_='something is because the word class is reserved for classes in Python.

现在,我已经注释了下面所使用的代码。我们使用class_='的原因是因为这个单词类是为Python中的类保留的。

URL = ('http://www.supremenewyork.com/shop/all/accessories')

URL =(“http://www.supremenewyork.com/shop/all/accessories”)

#UK_Proxy1 = '178.62.13.163:8080'

#proxies = {
# 'http': 'http://' + UK_Proxy1,

   #'https': 'https://' + UK_Proxy1

#}



#proxy_script = requests.get(URL, proxies=proxies).text

proxy_script = requests.get(URL).text
soup = BeautifulSoup(proxy_script, 'lxml')

thetable = soup.find('div', class_='turbolink_scroller')

items = thetable.find_all('div', class_='inner-article')

for item in items:
  only_text = item.h1.a.text
  # by doing .<tag> we extract information just from that tag
  # example bsobject = <html><body><b>ey</b></body</html> 
  # if we print bsobject.body.b it will return `<b>ey</b>`
  color = item.p.a.text

  print(only_text, color)

#1


0  

Here we go, I've commented the code that I've used below. and the reason we use class_='something is because the word class is reserved for classes in Python.

现在,我已经注释了下面所使用的代码。我们使用class_='的原因是因为这个单词类是为Python中的类保留的。

URL = ('http://www.supremenewyork.com/shop/all/accessories')

URL =(“http://www.supremenewyork.com/shop/all/accessories”)

#UK_Proxy1 = '178.62.13.163:8080'

#proxies = {
# 'http': 'http://' + UK_Proxy1,

   #'https': 'https://' + UK_Proxy1

#}



#proxy_script = requests.get(URL, proxies=proxies).text

proxy_script = requests.get(URL).text
soup = BeautifulSoup(proxy_script, 'lxml')

thetable = soup.find('div', class_='turbolink_scroller')

items = thetable.find_all('div', class_='inner-article')

for item in items:
  only_text = item.h1.a.text
  # by doing .<tag> we extract information just from that tag
  # example bsobject = <html><body><b>ey</b></body</html> 
  # if we print bsobject.body.b it will return `<b>ey</b>`
  color = item.p.a.text

  print(only_text, color)