SyntaxError:以'\xae'开头的非utf -8代码

时间:2023-01-05 23:45:11

I am using Python in selenium to create scripts. When used the below code getting syntax error. I could find that the issue is with the registered trademark symbol '®' in title. Please help me out of this.

我正在使用selenium中的Python创建脚本。当使用下面的代码得到语法错误。我能发现的问题是注册商标符号“®”称号。请帮我解决这个问题。

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

driver = webdriver.Firefox()

driver.get('https://advance.lexis.com')
assert 'Lexis Advance® Sign In | LexisNexis' in driver.title

1 个解决方案

#1


2  

The content of your question is fine: I inspected it to see that * provides the ® symbol encoded as UTF-8.

你的问题的内容很好:我检查它看到*提供®符号编码为utf - 8。

Based on the error message in the title, Python is reading the file as UTF-8 but I suspect that your editor is using a different encoding to save the file.

基于标题中的错误消息,Python将该文件读取为UTF-8,但我怀疑您的编辑器正在使用不同的编码来保存该文件。

Perhaps it is using ISO 8859-1 (aka 'latin1'), or something else. ISO 8859-1 defines the byte 0xAE as the registered trademark symbol. Unicode also defines code point U+00AE as the registered trademark symbol.

也许它使用的是ISO 8859-1(即“latin1”),或者其他什么。ISO 8859-1将字节0xAE定义为注册商标符号。Unicode还将代码点U+00AE定义为注册商标符号。

You have two solutions:

你有两个解决方案:

  1. determine what encoding your editor is using and tell python by putting # encoding: foo at the top of your file
  2. 确定编辑器正在使用什么编码,并通过将# encoding: foo放在文件的顶部来告诉python
  3. configure your editor to use UTF-8
  4. 将编辑器配置为使用UTF-8

#1


2  

The content of your question is fine: I inspected it to see that * provides the ® symbol encoded as UTF-8.

你的问题的内容很好:我检查它看到*提供®符号编码为utf - 8。

Based on the error message in the title, Python is reading the file as UTF-8 but I suspect that your editor is using a different encoding to save the file.

基于标题中的错误消息,Python将该文件读取为UTF-8,但我怀疑您的编辑器正在使用不同的编码来保存该文件。

Perhaps it is using ISO 8859-1 (aka 'latin1'), or something else. ISO 8859-1 defines the byte 0xAE as the registered trademark symbol. Unicode also defines code point U+00AE as the registered trademark symbol.

也许它使用的是ISO 8859-1(即“latin1”),或者其他什么。ISO 8859-1将字节0xAE定义为注册商标符号。Unicode还将代码点U+00AE定义为注册商标符号。

You have two solutions:

你有两个解决方案:

  1. determine what encoding your editor is using and tell python by putting # encoding: foo at the top of your file
  2. 确定编辑器正在使用什么编码,并通过将# encoding: foo放在文件的顶部来告诉python
  3. configure your editor to use UTF-8
  4. 将编辑器配置为使用UTF-8