I'm trying to create a regex to find option symbols in broker data. Per Wikipedia the format is:
我正在尝试创建一个regex来查找代理数据中的选项符号。每个*的格式是:
- Root symbol of the underlying stock or ETF, padded with spaces to 6 characters
- 基础股票或ETF的根符号,填充了6个字符的空格
- Expiration date, 6 digits in the format yymmdd
- 有效期,格式yymmdd为6位
- Option type, either P or C, for put or call
- 选项类型,P或C,用于put或call
- Strike price, as the price x 1000, front padded with 0s to 8 digits
- 执行价格,如x 1000,前垫0到8位数
So I created this regex:
所以我创建了这个regex:
option_regex = re.compile(r'''(
(\w{1,6}) # beginning ticker, 1 to 6 word characters
(\s)? # optional separator
(\d{6}) # 6 digits for yymmdd
([cp]) # C or P for call or put
(\d{8}) # 8 digits for strike price
)''', re.VERBOSE | re.IGNORECASE)
But when I test it out I get an error:
但是当我测试它的时候,我得到了一个错误:
import re
option_regex = re.compile(r'''(
(\w{1,6}) # beginning ticker, 1 to 6 word characters
(\s)? # optional separator
(\d{6}) # 6 digits for yymmdd
([cp]) # C or P for call or put
(\d{8}) # 8 digits for strike price
)''', re.VERBOSE | re.IGNORECASE)
result = option_regex.search('AAPL 170818C00155000')
result.group()
Traceback (most recent call last):
File "<ipython-input-4-0273c989d990>", line 1, in <module>
result.group()
AttributeError: 'NoneType' object has no attribute 'group'
1 个解决方案
#1
3
From python documentation on re.search()
:
从python文档关于re.search():
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
扫描字符串,查找正则表达式模式产生匹配的第一个位置,并返回相应的MatchObject实例。如果字符串中没有与模式匹配的位置,则返回None;注意,这与在字符串的某个点找到零长度匹配是不同的。
Your code throws this exception, because the subroutine didn't found anything. Basically, you are trying to run .group()
on None
. It would be a good idea to defend against it:
您的代码抛出这个异常,因为子例程没有发现任何东西。基本上,您试图在None上运行.group()。最好的办法是:
if not result:
... # Pattern didn't match the string
return
Your pattern doesn't match the string you typed in, because it has lengthier separator than what you assumed it to be: it has 2 spaces instead of one. You can fix that by adding a +
("at-least-once") to the rule:
你的模式与你输入的字符串不匹配,因为它的分隔符比你想象的要长:它有两个空格而不是一个空格。您可以通过在规则中添加一个+(“至少一次”)来解决这个问题:
(\s+)? # optional separator
#1
3
From python documentation on re.search()
:
从python文档关于re.search():
Scan through string looking for the first location where the regular expression pattern produces a match, and return a corresponding MatchObject instance. Return None if no position in the string matches the pattern; note that this is different from finding a zero-length match at some point in the string.
扫描字符串,查找正则表达式模式产生匹配的第一个位置,并返回相应的MatchObject实例。如果字符串中没有与模式匹配的位置,则返回None;注意,这与在字符串的某个点找到零长度匹配是不同的。
Your code throws this exception, because the subroutine didn't found anything. Basically, you are trying to run .group()
on None
. It would be a good idea to defend against it:
您的代码抛出这个异常,因为子例程没有发现任何东西。基本上,您试图在None上运行.group()。最好的办法是:
if not result:
... # Pattern didn't match the string
return
Your pattern doesn't match the string you typed in, because it has lengthier separator than what you assumed it to be: it has 2 spaces instead of one. You can fix that by adding a +
("at-least-once") to the rule:
你的模式与你输入的字符串不匹配,因为它的分隔符比你想象的要长:它有两个空格而不是一个空格。您可以通过在规则中添加一个+(“至少一次”)来解决这个问题:
(\s+)? # optional separator