I'm working with Google Big Query and try to extract some information from a string column into another column using Regexp_extract. In short:
我正在使用谷歌大查询,并尝试使用Regexp_extract将字符串列中的一些信息提取到另一个列中。简而言之:
Data in myVariable:
在myVariable数据:
yippie/eggs-spam/?portlet:hungry=1234
yippie/eggs-spam/?portlet:hungry=456&portlet:hungrier=7890
I want a column with:
我想要一个专栏:
1234
456
My command:
我的命令:
SELECT Regexp_extract(myVariable, r'SOME_MAGIC') as result
FROM table
I tried for SOME_MAGIC:
我试着SOME_MAGIC:
hungry=(.*)[&$] - null, 456 (I learned that $ is interpreted as is)
hungry=(.*)(&|$) - Error: Exactly one capturing group must be specified
hungry=(.*)^& - null, null
hungry=(&.*)?$ - null, null
I read this, but there the number has a fixed length. Also looked at this, but "?=" is no known command for perl.
我读了这个,但是这个数字有一个固定的长度。也看着这个,但是"?="是perl的已知命令。
Does anybody have an idea? Thank you in advance!
有人知道吗?提前谢谢你!
2 个解决方案
#1
1
I just found an answer to how I can solve my problem differently:
我刚刚找到了一个解决问题的方法:
hungry=([0-9]+) - 1234, 456
It isn't an answer to my abstract question (regex for selecting Charater A to [Character B or EOL]), so it's not that satisfying. E.g. it won't work with
它不是我抽象问题的答案(regex用于选择Charater A到[Character B或EOL]),所以它不是那么令人满意。这行不通
yippie/eggs-spam/?portlet:hungry=12AB34
However my original problem is solved. I leave the question open for a while in case somebody has a better answer.
然而,我最初的问题得到了解决。我把这个问题留一段时间,以防有人有更好的答案。
#2
1
I think I had a similar problem were I was trying to select the last 6 characters in a string (link_id) as a new column.
我想我有一个类似的问题,我试图选择一个字符串中的最后6个字符(link_id)作为一个新的列。
I kept getting this error:
我一直犯这样的错误:
Exactly one capturing group must be specified
必须指定一个捕获组
My code originally was:
最初我的代码是:
SELECT
...
REGEXP_EXTRACT(link_id, r'......$') AS updated_link_id
FROM sometable;
To get rid of the error and retrieve the correct substring as a column, I had to add parentheses around my regex string.
为了消除错误并以列形式检索正确的子字符串,我必须在regex字符串周围添加圆括号。
SELECT
...
REGEXP_EXTRACT(link_id, r'(......$)') AS updated_link_id
FROM sometable;
#1
1
I just found an answer to how I can solve my problem differently:
我刚刚找到了一个解决问题的方法:
hungry=([0-9]+) - 1234, 456
It isn't an answer to my abstract question (regex for selecting Charater A to [Character B or EOL]), so it's not that satisfying. E.g. it won't work with
它不是我抽象问题的答案(regex用于选择Charater A到[Character B或EOL]),所以它不是那么令人满意。这行不通
yippie/eggs-spam/?portlet:hungry=12AB34
However my original problem is solved. I leave the question open for a while in case somebody has a better answer.
然而,我最初的问题得到了解决。我把这个问题留一段时间,以防有人有更好的答案。
#2
1
I think I had a similar problem were I was trying to select the last 6 characters in a string (link_id) as a new column.
我想我有一个类似的问题,我试图选择一个字符串中的最后6个字符(link_id)作为一个新的列。
I kept getting this error:
我一直犯这样的错误:
Exactly one capturing group must be specified
必须指定一个捕获组
My code originally was:
最初我的代码是:
SELECT
...
REGEXP_EXTRACT(link_id, r'......$') AS updated_link_id
FROM sometable;
To get rid of the error and retrieve the correct substring as a column, I had to add parentheses around my regex string.
为了消除错误并以列形式检索正确的子字符串,我必须在regex字符串周围添加圆括号。
SELECT
...
REGEXP_EXTRACT(link_id, r'(......$)') AS updated_link_id
FROM sometable;