I'm using Hive (Hadoop) to write an SQL-like statement.
我正在使用Hive(Hadoop)编写类似SQL的语句。
I need to remove spaces in a field. For example, a postcode could be XX00 0XX and I'd like to remove the space before 0XX
我需要删除字段中的空格。例如,邮政编码可能是XX00 0XX,我想在0XX之前删除空格
So far, I have this regex:
到目前为止,我有这个正则表达式:
REGEXP_REPLACE(postcode, '[[:space:]]*', '')
But it doesn't seem to work. Can anyone advise?
但它似乎没有用。任何人都可以建议吗?
3 个解决方案
#1
2
Would there be anything wrong with just doing a simple (non-regex) replace? Try this:
只做一个简单的(非正则表达式)替换会有什么问题吗?尝试这个:
REPLACE(postcode, ' ', '')
If your version of Hive doesn't support REPLACE()
, then you can use:
如果您的Hive版本不支持REPLACE(),那么您可以使用:
REGEXP_REPLACE(postcode, '\\s+', '')
#2
0
Did you try '[[:blank:]]*' Care as this will capture tabs as well.
您是否尝试'[[:blank:]] *'小心,因为这也将捕获标签。
#3
0
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Translate
select translate('XX00 0XX',' ','')
regexp_replace
select regexp_replace('XX00 0XX',' ','');
select regexp_replace('XX00 0XX','\\s','');
select regexp_replace('XX00 0XX','\\p{Blank}','');
select regexp_replace('XX00 0XX','\\p{Space}','');
select regexp_replace('XX00 0XX','\\p{javaWhitespace}','');
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
replace
select replace ('XX00 0XX',' ','')
(as of Hive 1.3.0 and 2.1.0).
(截至Hive 1.3.0和2.1.0)。
#1
2
Would there be anything wrong with just doing a simple (non-regex) replace? Try this:
只做一个简单的(非正则表达式)替换会有什么问题吗?尝试这个:
REPLACE(postcode, ' ', '')
If your version of Hive doesn't support REPLACE()
, then you can use:
如果您的Hive版本不支持REPLACE(),那么您可以使用:
REGEXP_REPLACE(postcode, '\\s+', '')
#2
0
Did you try '[[:blank:]]*' Care as this will capture tabs as well.
您是否尝试'[[:blank:]] *'小心,因为这也将捕获标签。
#3
0
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF
Translate
select translate('XX00 0XX',' ','')
regexp_replace
select regexp_replace('XX00 0XX',' ','');
select regexp_replace('XX00 0XX','\\s','');
select regexp_replace('XX00 0XX','\\p{Blank}','');
select regexp_replace('XX00 0XX','\\p{Space}','');
select regexp_replace('XX00 0XX','\\p{javaWhitespace}','');
https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
replace
select replace ('XX00 0XX',' ','')
(as of Hive 1.3.0 and 2.1.0).
(截至Hive 1.3.0和2.1.0)。