正则表达式删除字符串中单词之间的空格

时间:2021-10-26 21:38:31

I'm using Hive (Hadoop) to write an SQL-like statement.

我正在使用Hive(Hadoop)编写类似SQL的语句。

I need to remove spaces in a field. For example, a postcode could be XX00 0XX and I'd like to remove the space before 0XX

我需要删除字段中的空格。例如,邮政编码可能是XX00 0XX,我想在0XX之前删除空格

So far, I have this regex:

到目前为止,我有这个正则表达式:

REGEXP_REPLACE(postcode, '[[:space:]]*', '')

But it doesn't seem to work. Can anyone advise?

但它似乎没有用。任何人都可以建议吗?

3 个解决方案

#1


2  

Would there be anything wrong with just doing a simple (non-regex) replace? Try this:

只做一个简单的(非正则表达式)替换会有什么问题吗?尝试这个:

REPLACE(postcode, ' ', '')

If your version of Hive doesn't support REPLACE(), then you can use:

如果您的Hive版本不支持REPLACE(),那么您可以使用:

REGEXP_REPLACE(postcode, '\\s+', '')

#2


0  

Did you try '[[:blank:]]*' Care as this will capture tabs as well.

您是否尝试'[[:blank:]] *'小心,因为这也将捕获标签。

#3


0  

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

Translate

select translate('XX00 0XX',' ','')

regexp_replace

select regexp_replace('XX00 0XX',' ','');
select regexp_replace('XX00 0XX','\\s','');
select regexp_replace('XX00 0XX','\\p{Blank}','');
select regexp_replace('XX00 0XX','\\p{Space}','');
select regexp_replace('XX00 0XX','\\p{javaWhitespace}','');

https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

replace

select replace ('XX00 0XX',' ','')

(as of Hive 1.3.0 and 2.1.0).

(截至Hive 1.3.0和2.1.0)。

#1


2  

Would there be anything wrong with just doing a simple (non-regex) replace? Try this:

只做一个简单的(非正则表达式)替换会有什么问题吗?尝试这个:

REPLACE(postcode, ' ', '')

If your version of Hive doesn't support REPLACE(), then you can use:

如果您的Hive版本不支持REPLACE(),那么您可以使用:

REGEXP_REPLACE(postcode, '\\s+', '')

#2


0  

Did you try '[[:blank:]]*' Care as this will capture tabs as well.

您是否尝试'[[:blank:]] *'小心,因为这也将捕获标签。

#3


0  

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

Translate

select translate('XX00 0XX',' ','')

regexp_replace

select regexp_replace('XX00 0XX',' ','');
select regexp_replace('XX00 0XX','\\s','');
select regexp_replace('XX00 0XX','\\p{Blank}','');
select regexp_replace('XX00 0XX','\\p{Space}','');
select regexp_replace('XX00 0XX','\\p{javaWhitespace}','');

https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html

replace

select replace ('XX00 0XX',' ','')

(as of Hive 1.3.0 and 2.1.0).

(截至Hive 1.3.0和2.1.0)。