如何查询值> = 60000的子字符串

时间:2022-05-21 02:00:45

I have a dataset with a data that is formatted like this:

我有一个数据集,其数据格式如下:

Date        | exec_time
------------+---------
Today       | 99999 ms
Yesterday   | 1 ms
Tomorrow    | 50000 ms
Another Day | None Recorded
Last Day |  ms

What I need to do is write a query to get all of the exec_time values that are >= 60000

我需要做的是编写一个查询来获取> = 60000的所有exec_time值

The way I've tried to write it is like this:

我尝试写它的方式是这样的:

select exec_time 
from myTable
where exec_time not like '%N%'
and cast(split_part(exec_time,' ', 1) as int) >= 60000
order by len(exec_time) desc, exec_time desc
limit 10

However, when I run this, I get this error:

但是,当我运行它时,我收到此错误:

ERROR: Invalid digit, Value '2', Pos 0, Type: Integer 
  Detail: 
  -----------------------------------------------
  error:  Invalid digit, Value '2', Pos 0, Type: Integer 
  code:      1207
  context:   
  query:     2780081
  location:  :0
  process:   query0_61 [pid=0]
  -----------------------------------------------

Any ideas how I can get around this?

有什么想法我可以解决这个问题吗?

2 个解决方案

#1


2  

The error: WHERE conditions are not executed in any given order.
Use a CASE statement to avoid the exception.

错误:WHERE条件不以任何给定顺序执行。使用CASE语句来避免异常。

SELECT exec_time 
FROM   myTable
WHERE  CASE WHEN exec_time NOT LIKE '%N%' THEN split_part(exec_time,' ', 1)::int >= 60000 ELSE FALSE END
ORDER  BY length(exec_time) desc, exec_time desc
LIMIT  10;

While being at it, if 'None Recorded' is the only case to rule out, use a faster left-anchored check:

在此期间,如果“无记录”是唯一要排除的情况,请使用更快的左锚定检查:

exec_time NOT LIKE 'N%'

If the above still errors out, check with this to find any offending rows you may have missed:

如果以上仍然出错,请检查以查找您可能错过的任何违规行:

SELECT DISTINCT exec_time
FROM   myTable
WHERE  exec_time NOT LIKE '%N%'
AND    exec_time !~ '^\\d+ '  -- not all digits before the first space

In modern Postgres you only need a single backslash. '^\d+ '! Seems you have to double up on backslashes in Redshift, which seems to still use the outdated Posix escape syntax for strings by default, and without explicit declaration (E'^\\d+ ')!

在现代Postgres中,您只需要一个反斜杠。 '^ \ d +'!看起来你必须加倍Redshift中的反斜杠,默认情况下似乎仍然使用过时的Posix转义语法,没有明确的声明(E'^ \\ d +')!

Generally, it's not a good idea to mix data this way. You should have an integer column to store execution time. Much cheaper, cleaner and faster.

通常,以这种方式混合数据不是一个好主意。您应该有一个整数列来存储执行时间。更便宜,更清洁,更快捷。

#2


1  

I think the problem is the "None Recorded" value. I don't know if it is going to run the first where to exclude the first or not. Try this:

我认为问题是“无记录”值。我不知道它是否会运行第一个排除第一个或不排除第一个的位置。试试这个:

SELECT exec_time
FROM (SELECT exec_time FROM myTable WHERE exec_time NOT LIKE 'N%') as foo
WHERE cast(split_part(foo.exec_time, ' ', 1) as int) >= 60000
ORDER by length(foo.exec_time) desc, foo.exec_time desc
limit 10

#1


2  

The error: WHERE conditions are not executed in any given order.
Use a CASE statement to avoid the exception.

错误:WHERE条件不以任何给定顺序执行。使用CASE语句来避免异常。

SELECT exec_time 
FROM   myTable
WHERE  CASE WHEN exec_time NOT LIKE '%N%' THEN split_part(exec_time,' ', 1)::int >= 60000 ELSE FALSE END
ORDER  BY length(exec_time) desc, exec_time desc
LIMIT  10;

While being at it, if 'None Recorded' is the only case to rule out, use a faster left-anchored check:

在此期间,如果“无记录”是唯一要排除的情况,请使用更快的左锚定检查:

exec_time NOT LIKE 'N%'

If the above still errors out, check with this to find any offending rows you may have missed:

如果以上仍然出错,请检查以查找您可能错过的任何违规行:

SELECT DISTINCT exec_time
FROM   myTable
WHERE  exec_time NOT LIKE '%N%'
AND    exec_time !~ '^\\d+ '  -- not all digits before the first space

In modern Postgres you only need a single backslash. '^\d+ '! Seems you have to double up on backslashes in Redshift, which seems to still use the outdated Posix escape syntax for strings by default, and without explicit declaration (E'^\\d+ ')!

在现代Postgres中,您只需要一个反斜杠。 '^ \ d +'!看起来你必须加倍Redshift中的反斜杠,默认情况下似乎仍然使用过时的Posix转义语法,没有明确的声明(E'^ \\ d +')!

Generally, it's not a good idea to mix data this way. You should have an integer column to store execution time. Much cheaper, cleaner and faster.

通常,以这种方式混合数据不是一个好主意。您应该有一个整数列来存储执行时间。更便宜,更清洁,更快捷。

#2


1  

I think the problem is the "None Recorded" value. I don't know if it is going to run the first where to exclude the first or not. Try this:

我认为问题是“无记录”值。我不知道它是否会运行第一个排除第一个或不排除第一个的位置。试试这个:

SELECT exec_time
FROM (SELECT exec_time FROM myTable WHERE exec_time NOT LIKE 'N%') as foo
WHERE cast(split_part(foo.exec_time, ' ', 1) as int) >= 60000
ORDER by length(foo.exec_time) desc, foo.exec_time desc
limit 10