多个像在同一列DB2上

时间:2022-12-01 16:35:04

I am trying to query something like

我试图查询类似的东西

select emp_id from dept.employee where
  firstname like '%sam%' or 
  firstname like '%SAM%' or
  firstname like '%will%' or
  firstname like '%WILL%'

Is it possible to put it in regex something like

有可能把它放在正则表达式中

select emp_id from dept.employee where
  firstname like '%sam|SAM|will|WILL%'

or

select emp_id from dept.employee where
  upper(firstname) like '%sam%' or
  upper(firstname) like '%will%'

I am using DB2 UDB9.

我正在使用DB2 UDB9。

2 个解决方案

#1


Unfortunately, there is no immediate Regex function available in DB2. But it is possible to have an external user-defined function (calling either an external CLI, .NET or Java function) that implements regular expressions. Here's a ready-to-use example by IBM:

不幸的是,DB2中没有立即可用的Regex函数。但是可以使用实现正则表达式的外部用户定义函数(调用外部CLI,.NET或Java函数)。这是IBM的一个现成的例子:

http://www.ibm.com/developerworks/data/library/techarticle/0301stolze/0301stolze.html

#2


That's either a bad example or a bad database design :-).

这可能是一个糟糕的例子,也可能是糟糕的数据库设计:-)。

You should probably not be storing first (or any) names in any way that will require you to use LIKE. You need to keep in mind that tables are almost always read far more often than written, and design accordingly.

您可能不应该以任何需要使用LIKE的方式存储第一个(或任何)名称。您需要记住,表格几乎总是比书面阅读更频繁,并相应地进行设计。

That means you want the cost to be on insertion or update, not on selecting. Per-row functions such as upper(name) never scale well to proper enterprise-class databases.

这意味着您希望在插入或更新时花费成本,而不是选择。诸如upper(name)之类的每行函数永远不能很好地扩展到适当的企业级数据库。

In my opinion, you should have, for DB2, the following:

在我看来,对于DB2,您应该拥有以下内容:

  • an insert/update trigger that will remove leading and trailing spaces from the first (and last) name.
  • 插入/更新触发器,将从第一个(和最后一个)名称中删除前导和尾随空格。

  • a generated column that will uppercase the names (this uses more storage but that's usually better than wasting time). I'm not sure if UDB9 has generated columns (DB2/z has) but you can do this in the same insert/update trigger. Basically it's an extra column that's always set to the uppercase version of another field.
  • 一个生成的列,它将使名称大写(这会使用更多的存储空间,但这通常比浪费时间更好)。我不确定UDB9是否已生成列(DB2 / z),但您可以在相同的插入/更新触发器中执行此操作。基本上它是一个额外的列,总是设置为另一个字段的大写版本。

  • an index on the generated column, not the original column.
  • 生成列的索引,而不是原始列。

That way, your selects will scream along with queries like (big and ugly, but efficient):

这样,您的选择将与诸如(大而丑,但有效)的查询一起尖叫:

select * from tbl
where generatedname = 'SAM'
or    generatedname = 'SAMUEL'
or    generatedname = 'SAMANTHA'
or    generatedname = 'WILL'
or    generatedname = 'WILLIAM'
or    generatedname = 'WILLOMENA'

or (less big and ugly, just as efficient, and closer to the original query in intent):

或者(不那么大又丑,同样有效,并且更接近于意图中的原始查询):

select * from tbl
where generatedname like 'SAM%'
or    generatedname like 'WILL%'

using the full power of the query optimizer (DB2, and other DBMS' I would think, can still optimize 'XX%' easily if the field is indexed).

使用查询优化器的全部功能(DB2和其他DBMS'我认为,如果字段被索引,仍然可以轻松地优化'XX%')。

I'm not a big fan of using LIKE for any decent sized tables although sometimes there's not much choice. I can't think of any viable situation in which you'd want to look for "%SAM%" and doing so results in an inability to use the optimizer to its fullest extent.

虽然有时没有太多的选择,但我并不喜欢将LIKE用于任何体面的桌子。我想不出你想要寻找“%SAM%”的任何可行情况,这样做会导致无法最大程度地使用优化器。

#1


Unfortunately, there is no immediate Regex function available in DB2. But it is possible to have an external user-defined function (calling either an external CLI, .NET or Java function) that implements regular expressions. Here's a ready-to-use example by IBM:

不幸的是,DB2中没有立即可用的Regex函数。但是可以使用实现正则表达式的外部用户定义函数(调用外部CLI,.NET或Java函数)。这是IBM的一个现成的例子:

http://www.ibm.com/developerworks/data/library/techarticle/0301stolze/0301stolze.html

#2


That's either a bad example or a bad database design :-).

这可能是一个糟糕的例子,也可能是糟糕的数据库设计:-)。

You should probably not be storing first (or any) names in any way that will require you to use LIKE. You need to keep in mind that tables are almost always read far more often than written, and design accordingly.

您可能不应该以任何需要使用LIKE的方式存储第一个(或任何)名称。您需要记住,表格几乎总是比书面阅读更频繁,并相应地进行设计。

That means you want the cost to be on insertion or update, not on selecting. Per-row functions such as upper(name) never scale well to proper enterprise-class databases.

这意味着您希望在插入或更新时花费成本,而不是选择。诸如upper(name)之类的每行函数永远不能很好地扩展到适当的企业级数据库。

In my opinion, you should have, for DB2, the following:

在我看来,对于DB2,您应该拥有以下内容:

  • an insert/update trigger that will remove leading and trailing spaces from the first (and last) name.
  • 插入/更新触发器,将从第一个(和最后一个)名称中删除前导和尾随空格。

  • a generated column that will uppercase the names (this uses more storage but that's usually better than wasting time). I'm not sure if UDB9 has generated columns (DB2/z has) but you can do this in the same insert/update trigger. Basically it's an extra column that's always set to the uppercase version of another field.
  • 一个生成的列,它将使名称大写(这会使用更多的存储空间,但这通常比浪费时间更好)。我不确定UDB9是否已生成列(DB2 / z),但您可以在相同的插入/更新触发器中执行此操作。基本上它是一个额外的列,总是设置为另一个字段的大写版本。

  • an index on the generated column, not the original column.
  • 生成列的索引,而不是原始列。

That way, your selects will scream along with queries like (big and ugly, but efficient):

这样,您的选择将与诸如(大而丑,但有效)的查询一起尖叫:

select * from tbl
where generatedname = 'SAM'
or    generatedname = 'SAMUEL'
or    generatedname = 'SAMANTHA'
or    generatedname = 'WILL'
or    generatedname = 'WILLIAM'
or    generatedname = 'WILLOMENA'

or (less big and ugly, just as efficient, and closer to the original query in intent):

或者(不那么大又丑,同样有效,并且更接近于意图中的原始查询):

select * from tbl
where generatedname like 'SAM%'
or    generatedname like 'WILL%'

using the full power of the query optimizer (DB2, and other DBMS' I would think, can still optimize 'XX%' easily if the field is indexed).

使用查询优化器的全部功能(DB2和其他DBMS'我认为,如果字段被索引,仍然可以轻松地优化'XX%')。

I'm not a big fan of using LIKE for any decent sized tables although sometimes there's not much choice. I can't think of any viable situation in which you'd want to look for "%SAM%" and doing so results in an inability to use the optimizer to its fullest extent.

虽然有时没有太多的选择,但我并不喜欢将LIKE用于任何体面的桌子。我想不出你想要寻找“%SAM%”的任何可行情况,这样做会导致无法最大程度地使用优化器。