I am trying to query something like
我试图查询类似的东西
select emp_id from dept.employee where
firstname like '%sam%' or
firstname like '%SAM%' or
firstname like '%will%' or
firstname like '%WILL%'
Is it possible to put it in regex something like
有可能把它放在正则表达式中
select emp_id from dept.employee where
firstname like '%sam|SAM|will|WILL%'
or
select emp_id from dept.employee where
upper(firstname) like '%sam%' or
upper(firstname) like '%will%'
I am using DB2 UDB9.
我正在使用DB2 UDB9。
2 个解决方案
#1
Unfortunately, there is no immediate Regex function available in DB2. But it is possible to have an external user-defined function (calling either an external CLI, .NET or Java function) that implements regular expressions. Here's a ready-to-use example by IBM:
不幸的是,DB2中没有立即可用的Regex函数。但是可以使用实现正则表达式的外部用户定义函数(调用外部CLI,.NET或Java函数)。这是IBM的一个现成的例子:
http://www.ibm.com/developerworks/data/library/techarticle/0301stolze/0301stolze.html
#2
That's either a bad example or a bad database design :-).
这可能是一个糟糕的例子,也可能是糟糕的数据库设计:-)。
You should probably not be storing first (or any) names in any way that will require you to use LIKE
. You need to keep in mind that tables are almost always read far more often than written, and design accordingly.
您可能不应该以任何需要使用LIKE的方式存储第一个(或任何)名称。您需要记住,表格几乎总是比书面阅读更频繁,并相应地进行设计。
That means you want the cost to be on insertion or update, not on selecting. Per-row functions such as upper(name)
never scale well to proper enterprise-class databases.
这意味着您希望在插入或更新时花费成本,而不是选择。诸如upper(name)之类的每行函数永远不能很好地扩展到适当的企业级数据库。
In my opinion, you should have, for DB2, the following:
在我看来,对于DB2,您应该拥有以下内容:
- an insert/update trigger that will remove leading and trailing spaces from the first (and last) name.
- a generated column that will uppercase the names (this uses more storage but that's usually better than wasting time). I'm not sure if UDB9 has generated columns (DB2/z has) but you can do this in the same insert/update trigger. Basically it's an extra column that's always set to the uppercase version of another field.
- an index on the generated column, not the original column.
插入/更新触发器,将从第一个(和最后一个)名称中删除前导和尾随空格。
一个生成的列,它将使名称大写(这会使用更多的存储空间,但这通常比浪费时间更好)。我不确定UDB9是否已生成列(DB2 / z),但您可以在相同的插入/更新触发器中执行此操作。基本上它是一个额外的列,总是设置为另一个字段的大写版本。
生成列的索引,而不是原始列。
That way, your selects will scream along with queries like (big and ugly, but efficient):
这样,您的选择将与诸如(大而丑,但有效)的查询一起尖叫:
select * from tbl
where generatedname = 'SAM'
or generatedname = 'SAMUEL'
or generatedname = 'SAMANTHA'
or generatedname = 'WILL'
or generatedname = 'WILLIAM'
or generatedname = 'WILLOMENA'
or (less big and ugly, just as efficient, and closer to the original query in intent):
或者(不那么大又丑,同样有效,并且更接近于意图中的原始查询):
select * from tbl
where generatedname like 'SAM%'
or generatedname like 'WILL%'
using the full power of the query optimizer (DB2, and other DBMS' I would think, can still optimize 'XX%' easily if the field is indexed).
使用查询优化器的全部功能(DB2和其他DBMS'我认为,如果字段被索引,仍然可以轻松地优化'XX%')。
I'm not a big fan of using LIKE
for any decent sized tables although sometimes there's not much choice. I can't think of any viable situation in which you'd want to look for "%SAM%"
and doing so results in an inability to use the optimizer to its fullest extent.
虽然有时没有太多的选择,但我并不喜欢将LIKE用于任何体面的桌子。我想不出你想要寻找“%SAM%”的任何可行情况,这样做会导致无法最大程度地使用优化器。
#1
Unfortunately, there is no immediate Regex function available in DB2. But it is possible to have an external user-defined function (calling either an external CLI, .NET or Java function) that implements regular expressions. Here's a ready-to-use example by IBM:
不幸的是,DB2中没有立即可用的Regex函数。但是可以使用实现正则表达式的外部用户定义函数(调用外部CLI,.NET或Java函数)。这是IBM的一个现成的例子:
http://www.ibm.com/developerworks/data/library/techarticle/0301stolze/0301stolze.html
#2
That's either a bad example or a bad database design :-).
这可能是一个糟糕的例子,也可能是糟糕的数据库设计:-)。
You should probably not be storing first (or any) names in any way that will require you to use LIKE
. You need to keep in mind that tables are almost always read far more often than written, and design accordingly.
您可能不应该以任何需要使用LIKE的方式存储第一个(或任何)名称。您需要记住,表格几乎总是比书面阅读更频繁,并相应地进行设计。
That means you want the cost to be on insertion or update, not on selecting. Per-row functions such as upper(name)
never scale well to proper enterprise-class databases.
这意味着您希望在插入或更新时花费成本,而不是选择。诸如upper(name)之类的每行函数永远不能很好地扩展到适当的企业级数据库。
In my opinion, you should have, for DB2, the following:
在我看来,对于DB2,您应该拥有以下内容:
- an insert/update trigger that will remove leading and trailing spaces from the first (and last) name.
- a generated column that will uppercase the names (this uses more storage but that's usually better than wasting time). I'm not sure if UDB9 has generated columns (DB2/z has) but you can do this in the same insert/update trigger. Basically it's an extra column that's always set to the uppercase version of another field.
- an index on the generated column, not the original column.
插入/更新触发器,将从第一个(和最后一个)名称中删除前导和尾随空格。
一个生成的列,它将使名称大写(这会使用更多的存储空间,但这通常比浪费时间更好)。我不确定UDB9是否已生成列(DB2 / z),但您可以在相同的插入/更新触发器中执行此操作。基本上它是一个额外的列,总是设置为另一个字段的大写版本。
生成列的索引,而不是原始列。
That way, your selects will scream along with queries like (big and ugly, but efficient):
这样,您的选择将与诸如(大而丑,但有效)的查询一起尖叫:
select * from tbl
where generatedname = 'SAM'
or generatedname = 'SAMUEL'
or generatedname = 'SAMANTHA'
or generatedname = 'WILL'
or generatedname = 'WILLIAM'
or generatedname = 'WILLOMENA'
or (less big and ugly, just as efficient, and closer to the original query in intent):
或者(不那么大又丑,同样有效,并且更接近于意图中的原始查询):
select * from tbl
where generatedname like 'SAM%'
or generatedname like 'WILL%'
using the full power of the query optimizer (DB2, and other DBMS' I would think, can still optimize 'XX%' easily if the field is indexed).
使用查询优化器的全部功能(DB2和其他DBMS'我认为,如果字段被索引,仍然可以轻松地优化'XX%')。
I'm not a big fan of using LIKE
for any decent sized tables although sometimes there's not much choice. I can't think of any viable situation in which you'd want to look for "%SAM%"
and doing so results in an inability to use the optimizer to its fullest extent.
虽然有时没有太多的选择,但我并不喜欢将LIKE用于任何体面的桌子。我想不出你想要寻找“%SAM%”的任何可行情况,这样做会导致无法最大程度地使用优化器。