As part of a data regression and quality assurance exercise, I need to look for diffs between two tables that should, ideally, be nearly identical. I have looked at a large number of commercial products and can't seem to find one that satisfies all of our requirements:
作为数据回归和质量保证练习的一部分,我需要在两个表之间寻找差异,理想情况下应该几乎相同。我查看了大量的商业产品,似乎找不到满足我们所有要求的产品:
-
Must be able to compare LARGE tables (10 million rows, by 200 columns) very efficiently
必须能够非常有效地比较LARGE表(1000万行,200列)
-
Must work across different DB servers and different DB vendors (Oracle vs. DB2)
必须跨不同的数据库服务器和不同的数据库供应商工作(Oracle与DB2)
-
Must be able to compare tables having different structures, ignoring the columns that are not shared between the two tables
必须能够比较具有不同结构的表,忽略两个表之间不共享的列
-
Must work with a user supplied, multi-column primary key-- can't rely on key defined in DB
必须使用用户提供的多列主键 - 不能依赖于DB中定义的键
-
Must run on linux/solaris. Will be run as part of a fully automated process that is executed within an enterprise environment.
必须在linux / solaris上运行。将作为在企业环境中执行的完全自动化过程的一部分运行。
-
Must be able to run headless (wihtout GUI)
必须能够无头(无GUI)
-
Must produce formatted report that identifies row diffs (row on only one side) and value diffs
必须生成标识行差异(仅在一侧的行)和值差异的格式化报告
-
Customer is willing to pay enterprise level price for right solution. In other words, price no object.
客户愿意为正确的解决方案支付企业级价格。换句话说,价格没有对象。
Has anyone ever seen something like this?
有没有人见过这样的东西?
3 个解决方案
#1
2
Not the best solutions but for flexibility, we have implemented this as a set of perl scripts that extract the data and then do file comparison.
不是最好的解决方案,但为了灵活性,我们将其实现为一组perl脚本,提取数据然后进行文件比较。
Most commercial databases have excellent bulk copy utility (bcp, sqlload etc.) and Perl is fast with string comparison and for proecessing large files.
大多数商业数据库都具有出色的批量复制实用程序(bcp,sqlload等),而Perl可以快速进行字符串比较,也可以用于扩展大型文件。
#2
2
I'd hash the DB rows based on your defined criteria and then use that. If the comparison details are fairly static you may want to persist the hash, either as a new column in the table itself or in a separate dedicated table. An appropriate index would then allow you to perform whatever comparisons you wish.
我根据您定义的标准对数据库行进行散列,然后使用它。如果比较细节是相当静态的,您可能希望将哈希持久化,作为表本身的新列或单独的专用表。然后,适当的索引将允许您执行您希望的任何比较。
#3
0
Don't know if it's related, by have a look at "Diffing" objects from a relational database
通过查看关系数据库中的“Diffing”对象,不知道它是否相关
#1
2
Not the best solutions but for flexibility, we have implemented this as a set of perl scripts that extract the data and then do file comparison.
不是最好的解决方案,但为了灵活性,我们将其实现为一组perl脚本,提取数据然后进行文件比较。
Most commercial databases have excellent bulk copy utility (bcp, sqlload etc.) and Perl is fast with string comparison and for proecessing large files.
大多数商业数据库都具有出色的批量复制实用程序(bcp,sqlload等),而Perl可以快速进行字符串比较,也可以用于扩展大型文件。
#2
2
I'd hash the DB rows based on your defined criteria and then use that. If the comparison details are fairly static you may want to persist the hash, either as a new column in the table itself or in a separate dedicated table. An appropriate index would then allow you to perform whatever comparisons you wish.
我根据您定义的标准对数据库行进行散列,然后使用它。如果比较细节是相当静态的,您可能希望将哈希持久化,作为表本身的新列或单独的专用表。然后,适当的索引将允许您执行您希望的任何比较。
#3
0
Don't know if it's related, by have a look at "Diffing" objects from a relational database
通过查看关系数据库中的“Diffing”对象,不知道它是否相关