During execution of a program that relies on the oracle.sql
package there is a large performance hit for persisting > 200 million Timestamps when compared to persisting the same number of longs.
在执行依赖于oracle.sql包的程序期间,与持久保持相同数量的long相比,持续> 2亿个时间戳会有很大的性能损失。
Java to persist:
Java要坚持:
Collection<ARRAY> longs = new ArrayList<ARRAY>(SIZE);
Collection<ARRAY> timeStamps = new ArrayList<ARRAY>(SIZE);
for(int i = 0; i < SIZE;i++)
{
longs.add(new ARRAY(description, connection, i));
timeStamps.add(new ARRAY(description,connection,new Timestamp(new Long(i)));
}
Statement timeStatement = conn.createStatement();
statement.setObject(1,timeStamps);
statement.execute(); //5 minutes
Statement longStatement = conn.createStatement();
statement.setObject(1,longs);
statement.execute(); //1 minutes 15 seconds
My question is what does Oracle do to Timestamps that make them so awful to insert in a bulk manner?
我的问题是Oracle对Timestamps做了什么,使得以批量方式插入它们太糟糕了?
Configuration:
64 bit RHEL 5
jre 6u16
ojdbc14.jar
64 GB dedicated to the JVM
UPDATEjava.sql.Timestamp
is being used
正在使用UPDATE java.sql.Timestamp
3 个解决方案
#1
1
Number takes 4 bytes, Timestamp takes 11 bytes. In addition, Timestamp has metadata associated with it. For each Timestamp, Oracle seems to compute the metadata and store with the field.
Number需要4个字节,Timestamp需要11个字节。此外,Timestamp还具有与之关联的元数据。对于每个时间戳,Oracle似乎计算元数据并与字段一起存储。
#2
1
Oracle timestamps are not stored as absolute value since epoc like a java.sql.Timestamp internally holds. It's a big bitmask containing values for the various "human" fields, centuries, months, etc.
Oracle时间戳不会存储为绝对值,因为epoc就像java.sql.Timestamp在内部持有一样。它是一个大的位掩码,包含各种“人类”领域的价值,几个世纪,几个月等。
So each one of your nanosecond-since-epoch timestamps is getting parsed into a "human" date before storage.
因此,每个纳秒级的时间戳都会在存储之前被解析为“人类”日期。
#3
1
Adding to Srini's post, for documentation on memory use by data type:
添加到Srini的帖子,有关数据类型的内存使用文档:
Oracle Doc on Data Types: http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31 (includes memory size for Number and Timestamp)
有关数据类型的Oracle Doc:http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31(包括Number和Timestamp的内存大小)
The docs state that Number takes 5-22 bytes, Timestamp takes 11 bytes, Integer takes 4 bytes.
文档声明Number需要5-22个字节,Timestamp需要11个字节,Integer需要4个字节。
Also - to your point on querying against a date range - could you insert the dates as long values instead of timestamps and then use a stored procedure to convert when you are querying the data? This will obviously impact the speed of the queries, so it could be kicking the problem down the road, but.... :)
另外 - 在查询日期范围时 - 您可以将日期作为长值而不是时间戳插入,然后在查询数据时使用存储过程进行转换吗?这显然会影响查询的速度,所以它可能会把问题踢到路上,但是.... :)
#1
1
Number takes 4 bytes, Timestamp takes 11 bytes. In addition, Timestamp has metadata associated with it. For each Timestamp, Oracle seems to compute the metadata and store with the field.
Number需要4个字节,Timestamp需要11个字节。此外,Timestamp还具有与之关联的元数据。对于每个时间戳,Oracle似乎计算元数据并与字段一起存储。
#2
1
Oracle timestamps are not stored as absolute value since epoc like a java.sql.Timestamp internally holds. It's a big bitmask containing values for the various "human" fields, centuries, months, etc.
Oracle时间戳不会存储为绝对值,因为epoc就像java.sql.Timestamp在内部持有一样。它是一个大的位掩码,包含各种“人类”领域的价值,几个世纪,几个月等。
So each one of your nanosecond-since-epoch timestamps is getting parsed into a "human" date before storage.
因此,每个纳秒级的时间戳都会在存储之前被解析为“人类”日期。
#3
1
Adding to Srini's post, for documentation on memory use by data type:
添加到Srini的帖子,有关数据类型的内存使用文档:
Oracle Doc on Data Types: http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31 (includes memory size for Number and Timestamp)
有关数据类型的Oracle Doc:http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31(包括Number和Timestamp的内存大小)
The docs state that Number takes 5-22 bytes, Timestamp takes 11 bytes, Integer takes 4 bytes.
文档声明Number需要5-22个字节,Timestamp需要11个字节,Integer需要4个字节。
Also - to your point on querying against a date range - could you insert the dates as long values instead of timestamps and then use a stored procedure to convert when you are querying the data? This will obviously impact the speed of the queries, so it could be kicking the problem down the road, but.... :)
另外 - 在查询日期范围时 - 您可以将日期作为长值而不是时间戳插入,然后在查询数据时使用存储过程进行转换吗?这显然会影响查询的速度,所以它可能会把问题踢到路上,但是.... :)