在持久存在数亿个SQL Timestamp对象时,性能受到了影响

时间:2022-08-05 16:52:28

During execution of a program that relies on the oracle.sql package there is a large performance hit for persisting > 200 million Timestamps when compared to persisting the same number of longs.

在执行依赖于oracle.sql包的程序期间,与持久保持相同数量的long相比,持续> 2亿个时间戳会有很大的性能损失。

Basic Schema

Java to persist:

Java要坚持:

Collection<ARRAY> longs = new ArrayList<ARRAY>(SIZE);
Collection<ARRAY> timeStamps = new ArrayList<ARRAY>(SIZE);
for(int i = 0; i < SIZE;i++)  
{  
    longs.add(new ARRAY(description, connection, i));  
    timeStamps.add(new ARRAY(description,connection,new Timestamp(new Long(i)));
}  

Statement timeStatement = conn.createStatement();  
statement.setObject(1,timeStamps);  
statement.execute();   //5 minutes

Statement longStatement = conn.createStatement();  
statement.setObject(1,longs);  
statement.execute();  //1 minutes 15 seconds

My question is what does Oracle do to Timestamps that make them so awful to insert in a bulk manner?

我的问题是Oracle对Timestamps做了什么,使得以批量方式插入它们太糟糕了?

Configuration:

64 bit RHEL 5  
jre 6u16  
ojdbc14.jar
64 GB dedicated to the JVM

UPDATE
java.sql.Timestamp is being used

正在使用UPDATE java.sql.Timestamp

3 个解决方案

#1


1  

Number takes 4 bytes, Timestamp takes 11 bytes. In addition, Timestamp has metadata associated with it. For each Timestamp, Oracle seems to compute the metadata and store with the field.

Number需要4个字节,Timestamp需要11个字节。此外,Timestamp还具有与之关联的元数据。对于每个时间戳,Oracle似乎计算元数据并与字段一起存储。

#2


1  

Oracle timestamps are not stored as absolute value since epoc like a java.sql.Timestamp internally holds. It's a big bitmask containing values for the various "human" fields, centuries, months, etc.

Oracle时间戳不会存储为绝对值,因为epoc就像java.sql.Timestamp在内部持有一样。它是一个大的位掩码,包含各种“人类”领域的价值,几个世纪,几个月等。

So each one of your nanosecond-since-epoch timestamps is getting parsed into a "human" date before storage.

因此,每个纳秒级的时间戳都会在存储之前被解析为“人类”日期。

#3


1  

Adding to Srini's post, for documentation on memory use by data type:

添加到Srini的帖子,有关数据类型的内存使用文档:

Oracle Doc on Data Types: http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31 (includes memory size for Number and Timestamp)

有关数据类型的Oracle Doc:http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31(包括Number和Timestamp的内存大小)

The docs state that Number takes 5-22 bytes, Timestamp takes 11 bytes, Integer takes 4 bytes.

文档声明Number需要5-22个字节,Timestamp需要11个字节,Integer需要4个字节。

Also - to your point on querying against a date range - could you insert the dates as long values instead of timestamps and then use a stored procedure to convert when you are querying the data? This will obviously impact the speed of the queries, so it could be kicking the problem down the road, but.... :)

另外 - 在查询日期范围时 - 您可以将日期作为长值而不是时间戳插入,然后在查询数据时使用存储过程进行转换吗?这显然会影响查询的速度,所以它可能会把问题踢到路上,但是.... :)

#1


1  

Number takes 4 bytes, Timestamp takes 11 bytes. In addition, Timestamp has metadata associated with it. For each Timestamp, Oracle seems to compute the metadata and store with the field.

Number需要4个字节,Timestamp需要11个字节。此外,Timestamp还具有与之关联的元数据。对于每个时间戳,Oracle似乎计算元数据并与字段一起存储。

#2


1  

Oracle timestamps are not stored as absolute value since epoc like a java.sql.Timestamp internally holds. It's a big bitmask containing values for the various "human" fields, centuries, months, etc.

Oracle时间戳不会存储为绝对值,因为epoc就像java.sql.Timestamp在内部持有一样。它是一个大的位掩码,包含各种“人类”领域的价值,几个世纪,几个月等。

So each one of your nanosecond-since-epoch timestamps is getting parsed into a "human" date before storage.

因此,每个纳秒级的时间戳都会在存储之前被解析为“人类”日期。

#3


1  

Adding to Srini's post, for documentation on memory use by data type:

添加到Srini的帖子,有关数据类型的内存使用文档:

Oracle Doc on Data Types: http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31 (includes memory size for Number and Timestamp)

有关数据类型的Oracle Doc:http://docs.oracle.com/cd/E11882_01/timesten.112/e21642/types.htm#autoId31(包括Number和Timestamp的内存大小)

The docs state that Number takes 5-22 bytes, Timestamp takes 11 bytes, Integer takes 4 bytes.

文档声明Number需要5-22个字节,Timestamp需要11个字节,Integer需要4个字节。

Also - to your point on querying against a date range - could you insert the dates as long values instead of timestamps and then use a stored procedure to convert when you are querying the data? This will obviously impact the speed of the queries, so it could be kicking the problem down the road, but.... :)

另外 - 在查询日期范围时 - 您可以将日期作为长值而不是时间戳插入,然后在查询数据时使用存储过程进行转换吗?这显然会影响查询的速度,所以它可能会把问题踢到路上,但是.... :)