Direct-Path INSERT(直接路径插入)

时间:2022-05-03 22:38:08

   下面我们来看一下到底什么是直接路径以及使用直接路径的优势和限制。文章截自“Oracle® Database Administrator's Guide11g Release 2 (11.2)”(当然中文注解不是官方文档中的)

Improving INSERT Performance with Direct-Path INSERT

When loading large amounts of data, you can improve load performance by using direct-path INSERT.

##当我们加载大量数据的时候,可以使用direct-path INSERT来提高处理性能

This section contains:

About Direct-Path INSERT

Oracle Database inserts data into a table in one of two ways:

##Oracle数据库向表中插入数据有如下两种方式(传统路径和直接路径):

  • During conventional INSERT operations, the database reuses free space in the table, interleaving newly inserted data with existing data. During such operations, the database also maintains referential integrity constraints.##使用传统路径方式插入数据,数据库会利用表中已有的空闲空间,新老数据是交叉在一起的,同时在插入的过程中会维护引用完整性约束

  • During direct-path INSERT operations, the database appends the inserted data after existing data in the table. Data is written directly into data files, bypassing the buffer cache. Free space in the table is not reused, and referential integrity constraints are ignored. Direct-path INSERT can perform significantly better than conventional insert.##使用直接路径方式插入数据,数据库在表中已有数据之后追加数据(即直接使用高水位线以上的新块,不会像传统路径一样去扫描高水位线以下的空闲块使用)。数据绕过buffer cache直接写进数据文件。高水位线以下的空闲空间不会被使用,表的完整性约束会被忽略。相对于传统路径插入,直接路径插入效率提高很显著。

The database can insert data either in serial mode, where one process executes the statement, or in parallel mode, where multiple processes work together simultaneously to run a single SQL statement. The latter is referred to as parallel execution.

##数据库能够以串行模式插入数据,也能够以并行模式插入数据,并行模式也就是并行执行

The following are benefits of direct-path INSERT:

##下面列出了direct-path INSERT的优势:

  • During direct-path INSERT, you can disable the logging of redo and undo entries to reduce load time. Conventional insert operations, in contrast, must always log such entries, because those operations reuse free space and maintain referential integrity.##使用直接路径加载数据时你可以禁止产生redo和undo的日志,以此来缩短加载时间。相比之下传统路径加载总是会产生这些日志条目。

  • Direct-path INSERT operations ensure atomicity of the transaction, even when run in parallel mode. Atomicity cannot be guaranteed during parallel direct-path loads (using SQL*Loader).##Direct-pathINSERT操作能够确保事物的原子性,即使使用并行模式。但是direct-path loads(using SQL*Loader)不能保证事物的原子性。

When performing parallel direct-path loads, one notable difference between SQL*Loader and INSERT statements is the following: If errors occur during parallel direct-path loads with SQL*Loader, the load completes, but some indexes could be marked UNUSABLE at the end of the load. Parallel direct-path INSERT, in contrast, rolls back the statement if errors occur during index update.##当使用parallel direct-path loads时,需要注意的是,和direct-path INSERT不一样,如果在使用SQL*Loader进行parallel direct-path loads数据导入时出错,那么数据导入完成,但是索引会被标记为失效。相比之下Parallel direct-path INSERT如果在更新索引的时候出错,那么事物会回滚。

Note:

A conventional INSERT operation checks for violations of NOTNULL constraints during the insert. Therefore, if a NOTNULL constraint is violated for a conventional INSERT operation, then the error is returned during the insert. A direct-path INSERT operation checks for violations of NOTNULL constraints before the insert. Therefore, if a NOTNULL constraint is violated for a direct-path INSERT operation, then the error is returned before the insert.
##传统路径插入是在插入过程中检查是否违反非空约束,因此,如果插入的数据违反了非空约束,那么会在插入过程中报错。直接路径插入在插入之前检查数据是否违反非空约束,因此,如果违反非空约束,那么会在插入之前报错。

How Direct-Path INSERT Works

You can use direct-path INSERT on both partitioned and nonpartitioned tables.

##可以对分区表和非分区表使用direct-path INSERT

Serial Direct-Path INSERT into Partitioned or Nonpartitioned Tables

The single process inserts data beyond the current high water mark of the table segment or of each partition segment. (Thehigh-water mark is the level at which blocks have never been formatted to receive data.) When a COMMIT runs, the high-water mark is updated to the new value, making the data visible to users.

##使用高水位线之上的数据块进行插入,当执行commit提交以后,高水位线即被更新为新的值,使新插入的数据对用户变为可见(direct-path
INSERT没有提交之前在同一个事物中,被插入的表是不能被dml,也不能被query的,否则会报ORA-12838错误。其他的会话可以查询该表,但是只能查到插入之前的数据
)

Parallel Direct-Path INSERT into Partitioned Tables

This situation is analogous to serial direct-path INSERT. Each parallel execution server is assigned one or more partitions, with no more than one process working on a single partition. Each parallel execution server inserts data beyond the current high-water mark of its assigned partition segment(s). When a COMMIT runs, the high-water mark of each partition segment is updated to its new value, making the data visible to users.

##并行模式类似于串行模式。每一个并行进程被分配给一个或多个分区,但是不会出现多个程序处理一个分区的情况。

Parallel Direct-Path INSERT into Nonpartitioned Tables

Each parallel execution server allocates a new temporary segment and inserts data into that temporary segment. When a COMMIT runs, the parallel execution coordinator merges the new temporary segments into the primary table segment, where it is visible to users.

##每一个并行进程被分配一个新的临时段并向临时段中插入数据。但执行commit时,并行执行的调度进程把这些临时段并入表所在的段中,这时数据对用户就是可见的了。

Loading Data with Direct-Path INSERT

You can load data with direct-path INSERT by using direct-path INSERT SQL statements, inserting data in parallel mode, or by using the Oracle SQL*Loader utility in direct-path mode. A direct-path INSERT can be done in either serial or parallel mode.

##

Serial Mode Inserts with SQL Statements

You can activate direct-path INSERT in serial mode with SQL in the following ways:

##你可以使用如下方法激活直接路径加载

  • If you are performing an INSERT with a subquery, specify the APPEND hint in each INSERT statement, either immediately after theINSERT keyword, or immediately after the SELECT keyword in the subquery of the INSERT statement.##如果你是使用子查询的方式插入,那么在insert后加append提示

  • If you are performing an INSERT with the VALUES clause, specify the APPEND_VALUES hint in each INSERT statement immediately after the INSERT keyword. Direct-path INSERT with the VALUES clause is best used when there are hundreds of thousands or millions of rows to load. The typical usage scenario is for array inserts using OCI. Another usage scenario might be inserts in a FORALL statement in PL/SQL.##如果你是使用带values从句的方式插入,那么在insert后加APPEND_VALUES提示

If you specify the APPEND hint (as opposed to the APPEND_VALUES hint) in an INSERT statement with a VALUES clause, the APPEND hint is ignored and a conventional insert is performed.

##如果你在带values从句的插入中使用的是append提示,而不是APPEND_VALUES提示,那么append提示会被忽略,执行的会是传统路径插入。

 

The following is an example of using the APPEND hint to perform a direct-path INSERT:

INSERT /*+ APPEND */ INTO sales_hist SELECT * FROM sales WHERE cust_id=8890;

The following PL/SQL code fragment is an example of using the APPEND_VALUES hint:

FORALL i IN 1..numrecords
INSERT /*+ APPEND_VALUES */ INTO orderdata
VALUES(ordernum(i), custid(i), orderdate(i),shipmode(i), paymentid(i));
COMMIT;
Parallel Mode Inserts with SQL Statements

When you are inserting in parallel mode, direct-path INSERT is the default. However, you can insert in parallel mode using conventional INSERT by using the NOAPPENDPARALLEL hint.

##如果你使用并行模式插入,那么默认就是直接路径。然而,你也可以通过使用NOAPPEND PARALLEL提示来强制使用传统路径进行数据插入。

To run in parallel DML mode, the following requirements must be met:

##为了运行并行模式的DML,必须满足下面的条件:

  • You must have Oracle Enterprise Edition installed.##你必须安装的是oracle企业版

  • You must enable parallel DML in your session. To do this, submit the following statement:##你必须启用会话级并行DML

    ALTER SESSION { ENABLE | FORCE } PARALLEL DML;
  • You must meet at least one of the following requirements:##同时你必须满足下面条件中的一个

    • Specify the parallel attribute for the target table, either at create time or subsequently##为目标表指定并行属性(建表时指定或建表后指定)

    • Specify the PARALLEL hint for each insert operation##在insert插入时加aprallel提示

    • Set the database initialization parameter PARALLEL_DEGREE_POLICY toAUTO##把PARALLEL_DEGREE_POLICY参数设为AUTO

To disable direct-path INSERT, specify the NOAPPEND hint in each INSERT statement. Doing so overrides parallel DML mode.

##我们可以通过使用NOAPPEND提示来禁用直接路径插入。

Note:

You cannot query or modify data inserted using direct-path INSERT immediately after the insert is complete. If you attempt to do so, an ORA-12838 error is generated. You must first issue a COMMIT statement before attempting to read or modify the newly-inserted data.
##在direct-path INSERT完成之后,你应该首先执行commit,然后再去查询和修改表中数据。如果你在还没有提交时就去查询或者修改数据,那么会报ORA-12838错误(注意,这里说的情况是在同一个会话下,如果你在另一个会话中还是可以对表进行查询的,当前你查到数据时插入之前的数据。但是就算是另一个会话你也不能完成dml操作,因为直接路径加载会给表加排他锁)

See Also:

Specifying the Logging Mode for Direct-Path INSERT

Direct-path INSERT lets you choose whether to log redo and undo information during the insert operation.

##Direct-path INSERT允许你选择是否记录redo和undo的日志信息(这里的意思应该是指是否写redo和undo)

  • You can specify logging mode for a table, partition, index, or LOB storage at create time (in a CREATE statement) or subsequently (in an ALTER statement).

  • If you do not specify either LOGGING or NOLOGGING at these times:

    • The logging attribute of a partition defaults to the logging attribute of its table.##如果明确给分区指定日志属性,那么分区会继承表的日志属性

    • The logging attribute of a table or index defaults to the logging attribute of the tablespace in which it resides.##如果表或索引没有指定日志属性,那么会继承所在表空间的日志属性

    • The logging attribute of LOB storage defaults to LOGGING if you specify CACHE for LOB storage. If you do not specify CACHE, then the logging attributes defaults to that of the tablespace in which theLOB values resides.##LOB存储要看CACHE模式,如果是CACHE模式,则默认为LOGGING属性,如果为NOCACHE,则要看其所属表空间的LOGGING属性

  • You set the logging attribute of a tablespace in a CREATE TABLESPACE or ALTER TABLESPACE statements.

    Note:

    If the database or tablespace is in FORCE LOGGING mode, then direct path INSERT always logs, regardless of the logging setting.##如果数据库或者表空间处于FORCE LOGGING模式,那么无论其他的日志属性怎么设置,直接路径加载都会记录日志。
Direct-Path INSERT with Logging

In this mode, Oracle Database performs full redo logging for instance and media recovery. If the database is in ARCHIVELOG mode, then you can archive redo logs to tape. If the database is in NOARCHIVELOG mode, then you can recover instance crashes but not disk failures.

##在这种模式下,数据库记录用于实例和介质恢复所需的所有日志。如果数据库处于归档模式下,你可以把日志归档到磁带上。如果数据库处于非归档模式,那么你能够进行实例恢复,但是不能够进行介质恢复。

Direct-Path INSERT without Logging

In this mode, Oracle Database inserts data without redo or undo logging. Instead, the database logs a small number of block range invalidation redo records and periodically updates the control file with information about the most recent direct write.

##在这种模式下,数据库不记录数据插入时的redo和undo日志,但在新区标记invalid状态和修改数据字典的时候会产生少量日志,并且定期的在控制文件中更新当前的direct write信息

Direct-path INSERT without logging improves performance. However, if you subsequently must perform media recovery, the invalidation redo records mark a range of blocks as logically corrupt, because no redo data was logged for them. Therefore, it is important that you back up the data after such an insert operation.

##Direct-path INSERT without logging能够提高性能。然后如果你随后就进行介质恢复,那些被标记为invalid状态的块会报逻辑错误,因为没有为这些块记录日志。因此在直接路径之后进行数据备份是很有必要的。

Beginning with release 11.2.0.2 of Oracle Database, you can significantly improve the performance of unrecoverable direct path inserts by disabling the periodic update of the control files. You do so by setting the initialization parameter DB_UNRECOVERABLE_SCN_TRACKING to FALSE. However, if you perform an unrecoverable direct path insert with these control file updates disabled, you will no longer be able to accurately query the database to determine if any data files are currently unrecoverable.

##oracle11.2.0.2开始,使用Direct-path INSERT时,如果你通过把DB_UNRECOVERABLE_SCN_TRACKING 设为FALSE来禁用定期更新控制文件机制,那么数据插入性能会得到极大的提升。但是如果这这样做了,那么你就无法通过数据库查询那些数据文件时无法恢复的了。

See Also:

Additional Considerations for Direct-Path INSERT

The following are some additional considerations when using direct-path
INSERT
.

##下面列出了使用direct-path INSERT的其他注意事项:

Compressed Tables

If a table is created with the basic compression, then you must use direct-path INSERT to compress table data as it is loaded. If a table is created with OLTP, warehouse, or archive compression, then best compression ratios are achieved with direct-path INSERT.

See "Consider Using Table Compression" for more information.

##如果你的表是basic压缩表,那么想要插入的数据会被压缩,必须使用direct-path INSERT(此时传统路径插入的数据是不会被压缩的)。如果你的表是oltp,warehouse,archive模式的压缩表,那么direct-pathINSERT方式插入的数据压缩效率是最高的(oltp等模式的压缩表,对普通方式插入的数据也会进行压缩,但不是在插入时即进行压缩)

Index Maintenance with Direct-Path INSERT

Oracle Database performs index maintenance at the end of direct-path INSERT operations on tables (partitioned or nonpartitioned) that have indexes. This index maintenance is performed by the parallel execution servers for parallel direct-path INSERT or by the single process for serial direct-path INSERT. You can avoid the performance impact of index maintenance by making the index unusable before the INSERT operation and then rebuilding it afterward.

See Also:

"Making an Index Unusable"
##使用直接路径方式加载数据,索引的维护会被放在操作的最后执行。如果你使用的并行直接路径加载,那么维护索引时也会使用并行模式,如果你使用的是串行直接路径加载,维护索引时也是串行。为了避免数据加载过程中维护索引对性能产生影响,我们可以在加载数据之前使所有失效,加载数据之后再手工重建索引。
Space Considerations with Direct-Path INSERT

Direct-path INSERT requires more space than conventional-path INSERT.

##与传统路径插入相比直接路径插入需要更多的磁盘空间

All serial direct-path INSERT operations, as well as parallel direct-path INSERT into partitioned tables, insert data above the high-water mark of the affected segment. This requires some additional space.

##不论串行或并行直接路径插入都是直接利用高水位线之上的数据块,这样就需要更多的空间(与传统路径插入相比)

Parallel direct-path INSERT into nonpartitioned tables requires even more space, because it creates a temporary segment for each degree of parallelism. If the nonpartitioned table is not in a locally managed tablespace in automatic segment-space management mode, you can modify the values of the NEXT and PCTINCREASE storage parameter and MINIMUMEXTENT tablespace parameter to provide sufficient (but not excess) storage for the temporary segments. Choose values for these parameters so that:

##

  • The size of each extent is not too small (no less than 1 MB). This setting affects the total number of extents in the object.

  • The size of each extent is not so large that the parallel INSERT results in wasted space on segments that are larger than necessary.

After the direct-path INSERT operation is complete, you can reset these parameters to settings more appropriate for serial operations.

Locking Considerations with Direct-Path INSERT

During direct-path INSERT, the database obtains exclusive locks on the table (or on all partitions of a partitioned table). As a result, users cannot perform any concurrent insert, update, or delete operations on the table, and concurrent index creation and build operations are not permitted. Concurrent queries, however, are supported, but the query will return only the information before the insert operation.

##direct-path INSERT操作会对表或者分区表的所有分区加上排他锁。因此在direct-path INSERT过程中不能对目标表进行dml操作,查询是可以的(非direct-pathINSERT当前会话),但查询返回的是direct-pathINSERT之前的数据。

   使用During direct-path
INSERT
的限制条件:

Direct-path INSERT is subject to a number of restrictions. If any of these restrictions is violated, then Oracle Database executes conventionalINSERT serially without returning any message, unless otherwise noted:

##Direct-path INSERT有如下一些限制。如果符合下面任何一条,那么数据库会在不给任何反馈信息的情况下自动的采用串行传统路径插入

  • You can have multiple direct-path INSERT statements in a single transaction, with or without other DML statements. However, after one DML statement alters a particular table, partition, or index, no other DML statement in the transaction can access that table, partition, or index.##在同一个事物中如果有多个direct-path INSERT和DML操作,在一个direct-pathINSERT执行后(提交前)其他的查询或者dml操作都无法执行,执行时会报ORA-12838

  • Queries that access the same table, partition, or index are allowed before the direct-pathINSERT statement, but not after it.##同上面说的是同一个意思

  • If any serial or parallel statement attempts to access a table that has already been modified by a direct-path INSERT in the same transaction, then the database returns an error and rejects the statement.##同上

  • The target table cannot be of a cluster.##如果要插入的目标表是簇表的话,那么无法使用直接路径,只能使用传统路径

  • The target table cannot contain object type columns.##目标表不能包含对象类型的列

  • Direct-path INSERT is not supported for an index-organized table (IOT) if it is not partitioned, if it has a mapping table, or if it is reference by a materialized view.##对于非分区的索引组织表或者是含有mapping table的索引组织表,或者是被视图引用的索引组织表是无法使用直接路径加载的。

  • Direct-path INSERT into a single partition of an index-organized table (IOT), or into a partitioned IOT with only one partition, will be done serially, even if the IOT was created in parallel mode or you specify the APPEND or APPEND_VALUES hint. However, direct-path INSERT operations into a partitioned IOT will honor parallel mode as long as the partition-extended name is not used and the IOT has more than one partition.##当使用Direct-pathINSERT向单分区或者只有一个分区的索引组织表中插入时,即使索引组织表是使用并行模式建的,也只会使用串行方式插入。但是,只要在使用Direct-pathINSERT向索引组织表中插入数据时,我们没有指定具体的分区扩展名(如 insert into t_part partition(P1) values(1,'***','***')),并且索引组织表有多个分区,我们就可以使用并行模式

  • The target table cannot have any triggers or referential integrity constraints defined on it.##目标表上不能含有触发器,以及被引用的完整性约束

  • The target table cannot be replicated.##目标表不能被复制(说实话没理解这一条说的是什么意思!!!)

  • A transaction containing a direct-path INSERT statement cannot be or become distributed.##包含direct-pathINSERT的事物不能被分布式

You cannot query or modify direct-path inserted data immediately after the insert is complete. If you attempt to do so, anORA-12838 error is generated. You must first issue aCOMMIT statement before attempting to read or modify the newly-inserted data.

##在同一个事物中如果direct-path inserted执行完成后,提交之前,你不能对目标表进行query以及dml等操作,否则会报ORA-12838错误

 

使用直接路径的方法:

The following direct-path INSERT methods:

  • Direct path SQL*Loader

  • CREATE TABLE AS SELECT statements

  • Parallel INSERT statements

  • INSERT statements with an APPEND orAPPEND_VALUES hint