log file sync

时间:2022-02-16 12:59:38

Recently, our application system has updated one app. I receive a email of complain the db server changing slowly, after some days.

I see the war reports and find one top event which is the reason why the db sever changing slowly.

Event Waits Time(s) Avg wait (ms) % DB time Wait Class
log file sync 623,700 15,985 26 84.99 Commit
DB CPU   2,124   11.29  
db file scattered read 226,702 549 2 2.92 User I/O
db file sequential read 137,232 251 2 1.34 User I/O
SQL*Net more data from client 606,953 103 0 0.55 Network

log file sync:

The Oracle "log file sync" wait event is triggered when a user session issues a commit (or a rollback). The user session will signal or post the LGWR to write the log buffer to the redo log file. When the LGWR has finished writing, it will post the user session. The wait is entirely dependent on LGWR to write out the necessary redo blocks and send confirmation of its completion back to the user session. The wait time includes the writing of the log buffer and the post, and is sometimes called "commit latency".

Reducing oracle waits

If a SQL statement is encountering a significant amount of total time for this event, the average wait time should be examined. If the average wait time is low, but the number of waits is high, then the application might be committing after every row, rather than batching COMMITs. Oracle applications can reduce this wait by committing after "n" rows so there are fewer distinct COMMIT operations. Each commit has to be confirmed to make sure the relevant REDO is on disk. Although commits can be "piggybacked" by Oracle, reducing the overall number of commits by batching transactions can be very beneficial.

If the SQL statement is a SELECT statement, review the Oracle Auditing settings. If Auditing is enabled for SELECT statements, Oracle could be spending time writing and commit data to the AUDIT$ table.

If the average wait time is high, then examine the other log related waits for the session, to see where the session is spending most of its time. If a session continues to wait on the same buffer# then the SEQ# column of V$SESSION_WAIT should increment every second. If not then the local session has a problem with wait event timeouts. If the SEQ# column is incrementing then the blocking process is the LGWR process. Check to see what LGWR is waiting on as it may be stuck.

If the waits are because of slow I/O, then try the following:

  • Reduce other I/O activity on the disks containing the redo logs, or use dedicated disks.
    • Try to reduce resource contention. Check the number of transactions (commits + rollbacks) each second, from V$SYSSTAT.
  • Alternate redo logs on different disks to minimize the effect of the archiver on the log writer.
  • Move the redo logs to faster disks or a faster I/O subsystem (for example, switch from RAID 5 to RAID 1).
  • Consider using raw devices (or simulated raw devices provided by disk vendors) to speed up the writes.
  • See if any activity can safely be done with NOLOGGING / UNRECOVERABLE options in order to reduce the amount of redo being written.
  • See if any of the processing can use the COMMIT NOWAIT option (be sure to understand the semantics of this before using it).
  • Check the size of the log buffer as it may be so large that LGWR is writing too many blocks at one time.
Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO SQL Id SQL Module SQL Text
                 
68.17 140,990 0.00 0.36 97.31 0.00 7m9wpnw8jg207   update ts_receipt_bill_biz set...
64.30 131,830 0.00 0.34 99.30 0.07 b17sff2vp0rt6   update ts_receipt_bill set dis...

THERE ARE LOTS OF UPDATING OPERATION IN THIS DB VIA AWR REPORT. AND THE PROGRAMER TELL ME THAT THERE IS A BATH PACKAGE FOR THE APPLICATION WHICH WILL BE UPDATING MORE THAN 10000 ROWS AND WITH COMMIT EVERY 30 ROWS. SO MODIFY THE PROGRAM CAN SOLVE THE TOP EVENT.