RAC环境中threads变更后如何确保goldengate继续正常复制

时间:2022-06-18 21:26:56
转载:http://www.easyora.net/blog/goldengate_rac_threads_remap.html

当rac节点变更的时候,比如我们添加或者删除了集群中的节点,理所当然会对节点对应的log threads进行添加或者删除,但会造成goldengate的map log threads的顺序发生紊乱。在进行这一类行为变更的时候,特别需要注意goldengate端也需要进行特别处理。

比如,在节点添加之前,goldengate map log threads顺序如下(数据库log thread在后,下同):

1—>1 (假设是sequence 100,rba 1001)

2—>2(假设是sequence 88,rba 3009)

当添加节点后,map log threads的顺序会变成:

1—->3(sequence 88,rba 3009)

2—->1(sequence 100,rba 1001)

3—->2(new)

当ogg重新工作的时候,因为此时map的顺序发生了变化,因此会造成抽取进度出现问题。

如果有足够的处理时间,简单而又安全的做法是停止源端应用,删除extract进程后,重新配置新的extract进程并从当前开始抽取。但在这段时间内,所有的操作需确保在应用已经停止服务的前提下,否则数据将造成丢失或者不一致,需要手工处理或者重新初始化。

如果应用无法停机呢?我们可以将新建的extract进度修改成停止之前的进度状态,从而避免操作过程中应用的停机行为。

需要我们特别记录的checkpoint有:Current Checkpoint、Recovery Checkpoint以及Write Checkpoint

1.正常停止extract(略)

2.获得extract的checkpoint记录

GGSCI (node1) 21> info ext_r1 showch

EXTRACT    EXT_R1    Last Started 2011-08-16 22:35   Status STOPPED 
Checkpoint Lag       00:00:00 (updated 00:06:21 ago) 
Log Read Checkpoint  Oracle Redo Logs 
                     2011-08-17 03:32:48  Thread 1, Seqno 62, RBA 29890576 
Log Read Checkpoint  Oracle Redo Logs 
                     2011-08-17 03:32:34  Thread 2, Seqno 42, RBA 18674704

Current Checkpoint Detail:

Read Checkpoint #1

Oracle RAC Redo Log

Startup Checkpoint (starting position in the data source): 
    Thread #: 1 
    Sequence #: 61 
    RBA: 32112656 
    Timestamp: 2011-08-16 22:34:53.000000 
    SCN: 0.3743980 (3743980) 
    Redo File: +DATA1/my/onlinelog/group_1.261.758327805

Recovery Checkpoint (position of oldest unprocessed transaction in the data source): 
    Thread #: 1 
    Sequence #: 62 
    RBA: 29890576 
    Timestamp: 2011-08-17 03:32:48.000000 
    SCN: 0.3811675 (3811675) 
    Redo File: +DATA1/my/onlinelog/group_2.262.758327805

  Current Checkpoint (position of last record read in the data source): 
    Thread #: 1 
    Sequence #: 62 
    RBA: 29890576 
    Timestamp: 2011-08-17 03:32:48.000000 
    SCN: 0.3811675 (3811675) 
    Redo File: +DATA1/my/onlinelog/group_2.262.758327805

BR Previous Recovery Checkpoint: 
    Thread #: 1 
    Sequence #: 0 
    RBA: 0 
    Timestamp: 2011-08-16 22:35:09.416136 
    SCN: Not available 
    Redo File:

BR Begin Recovery Checkpoint: 
    Thread #: 1 
    Sequence #: 62 
    RBA: 22437392 
    Timestamp: 2011-08-17 02:35:11.000000 
    SCN: 0.3798208 (3798208) 
    Redo File:

BR End Recovery Checkpoint: 
    Thread #: 1 
    Sequence #: 62 
    RBA: 24120320 
    Timestamp: 2011-08-17 02:35:16.000000 
    SCN: 0.3801192 (3801192) 
    Redo File:

Read Checkpoint #2

Oracle RAC Redo Log

Startup Checkpoint (starting position in the data source): 
    Thread #: 2 
    Sequence #: 41 
    RBA: 25323024 
    Timestamp: 2011-08-16 22:34:40.000000 
    SCN: 0.3743980 (3743980) 
    Redo File: +DATA1/my/onlinelog/group_3.266.758328125

  Recovery Checkpoint (position of oldest unprocessed transaction in the data source): 
    Thread #: 2 
    Sequence #: 42 
    RBA: 18674704 
    Timestamp: 2011-08-17 03:32:34.000000 
    SCN: 0.3811674 (3811674) 
    Redo File: +DATA1/my/onlinelog/group_4.267.758328125

  Current Checkpoint (position of last record read in the data source): 
    Thread #: 2 
    Sequence #: 42 
    RBA: 18674704 
    Timestamp: 2011-08-17 03:32:34.000000 
    SCN: 0.3811674 (3811674) 
    Redo File: +DATA1/my/onlinelog/group_4.267.758328125

BR Previous Recovery Checkpoint: 
    Thread #: 2 
    Sequence #: 0 
    RBA: 0 
    Timestamp: 2011-08-16 22:35:09.416136 
    SCN: Not available 
    Redo File:

BR Begin Recovery Checkpoint: 
    Thread #: 2 
    Sequence #: 42 
    RBA: 15242240 
    Timestamp: 2011-08-17 02:35:02.000000 
    SCN: 0.3800455 (3800455) 
    Redo File:

BR End Recovery Checkpoint: 
    Thread #: 2 
    Sequence #: 42 
    RBA: 15242240 
    Timestamp: 2011-08-17 02:35:02.000000 
    SCN: 0.3800455 (3800455) 
    Redo File:

Write Checkpoint #1

GGS Log Trail

Current Checkpoint (current write position): 
    Sequence #: 3 
    RBA: 51132 
    Timestamp: 2011-08-17 03:32:48.695373 
    Extract Trail: /opt/ggs/dirdat/r1/ex

Header: 
  Version = 2 
  Record Source = A 
  Type = 6 
  # Input Checkpoints = 2 
  # Output Checkpoints = 1

File Information: 
  Block Size = 2048 
  Max Blocks = 100 
  Record Length = 4096 
  Current Offset = 0

Configuration: 
  Data Source = 3 
  Transaction Integrity = 1 
  Task Type = 0

Status: 
  Start Time = 2011-08-16 22:35:10 
  Last Update Time = 2011-08-17 03:32:48 
  Stop Status = G 
  Last Result = 402

3.新建extract进程。

GGSCI (node1) 34> ADD EXT ext_r1, BEGIN NOW, TRANLOG, THREADS 3

2011-08-17 03:52:26  INFO    OGG-01749  Successfully registered EXTRACT EXT_R1 to start managing log retention at SCN 3826107. 
EXTRACT added.

4.修改current checkpoint (注意每个thread都要修改)

GGSCI (node1) 35> alter EXTRACT ext_r1, TRANLOG, EXTSEQNO 62, EXTRBA 29890576,thread 1 
EXTRACT altered.

GGSCI (node1) 36> alter EXTRACT ext_r1, TRANLOG, EXTSEQNO 42, EXTRBA 18674704,thread 2

EXTRACT altered.

5. 修改recovery checkpoint (注意每个thread都要修改)

GGSCI (node1) 42> ALTER EXTRACT ext_r1, IOEXTSEQNO 62, IOEXTRBA 29890576,thread 1

GGSCI (node1) 42> ALTER EXTRACT ext_r1, IOEXTSEQNO 42, IOEXTRBA 18674704,thread 2

6. 修改exttrail或者rmttrail的write checkpoint

GGSCI (node1) 47> ADD EXTTRAIL /opt/ggs/dirdat/r1/ex,SEQNO 3, RBA 51132, EXTRACT ext_r1 
EXTTRAIL added.

7. 验证checkpoint是否修改成功(使用showch,略)

8.重新启动extract(略)