How Lack of Primary Keys Affects GoldenGate Replication
Oracle GoldenGate works best when tables have a Primary Key (PK) or Unique Index (UI).If a table does not have a unique way to identify each row, GoldenGate:
- Tries its best to guess which row to update or delete
- May update the wrong row
- Can make source and target data go out of sync
How updates really work
If no key or index exists, the replicat defaults to using all columns of the table as a key and will always attempt a single row update. When applying an update or delete, it uses all columns in the where clause plus rownum = 1, ensuring only the first matching record is affected. If multiple rows have the same values, it doesn’t matter which one is updated or deleted, as long as one is modified.
How Replicat Finds Rows
Case 1: Table has a PK or Unique Index ✅ (Best case)
- Replicat uses the PK / unique columns
- It finds exactly one row
- Updates are safe and accurate
✅ No problem
Case 2: Table has NO PK or Unique Index ❌
GoldenGate falls back to this logic:
- Use all columns as a “pseudo key”
- If some columns cannot be used (LOB, CLOB, LONG, XML, UDT):
- Those columns are excluded
- Final SQL looks like:
UPDATE tableSET col4 = 'X'WHERE col1='A' AND col2='B' AND col3='C'AND ROWNUM = 1;
👉 ROWNUM = 1 means: Only the first matching row will be updated
Real Problem Example (Very Important)
Initial data (Source and Target)
Three rows are identical:
| COL1 | COL2 | COL3 | COL4 |
|---|---|---|---|
| A | B | C | D |
| A | B | C | D |
| A | B | C | D |
✅ Everything is in sync
What application does on Source
Application updates rows individually (for example, using ROWID):
| COL1 | COL2 | COL3 | COL4 |
|---|---|---|---|
| A | B | C | X |
| A | B | C | Y |
| A | B | C | Z |
✅ Source data is correct
What happens on Target (❌ Wrong!)
GoldenGate can only identify rows using COL1, COL2, COL3
(COL4 is a CLOB → cannot be part of key)
So Replicat runs:
UPDATE mytable SET col4='X' WHERE col1='A' AND col2='B' AND col3='C' AND ROWNUM=1;UPDATE mytable SET col4='Y' WHERE col1='A' AND col2='B' AND col3='C' AND ROWNUM=1;UPDATE mytable SET col4='Z' WHERE col1='A' AND col2='B' AND col3='C' AND ROWNUM=1;
👉 All three updates likely hit the same first row
Final Target Data (OUT OF SYNC)
| COL1 | COL2 | COL3 | COL4 |
|---|---|---|---|
| A | B | C | Z |
| A | B | C | D |
| A | B | C | D |
❌ Different from Source
When It Works Correctly
If ALL columns can be used as keys, GoldenGate generates safer SQL like:
UPDATE mytableSET col4='X'WHERE col1='A' AND col2='B' AND col3='C' AND col4='D'AND ROWNUM=1;
KEYCOLS – The Practical Solution
GoldenGate allows you to define KEYCOLS:
MAP schema.mytable,KEYCOLS (col1, col2, col3);
What KEYCOLS means
- You are telling GoldenGate: “These columns uniquely identify a row based on business rules”
- Database still has no PK/UI
- Database will allow duplicates
- GoldenGate assumes uniqueness during replication
⚠️ If your assumption is wrong → replication issues will occur
How to Check If Your KEYCOLS Are Safe ✅
Run this SQL on source and target:
SELECT col1, col2, col3, COUNT(*)FROM mytableGROUP BY col1, col2, col3HAVING COUNT(*) > 1;
- No rows returned → Good
- Rows returned → ❌ Not safe as KEYCOLS
✅ Run this periodically
Initial Load Risks (No PK / UI)
- Initial load can insert duplicate rows
- GoldenGate cannot prevent duplicates
- Especially risky if:
- LOB / CLOB / XML columns exist
- Applications update those columns
👉 Data integrity cannot be guaranteed
Performance Considerations
Replication without PK/UI is slower and heavier.
Best Practices (Recommended Order)
- ✅ Add Primary Key or Unique Index (best solution)
- ✅ Use KEYCOLS with truly unique columns
- ✅ Add non‑unique indexes to help Replicat (especially for deletes)
- ⚠️ Avoid large batch deletes without indexes