Oracle Exadata is a powerful platform designed to run Oracle Database faster and more efficiently. What makes it special is the unique algorithms and intelligent software that manage both database servers and storage servers. Let’s break down the key components in simple words.
Intelligent Storage and Network
Exadata uses smart algorithms in:
- Storage servers → to handle data intelligently, offloading SQL queries and backups.
- PCI-based flash → for super-fast caching and logging.
- RDMA Network Fabric → allows servers to talk directly to each other’s memory, reducing delays.
This combination delivers higher performance at lower cost compared to traditional platforms.
Software on Database Servers
Each Exadata database server runs:
- Oracle Database instances → the actual database engine.
- Oracle Database Resource Manager (DBRM) → ensures fair resource allocation among workloads.
- DBMCLI (command-line tool) → lets administrators manage Exadata software.
- Management Server (MS) → processes commands and coordinates services.
- Restart Server (RS) → monitors services and restarts them if they fail.
- Oracle Grid Infrastructure (GI) → provides clustering for Oracle RAC and manages storage with ASM.
- Exascale support → if configured, adds services for Exascale storage clusters.
Software on Storage Servers
Each Exadata storage server includes:
- CellCLI (command-line tool) → for managing storage software.
- Cell Server (CELLSRV) → the heart of storage services, handling SQL offload and I/O Resource Management (IORM).
- Cell Offload Server (CELLOFLSRV) → helper process that supports SQL offload for different Oracle Database versions.
- Management Server (MS) → works with CellCLI to process commands.
- Restart Server (RS) → keeps services alive by restarting them if needed.
- Exascale storage services → if configured, provide advanced clustered storage features.
Administration Tools
Admins can manage Exadata using secure network connections and several tools:
- DBMCLI → for database servers.
- CellCLI → for storage servers.
- dcli → run commands across multiple servers at once.
- ExaCLI → manage servers remotely.
- exadcli → centrally manage the whole Exadata system.
- ESCLI → manage Exascale storage clusters.
Why It Matters
- Performance: SQL queries and backups are offloaded to storage, freeing database servers.
- Reliability: Services are monitored and restarted automatically.
- Scalability: Works with Oracle RAC and Exascale storage for large deployments.
- Ease of Management: Multiple command-line tools simplify administration.
Oracle Exadata storage servers organize data in a layered hierarchy:
- Physical Disks
- Each server has multiple disks: HDDs or flash devices.
- On most models → one disk = one LUN (Logical Unit Number).
- On Exadata X10M Extreme Flash (EF) → each large flash device is split into 2 LUNs, giving 8 LUNs per server.
- Cell Disks
- A cell disk reserves space on a LUN for Exadata System Software.
- Each LUN can hold only one cell disk.
3. Grid Disks (ASM)
- If using Oracle ASM for storage management:
- Cell disks are divided into grid disks.
- Grid disks are exposed to ASM disk groups.
- Multiple grid disks allow different ASM clusters and databases to share the same physical disk.
4. Pool Disks (Exascale)
- If using Oracle Exascale for storage management:
- Cell disks are divided into pool disks.
- Pool disks are exposed to Exascale storage pools.
- Usually only one pool disk per cell disk is needed, since Exascale is designed to securely share resources among many tenants.
Maintaining High Performance During Storage Interruptions in Oracle Exadata
Oracle Exadata is designed to deliver extreme performance by intelligently managing data across multiple storage tiers—hard disks, flash, persistent memory (PMEM), and RDMA memory (XRMEM). Even when storage interruptions occur, whether planned (like software updates) or unplanned (like disk failures), Exadata ensures that performance and low latency are preserved.
Types of Storage Interruptions
- Unplanned events: hardware failures (flash or disk), predictive failures, OS crashes, or cell reboots.
- Planned events: rolling storage server software updates.
Interruptions mainly affect performance through:
- Extra I/O load needed to restore redundancy.
- Cache misses when data copies are unavailable.
Managing I/O Load During Interruptions
Exadata uses smart automation to minimize impact:
Disk failure → Automatically drops the disk from ASM and triggers a rebalance to restore redundancy.
Predictive failure → Option to proactively drop and rebalance before replacement.
Flash cache write-back mode → Maintains metadata so cache can be rebuilt (resilvered) efficiently using RDMA transfers.
Short-term interruptions → ASM performs a resync using bitmaps to copy only changed data.
ASM power limit → Kept low to avoid overloading disks and ensure minimal impact on application I/O.
I/O Resource Management (IORM) → Prioritizes application I/O over system operations like rebalance.
Minimizing Cache Misses
Exadata ensures important data stays cached:
- Primary copy caching → Data read into buffer cache is also cached in flash/PMEM/XRMEM.
- Secondary copy caching → If primary is unavailable, secondary cache is already preloaded.
- Tertiary copy → Used only in rare triple-mirror scenarios (no proactive caching).
- Cache preservation during rebalance → Data moved between cells is re-cached automatically.
- Fast cache recovery → Flash cache metadata stored on SSDs allows quick rebuild after failures.
- New device warm-up → Ensures cache hit ratios are healthy before enabling new storage.