Balancing Storage & CPU Organization

Each partition has to have not only its own directories and filesystems, but these file systems must be striped over enough disk spindles connected through an I/O path of adequate bandwidth.

We ran tests with two partitions sharing the same I/O path and disk. A scan of a single partition at a time took 6.5 seconds. The same scan of both partitions together took 31 seconds. When partition 1 was moved to a separate device and I/O path, the scan was only limited by the slowest device in both cases. Partition 1, on a slower device, took 16 seconds, partition 2 took the same 6.5 seconds, while both partitions together took 16 seconds, dictated by the slowest device as expected. When both partition data were queried off the same spindle, contention doubled the time it took to serially extract the two partion data serially. This did not change when we varied prefetch size or run the two partitions from independent queries one for each partition vs. one query against the two partitions.

When high performance disk array controllers and 15K RPM disks are used, contention is negotiated better. The question becomes one of bandwidth and capacity, and the bottleneck typically shifts to the I/O dhannel. Make sure that scan rates used for the purpose of scan erformance, are not from the storage cache but are sustainable rates after the cache is full.

Running DB2 on intel 5000 or 7000 chipsets, a core is capable of handling 100-150 MB/sec for each GHz of processor clock speed. Thus a processor core running at 3GHz can handle I/O throughput from 300-450MB/sec. Formating SSA 15K rpm disks with 64K segment size, and assuming 90-120 MB/sec sustainable throughput per disk, we need 4-5 usable disks to support each processor core with commensurate I/O path bandwidth. Such bandwidth would correspond to one 4Gbps fiber channel or half an SDR 4X Infiniband port rated at 10Gbps (Infiniband incurs a 20% payload overhead of two extra bits per byte.)

The IBM DS5300 storage system with dual RAID controllers, 32 4X fiber channel ports and 256 HDDs, under this prescription, supports 30-32 processor cores with up to 8 drives per core allowing for 16 to zero spare drives. Using 146 GB drives and RAID5 (7+P) each partition can carry around 1 TB of total usable data. Using RAID5 (5+P) and RAID1 for logs and temp space is another alternaitve. Another option is to use RAID5 (7+P) for a pair of nodes and a RAID10 (4 disk) configuration for log and temp space. All of these configurations preserve adequate I/O bandwidth, the choice depending on active/inactive data needed to be stored, and how much data one wants to put per partition. For higher concurrency and mixed workloads, allow 1.25 cores and 200-250 GB of active data per partition. Using 8-core servers, this allows for failover scenarios as well.

On the server side, one of the following will be required:

  • 4 X 8 Gbps fiber channel ports, or
  • 8 X 4 Gbps fiber channel ports (four dual adapters). or
  • 4 X 4x SDR Infiniband ports (rated at 10 Gbps each), or
  • 2 X 4x DDR Infiniband ports (rated at 20 Gbps each)
VDM Access: