View Single Post
Old 2010-12-15, 16:39   #3
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

101000100101102 Posts
Default

Quote:
Originally Posted by thread View Post
Can you be more specific about what kind of block device that is?
Yes, though I was specifically unspecific because I didn't know whether the advert would be acceptable here.

We have a system which requires read-only random access to a database of a few terabytes. Each access is for only a small amount of data and the truly random access patterns means that there is very little, if any, point in trying to optimize for locality of storage except, perhaps, at the level of entire disks. Each datum is a kilobyte or less, smaller than the block sizes of most storage systems these days. Bandwidth is not really an issue, what we want to maximise is the number of IO operations per second (IOPS).

The system presently runs a Linux kernel on each of two dual-proc multi-core servers which are fitted with plenty of gigabytes of RAM and enough PCI buses and SATA controllers to let us hang over a hundred SSDs on them if we wish. At the moment we are finding it hard to get more than 300k IOPS no matter how much hardware we throw at the problem. The limitation is the SATA driver which itself lives under a SCSI layer. The data lives in a standard Linux-supported file system and we've pulled all the well-known optimization tricks, such as aligning the filesystem to the SSD data block boundaries, using the noop scheduler, and so forth. Note that we would be just as happy with raw device IO as with going through the filesystem; it's just that the device driver is the limiting factor at present, not the file system or SCSI layers, so the convenience of access through the file system is essentially cost-free. The SSDs can sustain several tens of thousands of IOPS (we've measured 75k IOPS from a single disk) so multiply that by a hundred disks working flat out and the theoretical peak performance should be well into the several million IOPS range.

One million IOPS is the minimum target performance. Three million would be much more in line with what we'd like and any more would be an undoubted bonus.

Why, are you capable and interested in writing a driver, or know someone who is?


Paul

Last fiddled with by xilman on 2010-12-15 at 16:40 Reason: Fix clumsy phrasing
xilman is online now   Reply With Quote