Saturday, July 31, 2010
 
Chenbro 5 disk hot swap bay.
 
Roll Your Own SAN
 
We've been experimenting with vSphere 4 at work and a lot of the higher end features require the use of a SAN.
This is so that multiple vSphere hosts can access the same storage for things like vMotion.
 
Buying an enterprise SAN was out of the question, so I looked for alternatives and found Openfiler and FreeNAS.
 
I quickly settled on Openfiler, it seemed to me, easier to use and has better support for iSCSI which is what I wanted to use with vSphere.
 
I setup a junk PC with Openfiler and it worked wonderfully.
However, the hardware was junk and not suitable for use in a work environment in terms of reliability and performance.
 
So, to build a better SAN, I needed better hardware.  I put together the following shopping list.  It might not be ideal but seems to me to be OK for the money we had to spend.
 
PARTS LIST
 
1 x Intel E8400 CPU (Overclocked from 3.0 to 3.6GHz)
1 x ASUS P5Q Pro Turbo mainboard
4 x 1GB DDR2 RAM (See stumbling block below for my RAM pains)
2 x Intel Pro 1000MT network adapters (already had these laying around)
1 x 150GB Velociraptor hard drive, (for booting Openfiler)
1 x 3Ware 9650SE 12ML RAID controller, (Openfiler has built-in support for this device)
1 x Saphire HD2400Pro (cheapest I could get passively cooled)
+ Misc cables etc.
 
The Cosmos S case has seven 5.25" drive bays, I figured I can put the two Chenbro bays in there and an additional hotswap drive carrier I had in the remaining bay. (With the boot disk in here).
But wait, that's only 10 of the 12 drives on the list... Yes, the other two 1TB drives will be mounted internally unfortunately.  A case with 9 5.25" bays just pushed the price up too high and would have required a third Chenbro bay.  I bet the first drive to fail will be one of the internally mounted ones just to annoy me.
 
The case does have a nice cooling fan on the internal 3.5" bay though, so the internally mounted drives should be nice and cool.  (In fact one of the main reasons I bought this case was the number of cooling fans).
  
 
PUTTING IT TOGETHER
 
Once all the computer components were connected the system just did not want to boot with more than one stick of RAM on the mainboard.  I spent hours on this with different combinations/slots/types of RAM without luck.  Bumping RAM voltage in BIOS didn't work either.
Googling seems to say that the ASUS P5Q Pro Turbo mainboard is picky about RAM.
 
Eventually I stole some RAM from a machine that was a little older and ran at 1.9 volts as opposed to the RAM I bought which was 2.1 volts.  Bingo!  It worked, with all slots populated.
 
I'm rather upset at this, the RAM I bought ( Geil GX22GB6400UDC ) is in the mainboards compatibility list from ASUS.  But seems this board didn't like it.  I also tried (OCZ) OCZ2G8002GK and (Geil) GX21GB6400DC, none of which worked.
 
The RAM that did finally work is (Geil) GE22GB800C4DC. (Pictured right).
 
 
 
Now I had the system booting I thought I'd try and clock it up a bit faster.  Amazing!  I hit 4.07GHz without even trying too hard, overclocking by over 1GHz.  (Was not stable at this speed though).  This was with stock cooler and just a modest CPU voltage bump.
I ended up clocking it to 3.6GHz and it seems very stable.  These CPUs are amazing value.  With more voltage fiddling and FSB tuning, a better cooler I'm sure you can do what many others are and run this chip at 4GHz or more.
 
 
Warning on the case.
I went to install the Chenbro hotswap units in the Cosmos case and they don't fit!  The Cosmos 5.25" bays have small shelves that stick out, ideal for mounting CDROM's etc.  However the Chenbro units want three bays and the shelves block the insertion.
This was solved with brute force.  I bent all the shelf tabs back in with a hammer and by bending them back.  Be careful, if you slip you could cut yourself badly.  Try and find a case that doesn't have these shelves, however I think most cases have them.
 
The 3Ware 9650SE slotted in nicely and I connected the Chenbro units to the controller with the cables that were supplied with the 3Ware controller.
The Velociraptor is hooked to one of the mainboards Intel ICH10 SATA ports.  I have disabled the Micron controller in BIOS as I find it's performance is very poor.  The ICH10 ports are set to IDE - Enhanced in BIOS.  (AHCI won't work with Openfiler as far as I know).
 
 
 
 
 
 

 
 
RAID LEVEL

A big question to ask yourself is what RAID level to use?
 
With 12 disks I'm looking at RAID6 or RAID10.
 
RAID6 supports the loss of ANY two drives, whilst RAID10 supports multiple disk losses as well; they must not be on the same mirror or your data is gone.
 
The 3Ware card I'm going to be using calculates the XOR in parallel and boasts high write speeds, which is RAID6's main failing.  Read speeds on RAID6 are great.  It also is much more efficient in terms of disk space than RAID10.  RAID10 may be better.  Be nice to benchmark eh?
 
3Wares white paper on RAID6 recommends people to use RAID6 over RAID10 when the array has more than four drives.  Looking forward to testing that claim.
 
So, I decided to give RAID6 a go first, I don't really need a hot spare, since I can lose up to two disks.  However I thought it would be worthwhile as when using a 1TB disk, rebuilds can take a long time.  Best to have that hot spare sitting there rebuilding whilst you get a replacement drive in.
 
Booted the system, hit Alt-3 to get into the RAID controller BIOS and set all 12 disks up as a RAID 6 volume.  256kb block size.  Write cache was turned on.  (I'm just benchmarking at this stage so write cache without the battery backup unit for the controller is OK).
Then I waited while the system built the array, and waited some more.  At five and a half hours in I was only up to 28% done.  Sigh.  Will continue tomorrow when hopefully the array is done!
 
 
OPENFILER INSTALLATION
 
Next I grabbed the 64bit build of Openfiler.  It is recommended to use the 64bit version if you can.  (Unsure of the exact reasons for this, perhaps someone can contact me and tell me?).  Burn the ISO to a CD and set that aside.
 
I have a USB CDROM drive which is invaluable when working on servers, you don't need to waste a slot or make the purchase of a CDROM every time you buy a machine.  I'll use this to boot the install.
Using the 64bit version of Openfiler 2.3 installed the system and did base configuration.
 
Ran the following commands to update the system:
conary update conary
conary updateall
 
Compiled the driver for the ASUS onboard NIC.  (Atheros).  Installed driver.
Installed iometer and fired up dynamo pointing it to my Windows iometer host.
 
 
BENCHMARKING
 
IOmeter seems to be almost a standard for measuring IOPS on systems.
I installed the dynamo portion on Openfiler and then had to decide what tests to run.
Rightly or wrongly I chose to use storagereview.com standard database benchmark settings for IOmeter.  The settings are displayed here.
 
This test consists of 8KB data chunks with 67% read operations and 23% writes.  100% randomness.
If anyone can suggest better test parameters I'm all ears as in this area I'm not well informed.
Given I am testing RAID6 vs RAID10 it probably does not matter too much as long as results show a comparison between these two systems.
 

 
As an initial test, I ran this benchmark on the Openfiler boot disk, a 150GB Western Digital Velociraptor hooked onto the motherboards Intel ICH10 SATA port.
 
Results:
 
150GB WD Velociraptor - Database test
Total IOPS 168.22
Total MBs per Second 1.31
Average I/O Response Time 5.94 ms
Maximum I.O Response Time 27.1ms
 
 
 
 
 
 
 
These results are very close to commercial reviews of the same drive so I'm a bit more confident of providing some decent benchmarks.
 
 

 
Next test, RAID 6 on the 3Ware 9650SE controller.  The RAID 6 array is comprised of 12 x 1TB Seagate ST31000528AS.
A 256kb block size was used when creating the array as advised by the 3Ware BIOS.
 
Results:
 
3Ware RAID 6 256kb block size - Database test
Total IOPS 110.79
Total MBs per Second 0.87
Average I/O Response Time 9.02ms
Maximum I.O Response Time 516.8ms
 
Err, these results don't look good at all!
I tried tweaking a bit, but seem to be stuck at around this level of performance.
Either something is wrong with the setup, which I doubt,
or the 3Ware Controller under RAID 6 is very slow.
 
 
 

 
Next test, RAID 10 on the 3Ware 9650SE controller.  The RAID 10 array is comprised of 12 x 1TB Seagate ST31000528AS.
(6 x 2 Drive mirrors).
A 64kb block size was used when creating the array.
 
Results:
 
3Ware RAID 10 64kb block size - Database test
Total IOPS 132.92
Total MBs per Second 1.04
Average I/O Response Time 7.52ms
Maximum I.O Response Time 993.2ms
 
Better. But not great either.
 
 
 
 
 
 
 
 
This is all very confusing.  If I run the default iometer test, (2kb blocks, 100% random, 67% read)  I get 241.87 IOPS with two workers.
512b block, 75% read, 1 worker = 4393 IOPS
4kb block, 75% read, 1 worker = 2773 IOPS
16kb block 75% read, 1 worker = 1545 IOPS
 
 Setting a 64kb block, 100% sequential, 100% read = 3561 IOPS, 222MB/sec which is not bad.
 
 Any suggestions?
 
 
I have since discovered that they are not supported by 3Ware and I should be using "enterprise class" drives.
(Such as the Seagate ST31000340NS).
Have been experiencing drive timeout errors on multiple disks and reallocated sector issues galore.
I have been replacing the disks that are reporting errors and the system seems to be coming good now.  I have two drives left that show reallocated sectors and these will be swapped out soon.
 
Guess this is what happens when you try and go too cheap!
 
 

 

Copyright 2009 by Simon Shaw where applicable.   ::   Privacy Statement   ::   Terms Of Use
Login Login User Account Manager Register
);