<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www.eggxpert.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Storage</title><link>http://www.eggxpert.com/forums/18259/ShowForum.aspx</link><description /><dc:language>en</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61120.2)</generator><item><title>Classroom 101: RAID Explained</title><link>http://www.eggxpert.com/forums/thread/162889.aspx</link><pubDate>Mon, 17 Sep 2007 17:28:33 GMT</pubDate><guid isPermaLink="false">e96c5591-d47d-4b8d-80c4-18d6411a9236:162889</guid><dc:creator>root</dc:creator><slash:comments>0</slash:comments><comments>http://www.eggxpert.com/forums/thread/162889.aspx</comments><wfw:commentRss>http://www.eggxpert.com/forums/commentrss.aspx?SectionID=18259&amp;PostID=162889</wfw:commentRss><description>&lt;p&gt;Last Updated: 10/30/2007 : 14:38 CT&amp;nbsp;&lt;/p&gt;&lt;p&gt;This purpose of this post is to explain what RAID is to someone new to building computers.&lt;/p&gt;
&lt;p&gt;RAID means &lt;b&gt;Redundant Array of Independent Drives (or Disks)&lt;/b&gt;, also known as &lt;b&gt;Redundant Array of Inexpensive Drives (or Disks)&lt;/b&gt; (source: &lt;A href="http://en.wikipedia.org/wiki/RAID" target=_blank title="http://en.wikipedia.org/wiki/RAID" target="_blank"&gt;http://en.wikipedia.org/wiki/RAID&lt;/a&gt;). If you are wondering what an Array is, it is two or more hard disk drives grouped together [logically] to appear as a single device to the host computer (source:&lt;font&gt;&lt;font size="-1"&gt;&lt;A href="http://www.google.com/url?sa=X&amp;amp;start=17&amp;amp;oi=define&amp;amp;ei=durYRseQDajAggSJ4YHmCQ&amp;amp;sig2=3dudHkR8Bzn5QYJmv16AZA&amp;amp;q=http://www.usbman.com/glossarycomputerterms.htm&amp;amp;usg=AFQjCNFkgqxy7l9eqXspfVpme7vq-PsWHQ" target=_blank target="_blank"&gt;&lt;font color="#008000"&gt;www.usbman.com/glossarycomputerterms.htm&lt;/font&gt;&lt;/a&gt;). &lt;br&gt; &lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;There are many RAID levels, or configurations if you will. Each level has it's own pro/con. This post covers the following levels:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;JBOD&lt;/li&gt;
&lt;li&gt;RAID0&lt;/li&gt;
&lt;li&gt;RAID1&lt;/li&gt;
&lt;li&gt;RAID3&lt;/li&gt;
&lt;li&gt;RAID5&lt;/li&gt;
&lt;li&gt;RAID6&lt;/li&gt;
&lt;li&gt;RAID1+0 (aka RAID10)&lt;/li&gt;
&lt;/ul&gt;
&lt;br&gt;This post will also cover the hardware and software RAID configurations, along with requirements. &lt;br&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;JBOD&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This is a derogatory term used for Spanning, although some people actually mean JBOD. JBOD means "&lt;u&gt;J&lt;/u&gt;ust a &lt;u&gt;B&lt;/u&gt;unch &lt;u&gt;O&lt;/u&gt;f &lt;u&gt;D&lt;/u&gt;isks". This is what you use prior to using RAID. They are just independent disks. Sometimes, however, people are actually referring to spanning. Spanning is the act of combing multiple drives (no matter their size) into one logical drive. There is actually no performance boost doing this because it is not stripping, it is concatenating the drives. When one drive fills up, it will automatically go to the next drive. But, like striping, if one drive fails you loose all data in the array. Their are two pros to doing this:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data recovery, if a drive fails, is easier than if it were stripped. &lt;/li&gt;
&lt;li&gt;Unlike RAID, you aren't limited to the smallest size drive in the array.&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What do I mean by limited to the smallest size drive in the array? With RAID (0, 1, 3, 5, 6, etc), you can only use the smallest drive. Let's say you have RAID0 (which I'll be going more into depth in the next section), and you have two drives. One is a 50GB drive and the other is a 750GB drive. 700GB will become wasted space if you use it in RAID0. The only way that I know how you can utilize this extra 'dead' space is by either software RAID or by using a RAID controller built by Intel called &lt;A href="http://www.intel.com/design/chipsets/matrixstorage_sb.htm" target=_blank title="http://www.intel.com/design/chipsets/matrixstorage_sb.htm" target="_blank"&gt;Matrix RAID&lt;/a&gt;. I would highly recommend reading &lt;A href="http://storageadvisors.adaptec.com/2006/11/16/multiple-raids-on-a-common-set-of-drives/" target=_blank title="http://storageadvisors.adaptec.com/2006/11/16/multiple-raids-on-a-common-set-of-drives/" target="_blank"&gt;this article&lt;/a&gt; on the comparison between a normal RAID and a Matrix RAID after you have read up on all the different types of RAID I'll be explaining below. &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID0&lt;/b&gt;&lt;br&gt;&lt;br&gt;Lets start off with RAID0 since it is a popular one among the gaming community. RAID0 requires a minimum of 2 drives and uses striping, which refers to how the data is written. When you stripe data, you write across all drives, utilizing multiple drives. This can almost multiply the speed at which you can read and write data (i.e. twice as fast with 2 drives, three times as fast with 3 drives, etc...)&lt;br&gt; &lt;/p&gt;
&lt;p&gt;Let's say before you could only write 1 megabyte per second. Adding a second drive, you can now write 2 megabytes per second. The more drives you add to the array, the more you can utilize and the faster you can read/write. &lt;/p&gt;
&lt;p&gt;This is great! We should always do this, right? &lt;/p&gt;
&lt;p&gt;There are problems with using RAID0, unfortunately. Since a hard drive has moving parts in it, it will &lt;b&gt;&lt;i&gt;always &lt;/i&gt;&lt;/b&gt;fail over time. If one drive fails then all the drives that are in that same array become useless. This is because half of your data (or at least a portion of it) is on the failed drive. Without that portion, the Operating System (OS) can't read the data. The more drives you add to a RAID0 array, the more likely of a drive is to fail. Keep in mind that the &lt;u&gt;&lt;i&gt;Average life of a hard drive is 4-5 years&lt;/i&gt;&lt;/u&gt;. Knowing this, and using higher level math, you can see that adding another drive lowers that average life. Adding multiple drives will decrease it even more. But this is only a risk and there are no guarantees. Sad, I know. &lt;br&gt;&lt;/p&gt;
&lt;p&gt;Where is the "redundant" part in "Redundant Array of Independent Disks" you ask? That's right. It's not there. This is why when you use RAID0, the data should either be backed up regularly or the data is expendable. Many people use RAID0 for their applications and games. Because of the performance boost, and the small rate of change in data (i.e. fewer backups required), this is ideal for someone who has installation CDs lying around and can afford a data loss. &lt;/p&gt;
&lt;p&gt;Keep in mind, though, that many times your bottleneck isn't in the drives but in the processor, GPU, memory, or even in the bus/North Bridge. Remember: &lt;b&gt;you are only as fast as your slowest part&lt;/b&gt;. This is where monitoring utilities come into play like good ol' Task Manager (I know, bad example). If you want to learn more about using RAID0, please take a look at my Hard Drive Optimization thread (link posted at the very bottom).&lt;br&gt;  &lt;/p&gt;
&lt;p&gt;Now what if you don't want &lt;i&gt;any&lt;/i&gt; data loss? This leads us to RAID1.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID1&lt;/b&gt;&lt;br&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;RAID1 uses mirroring. As the term implies, it mirrors your data to another drive. With a minimum of 2 drives to work (most people do 2 drives), you can have complete redundancy. When you write to one drive, it writes to the other (before it acknowledges the write to the OS). Put simply, if one drive fails, the other will have all the data that was on the failed drive. No data loss. In fact, when a drive fails, you can just replace it and the array will rebuild itself--the drives will get re-synced and you won't even notice it. Beautiful. Unlike RAID0, though, you don't get a write performance boost--but you don't loose performance either (at least any good controller should write to the drives at the same time, not round robin). You do, however, get a gain in reads. This is because when you read data from a RAID1 array, it uses the same philosophy as RAID0, it reads from both drives. You might say: "Who cares about write performance, why not do this always?"&lt;/p&gt;
&lt;p&gt;Unlike RAID0, were it combines drives from a storage standpoint (i.e. if you have two 50GB drives, you will 'see' 100GB of storage), RAID1 does the opposite. In our example of two 50GB drives, in a RAID1 setup, you would only 'see' 50GB of storage. This is because the second drive is being used up for the mirror. A lot of people don't like this since you aren't utilizing the second drive's storage, and lets not forget the money to buy the second drive.&lt;/p&gt;
&lt;p&gt;What do most people use this for? The OS. This is because the OS runs mostly on memory after bootup, so the read performance is welcomed (ever had to wait for a machine to boot up and it seemed to take forever?). And since it's your OS, well, you can't access your data if your OS crashes due to a hard drive failure.&lt;br&gt;&lt;/p&gt;
&lt;p&gt;"So, root, how can I have both read/write performance, AND get redundancy, AND not loose as much storage?" Pfft, I wouldn't leave you hanging like that.&lt;br&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID3&lt;/b&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Welcome to the world of parity. Parity is, for a better lack of an example, a math problem's answer. Think 1 + 2 = 3. The number '3' would be the parity. RAID3, which requires a minimum of 3 drives to work, calculates a parity 'bit' when it writes to the drives in a stripe (remember the word stripe from RAID0?). What does this mean? &lt;/p&gt;
&lt;p&gt;So in our example, 1 + 2 = 3, think of each number as a drive. If one drive fails, lets say '2', you can use '3' and '1' to deduct that the second drive is '2' by subtracting the two numbers. In RAID3, the parity 'bit' is, like the example, a drive. For each write you do to the drive, it stripes it across the array and puts a special parity 'bit' on the parity drive. That being said, one drive out of three is chewed up toward the parity calculation. But remember, you aren't limited to three drives so you could have more. So if we have five drives of 50GB, you will 'see' 200GB. In this case, the penalty for redundancy isn't as bad as RAID1 (20% vs 50%). You get both read/write performance boosts with redundancy. And, like RAID1, if a drive fails, you just replace it and it rebuilds itself. Not only this, but when it does fail, you can still read/write to the array (although significantly slower)! So, what are the cons you say? There are three of them, actually.&lt;br&gt; &lt;/p&gt;
&lt;p&gt;The first two have to do with the nature of parity. &lt;/p&gt;
&lt;p&gt;First is the parity calculation in itself, it takes some resources to calculate. Whether this is the CPU/memory on the motherboard or on a dedicated RAID controller card... it's still using up resources. But note that the resources for parity is used during WRITES, not READS. This means that you can have the same READ performance as a RAID0 array with the same amount of drive. The second problem is that if one drive fails, you can't loose another drive. If you loose more than one drive, you loose the entire array (like RAID0 but you get one life). The third problem with RAID3 is the dedicated parity drive. The drive that gets pinned with the parity bit actually gets more activity than the other drives. So when you have four drives in a RAID3 array, one of those drives is getting hit harder than all the others--making it's life expectancy shorter than the others. More than likely, when a drive fails, it will be the parity drive.&lt;/p&gt;
&lt;p&gt;When would you use a parity type RAID? I use it for all my large media files and archival stuff. It's great to use on files that I don't need high performance (pretty much everything in a single user environment--although some applications I wouldn't recommend it like games or high Input Output Per Second (IOPS) databases).&amp;nbsp;&lt;/p&gt;
&lt;p&gt;"Well root, I see three other RAID types. Obviously some of those problems were addressed, right?"&lt;/p&gt;
&lt;p&gt;Right.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID5&lt;/b&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;RAID5 is probably one of the most popular RAID types. It is pretty much exactly like RAID3 but addresses the parity drive issue by using a distributed parity bit. What this means is that instead of one dedicated drive, the parity bit floats around among all the drives. I can't say much more on this since it is pretty much RAID3 so I'll move on to RAID6.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;But before I move on to RAID6, *grin*, I'd like to highlight a troublesome problem with failing drives in a RAID5 array--and the need for RAID6. &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Drive Failure and Replacement&lt;/b&gt;&lt;br&gt;&lt;/p&gt;
&lt;p&gt;You have to understand what happens when an array rebuilds a failed drive to really comprehend the need for RAID6. Say you have an array of eight 100GB drives in a RAID5 setup. While this sounds overly excessive, there are many people that do this, especially in the enterprise level. Eight drives is actually &lt;i&gt;the&lt;/i&gt; magical number. There was a study that was done (by who, and what the size of the drive it was, I can't remember) that proved that eight drives was the most economical. But now I'm getting off track. &lt;/p&gt;
&lt;p&gt;Say one of the drives in that eight drive array fails. For you to rebuild that drive (when you pop in a brand new one in it's place), the raid controller has to look at &lt;i&gt;all&lt;/i&gt; of the drives in the array and has to compute the parity bit (or lack thereof) to fill in the holes. As you can imagine, this can take some time. The bigger the array, and the bigger the &lt;i&gt;size&lt;/i&gt; of the drive, the longer the rebuild takes. Since during this rebuild process you are still susceptible to 2 or more drive failure limitation of the array (i.e. you loose one more you loose ALL of your data), you can see the problem in this.&amp;nbsp;&lt;/p&gt;
&lt;p&gt; At first they implemented hotspares. Hotspares are drives that do nothing but sit on the array. There is no data on them and you can't access them (usually) through normal means. The purpose of hotspares is to be there when a drive fails so it can immediately start rebuilding itself. Remember, if a drive fails, &lt;i&gt;you&lt;/i&gt; have to know it failed so you can replace it. That takes time to both, notify you, to order it, and to physically replace it. Hotspares stream lined that process. But here's the problem: I've seen some drives take up to a &lt;i&gt;week&lt;/i&gt; to rebuild. And when you have&lt;i&gt; terabyte&lt;/i&gt; drives on the market, you can see huge problem for these giant arrays. &lt;/p&gt;
&lt;p&gt;And yes, this is why they made RAID6.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID6&lt;/b&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;As you may have guessed, RAID6 uses double distributed parity. Think of it being like RAID5 but you can loose up to 2 drives and still be able to read/write to your array (although it will be extremely slow). It does require four drives, instead of three like RAID3 and RAID5. This is because of the double parity. &lt;/p&gt;
&lt;p&gt;You might have also guessed the disadvantages of having double parity. Remember how a parity 'chews up' a drive? Well with double parity, two drives get taken up as the penalty for redundancy. An example: if you have five 100GB drives, you will only 'see' 300GB of storage. Then there is the problem with calculating the parity bit--now you are doing double the work. This is where dedicated RAID controllers that plug into the PCI-E (or X) slot come into play.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;"OK. I get it now, root. So what's with this RAID10?"&lt;/p&gt;
&lt;p&gt;Alright young padawan. Now you've just crossed into the hybrid RAIDs.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;RAID10&lt;/b&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Ignoring the title of this RAID, and using all your knowledge of RAID so far, what do you think would be the next step? I'll make it easy for you, suppose you don't care about the penalty of redundancy. Suppose you had money to blow and had a case that could hold all the drives that you could possibly buy. What do you think would give you the most redundancy and the best performance? Alright, I'll stop ranting. &lt;/p&gt;
&lt;p&gt;RAID10, as the name implies, uses both RAID1 and RAID0 together in a sort of matrix fashion.&amp;nbsp; It requires a minimum of 4 drives and an even number set (if greater than 4 drives)... and is probably the biggest penalty anyone could probably ask for IMO. By mirroring to two drives and then stripping that mirror across another drive, on each side of the mirror, you have successfully taken both advantages of RAID1 and RAID0. You get the complete redundancy of RAID1 and the read/write performance of RAID0. To put a cherry on top of that metaphoric sundae of storage goodness, there is no parity bit to calculate. *drools*. &lt;/p&gt;
&lt;p&gt;"So what is the cost?"&lt;/p&gt;
&lt;p&gt;Heh.&lt;/p&gt;
&lt;p&gt;Say you have four drives of 50GB, you can only 'see' 100GB. Ouch. That means two drives are just sitting there, &lt;i&gt;not&lt;/i&gt; doing anything for you but improving your performance and redundancy. When would you do this? Only two reasons that I can think of: &lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;You are running an enterprise level sever that needs all the IOPS you can dish out.&lt;/li&gt;
&lt;li&gt;You really really really want to have bragging rights on your next LAN party. &lt;br&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;br&gt;But RAID10 isn't the only hybrid. There are many others (as well as other non-Standard or less common RAID levels) like RAID50 (RAID5 but stripped to an equal amount of drives--countering some of the parity calculation performance issues). You might also see RAID01, which is not to be confused with RAID10. While the two have the exact same performance gains and storage, the amount of failures you can receive is different. RAID01 will allow you to only sustain 1 failure. In RAID10, you can actually sustain 2 failures and rebuild times are faster.&lt;br&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;Software vs Hardware RAID Controllers&lt;br&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As I've mentioned briefly, there are two different types of RAID controllers. A hardware and software. Hardware, like the name implies, is an actual physical controller out side of the OS that controls the RAID logic that has it's own resources (CPU/Cache). This is usually a dedicated RAID controller card that sits in a PCI-Express (or PCI-X or PCI) slot. A software controller, on the other hand, is done from the OS side (usually, but can be an application). The OS actually does the RAID logic, not some dedicated device. &lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Pros: cheaper&lt;/li&gt;
&lt;li&gt;Cons: chews up resources and you are reliant on the OS (or an application).&amp;nbsp;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;There are also 'hybrids', or at least that's what I like to think them as. They are outside of the OS but don't have their own resources. These are usually found on the mobo, but you can also find RAID controller cards (the cheaper ones--you'll see a significant jump between RAID cards and this is why).&lt;br&gt;&lt;/p&gt;
&lt;p&gt;Well I hope this addresses any questions on what RAID is and how it can help you. For those of you who are interested in using RAID, you will need to make sure that either your motherboard supports it, your OS, or you get a RAID controller card. And also make sure that the RAID you want is supported (not all RAIDs are supported). &lt;/p&gt;
&lt;p&gt;Before you move on to the next post/thread, I'd like you to remember three things:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;RAID is no excuse for you to not back up. RAID (excluding RAID0) is meant for &lt;i&gt;physical&lt;/i&gt; failure, not &lt;A href="http://en.wikipedia.org/wiki/PEBKAC" target=_blank title="http://en.wikipedia.org/wiki/PEBKAC" target="_blank"&gt;PEBCAK &lt;/a&gt;or Windows *grin*. That said, if you have an option between RAID or purchasing a back up drive for backing up, I will always recommend to you to buy a back up drive.&lt;br&gt;  &lt;/li&gt;
&lt;li&gt;You are only as fast as your slowest part when it comes to computers. So take a step back and ask your self, "is it really the drives that I have a bottle neck with?" before you go RAID10.&lt;/li&gt;&lt;li&gt;If you have the money/space/power/backup space/etc to do some insane RAID0 or RAID10 setup, why not? You can never have too much performance from your disks. But if it comes down to what's the biggest bang for the buck when it comes to upgrades? Storage requirements aside (not performance), 8 times out of 10 you'd be better off with upgrading your CPU/Memory/Bus.&lt;br&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;And of course, if you have any questions what so ever, feel free to PM me. No question is considered stupid. Remember, I'm still (and will always be) learning so feel free to show me research that goes against what I'm preaching and I'll keep an open mind.&lt;br&gt;&lt;/p&gt;
&lt;p&gt;|o.O| &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Other Reading Materials:&lt;/p&gt;
&lt;p&gt;Google's study (pdf): &lt;A href="http://labs.google.com/papers/disk_failures.pdf" target=_blank target="_blank"&gt;Failure Trends in a Large Disk Drive Population&lt;/a&gt; &lt;br&gt;Wikipedia (Hard Drives): &lt;A href="http://en.wikipedia.org/wiki/Hard_disk%20" target=_blank title="http://en.wikipedia.org/wiki/Hard_disk " target="_blank"&gt;http://en.wikipedia.org/wiki/Hard_disk&amp;nbsp;&lt;/a&gt;&lt;br&gt;&lt;A href="http://www.eggxpert.com/forums/thread/162906.aspx" target=_blank title="http://www.eggxpert.com/forums/thread/162906.aspx" target="_blank"&gt;Classroom 101: Hard Drive Optimization&lt;/a&gt; by Root&lt;/p&gt;&lt;p&gt;&lt;br&gt;Q and A with root: &lt;A href="http://www.eggxpert.com/forums/thread/149990.aspx" target=_blank title="http://www.eggxpert.com/forums/thread/149990.aspx" target="_blank"&gt;Classroom 101: RAID Explained - Discussion&lt;/a&gt;&lt;br&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item></channel></rss>