You need backups. RAID or something similar is only necessary if you need redundancy which is most often not as necessary compared to loosing all your data.
RAID is necessary because drives fail, and sometimes you can’t afford, or want, be offline until you can get around to sourcing & installing a new drive, and restoring from backup.
Backups are important, but we were talking about drive failures. Backups help when you screw up the data; RAID6 helps when drives go bad. If you don’t trust the hardware, RAID.
Backups only means you’re down until you restore; RAID5/6 means you stay up.
The amount of risk of drives failing is not dependent of your raid config at all. ignoring excessive duty cycling. I believe you are misunderstanding the point I was making in my original reply. I’m claiming that these 32TB drives will reduce your risk of losing data than by raiding 2 16TB drives, given the same failure rate.
I’m uncomfortable storing 16TB worth of data on one drive
Example you have 20TB of data. What is safer?
2 16TB drives in raid0
1 32TB drive
This is completely irrelevant to your backup solution. You should have backups, of course, but I don’t see how that factors into my point? You have to put the data somewhere, and then back it up, where do you put it? I will always put it on as few physical drives as possible, to minimize the risk of drive failure over time so I don’t have to restore/re-stripe as often.
I’m claiming that these 32TB drives will reduce your risk of losing data than by raiding 2 16TB drives, given the same failure rate.
Assuming the probability of failure is the same, you’re right, running two drives doubles the risk of a drive failing.
However, if your single 32 TB drive fails, all data is gone and you have to rely on backup. If one of the 16 TB drives fails, you replace it and the RAID restores the data with much less hassle.
Both 16 TB drives failing at once is negligible (however, the RAID controller might).
If that is your whole point, you didn’t approach it right as you can see with all the downvotes.
You seemingly argued against RAID which was invented for data availability and performance. While it’s true, that RAID alone is no backup solution, having just a single drive is more hassle when it fails, so running multiple drives in a RAID allows for better handling despite the higher probability of having to swap a drive.
Another point you did not consider: larger drives have more sectors that can fail. While I have no data for this, a 32 TB drive is unlikely to have the same rate of failure as a 16 TB one - the larger drive will be more likely to fail (not as likely as one of two drives failing though).
Why would anyone back up data in the manner you’re saying? That’s dumb.
Don’t split the data across multiple logical locations, keep it logically contained. A raid designed for availability is better than a single external hard drive but that isn’t what is being talked about.
3 2 1 means keeping multiple copies of the SAME data on multiple media types in multiple locations so you remove a single point of failure.
Which is why you have backups. Doesn’t matter if you have 1 32TB drive or 32 1TB drives, backups are how you recover from failure. Running 1 drive is less risk than running 2 drives for the same storage capacity.
Raid0? You mean having two devices stripped across is rather than just one device with no stripping? Raid0 is a risk you take when you care more about performance than downtime to restore a backup.
If I have 20TB of data, it cannot fit on a single 16TB drive. So my options are Raid, or this single drive option. I would always pick the single drive if I could afford it.
Raid 5 is a great balance of capacity and useful storage with 3 drives. You get 1 drive worth of fault tolerance and 2 drives worth of capacity. I personally have mismatched drives so I run raid 1 in between the matching sizes, and jbod between the raid 1 mirrors (well the zfs equivilent) And my really important data is backed up onto two more drives in raid 10.
First, if you have more than one disk, you should be either getting redundancy through mirroring, or building arrays of several disks with redundant methods like RAID5 / RAID6 / ZFS zraid2.
Second, no single copy of data is safe, you must always have recent, tested backups.
If you have 20TB of data to store, a single drive is safer than splitting it across multiple drives. Few point of failure in total.
If you are storing your own data a single drive is asking to lose all your data.
3 2 1 for all your important data.
RAID6, my person. RAID6.
RAID is not a backup.
It is not. But backups are also not RAID.
Yes, obviously.
You need backups. RAID or something similar is only necessary if you need redundancy which is most often not as necessary compared to loosing all your data.
RAID is necessary because drives fail, and sometimes you can’t afford, or want, be offline until you can get around to sourcing & installing a new drive, and restoring from backup.
That is what I said.
RAID6 only works if the machine is working fine. If something happens that toasts the whole thing then you’re fucked unless you have a backup offsite.
Backups are important, but we were talking about drive failures. Backups help when you screw up the data; RAID6 helps when drives go bad. If you don’t trust the hardware, RAID.
Backups only means you’re down until you restore; RAID5/6 means you stay up.
Right, but he was talking about the 3 2 1 rule and you recommended RAID6.
But he was responding to someone who was unconfortable with putting all their eggs in one basket. That’s not what backups are for.
RAID is not a backup.
Reducing the number of drives you are running, reduces the risk of losing data. Do you disagree?
Depends entirely on the config. RAID 0? Higher risk. RAID 1? Lower risk.
I run RAID 0 on a couple of external USB drives with a full backup on Google and locally. No worries.
The amount of risk of drives failing is not dependent of your raid config at all. ignoring excessive duty cycling. I believe you are misunderstanding the point I was making in my original reply. I’m claiming that these 32TB drives will reduce your risk of losing data than by raiding 2 16TB drives, given the same failure rate.
Example you have 20TB of data. What is safer?
This is completely irrelevant to your backup solution. You should have backups, of course, but I don’t see how that factors into my point? You have to put the data somewhere, and then back it up, where do you put it? I will always put it on as few physical drives as possible, to minimize the risk of drive failure over time so I don’t have to restore/re-stripe as often.
Assuming the probability of failure is the same, you’re right, running two drives doubles the risk of a drive failing.
However, if your single 32 TB drive fails, all data is gone and you have to rely on backup. If one of the 16 TB drives fails, you replace it and the RAID restores the data with much less hassle.
Both 16 TB drives failing at once is negligible (however, the RAID controller might).
Thank you for understanding.
If that is your whole point, you didn’t approach it right as you can see with all the downvotes.
You seemingly argued against RAID which was invented for data availability and performance. While it’s true, that RAID alone is no backup solution, having just a single drive is more hassle when it fails, so running multiple drives in a RAID allows for better handling despite the higher probability of having to swap a drive.
Another point you did not consider: larger drives have more sectors that can fail. While I have no data for this, a 32 TB drive is unlikely to have the same rate of failure as a 16 TB one - the larger drive will be more likely to fail (not as likely as one of two drives failing though).
It seems you never had a HDD die on you.
You misunderstand my claim.
You misunderstand the intent then.
Why would anyone back up data in the manner you’re saying? That’s dumb.
Don’t split the data across multiple logical locations, keep it logically contained. A raid designed for availability is better than a single external hard drive but that isn’t what is being talked about.
3 2 1 means keeping multiple copies of the SAME data on multiple media types in multiple locations so you remove a single point of failure.
You are not ready to be lecturing on this topic.
This single point of failure equals to putting all of your eggs in the same basket.
Which is why you have backups. Doesn’t matter if you have 1 32TB drive or 32 1TB drives, backups are how you recover from failure. Running 1 drive is less risk than running 2 drives for the same storage capacity.
If it’s split up sure, but I’m talking about a raid > 0 setup and/or having backup copies of your data onto drive #2
Raid0? You mean having two devices stripped across is rather than just one device with no stripping? Raid0 is a risk you take when you care more about performance than downtime to restore a backup.
If I have 20TB of data, it cannot fit on a single 16TB drive. So my options are Raid, or this single drive option. I would always pick the single drive if I could afford it.
Double check that symbol there.
Raid 5 is a great balance of capacity and useful storage with 3 drives. You get 1 drive worth of fault tolerance and 2 drives worth of capacity. I personally have mismatched drives so I run raid 1 in between the matching sizes, and jbod between the raid 1 mirrors (well the zfs equivilent) And my really important data is backed up onto two more drives in raid 10.
The person I replied to said
as a criticism of using a single 32TB drive.
I argue that a single 32TB drive is less risk than using 2 16TB drives. Am I wrong?
Christ alive.
No. Actually. The 32TB drive is a single point of failure for all your data.
Splitting it means you have 2 points of failure but for only half your data.
From an integrity and availability standpoint the two disk solution, while wildly ridiculous and dumb as fuck, is actually better.
Both solutions are ridiculous and dumb and are not sufficient backup.
First, if you have more than one disk, you should be either getting redundancy through mirroring, or building arrays of several disks with redundant methods like RAID5 / RAID6 / ZFS zraid2.
Second, no single copy of data is safe, you must always have recent, tested backups.