So we have a linux machine and couple of new “drives” (block devices), and we want to put a few (millions of) files on these. Local drives, nothing fancy or distributed - which means normal, regular linux filesystems.
We may or may not have SSDs, and we may or may not have a hardware RAID already.
Why it’s never Ext4?
Pretty much the only place you’d want to use ext4 now is /boot. It is not the best choice anywhere else. For regular flat partitions (if we don’t care about the integrity, or there’s a hardware RAID underneath) please use XFS. For lone small-ish SSDs please use F2FS.
For a long time the best way to handle many spinning disks was a stack of MD+LVM+XFS. Now BTRFS gives you the best in terms of managing it, so you’re able to both add and remove disks, and run periodic scrubs. It is probably not the worst option for handling multiple SSDs in a mirror (but this may not be accurate - XFS + md with a bitmap may be a better choice as it may avoid Copy-on-write amplification).
BTRFS pros and cons
Pros first:
- BTRFS stores block/extent/metadata checksums. Traditional MD/RAID in a mirrored mode doesn’t store the checksum, so if there’s an error which results in data bits changed (and it evades the hard disk’s checksumming mechanism itself) it’s impossible to know which of the 2 copies is correct. BTRFS doesn’t have this problem.
- You can add and remove disks, one by one, mixing disk sizes and utilizing the maximum of those, with data moved between disks on “what’s occupied and what isn’t” basis.
Cons next:
- Copy-on-write nature of btrfs makes it the worst medium for storing databases. Postgresql, mysql, you name it - all of these will produce crazily fragmented files and insane amounts of write amplification.
- The performance of raid5-6 configurations is laughably low. Of course, this is true for any implementation of raid5 (also, almost any implementation of raid5 has the write hole).