Potential customers with high I/O requirements frequently ask me if they can use Linux instead of AIX or Solaris.
|Linux file systems aren’t for everyone. A variety of limitations make them poorly suited for large and HPC environments.|
Unsure About an Acronym or Term?
No one ever asks me about high-performance I/O — high IOPS or high streaming I/O — on Windows or NTFS because it isn’t possible. Windows and the NTFS file system, which hasn’t changed much since it was released almost 10 years ago, can’t scale given its current structure. The NTFS file system layout, allocation methodology and structure do not allow it to efficiently support multi-terabyte file systems, much less file systems in the petabyte range, and that’s no surprise since it’s not Microsoft’s target market.
And what was Linux’s initial target market? A Microsoft desktop replacement, of course. Linux has since moved from the desktop to run on many large SMP servers from Sun, IBM and SGI. But can Linux as an operating system and Linux file systems meet the challenge of high-performance I/O?
» Storage Virtualization Plays Catch Up
More About Storage
You may think you don’t need high-performance I/O, but every server needs this type of I/O performance for something as simple as backup and restoration. Current LTO-4 tape drives can operate at 120 MB/sec without compression and can support data rates up to 240 MB/sec with compression. If your file system cannot support I/O at these streaming data rates, then the time to backup and restore will take much longer than expected. For large environments with multiple tape drives, not being able to use the tape drives at their full data rate might require additional tape drives to meet the backup time window, which affects restoration too. Therefore, it seems to me that everyone should be interested in the performance of Linux file systems, if only for backup and restore.
Can Linux file systems, which I will define as ext-4, XFS and xxx, match the performance of file systems on other Unix-based large SMP servers such as IBM and Sun? Some might also inquire about SGI, but SGI has something called ProPack, which has a number of optimizations to Linux for high-speed I/O, and SGI also has an open proprietary Linux file system called CxFS, which is not part of standard Linux distributions. Because SGI ProPack and CxFS are not part of standard Linux distributions, we will not consider them here. We’ll stick to standard Linux because that is what most people use.
We’ll focus on two areas:
- Linux as an operating system
- Linux file systems
Linux Operating System Issues
We’ll set aside what might happen with Linux in the future and instead focus on what is available today. Linux has a number of features that match the I/O performance of AIX and Solaris, such as direct I/O, but the bottom line is that Linux wasn’t designed around high-performance multi-threaded I/O.
Several areas limit performance in Linux, such as page size compared with other operating systems, the restrictions Linux places on direct I/O and page alignment, and the fact that Linux does not allow direct I/O automatically by request size — I have seen Linux kernels break large (greater than 512 KB) I/O requests into 128 KB requests. Since the Linux I/O performance and file system were designed for a desktop replacement for Windows, none of this comes as much of a surprise.
Linux has other issues, as I see it; for starters, the lack of someone to take charge or responsibility. With Linux, if you find a problem, groups of people are going to have to agree to fix it, and the people writing Linux might not necessarily be responsive to the problems you’re facing. If a large vendor of Linux agrees with your problem and provides a fix, that doesn’t mean it will be accepted — or accepted any time soon — by the Linux community. And getting a patch for your problem could pose maintenance problems.
Henry Newman is a regular contributor to Enterprise Storage Forum, where this story originally appeared. Newman is an industry consultant with 27 years experience in high-performance computing and storage.