On several occasions, I have been asked a question that is straightforward to answer, but where I’m left with an uneasy feeling about why the question was being asked in the first place. For example, about a year ago I was asked, “Is StorageX multi-threaded?”
That’s a seemingly reasonable question, and a correct answer is easy to give: yes.
I suspect the person who asked the question would have been happy with that basic answer. He would have understood it (“yep, they have multiple threads in their program…that’s good, right?”) and he could easily convey the answer to whoever asked it of him. Most likely the question originated with a piece of marketing collateral for a competing product, touting how the product is “multi-threaded so that it scales to the available hardware” or some such seemingly wonderful claim.
So the person evaluating the available products to accomplish his task (in this case, enterprise file system migration during a NAS hardware refresh) picked up brochures at trade shows, read blog articles and discussion threads, talked with his peers, etc., and built a checklist of features that he needed. My guess is that most of the items on his checklist were perfectly reasonable, with features like:
- Copies all files and directories, even those with restrictive security
- Copies all file system attributes, including security descriptors (SMB), ownership, mode bits (NFS), etc.
- Migrates quotas and other volume settings
- Provides scheduling and reporting
- Is fast
That last item is where he got into trouble. Fast in what sense? Here are a few ways in which file system migration software should be fast:
- Performing a baseline copy of the data
- Performing a final incremental copy, thereby minimizing the cutover window
- Installation, deployment, and configuration
There are plenty of other ways in which you want your software to be fast (e.g., you want the UI to be responsive and you want reports to run quickly), but these are probably the most important ways in which you want your file system migration software to be fast.
How does the administrator evaluating his migration toolkit options assess the fast-ness of each product? He can install the products in his lab and test each with a representative data set. Many of our users do exactly that. But how can he be sure that once he puts his chosen solution into production, it will meet his requirements for performing the migration and cutting over to the new NAS hardware? He needs some assurance from the provider of the software that it is indeed fast.
Simply stating that software is fast isn’t quite enough, is it? Obviously, anyone can make all kinds of claims. But what if I state my claim in a very technical-sounding manner? How about, “our software will migrate your data blazingly fast because it is multi-threaded and scales to the available hardware”? That’s a little wordy, but I can see the marketing types getting excited.
The problem is that creating multiple threads in a software application is no guarantee that the application will perform its tasks as fast as possible. In fact, the opposite is often the case. Unless the software developer really understands issues like resource contention and context switching, adding threads to the mix is a recipe for all kinds of woe.
What the person asking me “is StorageX multi-threaded?” really wanted to know was if StorageX could move his file data as fast as possible. Except when it shouldn’t. After all, he might need to perform parts of his migration during business hours, and you know how cranky those business unit owners can be when their network is slow. So the administrator performing the migration wants StorageX to fill his network pipe with data, but he also wants the flexibility to throttle StorageX’s network usage during certain time windows.
More importantly, the administrator wants StorageX to be as fast as possible when he performs the final incremental copy, just prior to cutting his end users over to the new NAS hardware. This is a very different problem from filling a network pipe with as much data as possible. Here the goal is to do as little as possible in as short a time as possible, while ensuring that all changed files have been copied.
I will add some details to how StorageX uses multiple threads in another article, but for now, rest assured that StorageX is indeed multi-threaded, but more importantly, it is fast in the ways that matter.