In a previous article, I stated that StorageX is multi-threaded. I also spent quite a bit of time discussing why I consider this fact to be (mostly) irrelevant to the administrator who is using StorageX to perform his file system migrations. What the user of StorageX really wants is for StorageX to do its job as fast as possible: when he is doing a baseline copy, he wants StorageX to fill his network pipe and move the data as quickly as possible, and when he is cutting over to his shiny new NAS hardware, he wants StorageX to do the final incremental copy within his allotted cutover window.
As I mentioned in my previous article, the techniques StorageX uses to fill the network pipe during a baseline copy are very different from those used to find changed files as quickly as possible during an incremental copy. In this article, I will focus on baseline copies.
Filling a network pipe is fairly easy if you have large files (e.g., ISO images of Linux distros). Here the trick is to avoid half-duplex network traffic, wherein you read a block of data from the source file, wait for the read operation to complete, write the block to the destination file, wait for the write operation to complete, and then read the next block of data from the source file, continuing until the entire file is copied.
Serializing the read and write operations is a performance killer, and there are a couple basic ways to avoid this problem. The most obvious way is to use multiple threads, one per file being copied. That way, while one thread is reading from source file 1, another thread can be writing to destination file 2. At least, that’s the hope. Because this is the approach many application writers take, I suspect it is why the customer I mentioned in my previous article assumed that multiple threads equate to faster copies.
The problem with creating a thread for each file to copy is that thread creation is not cheap. This issue is usually mitigated by using a thread pool to avoid lots of threads being created and destroyed. Another problem with using a thread for each file is that each time a thread blocks (e.g., waiting for a read operation to complete), the processor puts the thread to sleep and selects another thread to run. This is called thread context switching, and it is a fairly expensive operation. If an application uses significantly more threads than processor cores to perform a task, thread context switching can be another performance killer.
Filling a network pipe with lots of small files is considerably more difficult, because the overhead of creating and closing the files far outweighs the time spent reading and writing file data. For both large and small files (and all sizes in between), it turns out the best way to fill the network pipe is by using asynchronous I/O. With asynchronous I/O, the application does not wait for each I/O operation (read or write) to finish before doing more work.
StorageX uses asynchronous I/O as much as possible to increase its performance when copying files. However, not all operations can be performed asynchronously (for example, creating a file is a synchronous operation), which is why copying one 1MB file is much faster than copying 1,000 1KB files.
Once the network pipe is as full as possible for a given set of files, there is not much more that can be done to speed up a baseline copy. We have compared StorageX to various other applications used in migrations, and all perform roughly the same in terms of bulk I/O capabilities. For a baseline copy, the difference among applications comes down to other features, such as scheduling, reporting, security handling, tech support, etc., and StorageX shines in all of these areas.