Teracopy to support multi-threading when copying/verifying

Avatar
  • updated
  • Planned

I suggest that Teracopy supports multithreading. As you know when copying small files multithreading support will speed things up instead of serial copy.

Avatar
youz

对得,我也很赞同这个意见,很烦恼的就是这些量级的小文件,总大小不大,就是数量多。

Avatar
Martin
Quote from youz

对得,我也很赞同这个意见,很烦恼的就是这些量级的小文件,总大小不大,就是数量多。

English, please.
Avatar
geraud.dumont

When copying files on sharepoint server, the speed per channel is very slow, but the server can handel several opened channel in parallel. Could you put a number of concomitant files (by default at 1) in setting menu ?

Avatar
-1
Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

Avatar
Martin
Quote from Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

While there's even a drop with NVMe based flash storage, IMHO that can be ignored sometimes, e. g. transferring a couple of big files and way more small files to the same target disk.


Maybe an UI switch, defaulting to off, would be sufficient to fulfill both types of users.


Anyway, I'm supporting your request for faster transfers - TeraCopy should, at the very least, be nearly on par with Windows's internal copy routine.

Avatar
stephane simonetti

When you add files from an existing copying session, the addition in KB/MB/GB is not correct, after copying the first add list, the progression bar is above 100% !   and you don't know when it's gonna be finished. i go back to the previous version which was working perfectly at this point.

Avatar
mow
Quote from Yuzhy

Keep in mind transferring multiple files simultaneously will turn a sequential read/write job into lots of random read/write jobs and your hard drives will become your bottle neck.


However I am finding even when transferring and verifying single large files Teracopy seems very slow on systems capable of Giga Bytes per second of sequential read/write (eg. by using large numbers of HDD in Raid arrays or NVMe SSDs) and 10GbE+ networks.


For example, doing a hash verification on a 50GB file goes at only 350MB/sec on a Windows 10 VM with CPU utilization at only 10% when Crystal Disk Mark on the same system can hit over 2000MB/sec on sequential read and write when using 4 threads.  Even Windows 10 transfers are more than double that of Teracopy's.  CPU core counts are going up and up, 10GbE networks are becoming accessible and NVMe SSD performance are starting to hit the PCIe Gen3x4 bandwidth limit.  There should be a lot of room to optimize performance by better use of multi-threading.

Keep in mind that, when using SSD drives, access times are negligible, so random read won't be much of a bottleneck. For really small files (under a filesystem block each), writing would actually not be that random because they'll be placed in adjacent blocks either way. This might even help the SSD writing complete (flash) blocks.


Also, when doing hash verification, Teracopy first hashes the source file and then the target file. If those are on different devices, both could be read and hashed in parallel.

Avatar
Code Sector
  • Planned