Not a question, only a enlightenment.
I’ve created a small benchmark to test the VirtualFileSystem. There were 10 Files each with 10 MB of randomized data (not true randomness or it would be incompressible).
Size
Raw files = 10 x 10 MB = 100 MB
Packed with multify = 32 MB
Packed with default ZIP algorithm from 7Zip = 32 MB
Packed with 7Zip’s own algorithm = 0.6 MB
Read all 10 files into a string (reboot of the computer after each test group)
-
VFS without mount = 4.2 s
-
VFS without mount = 0.9 s
-
VFS without mount = 0.9 s
-
VFS with mounted multifile = 3.6 s
-
VFS with mounted multifile = 1.45 s
-
VFS with mounted multifile = 1.46 s
-
With Python = 4.06 s
-
With Python = 0.15 s
-
With Python = 0.14 s
Most often only the 1. time is interesting (a user does not start the application three times).
I also tested this with data that is incompressible. My rash conclusion is: If ZIPing your files doesn’t change their size, do not store them in a multifile.
Maybe it would be nice to benchmark the seek time to see if it’s a good idea to store thousands of files in a multifile.