I asked the same question myself after reading a best practice guide from Datrium that suggested using the VMware PVSCSI controller instead of the default recommendation of the LSI SAS controller that VMware makes when you create a Windows VM.
Out of curiosity I spun up a new server 2016 VM. 4 Cores 8GB of RAM, and a 100GB drive, hosted on my Datrium storage to find out how much of a difference there was.
For this test I ran during a normal production workload, and used Microsoft DiskSpd with a 16k IO size (my current average for my app servers) to test to see what we would get for results. The specific command I used was
diskspd.exe -b16K -d1800 -h -L -o2 -t4 -r -w50 -c10G C:\io.dat
The first run on the VMware LSI SAS controller resulted in this.
Command Line: C:\Users\cjoyce_admin\Downloads\Diskspd-v2.0.17\amd64fre\diskspd.exe -b16K -d1800 -h -L -o2 -t4 -r -w50 -c10G c:\io.dat
Input parameters:
timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'c:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time: 1800.00s
thread count: 4
proc count: 4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 8.35%| 1.84%| 6.50%| 91.65%
1| 8.38%| 1.89%| 6.48%| 91.62%
2| 7.78%| 1.79%| 5.99%| 92.22%
3| 7.39%| 1.60%| 5.79%| 92.61%
-------------------------------------------
avg.| 7.97%| 1.78%| 6.19%| 92.03%
Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 15150776320 | 924730 | 8.03 | 513.74 | 3.888 | 3.175 | c:\io.dat (10240MB)
1 | 15089106944 | 920966 | 7.99 | 511.65 | 3.904 | 3.289 | c:\io.dat (10240MB)
2 | 15108947968 | 922177 | 8.00 | 512.32 | 3.899 | 3.140 | c:\io.dat (10240MB)
3 | 15109013504 | 922181 | 8.01 | 512.32 | 3.898 | 3.086 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 60457844736 | 3690054 | 32.03 | 2050.03 | 3.897 | 3.173
Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7574110208 | 462287 | 4.01 | 256.83 | 3.274 | 2.741 | c:\io.dat (10240MB)
1 | 7539032064 | 460146 | 3.99 | 255.64 | 3.297 | 2.966 | c:\io.dat (10240MB)
2 | 7562526720 | 461580 | 4.01 | 256.43 | 3.297 | 2.861 | c:\io.dat (10240MB)
3 | 7543046144 | 460391 | 4.00 | 255.77 | 3.293 | 2.613 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 30218715136 | 1844404 | 16.01 | 1024.67 | 3.290 | 2.798
Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7576666112 | 462443 | 4.01 | 256.91 | 4.501 | 3.448 | c:\io.dat (10240MB)
1 | 7550074880 | 460820 | 4.00 | 256.01 | 4.510 | 3.479 | c:\io.dat (10240MB)
2 | 7546421248 | 460597 | 4.00 | 255.89 | 4.501 | 3.289 | c:\io.dat (10240MB)
3 | 7565967360 | 461790 | 4.01 | 256.55 | 4.503 | 3.389 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 30239129600 | 1845650 | 16.02 | 1025.36 | 4.504 | 3.402
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.000 | 0.000 | 0.000
25th | 1.360 | 2.258 | 1.709
50th | 2.818 | 3.885 | 3.269
75th | 4.481 | 6.093 | 5.443
90th | 6.259 | 8.370 | 7.195
95th | 7.163 | 9.928 | 8.987
99th | 10.090 | 13.425 | 12.593
3-nines | 23.523 | 30.284 | 27.785
4-nines | 47.191 | 52.535 | 49.878
5-nines | 190.339 | 161.402 | 190.339
6-nines | 534.581 | 534.289 | 534.289
7-nines | 545.593 | 535.040 | 545.593
8-nines | 545.593 | 535.040 | 545.593
9-nines | 545.593 | 535.040 | 545.593
max | 545.593 | 535.040 | 545.593
Overall not terrible. Now lets look at what we get when we replace the LSI SAS with a PVSCSI.
Input parameters:
timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'c:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time: 1800.00s
thread count: 4
proc count: 4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 7.37%| 1.53%| 5.84%| 92.63%
1| 7.02%| 1.40%| 5.62%| 92.98%
2| 6.35%| 1.25%| 5.10%| 93.65%
3| 6.04%| 1.22%| 4.82%| 93.96%
-------------------------------------------
avg.| 6.70%| 1.35%| 5.35%| 93.30%
Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 15667019776 | 956239 | 8.30 | 531.24 | 3.760 | 2.938 | c:\io.dat (10240MB)
1 | 15743369216 | 960899 | 8.34 | 533.83 | 3.741 | 3.011 | c:\io.dat (10240MB)
2 | 15789637632 | 963723 | 8.37 | 535.40 | 3.730 | 2.841 | c:\io.dat (10240MB)
3 | 15788425216 | 963649 | 8.36 | 535.36 | 3.731 | 2.914 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 62988451840 | 3844510 | 33.37 | 2135.84 | 3.740 | 2.926
Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7831814144 | 478016 | 4.15 | 265.56 | 2.660 | 2.405 | c:\io.dat (10240MB)
1 | 7862943744 | 479916 | 4.17 | 266.62 | 2.640 | 2.538 | c:\io.dat (10240MB)
2 | 7904346112 | 482443 | 4.19 | 268.02 | 2.632 | 2.247 | c:\io.dat (10240MB)
3 | 7881277440 | 481035 | 4.18 | 267.24 | 2.631 | 2.557 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 31480381440 | 1921410 | 16.68 | 1067.45 | 2.641 | 2.440
Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 7835205632 | 478223 | 4.15 | 265.68 | 4.859 | 3.010 | c:\io.dat (10240MB)
1 | 7880425472 | 480983 | 4.18 | 267.21 | 4.840 | 3.045 | c:\io.dat (10240MB)
2 | 7885291520 | 481280 | 4.18 | 267.38 | 4.831 | 2.946 | c:\io.dat (10240MB)
3 | 7907147776 | 482614 | 4.19 | 268.12 | 4.827 | 2.833 | c:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 31508070400 | 1923100 | 16.69 | 1068.39 | 4.839 | 2.959
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.000 | 0.000 | 0.000
25th | 1.189 | 2.947 | 1.810
50th | 1.868 | 4.126 | 3.120
75th | 3.536 | 6.037 | 4.971
90th | 5.392 | 8.026 | 6.924
95th | 6.269 | 9.628 | 8.417
99th | 9.446 | 13.234 | 12.021
3-nines | 22.655 | 32.422 | 28.825
4-nines | 45.679 | 50.249 | 48.554
5-nines | 158.326 | 159.371 | 159.371
6-nines | 475.470 | 427.329 | 427.329
7-nines | 475.711 | 427.338 | 475.711
8-nines | 475.711 | 427.338 | 475.711
9-nines | 475.711 | 427.338 | 475.711
max | 475.711 | 427.338 | 475.711
So overall we see roughly a 4% performance increase across the board. Not groundbreaking numbers, however if you’re trying to squeeze every last drop of performance out of your VMs this could be a big step in the right direction.
Speaking of squeezing every last drop, lets see what happens when we test against a ReFS formatted disk.
Command Line: C:\Users\cjoyce_admin\Downloads\Diskspd-v2.0.17\amd64fre\diskspd.exe -b16K -d1800 -h -L -o2 -t4
10G E:\io.dat
Input parameters:
timespan: 1
-------------
duration: 1800s
warm up time: 5s
cool down time: 0s
measuring latency
random seed: 0
path: 'E:\io.dat'
think time: 0ms
burst size: 0
software cache disabled
hardware write cache disabled, writethrough on
performing mix test (read/write ratio: 50/50)
block size: 16384
using random I/O (alignment: 16384)
number of outstanding I/O operations: 2
thread stride size: 0
threads per file: 4
using I/O Completion Ports
IO priority: normal
Results for timespan 1:
*******************************************************************************
actual test time: 1800.02s
thread count: 4
proc count: 4
CPU | Usage | User | Kernel | Idle
-------------------------------------------
0| 8.65%| 1.62%| 7.03%| 91.35%
1| 8.69%| 1.49%| 7.20%| 91.31%
2| 7.83%| 1.35%| 6.47%| 92.17%
3| 7.43%| 1.36%| 6.07%| 92.57%
-------------------------------------------
avg.| 8.15%| 1.46%| 6.69%| 91.85%
Total IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 18047041536 | 1101504 | 9.56 | 611.94 | 3.263 | 2.708 | E:\io.dat (10240MB)
1 | 18078842880 | 1103445 | 9.58 | 613.02 | 3.258 | 3.004 | E:\io.dat (10240MB)
2 | 18066751488 | 1102707 | 9.57 | 612.61 | 3.260 | 2.712 | E:\io.dat (10240MB)
3 | 18132910080 | 1106745 | 9.61 | 614.85 | 3.248 | 2.727 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 72325545984 | 4414401 | 38.32 | 2452.42 | 3.257 | 2.791
Read IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 9020080128 | 550542 | 4.78 | 305.85 | 2.762 | 2.399 | E:\io.dat (10240MB)
1 | 9030025216 | 551149 | 4.78 | 306.19 | 2.760 | 2.927 | E:\io.dat (10240MB)
2 | 9041592320 | 551855 | 4.79 | 306.58 | 2.759 | 2.342 | E:\io.dat (10240MB)
3 | 9050865664 | 552421 | 4.80 | 306.90 | 2.752 | 2.479 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 36142563328 | 2205967 | 19.15 | 1225.53 | 2.758 | 2.547
Write IO
thread | bytes | I/Os | MB/s | I/O per s | AvgLat | LatStdDev | file
-----------------------------------------------------------------------------------------------------
0 | 9026961408 | 550962 | 4.78 | 306.09 | 3.764 | 2.899 | E:\io.dat (10240MB)
1 | 9048817664 | 552296 | 4.79 | 306.83 | 3.754 | 2.998 | E:\io.dat (10240MB)
2 | 9025159168 | 550852 | 4.78 | 306.03 | 3.762 | 2.954 | E:\io.dat (10240MB)
3 | 9082044416 | 554324 | 4.81 | 307.96 | 3.742 | 2.870 | E:\io.dat (10240MB)
-----------------------------------------------------------------------------------------------------
total: 36182982656 | 2208434 | 19.17 | 1226.90 | 3.756 | 2.931
%-ile | Read (ms) | Write (ms) | Total (ms)
----------------------------------------------
min | 0.267 | 0.297 | 0.267
25th | 1.252 | 1.773 | 1.403
50th | 2.019 | 3.097 | 2.618
75th | 3.724 | 5.038 | 4.275
90th | 5.581 | 6.998 | 6.240
95th | 6.395 | 8.584 | 7.525
99th | 9.641 | 12.213 | 11.021
3-nines | 20.505 | 26.232 | 23.305
4-nines | 42.971 | 45.559 | 44.280
5-nines | 238.498 | 175.573 | 204.921
6-nines | 502.382 | 359.149 | 435.862
7-nines | 547.128 | 547.124 | 547.128
8-nines | 547.128 | 547.124 | 547.128
9-nines | 547.128 | 547.124 | 547.128
max | 547.128 | 547.124 | 547.128
With a ReFS formatted disk on top of PVSCSI we see a 17% increase!
So if your applications support it, and you truly want to squeeze every last drop out of your storage, ReFS and PVSCSI is the combination to go with!
Like this:
Like Loading...
I asked the same question myself after reading a best practice guide from Datrium that suggested using the VMware PVSCSI controller instead of the default recommendation of the LSI SAS controller that VMware makes when you create a Windows VM.
Out of curiosity I spun up a new server 2016 VM. 4 Cores 8GB of RAM, and a 100GB drive, hosted on my Datrium storage to find out how much of a difference there was.
For this test I ran during a normal production workload, and used Microsoft DiskSpd with a 16k IO size (my current average for my app servers) to test to see what we would get for results. The specific command I used was
The first run on the VMware LSI SAS controller resulted in this.
Overall not terrible. Now lets look at what we get when we replace the LSI SAS with a PVSCSI.
So overall we see roughly a 4% performance increase across the board. Not groundbreaking numbers, however if you’re trying to squeeze every last drop of performance out of your VMs this could be a big step in the right direction.
Speaking of squeezing every last drop, lets see what happens when we test against a ReFS formatted disk.
With a ReFS formatted disk on top of PVSCSI we see a 17% increase!
So if your applications support it, and you truly want to squeeze every last drop out of your storage, ReFS and PVSCSI is the combination to go with!
Share this:
Like this: