Sequential disk write much slower than seq read (Online SC 2016 SATA)
I got an Online.net SC 2016 SATA server with 500GB drive, but the sequential write speed is about half of the sequential read speed.
For all harddrives I've seen so far, the seq read/write speeds are typically about the same, so it seems really odd that I have this discrepancy. Does anyone have any clues as to what may be the cause?
I'm testing on an ext4 partition with journaling disabled, from recovery mode (so no other disk activity). Read speeds seem fine:
# hdparm -tT /dev/sda /dev/sda: Timing cached reads: 3022 MB in 2.00 seconds = 1510.72 MB/sec Timing buffered disk reads: 378 MB in 3.00 seconds = 125.83 MB/sec
However writes don't:
# dd if=/dev/zero of=test bs=2M count=512 conv=fdatasync 512+0 records in 512+0 records out 1073741824 bytes (1.1 GB) copied, 24.9087 s, 43.1 MB/s
Using FIO instead, I get 119MB/s seq read and 56MB/s seq write.
No issues reported by smartctl.
Is this a sign of a faulty disk, or a potential configuration error on my part?
For anyone else with the same server, what speeds are you getting?
david_W seems to be getting around 100MB/s and other benchmarks are giving similar seq read/write performance for the disk.
Online.net support are claiming that this is not unusual. Is this really the case?
I check in rescue and no problem with the disk, values seems OK with this kind of disk too
Values can differ between servers, but this is not abnormal values
Results can go from 50MB/s to 100MB/s, and the most often, between this 2 values
Edit: smartctl --all /dev/sda output:
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.13.0-77-generic] (local build) Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Model Family: Hitachi/HGST Travelstar Z7K500 Device Model: HGST HTS725050A7E630 Serial Number: RCF50ACE1S0G7M LU WWN Device Id: 5 000cca 85ed88dc3 Firmware Version: GS2OA230 User Capacity: 500,107,862,016 bytes [500 GB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 7200 rpm Device is: In smartctl database [for details use: -P show] ATA Version is: ATA8-ACS T13/1699-D revision 6 SATA Version is: SATA 2.6, 6.0 Gb/s (current: 3.0 Gb/s) Local Time is: Sun Mar 20 02:39:13 2016 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 45) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 90) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000b 100 100 062 Pre-fail Always - 0 2 Throughput_Performance 0x0005 100 100 040 Pre-fail Offline - 0 3 Spin_Up_Time 0x0007 100 100 033 Pre-fail Always - 1 4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 7 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0 8 Seek_Time_Performance 0x0005 100 100 040 Pre-fail Offline - 0 9 Power_On_Hours 0x0012 100 100 000 Old_age Always - 205 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 4 191 G-Sense_Error_Rate 0x000a 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 15 194 Temperature_Celsius 0x0002 214 214 000 Old_age Always - 28 (Min/Max 17/34) 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0 223 Load_Retry_Count 0x000a 100 100 000 Old_age Always - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 147 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Seq-Read: (g=0): rw=read, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=1 Seq-Write: (g=1): rw=write, bs=1M-1M/1M-1M/1M-1M, ioengine=libaio, iodepth=1 fio-2.1.3 Starting 2 processes Seq-Read: Laying out IO file(s) (1 file(s) / 1024MB) Jobs: 1 (f=1): [RP] [4.8% done] [120.9MB/0KB/0KB /s] [120/0/0 iops] [eta 01m:00sJobs: 1 (f=1): [RP] [6.2% done] [120.9MB/0KB/0KB /s] [120/0/0 iops] [eta 01m:00sJobs: 1 (f=1): [RP] [7.7% done] [120.9MB/0KB/0KB /s] [120/0/0 iops] [eta 01m:00sJobs: 1 (f=1): [RP] [9.1% done] [120.9MB/0KB/0KB /s] [120/0/0 iops] [eta 01m:00sJobs: 1 (f=1): [RP] [10.4% done] [119.1MB/0KB/0KB /s] [119/0/0 iops] [eta 01m:00Jobs: 1 (f=1): [RP] [11.8% done] [120.9MB/0KB/0KB /s] [120/0/0 iops] [eta 01m:00Jobs: 1 (f=1): [_W] [11.8% done] [91044KB/1022KB/0KB /s] [88/0/0 iops] [eta 01m:Jobs: 1 (f=1): [_W] [20.0% done] [0KB/48079KB/0KB /s] [0/46/0 iops] [eta 00m:40sJobs: 1 (f=1): [_W] [31.4% done] [0KB/64447KB/0KB /s] [0/62/0 iops] [eta 00m:24sJobs: 1 (f=1): [_W] [37.5% done] [0KB/59332KB/0KB /s] [0/57/0 iops] [eta 00m:20sJobs: 1 (f=1): [_W] [43.3% done] [0KB/56263KB/0KB /s] [0/54/0 iops] [eta 00m:17sJobs: 1 (f=1): [_W] [46.7% done] [0KB/54217KB/0KB /s] [0/52/0 iops] [eta 00m:16sJobs: 1 (f=1): [_W] [51.7% done] [0KB/50125KB/0KB /s] [0/48/0 iops] [eta 00m:14sJobs: 1 (f=1): [_W] [55.2% done] [0KB/51148KB/0KB /s] [0/49/0 iops] [eta 00m:13sJobs: 1 (f=1): [_W] [58.6% done] [0KB/57286KB/0KB /s] [0/55/0 iops] [eta 00m:12sJobs: 1 (f=1): [_W] [64.3% done] [0KB/59332KB/0KB /s] [0/57/0 iops] [eta 00m:10sJobs: 1 (f=1): [_W] [67.9% done] [0KB/61378KB/0KB /s] [0/59/0 iops] [eta 00m:09sJobs: 1 (f=1): [_W] [71.4% done] [0KB/55240KB/0KB /s] [0/53/0 iops] [eta 00m:08sJobs: 1 (f=1): [_W] [75.0% done] [0KB/54217KB/0KB /s] [0/52/0 iops] [eta 00m:07sJobs: 1 (f=1): [_W] [81.5% done] [0KB/61378KB/0KB /s] [0/59/0 iops] [eta 00m:05sJobs: 1 (f=1): [_W] [85.2% done] [0KB/55240KB/0KB /s] [0/53/0 iops] [eta 00m:04sJobs: 1 (f=1): [_W] [88.9% done] [0KB/62401KB/0KB /s] [0/60/0 iops] [eta 00m:03sJobs: 1 (f=1): [_W] [92.6% done] [0KB/60355KB/0KB /s] [0/58/0 iops] [eta 00m:02sJobs: 1 (f=1): [_W] [96.3% done] [0KB/55240KB/0KB /s] [0/53/0 iops] [eta 00m:01sJobs: 1 (f=1): [_W] [100.0% done] [0KB/42965KB/0KB /s] [0/41/0 iops] [eta 00m:00s] Seq-Read: (groupid=0, jobs=1): err= 0: pid=2474: Sun Mar 20 02:48:35 2016 read : io=1024.0MB, bw=122554KB/s, iops=119, runt= 8556msec slat (usec): min=92, max=746, avg=94.72, stdev=20.57 clat (msec): min=6, max=56, avg= 8.26, stdev= 1.74 lat (msec): min=6, max=57, avg= 8.35, stdev= 1.76 clat percentiles (usec): | 1.00th=[ 6752], 5.00th=[ 6816], 10.00th=[ 6816], 20.00th=[ 8384], | 30.00th=[ 8384], 40.00th=[ 8512], 50.00th=[ 8512], 60.00th=[ 8512], | 70.00th=[ 8512], 80.00th=[ 8512], 90.00th=[ 8512], 95.00th=[ 8512], | 99.00th=[ 8512], 99.50th=[ 9664], 99.90th=, 99.95th=, | 99.99th= bw (KB /s): min=111304, max=124430, per=100.00%, avg=122630.56, stdev=3282.11 lat (msec) : 10=99.71%, 20=0.10%, 50=0.10%, 100=0.10% cpu : usr=0.08%, sys=1.32%, ctx=1027, majf=0, minf=283 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=1024/w=0/d=0, short=r=0/w=0/d=0 Seq-Write: (groupid=1, jobs=1): err= 0: pid=2475: Sun Mar 20 02:48:35 2016 write: io=1024.0MB, bw=55814KB/s, iops=54, runt= 18787msec slat (usec): min=118, max=230, avg=168.67, stdev=18.20 clat (msec): min=3, max=69, avg=18.17, stdev=12.78 lat (msec): min=4, max=70, avg=18.34, stdev=12.78 clat percentiles (usec): | 1.00th=[ 3952], 5.00th=[ 3952], 10.00th=[ 3952], 20.00th=[ 3952], | 30.00th=[ 3952], 40.00th=, 50.00th=, 60.00th=, | 70.00th=, 80.00th=, 90.00th=, 95.00th=, | 99.00th=, 99.50th=, 99.90th=, 99.95th=, | 99.99th= bw (KB /s): min=40715, max=65145, per=100.00%, avg=55882.17, stdev=6620.89 lat (msec) : 4=33.30%, 10=0.10%, 20=38.67%, 50=26.27%, 100=1.66% cpu : usr=0.48%, sys=0.59%, ctx=1032, majf=0, minf=26 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=1024/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): READ: io=1024.0MB, aggrb=122554KB/s, minb=122554KB/s, maxb=122554KB/s, mint=8556msec, maxt=8556msec Run status group 1 (all jobs): WRITE: io=1024.0MB, aggrb=55813KB/s, minb=55813KB/s, maxb=55813KB/s, mint=18787msec, maxt=18787msec Disk stats (read/write): sda: ios=2048/2048, merge=0/0, ticks=12680/28036, in_queue=40704, util=98.09%
Looks like a fairly new disk, wonder why there are so many pre-fails.
Where'd you see the pre-fails?
How about a larger block size with dd ?
What size do you recommend?
2MB blocks are fairly large and I've never really seen benchmarks that go beyond that. I can't imagine much of a difference though, as you get rapidly diminishing returns the higher you go.
I use bs=2G count=1 when testing manually ...... Way much too high I guess
Looks like some sort of power saving on the HDD, first time testing is similar to your result.
What's the output of
Side Note: May have to wait a while without disk activity to run hdparm -C just to make sure that things are spinning down if they actually are.
Tried your test but doesn't seem to make a difference
...well, I suppose the speed is increasing slightly, gradually... May be due to random chance.
APM was set to 254 (max performance) anyway, so I assume that power saving shouldn't be coming into play anyway:
Online.net support keeps insisting that my speeds are acceptable, and won't do anything about it. Even looking across 1.8k disks of the same model, the minimum speed (68.3MB/s) is still a fair bit better than the best I can achieve (which is MUCH slower than the average of 101MB/s).
Apparently refunds aren't available after an install has been performed - seems quite reasonable, but they don't give the option to boot into recovery before installation. Which means that it's impossible to check disk speed until after you install an OS.
Thanks for the responses!
I am using CentOS7 BTW.. not sure if that's something you can try it out.
Does dmesg showing any IO errors?
Pre_fail is a label for the type of indicator, not a value/result. The chance of imminent failure is determined by increasing values in pre_fail rows, the label will always say pre_fail/old_age.
So, SMART-wise this drive seems fine, even if performance is iffy.
Debian 8 here, but have tried with the Ubuntu 14.04 recovery disk. Online don't offer a CentOS recovery disk. I honestly doubt it makes any difference.
I can't see any errors in dmesg at all, but if there's anything I should grep, I can do that.
Honestly, if your disk has data (ie you're not write testing right on the edge) I think that 43MB/sec is normal for raw write speed of what I assume is a 2.5" drive. It's pretty much exactly the speed I've seen from WD black 2.5" drives.
Good point - didn't think of that!
It has data, but not a lot, so shouldn't be writing on the (center?) edge.
But we can try testing that too I suppose. My first partition is a 200MB /boot partition, which I presume is closest to the outer rim of the disk (and hence the fastest part of the drive).
I've wiped this partition, with the following:
I presume this should be about the fastest I can get out of this disk?
@xyz is the partition 4K-aligned? It needs to be (and just in case it's better align to 1MB or so). Show the output of
sfdisk -d /dev/sda
All aligned to 1MB:
I thought as much because my dedis are outputting the same results. Thanks for the explanation.
Here's mine Dedibox XC SSD 2015 with Samsung SSD PM871 128GB at SATA3:
/dev/sda: Timing cached reads: 3446 MB in 2.00 seconds = 1723.40 MB/sec Timing buffered disk reads: 1332 MB in 3.00 seconds = 443.38 MB/sec
dd if=/dev/zero of=test bs=2M count=512 conv=fdatasync 512+0 records in 512+0 records out 1073741824 bytes (1.1 GB) copied, 9.2816 s, 116 MB/s
Another XC SSD 2015 with Intel 320 series SSD 120 GB at SATA2:
And now XC SSD 2016 with Samsung SSD PM871 256GB at SATA2:
How does that help the OP? OP has a standard magnetic drive, not an SSD.
@Clouvider: Don't mind Spacedust. He is just very, very excited with his new online.net servers and tends to be a bit compulsive when posting... ;-)
You won't get more with laptop HDD's. SATA3 won't help at all as these drives will never reach 267 MB/s.
Thanks again for all the replies, from the looks of things, this issue probably isn't due to a configuration fault on my side.
I've managed to convince Online.net support that there's a problem - they're looking into a BIOS update which may fix the issue. Fingers crossed here, but a heads up to anyone else who happens to be in the same boat (is there anyone else here? haven't heard from them if so!).
They should update the BIOS to take advantage of SATA3.
@Spacedust / @david_W -
How on earth will this help when accessing a single 2.5" drive? I doubt you can even fill its cache at SATA3 speeds. 40-50MB/sec is pretty average write performance for a spinning laptop hard drive.
Well, at least, it will make me feel good........
@david_W it should make you feel stupid.
People wants everything with a 8.99 box lol. I don't even mind if they are willing to put ECC RAM on it.