Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752515AbcCXRmd (ORCPT ); Thu, 24 Mar 2016 13:42:33 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:24271 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751574AbcCXRmY (ORCPT ); Thu, 24 Mar 2016 13:42:24 -0400 Subject: Re: [PATCHSET v2][RFC] Make background writeback not suck To: , , References: <1458746750-9213-1-git-send-email-axboe@fb.com> <56F2B8B5.10106@fb.com> From: Jens Axboe Message-ID: <56F426FB.2010002@fb.com> Date: Thu, 24 Mar 2016 11:42:19 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56F2B8B5.10106@fb.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-03-24_08:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2593 Lines: 102 On 03/23/2016 09:39 AM, Jens Axboe wrote: > Hi, > > Apparently I dropped the subject on this one, it's of course v2 of the > writeback not sucking patchset... Some test result. I've run a lot of them, on various types of storage, and performance seems good with the default settings. This case reads in a file and writes it to stdout. It targets a certain latency for the reads - by default it's 10ms. If a read isn't done my 10ms, it'll queue the next read. This avoids the coordinated omission problem, where one long latency is in fact many of them, you just don't knows since you don't issue more while one is stuck. The test case reads a compressed file, and writes it over a pipe to gzip to decompress it. The input file is around 9G, uncompresses to 20G. At the end of the run, latency results are shown. Every time the target latency is exceeded during the run, it's output. To keep the system busy, 75% (24G) of the memory is taking up by CPU hogs. This is intended to make the case worse for the throttled depth, as Dave pointed out. Out-of-the-box results: # time (./read-to-pipe-async -f randfile.gz | gzip -dc > outfile; sync) read latency=11790 usec read latency=82697 usec [...] Latency percentiles (usec) (READERS) 50.0000th: 4 75.0000th: 5 90.0000th: 6 95.0000th: 7 99.0000th: 54 99.5000th: 64 99.9000th: 334 99.9900th: 17952 99.9990th: 101504 99.9999th: 203520 Over=333, min=0, max=215367 Latency percentiles (usec) (WRITERS) 50.0000th: 3 75.0000th: 5 90.0000th: 454 95.0000th: 473 99.0000th: 615 99.5000th: 625 99.9000th: 815 99.9900th: 1142 99.9990th: 2244 99.9999th: 10032 Over=3, min=0, max=10811 Read rate (KB/sec) : 88988 Write rate (KB/sec): 60019 real 2m38.701s user 2m33.030s sys 1m31.540s 215ms worst case latency, 333 cases of being above the 10ms target. And with the patchset applied: # time (./read-to-pipe-async -f randfile.gz | gzip -dc > outfile; sync) write latency=15394 usec [...] Latency percentiles (usec) (READERS) 50.0000th: 4 75.0000th: 5 90.0000th: 6 95.0000th: 8 99.0000th: 55 99.5000th: 64 99.9000th: 338 99.9900th: 2652 99.9990th: 3964 99.9999th: 7464 Over=1, min=0, max=10221 Latency percentiles (usec) (WRITERS) 50.0000th: 4 75.0000th: 5 90.0000th: 450 95.0000th: 471 99.0000th: 611 99.5000th: 623 99.9000th: 703 99.9900th: 1106 99.9990th: 2010 99.9999th: 10448 Over=6, min=1, max=15394 Read rate (KB/sec) : 95506 Write rate (KB/sec): 59970 real 2m39.014s user 2m33.800s sys 1m35.210s I won't bore you with vmstat output, it's pretty messy for the default case. -- Jens Axboe