Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758113AbcDAArf (ORCPT ); Thu, 31 Mar 2016 20:47:35 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:45475 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757524AbcDAArd (ORCPT ); Thu, 31 Mar 2016 20:47:33 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A2BTDgDKw/1WPDGaLHldgzNTfacEAQEBAQEBBowShWOEDx2FagQCAoFGTQEBAQEBAQcBAQEBQkCEQgEBBCcTHCMQCAMYCSUPBSUDBxoTG4gLD8NUAQEBAQEFAgEZBBmFPYUOgl2BRgMChWwFl3KFc4gLgXCHdYUyXo44hF0oMIcxgTwBAQE Date: Fri, 1 Apr 2016 11:46:23 +1100 From: Dave Chinner To: Jens Axboe Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Subject: Re: [PATCHSET v3][RFC] Make background writeback not suck Message-ID: <20160401004623.GT11812@dastard> References: <1459350477-16404-1-git-send-email-axboe@fb.com> <20160331082433.GO11812@dastard> <56FD344F.70908@fb.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <56FD344F.70908@fb.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3526 Lines: 107 On Thu, Mar 31, 2016 at 08:29:35AM -0600, Jens Axboe wrote: > On 03/31/2016 02:24 AM, Dave Chinner wrote: > >On Wed, Mar 30, 2016 at 09:07:48AM -0600, Jens Axboe wrote: > >>Hi, > >> > >>This patchset isn't as much a final solution, as it's demonstration > >>of what I believe is a huge issue. Since the dawn of time, our > >>background buffered writeback has sucked. When we do background > >>buffered writeback, it should have little impact on foreground > >>activity. That's the definition of background activity... But for as > >>long as I can remember, heavy buffered writers has not behaved like > >>that. For instance, if I do something like this: > >> > >>$ dd if=/dev/zero of=foo bs=1M count=10k > >> > >>on my laptop, and then try and start chrome, it basically won't start > >>before the buffered writeback is done. Or, for server oriented > >>workloads, where installation of a big RPM (or similar) adversely > >>impacts data base reads or sync writes. When that happens, I get people > >>yelling at me. > >> > >>Last time I posted this, I used flash storage as the example. But > >>this works equally well on rotating storage. Let's run a test case > >>that writes a lot. This test writes 50 files, each 100M, on XFS on > >>a regular hard drive. While this happens, we attempt to read > >>another file with fio. > >> > >>Writers: > >> > >>$ time (./write-files ; sync) > >>real 1m6.304s > >>user 0m0.020s > >>sys 0m12.210s > > > >Great. So a basic IO tests looks good - let's through something more > >complex at it. Say, a benchmark I've been using for years to stress > >the Io subsystem, the filesystem and memory reclaim all at the same > >time: a concurent fsmark inode creation test. > >(first google hit https://lkml.org/lkml/2013/9/10/46) > > Is that how you are invoking it as well same arguments? Yes. And the VM is exactly the same, too - 16p/16GB RAM. Cut down version of the script I use: #!/bin/bash QUOTA= MKFSOPTS= NFILES=100000 DEV=/dev/vdc LOGBSIZE=256k FSMARK=/home/dave/src/fs_mark-3.3/fs_mark MNT=/mnt/scratch while [ $# -gt 0 ]; do case "$1" in -q) QUOTA="uquota,gquota,pquota" ;; -N) NFILES=$2 ; shift ;; -d) DEV=$2 ; shift ;; -l) LOGBSIZE=$2; shift ;; --) shift ; break ;; esac shift done MKFSOPTS="$MKFSOPTS $*" echo QUOTA=$QUOTA echo MKFSOPTS=$MKFSOPTS echo DEV=$DEV sudo umount $MNT > /dev/null 2>&1 sudo mkfs.xfs -f $MKFSOPTS $DEV sudo mount -o nobarrier,logbsize=$LOGBSIZE,$QUOTA $DEV $MNT sudo chmod 777 $MNT sudo sh -c "echo 1 > /proc/sys/fs/xfs/stats_clear" time $FSMARK -D 10000 -S0 -n $NFILES -s 0 -L 32 \ -d $MNT/0 -d $MNT/1 \ -d $MNT/2 -d $MNT/3 \ -d $MNT/4 -d $MNT/5 \ -d $MNT/6 -d $MNT/7 \ -d $MNT/8 -d $MNT/9 \ -d $MNT/10 -d $MNT/11 \ -d $MNT/12 -d $MNT/13 \ -d $MNT/14 -d $MNT/15 \ | tee >(stats --trim-outliers | tail -1 1>&2) sync sudo umount /mnt/scratch $ > >>The above was run without scsi-mq, and with using the deadline scheduler, > >>results with CFQ are similary depressing for this test. So IO scheduling > >>is in place for this test, it's not pure blk-mq without scheduling. > > > >virtio in guest, XFS direct IO -> no-op -> scsi in host. > > That has write back caching enabled on the guest, correct? No. It uses virtio,cache=none (that's the "XFS Direct IO" bit above). Sorry for not being clear about that. Cheers, Dave. -- Dave Chinner david@fromorbit.com