Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757365AbcCaQVK (ORCPT ); Thu, 31 Mar 2016 12:21:10 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:33531 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752961AbcCaQVI (ORCPT ); Thu, 31 Mar 2016 12:21:08 -0400 Subject: Re: [PATCHSET v3][RFC] Make background writeback not suck To: Dave Chinner References: <1459350477-16404-1-git-send-email-axboe@fb.com> <20160331082433.GO11812@dastard> <56FD344F.70908@fb.com> CC: , , From: Jens Axboe Message-ID: <56FD4E70.3090203@fb.com> Date: Thu, 31 Mar 2016 10:21:04 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <56FD344F.70908@fb.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-03-31_07:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1633 Lines: 38 On 03/31/2016 08:29 AM, Jens Axboe wrote: >> What I see in these performance dips is the XFS transaction >> subsystem stalling *completely* - instead of running at a steady >> state of around 350,000 transactions/s, there are *zero* >> transactions running for periods of up to ten seconds. This >> co-incides with the CPU usage falling to almost zero as well. >> AFAICT, the only thing that is running when the filesystem stalls >> like this is memory reclaim. > > I'll take a look at this, stalls should definitely not be occurring. How > much memory does the box have? I can't seem to reproduce this at all. On an nvme device, I get a fairly steady 60K/sec file creation rate, and we're nowhere near being IO bound. So the throttling has no effect at all. On a raid0 on 4 flash devices, I get something that looks more IO bound, for some reason. Still no impact of the throttling, however. But given that your setup is this: virtio in guest, XFS direct IO -> no-op -> scsi in host. we do potentially have two throttling points, which we don't want. Is both the guest and the host running the new code, or just the guest? In any case, can I talk you into trying with two patches on top of the current code? It's the two newest patches here: http://git.kernel.dk/cgit/linux-block/log/?h=wb-buf-throttle The first treats REQ_META|REQ_PRIO like they should be treated, like high priority IO. The second disables throttling for virtual devices, so we only throttle on the backend. The latter should probably be the other way around, but we need some way of conveying that information to the backend. -- Jens Axboe