Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1145066ybb; Wed, 1 Apr 2020 16:54:01 -0700 (PDT) X-Google-Smtp-Source: APiQypKvz5cN93HqYooSOxKU1BLm15YQoROtBVvKyKGCdYjtvdraTsm9oUBzNzUV5X3Oq1oP5vX3 X-Received: by 2002:aca:4286:: with SMTP id p128mr353728oia.29.1585785241214; Wed, 01 Apr 2020 16:54:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585785241; cv=none; d=google.com; s=arc-20160816; b=vx5FaSvjagkuVbMI9fvCp0TC7KAFCiCAtBCElPRcBuULQtbeqPu7XX2uCSz2c+2G0E UsZHjqGGjKp87xyLxayT86/jXDktvlUqm26+U8IMYR3x2ZQErgPCyq5wfryAIDi0f0PG LY7bScmZAAzvqmaIWnw8T1ZVb0G3CTt/Ossgwga6A8E2F2ayzQm2kmUIqteO+myM7VnE wKVWlQNOXHdnOGzyImRPM/AwERdh9iFp6VaPo4y2Igk4Kg75E5qbDjGWpOzpqBWHM2Wf RU6Wk+Z/6ul8H8AaCMxNpiH6Jf2cEO86UF4c9TofKi7/+/bQhpkZwlj5wjy5hLfCxJrU SaGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:references :in-reply-to:subject:cc:date:to:from; bh=1hNagIVacu4BCmIzOE8+pzVuzClS3VwCTpn2BYAGClE=; b=GkZfqnxnILOsI3m0QBwiWFrd1/kcjhl8TLu0/JA4Ptzi8s4dI+MKeL2xuiW8w/vgAs LgqdfV/PTwvxzNCv/Gm5+JOXg7hBNCn72odNpyFxV9I+Fasoed6BXTz721OLrrZgonxc uos40FMGxhqwX5xITJ+k34SkmMmgxbU730IvGWQtKGY5q+K6QXVpdi7e/OzS/ePOru7A qquYkRVzXsEnDB+HPn3augwQ6H1Kl6Vc/WuAE7JNIGo89ywkRwLEyf/ko/QffAlsbD+7 TsJh18gvGNbyDS8ALCu5yfyNCweH2jxcDibbWqqPzWl41y9psZZmY5bxsI+BA7XDL/ui cVSg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h8si1717900otk.246.2020.04.01.16.53.36; Wed, 01 Apr 2020 16:54:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387444AbgDAXwd (ORCPT + 99 others); Wed, 1 Apr 2020 19:52:33 -0400 Received: from mx2.suse.de ([195.135.220.15]:38342 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733265AbgDAXwd (ORCPT ); Wed, 1 Apr 2020 19:52:33 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 7C7D9ADD3; Wed, 1 Apr 2020 23:52:30 +0000 (UTC) From: NeilBrown To: Trond Myklebust , "Anna.Schumaker\@Netapp.com" , Andrew Morton , Jan Kara Date: Thu, 02 Apr 2020 10:52:21 +1100 Cc: linux-mm@kvack.org, linux-nfs@vger.kernel.org, LKML Subject: Writeback fixes for NFS In-Reply-To: <87tv2b7q72.fsf@notabene.neil.brown.name> References: <87tv2b7q72.fsf@notabene.neil.brown.name> Message-ID: <87v9miydai.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org --=-=-= Content-Type: text/plain Please ignore my previous patch (which this is in reply to), it was flawed in various ways. I now understand the code a bit better and have a somewhat simpler patch which appears to address the same problem. The problem is that writeback to NFS often produces lots of small writes (10s of K) rather than fewer large writes (1M). This pattern can often hurt throughput, but in certain circumstances it can hurt NFS throughput more than expected. Each nfs_writepages() call results in an NFS commit being sent to the server. If writeback triggers lots of smaller nfs_writepages calls, this means lots of COMMITs. If the server is slow to handle the COMMIT (I've seen the Ganesha NFS server take over 200ms per commit), these COMMITs can overlap, queue up, and choke the NFS server and cause order-of-magnitude drop in throughput. So we really want to only call nfs_writepages when there are a largish number of pages to be written - i.e. that are 'dirty'. For historical reasons that I didn't thoroughly research but I'm confident are no longer relevant, pages that have been written to the NFS server but have not yet been the subject of a COMMIT - so-called "unstable" pages - are effectively accounted that same as "dirty" pages (sometimes called "reclaimable"). This can result in writeback thinking there are lots of "dirty" pages to reclaim, while nfs_writepages can only find a few that it can write out. The second patch following changes the accounting for these "unstable" pages. They are now always accounted exactly the same was writeback pages. Conceptually they can be thought of as still in writeback, but the writeback is now happening on the server. A COMMIT will always automatically follow the writes generated by nfs_writepages, so from the perspective of the VM, there really is no difference: It has scheduled the write and there is nothing else it can do except wait. Testing this patch showed that loop-back NFS is prone to deadlocks again. I cannot see exactly how the change to 'unstable' accounting affected this, but I can see that the old +25% heuristic can no longer be justified given the complexity of writeback calculations. So the first patch following changes how writeback is handled for NFS servers handling loop-back requests (and other similar services) so that it is more obviously safe against excessive dirty pages scheduled for other devices. Thanks, NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAl6FKTYACgkQOeye3VZi gbkMhQ/9Eb/q3J6cLCyuDP/OqMQWyM38kLkbAhYYBgjcrgn7r/VmruQJfREPQGZc f2i7X9VZiBFCIh0HXRdsR17d4qU6NCtAtf234EppdkceGrT+yA1RNdUV1nOFZJCY 1Qs/xyNzHOgzveedx2wuGJ5BA94Dd6MVeNE+DxFEgzqPexp14/8vqALhbLB0GLJu eN8R7w+uCIUvTDeQp0TFuG6aUWDcQoIWi3aCxMfTICyYxjG35Nss2N/N7HinmZa/ zg8nMnE/iCFc7It89N/6i8IjjAE62SBcj5kfhzdqY3DguVX6nio3raef/ZMoH3bS j7DEQacqwUVOsvoLutEzGBRRZ+GQEa2+Cal5AniuUpBOfOr+DyhOOpSVDPNLBY/b 7yTzK1BR1ttFI3NtJpKoFbNdKfpWkIpdebPhe6AcfOT+rhnbgXWkl15oexsiZWOU q5k49bL9HPZ6NRsMng2pS2W7BYNQVqin70XuO2XnOTHLa+BOBh0cm6k0QTjc+XI+ /bvNonecFYqQMAcDtVDwo7G3bCwPxcSfcUDM5QD+TqbJ5tZLF2yHu1KWcZhTrtWf q++Lrs0NGYXoe/iyqtNYkdBFOk/4YzWVZzzfNi8feV1oqYRACqKhSmicFhdr4OYc vEfR84TCAEldq/TWUw0o51i2HvZvcsJ1hwl5PdwnCAGsnQC8rsw= =nBP7 -----END PGP SIGNATURE----- --=-=-=--