Return-Path: Received: from mail-ob0-f182.google.com ([209.85.214.182]:32826 "EHLO mail-ob0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161963AbbKEP5E (ORCPT ); Thu, 5 Nov 2015 10:57:04 -0500 Received: by obbww6 with SMTP id ww6so43017656obb.0 for ; Thu, 05 Nov 2015 07:57:04 -0800 (PST) MIME-Version: 1.0 In-Reply-To: References: <1446563847-14005-1-git-send-email-agruenba@redhat.com> <1446563847-14005-46-git-send-email-agruenba@redhat.com> Date: Thu, 5 Nov 2015 10:57:04 -0500 Message-ID: Subject: Re: [PATCH v13 45/51] sunrpc: Allow to demand-allocate pages to encode into From: Trond Myklebust To: Andreas Gruenbacher Cc: Alexander Viro , "Theodore Ts'o" , Andreas Dilger , "J. Bruce Fields" , Jeff Layton , Anna Schumaker , Dave Chinner , linux-ext4 , XFS Developers , Linux Kernel Mailing List , Linux FS-devel Mailing List , Linux NFS Mailing List , linux-cifs@vger.kernel.org, Linux API Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-nfs-owner@vger.kernel.org List-ID: On Thu, Nov 5, 2015 at 6:07 AM, Andreas Gruenbacher wrote: > Trond, > > On Tue, Nov 3, 2015 at 5:25 PM, Trond Myklebust > wrote: >> On Tue, Nov 3, 2015 at 10:17 AM, Andreas Gruenbacher >> wrote: >>> When encoding large, variable-length objects such as acls into xdr_bufs, >>> it is easier to allocate buffer pages on demand rather than precomputing >>> the required buffer size. >> >> NACK. We're not doing allocations from inside the XDR encoders. This >> can and should be done before calling into the SUNRPC layer. > > an XDR-encoded ACL can be up to 64k (16 pages) in size. In practice, > large ACLs like that will almost never occur and almost all ACLs will > fit into a single page though. > > The XDR-encoded ACL contains strings for the user and group names > which need to be looked up when the idmapper is used. Those lookups > are somewhat expensive; in addition, the lookup results can change > over time. When precomputing the size, allocating space, and then > encoding the ACL, we could run out of space when encoding. > > So we could always allocate the maximum 16 pages, encode the acl, and > free the unused pages. This would be rather wasteful though. > > Given how simple it is to allocate pages as we go, this seems the > better choice here. This doesn't break any existing code either; NULL > page pointers would have oopsed in xdr_get_next_encode_buffer before. > > From the memory management point of view, there is no difference in > preallocating GFP_NOFS pages and allocating them on demand; the pages > are allocated in the same task and locking context in both cases. > > So could you please explain why you object to this change? > Allocating memory deep in the bowels of the RPC code with the expectation that it will be freed by the caller of the RPC request is a layering violation of the ugliest sort. How is anyone who is unfamiliar with the code going to be able to understand what is going on without tracing through 1000 lines of code to spot where the allocation is happening? Aside from that, we do not want any non-critical blocking while holding the RPC socket lock. Your allocation request will block all further traffic to the server until it is satisfied. That includes blocking page writeback, which might actually free up memory to satisfy the allocation. As I said above, there is no reason whatsoever to have to do all this inside encode_setacl(). The entire ACL encoding into pages can be done before even calling into the RPC layer, just like we do today.