Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A83F7C05027 for ; Mon, 23 Jan 2023 16:43:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233455AbjAWQni (ORCPT ); Mon, 23 Jan 2023 11:43:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47450 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233314AbjAWQnZ (ORCPT ); Mon, 23 Jan 2023 11:43:25 -0500 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E1892D173; Mon, 23 Jan 2023 08:43:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=p4HgFxy6mLo8zzLlEGW1QRWEKxassPu4wwIJpwTVZHU=; b=gPLkQUa/b4j+aQ8Sl5cIlUfqSj Ft2dnwqr+/hclbHhFT9bUZA0dszqDhZbBNRYLsmKM/zHlCkcW62BFzmzhqwynry183IZXdPjxxRa7 hUMdiJg/5y7IZqSEOS1ZbUWsbqwEeZat121QtlfnCLi6DBpMbKyVOurxaNtYifTnlxn24BhiSWNn7 fl0sRy8xvrN7vjsQqkxxsnbM5Sx7SKfc0di2RugQzKGSl52EkRaTZ3jhMsmK6iPv4JlkpVY5cFEQL UoNkPcirQCAUwirqvcezrUFt/fSIJNLNMpLeMy2Sr3FgKX1g1fRkYE8Jfii5oOefMBrx2MK31xYyQ gR0EeOjw==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1pJzuK-004NNk-4c; Mon, 23 Jan 2023 16:42:56 +0000 Date: Mon, 23 Jan 2023 16:42:56 +0000 From: Matthew Wilcox To: David Howells Cc: John Hubbard , Al Viro , Christoph Hellwig , Jens Axboe , Jan Kara , Jeff Layton , Logan Gunthorpe , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v7 0/8] iov_iter: Improve page extraction (ref, pin or just list) Message-ID: References: <20230120175556.3556978-1-dhowells@redhat.com> <318138.1674491927@warthog.procyon.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <318138.1674491927@warthog.procyon.org.uk> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 23, 2023 at 04:38:47PM +0000, David Howells wrote: > Matthew Wilcox wrote: > > > Why do we want to track that information on a per-page basis? Wouldn't it > > be easier to have a VM_NOCOW flag in vma->vm_flags? Set it the first > > time somebody does an O_DIRECT read or RDMA pin. That's it. Pages in > > that VMA will now never be COWed, regardless of their refcount/mapcount. > > And the whole "did we pin or get this page" problem goes away. Along > > with folio->pincount. > > Wouldn't that potentially make someone's entire malloc() heap entirely NOCOW > if they did a single DIO to/from it. Yes. Would that be an actual problem for any real application? We could do this with a vm_pincount if it's essential to be able to count how often it's happened and be able to fork() without COW if it's something that happened in the past and is now not happening. > Also you only mention DIO read - but what about "start DIO write; fork(); touch > buffer" in the parent - now the write buffer belongs to the child and they can > affect the parent's write. I'm struggling to see the problem here. If the child hasn't exec'd, the parent and child are still in the same security domain. The parent could have modified the buffer before calling fork().