Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3475150pxk; Mon, 21 Sep 2020 14:57:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzKG2S0dzcWYOCD+6yBFDgxbF2/TLNGCQB6CX0f11i29rSc37zgSdK1wnu3TzONPTAw9bK7 X-Received: by 2002:a50:ce06:: with SMTP id y6mr976418edi.273.1600725456030; Mon, 21 Sep 2020 14:57:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600725456; cv=none; d=google.com; s=arc-20160816; b=hmbO/YDOeR8tofHLgB5h1ok5NubXElsHb0razq0zPLIciCawDzJRR49abF7uExCcsk ZvuQpmOR4vC5hERizP2PUQdsA3Xn+gXUjyUxO9GSZJlfxA235o+gZZhOewApenrnkDwL gCRjLekNcuvyPEqBFLMcm1kNLPqg37ygnBXgSpCRC4W46WaXdSpDYcmJqNtkM1eazrAj E/ugve4SNBMrqSFJaqHdw5QbzK7YPCXQ/EG9r3uuQM2UYpHZswQaU6fm09iGyRkTwiNW VmRg7ndWenEDZl7GXHjpNaujNhJJJYRCghITm48pGU5Q8wazDHynyLN5hivd9sd2zipT caFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=RrMcSaTkEOALr4bDHp7ymhMGB8bDnOAQN8954Kq+6mQ=; b=pRitfq6pa1XGbGYxU01qFo1HaTEwhRZtu3eKHCJB+3fT8XvoRsL4CIKOqnjDj6s4ZY rRdpcJxh1mK4ZdyUS0w84sbCQblKS8ohrJ+vQH1gMR0BnujBsPRbNftRZ7QpsVu+S4Ko vUgQPVDYXyFr/HYSIFK776l2cCvriucJMiRLNjnb2ZTXhxNhEEy49YqpLlKNhZLcAU6u VHI8FPsW+yTZwkn/SIoJBRybZb0xSYWeKmHPhziE75PswPf2OIMFbj92ayrsOPJRwCCt AhvLYQ9IfRKnR/tMzO8nTDGvjwfJAgzdlEcY457/cjg2hv7UeVt7GK38IVUIR/DjofKY AjsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="cct76/7O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n6si9149635edy.596.2020.09.21.14.57.13; Mon, 21 Sep 2020 14:57:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b="cct76/7O"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728502AbgIUVzf (ORCPT + 99 others); Mon, 21 Sep 2020 17:55:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726497AbgIUVzf (ORCPT ); Mon, 21 Sep 2020 17:55:35 -0400 Received: from mail-ej1-x644.google.com (mail-ej1-x644.google.com [IPv6:2a00:1450:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88C15C061755 for ; Mon, 21 Sep 2020 14:55:34 -0700 (PDT) Received: by mail-ej1-x644.google.com with SMTP id gr14so19984069ejb.1 for ; Mon, 21 Sep 2020 14:55:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RrMcSaTkEOALr4bDHp7ymhMGB8bDnOAQN8954Kq+6mQ=; b=cct76/7OEv+D+OtclOJd+ew8Tpda0f3cJNc/Yas9jXEhPJ8NvawHGSYdQJCM2/NsyN 096Z1UkUQRHqPryA08o7CnBm2QJ8Zrf7ujnTySKPm95vLXQABDZdn9FuXbDxKsc6pPPc id478g66NR0G1aBg+Hwtu8+k2fFt09se2rlBAxUCteV2Y6fwAxSUfktUFFMqOpJ0qolF oZoB/zU/53RwQ4U9TL3Wk0XeQuqaFHJZbKOAiQvCHCVstf4v9YLoR65OzwXMsBytyK0Q IEfP1mNNMPGA7/7EAKDcD4mql1mxwwlJX4Rpi6A4psbdkSB0dTZR1uxBzMSqYjhfm/8O T2Ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RrMcSaTkEOALr4bDHp7ymhMGB8bDnOAQN8954Kq+6mQ=; b=E6tutESmducNFnQB/lES7Bc40EkCoZLccG9iXpXdrUZcyC5+chSNPmxmQdpOz2t3Hu PQf+gM8KPLQDHmaOITgY7q3XWBrhuedQIr6qc2ZRDuk+WJ7ajImROHDs7xed3nFTNxYm VhtLwiFjezs8/yuymAbvBJB32ZGmR+MyCXFQslTRto+D4Itr2ZNk1OVpq0YFd93MN6ev KDNOdPUqH+5LGkWv+xGAbhpryLw8cMnlBU5vvfhK15FUcdaroQhAEfIhdBn4BovglrK2 TfdDmz2TsQvaZziJPzucLkTTWoOpG9/bjVaInP2K5vRg6BO7DzEdOOn/73HfN7cEcIg6 1fkw== X-Gm-Message-State: AOAM533K3jgEhvy1MMLiyvKoGxD4/OIeWhbjZVNZgrNzY2ddmFfYhCQW xasZEeb9jJ+WgEeHnyE2IdlVPVh99D/oJcT0tTZRNg== X-Received: by 2002:a17:906:9389:: with SMTP id l9mr1589347ejx.537.1600725333044; Mon, 21 Sep 2020 14:55:33 -0700 (PDT) MIME-Version: 1.0 References: <20200921211744.24758-1-peterx@redhat.com> <20200921212028.25184-1-peterx@redhat.com> In-Reply-To: <20200921212028.25184-1-peterx@redhat.com> From: Jann Horn Date: Mon, 21 Sep 2020 23:55:06 +0200 Message-ID: Subject: Re: [PATCH 4/5] mm: Do early cow for pinned pages during fork() for ptes To: Peter Xu Cc: Linux-MM , kernel list , Linus Torvalds , Michal Hocko , Kirill Shutemov , Oleg Nesterov , Kirill Tkhai , Hugh Dickins , Leon Romanovsky , Jan Kara , John Hubbard , Christoph Hellwig , Andrew Morton , Jason Gunthorpe , Andrea Arcangeli Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 21, 2020 at 11:20 PM Peter Xu wrote: > This patch is greatly inspired by the discussions on the list from Linus, Jason > Gunthorpe and others [1]. > > It allows copy_pte_range() to do early cow if the pages were pinned on the > source mm. Currently we don't have an accurate way to know whether a page is > pinned or not. The only thing we have is page_maybe_dma_pinned(). However > that's good enough for now. Especially, with the newly added mm->has_pinned > flag to make sure we won't affect processes that never pinned any pages. To clarify: This patch only handles pin_user_pages() callers and doesn't try to address other GUP users, right? E.g. if task A uses process_vm_write() on task B while task B is going through fork(), that can still race in such a way that the written data only shows up in the child and not in B, right? I dislike the whole pin_user_pages() concept because (as far as I understand) it fundamentally tries to fix a problem in the subset of cases that are more likely to occur in practice (long-term pins overlapping with things like writeback), and ignores the rarer cases ("short-term" GUP).