Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp4528609pxu; Mon, 21 Dec 2020 15:14:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJxZ/P/JL8rS8YlJXzTpS0jOTDD+QUh9MZnW7yb95HnTlLj1seq25QldxfGj9QdsvaUxlBM3 X-Received: by 2002:a17:906:73d8:: with SMTP id n24mr17250378ejl.14.1608592449205; Mon, 21 Dec 2020 15:14:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608592449; cv=none; d=google.com; s=arc-20160816; b=IieBYeW6sllj+f/f/A+/gW6SyH60jlQsyM4wCn7bLXwms85MGEHsgAjAYNmKflxANy 5gCO4C+hF73hY4KVXx0IcfA8k+p+Ec3YRQPdy2CvL/7Ls4ghc64UvuL9TQ5/dawav96Z 6fIFHdJ6HYJ5mAtXihikGi4xQrb7JgO9DK3s6OKjEoje+VqY0hOiYieo6bL4NmTRNc6Y KBtKgub8m+Df7eGGd2Bz0gZdXe1htwtHkueP1txjnsA5ZoEUVcRR/v6S/vGPRK5nAdwL 8Pz//pDvrJhsViNL9/zxEd2T6+X8ftcqhQw+uga6jVpUoCKey8OGv1/krWSF5CXEcLCd J7SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=+Y8ZZnXUThIraa3kqic3ZC5rSXsBos2RLv0vNdKIdp8=; b=gH1GjM8inxhe3VTn62KlKBoZfc5XolmQc5s1mFZT+lLILCEGcs+fQTQSrNcxHPM5vJ a1pA4qaKOMEw3kMHOIl1js2rv3xUGA54/Su0l6nsANLVwY6Xi4Hvpnle2p3nA9bcLCbE M6PZIV+mGqA/q9pT/M83N+5IBXhxmJ0uiDtYuX+Y4ygernZqOV4HprVPtKMaJDmiQfCe CTLwjpJx+AXIrfZ8f4uNxD8veUbv/3th0GlSl/ghAIR/MUfUA1oSG9YpxanpuAnP4b2x SWjXIfMbcQNFOYMoc1OP8ARKGFJLkuOFn91bVPxbdlMaQ70xknRnU6K/7O8Mf0zoctE1 wRhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DxVK5T7x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u9si11366431edd.596.2020.12.21.15.13.45; Mon, 21 Dec 2020 15:14:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DxVK5T7x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725913AbgLUXNI (ORCPT + 99 others); Mon, 21 Dec 2020 18:13:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725780AbgLUXNH (ORCPT ); Mon, 21 Dec 2020 18:13:07 -0500 Received: from mail-io1-xd36.google.com (mail-io1-xd36.google.com [IPv6:2607:f8b0:4864:20::d36]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F734C0613D3 for ; Mon, 21 Dec 2020 15:12:27 -0800 (PST) Received: by mail-io1-xd36.google.com with SMTP id r9so10351425ioo.7 for ; Mon, 21 Dec 2020 15:12:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+Y8ZZnXUThIraa3kqic3ZC5rSXsBos2RLv0vNdKIdp8=; b=DxVK5T7xYuc3JN8wgvZmILR7izqYuIuOuZP8y9YDH+UGCLq1U5ffM+vkNZbFyoyFfY QjYsdKEH4NUBglLU6ZfspgzE2jYzSZms3pYlhjcTdJn9SjgMvNWaFrpdpf2+NwYffk5Y VDfiSGmiABKqKfOouHbwEBvyuwGer/TmjR0q2AcU7C+G7TZzNug02zhc7TYu4A1TEvOb tV9OmyCeuLeeM5d7FNX4qEcVjsHxvCmsGjLVKs/PABt6O0pe9orzgp5di2aZrq96wVTY 3rAWaegB1d30JCP0tyjrQ5p61WdzS6iZF2DEwaK+yokbaAXymj2kq8FtCS2EPSlGfS/D CCSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+Y8ZZnXUThIraa3kqic3ZC5rSXsBos2RLv0vNdKIdp8=; b=J7/oqE01MCtSsIeEcF7LT+a0ei77UAgbKJZud181/OZbmgiSn/LGausrWt2V9dcb8B K6V7k9+4YeZ9Jrmcf2AKu3jjoWURMENoRewXRwUIFBfuMYfCUC/W23vgWDXKFB9ep8LS MIRCB37HH7aOhvB4V2UQuHI9T8+TNekcpiPeTEMgUON08Q16pciMEc3cQBkRVtpqWyou QnBvlvrtoo8vKNULcMWvAHG/nDGxCeXfpo8QUrhxEnYeKU5/EdtiAKCVaPhCbzEYd+7j TbUyIV6trcmGAIME9ibTw7+/qfjgKLgPIBDacmQdfYHKvgFNBXwfTPr6isV0nqd3Tn6i 6beg== X-Gm-Message-State: AOAM531vzi1Zp+9+bxXA2YtLDQf3Ncej90qr3QzuP0VEv11JeZ5DZhdR 5w7vb8iLKrWFOwWjh9+5VSm+1g== X-Received: by 2002:a02:3541:: with SMTP id y1mr16433792jae.66.1608592346477; Mon, 21 Dec 2020 15:12:26 -0800 (PST) Received: from google.com ([2620:15c:183:200:7220:84ff:fe09:2d90]) by smtp.gmail.com with ESMTPSA id c15sm13515002ils.87.2020.12.21.15.12.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Dec 2020 15:12:25 -0800 (PST) Date: Mon, 21 Dec 2020 16:12:21 -0700 From: Yu Zhao To: Peter Xu Cc: Nadav Amit , Linus Torvalds , Andrea Arcangeli , linux-mm , lkml , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , stable , Minchan Kim , Andy Lutomirski , Will Deacon , Peter Zijlstra Subject: Re: [PATCH] mm/userfaultfd: fix memory corruption due to writeprotect Message-ID: References: <20201221172711.GE6640@xz-x1> <76B4F49B-ED61-47EA-9BE4-7F17A26B610D@gmail.com> <9E301C7C-882A-4E0F-8D6D-1170E792065A@gmail.com> <1FCC8F93-FF29-44D3-A73A-DF943D056680@gmail.com> <20201221223041.GL6640@xz-x1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201221223041.GL6640@xz-x1> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 21, 2020 at 05:30:41PM -0500, Peter Xu wrote: > On Mon, Dec 21, 2020 at 01:49:55PM -0800, Nadav Amit wrote: > > BTW: In general, I think that you are right, and that changing of PTEs > > should not require taking mmap_lock for write. However, I am not sure > > cow_user_page() is not the only one that poses a problem and whether a more > > systematic solution is needed. If cow_user_pages() is the only problem, do > > you think it is possible to do the copying while holding the PTL? It works > > for normal-pages, but I am not sure whether special-pages pose special > > problems. > > > > Anyhow, this is an enhancement that we can try later. > > AFAIU mprotect() is the only one who modifies the pte using the mmap write > lock. NUMA balancing is also using read mmap lock when changing pte > protections, NUMA balance doesn't clear pte_write() -- I would not call setting pte_none() a change of protection. > while my understanding is mprotect() used write lock only because > it manipulates the address space itself (aka. vma layout) rather than modifying > the ptes, so it needs to. Yes, and personally, I would only take mmap lock for write when I change VMAs, not PTE protections. > At the pte level, it seems always to be the pgtable lock that serializes things. > > So it's perfectly legal to me for e.g. a driver to modify ptes with the read > lock of mmap_sem, unless I'm severely mistaken.. as long as the pgtable lock is > taken when doing so. > > If there's a driver that manipulated the ptes, changed the content of the page, > recover the ptes to origin, and all these happen right after wp_page_copy() > unlocked the pgtable lock but before wp_page_copy() retakes the same lock > again, we may face the same issue finding that the page got copied contains > corrupted data at last. While I don't know what to blame on the driver either > because it seems to be exactly following the rules. > > I believe changing into write lock would solve the race here because tlb > flushing would be guaranteed along the way, but I'm just a bit worried it's not > the best way to go.. I can't say I disagree with you but the man has made the call and I think we should just move on.