Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp5817997imw; Wed, 20 Jul 2022 13:11:49 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vJEmliFeua/IZNRFnVYQzAa1HGJFiHeWAZClAA88swjMF48P6Y37fkXc+r+DJwiuoszWsb X-Received: by 2002:a17:90b:2c0b:b0:1ef:aa42:f19b with SMTP id rv11-20020a17090b2c0b00b001efaa42f19bmr7362374pjb.211.1658347909592; Wed, 20 Jul 2022 13:11:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658347909; cv=none; d=google.com; s=arc-20160816; b=iiOKIVt2qrihgGKqG0yNU9+QhTrq+QJmasSB8CU/l2nsiqMmSoSbOYlWtKUbCtD/Vn d3jf6xRY8HmqAETVnxh/yIZXISLsFGpqvKVUPksaZTEL7KOWsZR64ujwFy0llmLNUreI 96Q4WTpAn5tOeOx6NABlcLvUBsl+OK5yn2B3rhmKiEPK7+p0w9kmIaz2PXm6EGOx1d25 uBcMbPTHLO0Obxfvs3YQrWg8dmJ/9sDfIBwyHGRl8AyE064aeiMrbomF2VZRc/Cu8q2c LstqLp1fw7sNs/c/2cmxz4iNO4/c2d6F0jIWjAXZDTCvV/+ausF53uWRNtp/jR5vSROe /jPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :organization:from:references:cc:to:content-language:user-agent :mime-version:date:message-id:dkim-signature; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=UwXjilDjMSutGhmVm5kv2OFPBgya6NxLXgyhbMDG79NXQfVz9GzNfC39LDgc+asW/0 5M3L8QpjBOBP945W+i31dNHsYsuoGY0HKSHB+jM/yf4YyRD6CHZUSezxxPDU9ShQaTO+ C0iTtQHI1B0smdJHmwYXHYLH1LFzX+k2v89ZBw4wlvQELJTGlWaxl72qIRji+t3pREfA Q2lZObNSUmLZv2JmoCiwG3DUSLkUX0szOsnJdWyLKahux+kuFpfLPIx9Fg0+SfkKSci9 T+u06ur1JQWq2g1A3eTjUqvGwW50B1yK4UsZEEMzfgLI+p9HQCjZzwmw8A9vtxAf4u0U nzaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dhvMHqaE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r10-20020a170902be0a00b0016c2d9149basi16061448pls.21.2022.07.20.13.11.35; Wed, 20 Jul 2022 13:11:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dhvMHqaE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229537AbiGTTz5 (ORCPT + 99 others); Wed, 20 Jul 2022 15:55:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229530AbiGTTzn (ORCPT ); Wed, 20 Jul 2022 15:55:43 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1B1FB54C81 for ; Wed, 20 Jul 2022 12:55:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1658346940; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=dhvMHqaEfEEftQYqBttR4djOLBA8JRp4hShVkc00BGkBOombMxor2UC54iLb/9YUiHgLH3 mmOLhx82cAIfvFV9dXNDB0Ci8FsE+Y6aFM7M70i6OCT866Y2FUGFVA3D9ccdcGNMG0sKaY aMsweBSaEUBn7uisiqbLfalssaKy10s= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-636-h5lwaHfROuio37Z_4bnaMw-1; Wed, 20 Jul 2022 15:55:39 -0400 X-MC-Unique: h5lwaHfROuio37Z_4bnaMw-1 Received: by mail-wr1-f72.google.com with SMTP id h1-20020adfa4c1000000b0021e43452547so1007746wrb.9 for ; Wed, 20 Jul 2022 12:55:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=mr8pnCTauYjzOGkOniK7GloDn5urX/KaCQJ8ZGemUEM=; b=UpmEjdhGED4uuujUJuDG9FAGEgdK6aEd6NcqPbM4rIjHf+9seiTnhU5LFgX6vz3BgM hVIke2M9ClWckMdWE5VSzHvobefgzpLVkUbrdqow0nWSLotHcpcKu+bswYtP08uTCoRM vcovumjn86yMRwSAvpRLo5vTnrLvpDtb4ZI4ynwfdYGOHMW0tiCwrQf8U6r6zftpIuqZ X5bfaX9hxlMRL9sIpovhqa22m6ShpVy/hLVfBZZ2W57TH2Hyj97xAaYwHY0/AWznyX75 q4ypm7qIDCbfRFgl3bxa8socMNx6fx0s23OtRx6AuYLMbNccUbeORZNg62Yu11PY4cPG /APA== X-Gm-Message-State: AJIora9U2DiVskh3U/fQZays6bQTU05xJJ4oqFxqkJ1Lbyidh8Y2HsEA C6shjhOxUKNbfne6W7eog2jhtSsEX2lWzfNa5LGaFZBgSAY1VbO115hB+XfvUwki0WlE2Kxo+9b snuU8AC/kqXXNchJPQgvIJPoW X-Received: by 2002:a05:6000:15ce:b0:21d:b177:a8f0 with SMTP id y14-20020a05600015ce00b0021db177a8f0mr32417141wry.370.1658346937488; Wed, 20 Jul 2022 12:55:37 -0700 (PDT) X-Received: by 2002:a05:6000:15ce:b0:21d:b177:a8f0 with SMTP id y14-20020a05600015ce00b0021db177a8f0mr32417127wry.370.1658346937235; Wed, 20 Jul 2022 12:55:37 -0700 (PDT) Received: from ?IPV6:2003:cb:c706:e00:8d96:5dba:6bc4:6e89? (p200300cbc7060e008d965dba6bc46e89.dip0.t-ipconnect.de. [2003:cb:c706:e00:8d96:5dba:6bc4:6e89]) by smtp.gmail.com with ESMTPSA id g14-20020adff40e000000b0021bbf6687b1sm19731824wro.81.2022.07.20.12.55.36 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 20 Jul 2022 12:55:36 -0700 (PDT) Message-ID: <4ad140b5-1d5b-2486-0893-7886a9cdfd76@redhat.com> Date: Wed, 20 Jul 2022 21:55:35 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Content-Language: en-US To: Peter Xu Cc: Nadav Amit , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Mike Rapoport , Axel Rasmussen , Nadav Amit , Andrea Arcangeli , Andrew Cooper , Andy Lutomirski , Dave Hansen , Peter Zijlstra , Thomas Gleixner , Will Deacon , Yu Zhao , Nick Piggin References: <20220718120212.3180-1-namit@vmware.com> <20220718120212.3180-2-namit@vmware.com> <017facf0-7ef8-3faf-138d-3013a20b37db@redhat.com> <2b4393ce-95c9-dd3e-8495-058a139e771e@redhat.com> <69022bad-d6f1-d830-224d-eb8e5c90d5c7@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH 01/14] userfaultfd: set dirty and young on writeprotect In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 20.07.22 21:48, Peter Xu wrote: > On Wed, Jul 20, 2022 at 09:33:35PM +0200, David Hildenbrand wrote: >> On 20.07.22 21:15, Peter Xu wrote: >>> On Wed, Jul 20, 2022 at 05:10:37PM +0200, David Hildenbrand wrote: >>>> For pagecache pages it may as well be *plain wrong* to bypass the write >>>> fault handler and simply mark pages dirty+map them writable. >>> >>> Could you elaborate? >> >> Write-fault handling for some filesystems (that even require this >> "slow path") is a bit special. >> >> For example, do_shared_fault() might have to call page_mkwrite(). >> >> AFAIK file systems use that for lazy allocation of disk blocks. >> If you simply go ahead and map a !dirty pagecache page writable >> and mark it dirty, it will not trigger page_mkwrite() and you might >> end up corrupting data. >> >> That's why we the old change_pte_range() code never touched >> anything if the pte wasn't already dirty. > > I don't think that pte_dirty() check was for the pagecache code. For any fs > that has page_mkwrite() defined, it'll already have vma_wants_writenotify() > return 1, so we'll never try to add write bit, hence we'll never even try > to check pte_dirty(). > I might be too tired, but the whole reason we had this magic before my commit in place was only for the pagecache. With vma_wants_writenotify()=0 you can directly map the pages writable and don't have to do these advanced checks here. In a writable MAP_SHARED VMA you'll already have pte_write(). We only get !pte_write() in case we have vma_wants_writenotify()=1 ... try_change_writable = vma_wants_writenotify(vma, vma->vm_page_prot); and that's the code that checked the dirty bit after all to decide -- amongst other things -- if we can simply map it writable without going via the write fault handler and triggering do_shared_fault() . See crazy/ugly FOLL_FORCE code in GUP that similarly checks the dirty bit. But yeah, it's all confusing so I might just be wrong regarding pagecache pages. -- Thanks, David / dhildenb