Received: by 2002:a05:6358:795:b0:dc:4c66:fc3e with SMTP id n21csp2126752rwj; Sun, 30 Oct 2022 11:45:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5jv5sfbi1VKWBelVrfnJ5gN2sLkXMZmx7SP26MtGNHqodIouJI/tMwR8sesA+ltszqN/U2 X-Received: by 2002:a17:907:2e0d:b0:78e:314:9d88 with SMTP id ig13-20020a1709072e0d00b0078e03149d88mr9299698ejc.54.1667155547944; Sun, 30 Oct 2022 11:45:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667155547; cv=none; d=google.com; s=arc-20160816; b=g++vaG58W1+pnvcECtgyocTGn0ptx7HarN/aaoluWKtmfeozMccXz/8L9hAU5VVYb/ tBVI+eCFZyzSnOgzcfAg2RUQTTRN5q15+SXut1ke+BqkKCEBzLT47bTUFGYBggzBv6OP lGRgO/z47vMYYbRzcqtO03xSmBBTvTsEaVsQ6/eCtKQb7WIw12qO3BTvSyfqvoNioDeo r50oY8a27KTKmK86SZUlFQF0V8WuYLUcFnzKCf7FwkgvaqUXt+e7paCrEfzv0c7rbbhj YhfZ1sVIRTSQZD7wU95LLCy9BRuaoF12FytsMGDgzmzJSf2etK78E9R60i4G15ekMCRB tKXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Nhjvm4CLfjf8sZh4i2SvI4bcfGTGarDkiioYf9j/q80=; b=bcRZ2XkPN2a8EC5ZC6OktnyvfGU4SLHbFTOoMnZ5In8uXZTbOvme9Oc0Qtjih6LXIt Qm5kmOqMKbYyd+O1fQ9DKZDnHHxqHmYhsRetz1rx/HVhTRTbc4oYZ2HuWZBwHy1yaAG8 KihH1K/VnmcEtNDZPbIcsQ3bEMGffAldUf9TAOa385WFgbN8UxG3ZUhNuJs8Sbr5FoxI tE4/yig+1F2pfn34gvOAkvNGFdWFDjoQhhDVDzYmXjV8RREZqxUa6bKyW3RAh3YXywFi KCfUjnYznMKaPjJ3K8E0VDp4WQLOjY5tiV8XHXEbnZuW18uzUdrqwzpxc9WOrcbaN0Bf h7kg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=fN3naw36; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b17-20020a1709064d5100b00787c0e9818csi4117738ejv.568.2022.10.30.11.45.20; Sun, 30 Oct 2022 11:45:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b=fN3naw36; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229648AbiJ3SU4 (ORCPT + 99 others); Sun, 30 Oct 2022 14:20:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50274 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229520AbiJ3SUy (ORCPT ); Sun, 30 Oct 2022 14:20:54 -0400 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E83DFB845 for ; Sun, 30 Oct 2022 11:20:53 -0700 (PDT) Received: by mail-qk1-x72b.google.com with SMTP id k4so3570441qkj.8 for ; Sun, 30 Oct 2022 11:20:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Nhjvm4CLfjf8sZh4i2SvI4bcfGTGarDkiioYf9j/q80=; b=fN3naw36QXCqqTVPHS+dkD5oxumRUFB/LNJ+o5y4bb9W32TVqB285knxQkX7u6XkmS Yn5BMZWYJT09H0YWWX9PvBHq4pL9ArHEIWnYEICha7ZVsiJKLEIJ2wqnI0zWXzpbxA1D 5WSsG7eVfB/2HH+LnLuoxguol6B3WgeqRXu8c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Nhjvm4CLfjf8sZh4i2SvI4bcfGTGarDkiioYf9j/q80=; b=inPpIh+ifl6lRNKQ4xXbuvgKqqu56Yt/20OG7/n4GPKX5pgIuBGf+3RpMHLKfkJBu8 U44NxrVBpb1v/t0YbqKKYrpmVrchXfDENKKbbHabzmDpc2LoAKFcW/as36TdaR3VBhn/ zmyjscDL95THYH0ujFMZ8uiR35giL+f0M+8HrgCBorGqh/AtLGOi3wO2JEk3HX9OvdsR WLVBlq85PEEQp9PPDf+50+zd02QpEulBuB0ZALazJ3NMJFe6DWK6EAchOOzYnP15oXIa GGlmnb0h34nNDft3JWhtyqDzEqLzkodZtMGIqMSIl/E8bnbdYIZHnp/GhdT2jRV8PiXY YImw== X-Gm-Message-State: ACrzQf2AkI4ieyyvwKipi49lt60ZwsvzQNJXtLxHFCOYMDLBZL6zn3/U i9d+MDkneihZKmPy+VajetrUgMo4CPUNoQ== X-Received: by 2002:a05:620a:408a:b0:6fa:b07:3552 with SMTP id f10-20020a05620a408a00b006fa0b073552mr6646197qko.377.1667153993962; Sun, 30 Oct 2022 11:19:53 -0700 (PDT) Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com. [209.85.128.173]) by smtp.gmail.com with ESMTPSA id t12-20020a37ea0c000000b006cfaee39ccesm3209970qkj.114.2022.10.30.11.19.52 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 30 Oct 2022 11:19:53 -0700 (PDT) Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-3704852322fso20075577b3.8 for ; Sun, 30 Oct 2022 11:19:52 -0700 (PDT) X-Received: by 2002:a0d:c246:0:b0:370:2d8c:81d6 with SMTP id e67-20020a0dc246000000b003702d8c81d6mr9315634ywd.112.1667153992578; Sun, 30 Oct 2022 11:19:52 -0700 (PDT) MIME-Version: 1.0 References: <20221022111403.531902164@infradead.org> <20221022114424.515572025@infradead.org> <2c800ed1-d17a-def4-39e1-09281ee78d05@nvidia.com> <6C548A9A-3AF3-4EC1-B1E5-47A7FFBEB761@gmail.com> <47678198-C502-47E1-B7C8-8A12352CDA95@gmail.com> <140B437E-B994-45B7-8DAC-E9B66885BEEF@gmail.com> In-Reply-To: <140B437E-B994-45B7-8DAC-E9B66885BEEF@gmail.com> From: Linus Torvalds Date: Sun, 30 Oct 2022 11:19:36 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 01/13] mm: Update ptep_get_lockless()s comment To: Nadav Amit Cc: Peter Zijlstra , Jann Horn , John Hubbard , X86 ML , Matthew Wilcox , Andrew Morton , kernel list , Linux-MM , Andrea Arcangeli , "Kirill A . Shutemov" , jroedel@suse.de, ubizjak@gmail.com, Alistair Popple Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Oct 29, 2022 at 7:17 PM Nadav Amit wrote: > > Running the PoC on Linux 6.0.6 with these patches caused the following splat > on the following line: > > WARN_ON_ONCE(!folio_test_locked(folio) && !folio_test_dirty(folio)); Yeah, this is a sign of that "folio_mkclean() serializes with folio_mark_dirty using rmap and the page table lock". And page_remove_rmap() could *almost* be called later, but it does have code that also depends on the page table lock, although it looks like realistically that's just because it "knows" that means that preemption is disabled, so it uses non-atomic statistics update. I say "knows" in quotes, because that's what the comment says, but it turns out that __mod_node_page_state() has to deal with CONFIG_RT anyway and does that preempt_disable_nested(); ... preempt_enable_nested(); thing. And then it wants to see the vma, although that's actually only to see if it's 'mlock'ed, so we could just squirrel that away. So we *could* move page_remove_rmap() later into the TLB flush region, but then we would have lost the page table lock anyway, so then folio_mkclean() can come in regardless. So that doesn't even help. End result: we do want to do the page_set_dirty() and the remove_rmap() under the paeg table lock, because it's what serializes folio_mkclean(). And we'd _like_ to do the TLB flush before the remove_rmap(), but we *really* don't want to do that for every page. So my current gut feel is that we should just say that if you do "MADV_DONTNEED or do a munmap() (which includes the "re-mmap() over the area", while some other thread is still writing to that memory region, you may lose writes. IOW, just accept the behavior that Nadav's test-program tries to show, and say "look, you're doing insane things, we've never given you any other semantics, it's your problem" to any user program that does that. If a user program does MADV_DONTNEED on an area that it is actively using at the same time in another thread, that sounds really really bogus. Same goes doubly for 'munmap()' or over-mapping. Linus