Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp4581742imw; Tue, 19 Jul 2022 09:13:41 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vIlWb0B3t89lNp4z/6gjn0QneczkPk46m6slj3t2Q6zxoQoJanDkIwHu6+IZT/dDfsPUeM X-Received: by 2002:a9d:27e1:0:b0:619:2324:d202 with SMTP id c88-20020a9d27e1000000b006192324d202mr13175621otb.266.1658247221532; Tue, 19 Jul 2022 09:13:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658247221; cv=none; d=google.com; s=arc-20160816; b=W/VjeIftnc5BdYzEQejl5Doz88MzE5Muf75OGQwGbemWRCo+yR9XmJL10xc3VQScIL zxS9ue4yvs2UYPhunJlOJEwDOFEMZ6CjFXI2SCD6vx1lZO8+nzvbnKcLjOLUyUaag6ou DT7EHFy9SADDJhGjRJKECIZS2n6kOxMlDxIqSxL9+eUMDVy6KEnmfAwDmqZxCNbrRnM/ iVswqhUEuOqCaZt6CgeJh8etvZOH8/wZrqQcsbhkENrgQ0fNNRSwO1YOYUZHsCu39Fco wEW8di64g3eiM8rbexCU7rxGAw3IU29PfCBu63x/vVmy7NAvAUFfUcbxGab3gXbx33Ms xDVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=AHDsbNoX4BgfFza6+kahupshGKvdKhml/XU63hfZ/84=; b=KTbMkARAkwKCzGPEE6E1WS62l4nZIhW2Oo84qSJBlvi/uHw6jBTNrFhwnngcPhDHQp tJNk+jRDJ6yGWB/l/EOHYee0WUSEqdQqyGkqqtUQ6BVZuwGInJ4fX2svNZcNgg+k0bcc JU0nWUQ5wt8wij5xgmEAdSmeBpWnQv4j63JaQEfePEt40p5XQDNW2Lt03QmwRRWZySlc Ly7G2xcUmoV713KyWjxb4dYfyaHRVJQSall/kDxO9o7HNd5OVK7orhNW9MVwbe4MX1ky Xua8UFmyUNgsFwAsfYkfqClgf0FdnYvZ87smGYWbSZIAqJ+OkkkvHnp0wj53gx8s/pFO qjQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=eqQZHhc2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w9-20020a0568080d4900b0033a7cf81687si6732304oik.284.2022.07.19.09.13.28; Tue, 19 Jul 2022 09:13:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=eqQZHhc2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237561AbiGSPhY (ORCPT + 99 others); Tue, 19 Jul 2022 11:37:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237364AbiGSPhU (ORCPT ); Tue, 19 Jul 2022 11:37:20 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E34C956BA9 for ; Tue, 19 Jul 2022 08:37:19 -0700 (PDT) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 829CC34CC4; Tue, 19 Jul 2022 15:37:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1658245038; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=AHDsbNoX4BgfFza6+kahupshGKvdKhml/XU63hfZ/84=; b=eqQZHhc2lCrDj0hrb55nBkBm0E15yypGhM4uWaFjAQO9cjLrfMTymRp4WYzLyHuOm7TDPY eQgUACNnQWLoWo5mEHl9d1W30ImeplJy1MDdnlKmXye7d0mpImPaiTt83eNIL2TYKqgLBz fltS6yrMOKoEmdhMlrvlrs89wMd3a38= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 2C4502C141; Tue, 19 Jul 2022 15:37:15 +0000 (UTC) Date: Tue, 19 Jul 2022 17:37:14 +0200 From: Michal Hocko To: David Hildenbrand Cc: Charan Teja Kalla , akpm@linux-foundation.org, pasha.tatashin@soleen.com, sjpark@amazon.de, sieberf@amazon.com, shakeelb@google.com, dhowells@redhat.com, willy@infradead.org, vbabka@suse.cz, minchan@kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "iamjoonsoo.kim@lge.com" Subject: Re: [PATCH] mm: fix use-after free of page_ext after race with memory-offline Message-ID: References: <1657810063-28938-1-git-send-email-quic_charante@quicinc.com> <6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6fa6b7aa-731e-891c-3efb-a03d6a700efa@redhat.com> X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 19-07-22 17:19:34, David Hildenbrand wrote: > On 18.07.22 16:54, Michal Hocko wrote: > > On Mon 18-07-22 19:28:13, Charan Teja Kalla wrote: [...] > >>> 3) Change the design where the page_ext is valid as long as the struct > >>> page is alive. > >> > >> :/ Doesn't spark joy." > > > > I would be wondering why. It should only take to move the callback to > > happen at hotremove. So it shouldn't be very involved of a change. I can > > imagine somebody would be relying on releasing resources when offlining > > memory but is that really the case? > > Various reasons: > > 1) There was a discussion in the past to eventually also use rcu > protection for handling pdn_to_online_page(). So doing it cleanly here > is certainly an improvement. Call me skeptical on that. > 2) I really dislike having to scatter section online checks all over the > place in page ext code. Once there is a difference between active vs. > stale page ext data things get a bit messy and error prone. This is > already ugly enough in our generic memmap handling code IMHO. They should represent a free page in any case so even they are stall they shouldn't be really dangerous, right? > 3) Having on-demand allocations, such as KASAN or page ext from the > memory online notifier is at least currently cleaner, because we don't > have to handle each and every subsystem that hooks into that during the > core memory hotadd/remove phase, which primarily only setups the > vmemmap, direct map and memory block devices. Cannot this hook into __add_pages which is the real implementation of the arch independent way to allocate vmemmap. Or at the sparsemem level because we do not (and very likely won't) support memory hotplug on any other memory model. > Personally, I think what we have in this patch is quite nice and clean. > But I won't object if it can be similarly done in a clean way from > hot(un)plug code. Well, if the scheme can be done without synchronize_rcu for each section which can backfire and if the scheme doesn't add too much complexity to achieve that then sure I won't object. I just do not get why page_ext should have a different allocation lifetime expectancy than a real page. Quite confusing if you ask me. -- Michal Hocko SUSE Labs