Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp908040lqt; Fri, 7 Jun 2024 02:15:55 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXJVHr3U3Jqr04osgiPNAfEi9dzx4E03sr3pxJqUGlz0Wa8ncdHyynyr+P+Q/Pay6HKgx+V3G5PoAkglh+X4vMpH6ykp3P2N/3a0Jw9sw== X-Google-Smtp-Source: AGHT+IFw/Y+B2R9T681jZ8jd1EOoDkZwF/xrYPcTf4bGO0P1s3DJvKE3clZ99hwviLuwF0ffalpS X-Received: by 2002:a17:90a:a891:b0:2c0:29d5:3515 with SMTP id 98e67ed59e1d1-2c2bcac8c4dmr1822326a91.3.1717751755198; Fri, 07 Jun 2024 02:15:55 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717751755; cv=pass; d=google.com; s=arc-20160816; b=Oh7XzoNa1MuRzUjld0VkCGYccg4Zl0MNQaMHL5P57oh5hsAvHEG7K0Ij/1mHWgyE9I /8BDf6V0A6n1bHcSBDNkV8YYKhSD/B4oM5335TMDAU6UpmkvLyC50tDuLCRkDVbu6vok +9WWBCCb/4pgPcaeqbWZWbMJJpCJt3QGIOIvr6F6EFbpZlF5469VB9JtiIQZOUn7YH9e nbXlOD92bdhDulUV7KeMVi544ieBYyiMvLBlt6QvaaM3avU/WDyAxbHSWJrAATg45Vju TvGKaJZnRZTtmLc5f8nQzvhuif9Z6JA/8fpEcS54mOikitTFsMzv/nMc5BdEqWSrHBZg uv0A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=BkvZtA0DZCTst0gmiY3p4tzfSSSAg0RWHnN+gN89loE=; fh=8obm1S+EgOJGM37d32V0lIxEEqS6t/kRT3OFvw5SNhc=; b=XJQtf/WWP+G/KzqCoWANu+k18pQOAuv6jSUIvDBADpJvfnXsFrKQ5Pg5wBrV385AQm lHbCaaWh7MUL1HfVfL9iZORCr3tELjtXQoQ1iGR4xS+EJ4Vf8h9YGG5EyUsh4JTRWylg 9OkbrPQemznn3rTT7bMovuQLdqMzMq+83W3ADtEclBc+gAX6Cvej6JkTFkHcGc/KqfSU spqkzlMvrkKdctOhG9F3rqUE5NsWktserV/s45FvCYv8QExZIokGTn9Jbpv1l7rJZqdp BQHOQZfHKVA9OVMXLEMkXX7FT5PAZ6Sz52JyXh47wgU+S+WxPBgFVF+g8ewunnputBo6 LZzg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="EyM1/bpm"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-205687-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-205687-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id 98e67ed59e1d1-2c29c218e03si2750546a91.76.2024.06.07.02.15.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Jun 2024 02:15:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-205687-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="EyM1/bpm"; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-205687-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-205687-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 6D84FB25AE7 for ; Fri, 7 Jun 2024 09:10:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DC8C715CD54; Fri, 7 Jun 2024 09:10:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="EyM1/bpm" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25C2F15B99A for ; Fri, 7 Jun 2024 09:10:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717751407; cv=none; b=S7B3IritH07D7j3rcAThmkIfy9lP4RnTb5rxl5lKyKsiI8De58HqMvX9CVUnmdKvopS0hYD1aM1hW2W5p7ewVTeSvihfygJSvwb1A4E2Ggj4rT3f4Vn7/dsLVLWxiwn5NbCHLWk38nTbzfyc6YCTkrmAsfSUeMuLDarHaaE6UqU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717751407; c=relaxed/simple; bh=rcr98LUxMH4GskxdErEsuVpzVDjlNv4ljeb+rvJlTGs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kOZpp6WYjuyNb1fkE5TBo7jqizNGGYQ8JW445cSLG15ZL3QdXEqPUAQ8AkOIvi2t9p76nxDBlZKC2+Fthg+1PSudoRNBOXAuXOWMW0Vodp1bOSxfdx7jhwklUky7DJcfwa1xB03pi1dk9raKhN62OOIObZR8ls66bpwqTfHDCPQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=EyM1/bpm; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717751405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BkvZtA0DZCTst0gmiY3p4tzfSSSAg0RWHnN+gN89loE=; b=EyM1/bpmKug/pD2Sj/tnYMQ7j3JBCoiij7zJQVlBLk5ZKyFs3N06CQjm/Wx9BzBZyn8kz+ GHyQ8SbTwQKMNEUNEIwuG+h+WSPXvS0jwc93rSiHH//1y5dpwTPqFcPJRHKmnrkRAXGBvT TUO7dtH5cumVnzQieZ0Xk34i9K0fQDI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-592-yH10sX6lNoa9rUALrYBFDg-1; Fri, 07 Jun 2024 05:10:01 -0400 X-MC-Unique: yH10sX6lNoa9rUALrYBFDg-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 16217811E81; Fri, 7 Jun 2024 09:10:00 +0000 (UTC) Received: from t14s.fritz.box (unknown [10.39.194.94]) by smtp.corp.redhat.com (Postfix) with ESMTP id CF51437E7; Fri, 7 Jun 2024 09:09:55 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev, xen-devel@lists.xenproject.org, kasan-dev@googlegroups.com, David Hildenbrand , Andrew Morton , Mike Rapoport , Oscar Salvador , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Alexander Potapenko , Marco Elver , Dmitry Vyukov Subject: [PATCH v1 3/3] mm/memory_hotplug: skip adjust_managed_page_count() for PageOffline() pages when offlining Date: Fri, 7 Jun 2024 11:09:38 +0200 Message-ID: <20240607090939.89524-4-david@redhat.com> In-Reply-To: <20240607090939.89524-1-david@redhat.com> References: <20240607090939.89524-1-david@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 We currently have a hack for virtio-mem in place to handle memory offlining with PageOffline pages for which we already adjusted the managed page count. Let's enlighten memory offlining code so we can get rid of that hack, and document the situation. Signed-off-by: David Hildenbrand --- drivers/virtio/virtio_mem.c | 11 ++--------- include/linux/memory_hotplug.h | 4 ++-- include/linux/page-flags.h | 8 ++++++-- mm/memory_hotplug.c | 6 +++--- mm/page_alloc.c | 12 ++++++++++-- 5 files changed, 23 insertions(+), 18 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index b90df29621c81..b0b8714415783 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -1269,12 +1269,6 @@ static void virtio_mem_fake_offline_going_offline(unsigned long pfn, struct page *page; unsigned long i; - /* - * Drop our reference to the pages so the memory can get offlined - * and add the unplugged pages to the managed page counters (so - * offlining code can correctly subtract them again). - */ - adjust_managed_page_count(pfn_to_page(pfn), nr_pages); /* Drop our reference to the pages so the memory can get offlined. */ for (i = 0; i < nr_pages; i++) { page = pfn_to_page(pfn + i); @@ -1293,10 +1287,9 @@ static void virtio_mem_fake_offline_cancel_offline(unsigned long pfn, unsigned long i; /* - * Get the reference we dropped when going offline and subtract the - * unplugged pages from the managed page counters. + * Get the reference again that we dropped via page_ref_dec_and_test() + * when going offline. */ - adjust_managed_page_count(pfn_to_page(pfn), -nr_pages); for (i = 0; i < nr_pages; i++) page_ref_inc(pfn_to_page(pfn + i)); } diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h index 7a9ff464608d7..ebe876930e782 100644 --- a/include/linux/memory_hotplug.h +++ b/include/linux/memory_hotplug.h @@ -175,8 +175,8 @@ extern int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages, extern void mhp_deinit_memmap_on_memory(unsigned long pfn, unsigned long nr_pages); extern int online_pages(unsigned long pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group); -extern void __offline_isolated_pages(unsigned long start_pfn, - unsigned long end_pfn); +extern unsigned long __offline_isolated_pages(unsigned long start_pfn, + unsigned long end_pfn); typedef void (*online_page_callback_t)(struct page *page, unsigned int order); diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index e0362ce7fc109..0876aca0833e7 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -1024,11 +1024,15 @@ PAGE_TYPE_OPS(Buddy, buddy, buddy) * putting them back to the buddy, it can do so via the memory notifier by * decrementing the reference count in MEM_GOING_OFFLINE and incrementing the * reference count in MEM_CANCEL_OFFLINE. When offlining, the PageOffline() - * pages (now with a reference count of zero) are treated like free pages, - * allowing the containing memory block to get offlined. A driver that + * pages (now with a reference count of zero) are treated like free (unmanaged) + * pages, allowing the containing memory block to get offlined. A driver that * relies on this feature is aware that re-onlining the memory block will * require not giving them to the buddy via generic_online_page(). * + * Memory offlining code will not adjust the managed page count for any + * PageOffline() pages, treating them like they were never exposed to the + * buddy using generic_online_page(). + * * There are drivers that mark a page PageOffline() and expect there won't be * any further access to page content. PFN walkers that read content of random * pages should check PageOffline() and synchronize with such drivers using diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 0254059efcbe1..965707a02556f 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1941,7 +1941,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, struct zone *zone, struct memory_group *group) { const unsigned long end_pfn = start_pfn + nr_pages; - unsigned long pfn, system_ram_pages = 0; + unsigned long pfn, managed_pages, system_ram_pages = 0; const int node = zone_to_nid(zone); unsigned long flags; struct memory_notify arg; @@ -2062,7 +2062,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, } while (ret); /* Mark all sections offline and remove free pages from the buddy. */ - __offline_isolated_pages(start_pfn, end_pfn); + managed_pages = __offline_isolated_pages(start_pfn, end_pfn); pr_debug("Offlined Pages %ld\n", nr_pages); /* @@ -2078,7 +2078,7 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages, zone_pcp_enable(zone); /* removal success */ - adjust_managed_page_count(pfn_to_page(start_pfn), -nr_pages); + adjust_managed_page_count(pfn_to_page(start_pfn), -managed_pages); adjust_present_page_count(pfn_to_page(start_pfn), group, -nr_pages); /* reinitialise watermarks and update pcp limits */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 039bc52cc9091..809bc4a816e85 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6745,14 +6745,19 @@ void zone_pcp_reset(struct zone *zone) /* * All pages in the range must be in a single zone, must not contain holes, * must span full sections, and must be isolated before calling this function. + * + * Returns the number of managed (non-PageOffline()) pages in the range: the + * number of pages for which memory offlining code must adjust managed page + * counters using adjust_managed_page_count(). */ -void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) +unsigned long __offline_isolated_pages(unsigned long start_pfn, + unsigned long end_pfn) { + unsigned long already_offline = 0, flags; unsigned long pfn = start_pfn; struct page *page; struct zone *zone; unsigned int order; - unsigned long flags; offline_mem_sections(pfn, end_pfn); zone = page_zone(pfn_to_page(pfn)); @@ -6774,6 +6779,7 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) if (PageOffline(page)) { BUG_ON(page_count(page)); BUG_ON(PageBuddy(page)); + already_offline++; pfn++; continue; } @@ -6786,6 +6792,8 @@ void __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) pfn += (1 << order); } spin_unlock_irqrestore(&zone->lock, flags); + + return end_pfn - start_pfn - already_offline; } #endif -- 2.45.1