Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp766222pxp; Fri, 11 Mar 2022 14:33:44 -0800 (PST) X-Google-Smtp-Source: ABdhPJz4KENiccsaQwkeXZgjtfTAsOKn5sWB8925Rpl6vCh02l6fkSXqAaccraExDnW/ZirsKsXW X-Received: by 2002:a17:902:ccd2:b0:14f:8182:96c4 with SMTP id z18-20020a170902ccd200b0014f818296c4mr12750606ple.67.1647038024027; Fri, 11 Mar 2022 14:33:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1647038024; cv=none; d=google.com; s=arc-20160816; b=teDJtdXKqr/sEccqDIZzPZfSw2yhZqPoDgE+Uqg2oPFxBdb9T1DblntzUQC9lo4Go0 9VqOXxukWGFmciP7a4d+cH+LyCFTavvVCZG3kIV0JK4CwAKoXfEzrweiZl7LC087V/r/ NeN6OcCFcSrctOqdcA6f0Xpj/qekq3zclS5AlAtBO6KOHs79tqFnIQq3Scj1RvrvTApX 1G9dR590RbWedsVlOWRaTJoeSNt9BSpEcReWBrKjbv/ZKph7ARbIj5Bui2XpyEECJJM6 JHa1VBzpeCWGAWjoYsuMHSF/goyvWLVGufWcOxEXpazOunb0y9dOwqhLPx6Z88p/h8+k ySKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=AUBsSaqnRWg4B0uOKbllkHk4Scdppop0KK0xMi3fHF0=; b=pvVDWdhDv3oYG4DOrOR5rfl+DE7w2RLxfMmP/gDuSco86E9EOm/4VOTFIKiFf9Pv4u tBiV7J8Ib+918SC9lKQVlELyE36WYPhu4kGnPk0Dsfu7LAuaX6w8V8Heq4n9rfSGC3ia WfMr6ibRwb8ZeaqAiW4yWTo5wHK4k7Zly7cLBl3WIoh8Dmn1VSzzrew5c69AMNWUuGqe vna3mIO/lz7CWyEd1AjMXmCNmpQrMI270OL9tSGvGMCZFbkBmeHhNG60SKjV8eUgKExW MWBluNJHF3TYWVzlLeRnFQPxCIiUwUtsspwJfJm2GaxbnddQm0z98hkAf7z019jhdI1g 6Q3Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PAbXkYu6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id j12-20020a170902da8c00b0015334e625cfsi3276279plx.241.2022.03.11.14.33.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Mar 2022 14:33:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=PAbXkYu6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 39B1A240E24; Fri, 11 Mar 2022 13:39:57 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349940AbiCKTJt (ORCPT + 99 others); Fri, 11 Mar 2022 14:09:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237153AbiCKTJs (ORCPT ); Fri, 11 Mar 2022 14:09:48 -0500 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB77E1B8FF2 for ; Fri, 11 Mar 2022 11:08:44 -0800 (PST) Received: by mail-pj1-x1031.google.com with SMTP id b8so8995643pjb.4 for ; Fri, 11 Mar 2022 11:08:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AUBsSaqnRWg4B0uOKbllkHk4Scdppop0KK0xMi3fHF0=; b=PAbXkYu6bO5CXr9IedJ6KlSYT41m9t9e0QurfT3ujodBdLBV+DtO/BasHg04X5F54j dpjmZViSKc7LwACuQN2p2nVCvDJpi6HPy+bO3hVy8oc7XXFpC67uM7/RbNZJt9DCevJ8 sR8s6D/vIDBptzP1nURtjV1KEfPhd3twjmLzjCQYXni/j8GfhNXqNrOtyK0wGpeEBfXO ubWMK7PLb08B39jxdnlpnf6OBypo8xJ3YyWB3cW0XEmkpsYTWScUoAnuIn4l8tbMtQvr f6fdCKZfIme6lT86B3YjtIix1YxCoTL0uJj0/1R847d6bQ/Qvg0kpAQn0k7i3ZJzI1v3 lD0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AUBsSaqnRWg4B0uOKbllkHk4Scdppop0KK0xMi3fHF0=; b=uDguGBHOYL2o57Ik4KB3U3A1n5BWCAFLcfF64GHqtnjPh79TRYlrzTDoNLjncVHI7e oINn3t4j5+ynK2UsgZwKbybhM3bBStk6t+la1AnST9wGLIySH0P+emKfgJdoaxpuZxRo O74BbrpdsgO0BQX2FgNYqCYDpw5cb6j3HP0IhM6RGsOdJ+rUaksVFiprgGl9JqGewqtT XKZyJ9+Qt1Kn3//efSIQUkhtx5d1a6KwBOgdMGMNgkQaY9jTdAdeu7Nt9SJxsdCKaeb6 6pPgejPDBj0M2tU6gYEp6aXc6tLZsivukEbhu99/AoNwunTx1L4aWnOGvL3JhwmzAieB y7Pg== X-Gm-Message-State: AOAM533weU2pi+KNqmruS4bLsnUxW5hU3u5KJL0gCOMhQCv4jWz7ll9X +UCwWkHMC9Ds3fGrAAQgGQxMD1ZxLRpAjE5f5VM= X-Received: by 2002:a17:90b:4595:b0:1be:db22:8327 with SMTP id hd21-20020a17090b459500b001bedb228327mr23402186pjb.99.1647025724268; Fri, 11 Mar 2022 11:08:44 -0800 (PST) MIME-Version: 1.0 References: <20220311090119.2412738-1-maobibo@loongson.cn> In-Reply-To: <20220311090119.2412738-1-maobibo@loongson.cn> From: Yang Shi Date: Fri, 11 Mar 2022 11:08:32 -0800 Message-ID: Subject: Re: [PATCH] mm/khugepaged: sched to numa node when collapse huge page To: Bibo Mao Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 11, 2022 at 1:01 AM Bibo Mao wrote: > > collapse huge page is slow, specially when khugepaged daemon runs > on different numa node with that of huge page. It suffers from > huge page copying across nodes, also cache is not used for target > node. With this patch, khugepaged daemon switches to the same numa > node with huge page. It saves copying time and makes use of local > cache better. > > Signed-off-by: Bibo Mao > --- > mm/khugepaged.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 131492fd1148..460c285dc974 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -116,6 +116,7 @@ struct khugepaged_scan { > struct list_head mm_head; > struct mm_slot *mm_slot; > unsigned long address; > + int node; > }; > > static struct khugepaged_scan khugepaged_scan = { > @@ -1066,6 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm, > struct vm_area_struct *vma; > struct mmu_notifier_range range; > gfp_t gfp; > + const struct cpumask *cpumask; > > VM_BUG_ON(address & ~HPAGE_PMD_MASK); > > @@ -1079,6 +1081,13 @@ static void collapse_huge_page(struct mm_struct *mm, > * that. We will recheck the vma after taking it again in write mode. > */ > mmap_read_unlock(mm); > + > + /* sched to specified node before huage page memory copy */ > + cpumask = cpumask_of_node(node); > + if ((khugepaged_scan.node != node) && !cpumask_empty(cpumask)) { > + set_cpus_allowed_ptr(current, cpumask); > + khugepaged_scan.node = node; What if khugepaged was scheduled to the other nodes after this, but khugepaged_scan.node still equals to node? It seems possible to me IIUC. TBH I'm not quite sure if migrating khugepaged is really worth it for everyone or not. The worst case is the locality of base pages are not obvious, for example, the base pages may be across all nodes, so you always get cross nodes memory copy. And khugepaged may get slower if cpu is contentious. In addition, I saw MIPS has its own copy_user_highpage(), is it a contributing factor too? > + } > new_page = khugepaged_alloc_page(hpage, gfp, node); > if (!new_page) { > result = SCAN_ALLOC_HUGE_PAGE_FAIL; > @@ -2380,6 +2389,7 @@ int start_stop_khugepaged(void) > kthread_stop(khugepaged_thread); > khugepaged_thread = NULL; > } > + khugepaged_scan.node = NUMA_NO_NODE; > set_recommended_min_free_kbytes(); > fail: > mutex_unlock(&khugepaged_mutex); > -- > 2.31.1 > >