Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp1012200iob; Fri, 13 May 2022 19:35:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwtz/RHfx29gr+IJMZcM58nSeytKOkWeaFHAfcRzrqf9U0wDavCeR2aJd8+sNEYtVOJWhhv X-Received: by 2002:a05:600c:a44:b0:38e:3ce3:3148 with SMTP id c4-20020a05600c0a4400b0038e3ce33148mr7039912wmq.26.1652495700264; Fri, 13 May 2022 19:35:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1652495700; cv=none; d=google.com; s=arc-20160816; b=B27xTqZrL0GB2P2Z2Qy0uZxlLra1ioFWF9nwloZA4tSQLxb8fRBjU6hm7FWkZHotHx rwBznECs+lBrxdMfauJhUGs3y/64XhXWgHR3MsOCiHU9azVUZcrgkx0SvNE9znYVVVub 1HzX0pDL0dTpxy1SELJytYQt8sU6fbUU4GTOPm/zRfu+HQZYMp/D0ipbCKoOJPSwJSDu h83aSi6rNaDrkJg/2UV/qUvra/MrbVhlE4RrspIfqHXDJirRH4NOL1p4toEYY5/ogrb8 zmn0/fe3DYjMNoGS4Uy1RIb1eG2yKkCn66J2lXU3CSUgoQ9bdlL/z17S3OggDadPMxl7 g8OQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=06hvbV93Q10JL9JbiaCPrhXMN7x92kvTsyQkBwvsHgs=; b=lRyOwsqCWwMdBHJBXIf6utE3C0kOnRIM5FMrujCzVy7B2EpvHQNueS6J7nyfu/E9v2 Fziq974Eb5+Md7jXsK79bK2WDegia1gJJaDYU0x97AOQreqA4sjqMsvq0OcG4NAvXeqJ nSHefJSQDxAsp3WWnxrPQUBfFV0USW1KIWdIle3OohZCRd3noBPezIOfJk9yyk5cnnzh zodHnSxBOkSfnPW5z1fATqMXF2xTOKJH4JvZxVLFKHav0U7olJqGeCprNCodVs/LmNXZ ze8i8QwEAEQjGRC7de061nMFlwAPCndqsyS/4GTMlHTM3sKkSM1frzPl2VweFqdjdien z19g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=ufFHIQ2s; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id o17-20020a1c4d11000000b00395b8e96eb0si3866805wmh.195.2022.05.13.19.34.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 May 2022 19:35:00 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=korg header.b=ufFHIQ2s; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7DCE24ED7A8; Fri, 13 May 2022 17:47:32 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1383650AbiEMTiM (ORCPT + 99 others); Fri, 13 May 2022 15:38:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346242AbiEMTiL (ORCPT ); Fri, 13 May 2022 15:38:11 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E92C692B2 for ; Fri, 13 May 2022 12:38:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CA6D9B83193 for ; Fri, 13 May 2022 19:38:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4246CC34100; Fri, 13 May 2022 19:38:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1652470686; bh=ywucsifpeTrhmSlANeerZ/FfD2u6HRZohJjqkLL8LyM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=ufFHIQ2sdh+3D/s3koGgMG1J0VYzIzwh/3wfDxSS8hv8MP8XMgO7UbGWZA4Xjfm0N FWHNfY+MImPnQZkDhwpSftx9DKFKxfzYircmMgE2pHkncRCu0b0GsC4D1Z9dTBDGgD uTGYun6Npfbgg3BPhO/Ntn2Llv94H1LWA5GhGw9o= Date: Fri, 13 May 2022 12:38:05 -0700 From: Andrew Morton To: Mel Gorman Cc: Nicolas Saenz Julienne , Marcelo Tosatti , Vlastimil Babka , Michal Hocko , LKML , Linux-MM Subject: Re: [PATCH 0/6] Drain remote per-cpu directly v3 Message-Id: <20220513123805.41e560392d028c271b36847d@linux-foundation.org> In-Reply-To: <20220513142330.GI3441@techsingularity.net> References: <20220512085043.5234-1-mgorman@techsingularity.net> <20220512124325.751781bb88ceef5c37ca653e@linux-foundation.org> <20220513142330.GI3441@techsingularity.net> X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 13 May 2022 15:23:30 +0100 Mel Gorman wrote: > Correct. > > > > the draining in non-deterministic. > > > > s/n/s/;) > > > > Think that one is ok. At least spell check did not complain. s/in/si/ > > > Currently an IRQ-safe local_lock protects the page allocator per-cpu lists. > > > The local_lock on its own prevents migration and the IRQ disabling protects > > > from corruption due to an interrupt arriving while a page allocation is > > > in progress. The locking is inherently unsafe for remote access unless > > > the CPU is hot-removed. > > > > I don't understand the final sentence here. Which CPU and why does > > hot-removing it make the locking safe? > > > > The sentence can be dropped because it adds little and is potentially > confusing. The PCP being safe to access remotely is specific to the > context of the CPU being hot-removed and there are other special corner > cases like zone_pcp_disable that modifies a per-cpu structure remotely > but not in a way that causes corruption. OK. I pasted in your para from the other email. Current 0/n blurb: Some setups, notably NOHZ_FULL CPUs, may be running realtime or latency-sensitive applications that cannot tolerate interference due to per-cpu drain work queued by __drain_all_pages(). Introduce a new mechanism to remotely drain the per-cpu lists. It is made possible by remotely locking 'struct per_cpu_pages' new per-cpu spinlocks. This has two advantages, the time to drain is more predictable and other unrelated tasks are not interrupted. This series has the same intent as Nicolas' series "mm/page_alloc: Remote per-cpu lists drain support" -- avoid interference of a high priority task due to a workqueue item draining per-cpu page lists. While many workloads can tolerate a brief interruption, it may cause a real-time task running on a NOHZ_FULL CPU to miss a deadline and at minimum, the draining is non-deterministic. Currently an IRQ-safe local_lock protects the page allocator per-cpu lists. The local_lock on its own prevents migration and the IRQ disabling protects from corruption due to an interrupt arriving while a page allocation is in progress. This series adjusts the locking. A spinlock is added to struct per_cpu_pages to protect the list contents while local_lock_irq continues to prevent migration and IRQ reentry. This allows a remote CPU to safely drain a remote per-cpu list. This series is a partial series. Follow-on work should allow the local_irq_save to be converted to a local_irq to avoid IRQs being disabled/enabled in most cases. Consequently, there are some TODO comments highlighting the places that would change if local_irq was used. However, there are enough corner cases that it deserves a series on its own separated by one kernel release and the priority right now is to avoid interference of high priority tasks. Patch 1 is a cosmetic patch to clarify when page->lru is storing buddy pages and when it is storing per-cpu pages. Patch 2 shrinks per_cpu_pages to make room for a spin lock. Strictly speaking this is not necessary but it avoids per_cpu_pages consuming another cache line. Patch 3 is a preparation patch to avoid code duplication. Patch 4 is a simple micro-optimisation that improves code flow necessary for a later patch to avoid code duplication. Patch 5 uses a spin_lock to protect the per_cpu_pages contents while still relying on local_lock to prevent migration, stabilise the pcp lookup and prevent IRQ reentrancy. Patch 6 remote drains per-cpu pages directly instead of using a workqueue.