Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1151728pxf; Fri, 9 Apr 2021 01:01:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJz9UDKIUrYzggLWqD173ZWYdl/THcGK7t8r0uYSXNhfIV10ollAxvQ4kp+wonMH3UMqy/bb X-Received: by 2002:a17:906:168c:: with SMTP id s12mr14899312ejd.462.1617955270221; Fri, 09 Apr 2021 01:01:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1617955270; cv=none; d=google.com; s=arc-20160816; b=vR9o5b/reTkQQ7Svl9DYFBM/bBvS5f7NT7pQyi4uipKvaEwn0RVbvejnxKj8rwlaQZ t7Wh708L3pVSmg4pXSzsWXGTPPvtFG9hwwiwPpLR7pTng53Uby8z0f9DlVcwQAFkfUsS ameEPdjSED4KmMjN/V2N1tYSyUwumYJLl6ugecgp9gTz7IJIhN5uYZjjpSWlc2bnt/tf XlkRZLWvi8fVUqXfPL/x1jeotV9u7i+YPM+F4uLwh13w1k9BlTFviiqqSQE14MeYiwoO ynxgr+sFAb2T98OcbmQh4C5uzR88LHsoEKncnW2BjBM+xxBP0QmfFkaTK2E1JmUxoM82 OZIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=mJxmbVqchrB8OnAgMojIIcxHjvQETLCKwOoaZq2dS4w=; b=QiNGR9F/qUmloN7rkcZydGGav5vTqCEYrqPglw7EbU6XnRiBqtx4cU5l0tCnEZ1hxc r/B0GOenQWOBAZo8UVS9T6A4Ysjoh75Es/lLCLvxPnmvWtfU4T8bIU1BQCTJ1jkHWqNf aZ6VcSKTel4Jl/doxmS08AZNwavO8w1ABoMq+Q3LNbjKF4ibB0tnsnTXo+Nk8NBQ4zFp caq9x+Q7MlsdNO5UOW4ckBivzPsxBTS/8cA91DXzEbjbZ5KPyRmDWHI8mj39Nnh1hfIa 3U95g7skUDweCz+I6uJdry+oW1LyfFpj2E2kosKNs5VOA/Jv0EWQAKJ+gh2tOyzw6YuT verw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o4si1782657edc.426.2021.04.09.01.00.46; Fri, 09 Apr 2021 01:01:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231280AbhDIH7z (ORCPT + 99 others); Fri, 9 Apr 2021 03:59:55 -0400 Received: from outbound-smtp44.blacknight.com ([46.22.136.52]:57675 "EHLO outbound-smtp44.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229621AbhDIH7z (ORCPT ); Fri, 9 Apr 2021 03:59:55 -0400 Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp44.blacknight.com (Postfix) with ESMTPS id EDBD5F8058 for ; Fri, 9 Apr 2021 08:59:41 +0100 (IST) Received: (qmail 3562 invoked from network); 9 Apr 2021 07:59:41 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 9 Apr 2021 07:59:41 -0000 Date: Fri, 9 Apr 2021 08:59:39 +0100 From: Mel Gorman To: Peter Zijlstra Cc: Linux-MM , Linux-RT-Users , LKML , Chuck Lever , Jesper Dangaard Brouer , Matthew Wilcox , Thomas Gleixner , Ingo Molnar , Michal Hocko , Oscar Salvador Subject: Re: [PATCH 02/11] mm/page_alloc: Convert per-cpu list protection to local_lock Message-ID: <20210409075939.GJ3697@techsingularity.net> References: <20210407202423.16022-1-mgorman@techsingularity.net> <20210407202423.16022-3-mgorman@techsingularity.net> <20210408174244.GG3697@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 09, 2021 at 08:39:45AM +0200, Peter Zijlstra wrote: > On Thu, Apr 08, 2021 at 06:42:44PM +0100, Mel Gorman wrote: > > On Thu, Apr 08, 2021 at 12:52:07PM +0200, Peter Zijlstra wrote: > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > > > index a68bacddcae0..e9e60d1a85d4 100644 > > > > --- a/mm/page_alloc.c > > > > +++ b/mm/page_alloc.c > > > > @@ -112,6 +112,13 @@ typedef int __bitwise fpi_t; > > > > static DEFINE_MUTEX(pcp_batch_high_lock); > > > > #define MIN_PERCPU_PAGELIST_FRACTION (8) > > > > > > > > +struct pagesets { > > > > + local_lock_t lock; > > > > +}; > > > > +static DEFINE_PER_CPU(struct pagesets, pagesets) = { > > > > + .lock = INIT_LOCAL_LOCK(lock), > > > > +}; > > > > > > So why isn't the local_lock_t in struct per_cpu_pages ? That seems to be > > > the actual object that is protected by it and is already per-cpu. > > > > > > Is that because you want to avoid the duplication across zones? Is that > > > worth the effort? > > > > When I wrote the patch, the problem was that zone_pcp_reset freed the > > per_cpu_pages structure and it was "protected" by local_irq_save(). If > > that was converted to local_lock_irq then the structure containing the > > lock is freed before it is released which is obviously bad. > > > > Much later when trying to make the allocator RT-safe in general, I realised > > that locking was broken and fixed it in patch 3 of this series. With that, > > the local_lock could potentially be embedded within per_cpu_pages safely > > at the end of this series. > > Fair enough; I was just wondering why the obvious solution wasn't chosen > and neither changelog nor comment explain, so I had to ask :-) It's a fair question and it was my first approach before I hit problems. Thinking again this morning, I remembered that another problem I hit was patterns like this local_lock_irqsave(&pagesets.lock, flags); pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); turning into cpu = get_cpu(); pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); local_lock_irqsave(&pcp->lock, flags); That has its own problems if zone->lock was acquired within the local_lock_irqsave section (Section "spinlock_t and rwlock_t" in Documentation/locking/locktypes.rst) so it has to turn into migrate_disable(); pcp = this_cpu_ptr(zone->per_cpu_pageset); local_lock_irqsave(&pcp->lock, flags); I did not want to start adding migrate_disable() in multiple places like this because I'm guessing that new users of migrate_disable() need strong justification and adding such code in page_alloc.c might cause cargo-cult copy&paste in the future. Maybe it could be addressed with a helper like this_cpu_local_lock or this_cpu_local_lock_irq but that means in some cases that the PCP structure is looked up twice with patterns like this one local_lock_irqsave(&pagesets.lock, flags); free_unref_page_commit(page, pfn, migratetype); local_unlock_irqrestore(&pagesets.lock, flags); To get around multiple lookups the helper becomes something that disables migration, looks up the PCP structure, locks it and returns it with pcp then passed around as appropriate. Not sure what I would call that helper :P In the end I just gave up and kept it simple as there is no benefit to !PREEMPT_RT which just disables IRQs. Maybe it'll be worth considering when PREEMPT_RT is upstream and can be enabled. The series was functionally tested on the PREEMPT_RT tree by reverting the page_alloc.c patch and applies this series and all of its prerequisites on top. -- Mel Gorman SUSE Labs