Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2254757ybt; Tue, 16 Jun 2020 00:49:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwKtesswx0ltJl5xzop9U1rW4DOhtUcqt1ksynJlVZ03/3NtgPLGn3qVYPcwoMlJQqVSYwV X-Received: by 2002:aa7:c489:: with SMTP id m9mr1509514edq.102.1592293791861; Tue, 16 Jun 2020 00:49:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592293791; cv=none; d=google.com; s=arc-20160816; b=mDQhEheFO2s4kOz5Ohs9XKD7i+ilspWUML2D55iTqtu67upKrPkvMTh0VQNFZIGAmj qb3pfnOdwxGtgA0g2P8CSc+kKQJYL5o3n83mM6Y3tregmSyP08xhaBIPCOiQVLKUW8rI AyeE5P2X2UlHcC3jyfu7auhls9DYrL+hKX3KEgscI0TziodtxEhK0Ius42vMqTbCz/m/ hrPPUxmKO8KKDLfK4ZJX9eVWsCZQDDykB77cxy3h63umwtx2rAFekIAwpGameIj14WpQ V/JiTbS25Vrf8Qwq90snSkht9l0A5w13X/CfNsCdgd9QI/BPfxCZkE5gbMEp0AePA0Yc RbRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=IjMEyEz1vVteOq1rXJH7L2vMzQmnJX1pXrAA9b4Ds2Y=; b=Gxyu7rFn1THOwK1ovYnS42IzczTuwYxFlUwBZJdbwpO9EuNa2awTe/fEWiQlDOmwW2 5mXSxSxzdOf35WMQ8BjC/w2ukF5nLdB6N3mtVFjAi4jIzalwWBu2rVikRmFEjxCVYE3x VJtIVzUbnKsrdhQKExoR3azWU5llNXdoDuWrmsK7Sq7ynMAGOJ+YKhvxpEdEoW2q6Uyv uXuHuUBx4MG6v+kLkR8Rl4a2za5v7mbGCajzYr0EELuiXxw+FzJXf0d7DNd30h4DQ1Wj KKElk7FFFK88QXQdHge4ddloWQJe6V/PauVsVuvXiDbuPNVidXR67VfBcU+nmhBICCEj njEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t9si10309490eds.213.2020.06.16.00.49.29; Tue, 16 Jun 2020 00:49:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726660AbgFPHpl (ORCPT + 99 others); Tue, 16 Jun 2020 03:45:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:44598 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725768AbgFPHpl (ORCPT ); Tue, 16 Jun 2020 03:45:41 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 10875AFBE; Tue, 16 Jun 2020 07:45:43 +0000 (UTC) Subject: Re: [PATCH] mm, page_alloc: capture page in task context only To: Hugh Dickins Cc: Mel Gorman , Andrew Morton , Li Wang , Alex Shi , linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <01287865-153d-42e7-afd8-1178ec6bc5b9@suse.cz> From: Vlastimil Babka Message-ID: Date: Tue, 16 Jun 2020 09:45:38 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6/15/20 11:03 PM, Hugh Dickins wrote: > On Fri, 12 Jun 2020, Vlastimil Babka wrote: >> > This could presumably be fixed by a barrier() before setting >> > current->capture_control in compact_zone_order(); but would also need >> > more care on return from compact_zone(), in order not to risk leaking >> > a page captured by interrupt just before capture_control is reset. >> >> I was hoping a WRITE_ONCE(current->capture_control) would be enough, >> but apparently it's not (I tried). > > Right, I don't think volatiles themselves actually constitute barriers; > but I'd better keep quiet, I notice the READ_ONCE/WRITE_ONCE/data_race > industry has been busy recently, and I'm likely out-of-date and mistaken. Same here, but from what I've read, volatiles should enforce order against other volatiles, but not non-volatiles (which is the struct initialization). So barrier() is indeed necessary, and WRITE_ONCE just to prevent (very hypothetical, hopefully) store tearing. >> >> > Maybe that is the preferable fix, but I felt safer for task_capc() to >> > exclude the rather surprising possibility of capture at interrupt time. >> >> > Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") >> > Cc: stable@vger.kernel.org # 5.1+ >> > Signed-off-by: Hugh Dickins >> >> Acked-by: Vlastimil Babka > > Thanks, and to Mel for his. > >> >> But perhaps I would also make sure that we don't expose the half initialized >> capture_control and run into this problem again later. It's not like this is a >> fast path where barriers hurt. Something like this then? (with added comments) > > Would it be very rude if I leave that to you and to Mel? to add, or No problem. > to replace mine if you wish - go ahead. I can easily see that more > sophistication at the compact_zone_order() end may be preferable to > another test and branch inside __free_one_page() Right, I think so, and will also generally sleep better if we don't put pointers to unitialized structures to current. > (and would task_capc() > be better with an "unlikely" in it?). I'll try and see if it generates better code. We should be also able to remove the "capc->cc->direct_compaction" check, as the only place where we set capc is compact_zone_order() which sets direct_compaction true unconditionally. > But it seems unnecessary to have a fix at both ends, and I'm rather too > wound up in other things at the moment, to want to read up on the current > state of such barriers, and sign off on the Vlastipatch below myself (but > I do notice that READ_ONCE seems to have more in it today than I remember, > which probably accounts for why you did not put the barrier() I expected > to see on the way out). Right, minimally it's a volatile cast (I've checked 5.1 too, for stable reasons) which should be enough. So I'll send the proper patch. Thanks! Vlastimil > Hugh > >> >> diff --git a/mm/compaction.c b/mm/compaction.c >> index fd988b7e5f2b..c89e26817278 100644 >> --- a/mm/compaction.c >> +++ b/mm/compaction.c >> @@ -2316,15 +2316,17 @@ static enum compact_result compact_zone_order(struct zone *zone, int order, >> .page = NULL, >> }; >> >> - current->capture_control = &capc; >> + barrier(); >> + >> + WRITE_ONCE(current->capture_control, &capc); >> >> ret = compact_zone(&cc, &capc); >> >> VM_BUG_ON(!list_empty(&cc.freepages)); >> VM_BUG_ON(!list_empty(&cc.migratepages)); >> >> - *capture = capc.page; >> - current->capture_control = NULL; >> + WRITE_ONCE(current->capture_control, NULL); >> + *capture = READ_ONCE(capc.page); >> >> return ret; >> } >