Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2275392ybt; Tue, 16 Jun 2020 01:32:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwQLAJ9Vaw5Mx67Gfk/7FEFeNpKNlX5xnW3OK/Mq1s2PBxvDdk433mD2aslOjyruti3D3w1 X-Received: by 2002:a17:906:b845:: with SMTP id ga5mr1697819ejb.300.1592296337111; Tue, 16 Jun 2020 01:32:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592296337; cv=none; d=google.com; s=arc-20160816; b=Ua6jU3wtFRzDHVxfNgb6ZYGRzsji6MVVQP6xhgTiKba0fVBClXPbArvg2NbN8Ek3Pc 4YZ2/pMDNCyGsKN1JXtZV9VxX8DB/xdSk0tUUCcHMDHjKpCLgAxtdnnyDUCXCOsgH6kG lnr0jP1pIMsVw+WZMkBSlGchO4PbGFR/UlD7rneXQDtVDOcNtMq7ayv244QbnGh3kOI/ SZexlh4/ONIvpBeZZo/BUXvo8CJ6SGO+V5coPNBz8jFLNrGRCV4N25/ADAZORoJgQSL9 kFSnWOTIGOolY+BhohXSVI3W3TmZeuHhIDKDHsbzJdylLo8S7nvNHSj3mxAiH4gvsdcO pHvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=SppPeIbhB24xYFoYRyklQqYI3k4K12UeO9APG24u/xc=; b=g22wEzQiJznRSXgoHorHnNw1RMCspki0HQqzDP5sOzVQrcADtMGO2GAugCb2/2OhEl 6G53UlnRmkUrkZ022dulIKWAo8+Pd2KEtQnJ0jcWnXIigkavdAhilOPvlb5FNgFho284 HwAU43d/Sz/FIeeRmnhZUd/WeQxqN7ledqBYSwy5KOh5hrTeO9obBphQeUt5MQl7gco3 FQw9awLXtJTOkuRgo6vrb4wDjQEVciA/3RipYMH8j0k+u9LOytx63vuJ+5NSnxhduJ2v Q7fERnE9oRVrxn5JT9DMo4HlI/By6tJj1hcToGmIScesBUyIynDg4ytdsDtGYDddMVcr RG/w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id du15si14314221ejc.591.2020.06.16.01.31.55; Tue, 16 Jun 2020 01:32:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727935AbgFPI1N (ORCPT + 99 others); Tue, 16 Jun 2020 04:27:13 -0400 Received: from mx2.suse.de ([195.135.220.15]:49366 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726467AbgFPI1N (ORCPT ); Tue, 16 Jun 2020 04:27:13 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 1F6DFAD7B; Tue, 16 Jun 2020 08:27:15 +0000 (UTC) From: Vlastimil Babka To: vbabka@suse.cz Cc: akpm@linux-foundation.org, alex.shi@linux.alibaba.com, hughd@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, liwang@redhat.com, mgorman@techsingularity.net, stable@vger.kernel.org Subject: [PATCH 1/2] mm, compaction: make capture control handling safe wrt interrupts Date: Tue, 16 Jun 2020 10:26:48 +0200 Message-Id: <20200616082649.27173-1-vbabka@suse.cz> X-Mailer: git-send-email 2.27.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hugh reports: ===== While stressing compaction, one run oopsed on NULL capc->cc in __free_one_page()'s task_capc(zone): compact_zone_order() had been interrupted, and a page was being freed in the return from interrupt. Though you would not expect it from the source, both gccs I was using (a 4.8.1 and a 7.5.0) had chosen to compile compact_zone_order() with the ".cc = &cc" implemented by mov %rbx,-0xb0(%rbp) immediately before callq compact_zone - long after the "current->capture_control = &capc". An interrupt in between those finds capc->cc NULL (zeroed by an earlier rep stos). This could presumably be fixed by a barrier() before setting current->capture_control in compact_zone_order(); but would also need more care on return from compact_zone(), in order not to risk leaking a page captured by interrupt just before capture_control is reset. Maybe that is the preferable fix, but I felt safer for task_capc() to exclude the rather surprising possibility of capture at interrupt time. ===== I have checked that gcc10 also behaves the same. The advantage of fix in compact_zone_order() is that we don't add another test in the page freeing hot path, and that it might prevent future problems if we stop exposing pointers to unitialized structures in current task. So this patch implements the suggestion for compact_zone_order() with barrier() (and WRITE_ONCE() to prevent store tearing) for setting current->capture_control, and prevents page leaking with WRITE_ONCE/READ_ONCE in the proper order. Fixes: 5e1f0f098b46 ("mm, compaction: capture a page under direct compaction") Cc: stable@vger.kernel.org # 5.1+ Reported-by: Hugh Dickins Suggested-by: Hugh Dickins Signed-off-by: Vlastimil Babka --- mm/compaction.c | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index fd988b7e5f2b..86375605faa9 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -2316,15 +2316,26 @@ static enum compact_result compact_zone_order(struct zone *zone, int order, .page = NULL, }; - current->capture_control = &capc; + /* + * Make sure the structs are really initialized before we expose the + * capture control, in case we are interrupted and the interrupt handler + * frees a page. + */ + barrier(); + WRITE_ONCE(current->capture_control, &capc); ret = compact_zone(&cc, &capc); VM_BUG_ON(!list_empty(&cc.freepages)); VM_BUG_ON(!list_empty(&cc.migratepages)); - *capture = capc.page; - current->capture_control = NULL; + /* + * Make sure we hide capture control first before we read the captured + * page pointer, otherwise an interrupt could free and capture a page + * and we would leak it. + */ + WRITE_ONCE(current->capture_control, NULL); + *capture = READ_ONCE(capc.page); return ret; } -- 2.27.0