Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758583AbcLBJti (ORCPT ); Fri, 2 Dec 2016 04:49:38 -0500 Received: from outbound-smtp10.blacknight.com ([46.22.139.15]:40994 "EHLO outbound-smtp10.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751786AbcLBJtg (ORCPT ); Fri, 2 Dec 2016 04:49:36 -0500 Date: Fri, 2 Dec 2016 09:49:33 +0000 From: Mel Gorman To: Michal Hocko Cc: Andrew Morton , Christoph Lameter , Vlastimil Babka , Johannes Weiner , Jesper Dangaard Brouer , Linux-MM , Linux-Kernel Subject: Re: [PATCH 1/2] mm, page_alloc: Keep pcp count and list contents in sync if struct page is corrupted Message-ID: <20161202094933.jxcgvtth2poqdm3n@techsingularity.net> References: <20161202002244.18453-1-mgorman@techsingularity.net> <20161202002244.18453-2-mgorman@techsingularity.net> <20161202081216.GA6830@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20161202081216.GA6830@dhcp22.suse.cz> User-Agent: Mutt/1.6.2 (2016-07-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1364 Lines: 33 On Fri, Dec 02, 2016 at 09:12:17AM +0100, Michal Hocko wrote: > On Fri 02-12-16 00:22:43, Mel Gorman wrote: > > Vlastimil Babka pointed out that commit 479f854a207c ("mm, page_alloc: > > defer debugging checks of pages allocated from the PCP") will allow the > > per-cpu list counter to be out of sync with the per-cpu list contents > > if a struct page is corrupted. This patch keeps the accounting in sync. > > > > Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP") > > Signed-off-by: Mel Gorman > > cc: stable@vger.kernel.org [4.7+] > > I am trying to think about what would happen if we did go out of sync > and cannot spot a problem. Vlastimil has mentioned something about > free_pcppages_bulk looping for ever but I cannot see it happening right > now. free_pcppages_bulk can infinite loop if the page count is positive and there are no pages. While I've only seen this during development, a corrupted count loops here do { batch_free++; if (++pindex == NR_PCP_LISTS) pindex = 0; list = &pcp->lists[pindex]; } while (list_empty(list)); It would only be seen in a situation where struct page corruption was detected so it's rare. -- Mel Gorman SUSE Labs