Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp158686pxa; Thu, 13 Aug 2020 23:40:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwwYeHLvtj6EM5USAULJreu2DtaOL0b3Vd4LHWrUyY3QLkijGxQudNJqCjb4oE/2C5T9cn+ X-Received: by 2002:a05:6402:c12:: with SMTP id co18mr889475edb.297.1597387255474; Thu, 13 Aug 2020 23:40:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597387255; cv=none; d=google.com; s=arc-20160816; b=AzmrpyqQ+U5Y0AKnMMUO2wruw1z8IHOBboi8T+grMDkg6Ym7Fzz5X2wcfXtRMOjwqv 5d2sqSf8ugPFFf5BKGz4T17QTk3q7BEICdkFa7hVVFIzTmx/gNHHVMKzW/qMA+0NOGxy aS9Y4W5N9PxUYkcTq6kqmzz9kNQvYFMIFS2SQDvk2rUq1qEuA0vpdS/VMdBH0ih0Pxn0 fxXsY8OI7PZZ5RwIVT3Gin4KvRmsvxnPbmKrIOtma59ZeflrIywarZBP21uHKAFBDvFQ WPmrvwPRJ4AGFZ29vGrRk0QMmQq+tGFH5j3BpMcXvZrFKUvnwMluMFgnMOIhI5d1CGsd pbFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=63gkbbOYfhdq1sNchVkV2PT5u4CU5W1AC2naLqxzRYI=; b=q7vISXw8FC8O6XUbaS44GYvqGQzDmgihkLhNYuQrmRPkd2s1QaEhJpmqYhHx4m/NdT VokavUSWH/U45NgexBI2v2Ng/5K1c8iy6WndnOFCqnCisuHPZHNxmIK+OT8lwPQn1DsP 2G6C1Can6SDvowwF1U+lvbCrm3vwtc6Pt6OC9rrfTCe+9/JKQwMYgdZJkSUZuTlbX6e7 rkgPzHCb2tNDMTIvQAQwwpW2rtKYbAN5i5YFm8rNUZS1wp8Rt5b2vr2Py/YbN3qYNsQY 7meuadymU5p8N1Npxt8zj/zNYIPK+b/MTr24afwTV/JOfbdHsnRIBMWH+K7qsduGjT3X 4Hsg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b2si4653212ejk.222.2020.08.13.23.40.31; Thu, 13 Aug 2020 23:40:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726185AbgHNGj2 (ORCPT + 99 others); Fri, 14 Aug 2020 02:39:28 -0400 Received: from mx2.suse.de ([195.135.220.15]:55458 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726124AbgHNGj1 (ORCPT ); Fri, 14 Aug 2020 02:39:27 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id A4CFFADF0; Fri, 14 Aug 2020 06:39:48 +0000 (UTC) Date: Fri, 14 Aug 2020 08:39:24 +0200 From: Michal Hocko To: Charan Teja Kalla Cc: akpm@linux-foundation.org, vbabka@suse.cz, david@redhat.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, vinmenon@codeaurora.org Subject: Re: [PATCH V2] mm, page_alloc: fix core hung in free_pcppages_bulk() Message-ID: <20200814063924.GX9477@dhcp22.suse.cz> References: <1597150703-19003-1-git-send-email-charante@codeaurora.org> <20200813114105.GI9477@dhcp22.suse.cz> <9ca76893-dfe8-9a46-f2ec-6b3c663e848e@codeaurora.org> <20200813163054.GR9477@dhcp22.suse.cz> <099b1a12-7fcd-f665-3f9d-e20d4e1396d3@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <099b1a12-7fcd-f665-3f9d-e20d4e1396d3@codeaurora.org> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 13-08-20 22:57:32, Charan Teja Kalla wrote: > Thanks Michal. > > On 8/13/2020 10:00 PM, Michal Hocko wrote: > > On Thu 13-08-20 21:51:29, Charan Teja Kalla wrote: > >> Thanks Michal for comments. > >> > >> On 8/13/2020 5:11 PM, Michal Hocko wrote: > >>> On Tue 11-08-20 18:28:23, Charan Teja Reddy wrote: > >>> [...] > >>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c > >>>> index e4896e6..839039f 100644 > >>>> --- a/mm/page_alloc.c > >>>> +++ b/mm/page_alloc.c > >>>> @@ -1304,6 +1304,11 @@ static void free_pcppages_bulk(struct zone *zone, int count, > >>>> struct page *page, *tmp; > >>>> LIST_HEAD(head); > >>>> > >>>> + /* > >>>> + * Ensure proper count is passed which otherwise would stuck in the > >>>> + * below while (list_empty(list)) loop. > >>>> + */ > >>>> + count = min(pcp->count, count); > >>>> while (count) { > >>>> struct list_head *list; > >>> > >>> > >>> How does this prevent the race actually? > >> > >> This doesn't prevent the race. This only fixes the core hung(as this is > >> called with spin_lock_irq()) caused by the race condition. This core > >> hung is because of incorrect count value is passed to the > >> free_pcppages_bulk() function. > > > > Let me ask differently. What does enforce that the count and lists do > > not get out of sync in the loop. > > count value is updated whenever an order-0 page is being added to the > pcp lists through free_unref_page_commit(), which is being called with > both interrupts, premption disabled. > static void free_unref_page_commit(struct page *page, { > .... > list_add(&page->lru, &pcp->lists[migratetype]); > pcp->count++ > } > > As these are pcp lists, they only gets touched by another process when > this process is context switched, which happens only after enabling > premption or interrupts. So, as long as process is operating on these > pcp lists in free_unref_page_commit function, the count and lists are > always synced. > > However, the problem here is not that the count and lists are being out > of sync. They do always in sync, as explained above. It is with the > asking free_pcppages_bulk() to free the pages more than what is present > in the pcp lists which is ending up in while(list_empty()). You are right. I managed to confuse myself. The thing is that the batch count is out of sync. > > Your changelog says that the fix is to > > use the proper value without any specifics. > > > Will change this to: Ensure the count value passed is not greater than > the pcp lists count. Any better you suggest? Yes, this makes it more clear. Feel free to add Acked-by: Michal Hocko -- Michal Hocko SUSE Labs