Received: by 2002:a25:7ec1:0:0:0:0:0 with SMTP id z184csp2635703ybc; Mon, 25 Nov 2019 01:28:52 -0800 (PST) X-Google-Smtp-Source: APXvYqyGZygEmWFB1PG+rQzMwVxs1cu/aMnAmWWUM6usbxBiOr3H2WQM4K+d6vr/73hpcz/Se1tN X-Received: by 2002:a17:906:fac7:: with SMTP id lu7mr36399385ejb.5.1574674132636; Mon, 25 Nov 2019 01:28:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1574674132; cv=none; d=google.com; s=arc-20160816; b=ClzO2d2KeWv24T2CyWRcd6Z534Ams/rBHmPK+H8ldHVplNiV2E6tWg+yeel9sninI7 xz8troP30TfTxGRA438nTMTf+mH3JvDsJpfOcB3hjGt+5Zocxvds7AsG4jOzWM3PmMPX 9EeOMre1JhAnQ2mmrBIkJE4IiMtDarZsQsxs9QyX3ghNOBzwSbQ5MAYlBWrEpCjrLHJk 8Wl6/kOLuR3SfsoPZZlwdRxHePI9a+oQrybg5h2JIYIXRV3s6kRM8Ye7arsEYaQrGM7M cFDDbDLUTEP9XoyPD7sRvhP4TnbwjpsZMvIyfoReHAgmze+T67NivecWjM3xBDBDy54Y WqMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=agy88qmmxB2sWoqeE/HYY+4VYmWgQzE39c49gLudA8o=; b=vsUFOuq2sJm3jsLp1wZxERC0owcoNw+l0cqiBOrpzt0di2ak8jw3zv3syn/P2S3oO9 al2e2yJXFxE92IdJ/9foSVCejVvZNhI9hE3YoXZvddQB1sBHYZxN4pRwOGBHOKD+/8x9 J4NWh9KcSobX8vX3x+mUYhxUugWChIIHVafIhwCKO3D24XQGsRiFAxL3KrZbwpXR0SlD Zj2xAj5La+F4jLvt8qgahuIrWIW+WmYUQgQ4JJ7gX7wyabumqwLrfakA1RJ1yeg32CL7 TgYgDkxE45tIPbUAjV0RYcPNopFie34eXcTUKU3ic7Ap4Uc/LfQaxu3oj1McxTlHH4WO VjgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d10si5469508edk.122.2019.11.25.01.28.28; Mon, 25 Nov 2019 01:28:52 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727215AbfKYJ1E (ORCPT + 99 others); Mon, 25 Nov 2019 04:27:04 -0500 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:33450 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726498AbfKYJ1E (ORCPT ); Mon, 25 Nov 2019 04:27:04 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R921e4;CH=green;DM=||false|;DS=|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04423;MF=alex.shi@linux.alibaba.com;NM=1;PH=DS;RN=38;SR=0;TI=SMTPD_---0Tj25bSL_1574674004; Received: from IT-FVFX43SYHV2H.local(mailfrom:alex.shi@linux.alibaba.com fp:SMTPD_---0Tj25bSL_1574674004) by smtp.aliyun-inc.com(127.0.0.1); Mon, 25 Nov 2019 17:26:45 +0800 Subject: Re: [PATCH v4 3/9] mm/lru: replace pgdat lru_lock with lruvec lock To: Johannes Weiner Cc: cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mgorman@techsingularity.net, tj@kernel.org, hughd@google.com, khlebnikov@yandex-team.ru, daniel.m.jordan@oracle.com, yang.shi@linux.alibaba.com, willy@infradead.org, shakeelb@google.com, Michal Hocko , Vladimir Davydov , Roman Gushchin , Chris Down , Thomas Gleixner , Vlastimil Babka , Qian Cai , Andrey Ryabinin , "Kirill A. Shutemov" , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Andrea Arcangeli , David Rientjes , "Aneesh Kumar K.V" , swkhack , "Potyra, Stefan" , Mike Rapoport , Stephen Rothwell , Colin Ian King , Jason Gunthorpe , Mauro Carvalho Chehab , Peng Fan , Nikolay Borisov , Ira Weiny , Kirill Tkhai , Yafang Shao References: <1574166203-151975-1-git-send-email-alex.shi@linux.alibaba.com> <1574166203-151975-4-git-send-email-alex.shi@linux.alibaba.com> <20191119160456.GD382712@cmpxchg.org> <20191121220613.GB487872@cmpxchg.org> <20191122161652.GA489821@cmpxchg.org> From: Alex Shi Message-ID: Date: Mon, 25 Nov 2019 17:26:34 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:60.0) Gecko/20100101 Thunderbird/60.9.1 MIME-Version: 1.0 In-Reply-To: <20191122161652.GA489821@cmpxchg.org> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > > But that leaves me with one more worry: compaction. We locked out > charge moving now, so between that and knowing that the page is alive, > we have page->mem_cgroup stable. But compaction doesn't know whether > the page is alive - it comes from a pfn and finds out using PageLRU. > > In the current code, pgdat->lru_lock remains the same before and after > the page is charged to a cgroup, so once compaction has that locked > and it observes PageLRU, it can go ahead and isolate the page. > > But lruvec->lru_lock changes during charging, and then compaction may > hold the wrong lock during isolation: > > compaction: generic_file_buffered_read: > > page_cache_alloc() > > !PageBuddy() > > lock_page_lruvec(page) > lruvec = mem_cgroup_page_lruvec() > spin_lock(&lruvec->lru_lock) > if lruvec != mem_cgroup_page_lruvec() > goto again > > add_to_page_cache_lru() > mem_cgroup_commit_charge() > page->mem_cgroup = foo > lru_cache_add() > __pagevec_lru_add() > SetPageLRU() > > if PageLRU(page): > __isolate_lru_page() > > I don't see what prevents the lruvec from changing under compaction, > neither in your patches nor in Hugh's. Maybe I'm missing something? > Hi Johannes, It looks my patch do the lruvec recheck/relock after PageLRU in compaction.c. It should be fine for your question. So I will try more testing after all changes. Thanks Alex @@ -949,10 +959,26 @@ static bool too_many_isolated(pg_data_t *pgdat) if (!(cc->gfp_mask & __GFP_FS) && page_mapping(page)) goto isolate_fail; + rcu_read_lock(); +reget_lruvec: + lruvec = mem_cgroup_page_lruvec(page, pgdat); + /* If we already hold the lock, we can skip some rechecking */ - if (!locked) { - locked = compact_lock_irqsave(&pgdat->lru_lock, - &flags, cc); + if (lruvec != locked_lruvec) { + if (locked_lruvec) { + spin_unlock_irqrestore(&locked_lruvec->lru_lock, + flags); + locked_lruvec = NULL; + } + if (compact_lock_irqsave(&lruvec->lru_lock, + &flags, cc)) + locked_lruvec = lruvec; + + + if (lruvec != mem_cgroup_page_lruvec(page, pgdat)) + goto reget_lruvec; + + rcu_read_unlock(); /* Try get exclusive access under lock */ if (!skip_updated) { @@ -974,9 +1000,9 @@ static bool too_many_isolated(pg_data_t *pgdat) low_pfn += compound_nr(page) - 1; goto isolate_fail; } - } + } else + rcu_read_unlock(); - lruvec = mem_cgroup_page_lruvec(page, pgdat); /* Try isolate the page */ if (__isolate_lru_page(page, isolate_mode) != 0) @@ -1017,9 +1043,10 @@ static bool too_many_isolated(pg_data_t *pgdat) * page anyway. */ if (nr_isolated) { - if (locked) { - spin_unlock_irqrestore(&pgdat->lru_lock, flags); - locked = false; + if (locked_lruvec) { + spin_unlock_irqrestore(&locked_lruvec->lru_lock, + flags); + locked_lruvec = NULL; } putback_movable_pages(&cc->migratepages);