Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4574161ybl; Wed, 22 Jan 2020 00:16:20 -0800 (PST) X-Google-Smtp-Source: APXvYqwaxBpvy0E41vDTKXN2rOK9YP/8Eziv7K2KG5A2Cij9nryRro+B9dPFfsTvE0FvZVGlsMKJ X-Received: by 2002:aca:ec4f:: with SMTP id k76mr5733984oih.156.1579680980643; Wed, 22 Jan 2020 00:16:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579680980; cv=none; d=google.com; s=arc-20160816; b=QrMAziCUExYe0JVFnsWAkfK401z2gSfsUxv+mvQPdQxTkuu2tcdc3nWFlHA2hEK2h2 L0FaF7nf6jyXj5XMBVNb0DiCnAmGEX1y+PraU/DeVOWskSmvzXSHld4xMqgwx+pOjiwM 425CDSQ8FSO//wliHLU3V6h1KSUsXiZLHhqxAzMnVTdupt5pIgwxcBdIaiwYy/fWvAta RZC3YSIcV4J2iyeAuPb3eR6CzEWrX2cYb/oTQm9iC2HCUiS1OQFP1vS1ih8Z4zGzzyWY oKK8T2oAptouQMfDTyT/2nWcx8V+iFuR1fApGkRJC3Q8kDdbfqzbOuDPZ46VdgawU5ZJ Hi4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=Xc1OE301adpCBNtCcujnxgV30fLuT/0jt7wsYeyWTiw=; b=Ayv9p4v46Oxb/WONGrs8n7soz3k71a/TnSTnYUF8MkELg53mh74EXrvgCSZXp61qVv f6wsSbqZ/dFHM/S4s661DvqEGehUfUjK5zGvZBDzjkPKYWUH3AEfco/SXU15sLhVmYuw whkTlpgnV867z8dHtz/XzxDpsjwkYw8U5gWZWzTewFo1HOj5eYbOVjmkMYK/tACZ5XVi OrYooVNlBrWL9OyMbr0qBFHI1Jhg9kCY+hlJNv7xP92vkfAKzQ+0AwU3kZVmIDtaY+58 qbOi2dn71ih7mJ77lLqwoJKo/OPRwCHm64CNR8ZVT2qVIuBi1tHJ1HIUyEjLw2DDxCha jekw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e3si23207133otr.245.2020.01.22.00.16.08; Wed, 22 Jan 2020 00:16:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729021AbgAVIOK (ORCPT + 99 others); Wed, 22 Jan 2020 03:14:10 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:45719 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725900AbgAVIOK (ORCPT ); Wed, 22 Jan 2020 03:14:10 -0500 Received: by mail-wr1-f66.google.com with SMTP id j42so6161378wrj.12; Wed, 22 Jan 2020 00:14:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Xc1OE301adpCBNtCcujnxgV30fLuT/0jt7wsYeyWTiw=; b=G10Z8+4btQ/WyBIsGexaLb0r5r+YK1C8ovJF/namQRx8XXX4yp8MrLC78ctuFuZfKG WUifP8WKFKmiTg/sJ0fUtH8fJF8UlmfLxYwg5UI2ig3Rofkkl472bsGSOFLtcqfySB33 OuzFJJoU0Km1n+60GoVAzsFAlT7otMooQPAuOUcUEFn0PTUqqrasmbLo6f4jev0Q4hey qcOr9neNn9tGhvv/bOiM2xPG1aKn+pQyETQm+ZBAynLIvab88EkpFXdbuBUW7pAu7mKK khyZqPhiZrGuNLPUmHhnm/lMRjdyE9NLgamGgYPZa6En7GYL+vSz1HMXWB2daMcL5MuF Dz3g== X-Gm-Message-State: APjAAAUMpwjvjt8/P5keMMQrNR+K4whrVgPvpnqnvRg+YWRv6gO8yHPg knSN49+UeCqwmxt53+1R3bk= X-Received: by 2002:a05:6000:1187:: with SMTP id g7mr9774091wrx.109.1579680848513; Wed, 22 Jan 2020 00:14:08 -0800 (PST) Received: from localhost (ip-37-188-245-167.eurotel.cz. [37.188.245.167]) by smtp.gmail.com with ESMTPSA id t25sm2838897wmj.19.2020.01.22.00.14.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 22 Jan 2020 00:14:07 -0800 (PST) Date: Wed, 22 Jan 2020 09:14:06 +0100 From: Michal Hocko To: David Rientjes Cc: Andrew Morton , Wei Yang , hannes@cmpxchg.org, vdavydov.dev@gmail.com, ktkhai@virtuozzo.com, kirill.shutemov@linux.intel.com, yang.shi@linux.alibaba.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, alexander.duyck@gmail.com, stable@vger.kernel.org Subject: Re: [Patch v4] mm: thp: remove the defer list related code since this will not happen Message-ID: <20200122081406.GO29276@dhcp22.suse.cz> References: <20200117233836.3434-1-richardw.yang@linux.intel.com> <20200118145421.0ab96d5d9bea21a3339d52fe@linux-foundation.org> <20200120072237.GA18451@dhcp22.suse.cz> <20200120212726.GB29276@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 21-01-20 15:08:39, David Rientjes wrote: > On Mon, 20 Jan 2020, Michal Hocko wrote: > > > > > > When migrating memcg charges of thp memory, there are two possibilities: > > > > > > > > > > (1) The underlying compound page is mapped by a pmd and thus does is not > > > > > on a deferred split queue (it's mapped), or > > > > > > > > > > (2) The compound page is not mapped by a pmd and is awaiting split on a > > > > > deferred split queue. > > > > > > > > > > The current charge migration implementation does *not* migrate charges for > > > > > thp memory on the deferred split queue, it only migrates charges for pages > > > > > that are mapped by a pmd. > > > > > > > > > > Thus, to migrate charges, the underlying compound page cannot be on a > > > > > deferred split queue; no list manipulation needs to be done in > > > > > mem_cgroup_move_account(). > > > > > > > > > > With the current code, the underlying compound page is moved to the > > > > > deferred split queue of the memcg its memory is not charged to, so > > > > > susbequent reclaim will consider these pages for the wrong memcg. Remove > > > > > the deferred split queue handling in mem_cgroup_move_account() entirely. > > > > > > > > I believe this still doesn't describe the underlying problem to the full > > > > extent. What happens with the page on the deferred list when it > > > > shouldn't be there in fact? Unless I am missing something deferred_split_scan > > > > will simply split that huge page. Which is a bit unfortunate but nothing > > > > really critical. This should be mentioned in the changelog. > > > > > > > > > > Are you referring to a compound page on the deferred split queue before a > > > task is moved? I'm not sure this is within the scope of Wei's patch.. > > > this is simply preventing a page from being moved to the deferred split > > > queue of a memcg that it is not charged to. Is there a concern about why > > > this code can be removed or a suggestion on something else it should be > > > doing instead? > > > > No, I do not have any concern about the patch itslef. It is that the > > changelog doesn't decribe the user visible effect. All I am saying is > > that the current code splits THPs of moved pages under memory pressure > > even if that is not needed. And that is a clear bug. > > Ah, gotcha. I tried to do this in the final paragraph of my amedment to > Wei's patch and why it's important that this is marked as stable. I considered "susbequent reclaim will consider these pages for the wrong memcg." quite unclear TBH. > The current code in 5.4 from commit 87eaceb3faa59 places any migrated > compound page onto the deferred split queue of the destination memcg > regardless of whether it has a mapping pmd > (list_empty(page_deferred_list()) was already false) or it does not have a > mapping pmd (but is now on the wrong queue). For the latter, > can_split_huge_page() can help for the actual split but not for the > removal of the page that is now erroneously on the queue. Does that mean that those fully mapped THPs are not going to be split? > For the former, > memcg reclaim would not see the pages that it should split under memcg > pressure so we'll see the same memcg oom conditions we saw before the > deferred split shrinker became SHRINKER_MEMCG_AWARE: unnecessary ooms. OK, this is yet another user visibile effect and it would be better to mention it explicitly in the changelog. -- Michal Hocko SUSE Labs