Received: by 2002:a05:7412:ba23:b0:fa:4c10:6cad with SMTP id jp35csp982151rdb; Fri, 19 Jan 2024 05:00:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IE0eHHr7dP19pC2AYnole5MhQTjumm7yXvIRJDGVJVqxmPFGpApPHwsFrS9s950vTE2O4Bx X-Received: by 2002:a17:906:e96:b0:a2e:9398:aaff with SMTP id p22-20020a1709060e9600b00a2e9398aaffmr1380707ejf.74.1705669209296; Fri, 19 Jan 2024 05:00:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705669209; cv=pass; d=google.com; s=arc-20160816; b=0+xf6qh0mlxaNqG4s6krDAR4vhbpfWDr4EgkQ33uz74Vbsaj61OCLjTXyKE+fgYa+z RTZAvY2V6XfMBta9cm/mRYyjuLCSnyfqqFdWkqZSyN7+u/N3gTIfvoRSa5Ak5n9M3d/y O+VYuhm/gedN8ckHK/N6RAzIL3SeRrTCO2Wy5kDTdX8vkB1/JIlTw5/dvVBD5Qf/9WV+ dokLLCfZcIYtjIOHycb3xOOqo2sdVIrkgVcI3rQB/aytbXLrP7eYp30wOnXHEolv0PV4 3z34gcc+YqgRtGlRR35cVUII3F5EEqbYUuJq9werui13G2byFtHWxfNbuHfoTL4iWOtM pvfQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=4zrlMXyEhSmzZ1tj4JXNEs3A6LNgvNTga4tMg2r1Yxc=; fh=21iFKAZOIe+0XcdDY+59IdYdwFQ+1lf9qT8dvfE3tIE=; b=MuJecALBwvyiuwayJzc2piybrsfWzz9IVXOAa3jnWCbKQ6jktkvzEZYSLZLeMZvyZx C87fSsRWv+42XWeXELZAIzmBcl6rYC9aS2633AeqoF+at1K55Z1xMNOQ7GcwaovqNo5o Q4QlAoZOqjO1FX3XB5L8Chka8ggydwHlXMai/fAYUFwYSCNO7kkzS/hhIUYKcJCXT+jT FUwbCunLYlEyhFvFpeG3H9IqIlow3FmSkqAYpv2GVUJt115an1wve+g2yBvacqF1TKzT IGiPmiEAKg+7RcTKPSpiMqNxGSybfjWU3beJPSHBdU0KirPeQ3mNyyqRPw6LifLrKweO shlQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-31155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31155-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id o4-20020a17090637c400b00a26d8051b01si7853554ejc.282.2024.01.19.05.00.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 05:00:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-31155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=huawei.com dmarc=pass fromdomain=huawei.com); spf=pass (google.com: domain of linux-kernel+bounces-31155-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31155-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 0D25A1F2313D for ; Fri, 19 Jan 2024 13:00:09 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5C77151C4E; Fri, 19 Jan 2024 12:59:31 +0000 (UTC) Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1933A51C4D for ; Fri, 19 Jan 2024 12:59:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.187 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705669170; cv=none; b=ZFKzixJEaMBzndZ3Pn2bQ0sOaiNB6vBRe9W2MeAdy1DgHDdFIFCErRRTbLkFPLhyAAxarh7hSOJ8kuAUhcNLTvxrkjfeKhjAaq1aB3MCPYb7defdnyQtUwzdqHOUc4uFB4TaTSnhY1kpD9glC5pW653sYMvDBHA6oo0csoEp+tY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705669170; c=relaxed/simple; bh=a9evNIQTQ4v07c3j6jPq8pkzsYOuV1SjMhA90sc661c=; h=Message-ID:Date:MIME-Version:Subject:To:CC:References:From: In-Reply-To:Content-Type; b=VCasajjGBMoEKAtCqIdCqBOhPGG1OzaWBhG7yCmM/QIM7hFa8RNQ18jQW+54UBgUJ/zTou8z2FN2V32joAinOij/Ys7TJ8cVyDCliGN2KqtjPZxxsLu1V9VqLvwjCxB7sJfyqvMIbu/VhMXoHLEI6Imn5cKG1+vIgQ0vZFaINOM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.187 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.88.105]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4TGfmz0QQ3zsVs0; Fri, 19 Jan 2024 20:58:27 +0800 (CST) Received: from dggpemm100001.china.huawei.com (unknown [7.185.36.93]) by mail.maildlp.com (Postfix) with ESMTPS id 7D9CD140153; Fri, 19 Jan 2024 20:59:23 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Fri, 19 Jan 2024 20:59:23 +0800 Message-ID: <14ae628d-a9ef-42f3-9201-e90c5c88c133@huawei.com> Date: Fri, 19 Jan 2024 20:59:22 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2] mm: memory: move mem_cgroup_charge() into alloc_anon_folio() Content-Language: en-US To: Michal Hocko CC: Andrew Morton , , , , Matthew Wilcox , David Hildenbrand References: <20240117103954.2756050-1-wangkefeng.wang@huawei.com> From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm100001.china.huawei.com (7.185.36.93) On 2024/1/19 16:00, Michal Hocko wrote: > On Fri 19-01-24 10:05:15, Kefeng Wang wrote: >> >> >> On 2024/1/18 23:59, Michal Hocko wrote: >>> On Wed 17-01-24 18:39:54, Kefeng Wang wrote: >>>> mem_cgroup_charge() uses the GFP flags in a fairly sophisticated way. >>>> In addition to checking gfpflags_allow_blocking(), it pays attention >>>> to __GFP_NORETRY and __GFP_RETRY_MAYFAIL to ensure that processes within >>>> this memcg do not exceed their quotas. Using the same GFP flags ensures >>>> that we handle large anonymous folios correctly, including falling back >>>> to smaller orders when there is plenty of memory available in the system >>>> but this memcg is close to its limits. >>> >>> The changelog is not really clear in the actual problem you are trying >>> to fix. Is this pure consistency fix or have you actually seen any >>> misbehavior. From the patch I suspect you are interested in THPs much >>> more than regular order-0 pages because those are GFP_KERNEL like when >>> it comes to charging. THPs have a variety of options on how aggressive >>> the allocation should try. From that perspective NORETRY and >>> RETRY_MAYFAIL are not all that interesting because costly allocations >>> (which THPs are) already do imply MAYFAIL and NORETRY. >> >> I don't meet actual issue, it founds from code inspection. >> >> mTHP is introduced by Ryan(19eaf44954df "mm: thp: support allocation of >> anonymous multi-size THP"),so we have similar check for mTHP like PMD THP >> in alloc_anon_folio(), it will try to allocate large order folio below >> PMD_ORDER, and fallback to order-0 folio if fails, meanwhile, >> it get GFP flags from vma_thp_gfp_mask() according to user configuration >> like PMD THP allocation, so >> >> 1) the memory charge failure check should be moved into fallback >> logical, because it will make us to allocated as much as possible large >> order folio, although the memcg's memory usage is close to its limits. >> >> 2) using seem GFP flags for allocate/mem charge, be consistent with PMD >> THP firstly, in addition, according to GFP flag returned for >> vma_thp_gfp_mask(), GFP_TRANSHUGE_LIGHT could make us skip direct reclaim, >> _GFP_NORETRY will make us skip mem_cgroup_oom and won't kill >> any progress from large order folio charging. > > OK, makes sense. Please turn that into the changelog. Sure. > >>> GFP_TRANSHUGE_LIGHT is more interesting though because those do not dive >>> into the direct reclaim at all. With the current code they will reclaim >>> charges to free up the space for the allocated THP page and that defeats >>> the light mode. I have a vague recollection of preparing a patch to >> >> We are interesting to GFP_TRANSHUGE_LIGHT and _GFP_NORETRY as mentioned >> above. > > if mTHP can be smaller than COSTLY_ORDER then you are correct and > NORETRY makes a difference. Please mention that in the changelog as > well. > For memory cgroup charge, _GFP_NORETRY checked to make us directly skip mem_cgroup_oom(), it has no concern with folio order or COSTLY_ORDER when check _GFP_NORETRY in try_charge_memcg(), so I think NORETRY should always make difference for all large order folio.