Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp2479332pxb; Fri, 5 Feb 2021 20:54:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJza94fhmf/mnObL5xcH9TBs0/UmiiHHqYTiOsG6TtsNWEl/c6iY2XP2Sm4YwhgvoGRs+Wp6 X-Received: by 2002:a17:906:c318:: with SMTP id s24mr7185168ejz.187.1612587269805; Fri, 05 Feb 2021 20:54:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612587269; cv=none; d=google.com; s=arc-20160816; b=pdmccdlAiNlSqspFx56epjK9irK1rOiBTmir5AGp7j0RMxeM4U33TNey4iQHJ2Hafj 1RR+fW4pS8PvXkDb6pbIQBIiIcnfrQrbGpABo9IxAn6t6//RvXCO8EHshb/ThXYrMwGT bGo3jml/GApQy48tVi47EU4X4+p7LvQ4N0YaoVD6peCb5kJS8ufPx/3RHrZWFXTT5KNR Fg73r9noftqt05zo2zAUHEsHTpX2uppYf+IflRmu5iPkYZ4NeQyMwiKoyGZs2CRhtrk9 xsh5A+8QIYHl4VSepJwDiiRG7i2iFxOau0eXh2BPBdAwZzs1kTauQ0QduDCvJ59doPpo 6zvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:dkim-signature:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=oKolGBS1u2dclWOmYgPrpVeGGcq5+9aFix3dZDw1bGk=; b=n6UX8S2Z0gNCel/Ck5IQqSbhsYstukL5CV07HzHhqNG3UjqKW6nWb81YSoPrIZyRn/ eTN2bUuLSj9DfY6sesEsIG8+Xj+ma8Bqq9OFTwzuFOgEYZ+o0QQ2FTY3w4x5wAJYMCr6 fK9NfEilIj/dKs57nyatIPyeVGyxFBhNV2M7iQxl+xidGKLNaxPfuVycnO4CfS1q5zMN 7bh/BClPRDSynlPcLdOHnytR1W//2cfwtFssSCikZ/OyB5+GoInZO1tOtaILQ6RxZdC3 a26FauOLzCA3eGkVkBU4VlRxAJKFWj3kpEKNR/CaGg3EmvfkB4DdcXp5biNzns3nJSnE Ec+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=WiHb1KkC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k16si1753874ejq.237.2021.02.05.20.54.05; Fri, 05 Feb 2021 20:54:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@nvidia.com header.s=n1 header.b=WiHb1KkC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=nvidia.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231824AbhBFExi (ORCPT + 99 others); Fri, 5 Feb 2021 23:53:38 -0500 Received: from hqnvemgate24.nvidia.com ([216.228.121.143]:17834 "EHLO hqnvemgate24.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230077AbhBFD2w (ORCPT ); Fri, 5 Feb 2021 22:28:52 -0500 Received: from hqmail.nvidia.com (Not Verified[216.228.121.13]) by hqnvemgate24.nvidia.com (using TLS: TLSv1.2, AES256-SHA) id ; Fri, 05 Feb 2021 13:57:03 -0800 Received: from MacBook-Pro-10.local (172.20.145.6) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1473.3; Fri, 5 Feb 2021 21:57:03 +0000 Subject: Re: [PATCH] mm: cma: support sysfs To: Minchan Kim CC: Andrew Morton , , , , LKML , linux-mm References: <87d7ec1f-d892-0491-a2de-3d0feecca647@nvidia.com> <71c4ce84-8be7-49e2-90bd-348762b320b4@nvidia.com> <34110c61-9826-4cbe-8cd4-76f5e7612dbd@nvidia.com> <269689b7-3b6d-55dc-9044-fbf2984089ab@nvidia.com> From: John Hubbard Message-ID: Date: Fri, 5 Feb 2021 13:57:03 -0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.16; rv:78.0) Gecko/20100101 Thunderbird/78.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [172.20.145.6] X-ClientProxiedBy: HQMAIL107.nvidia.com (172.20.187.13) To HQMAIL107.nvidia.com (172.20.187.13) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nvidia.com; s=n1; t=1612562223; bh=oKolGBS1u2dclWOmYgPrpVeGGcq5+9aFix3dZDw1bGk=; h=Subject:To:CC:References:From:Message-ID:Date:User-Agent: MIME-Version:In-Reply-To:Content-Type:Content-Language: Content-Transfer-Encoding:X-Originating-IP:X-ClientProxiedBy; b=WiHb1KkCmlGZI7ENz8fVz3w9/niPxxR7RAm7CISZUYzx+13FYRIma9Q7Y1Sy1603w E6cN3GSYcKpC5bl9SdyaA1ggtx7IMLFpOfMV0XAFqnniIPkdw+sVCHB8t3qwVyHss3 loWXxBpc9UhAtNBAXQwzs/QTbIWzUgI7u3YcCmZ1BIEZ1NfpuMNqlj0Od+5rxWwDLW oU1qCaF9GSX2e7GaDsaEQgCMlkQJGKboMYNQ94WQ1+g56HTZkkQXjjbzbCOLXoEwYM mpItxBOXVI0WExrvHhvAS7tCsPGSmxTkr6LBln1gb2WFeNz89WlsklZU3DilAbErDw UF1vzzl8XP/Bw== Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/5/21 1:28 PM, Minchan Kim wrote: > On Fri, Feb 05, 2021 at 12:25:52PM -0800, John Hubbard wrote: >> On 2/5/21 8:15 AM, Minchan Kim wrote: >> ... >> OK. But...what *is* your goal, and why is this useless (that's what >> orthogonal really means here) for your goal? > > As I mentioned, the goal is to monitor the failure from each of CMA > since they have each own purpose. > > Let's have an example. > > System has 5 CMA area and each CMA is associated with each > user scenario. They have exclusive CMA area to avoid > fragmentation problem. > > CMA-1 depends on bluetooh > CMA-2 depends on WIFI > CMA-3 depends on sensor-A > CMA-4 depends on sensor-B > CMA-5 depends on sensor-C > aha, finally. I had no idea that sort of use case was happening. This would be good to put in the patch commit description. > With this, we could catch which module was affected but with global failure, > I couldn't find who was affected. > >> >> Also, would you be willing to try out something simple first, >> such as providing indication that cma is active and it's overall success >> rate, like this: >> >> /proc/vmstat: >> >> cma_alloc_success 125 >> cma_alloc_failure 25 >> >> ...or is the only way to provide the more detailed items, complete with >> per-CMA details, in a non-debugfs location? >> >> >>>> >>>> ...and then, to see if more is needed, some questions: >>>> >>>> a) Do you know of an upper bound on how many cma areas there can be >>>> (I think Matthew also asked that)? >>> >>> There is no upper bound since it's configurable. >>> >> >> OK, thanks,so that pretty much rules out putting per-cma details into >> anything other than a directory or something like it. >> >>>> >>>> b) Is tracking the cma area really as valuable as other possibilities? We can put >>>> "a few" to "several" items here, so really want to get your very favorite bits of >>>> information in. If, for example, there can be *lots* of cma areas, then maybe tracking >>> >>> At this moment, allocation/failure for each CMA area since they have >>> particular own usecase, which makes me easy to keep which module will >>> be affected. I think it is very useful per-CMA statistics as minimum >>> code change so I want to enable it by default under CONFIG_CMA && CONFIG_SYSFS. >>> >>>> by a range of allocation sizes is better... >>> >>> I takes your suggestion something like this. >>> >>> [alloc_range] could be order or range by interval >>> >>> /sys/kernel/mm/cma/cma-A/[alloc_range]/success >>> /sys/kernel/mm/cma/cma-A/[alloc_range]/fail >>> .. >>> .. >>> /sys/kernel/mm/cma/cma-Z/[alloc_range]/success >>> /sys/kernel/mm/cma/cma-Z/[alloc_range]/fail >> >> Actually, I meant, "ranges instead of cma areas", like this: >> >> /> /> /> /> ... >> /> /> >> The idea is that knowing the allocation sizes that succeeded >> and failed is maybe even more interesting and useful than >> knowing the cma area that contains them. > > Understand your point but it would make hard to find who was > affected by the failure. That's why I suggested to have your > suggestion under additional config since per-cma metric with > simple sucess/failure are enough. > >> >>> >>> I agree it would be also useful but I'd like to enable it under >>> CONFIG_CMA_SYSFS_ALLOC_RANGE as separate patchset. >>> >> >> I will stop harassing you very soon, just want to bottom out on >> understanding the real goals first. :) >> > > I hope my example makes the goal more clear for you. > Yes it does. Based on the (rather surprising) use of cma-area-per-device, it seems clear that you will need this, so I'll drop my objections to putting it in sysfs. I still think the "number of allocation failures" needs refining, probably via a range-based thing, as we've discussed. But the number of pages failed per cma looks OK now. thanks, -- John Hubbard NVIDIA