Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4827041rdh; Wed, 29 Nov 2023 11:45:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IGvJse8Y+zvWPM/Du8lJKfxRIpk8WRnN99wkOoqV55CsVYd9k+UwLsF043VLE0qUG58575B X-Received: by 2002:aa7:9a87:0:b0:6bb:8982:411c with SMTP id x7-20020aa79a87000000b006bb8982411cmr18839398pfi.8.1701287159484; Wed, 29 Nov 2023 11:45:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701287159; cv=none; d=google.com; s=arc-20160816; b=VUOUAZXH5R4H8OKSdvnYYY1ABjgAUZzOn/jmVNaWpqgXuMqd0sLOJ5jUgKWDCBOrsV zdnrlpnBO0hpbv7xGcBUPvUZ4n5hSSRAOwTC/KU1RW5zg2GcCmDT22twnfL44NxyxH3+ +SYULxraG3sqOXkBx4ziOOPh631TzBXo6z2TB40td+9/0nbRmYvwkVtDR3ctWcQGpYjB kkA7aFVtX4gbJXUBpJUWOiVHZaRsDWAf5ZR6BkvWoFfIzNRWLsyOyXMxkjSUr5wihmKN bTCuVmv8AXSXaN9O68BAwd3QVw0/F4bK5ggQfcaNjx6hXYwC/3zBw9wVClUQodzb+3+t G29Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=fNNsoM3smwE3RnQXjosvlH5YCUc/qVQE7cVYHj28TpE=; fh=ZvDE8NqP3cj28C5ztFDTo0Q+CS9dxhDMwvcrWGiz2cc=; b=V9DapvoqPb+/87Ijalvno7ts7LbPAfzfSJ09MZP/rJ+VKvF+yhwOZTxgIR/cwhXqaL /AAEZFdYnBwjwG69aRl0e0vAt73wqiRblNTfE4LrABFnnoB6buUQmZ4b5z2IJj6rfjdb fWOEWaEV4q7blYocJ1v+7Fz1btIBlmKaXnLBCee4ZQP3oiVEwR2oeIbRzsw0dRQ9ZqRD 9Iq3ZNAVeH+ZmVCkqOiLZ3Y51NmGJI8G7zeIh3HbGieyTLTn5BOaXMqjs9AQm7cZM4Te GgzNPbGlG7pIALjeTyNqyVV0dRltU8JIse5ehNMqEJWY070vDU6y9SRAnyXFxycoiI/R lLhA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b="LXdK4D//"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id e7-20020a636907000000b005be10674e53si14403789pgc.479.2023.11.29.11.45.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 11:45:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b="LXdK4D//"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id B5430804200C; Wed, 29 Nov 2023 11:45:56 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231260AbjK2Tpl (ORCPT + 99 others); Wed, 29 Nov 2023 14:45:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230167AbjK2Tpj (ORCPT ); Wed, 29 Nov 2023 14:45:39 -0500 Received: from mail-lj1-x236.google.com (mail-lj1-x236.google.com [IPv6:2a00:1450:4864:20::236]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5D2B10C9 for ; Wed, 29 Nov 2023 11:45:44 -0800 (PST) Received: by mail-lj1-x236.google.com with SMTP id 38308e7fff4ca-2c9bf8964c3so2426051fa.1 for ; Wed, 29 Nov 2023 11:45:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1701287143; x=1701891943; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=fNNsoM3smwE3RnQXjosvlH5YCUc/qVQE7cVYHj28TpE=; b=LXdK4D//aEMP9BGxDDgf86iGwwP755ud2a5fDDm/H2p6QdLjemMKdW6OURufzQsPsY 0eLaXgTBlg6Ro0WFppwNRmOJLM1GDxyvkcB5TIzwaa1UbSsi7lsRSu+qu1QqrpGx2WuC PMdM6kTA5x1LLeGlJbl22YhycJKkzgKrJu76H5g6PWBs4oxqQRSTNqluBFW5+ez9Jziz 4aIESjCEFR56CJFlo89lMbSGxS5kd6/vIt4XU6zdtPKVphOyJms1zwcJv0kWIeUQ3+tq DwWxhQoRd48Vb+sQo8NASWDsRlac17f1RpzDbTnO7V3cede7ZgbiDEwo9K3QOihi0Ay/ uOvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701287143; x=1701891943; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=fNNsoM3smwE3RnQXjosvlH5YCUc/qVQE7cVYHj28TpE=; b=rJUiHYLogIoNkdSt3/13VVeONmsl8UoaCWpVa27qF5B/wk+FHxrsgnQRGTuzvvm3EP jFkBWNyALCktjjCVG227GHe5EjltZgu3sq0VoqrfaE4aDGYx7N7HJItTY+/UuQPW5sWx H7mi4WkxPSTEGTMlapICasI71Xu8IlOk9y7a4/sqLnyEq9kO5rvkYfwsFsZOrK8psuK2 Rtj53q8ZW11LhBGWmHbIFbaMymZ2JnrgBW9XbY8uPBtfTVe4h7t9NY6ov3ksRXFSsVw1 D88JZ6NOXiYTN9VzSrbpf1819xfhkuNJmbqCKRiiZJsHY+X2xJexiPgMiFusf7FGfDuy 5PMA== X-Gm-Message-State: AOJu0YwTlJOPFU1dcWr4ZdhEYaNMOwyC84aiSTKdYVMM75gjCWYy/MT0 2JkRey6ZyXeAiYlSqpwajHKkmFfwZKHw3h/mYM04yg== X-Received: by 2002:a2e:9b59:0:b0:2c6:ece6:5b65 with SMTP id o25-20020a2e9b59000000b002c6ece65b65mr12569535ljj.10.1701287142874; Wed, 29 Nov 2023 11:45:42 -0800 (PST) MIME-Version: 1.0 References: <20231128204938.1453583-1-pasha.tatashin@soleen.com> <20231128204938.1453583-9-pasha.tatashin@soleen.com> <1c6156de-c6c7-43a7-8c34-8239abee3978@arm.com> <20231128235037.GC1312390@ziepe.ca> <52de3aca-41b1-471e-8f87-1a77de547510@arm.com> In-Reply-To: <52de3aca-41b1-471e-8f87-1a77de547510@arm.com> From: Pasha Tatashin Date: Wed, 29 Nov 2023 14:45:03 -0500 Message-ID: Subject: Re: [PATCH 08/16] iommu/fsl: use page allocation function provided by iommu-pages.h To: Robin Murphy Cc: Jason Gunthorpe , akpm@linux-foundation.org, alex.williamson@redhat.com, alim.akhtar@samsung.com, alyssa@rosenzweig.io, asahi@lists.linux.dev, baolu.lu@linux.intel.com, bhelgaas@google.com, cgroups@vger.kernel.org, corbet@lwn.net, david@redhat.com, dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de, iommu@lists.linux.dev, jasowang@redhat.com, jernej.skrabec@gmail.com, jonathanh@nvidia.com, joro@8bytes.org, kevin.tian@intel.com, krzysztof.kozlowski@linaro.org, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-rockchip@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev, linux-tegra@vger.kernel.org, lizefan.x@bytedance.com, marcan@marcan.st, mhiramat@kernel.org, mst@redhat.com, m.szyprowski@samsung.com, netdev@vger.kernel.org, paulmck@kernel.org, rdunlap@infradead.org, samuel@sholland.org, suravee.suthikulpanit@amd.com, sven@svenpeter.dev, thierry.reding@gmail.com, tj@kernel.org, tomas.mudrunka@gmail.com, vdumpa@nvidia.com, virtualization@lists.linux.dev, wens@csie.org, will@kernel.org, yu-cheng.yu@intel.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 29 Nov 2023 11:45:56 -0800 (PST) > >> We can separate the metric into two: > >> iommu pagetable only > >> iommu everything > >> > >> or into three: > >> iommu pagetable only > >> iommu dma > >> iommu everything > >> > >> What do you think? > > > > I think I said this at LPC - if you want to have fine grained > > accounting of memory by owner you need to go talk to the cgroup people > > and come up with something generic. Adding ever open coded finer > > category breakdowns just for iommu doesn't make alot of sense. > > > > You can make some argument that the pagetable memory should be counted > > because kvm counts it's shadow memory, but I wouldn't go into further > > detail than that with hand coded counters.. > > Right, pagetable memory is interesting since it's something that any > random kernel user can indirectly allocate via iommu_domain_alloc() and > iommu_map(), and some of those users may even be doing so on behalf of > userspace. I have no objection to accounting and potentially applying > limits to *that*. Yes, in the next version, I will separate pagetable only from the rest, for the limits. > Beyond that, though, there is nothing special about "the IOMMU > subsystem". The amount of memory an IOMMU driver needs to allocate for > itself in order to function is not of interest beyond curiosity, it just > is what it is; limiting it would only break the IOMMU, and if a user Agree about the amount of memory IOMMU allocates for itself, but that should be small, if it is not, we have to at least show where the memory is used. > thinks it's "too much", the only actionable thing that might help is to > physically remove devices from the system. Similar for DMA buffers; it > might be intriguing to account those, but it's not really an actionable > metric - in the overwhelming majority of cases you can't simply tell a > driver to allocate less than what it needs. And that is of course > assuming if we were to account *all* DMA buffers, since whether they > happen to have an IOMMU translation or not is irrelevant (we'd have > already accounted the pagetables as pagetables if so). DMA mappings should be observable (do not have to be limited). At the very least, it can help with explaining the kernel memory overhead anomalies on production systems. > I bet "the networking subsystem" also consumes significant memory on the It does, and GPU drivers also may consume a significant amount of memory. > same kind of big systems where IOMMU pagetables would be of any concern. > I believe some of the some of the "serious" NICs can easily run up > hundreds of megabytes if not gigabytes worth of queues, SKB pools, etc. > - would you propose accounting those too? Yes. Any kind of kernel memory that is proportional to the workload should be accountable. Someone is using those resources compared to the idling system, and that someone should be charged. Pasha