Received: by 2002:a05:7412:419a:b0:f3:1519:9f41 with SMTP id i26csp4134967rdh; Tue, 28 Nov 2023 12:50:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IGKeDf/AnDjxWu4bndSHs+pVcaMXvTGfsueBQhovTPv4J6IWjgnjaeX89fiRZuQalMX7QqJ X-Received: by 2002:a17:90b:1bcb:b0:280:735:bece with SMTP id oa11-20020a17090b1bcb00b002800735becemr22155568pjb.16.1701204608863; Tue, 28 Nov 2023 12:50:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701204608; cv=none; d=google.com; s=arc-20160816; b=pUiFqDL4TAhf3JnBqH4dK72+toRwuby33Zou79uyOftOAOnoV6gbfLpGlTbGBJSwUa V7XTVloWGHplRedbcpoHzfv3QL+y3sr7Sn7j8QGhAtZR49iyZDy/eO0BmEtv1Y//NpGc erS+w5AmTdkZkTPZIscCPLUEns14oaQIme8S1xAaj/4fdvA3jTH7wdAWzPqzMrAxWCL0 J4A540kPINgOlF6nMNJv7vu4AhbA6leTCqgMugbtNCUx4ZIa7YTdukbEG+YI5NzBcK8h XlN7mmHy2HpeasMm0D34WcWYweJ8TEC4QjMSANghUQLm4GqtL/6KUiYOvXzhJRWLXGx+ abbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:to:from:dkim-signature; bh=FsBQ1c03W+bPyqjQatoY3AtDcRPwnaylyTF2JWK/6Qg=; fh=z7CihAYYs+frw4g3uL4utEDJXxOOJDcFFsLsywo+fzs=; b=Rpfizr/Lt1MYZCGd0qmoOYwlzkCX8pH3DdrU4XgL1AndMkVTlK3dDak29ZlDXwWiuv Ck61Himilj+6f7MbS5h3mbz2q0y+BC0TpA5HjYdXXNemePr6BxDpcflgqb6M9PyoNPYn hLHIS2dwc341tEy4bnlRQ5ccXWFDoRh+mu6QrokgbyaMGcZmI0DqvY5sQkIipf5Jr9xp 3qqoOsjJ5PaKd1hsya+pQ2pP/atls+aq6EnQRkGhzy5JpgZ4xxMnLaVQ5aW8xaBBVVYF YFz3B9Zay8CRaSFJYgeqmJSwKdPnkrkYAUxEk6jtyq1VGgHErQUbcMWeXYw+zTGAmjrB /5/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b="alMS/GLI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id hk8-20020a17090b224800b00274d3f62044si13808387pjb.111.2023.11.28.12.50.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 12:50:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen.com header.s=google header.b="alMS/GLI"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 629E380A87FD; Tue, 28 Nov 2023 12:50:00 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344571AbjK1Utk (ORCPT + 99 others); Tue, 28 Nov 2023 15:49:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229526AbjK1Utj (ORCPT ); Tue, 28 Nov 2023 15:49:39 -0500 Received: from mail-qk1-x72e.google.com (mail-qk1-x72e.google.com [IPv6:2607:f8b0:4864:20::72e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 058271988 for ; Tue, 28 Nov 2023 12:49:42 -0800 (PST) Received: by mail-qk1-x72e.google.com with SMTP id af79cd13be357-77dbdc184fdso16914685a.1 for ; Tue, 28 Nov 2023 12:49:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1701204581; x=1701809381; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:from:to:cc:subject:date:message-id:reply-to; bh=FsBQ1c03W+bPyqjQatoY3AtDcRPwnaylyTF2JWK/6Qg=; b=alMS/GLIxw3ls2/A/aDYm19ouFrvpua3I3pNdCaGjL5qbpXTbOaISDHTzVEPc6fLe7 meXRIE1FuyxGSA/zEeg+Jez6cowKKFRllQ4oicINWr9Gtn5uMwhVduMnxw67Njlhktgp vUKyJhK9plRHYks0eU2zL2AYgJAqYLd11S6SrXKzQi2XefE6n2IkloDI6dH5wRyLHqZK uDT+kE5Nuz5RIrQlyn30kWx4hHd/a03pa8oQL+Gewfcb0V6Obd6TR4PAT7knPv4OlZc6 ONLo/T0kLcebPBROohbBZA1qrMfO+o2Fkmtp58m9Z585Uft2HzIGDB59sWzmQEvbLGaJ 0rvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701204581; x=1701809381; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FsBQ1c03W+bPyqjQatoY3AtDcRPwnaylyTF2JWK/6Qg=; b=hEERXyts8mYWZy2SEPWxTVZFNXCHuKxtoL1Ea0wYLXx/WWpI2RzC7GYGuaPsx+utJd N/2qpCo0xNrF2fiB5sF/H5kMYj5R0Qul2PwgS4xof2i2wJG1uK9MyEKREfBhON060uem 6Qlwi6GF/2qeBNR4uIlpC+8nlHlLKjqtQRM1EpWn7yE3AM4Ud4nnEZWxw3GzWGlI2xJX yuV/f9+YxIUdA+B+18s+vw3pdCKgjnS63OYN/s55Yi2Oq8N6TRNqLgPklSylFI+m2e6z 7aC+c/ON17Vd9S7Dc8aw73qJTJO9Ec49JBgLfc45qdk00IkZOdf4IpQnS0e3IQ5oPv2F TFRw== X-Gm-Message-State: AOJu0YwvQUaplcQFyggcdup0n2//SWTsOQiA5q1h7nFCkCyknOO1pEaW iyAs6n2tpXn/uDPeA0d085/MiQ== X-Received: by 2002:a05:620a:1452:b0:77d:c593:f63c with SMTP id i18-20020a05620a145200b0077dc593f63cmr2543824qkl.24.1701204581121; Tue, 28 Nov 2023 12:49:41 -0800 (PST) Received: from soleen.c.googlers.com.com (55.87.194.35.bc.googleusercontent.com. [35.194.87.55]) by smtp.gmail.com with ESMTPSA id d11-20020a0cfe8b000000b0067a56b6adfesm1056863qvs.71.2023.11.28.12.49.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 12:49:40 -0800 (PST) From: Pasha Tatashin To: akpm@linux-foundation.org, alex.williamson@redhat.com, alim.akhtar@samsung.com, alyssa@rosenzweig.io, asahi@lists.linux.dev, baolu.lu@linux.intel.com, bhelgaas@google.com, cgroups@vger.kernel.org, corbet@lwn.net, david@redhat.com, dwmw2@infradead.org, hannes@cmpxchg.org, heiko@sntech.de, iommu@lists.linux.dev, jasowang@redhat.com, jernej.skrabec@gmail.com, jgg@ziepe.ca, jonathanh@nvidia.com, joro@8bytes.org, kevin.tian@intel.com, krzysztof.kozlowski@linaro.org, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-rockchip@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-sunxi@lists.linux.dev, linux-tegra@vger.kernel.org, lizefan.x@bytedance.com, marcan@marcan.st, mhiramat@kernel.org, mst@redhat.com, m.szyprowski@samsung.com, netdev@vger.kernel.org, pasha.tatashin@soleen.com, paulmck@kernel.org, rdunlap@infradead.org, robin.murphy@arm.com, samuel@sholland.org, suravee.suthikulpanit@amd.com, sven@svenpeter.dev, thierry.reding@gmail.com, tj@kernel.org, tomas.mudrunka@gmail.com, vdumpa@nvidia.com, virtualization@lists.linux.dev, wens@csie.org, will@kernel.org, yu-cheng.yu@intel.com Subject: [PATCH 00/16] IOMMU memory observability Date: Tue, 28 Nov 2023 20:49:22 +0000 Message-ID: <20231128204938.1453583-1-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.43.0.rc2.451.g8631bc7472-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Tue, 28 Nov 2023 12:50:00 -0800 (PST) From: Pasha Tatashin IOMMU subsystem may contain state that is in gigabytes. Majority of that state is iommu page tables. Yet, there is currently, no way to observe how much memory is actually used by the iommu subsystem. This patch series solves this problem by adding both observability to all pages that are allocated by IOMMU, and also accountability, so admins can limit the amount if via cgroups. The system-wide observability is using /proc/meminfo: SecPageTables: 438176 kB Contains IOMMU and KVM memory. Per-node observability: /sys/devices/system/node/nodeN/meminfo Node N SecPageTables: 422204 kB Contains IOMMU and KVM memory memory in the given NUMA node. Per-node IOMMU only observability: /sys/devices/system/node/nodeN/vmstat nr_iommu_pages 105555 Contains number of pages IOMMU allocated in the given node. Accountability: using sec_pagetables cgroup-v2 memory.stat entry. With the change, iova_stress[1] stops as limit is reached: # ./iova_stress iova space: 0T free memory: 497G iova space: 1T free memory: 495G iova space: 2T free memory: 493G iova space: 3T free memory: 491G stops as limit is reached. This series encorporates suggestions that came from the discussion at LPC [2]. [1] https://github.com/soleen/iova_stress [2] https://lpc.events/event/17/contributions/1466 Pasha Tatashin (16): iommu/vt-d: add wrapper functions for page allocations iommu/amd: use page allocation function provided by iommu-pages.h iommu/io-pgtable-arm: use page allocation function provided by iommu-pages.h iommu/io-pgtable-dart: use page allocation function provided by iommu-pages.h iommu/io-pgtable-arm-v7s: use page allocation function provided by iommu-pages.h iommu/dma: use page allocation function provided by iommu-pages.h iommu/exynos: use page allocation function provided by iommu-pages.h iommu/fsl: use page allocation function provided by iommu-pages.h iommu/iommufd: use page allocation function provided by iommu-pages.h iommu/rockchip: use page allocation function provided by iommu-pages.h iommu/sun50i: use page allocation function provided by iommu-pages.h iommu/tegra-smmu: use page allocation function provided by iommu-pages.h iommu: observability of the IOMMU allocations iommu: account IOMMU allocated memory vhost-vdpa: account iommu allocations vfio: account iommu allocations Documentation/admin-guide/cgroup-v2.rst | 2 +- Documentation/filesystems/proc.rst | 4 +- drivers/iommu/amd/amd_iommu.h | 8 - drivers/iommu/amd/init.c | 91 +++++----- drivers/iommu/amd/io_pgtable.c | 13 +- drivers/iommu/amd/io_pgtable_v2.c | 20 +- drivers/iommu/amd/iommu.c | 13 +- drivers/iommu/dma-iommu.c | 8 +- drivers/iommu/exynos-iommu.c | 14 +- drivers/iommu/fsl_pamu.c | 5 +- drivers/iommu/intel/dmar.c | 10 +- drivers/iommu/intel/iommu.c | 47 ++--- drivers/iommu/intel/iommu.h | 2 - drivers/iommu/intel/irq_remapping.c | 10 +- drivers/iommu/intel/pasid.c | 12 +- drivers/iommu/intel/svm.c | 7 +- drivers/iommu/io-pgtable-arm-v7s.c | 9 +- drivers/iommu/io-pgtable-arm.c | 7 +- drivers/iommu/io-pgtable-dart.c | 37 ++-- drivers/iommu/iommu-pages.h | 231 ++++++++++++++++++++++++ drivers/iommu/iommufd/iova_bitmap.c | 6 +- drivers/iommu/rockchip-iommu.c | 14 +- drivers/iommu/sun50i-iommu.c | 7 +- drivers/iommu/tegra-smmu.c | 18 +- drivers/vfio/vfio_iommu_type1.c | 8 +- drivers/vhost/vdpa.c | 3 +- include/linux/mmzone.h | 5 +- mm/vmstat.c | 3 + 28 files changed, 415 insertions(+), 199 deletions(-) create mode 100644 drivers/iommu/iommu-pages.h -- 2.43.0.rc2.451.g8631bc7472-goog