Received: by 2002:a05:7412:cfc7:b0:fc:a2b0:25d7 with SMTP id by7csp2202428rdb; Tue, 20 Feb 2024 23:26:13 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWPixkntgNrqQNRkzbBGu/bhvMS8kEPmOioh0oogXqOLmFpPXIM258olno6PTuUAsucwAKbb9zRCPslp7OHXo3GLrFcAdRw5GnWeSs0Cg== X-Google-Smtp-Source: AGHT+IEANEg7Byxq3UG+89gTK+v0UkUCyCW/yaP0VblU7WiyUmZGqjgoEOCCmnxW/KaebWaQRJzt X-Received: by 2002:a17:906:c347:b0:a3e:c6de:e5bb with SMTP id ci7-20020a170906c34700b00a3ec6dee5bbmr6037937ejb.40.1708500373319; Tue, 20 Feb 2024 23:26:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708500373; cv=pass; d=google.com; s=arc-20160816; b=wtmhgoilh3wejdyJzwAWjWDXPHzUa0yqKqLWyy1NJLdAYER3J7v4LzL3/fkqIAe6xv mW66BWgbSn0Gk+BHrfidKXRuNUB6zcISEF0Ug3aOw8E1k3TWjM+Jcm6Di39LhZgKHvCf 0k748/DoIt/QCopJROpMrwAoL9yFkiGUP8NDLxPYz/xUcVfchx4ag3GgvF/pUsWZWJbj jwdrGg3Gh6uW12QJ62nagXM3bbuWvAorS/3l/rJKHJBzZQUgxykAFYnoiejDslkB3iT1 /ffWXtPhoCrG9go9PfRwTRsxYsN7HDQNKqjklV5SMrsm7cpbQ+b0r+3qO2nBitdOym14 pyCQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=ueaOZVSLa3KihKWczeP0Mvx0Wg5V24kjzNwmuUZQlfQ=; fh=ghq7unOcBFRckYkqvEH/K3jrtxuU6yFleTU8/v1raEM=; b=Il5CMdWdzfW1XfDco2Pc+P0PFKYJ6hjC51mBCfbkeWbYF+1xFs+T1bVQTGLqzpAUVO COnbguqqGUib/xgKuFVs3ft4zOiH6M9g4uPJVXA8YAHASRXPAkrNAgG0ekg7RLVJd5H8 YeDqepE+vB8dgAR2uqp0RfB656rBcJivvQstcxQ3CnMAzR/IBYlbw4aqYPnaOLEXbpqf 7XGQx88v1gOBWKITSqyaxlZO913yAqDbwxjd0RtuhJMnMkbX+NiQFaaifGth+vIfi8vV v0636DReIY4YuQpKi5wR/8ENUSvTK0p4HD/dsGHSO/JBHJ9tER6SsgauLJy9bONddATc UhTg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=fIQvEFdT; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-74220-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-74220-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id qk4-20020a170906d9c400b00a3edc9e0a54si1732261ejb.904.2024.02.20.23.26.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Feb 2024 23:26:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-74220-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=fIQvEFdT; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-74220-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-74220-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id E47371F220BD for ; Wed, 21 Feb 2024 07:26:12 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C67803B18F; Wed, 21 Feb 2024 07:25:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="fIQvEFdT" Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A6623A8E1 for ; Wed, 21 Feb 2024 07:25:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.174 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708500357; cv=none; b=MEviM66vyx0dNrjFMHDK7MylUxDu9vVw8rqjAMR/V6JSrTOoU/MWxhZP+5ZiGbArRFIY+3O7U/lTnNwHaBbP7xtpTSu4+E42dd6YtINldWiGGR4fh4cMyEMu+M7yWibGA5qDX5h/17y6dc6GxVeP6R0m7DpkDdyAOZf0EXyJS0g= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708500357; c=relaxed/simple; bh=BNvF2Qw/ckDsg5AFU3UAYnddVhnqJSdCPlAYGCYfcZo=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=fYaXZP43TZucrna7ZvtFKjhVDZTRuoU1KvnfE+fKCNo16fh8ZJGHvmJvVITillLOLtzZvpJuz+E5qt0l3gJGBBPIB1HFFFdlc+1ugC8MGRNCrrVU9zNp4XbcXG+JbyZQG1cXLufz1OtIa5DfCOmJRZQgsO7tDHzjFZW6J/dfKN4= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=fIQvEFdT; arc=none smtp.client-ip=209.85.167.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Received: by mail-oi1-f174.google.com with SMTP id 5614622812f47-3c15bef14c3so2061233b6e.2 for ; Tue, 20 Feb 2024 23:25:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1708500355; x=1709105155; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=ueaOZVSLa3KihKWczeP0Mvx0Wg5V24kjzNwmuUZQlfQ=; b=fIQvEFdTXKgC8eE+5q2rOwBq0PqDTES2q50jjM1qPLM0W4N1Z9MEsQ0qCPFA4HhY0Z oFbeM/qqt2IzAHH5aQ03720AhNQBRc22zOpSZEVxxsnI0BMsXGLZhmaJUcO6kTtIvD5w HdSSQ0ox8pzwZ6sXmM51IuHs4+LmuH7ilAHVc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708500355; x=1709105155; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ueaOZVSLa3KihKWczeP0Mvx0Wg5V24kjzNwmuUZQlfQ=; b=hlIPlu1L9MZQRrCVMvD5t1Uc80pxZNfj7WAHIcQ9ltKZoRSACdjqBO2my8ob+pQbUR 8kJG0LGyL45CvjiuQRdnjaaATDanoCgT1SzwWPKeWoDV11MXKnJP1YQEIiT99mHbIjes 3myuwM8ihD2rVGDT4OFSUrP4lOMIFxabG1U5x5/HTl+H6MA5n0bVBfmqtSpTmcaJ4pGW FUQBta39CNsFXnnWlTy30V+OEQfeer2sVx0JHVctRYCl1KNXTnGngyoEtkX72gGk2BHX nNXSeWKo3Xsxg9903j7dCdBGAt8zaIBIA8F8QzqawrGO00bzTunMOydYF3ozP4t3YB4c BRmg== X-Forwarded-Encrypted: i=1; AJvYcCUC4ZNTSi4rg/uiJHm6c9oAdhCzt0msuUaJTIKZrFIJq5AkpgusGrSOKBi8JznhsN4ss/sH1LIuRohWMOxByRmqYvRoYeEfYlY15sNC X-Gm-Message-State: AOJu0YxBIaCRh3PxcxQpe3mUzeqqOWiK+jo2qFJUCn5S2JoxbPKFOfTN NpwREkKt7isSfsDJ4qB3oDY6N7UGFsyBVyGMNDrye5lYDDXPoX2nTebqxa/mMQ== X-Received: by 2002:a05:6808:17a3:b0:3c1:37d9:dc93 with SMTP id bg35-20020a05680817a300b003c137d9dc93mr20410632oib.10.1708500355230; Tue, 20 Feb 2024 23:25:55 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:b417:5d09:c226:a19c]) by smtp.gmail.com with UTF8SMTPSA id y5-20020a056a00180500b006e45daf9e89sm5832804pfa.131.2024.02.20.23.25.52 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 20 Feb 2024 23:25:54 -0800 (PST) From: David Stevens X-Google-Original-From: David Stevens To: Sean Christopherson Cc: Yu Zhang , Isaku Yamahata , Zhi Wang , Maxim Levitsky , kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, David Stevens Subject: [PATCH v10 0/8] KVM: allow mapping non-refcounted pages Date: Wed, 21 Feb 2024 16:25:18 +0900 Message-ID: <20240221072528.2702048-1-stevensd@google.com> X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: David Stevens This patch series adds support for mapping VM_IO and VM_PFNMAP memory that is backed by struct pages that aren't currently being refcounted (e.g. tail pages of non-compound higher order allocations) into the guest. Our use case is virtio-gpu blob resources [1], which directly map host graphics buffers into the guest as "vram" for the virtio-gpu device. This feature currently does not work on systems using the amdgpu driver, as that driver allocates non-compound higher order pages via ttm_pool_alloc_page(). First, this series replaces the gfn_to_pfn_memslot() API with a more extensible kvm_follow_pfn() API. The updated API rearranges gfn_to_pfn_memslot()'s args into a struct and where possible packs the bool arguments into a FOLL_ flags argument. The refactoring changes do not change any behavior. From there, this series extends the kvm_follow_pfn() API so that non-refconuted pages can be safely handled. This invloves adding an input parameter to indicate whether the caller can safely use non-refcounted pfns and an output parameter to tell the caller whether or not the returned page is refcounted. This change includes a breaking change, by disallowing non-refcounted pfn mappings by default, as such mappings are unsafe. To allow such systems to continue to function, an opt-in module parameter is added to allow the unsafe behavior. This series only adds support for non-refcounted pages to x86. Other MMUs can likely be updated without too much difficulty, but it is not needed at this point. Updating other parts of KVM (e.g. pfncache) is not straightforward [2]. [1] https://patchwork.kernel.org/project/dri-devel/cover/20200814024000.2485-1-gurchetansingh@chromium.org/ [2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/ v9 -> v10: - Re-add FOLL_GET changes. - Split x86/mmu spte+non-refcount-page patch into two patches. - Rename 'foll' variables to 'kfp'. - Properly gate usage of refcount spte bit when it's not available. - Replace kfm_follow_pfn's is_refcounted_page output parameter with a struct page *refcounted_page pointing to the page in question. - Add patch downgrading BUG_ON to WARN_ON_ONCE. v8 -> v9: - Make paying attention to is_refcounted_page mandatory. This means that FOLL_GET is no longer necessary. For compatibility with un-migrated callers, add a temporary parameter to sidestep ref-counting issues. - Add allow_unsafe_mappings, which is a breaking change. - Migrate kvm_vcpu_map and other callsites used by x86 to the new API. - Drop arm and ppc changes. v7 -> v8: - Set access bits before releasing mmu_lock. - Pass FOLL_GET on 32-bit x86 or !tdp_enabled. - Refactor FOLL_GET handling, add kvm_follow_refcounted_pfn helper. - Set refcounted bit on >4k pages. - Add comments and apply formatting suggestions. - rebase on kvm next branch. v6 -> v7: - Replace __gfn_to_pfn_memslot with a more flexible __kvm_faultin_pfn, and extend that API to support non-refcounted pages (complete rewrite). David Stevens (7): KVM: Relax BUG_ON argument validation KVM: mmu: Introduce kvm_follow_pfn() KVM: mmu: Improve handling of non-refcounted pfns KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn() KVM: x86: Migrate to kvm_follow_pfn() KVM: x86/mmu: Track if sptes refer to refcounted pages KVM: x86/mmu: Handle non-refcounted pages Sean Christopherson (1): KVM: Assert that a page's refcount is elevated when marking accessed/dirty arch/x86/kvm/mmu/mmu.c | 104 +++++++--- arch/x86/kvm/mmu/mmu_internal.h | 2 + arch/x86/kvm/mmu/paging_tmpl.h | 7 +- arch/x86/kvm/mmu/spte.c | 4 +- arch/x86/kvm/mmu/spte.h | 22 +- arch/x86/kvm/mmu/tdp_mmu.c | 22 +- arch/x86/kvm/x86.c | 11 +- include/linux/kvm_host.h | 53 ++++- virt/kvm/guest_memfd.c | 8 +- virt/kvm/kvm_main.c | 349 +++++++++++++++++++------------- virt/kvm/kvm_mm.h | 3 +- virt/kvm/pfncache.c | 11 +- 12 files changed, 399 insertions(+), 197 deletions(-) base-commit: 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478 -- 2.44.0.rc0.258.g7320e95886-goog