Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4185741rdb; Thu, 14 Sep 2023 14:43:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHKQZKoWnJC1IuNUgV3HaxUMW9tYfAPwFJ+CW+WnJ4tmDZ8vCF4yRebA23dBl55vc4qLytH X-Received: by 2002:a05:6358:9389:b0:134:61a5:7f05 with SMTP id h9-20020a056358938900b0013461a57f05mr46196rwb.10.1694727784839; Thu, 14 Sep 2023 14:43:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694727784; cv=none; d=google.com; s=arc-20160816; b=AbPdgZx0K+iD14Oxuq0ZDTs/nt7Nj5RFgnxpvs+WVnxl84jYbzzYFIvg2sZJZNzcCe k5jrJ47Raj/N2/sVqTh/zK4KbizRxuaTKOmIqitSiFYePQtyIrpkcQjPQvCQejcsDUlR 5O70usamxUdlzlS+iy4WuNOFsYm7IlxZTmTAnlrOzZA+w7IeGBT/W7b+PO8kamu6PpZu N25V1tp9RHyEYxAUhIgzadxtkq0eqyNvm9TVFrjbGzJTbMj+gUG0VVW/hkHnNKaoVLN/ jrDEz1Ej+49TUiua/p5xetQVS7NN5BT/Al1YwZeMLJ0SBpKi14Qh++uQ8/sqSEw8vDzf DZYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=uTzciLAK1rHaLd0gWzMb/fWg9Lw/LcXOlDJsQ/Mdv44=; fh=4zSMCzNb7rJrUGZtbNPceVM+tszI9ngqwF5WBZpI8NM=; b=Z2EAeEvmauv4ub3GbHDf5mJt6+7CCwVpiThwjWyXkunpnQZ1ZLzfH3y4qgGznVbhfI avlzty7TJ/L3usoqp5XwrKsK1gPIQDJ8JzYTfUN4Wl8X/JRHWYY8kr8wkphkHBxtPJka PXUQnB5hR0k8Twi+xHkPqZ/6449UijHwwGSHWsS3hg3TRED9xzPp913HyrNhRQH51W0b jJTYl9hTb4EG5p67wSsN2QhBjYnruAyBZY1l6OSpg7G5RtCnpPysN/boUv/YQoAGsjXi OwX6T1hoG9TrHr6EcLAYsVJKV2yhrdG99Y0DEZjTiATR9LxmBpH4W1YEi/1stfxdLc+7 428w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=l4uOP3MY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id z185-20020a6333c2000000b00577dd005706si1944473pgz.779.2023.09.14.14.43.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 14:43:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=l4uOP3MY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id BC85E80EBC8A; Wed, 13 Sep 2023 09:29:07 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230020AbjIMQ3D (ORCPT + 99 others); Wed, 13 Sep 2023 12:29:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229471AbjIMQ3C (ORCPT ); Wed, 13 Sep 2023 12:29:02 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24CC4B3 for ; Wed, 13 Sep 2023 09:28:58 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59b5d4a8242so170127b3.0 for ; Wed, 13 Sep 2023 09:28:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694622537; x=1695227337; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uTzciLAK1rHaLd0gWzMb/fWg9Lw/LcXOlDJsQ/Mdv44=; b=l4uOP3MY34hzAUeOFJRXJSa4hc4X+OKjFiAf3+yOoxNVo4Rw0QwAFImWyXdQmTLVfD XWoYg0Ld3mi2akEqFtf8B9f9Muzn526V9HZJZ7VYhQF66Sy67eFXO1Ff8KxveH79ng2v oMTx5HB/F9K6d0zkjZke5o0xvieu7YvP7hGXwr+xpkl9hl4VQ0y27D9G/rI6UN/9c0pq B4hU5ZWL/J3Ro4qEs3uUnJSPHWcdX4OlUjCE8K3o376ixNnwRmBokA2iEnUjiHYQl5mC LIe+nAmd76ITkUaRgptmNFBrf5vQKlUkG6FoV7btwXNaV8Z39UB3/J+O4/KZ1iQzRfIC cvvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694622537; x=1695227337; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uTzciLAK1rHaLd0gWzMb/fWg9Lw/LcXOlDJsQ/Mdv44=; b=LUfy24hNXO9gfqnN4p7iyPpVDEZnD8L0JysUqSNxp/ZirErCQx0f1khNqLS0LGAb+4 7x8OnR6X3+5P9QITflHKiqPYNtPnNpJ9aeZn0G9XW47YFufLbsJFdtXyLtT2Hq7nNcDZ 9Pr8nU/h34B65dVxErrJujW8GWTL9xyyR6I0AqZ11/DF4EQlDoULzkHY6k2rDCMQaW0r ORvB2rn+3OXiF7/DUgPGC+m8QNsvQBNyW9yWgRO5ulMkEaOsATU+ZzcF3CmumFEnYInn gPOw21H21Ar3EKymUWi3yGNaDhTBRjFXLo9glSAJLfXGftTSIDsdRTWkZiKeKaqRJPCe 0lbA== X-Gm-Message-State: AOJu0YzIQ2UBRQB09FvmWATW7lg7ymKcvM3B0xajn/sTxLuhJMoO42QH bnJ1DcTGNK9v9Y377EQvkZ/KQzL5eqU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:8210:0:b0:d7e:8dee:7813 with SMTP id q16-20020a258210000000b00d7e8dee7813mr66802ybk.8.1694622537338; Wed, 13 Sep 2023 09:28:57 -0700 (PDT) Date: Wed, 13 Sep 2023 09:28:55 -0700 In-Reply-To: Mime-Version: 1.0 References: Message-ID: Subject: Re: [RFC PATCH 2/6] KVM: guestmem_fd: Make error_remove_page callback to unmap guest memory From: Sean Christopherson To: isaku.yamahata@intel.com Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, Michael Roth , Paolo Bonzini , erdemaktas@google.com, Sagi Shahar , David Matlack , Kai Huang , Zhi Wang , chen.bo@intel.com, linux-coco@lists.linux.dev, Chao Peng , Ackerley Tng , Vishal Annapurve , Yuan Yao , Jarkko Sakkinen , Xu Yilun , Quentin Perret , wei.w.wang@intel.com, Fuad Tabba Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 13 Sep 2023 09:29:08 -0700 (PDT) On Wed, Sep 13, 2023, isaku.yamahata@intel.com wrote: > @@ -316,26 +316,43 @@ static int kvm_gmem_error_page(struct address_space *mapping, struct page *page) > end = start + thp_nr_pages(page); > > list_for_each_entry(gmem, gmem_list, entry) { > + struct kvm *kvm = gmem->kvm; > + > + KVM_MMU_LOCK(kvm); > + kvm_mmu_invalidate_begin(kvm); > + KVM_MMU_UNLOCK(kvm); > + > + flush = false; > xa_for_each_range(&gmem->bindings, index, slot, start, end - 1) { > - for (gfn = start; gfn < end; gfn++) { > - if (WARN_ON_ONCE(gfn < slot->base_gfn || > - gfn >= slot->base_gfn + slot->npages)) > - continue; > - > - /* > - * FIXME: Tell userspace that the *private* > - * memory encountered an error. > - */ > - send_sig_mceerr(BUS_MCEERR_AR, > - (void __user *)gfn_to_hva_memslot(slot, gfn), > - PAGE_SHIFT, current); > - } > + pgoff_t pgoff; > + > + if (WARN_ON_ONCE(end < slot->base_gfn || > + start >= slot->base_gfn + slot->npages)) > + continue; > + > + pgoff = slot->gmem.pgoff; > + struct kvm_gfn_range gfn_range = { > + .slot = slot, > + .start = slot->base_gfn + max(pgoff, start) - pgoff, > + .end = slot->base_gfn + min(pgoff + slot->npages, end) - pgoff, > + .arg.page = page, > + .may_block = true, > + .memory_error = true, Why pass arg.page and memory_error? There's no usage in this mini-series, and no explanation of what arch code would do the information. And I can't think of why arch would need to do anything but zap the SPTEs. If the memory error is directly related to the current instruction, the vCPU will fault on the zapped SPTE, see -HWPOISON, and exit to userspace. If the memory is unrelated, then the delayed notification is less than ideal, but not fundamentally broken, e.g. it's no worse than TDX's behavior of not signaling #MC until a poisoned cache line is actually accessed. I don't get arg.page in particular, because having the gfn should be enough for arch code to take action beyond zapping SPTEs. And _if_ we want to communicate the error to arch code, it would be much better to add a dedicated arch hook instead of piggybacking kvm_mmu_unmap_gfn_range() with a "memory_error" flag. If we just zap SPTEs, then can't this simply be? static int kvm_gmem_error_page(struct address_space *mapping, struct page *page) { struct list_head *gmem_list = &mapping->private_list; struct kvm_gmem *gmem; pgoff_t start, end; filemap_invalidate_lock_shared(mapping); start = page->index; end = start + thp_nr_pages(page); list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_begin(gmem, start, end); /* * Do not truncate the range, what action is taken in response to the * error is userspace's decision (assuming the architecture supports * gracefully handling memory errors). If/when the guest attempts to * access a poisoned page, kvm_gmem_get_pfn() will return -EHWPOISON, * at which point KVM can either terminate the VM or propagate the * error to userspace. */ list_for_each_entry(gmem, gmem_list, entry) kvm_gmem_invalidate_end(gmem, start, end); filemap_invalidate_unlock_shared(mapping); return MF_DELAYED; }