Received: by 2002:a05:7412:b795:b0:e2:908c:2ebd with SMTP id iv21csp477004rdb; Thu, 2 Nov 2023 08:56:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFSxiNi4D5K4Jw/c+l+noBQUcR3dDmb5GzCAZ9pF4Hxd93Cm+hScIm2MHS+nuntKPUljwgH X-Received: by 2002:a05:6a20:1595:b0:17b:3822:e5ea with SMTP id h21-20020a056a20159500b0017b3822e5eamr20527879pzj.19.1698940616206; Thu, 02 Nov 2023 08:56:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698940616; cv=none; d=google.com; s=arc-20160816; b=vM6pMWbSmb6kyV+2BBa4TpZNtIMESALneyyq/Bmon7+NRoiLL648f1dsIMynOQdWWo rRDr9sUo1V6JnY2297UhNc26V7hy0ojoFznHpx3dGHmiUoDOfKoVSVHEF05nKAwjeycy C9HsoH322XiaJFgnRRTJZdJxVRCI8R1P7x4um0oF1lORitrjoBkUTxlhhF4zRbK9G6/a VNCtX2AuXQXPj208jLVfuhpo/mppVdIM8WXnW5PHs3euQCBuhFry7yjqndc/1YLAZrac toQpSlsWorXCKy/fVpuP8538/NluTa4Y3FToDRSfZixi8XrLNFTqZHOS6blwOLXIH9NN FTcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; fh=+509spP8Mm4xGxxhCLsMkbvbJ4ikLM2670EdHLvlChM=; b=IOsCAQigrvr6CSGpReUvznuaIoFm5ymKs2y6b4a9Db5GNJ7g4XKwlJHf3Xu+4yznsZ yPuQpeGowrQOXmEvIH6KvaEQ+UGtjVWfZ/HJ8Q1t/b1Rxpj4N+RVhliUELeKjuCWmZjz 61ZoxUT0kzEtS2CJ/tf633d7mrM/PaYab4u9AZN1Yl3JAXcIstDC4PnCijUPYechcuAm sU9veWXTGbrgLam7RzCXT0I5Bx7MygVszAQ1AN0LvnZ+E7lYpuEjdqRhlZtpnB7lpluE 11GwfbOKmTgW4eAWRb4fxj4ct63W4mgGuXYCv0Gx5O7Fms6ruGet2IQGgpkaeXuUKh+U 8POw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y0pCm1U5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id d2-20020a656b82000000b005b95f5f8976si2155924pgw.634.2023.11.02.08.56.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 08:56:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y0pCm1U5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 232AA827C196; Thu, 2 Nov 2023 08:56:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1347560AbjKBP4u (ORCPT + 99 others); Thu, 2 Nov 2023 11:56:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234161AbjKBP4s (ORCPT ); Thu, 2 Nov 2023 11:56:48 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B48A1187 for ; Thu, 2 Nov 2023 08:56:45 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d9cb4de3bf0so1385907276.0 for ; Thu, 02 Nov 2023 08:56:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698940605; x=1699545405; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; b=Y0pCm1U54F6+t54g6ve3O4sKwAusswNVzdo/ZhFSDZYY3fnsTSWwACgBYfOq31uLmZ n9liL33D4Z/oeyFN9H7zzXPPuJSyH1Atgliy12chGBSd3rqhpRJSerdZqAUBkkEwvu49 xJ+bQC3rSFRvzDPE/of8HkJfzitSLMa+slgd8+PaNiVIEi98RiPmSNz1trgu7X0u8d16 DcHxfic4aChGEeMhHoYjx7XN06VzXWiTSO9GRHyDw433j7RQV4xwPqo8P0ScSeH3p6pr Bg5jhSQZ0RyAdW4Uc8W8fXlvZhxX9QSuhVl0yroaBCXIdxltEoflCiVneqVbQLl2q1OW lAWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698940605; x=1699545405; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=sYwIeNUmcSQVXWofeU45UHS4WNGoW8dkX+AiEs9OLu8=; b=bGrhP08j7bi24jbHraNcz02bAECoVKOpN/ILPUiLSwCWzuDUlR9fDz0EsBuMSDCRVp v/axSRORHTK5mKS6cFvliATVkBD/YUmydgDcuybRV3qHGWNLiObCw78b+s/GCHgsQEGV AGyiOPG1ebc2BLPI377YefLHknzeeludCo/6adtnwY4TcZ5L6imFPpZR5P5B3PLqSfq1 KWE+GQanLGBYDv+nTkf7Qpm+ba1LSzyHFup6Pf9Nh0uwYKoPkIe5PW6bpSHjrGzKYRZ4 /ZUKCFhkxfwgkhw6UpZGHNvPuYqzJavsNW2L6hzZxmbzt+n995xNukpusgFN+wiQSDUw MpGg== X-Gm-Message-State: AOJu0Yz7565y4LBsiiInlyQGx+cHVtDYVa9PUHhBJAoiU4wVPd44Xu/a l7zKOjJmmgRoKSetdf2swMfZx6eCHkA= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:aae7:0:b0:da0:5a30:6887 with SMTP id t94-20020a25aae7000000b00da05a306887mr349504ybi.4.1698940604877; Thu, 02 Nov 2023 08:56:44 -0700 (PDT) Date: Thu, 2 Nov 2023 08:56:43 -0700 In-Reply-To: <64e3764e36ba7a00d94cc7db1dea1ef06b620aaf.camel@intel.com> Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-10-seanjc@google.com> <482bfea6f54ea1bb7d1ad75e03541d0ba0e5be6f.camel@intel.com> <64e3764e36ba7a00d94cc7db1dea1ef06b620aaf.camel@intel.com> Message-ID: Subject: Re: [PATCH v13 09/35] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace From: Sean Christopherson To: Kai Huang Cc: Xiaoyao Li , "kvm-riscv@lists.infradead.org" , "mic@digikod.net" , "liam.merwick@oracle.com" , Isaku Yamahata , "kvm@vger.kernel.org" , "pbonzini@redhat.com" , "kirill.shutemov@linux.intel.com" , "david@redhat.com" , "linux-fsdevel@vger.kernel.org" , "amoorthy@google.com" , "linuxppc-dev@lists.ozlabs.org" , "tabba@google.com" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "michael.roth@amd.com" , "viro@zeniv.linux.org.uk" , "oliver.upton@linux.dev" , "chao.p.peng@linux.intel.com" , "palmer@dabbelt.com" , "chenhuacai@kernel.org" , "aou@eecs.berkeley.edu" , "linux-mips@vger.kernel.org" , "mpe@ellerman.id.au" , Vishal Annapurve , "vbabka@suse.cz" , "mail@maciej.szmigiero.name" , "linux-riscv@lists.infradead.org" , "maz@kernel.org" , "willy@infradead.org" , "dmatlack@google.com" , "anup@brainfault.org" , "yu.c.zhang@linux.intel.com" , Yilun Xu , "qperret@google.com" , "brauner@kernel.org" , "isaku.yamahata@gmail.com" , "ackerleytng@google.com" , "jarkko@kernel.org" , "paul.walmsley@sifive.com" , "linux-arm-kernel@lists.infradead.org" , "linux-mm@kvack.org" , Wei W Wang , "akpm@linux-foundation.org" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 02 Nov 2023 08:56:55 -0700 (PDT) On Thu, Nov 02, 2023, Kai Huang wrote: > On Wed, 2023-11-01 at 10:36 -0700, Sean Christopherson wrote: > > On Wed, Nov 01, 2023, Kai Huang wrote: > > >=20 > > > > +7.34 KVM_CAP_MEMORY_FAULT_INFO > > > > +------------------------------ > > > > + > > > > +:Architectures: x86 > > > > +:Returns: Informational only, -EINVAL on direct KVM_ENABLE_CAP. > > > > + > > > > +The presence of this capability indicates that KVM_RUN will fill > > > > +kvm_run.memory_fault if KVM cannot resolve a guest page fault VM-E= xit, e.g. if > > > > +there is a valid memslot but no backing VMA for the corresponding = host virtual > > > > +address. > > > > + > > > > +The information in kvm_run.memory_fault is valid if and only if KV= M_RUN returns > > > > +an error with errno=3DEFAULT or errno=3DEHWPOISON *and* kvm_run.ex= it_reason is set > > > > +to KVM_EXIT_MEMORY_FAULT. > > >=20 > > > IIUC returning -EFAULT or whatever -errno is sort of KVM internal > > > implementation. > >=20 > > The errno that is returned to userspace is ABI. In KVM, it's a _very_ = poorly > > defined ABI for the vast majority of ioctls(), but it's still technical= ly ABI. > > KVM gets away with being cavalier with errno because the vast majority = of errors > > are considered fatal by userespace, i.e. in most cases, userspace simpl= y doesn't > > care about the exact errno. > >=20 > > A good example is KVM_RUN with -EINTR; if KVM were to return something = other than > > -EINTR on a pending signal or vcpu->run->immediate_exit, userspace woul= d fall over. > >=20 > > > Is it better to relax the validity of kvm_run.memory_fault when > > > KVM_RUN returns any -errno? > >=20 > > Not unless there's a need to do so, and if there is then we can update = the > > documentation accordingly. If KVM's ABI is that kvm_run.memory_fault i= s valid > > for any errno, then KVM would need to purge kvm_run.exit_reason super e= arly in > > KVM_RUN, e.g. to prevent an -EINTR return due to immediate_exit from be= ing > > misinterpreted as KVM_EXIT_MEMORY_FAULT. And purging exit_reason super= early is > > subtly tricky because KVM's (again, poorly documented) ABI is that *som= e* exit > > reasons are preserved across KVM_RUN with vcpu->run->immediate_exit (or= with a > > pending signal). > >=20 > > https://lore.kernel.org/all/ZFFbwOXZ5uI%2Fgdaf@google.com > >=20 > >=20 >=20 > Agreed with not to relax to any errno. However using -EFAULT as part of = ABI > definition seems a little bit dangerous, e.g., someone could accidentally= or > mistakenly return -EFAULT in KVM_RUN at early time and/or in a completely > different code path, etc. =C2=A0-EINTR has well defined meaning, but -EFA= ULT (which > is "Bad address") seems doesn't but I am not sure either. :-) KVM has returned -EFAULT since forever, i.e. it's effectively already part = of the ABI. I doubt there's a userspace that relies precisely on -EFAULT, but use= rspace definitely will be confused if KVM returns '0' where KVM used to return -EF= AULT. And so if we want to return '0', it needs to be opt-in, which means forcing userspace to enable a capability *and* requires code in KVM to conditionall= y return '0' instead of -EFAULT/-EHWPOISON. > One example is, for backing VMA with VM_IO | VM_PFNMAP, hva_to_pfn() retu= rns > KVM_PFN_ERR_FAULT when the kernel cannot get a valid PFN (e.g. when SGX v= epc > fault handler failed to allocate EPC) and kvm_handle_error_pfn() will jus= t > return -EFAULT. If kvm_run.exit_reason isn't purged early then is it pos= sible > to have some issue here? Well, yeah, but that's exactly why this series has a patch to reset exit_re= ason. The solution to "if KVM is buggy then bad things happen" is to not have KVM= bugs :-)