Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp1152604rdg; Fri, 13 Oct 2023 11:46:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHQc7SaQGoosW7D4Ty8UVzifWbd37NwdmJD8/m0gMUFEm6VTZZsGf0FJl5e/au2e6e/2ctc X-Received: by 2002:a9d:6a50:0:b0:6bc:c9e6:30b7 with SMTP id h16-20020a9d6a50000000b006bcc9e630b7mr29003226otn.26.1697222763260; Fri, 13 Oct 2023 11:46:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697222763; cv=none; d=google.com; s=arc-20160816; b=svNsGozwCVy6TsACBdEovoNPMF8LCEuir92N4XOzGW1c2yuHBYpRW93G/vZrYas4sA FxRURfXlKHuGAgc1HNac8MYEoTghRjVRJuEtDN+bWvKdmnhPR8xAmA/UJMG5lOOcMV9Y VW+iXcgsOD1j62j3uzDs+9BT/Hr5azOIOBtwOYLyzTB1xd1pMy/LryE7zu23r1WRDtkj CqpUWdGh3MzMWYF/M0sbE1q+Y2n10XsxysColcrkA8678cktOMGHQ8u1/aQTHLycQMH7 Jrj2jwRhBmRudwUz+Cu42vyLmY6ExBHRrhYpF7OJ6LSZ1eeO5SSHKcp6RKgcwDtI5gfI oaAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:from:subject :message-id:references:mime-version:in-reply-to:date:dkim-signature; bh=Ws3TNljM7U6dR7W0bGuw6beKuVMmLn0pBMtIeO28SXE=; fh=/WuP2lwxNry1h507T1uu4CUDg11fq69zPyKf/fdGNnE=; b=Y4uSYbfhLbkXR+puLYtmAg33CrxQW24DsqNRlYYPhc1IPRjTJebv0BWZFXesZ2xoAs hX/YhZYFNVsrAEfzeTVgo7sWyUXRkSGVDRQcD6+nKKigBEDWa4WmiNe++gEaWaRPwi9t /cIp5xIZCZvdi6ZDJJZtHcxUy9f1E+OfWK0uPjm0e5cvalck9fIcjrJSQPxflWyYfUQV l8dWIvrwNvWQE4wR/KePb3k0ERzsX4D/y/pcdsLZDEXEvthso1ZaNvt72WvoRHMKU0tf kbyoEe7NIv5OIgybmuPem2uYohDMhhc/avYDu4sF1/e44LWH/R+427Dky2k6m6sx3fLp JBpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="o/bmnhgA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id k198-20020a633dcf000000b00578fc70f905si4967600pga.315.2023.10.13.11.46.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 11:46:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="o/bmnhgA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 2D5B2834A9D5; Fri, 13 Oct 2023 11:45:59 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231336AbjJMSpr (ORCPT + 99 others); Fri, 13 Oct 2023 14:45:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229518AbjJMSpq (ORCPT ); Fri, 13 Oct 2023 14:45:46 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6612495 for ; Fri, 13 Oct 2023 11:45:41 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-5a31f85e361so1479126a12.0 for ; Fri, 13 Oct 2023 11:45:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1697222741; x=1697827541; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=Ws3TNljM7U6dR7W0bGuw6beKuVMmLn0pBMtIeO28SXE=; b=o/bmnhgAKzOwSltr+28ZopWmB0QBT+J5gtJUMwEDsyOHk0X6Xwyxr2cqlPjfvKsoyq zyvtbTb0tFoJBlhwtvAc3FJB0JsSvBBXai2OULXINY3+hVJC7ZcgnKOd61LrCh+/PS0A eJhM2QgiHM6MV/9+4qVjKOe3FfLF/XwAY+3tXnSamXG/rq9mVxiUT9CpxO9tNCcbefrY HipJKAq8wwIEOTb412oezeMePzpxEnSOEzM4gDfBZwoXl6LlXLanGLgWlBE2e97x4T7p eTmosgEcVnUXTYI9Itnn+oeAEGuYC/AR0TKUqof4uH7mz+UJ5CBaJ5wWsjtyH70LAZCx a4yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697222741; x=1697827541; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=Ws3TNljM7U6dR7W0bGuw6beKuVMmLn0pBMtIeO28SXE=; b=r9Ggo72VDhYAejuA0oZesF7wQK8YsIh8YOepZApOCcfIh0hupmepv+2h5Vj/W2Yw2V MUAxd6WCwe2HDU+nnXqCtue92aZewImTmBxr+qjDja8TKdYQaQ+70mxECjYtmtf9tjmr LzAirTeee98eNIC6uAFuvXm4QXfuWPT0R2USNjq3FsV7Fpf3fpj8rl2cjRolw3tK46qb dftphPBLyhmBRJ15VzNfIi/IX1AbNVAN2VYTUJ5+WJPpkkTvZWopCPfKxnmPg1H6lKs+ utrXeSXvas728NKBIz55pQloAlbdPcePBKt1zcUY1kq4w6Xaw64d9kFmPKm6j9gqkr5+ 0LyA== X-Gm-Message-State: AOJu0YyjmqQeUQfOxBY2i77z6AyFAqu9AnGNjDjqOIv6H06txH+1/Mph 5i8B2Q/awmavxRVf1yvn7iQWokzZou8= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:ab1d:b0:1ca:2620:78ad with SMTP id ik29-20020a170902ab1d00b001ca262078admr3280plb.8.1697222740713; Fri, 13 Oct 2023 11:45:40 -0700 (PDT) Date: Fri, 13 Oct 2023 11:45:39 -0700 In-Reply-To: Mime-Version: 1.0 References: <20230914015531.1419405-8-seanjc@google.com> <117db856-9aec-e91c-b1d4-db2b90ae563d@intel.com> Message-ID: Subject: Re: [RFC PATCH v12 07/33] KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace From: Sean Christopherson To: David Matlack Cc: Anish Moorthy , Xiaoyao Li , Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , kvm@vger.kernel.org, kvmarm@lists.linux.dev, kvm-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 13 Oct 2023 11:45:59 -0700 (PDT) On Tue, Oct 10, 2023, David Matlack wrote: > On Thu, Oct 5, 2023 at 3:46=E2=80=AFPM Sean Christopherson wrote: > > > > On Thu, Oct 05, 2023, Anish Moorthy wrote: > > > On Tue, Oct 3, 2023 at 4:46=E2=80=AFPM Sean Christopherson wrote: > > > > > > > > The only way a KVM_EXIT_MEMORY_FAULT that actually reaches userspac= e could be > > > > "unreliable" is if something other than a memory_fault exit clobber= ed the union, > > > > but didn't signal its KVM_EXIT_* reason. And that would be an egre= gious bug that > > > > isn't unique to KVM_EXIT_MEMORY_FAULT, i.e. the same data corruptio= n would affect > > > > each and every other KVM_EXIT_* reason. > > > > > > Keep in mind the case where an "unreliable" annotation sets up a > > > KVM_EXIT_MEMORY_FAULT, KVM_RUN ends up continuing, then something > > > unrelated comes up and causes KVM_RUN to EFAULT. Although this at > > > least is a case of "outdated" information rather than blatant > > > corruption. > > > > Drat, I managed to forget about that. > > > > > IIRC the last time this came up we said that there's minimal harm in > > > userspace acting on the outdated info, but it seems like another good > > > argument for just restricting the annotations to paths we know are > > > reliable. What if the second EFAULT above is fatal (as I understand > > > all are today) and sets up subsequent KVM_RUNs to crash and burn > > > somehow? Seems like that'd be a safety issue. > > > > For your series, let's omit > > > > KVM: Annotate -EFAULTs from kvm_vcpu_read/write_guest_page > > > > and just fill memory_fault for the page fault paths. That will be easi= er to > > document too since we can simply say that if the exit reason is KVM_EXI= T_MEMORY_FAULT, > > then run->memory_fault is valid and fresh. >=20 > +1 >=20 > And from a performance perspective, I don't think we care about > kvm_vcpu_read/write_guest_page(). Our (Google) KVM Demand Paging > implementation just sends any kvm_vcpu_read/write_guest_page() > requests through the netlink socket, which is just a poor man's > userfaultfd. So I think we'll be fine sending these callsites through > uffd instead of exiting out to userspace. >=20 > And with that out of the way, is there any reason to keep tying > KVM_EXIT_MEMORY_FAULT to -EFAULT? As mentioned in the patch at the top > of this thread, -EFAULT is just a hack to allow the emulator paths to > return out to userspace. But that's no longer necessary. Not forcing '0' makes handling other error codes simpler, e.g. if the memor= y is poisoned, KVM can simply return -EHWPOISON instead of having to add a flag = to run->memory_fault[*]. KVM would also have to make returning '0' instead of -EFAULT conditional ba= sed on a capability being enabled. And again, committing to returning '0' will make it all but impossible to e= xtend KVM_EXIT_MEMORY_FAULT beyond the page fault handlers. Well, I suppose we c= ould have the top level kvm_arch_vcpu_ioctl_run() do if (r =3D=3D -EFAULT && vcpu->kvm->enable_memory_fault_exits && kvm_run->exit_reason =3D=3D KVM_EXIT_MEMORY_FAULT) r =3D 0; but that's quite gross IMO. > I just find it odd that some KVM_EXIT_* correspond with KVM_RUN returning= an > error and others don't. FWIW, there is already precedent for run->exit_reason being valid with a no= n-zero error code. E.g. KVM selftests relies on run->exit_reason being preserved = when forcing an immediate exit, which returns -EINTR, not '0'. if (kvm_run->immediate_exit) { r =3D -EINTR; goto out; } And pre-immediate_exit code that relies on signalling vCPUs is even more ex= plicit in setting exit_reason with a non-zero errno: if (signal_pending(current)) { r =3D -EINTR; kvm_run->exit_reason =3D KVM_EXIT_INTR; ++vcpu->stat.signal_exits; } I agree that -EFAULT with KVM_EXIT_MEMORY_FAULT *looks* a little odd, but I= MO the existing KVM behavior of returning '0' is actually what's truly odd. E.g. = returning '0' + KVM_EXIT_MMIO if the guest accesses non-existent memory is downright = weird. KVM_RUN should arguably never return '0', because it can never actual compl= etely succeed. > The exit_reason is sufficient to tell userspace what's going on and has a > firm contract, unlike -EFAULT which anything KVM calls into can return. Eh, I don't think it lessens the contract in a meaningful way. KVM is stil= l contractually obligated to fill run->exit_reason when KVM returns '0', and userspace will still likely terminate the VM on an undocumented EFAULT/EHWP= OISON. E.g. if KVM has a bug and doesn't return KVM_EXIT_MEMORY_FAULT when handlin= g a page fault, then odds are very good that the bug would result in KVM return= ing a "bare" -EFAULT regardless of whether KVM_EXIT_MEMORY_FAULT is paried with '= 0' or -EFAULT. [*] https://lore.kernel.org/all/ZQHzVOIsesTTysgf@google.com