Received: by 2002:a05:6358:a55:b0:ec:fcf4:3ecf with SMTP id 21csp1261326rwb; Thu, 19 Jan 2023 08:27:25 -0800 (PST) X-Google-Smtp-Source: AMrXdXtMeZNY8Pj3w9ZSD8rPMd51ambnwkOkobYeYsKYMeWuOyWzUG0lAX3X6pXdSzsmO3TV689W X-Received: by 2002:a17:907:971c:b0:86f:d154:ef4f with SMTP id jg28-20020a170907971c00b0086fd154ef4fmr13397027ejc.28.1674145645544; Thu, 19 Jan 2023 08:27:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674145645; cv=none; d=google.com; s=arc-20160816; b=KH/QFNhV1PrJb7Q49rdrcB83ycQp/Cl3CxN9JN0wwcB1GbwtdNdmCTs2V19luYMVxO IafTdqKdOEK8qNcao1jN/Y/utp9jq5fKCgQI31bQgdHamvTThO6dbEHUeefYLUMaOU2k BK9YYd8HbWgIJkYyu8hbchHxZw1yKrLanF/NkALtQvmpwT21+scsdWgPyppvuDhLZBgI zBbAIBI1acVjlJwHPLMFN2nlucMXppMW9oTIyJ+0WChpR6lkX8eC035LzeCsf6t0WzBZ 5xzgACpEO3b4rQoNAgKoqH7ILth06Q9T7h4W5Bo+YXbhcqj57Eg2BjoAE2SN2+9TSWVM y7Qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=St2kBtgvxum7SrYgXTkZ6Xbz6xO368GgeLmznULzIQ4=; b=jMatuBSwfSPXPOIXJQW6RRAcZ89GTQZORhyJoK0OHCuLexSWx5f00Ijjc+feqbWUie sEWIMTR9Uq/3hXamovEm+sbnj2Mq5IHbu694KbDBnGXJId7i4rtSTqVyuy7rcm/iDpDD xBOlD63Mxe5XtQv0vBzNx1w2BUmla6dMIbYmeSMrtWbj1fvpuQAFhtsYzozOuhwvgnv4 tYyZIQpkcb/6tnwfaruDtMIlrA2OVskY1etc0XGF60lUndsCRU0OP1O1L5TsTtHLoWwL 25xaB8sKSX//gISQINEscjQX0tH3v7jaPwVa/MZXFJIIwAcd9GqbfXDVmtexEml9VQb3 +1Rg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=T9t1KZW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i11-20020a1709064fcb00b0086cda17d40esi20846201ejw.766.2023.01.19.08.27.14; Thu, 19 Jan 2023 08:27:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=T9t1KZW6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229659AbjASPhs (ORCPT + 45 others); Thu, 19 Jan 2023 10:37:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229710AbjASPhn (ORCPT ); Thu, 19 Jan 2023 10:37:43 -0500 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DACF37E4B9 for ; Thu, 19 Jan 2023 07:37:42 -0800 (PST) Received: by mail-pj1-x102b.google.com with SMTP id v6-20020a17090ad58600b00229eec90a7fso1112490pju.0 for ; Thu, 19 Jan 2023 07:37:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=St2kBtgvxum7SrYgXTkZ6Xbz6xO368GgeLmznULzIQ4=; b=T9t1KZW6rZnUrXATe3hJpWVtiuLPXSBmSAiN3bmtyHq9LRPj86td2VCStQA6YJ3fxt iOxwOzMAevAVckpom+ueVAtT7EfzSGbozlWo2c9TbVOubp1qP0KOdFkcdYBvZ/JPnbul 4nhiJmgoAoGAKFQwVOzEV3i5evm4L9BwwW/+ZPND5qg5+MVl79as+4gz7spkkv/SwWBV qA63HJ0eJ0KcZoWLkKpKbOGS/THY5LdZIcHl/beZcfDq9THBUcmXvzzrC1Zh3ZTXRXf1 IE2b0S8w282IlybL2x1CChX51dz8pz40tXO5nZOCNcNDqZzx3dI9X8NeSKvJyB/n6K/c KDtw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=St2kBtgvxum7SrYgXTkZ6Xbz6xO368GgeLmznULzIQ4=; b=C9dDPDV3OiG3o/Y52sizjLLV9xg3uKmfEIG3pAOqt6oMAysV/U6uQw3ST8jp4VOUI/ 1fLVoPDDUcDTpTB658ygM2Rqpj6lY56p/h6atYIkpo8J4LACdfZMpUObS7kkT2Wnn3/o mFfvYPgFfOxa30L5FkuvlrpyOvB9n3XyyOGLyOgR7tAudT32pcFhgdPQ1y1+26YGbiwn RDrjMv+UbTgBTnFNHIEz3vzjYaCuU6FHhaKp47AiWR3qDNcMxObJIdYiXZ9t2iMHKL8r 9E4hKBwElw/DSwZ73GjMKjDGcxatATMlTon6E059GUp7z/5cweeBrc/MYpEVt7/hWzQW 7Kvg== X-Gm-Message-State: AFqh2koJrFO+y20wxN4M+Yu3msy1QIOBWuzdKPTahpU2RWslkGX1roWn 0+hCculcF+hcPUVIm4D5QjX0zA== X-Received: by 2002:a17:90b:274b:b0:219:f970:5119 with SMTP id qi11-20020a17090b274b00b00219f9705119mr3193015pjb.1.1674142662190; Thu, 19 Jan 2023 07:37:42 -0800 (PST) Received: from google.com (7.104.168.34.bc.googleusercontent.com. [34.168.104.7]) by smtp.gmail.com with ESMTPSA id y1-20020a17090a474100b002262dd8a39bsm3199885pjg.49.2023.01.19.07.37.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Jan 2023 07:37:41 -0800 (PST) Date: Thu, 19 Jan 2023 15:37:38 +0000 From: Sean Christopherson To: "Huang, Kai" Cc: "zhi.wang.linux@gmail.com" , "sean.j.christopherson@intel.com" , "linux-kernel@vger.kernel.org" , "Shahar, Sagi" , "isaku.yamahata@gmail.com" , "Aktas, Erdem" , "dmatlack@google.com" , "kvm@vger.kernel.org" , "Yamahata, Isaku" , "pbonzini@redhat.com" Subject: Re: [PATCH v11 018/113] KVM: TDX: create/destroy VM structure Message-ID: References: <68fa413e61d7471657174bc7c83bde5c842e251f.1673539699.git.isaku.yamahata@intel.com> <20230113151258.00006a6d@gmail.com> <20230114111621.00001840@gmail.com> <20230117214414.00003229@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 19, 2023, Huang, Kai wrote: > On Tue, 2023-01-17 at 21:01 +0000, Sean Christopherson wrote: > > On Tue, Jan 17, 2023, Sean Christopherson wrote: > > > On Tue, Jan 17, 2023, Zhi Wang wrote: > > > > 2) As TDX module doesn't provide contention-and-wait, I guess the following > > > > approach might have been discussed when designing this "retry". > > > > > > > > KERNEL TDX MODULE > > > > > > > > SEAMCALL A -> PATH A: Taking locks > > > > > > > > SEAMCALL B -> PATH B: Contention on a lock > > > > > > > > <- Return "operand busy" > > > > > > > > SEAMCALL B -| > > > > | <- Wait on a kernel waitqueue > > > > SEAMCALL B <-| > > > > > > > > SEAMCALL A <- PATH A: Return > > > > > > > > SEAMCALL A -| > > > > | <- Wake up the waitqueue > > > > SEMACALL A <-| > > > > > > > > SEAMCALL B -> PATH B: Taking the locks > > > > ... > > > > > > > > Why not this scheme wasn't chosen? > > > > > > AFAIK, I don't think a waitqueue approach as ever been discussed publicly. Intel > > > may have considered the idea internally, but I don't recall anything being proposed > > > publically (though it's entirely possible I just missed the discussion). > > > > > > Anways, I don't think a waitqueue would be a good fit, at least not for S-EPT > > > management, which AFAICT is the only scenario where KVM does the arbitrary "retry > > > X times and hope things work". If the contention occurs due to the TDX Module > > > taking an S-EPT lock in VM-Enter, then KVM won't get a chance to do the "Wake up > > > the waitqueue" action until the next VM-Exit, which IIUC is well after the TDX > > > Module drops the S-EPT lock. In other words, immediately retrying and then punting > > > the problem further up the stack in KVM does seem to be the least awful "solution" > > > if there's contention. > > > > Oh, the other important piece I forgot to mention is that dropping mmu_lock deep > > in KVM's MMU in order to wait isn't always an option. Most flows would play nice > > with dropping mmu_lock and sleeping, but some paths, e.g. from the mmu_notifier, > > (conditionally) disallow sleeping. > > Could we do something similar to tdp_mmu_iter_cond_resched() but not simple busy > retrying "X times", at least at those paths that can release mmu_lock()? That's effectively what happens by unwinding up the stak with an error code. Eventually the page fault handler will get the error and retry the guest. > Basically we treat TDX_OPERAND_BUSY as seamcall_needbreak(), similar to > rwlock_needbreak(). I haven't thought about details though. I am strongly opposed to that approach. I do not want to pollute KVM's MMU code with a bunch of retry logic and error handling just because the TDX module is ultra paranoid and hostile to hypervisors. The problematic scenario of faulting indefinitely on a single instruction should never happen under normal circumstances, and so KVM should treat such scenarios as attacks/breakage and pass the buck to userspace.