Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp1163775pxb; Fri, 21 Jan 2022 11:12:12 -0800 (PST) X-Google-Smtp-Source: ABdhPJyZtGA3T9b2AYB0eZgovslnHpJEsjgk1iFBmtT8hSa2fx3PNCF0gST2Xy8PqkNZ4s2KEOTZ X-Received: by 2002:a17:902:a50f:b0:149:bc1a:2c98 with SMTP id s15-20020a170902a50f00b00149bc1a2c98mr4966376plq.35.1642792331888; Fri, 21 Jan 2022 11:12:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1642792331; cv=none; d=google.com; s=arc-20160816; b=VTjlk0hhmC1Q/Zqt8yF9sn824LjV/lI05GnhAtDQcmITTMd33Np/PX88OlJPubgtLz 8btC7slvJx8gz0mQYcdcg6GYFa3E0Qu35JHhlM2elxwm5cVAGgCdAmmcQLy8OaTbRJtl zk1ubvn4CFDc56s1XYuvCJuocEb0mOZJHlDUZ+FE1ENBCehOgNgNuvbLvNTAdt8f3xSX UYqMyqHMeKN5AZMvbwEXOhOa0ICVtkxR8MkLwlwxfZhgrhYwCA+5BTQE6gp+ACyptCMA vLGNcLvY6p0z4ivCL4PJyH5oDKkFORVG10PEquGSHGECXOPUabNipEjMmMl3oMhaNNdZ eltQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=ljARmyxShU4zjUJ1U/PzDnkyeSKxC6end7kWkGe87W0=; b=oBi0ePd15hoRv1YxgCxRrsj+HsGq0kpHFIytK5YUFhj92C4HhNa2776NXZPju/L7H8 wFVGCYfm2Jg43JwM+kKGKTNoXjU0/GzlBK7KcPdnsG4Ztw7Mh8ENyhA68WY43K3WXdzF 0NljQeML+9OOKVR6g4FxRQwrZntcHVJXkTPbU9+oK2j4WrhiuoyWM+ybUZLgt7/Q+Cbn ObCukImXV49i0A34E5aAI3TSg1+Z0wwyfBBEE4xynLrnM5+S1RJlOLd09pRokDPMGa2G EIWBUfExYa5H/g54U0ts2YEFyM2rlMDndWj0W38HpJpfI9x/YyBuC8M5rMzAQdOg9I0t 3glw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y13si2691674pfe.380.2022.01.21.11.11.59; Fri, 21 Jan 2022 11:12:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354308AbiASLuK (ORCPT + 99 others); Wed, 19 Jan 2022 06:50:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233640AbiASLuF (ORCPT ); Wed, 19 Jan 2022 06:50:05 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35FBDC061574 for ; Wed, 19 Jan 2022 03:50:05 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id CB60A61647 for ; Wed, 19 Jan 2022 11:50:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C0C78C004E1; Wed, 19 Jan 2022 11:50:01 +0000 (UTC) Date: Wed, 19 Jan 2022 11:49:58 +0000 From: Catalin Marinas To: Shameerali Kolothum Thodi Cc: "linux-arm-kernel@lists.infradead.org" , "kvmarm@lists.cs.columbia.edu" , "linux-kernel@vger.kernel.org" , "maz@kernel.org" , "will@kernel.org" , "james.morse@arm.com" , "julien.thierry.kdev@gmail.com" , "suzuki.poulose@arm.com" , "jean-philippe@linaro.org" , "Alexandru.Elisei@arm.com" , "qperret@google.com" , Jonathan Cameron , Linuxarm Subject: Re: [PATCH v4 0/4] kvm/arm: New VMID allocator based on asid Message-ID: References: <20211122121844.867-1-shameerali.kolothum.thodi@huawei.com> <207f800d1a67427a9771ffb06086365b@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <207f800d1a67427a9771ffb06086365b@huawei.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 19, 2022 at 09:23:31AM +0000, Shameerali Kolothum Thodi wrote: > > On Mon, Nov 22, 2021 at 12:18:40PM +0000, Shameer Kolothum wrote: > > > ?-TLA+ model. Modified the asidalloc model to incorporate the new > > > VMID algo. The main differences are, > > > ? -flush_tlb_all() instead of local_tlb_flush_all() on rollover. > > > ? -Introduced INVALID_VMID and vCPU Sched Out logic. > > > ? -No CnP (Removed UniqueASIDAllCPUs & UniqueASIDActiveTask invariants). > > > ? -Removed ?UniqueVMIDPerCPU invariant for now as it looks like > > > because of the speculative fetching with flush_tlb_all() there > > > is a small window where this gets triggered. If I change the > > > logic back to local_flush_tlb_all(), UniqueVMIDPerCPU seems to > > > be fine. With my limited knowledge on TLA+ model, it is not > > > clear to me whether this is a problem with the above logic > > > or the VMID model implementation. Really appreciate any help > > > with the model. > > > The initial VMID TLA+ model is here, > > > https://github.com/shamiali2008/kernel-tla/tree/private-vmidalloc-v1 > > > > I only had a brief look at the TLA+ model and I don't understand why you > > have a separate 'shed_out' process. It would run in parallel with the > > 'sched' but AFAICT you can't really schedule a guest out while you are > > in the middle of scheduling it in. I'd rather use the same 'sched' > > process and either schedule in an inactive task or schedule out an > > active one for a given CPU. > > > > Also active_vmids[] for example is defined on the CPUS domain but you > > call vcpu_sched_out() from a process that's not in the CPUS domain but > > the SCHED_OUT one. > > Many thanks for taking a look. My bad!. The 'sched_out' would indeed run in parallel > and defeat the purpose. I must say I was really confused by the TLA+ syntax and > is still not very confident about it. Yeah, it can be confusing. If you have time, you could give CBMC a try and the 'spec' would be pretty close to your C version. Each CPU would be modelled as a thread with an extra thread that simulates the speculative TLB look-ups for all CPUS together with the asserts for the invariants. The spinlocks would be pthread_mutexes. > Based on the above suggestion, I have modified it as below, > > \* vCPU is scheduled out by KVM > macro vcpu_sched_out() { > active_kvm[self].task := 0; > active_vmids[self] := INVALID_VMID; > } Could you call cpu_switch_kvm(0, INVALID_VMID) instead? You could do this directly below and avoid another macro. Well, whatever you find clearer. What confuses me is that your INVALID_VMID looks like a valid one: vmid 0, generation 1. Do you ensure that you never allocate VMID 0? > \* About to run a Guest VM > process (sched \in CPUS) > { > sched_loop: > while (TRUE) { > with (t \in TASKS) { > if (t # ActiveTask(self)) > call kvm_arm_vmid_update(t); > else > vcpu_sched_out(); > } > } > } Yes, that's what I meant. > > The corresponding > > UniqueASIDPerCPU meant that for any two TLB entries on a single CPU, if > > they correspond to different tasks (pgd), they should have different > > ASIDs. That's a strong requirement, otherwise we end up with the wrong > > translation. > > Yes, I understand that it is a strong requirement. Also, I thought this is something > that will trigger easily with the test setup I had with the real hardware. But testing > stayed on for days, so I was not sure it is a problem with the TLA+ implementation > or not. Well, you'd have to check the TLA+ state trace and see how it got there, whether the last state would be a valid one. It's either something missing in the spec that the hardware enforces or the algorithm is wrong and just hard to hit in practice. If this condition is violated in hardware for a very brief period, e.g. due to some TLBI, you'd not notice an issue under normal circumstances. But it's still incorrect. > > Why did you remove the CnP? Do we have this disabled for KVM guests? > > I removed CnP related Invariants to simplify things for the first version. Also not sure > what specific changes we need to do for CnP here. Do we still need that switching to > global pg_dir to prevent any speculative reading? I didn't see that being done in KVM > anywhere at the moment. Maybe I am missing something. It make sense to ignore CnP for now. Maybe KVM doesn't even bother and sets VTTBR_EL2.CnP to 0 (I haven't checked). > On a side note, In my setup, the CnP=TRUE case for asidalloc.tla now fails with, > "Error: Invariant TLBEmptyInvalPTE is violated.". Please check. This was added later as part of try-to-unmap and I only checked this with CnP = FALSE. I'll need to look into, it's possible that flush_tlb_all() doesn't take into account that the pte is unmapped (as cpu_switch_mm() does). I'll add a separate thread for speculative TLB loads, it's easier to have them in one place. Thanks. -- Catalin