Received: by 2002:a25:824b:0:0:0:0:0 with SMTP id d11csp200186ybn; Thu, 3 Oct 2019 03:54:41 -0700 (PDT) X-Google-Smtp-Source: APXvYqwpZIFEQnMMqoeZH3PlwemTkL8m7ELVWKnSKsxGx1CmN1YQPUZVexwZJAU3CUBZ2fFD2cd4 X-Received: by 2002:a50:dac2:: with SMTP id s2mr8838013edj.26.1570100081184; Thu, 03 Oct 2019 03:54:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570100081; cv=none; d=google.com; s=arc-20160816; b=f4POau0eQg6Rj9szNpAp8kAn/FU75AaCXG4MpqqPWTPEU9zSECcmvh64xUNWJGrxiu MaR3aLNd5m9N7E9IIH8ExRRGA3jKB9e6Murz7PCzHhgYCVKUj+4zOgRFujXXZx9dWpdo ZqrbiSus/nghohXgWdokcknlWlMsQtGLcwECPGA4VkOw7V/RsiX8uugj/8ZXwkKnjJiI 9sRIeAP6kersjWPsdhRb8695LMI3Yz6FOlLZGygmv60XdlgoFEo7tvTWsolCWRAMXdAa VNJ3NwQkB5glHb/7bzr70dl4ro560wtLMT3I2UjaEICfrWHTZZbUr9NkHlFezxUU9BHy 0+SA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from; bh=GffyjIVlPnvsrRSrHxajUWEZWrzULU9iPsHRwjh280A=; b=WefvnzXYCCwXgWx2LpcBqyx7Q7nEL2SrP1tEQoJIbESpiRXGdFuSLqMJ708jK1uc/X PF7t4uu6R0UoaoYVy8obn5E6H9nl7gT4yh5LyVKPI4UuSNYu7dTOtKx/zD43v4tk5Flv wDzST1U+0/LDGBrO2ASmvO9eftejBdd6PlzBXOBAEzuR/9u0scfRlUP4QwjlfLze178Z gTNVMZhIVtn3kwzYcAFLDYqkFRKZUT//1yKDpNlpcUj3wPSAfY1pVfIVJk5DCykBlkZ7 Xn/Yqq7eSbWC3rxTqlu58p3SGuXI1UJ1jrPw0s2J0OgQS17X9ezI+Va8GbRMpm/xSKSt ++Fw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c9si1081768eds.211.2019.10.03.03.54.16; Thu, 03 Oct 2019 03:54:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729574AbfJCKyJ (ORCPT + 99 others); Thu, 3 Oct 2019 06:54:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:45674 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729241AbfJCKyI (ORCPT ); Thu, 3 Oct 2019 06:54:08 -0400 Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6E1A587638 for ; Thu, 3 Oct 2019 10:54:07 +0000 (UTC) Received: by mail-wm1-f69.google.com with SMTP id n3so976506wmf.3 for ; Thu, 03 Oct 2019 03:54:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=GffyjIVlPnvsrRSrHxajUWEZWrzULU9iPsHRwjh280A=; b=K49/nF1+FqGJ0B0VgBaCZH73m9vv+XvOMqdhUyBTGUcsuRmi6QY4G5bV/AMIbxjFLT jexUqmTu9uIQjg4gjQaLV+mvfq4mfLV+bHk85rNXmCZM5M3mWhWfaeoxmScQXtonqB48 r1kR3niizYjJa70gxXpUEnSxwpUgTfWzJcm8ETOM+UqrCYunrAUBjX+DXLTM9Lf8NAyK h+oBBlVA9BfyLtU0kPCfXHZjJjQQ381F2Jbpb/CgRDBxOMrK3iOKx/4HZ8YTmIzGEtZT Q1NDOBQaiIWlrzC6/xWoO6c38OU63G8sEHHbNhggVMl9Ux+QDi/3CkWBPmBDNWb9j+h2 G65g== X-Gm-Message-State: APjAAAVoDlqSwo6Q8dMDH009+5Ntw4O8c8m1selUKqGcFNI0ymkCyWMB qhMRcvlL1aneaf0Ed4224WLTHkAvlU7RqP6NH1uIN+DW1aVHtmsdVGcpDf5nRnUqZ989AK6zEk0 62urof7NB/j4n190/lxu9y8eE X-Received: by 2002:a5d:4ed0:: with SMTP id s16mr6708222wrv.248.1570100045802; Thu, 03 Oct 2019 03:54:05 -0700 (PDT) X-Received: by 2002:a5d:4ed0:: with SMTP id s16mr6708199wrv.248.1570100045516; Thu, 03 Oct 2019 03:54:05 -0700 (PDT) Received: from vitty.brq.redhat.com (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id z5sm3892422wrs.54.2019.10.03.03.54.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Oct 2019 03:54:04 -0700 (PDT) From: Vitaly Kuznetsov To: Roman Kagan Cc: "kvm\@vger.kernel.org" , Michael Kelley , Lan Tianyu , Joerg Roedel , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Sasha Levin , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , "x86\@kernel.org" , "linux-hyperv\@vger.kernel.org" , "linux-kernel\@vger.kernel.org" Subject: Re: [PATCH v2] x86/hyperv: make vapic support x2apic mode In-Reply-To: <20191002101923.4981-1-rkagan@virtuozzo.com> References: <20191002101923.4981-1-rkagan@virtuozzo.com> Date: Thu, 03 Oct 2019 12:54:03 +0200 Message-ID: <87muei14ms.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 Content-Type: text/plain Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Roman Kagan writes: > Now that there's Hyper-V IOMMU driver, Linux can switch to x2apic mode > when supported by the vcpus. > > However, the apic access functions for Hyper-V enlightened apic assume > xapic mode only. > > As a result, Linux fails to bring up secondary cpus when run as a guest > in QEMU/KVM with both hv_apic and x2apic enabled. > > I didn't manage to make my instance of Hyper-V expose x2apic to the > guest; nor does Hyper-V spec document the expected behavior. However, > a Windows guest running in QEMU/KVM with hv_apic and x2apic and a big > number of vcpus (so that it turns on x2apic mode) does use enlightened > apic MSRs passing unshifted 32bit destination id and falls back to the > regular x2apic MSRs for less frequently used apic fields. > > So implement the same behavior, by replacing enlightened apic access > functions (only those where it makes a difference) with their > x2apic-aware versions when x2apic is in use. > > Fixes: 29217a474683 ("iommu/hyper-v: Add Hyper-V stub IOMMU driver") > Fixes: 6b48cb5f8347 ("X86/Hyper-V: Enlighten APIC access") > Cc: stable@vger.kernel.org > Signed-off-by: Roman Kagan > --- > v1 -> v2: > - add ifdefs to handle !CONFIG_X86_X2APIC > > arch/x86/hyperv/hv_apic.c | 54 ++++++++++++++++++++++++++++++++++++--- > 1 file changed, 51 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c > index 5c056b8aebef..eb1434ae9e46 100644 > --- a/arch/x86/hyperv/hv_apic.c > +++ b/arch/x86/hyperv/hv_apic.c > @@ -84,6 +84,44 @@ static void hv_apic_write(u32 reg, u32 val) > } > } > > +#ifdef CONFIG_X86_X2APIC > +static void hv_x2apic_icr_write(u32 low, u32 id) > +{ > + wrmsr(HV_X64_MSR_ICR, low, id); > +} AFAIU you're trying to mirror native_x2apic_icr_write() here but this is different from what hv_apic_icr_write() does (SET_APIC_DEST_FIELD(id)). Is it actually correct? (I think you've tested this and it is but) Michael, could you please shed some light here? > + > +static u32 hv_x2apic_read(u32 reg) > +{ > + u32 reg_val, hi; > + > + switch (reg) { > + case APIC_EOI: > + rdmsr(HV_X64_MSR_EOI, reg_val, hi); > + return reg_val; > + case APIC_TASKPRI: > + rdmsr(HV_X64_MSR_TPR, reg_val, hi); > + return reg_val; > + > + default: > + return native_apic_msr_read(reg); > + } > +} > + > +static void hv_x2apic_write(u32 reg, u32 val) > +{ > + switch (reg) { > + case APIC_EOI: > + wrmsr(HV_X64_MSR_EOI, val, 0); > + break; > + case APIC_TASKPRI: > + wrmsr(HV_X64_MSR_TPR, val, 0); > + break; > + default: > + native_apic_msr_write(reg, val); > + } > +} > +#endif /* CONFIG_X86_X2APIC */ > + > static void hv_apic_eoi_write(u32 reg, u32 val) > { > struct hv_vp_assist_page *hvp = hv_vp_assist_page[smp_processor_id()]; > @@ -262,9 +300,19 @@ void __init hv_apic_init(void) > if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) { > pr_info("Hyper-V: Using MSR based APIC access\n"); > apic_set_eoi_write(hv_apic_eoi_write); > - apic->read = hv_apic_read; > - apic->write = hv_apic_write; > - apic->icr_write = hv_apic_icr_write; > +#ifdef CONFIG_X86_X2APIC > + if (x2apic_enabled()) { > + apic->read = hv_x2apic_read; > + apic->write = hv_x2apic_write; > + apic->icr_write = hv_x2apic_icr_write; > + } else { > +#endif > + apic->read = hv_apic_read; > + apic->write = hv_apic_write; > + apic->icr_write = hv_apic_icr_write; (just wondering): Is it always safe to assume that we cannot switch between apic_flat/x2apic in runtime? Moreover, the only difference between hv_apic_read/hv_apic_write and hv_x2apic_read/hv_x2apic_write is native_apic_mem_{read,write} -> native_apic_msr_{read,write}. Would it make sense to move if (x2apic_enabled()) and merge these functions? > +#ifdef CONFIG_X86_X2APIC > + } > +#endif > apic->icr_read = hv_apic_icr_read; > } > } -- Vitaly