Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1966725imu; Wed, 21 Nov 2018 04:53:50 -0800 (PST) X-Google-Smtp-Source: AFSGD/VYZGaCQb4g6kch2QQIswQA8L/e8StBpkTVFcuJGNUDkS3CB/O7jfmaddSNvi4OlWUCtjZ+ X-Received: by 2002:a17:902:42e4:: with SMTP id h91mr6811528pld.18.1542804830211; Wed, 21 Nov 2018 04:53:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542804830; cv=none; d=google.com; s=arc-20160816; b=OQfqmDc48LhbR5gN5pORXgCPIY+kNIv/1xCjRFnTmeDXRmnsZy/qPUWUraOYWF3oDS me9jMZEQjKXMSete7Bh/LFyrbcEVjf6cw0Bjb4HZcshxpwhI8jrVdcYRkD0fr0wIAkAx JDaMOlr7oUUv9i/DwNEBD8jfjEIotrwaTOB06lMIaw+afjk9FCdY5aY1aDcrz50V3C/z DEx4uvzMwUF0rC02wxGZSnx9OqnGoZnOwWNIqEN451+y+02NYGrspVowaEYulTChQrSD b7bDPAfUYh00yGTSza1Y14RuIb1POnnBWFc1zlDS89lUEGO0hXuEP7lSIAox4EmT8aUX AjYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=GaFePR4OVO8mps3YAHh7GP6Cy/JKdbadwaW1BE7Fask=; b=n0cpv82oqGqDBZeuqvn+Jao0hWrci+sEFZMvJOu57D01xwDpTR8z/uMtgoFuoDcJYe iDaMbYYyArUV3hYwZYyBqbwvEB5/XhFM+mQ7AUff1XEo3R87bsQ3SZZVAJQJLv1YBErJ iB3hnLJigRI7pk1zh8DACpt5Lfor5wy1W0V+gy/iZCUbfO/GXYJuMx3EJior7+ffa8jM k9+kptfSMnWfBILw0+vaVBCYN5VTaQo2p7Dcwp50m53P/9LLJwemFQxItmF7/eQznhuj kpgvS3mPcEpDqqVXUb1h54tU0GM87azgVSWobXiSkJuOaa1WGoyy4tbHl5G09o7srJjA j/hg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y12si16733789pgf.527.2018.11.21.04.53.34; Wed, 21 Nov 2018 04:53:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729986AbeKUVk0 (ORCPT + 99 others); Wed, 21 Nov 2018 16:40:26 -0500 Received: from foss.arm.com ([217.140.101.70]:46992 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729710AbeKUVkU (ORCPT ); Wed, 21 Nov 2018 16:40:20 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8D6E515AB; Wed, 21 Nov 2018 03:06:22 -0800 (PST) Received: from localhost (e113682-lin.copenhagen.arm.com [10.32.144.41]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 252363F5A0; Wed, 21 Nov 2018 03:06:21 -0800 (PST) Date: Wed, 21 Nov 2018 12:06:20 +0100 From: Christoffer Dall To: peng.hao2@zte.com.cn Cc: marc.zyngier@arm.com, andre.przywara@arm.com, linux-kernel@vger.kernel.org, leif.lindholm@linaro.org, ard.bieshseuvel@linaro.org, kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org Subject: Re: [PATCH] kvm: arm/arm64 : fix vm's hanging at startup time Message-ID: <20181121110620.GC18678@e113682-lin.lund.arm.com> References: <01746207-c9bb-dc91-58d7-a66c0d971f88@arm.com> <201811211656540883310@zte.com.cn> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201811211656540883310@zte.com.cn> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Wed, Nov 21, 2018 at 04:56:54PM +0800, peng.hao2@zte.com.cn wrote: > >On 19/11/2018 09:10, Mark Rutland wrote: > >> On Sat, Nov 17, 2018 at 10:58:37AM +0800, peng.hao2@zte.com.cn wrote: > >>>> On 16/11/18 00:23, peng.hao2@zte.com.cn wrote: > >>>>>> Hi, > >>>>>>> When virtual machine starts, hang up. > >>>>>> > >>>>>> I take it you mean the *guest* hangs? Because it doesn't get a timer > >>>>>> interrupt? > >>>>>> > >>>>>>> The kernel version of guest > >>>>>>> is 4.16. Host support vgic_v3. > >>>>>> > >>>>>> Your host kernel is something recent, I guess? > >>>>>> > >>>>>>> It was mainly due to the incorrect vgic_irq's(intid=27) group value > >>>>>>> during injection interruption. when kvm_vgic_vcpu_init is called, > >>>>>>> dist is not initialized at this time. Unable to get vgic V3 or V2 > >>>>>>> correctly, so group is not set. > >>>>>> > >>>>>> Mmh, that shouldn't happen with (v)GICv3. Do you use QEMU (which > >>>>>> version?) or some other userland tool? > >>>>>> > >>>>> > >>>>> QEMU emulator version 3.0.50 . > >>>>> > >>>>>>> group is setted to 1 when vgic_mmio_write_group is invoked at some > >>>>>>> time. > >>>>>>> when irq->group=0 (intid=27), No ICH_LR_GROUP flag was set and > >>>>>>> interrupt injection failed. > >>>>>>> > >>>>>>> Signed-off-by: Peng Hao > >>>>>>> --- > >>>>>>> virt/kvm/arm/vgic/vgic-v3.c | 2 +- > >>>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) > >>>>>>> > >>>>>>> diff --git a/virt/kvm/arm/vgic/vgic-v3.c b/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>> index 9c0dd23..d101000 100644 > >>>>>>> --- a/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>> +++ b/virt/kvm/arm/vgic/vgic-v3.c > >>>>>>> @@ -198,7 +198,7 @@ void vgic_v3_populate_lr(struct kvm_vcpu *vcpu, > >>>>>>> struct vgic_irq *irq, int lr) if (vgic_irq_is_mapped_level(irq) && > >>>>>>> (val & ICH_LR_PENDING_BIT)) irq->line_level = false; > >>>>>>> > >>>>>>> - if (irq->group) > >>>>>>> + if (model == KVM_DEV_TYPE_ARM_VGIC_V3) > >>>>>> > >>>>>> This is not the right fix, not only because it basically reverts the > >>>>>> GICv3 part of 87322099052 (KVM: arm/arm64: vgic: Signal IRQs using > >>>>>> their configured group). > >>>>>> > >>>>>> Can you try to work out why kvm_vgic_vcpu_init() is apparently called > >>>>>> before dist->vgic_model is set, also what value it has? > >>>>>> If I understand the code correctly, that shouldn't happen for a GICv3. > >>>>>> > >>>>> Even if the value of group is correctly assigned in kvm_vgic_vcpu_init, the group is then written 0 through vgic_mmio_write_group. > >>>>> If the interrupt comes at this time, the interrupt injection fails. > >>>> > >>>> Does that mean that the guest is configuring its interrupts as Group0? > >>>> That sounds wrong, Linux should configure all it's interrupts as > >>>> non-secure group1. > >>> > >>> no, I think that uefi dose this, not linux. > >>> 1. kvm_vgic_vcpu_init > >>> 2. vgic_create > >>> 3. kvm_vgic_dist_init > >>> 4.vgic_mmio_write_group: uefi as guest, write group=0 > >>> 5.vgic_mmio_write_group: linux as guest, write group=1 > >> > >> Is this the same issue fixed by EDK2 commit: > >> > >> 66127011a544b90e ("ArmPkg/ArmGicDxe ARM: fix encoding for GICv3 interrupt acknowledge") > >> > >> ... where EDK2 would try to use IAR0 rather than IAR1? > >> > >> The commit messages notes this lead to a boot-time hang. > > > >I managed to trigger an issue with a really old EFI implementation that > >doesn't configure its interrupts as Group1, and yet tries to ACK its > >interrupts using the Group1 accessor. Guess what? It is not going to work. > > > >Commit c7fefb690661f2e38afcb8200bd318ecf38ab961 in the edk2 tree seems > >to be the fix (I only assume it does, I haven't actually checked). A > >recent build, as found in Debian Buster, works perfectly (tested with > >both QEMU v2.12 and tip of tree). > > > >Now, I really don't get what you're saying about Linux not getting > >interrupts. How do you get to booting Linux if EFI is not making any > >forward progress? Are you trying them independently? > > > I start linux with bypassing uefi, the print info is the same. > [507107.748908] vgic_mmio_write_group:## intid/27 group=0 > [507107.752185] vgic_mmio_write_group:## intid/27 group=0 > [507107.899566] vgic_mmio_write_group:## intid/27 group=1 > [507107.907370] vgic_mmio_write_group:## intid/27 group=1 > the command line is like this: > /home/qemu-patch/linshi/qemu/aarch64-softmmu/qemu-system-aarch64 -machine virt-3.1,accel=kvm,usb=off,dump-guest-core=off,gic-version=3 -kernel /home/kernelboot/vmlinuz-4.16.0+ -initrd /home/kernelboot/initramfs-4.16.0+.img -append root=/dev/mapper/cla-root ro crashkernel=auto rd.lvm.lv=cla/root rd.lvm.lv=cla/swap.UTF-8 -drive file=/home/centos74-ph/boot.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0 -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -vnc 0.0.0.0:0 -k en-us -device virtio-gpu-pci,id=video0,max_outputs=1,bus=pci.3,addr=0x0 -device pvpanic-mmio -msg timestamp=on > > This is strange. That's probably the Linux 4.16 kernel setting group to 0, and I'll continue to track in guest. Could you try the following patch: diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c index c0c0b88af1d5..6fa858c7a5a6 100644 --- a/virt/kvm/arm/vgic/vgic-init.c +++ b/virt/kvm/arm/vgic/vgic-init.c @@ -231,13 +231,6 @@ int kvm_vgic_vcpu_init(struct kvm_vcpu *vcpu) irq->config = VGIC_CONFIG_LEVEL; } - /* - * GICv3 can only be created via the KVM_DEVICE_CREATE API and - * so we always know the emulation type at this point as it's - * either explicitly configured as GICv3, or explicitly - * configured as GICv2, or not configured yet which also - * implies GICv2. - */ if (dist->vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) irq->group = 1; else @@ -298,6 +291,16 @@ int vgic_init(struct kvm *kvm) if (ret) goto out; + /* Initialize groups on CPUs created before the VGIC type was known */ + kvm_for_each_vcpu(i, vcpu, kvm) { + struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu; + + for (i = 0; i < VGIC_NR_PRIVATE_IRQS; i++) { + struct vgic_irq *irq = &vgic_cpu->private_irqs[i]; + irq->group = 1; + } + } + if (vgic_has_its(kvm)) { ret = vgic_v4_init(kvm); if (ret) Thanks, Christoffer