Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp2290057ybt; Sun, 21 Jun 2020 15:29:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyHow9yFqzppY7JH7JOA1Jd3hhkO7kvDuaoQtfPScXGKXex1phwTdmRSJ8hnaqhm0NG/Ide X-Received: by 2002:a05:6402:22b0:: with SMTP id cx16mr7326585edb.48.1592778594677; Sun, 21 Jun 2020 15:29:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592778594; cv=none; d=google.com; s=arc-20160816; b=ncfmCG26LknPXTYqGY/SCye2MSihW+ZjnZI3P2gSn8SztIoBpRpgvJVXhr96jatdwr OJdi4lzwdo7AhZdi8NW4wk8FwcqTMVuPT2nbl2pNAdpEp6UlcJeWwbFFMVGMb4VqELya dCCCvVi1iRvhUmmiiGGmdb77ho25cWtceBauOg+hlxEpYER2YaMglv/bemLuOOOe/Ojg MAlUnX8jTIONYTdAObOPIkWFvwv2GZ43bM1dC8kJ6iuSpTcppL5OQ3YuvjpyeNYr2gX2 ChzsDI0Lcu6NRpkXKkE6xS/5uCljocs5y67R6Tgz5oXxzyFgBiX8owwMGgbGCT+B5kYK /ZBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=9fL1R/P86+hC+bgW9wsTzQuqTu+6VC/+xAUVsewgB9I=; b=gdW3iy8NG9L/3ttZMpnuYLBdkwPUFKwoV+hfKty9daDBYiUQ66ZbqGr3gdPUF1x5IY /uJIlLa3IyJcWbqkDmnJ7f7AK/fdhBKf/6xlLHOM4OjKTGpx4KtakXygL8Cu4GtX62s2 VoBFjcB8EAOPNnAdoy6CHurj0HsCoRWtXbDEmH684kFieCo8KtCSeD0mLrV3tF5qyg5T 6ooIQw0YPqjj6EygwLQrVeDFT/S/lC0sIq2Fc0nEzLggYrLrrIqZ1idTsAErDyEZMykY dPXHiaCngAb+l6Q8nVDsoIXTg/wN63N1t4dwBUJYl6e4wX0J5lkv+yFmn2Ab87tUNfBZ Z4hQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G4VXnbG9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id mj14si7989314ejb.464.2020.06.21.15.29.32; Sun, 21 Jun 2020 15:29:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=G4VXnbG9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726562AbgFUW0v (ORCPT + 99 others); Sun, 21 Jun 2020 18:26:51 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:45323 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726479AbgFUW0v (ORCPT ); Sun, 21 Jun 2020 18:26:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592778408; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9fL1R/P86+hC+bgW9wsTzQuqTu+6VC/+xAUVsewgB9I=; b=G4VXnbG90dbyhigDc6L0AYyR36I6cbAjgRFV9qrL6qEPZQCYiRYnaVz9kaQ5naF2LdI3TQ jQS8xnb3QJ04Ny3SBlmOSaVCTVNWK5aogZJsMtGjtsIEnx0zcnDEBWJcvJJPJv0kgGiyqi SbcZ2MWNWNMbpWUvN8cHnY9PqNJ4eQs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-203-y7rNxRhZO6-0wxjb1FNJRg-1; Sun, 21 Jun 2020 18:26:44 -0400 X-MC-Unique: y7rNxRhZO6-0wxjb1FNJRg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 44BB58014D4; Sun, 21 Jun 2020 22:26:43 +0000 (UTC) Received: from localhost (unknown [10.40.208.13]) by smtp.corp.redhat.com (Postfix) with ESMTP id 869E05C220; Sun, 21 Jun 2020 22:26:40 +0000 (UTC) Date: Mon, 22 Jun 2020 00:26:37 +0200 From: Igor Mammedov To: Paolo Bonzini Cc: Wanpeng Li , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel Subject: Re: [PATCH v3] KVM: LAPIC: Recalculate apic map in batch Message-ID: <20200622002637.33358827@redhat.com> In-Reply-To: <3e025538-297b-74e5-f1b1-2193b614978b@redhat.com> References: <1582684862-10880-1-git-send-email-wanpengli@tencent.com> <20200619143626.1b326566@redhat.com> <3e025538-297b-74e5-f1b1-2193b614978b@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 19 Jun 2020 16:10:43 +0200 Paolo Bonzini wrote: > On 19/06/20 14:36, Igor Mammedov wrote: > > qemu-kvm -m 2G -smp 4,maxcpus=8 -monitor stdio > > (qemu) device_add qemu64-x86_64-cpu,socket-id=4,core-id=0,thread-id=0 > > > > in guest fails with: > > > > smpboot: do_boot_cpu failed(-1) to wakeup CPU#4 > > > > which makes me suspect that INIT/SIPI wasn't delivered > > > > Is it a know issue? > > > > No, it isn't. I'll revert. > > Paolo > Following fixes immediate issue: diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 34a7e0533dad..6dc177da19da 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2567,6 +2567,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s) } memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s)); + apic->vcpu->kvm->arch.apic_map_dirty = true; kvm_recalculate_apic_map(vcpu->kvm); kvm_apic_set_version(vcpu); Problem is that during kvm_arch_vcpu_create() new vcpu is not visible to kvm_recalculate_apic_map(), so whoever many times map update was called during it, it didn't affect apic map. What broke hotplug is that kvm_vcpu_ioctl_set_lapic -> kvm_apic_set_state, which is called after new vcpu is visible, used to make an unconditional update which pulled in the new vcpu, but with this patch the map update is gone since state hasn't actuaaly changed, so we lost the one call of kvm_recalculate_apic_map() which did actually matter. It happens to work for vcpus present at boot just by luck (BSP updates SPIV after all vcpus has been created which triggers kvm_recalculate_apic_map()) I'm not sending formal patch yet, since I have doubts wrt subj. following sequence looks like a race that can cause lost map update events: cpu1 cpu2 apic_map_dirty = true ------------------------------------------------------------ kvm_recalculate_apic_map: pass check mutex_lock(&kvm->arch.apic_map_lock); if (!kvm->arch.apic_map_dirty) and in process of updating map ------------------------------------------------------------- other calls to apic_map_dirty = true might be too late for affected cpu ------------------------------------------------------------- apic_map_dirty = false ------------------------------------------------------------- kvm_recalculate_apic_map: bail out on if (!kvm->arch.apic_map_dirty) it's safer to revert this patch for now like you have suggested earlier. If you prefer to keep it, I'll post above fixup as a patch.