Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5769332imu; Wed, 30 Jan 2019 03:18:51 -0800 (PST) X-Google-Smtp-Source: ALg8bN45k4zd9Xq7hE0ScF12a6eQgahUeeM/0E3qZnzBNC9F8fM67JF/JNwC2g92lKq33h6iZaZ0 X-Received: by 2002:a62:1c86:: with SMTP id c128mr31332654pfc.54.1548847131154; Wed, 30 Jan 2019 03:18:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548847131; cv=none; d=google.com; s=arc-20160816; b=dnG5VPwMLro7IFqrscO2qr1L4UNzJINnuo3E+D7RyYxWGycOCZIvcDf85WT+tCQj0G tT9waJFjh1Okg6SMJq+jBsum37laMRPQ7PJ5uz9gUdZbrWZFBOWWpZOLXXetAXMBzhG3 TNe/kwUo4bfcKVptXZl5KPRUGzdfw9cyrJ/nhv0UHS2W4e+fVjxnxZrN92rEu8sVLbsx 1XZWrgT3saKcRy+wDWC/hYehwVfFIZy6DGKcgwlDqpXZnyxUtT6TZK0JjKCJku+3bOrs HC2meKkYGzGRF6bKK0U7xzutcV6SbBEc06xiIan5qPQ/X14PG4Nge0Ii0TJoHLjoR0no IRIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:autocrypt:openpgp:from:references:cc:to:subject; bh=PUR4hQESAzN5SuqaPPHNg0oqJpMXiotpZHz2FE99sdQ=; b=INeJO936E1oszZOtwKu+ay/zfALbONSpzehq2bZj0yxKWyZzE3vDBFtFkm5fUdDQ6s vuJ4ySilnbToQ9NmHA+08apAeBd36DeIZGqU6kXj/p28WSWLu4p+Z7sE3zaVMe5+QTGc 4CSnAkBLoK4NquJhPg5YI7LdswFomlr7zheVd10511UoBmVxmp0a828+k0hoCSHjCKxl DPInBm53RbB5AyUq1W75tWRPS7taTin7TwRzqkEFFdI8HhKseYfiZj2O7c3eHN619h7Z nLb7ee2WQrj7Ck51nBQVLjvVhCLxgRiD8BX8f/y9muK4tTdFUWTCo9wk8BfID8gCgE2Z 7BNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v69si1113470pgd.284.2019.01.30.03.18.35; Wed, 30 Jan 2019 03:18:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727897AbfA3LSL (ORCPT + 99 others); Wed, 30 Jan 2019 06:18:11 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46122 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725768AbfA3LSK (ORCPT ); Wed, 30 Jan 2019 06:18:10 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 06E082D7F0; Wed, 30 Jan 2019 11:18:10 +0000 (UTC) Received: from [10.36.112.68] (ovpn-112-68.ams2.redhat.com [10.36.112.68]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CADA260C6E; Wed, 30 Jan 2019 11:18:06 +0000 (UTC) Subject: Re: [PATCH v2] KVM: x86: Sync the pending Posted-Interrupts To: "Kang, Luwei" , "rkrcmar@redhat.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "bp@alien8.de" , "hpa@zytor.com" , "x86@kernel.org" Cc: "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <1548809972-32100-1-git-send-email-luwei.kang@intel.com> <0e8444a1-39a3-d217-8602-ae175fedc8b2@redhat.com> <82D7661F83C1A047AF7DC287873BF1E172CD5EE4@SHSMSX101.ccr.corp.intel.com> <82D7661F83C1A047AF7DC287873BF1E172CD60FB@SHSMSX101.ccr.corp.intel.com> From: Paolo Bonzini Openpgp: preference=signencrypt Autocrypt: addr=pbonzini@redhat.com; prefer-encrypt=mutual; keydata= mQHhBFRCcBIBDqDGsz4K0zZun3jh+U6Z9wNGLKQ0kSFyjN38gMqU1SfP+TUNQepFHb/Gc0E2 CxXPkIBTvYY+ZPkoTh5xF9oS1jqI8iRLzouzF8yXs3QjQIZ2SfuCxSVwlV65jotcjD2FTN04 hVopm9llFijNZpVIOGUTqzM4U55sdsCcZUluWM6x4HSOdw5F5Utxfp1wOjD/v92Lrax0hjiX DResHSt48q+8FrZzY+AUbkUS+Jm34qjswdrgsC5uxeVcLkBgWLmov2kMaMROT0YmFY6A3m1S P/kXmHDXxhe23gKb3dgwxUTpENDBGcfEzrzilWueOeUWiOcWuFOed/C3SyijBx3Av/lbCsHU Vx6pMycNTdzU1BuAroB+Y3mNEuW56Yd44jlInzG2UOwt9XjjdKkJZ1g0P9dwptwLEgTEd3Fo UdhAQyRXGYO8oROiuh+RZ1lXp6AQ4ZjoyH8WLfTLf5g1EKCTc4C1sy1vQSdzIRu3rBIjAvnC tGZADei1IExLqB3uzXKzZ1BZ+Z8hnt2og9hb7H0y8diYfEk2w3R7wEr+Ehk5NQsT2MPI2QBd wEv1/Aj1DgUHZAHzG1QN9S8wNWQ6K9DqHZTBnI1hUlkp22zCSHK/6FwUCuYp1zcAEQEAAbQj UGFvbG8gQm9uemluaSA8cGJvbnppbmlAcmVkaGF0LmNvbT6JAg0EEwECACMFAlRCcBICGwMH CwkIBwMCAQYVCAIJCgsEFgIDAQIeAQIXgAAKCRB+FRAMzTZpsbceDp9IIN6BIA0Ol7MoB15E 11kRz/ewzryFY54tQlMnd4xxfH8MTQ/mm9I482YoSwPMdcWFAKnUX6Yo30tbLiNB8hzaHeRj jx12K+ptqYbg+cevgOtbLAlL9kNgLLcsGqC2829jBCUTVeMSZDrzS97ole/YEez2qFpPnTV0 VrRWClWVfYh+JfzpXmgyhbkuwUxNFk421s4Ajp3d8nPPFUGgBG5HOxzkAm7xb1cjAuJ+oi/K CHfkuN+fLZl/u3E/fw7vvOESApLU5o0icVXeakfSz0LsygEnekDbxPnE5af/9FEkXJD5EoYG SEahaEtgNrR4qsyxyAGYgZlS70vkSSYJ+iT2rrwEiDlo31MzRo6Ba2FfHBSJ7lcYdPT7bbk9 AO3hlNMhNdUhoQv7M5HsnqZ6unvSHOKmReNaS9egAGdRN0/GPDWr9wroyJ65ZNQsHl9nXBqE AukZNr5oJO5vxrYiAuuTSd6UI/xFkjtkzltG3mw5ao2bBpk/V/YuePrJsnPFHG7NhizrxttB nTuOSCMo45pfHQ+XYd5K1+Cv/NzZFNWscm5htJ0HznY+oOsZvHTyGz3v91pn51dkRYN0otqr bQ4tlFFuVjArBZcapSIe6NV8C4cEiSS5AQ0EVEJxcwEIAK+nUrsUz3aP2aBjIrX3a1+C+39R nctpNIPcJjFJ/8WafRiwcEuLjbvJ/4kyM6K7pWUIQftl1P8Woxwb5nqL7zEFHh5I+hKS3haO 5pgco//V0tWBGMKinjqntpd4U4Dl299dMBZ4rRbPvmI8rr63sCENxTnHhTECyHdGFpqSzWzy 97rH68uqMpxbUeggVwYkYihZNd8xt1+lf7GWYNEO/QV8ar/qbRPG6PEfiPPHQd/sldGYavmd //o6TQLSJsvJyJDt7KxulnNT8Q2X/OdEuVQsRT5glLaSAeVAABcLAEnNgmCIGkX7TnQF8a6w gHGrZIR9ZCoKvDxAr7RP6mPeS9sAEQEAAYkDEgQYAQIACQUCVEJxcwIbAgEpCRB+FRAMzTZp scBdIAQZAQIABgUCVEJxcwAKCRC/+9JfeMeug/SlCACl7QjRnwHo/VzENWD9G2VpUOd9eRnS DZGQmPo6Mp3Wy8vL7snGFBfRseT9BevXBSkxvtOnUUV2YbyLmolAODqUGzUI8ViF339poOYN i6Ffek0E19IMQ5+CilqJJ2d5ZvRfaq70LA/Ly9jmIwwX4auvXrWl99/2wCkqnWZI+PAepkcX JRD4KY2fsvRi64/aoQmcxTiyyR7q3/52Sqd4EdMfj0niYJV0Xb9nt8G57Dp9v3Ox5JeWZKXS krFqy1qyEIypIrqcMbtXM7LSmiQ8aJRM4ZHYbvgjChJKR4PsKNQZQlMWGUJO4nVFSkrixc9R Z49uIqQK3b3ENB1QkcdMg9cxsB0Onih8zR+Wp1uDZXnz1ekto+EivLQLqvTjCCwLxxJafwKI bqhQ+hGR9jF34EFur5eWt9jJGloEPVv0GgQflQaE+rRGe+3f5ZDgRe5Y/EJVNhBhKcafcbP8 MzmLRh3UDnYDwaeguYmxuSlMdjFL96YfhRBXs8tUw6SO9jtCgBvoOIBDCxxAJjShY4KIvEpK b2hSNr8KxzelKKlSXMtB1bbHbQxiQcerAipYiChUHq1raFc3V0eOyCXK205rLtknJHhM5pfG 6taABGAMvJgm/MrVILIxvBuERj1FRgcgoXtiBmLEJSb7akcrRlqe3MoPTntSTNvNzAJmfWhd SvP0G1WDLolqvX0OtKMppI91AWVu72f1kolJg43wbaKpRJg1GMkKEI3H+jrrlTBrNl/8e20m TElPRDKzPiowmXeZqFSS1A6Azv0TJoo9as+lWF+P4zCXt40+Zhh5hdHO38EV7vFAVG3iuay6 7ToF8Uy7tgc3mdH98WQSmHcn/H5PFYk3xTP3KHB7b0FZPdFPQXBZb9+tJeZBi9gMqcjMch+Y R8dmTcQRQX14bm5nXlBF7VpSOPZMR392LY7wzAvRdhz7aeIUkdO7VelaspFk2nT7wOj1Y6uL nRxQlLkBDQRUQnHuAQgAx4dxXO6/Zun0eVYOnr5GRl76+2UrAAemVv9Yfn2PbDIbxXqLff7o yVJIkw4WdhQIIvvtu5zH24iYjmdfbg8iWpP7NqxUQRUZJEWbx2CRwkMHtOmzQiQ2tSLjKh/c HeyFH68xjeLcinR7jXMrHQK+UCEw6jqi1oeZzGvfmxarUmS0uRuffAb589AJW50kkQK9VD/9 QC2FJISSUDnRC0PawGSZDXhmvITJMdD4TjYrePYhSY4uuIV02v028TVAaYbIhxvDY0hUQE4r 8ZbGRLn52bEzaIPgl1p/adKfeOUeMReg/CkyzQpmyB1TSk8lDMxQzCYHXAzwnGi8WU9iuE1P 0wARAQABiQHzBBgBAgAJBQJUQnHuAhsMAAoJEH4VEAzNNmmxp1EOoJy0uZggJm7gZKeJ7iUp eX4eqUtqelUw6gU2daz2hE/jsxsTbC/w5piHmk1H1VWDKEM4bQBTuiJ0bfo55SWsUNN+c9hh IX+Y8LEe22izK3w7mRpvGcg+/ZRG4DEMHLP6JVsv5GMpoYwYOmHnplOzCXHvmdlW0i6SrMsB Dl9rw4AtIa6bRwWLim1lQ6EM3PWifPrWSUPrPcw4OLSwFk0CPqC4HYv/7ZnASVkR5EERFF3+ 6iaaVi5OgBd81F1TCvCX2BEyIDRZLJNvX3TOd5FEN+lIrl26xecz876SvcOb5SL5SKg9/rCB ufdPSjojkGFWGziHiFaYhbuI2E+NfWLJtd+ZvWAAV+O0d8vFFSvriy9enJ8kxJwhC0ECbSKF Y+W1eTIhMD3aeAKY90drozWEyHhENf4l/V+Ja5vOnW+gCDQkGt2Y1lJAPPSIqZKvHzGShdh8 DduC0U3xYkfbGAUvbxeepjgzp0uEnBXfPTy09JGpgWbg0w91GyfT/ujKaGd4vxG2Ei+MMNDm S1SMx7wu0evvQ5kT9NPzyq8R2GIhVSiAd2jioGuTjX6AZCFv3ToO53DliFMkVTecLptsXaes uUHgL9dKIfvpm+rNXRn9wAwGjk0X/A== Message-ID: <0562a907-a6be-49c3-2c1f-7dae4e94f4eb@redhat.com> Date: Wed, 30 Jan 2019 12:18:03 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.3.1 MIME-Version: 1.0 In-Reply-To: <82D7661F83C1A047AF7DC287873BF1E172CD60FB@SHSMSX101.ccr.corp.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 30 Jan 2019 11:18:10 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 30/01/19 11:38, Kang, Luwei wrote: >>>> This is not what I asked. You should instead do the check after pi_clear_sn. >>>> >>> >>> I think the SN has been cleared here before test the bitmap. >>> The SN will be set when the vCPU is schedule out. ID: >>> 28b835d60fcc2498e717cf5e6f0c3691c24546f7 >>> But SN will be cleared when sched in. >>> >>> Another place is when vCPU run out of the vcpu_run() function: >>> kvm_arch_vcpu_ioctl_run() >>> vcpu_load(vcpu); -> kvm_arch_vcpu_load -> vmx_vcpu_load -> vmx_vcpu_pi_load -> new.sn = 0; >>> vcpu_run(vcpu); >>> for(;;) >>> vcpu_put(vcpu); -> kvm_arch_vcpu_put -> vmx_vcpu_put -> >>> vmx_vcpu_pi_put -> pi_set_sn() But SN will be cleared in vcpu_load() >>> before back to vcpu_run() >> >> Yes, but you're changing the wrong path. The patch is affecting _all_ vmentries, not just those after PID.SN has been cleared. >> >> As I mentioned in the previous email, KVM relies on the SDM's invariant that ON where PID.ON=1 whenever PID.PIR!=0. Invariants are your >> best friend when dealing with complicated multi-processor code so I don't want to change that. >> >> It's the VT-d pi_clear_sn path that I want to be changed, because it's VT-d and specifically SN that complicates the very simple definition in >> the SDM. By modifying the pi_clear_sn path, you ensure the invariant is respected and everyone is happy. > > Hi Paolo, > How about like this: > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 820a03b..dfc5e3d 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -1219,6 +1219,9 @@ static void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) > new.ndst = (dest << 8) & 0xFF00; > > new.sn = 0; > + > + if (!bitmap_empty((unsigned long *)new.pir, NR_VECTORS)) > + new.on = 1; This is racy, the bitmap can change after the cmpxchg64; and the "if" that calls pi_clear_sn needs to check the bitmap as well. So you have to remove that "if", and move the check outside the do/while. You also need an smp_mb__after_atomic() after the do/while. Paolo > } while (cmpxchg64(&pi_desc->control, old.control, > new.control) != old.control); > }