Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S944831AbcJSSp7 (ORCPT ); Wed, 19 Oct 2016 14:45:59 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:52651 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933236AbcJSSpv (ORCPT ); Wed, 19 Oct 2016 14:45:51 -0400 Subject: Re: [PATCH v4 5/5] x86, kvm: support vcpu preempted check To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Pan Xinhui References: <1476872416-42752-1-git-send-email-xinhui.pan@linux.vnet.ibm.com> <1476872416-42752-6-git-send-email-xinhui.pan@linux.vnet.ibm.com> <20161019172403.GA9240@potion> Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, virtualization@lists.linux-foundation.org, linux-s390@vger.kernel.org, xen-devel-request@lists.xenproject.org, kvm@vger.kernel.org, benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au, mingo@redhat.com, peterz@infradead.org, paulmck@linux.vnet.ibm.com, will.deacon@arm.com, kernellwp@gmail.com, jgross@suse.com, pbonzini@redhat.com, bsingharora@gmail.com, boqun.feng@gmail.com, borntraeger@de.ibm.com From: Pan Xinhui Date: Thu, 20 Oct 2016 02:45:33 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <20161019172403.GA9240@potion> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16101918-0028-0000-0000-000005D8AF9E X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00005941; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000187; SDB=6.00770285; UDB=6.00369204; IPR=6.00546791; BA=6.00004820; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013046; XFM=3.00000011; UTC=2016-10-19 18:45:48 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16101918-0029-0000-0000-0000302A7F0D Message-Id: <34aefe78-af86-19b3-9e2f-cb3ee6b5f735@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-10-19_15:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1610190334 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4980 Lines: 145 在 2016/10/20 01:24, Radim Krčmář 写道: > 2016-10-19 06:20-0400, Pan Xinhui: >> This is to fix some lock holder preemption issues. Some other locks >> implementation do a spin loop before acquiring the lock itself. >> Currently kernel has an interface of bool vcpu_is_preempted(int cpu). It >> takes the cpu as parameter and return true if the cpu is preempted. Then >> kernel can break the spin loops upon on the retval of vcpu_is_preempted. >> >> As kernel has used this interface, So lets support it. >> >> We use one field of struct kvm_steal_time to indicate that if one vcpu >> is running or not. >> >> unix benchmark result: >> host: kernel 4.8.1, i5-4570, 4 cpus >> guest: kernel 4.8.1, 8 vcpus >> >> test-case after-patch before-patch >> Execl Throughput | 18307.9 lps | 11701.6 lps >> File Copy 1024 bufsize 2000 maxblocks | 1352407.3 KBps | 790418.9 KBps >> File Copy 256 bufsize 500 maxblocks | 367555.6 KBps | 222867.7 KBps >> File Copy 4096 bufsize 8000 maxblocks | 3675649.7 KBps | 1780614.4 KBps >> Pipe Throughput | 11872208.7 lps | 11855628.9 lps >> Pipe-based Context Switching | 1495126.5 lps | 1490533.9 lps >> Process Creation | 29881.2 lps | 28572.8 lps >> Shell Scripts (1 concurrent) | 23224.3 lpm | 22607.4 lpm >> Shell Scripts (8 concurrent) | 3531.4 lpm | 3211.9 lpm >> System Call Overhead | 10385653.0 lps | 10419979.0 lps >> >> Signed-off-by: Pan Xinhui >> --- >> diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h >> @@ -98,6 +98,10 @@ struct pv_time_ops { >> unsigned long long (*steal_clock)(int cpu); >> }; >> >> +struct pv_vcpu_ops { >> + bool (*vcpu_is_preempted)(int cpu); >> +}; >> + > > (I would put it into pv_lock_ops to save the plumbing.) > hi, Radim thanks for your reply. yes, a new struct leads patch into unnecessary lines changed. I do that just because I am not sure which existing xxx_ops I should place the vcpu_is_preempted in. >> diff --git a/arch/x86/include/uapi/asm/kvm_para.h b/arch/x86/include/uapi/asm/kvm_para.h >> @@ -45,7 +45,8 @@ struct kvm_steal_time { >> __u64 steal; >> __u32 version; >> __u32 flags; >> - __u32 pad[12]; >> + __u32 preempted; > > Why __u32 instead of __u8? > I thought it is 32-bits aligned... yes, u8 is good to store the preempt status. >> + __u32 pad[11]; >> }; > > Please document the change in Documentation/virtual/kvm/msr.txt, section > MSR_KVM_STEAL_TIME. > okay, I totally forgot to do that. thanks! >> diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c >> @@ -415,6 +415,15 @@ void kvm_disable_steal_time(void) >> +static bool kvm_vcpu_is_preempted(int cpu) >> +{ >> + struct kvm_steal_time *src; >> + >> + src = &per_cpu(steal_time, cpu); >> + >> + return !!src->preempted; >> +} >> + >> #ifdef CONFIG_SMP >> static void __init kvm_smp_prepare_boot_cpu(void) >> { >> @@ -488,6 +497,8 @@ void __init kvm_guest_init(void) >> kvm_guest_cpu_init(); >> #endif >> >> + pv_vcpu_ops.vcpu_is_preempted = kvm_vcpu_is_preempted; > > Would be nicer to assign conditionally in the KVM_FEATURE_STEAL_TIME > block. The steal_time structure has to be zeroed, so this code would > work, but the native function (return false) is better if we know that > the kvm_vcpu_is_preempted() would always return false anway. > yes, agree. Will do that. I once thought we can patch the code runtime. we replace binary code "call 0xXXXXXXXX #pv_vcpu_ops.vcpu_is_preempted" with "xor eax, eax" however it is not worth doing that. the performace improvements might be very small. > Old KVMs won't have the feature, so we could also assign only when KVM > reports it, but that requires extra definitions and the performance gain > is fairly small, so I'm ok with this. > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> @@ -2057,6 +2057,8 @@ static void record_steal_time(struct kvm_vcpu *vcpu) >> &vcpu->arch.st.steal, sizeof(struct kvm_steal_time)))) >> return; >> >> + vcpu->arch.st.steal.preempted = 0; >> + >> if (vcpu->arch.st.steal.version & 1) >> vcpu->arch.st.steal.version += 1; /* first time write, random junk */ >> >> @@ -2812,6 +2814,16 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu) >> >> void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) >> { >> + if (vcpu->arch.st.msr_val & KVM_MSR_ENABLED) >> + if (kvm_read_guest_cached(vcpu->kvm, &vcpu->arch.st.stime, >> + &vcpu->arch.st.steal, >> + sizeof(struct kvm_steal_time)) == 0) { >> + vcpu->arch.st.steal.preempted = 1; >> + kvm_write_guest_cached(vcpu->kvm, &vcpu->arch.st.stime, >> + &vcpu->arch.st.steal, >> + sizeof(struct kvm_steal_time)); >> + } > > Please name this block of code. Something like > kvm_steal_time_set_preempted(vcpu); > yep, my code style is ugly. will do that. thanks xinhui > Thanks. >