Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965516AbdIYS3h (ORCPT ); Mon, 25 Sep 2017 14:29:37 -0400 Received: from goliath.siemens.de ([192.35.17.28]:36444 "EHLO goliath.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935687AbdIYS3g (ORCPT ); Mon, 25 Sep 2017 14:29:36 -0400 Subject: Re: [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall\ To: Thomas Gleixner , Marcelo Tosatti Cc: Peter Zijlstra , Konrad Rzeszutek Wilk , mingo@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org References: <20170921113835.031375194@redhat.com> <20170921114039.466130276@redhat.com> <20170921133653.GO26248@char.us.oracle.com> <20170921140628.zliqlz7mrlqs5pzz@hirez.programming.kicks-ass.net> <20170922011039.GB20133@amt.cnet> <20170922100004.ydmaxvgpc2zx7j25@hirez.programming.kicks-ass.net> <20170922121640.GA29589@amt.cnet> <20170922123107.fjh2yfwnej73trim@hirez.programming.kicks-ass.net> <20170922124005.GA30393@amt.cnet> <20170922130141.tz6f4gktihmbhqli@hirez.programming.kicks-ass.net> <20170925022238.GB5140@amt.cnet> From: Jan Kiszka Message-ID: <68b701fb-b58c-bca5-f002-8f11705a584d@siemens.com> Date: Mon, 25 Sep 2017 20:28:58 +0200 User-Agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2512 Lines: 54 On 2017-09-25 12:41, Thomas Gleixner wrote: > On Sun, 24 Sep 2017, Marcelo Tosatti wrote: >> On Fri, Sep 22, 2017 at 03:01:41PM +0200, Peter Zijlstra wrote: >> What the patch does is the following: >> It reduces the window where SCHED_FIFO is applied vcpu0 >> to those were a spinlock is shared between -RT vcpus and vcpu0 >> (why: because otherwise, when the emulator thread is sharing a >> pCPU with vcpu0, its unable to generate interrupts vcpu0). >> >> And its being rejected because: >> Please fill in. > > Your patch is just papering over one particular problem, but it's not > fixing the root cause. That's the worst engineering approach and we all > know how fast this kind of crap falls over. > > There are enough other issues which can cause starvation of the RT VCPUs > when the housekeeping VCPU is preempted, not just the particular problem > which you observed. > > Back then when I did the first prototype of RT in KVM, I made it entirely > clear, that you have to spend one physical CPU for _each_ VCPU, independent > whether the VCPU is reserved for RT workers or the housekeeping VCPU. The > emulator thread needs to run on a separate physical CPU. > > If you want to run the housekeeping VCPU and the emulator thread on the > same physical CPU then you have to make sure that both the emulator and the > housekeeper side of affairs are designed and implemented with RT in > mind. As long as that is not the case, you simply cannot run them on the > same physical CPU. RT is about guarantees and guarantees cannot be achieved > with bandaid engineering. It's even more complicated for the guest: It needs to be aware of the latencies its interaction with a VM - instead of a real machine - may cause while being in whatever critical sections. That's an additional design dimension that would be very hard to establish and maintain, even in Linux. The only way around that is to truly decouple guest CPUs via full core isolation inside the Linux guest and have your RT guest application exploit this partitioning, e.g. by using lock-less inter-core communication without kernel help. The reason I was playing with PV-sched back then was to explore how you could map the guest's task prio dynamically on its host vcpu. That involved boosting whenever en event (aka irq) came in for the guest vcpu. It turned out to be a more or less working solution looking for a real-world problem. Jan -- Siemens AG, Corporate Technology, CT RDA ITP SES-DE Corporate Competence Center Embedded Linux