Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1428942AbdDYJti (ORCPT ); Tue, 25 Apr 2017 05:49:38 -0400 Received: from mail-wr0-f169.google.com ([209.85.128.169]:34378 "EHLO mail-wr0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1428902AbdDYJtc (ORCPT ); Tue, 25 Apr 2017 05:49:32 -0400 Date: Tue, 25 Apr 2017 11:49:27 +0200 From: Daniel Lezcano To: Marc Zyngier Cc: tglx@linutronix.de, Mark Rutland , Vineet Gupta , Patrice Chotard , Kukjin Kim , Krzysztof Kozlowski , Javier Martinez Canillas , Christoffer Dall , Paolo Bonzini , Radim =?utf-8?B?S3LEjW3DocWZ?= , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-snps-arc@lists.infradead.org, kernel@stlinux.com, linux-samsung-soc@vger.kernel.org, kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org Subject: Re: [PATCH V9 1/3] irq: Allow to pass the IRQF_TIMER flag with percpu irq request Message-ID: <20170425094927.GB16888@mai> References: <1493042494-14057-1-git-send-email-daniel.lezcano@linaro.org> <398f3f3d-c567-0f1f-1a43-9b8d5805d5fd@arm.com> <20170424185909.GD2137@mai> <92e2a022-93d4-d4f3-78af-c9b5d51bb867@arm.com> <20170424195948.GE2137@mai> <16042494-2e67-e1a5-b9f6-af57e349d8a7@arm.com> <20170425083451.GA16888@mai> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6021 Lines: 157 On Tue, Apr 25, 2017 at 10:10:12AM +0100, Marc Zyngier wrote: [ ... ] > >>>> Maybe you could explain why you think this interrupt is relevant to what > >>>> you're trying to achieve? > >>> > >>> If this interrupt does not happen on the host, we don't care. > >> > >> All interrupts happen on the host. There is no such thing as a HW > >> interrupt being directly delivered to a guest (at least so far). The > >> timer is under control of the guest, which uses as it sees fit. When > >> the HW timer expires, the interrupt fires on the host, which re-inject > >> the interrupt in the guest. > > > > Ah, thanks for the clarification. Interesting. > > > > How can the host know which guest to re-inject the interrupt? > > The timer can only fire when the vcpu is running. If it is not running, > a software timer is queued, with a pointer to the vcpu struct. I see, thanks. > >>> The flag IRQF_TIMER is used by the spurious irq handler in the try_one_irq() > >>> function. However the per cpu timer interrupt will be discarded in the function > >>> before because it is per cpu. > >> > >> Right. That's not because this is a timer, but because it is per-cpu. > >> So why do we need this IRQF_TIMER flag, instead of fixing try_one_irq()? > > > > When a timer is not per cpu, (eg. request_irq), we need this flag, no? > > Sure, but in this series, they all seem to be per-cpu. I think I was unclear. We need to tag an interrupt with IRQS_TIMINGS to record their occurences but discarding the timers interrupt. That is done by checking against IRQF_TIMER when setting up an interrupt. request_irq() has a flag parameter which has IRQF_TIMER set in case of the timers. request_percpu_irq has no flag parameter, so it is not possible to discard these interrupts as the IRQS_TIMINGS will be set. I don't understand how this is related to the the try_one_irq() fix you are proposing. Am I missing something? Regarding your description below, the host has no control at all on the virtual timer and is not able to know the next expiration time, so I don't see the point to add the IRQF_TIMER flag to the virtual timer. I will resend a new version without this change on the virtual timer. > >>> IMO, for consistency reason, adding the IRQF_TIMER makes sense. Other than > >>> that, as the interrupt is not happening on the host, this flag won't be used. > >>> > >>> Do you want to drop this change? > >> > >> No, I'd like to understand the above. Why isn't the following patch > >> doing the right thing? > > > > Actually, the explanation is in the next patch of the series (2/3) > > > > [ ... ] > > > > +static inline void setup_timings(struct irq_desc *desc, struct irqaction *act) > > +{ > > + /* > > + * We don't need the measurement because the idle code already > > + * knows the next expiry event. > > + */ > > + if (act->flags & __IRQF_TIMER) > > + return; > > And that's where this is really wrong for the KVM guest timer. As I > said, this timer is under complete control of the guest, and the rest of > the system doesn't know about it. KVM itself will only find out when the > vcpu does a VM exit for a reason or another, and will just save/restore > the state in order to be able to give the timer to another guest. > > The idle code is very much *not* aware of anything concerning that guest > timer. Just for my own curiosity, if there are two VM (VM1 and VM2). VM1 sets a timer1 at