Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752850Ab0ASOHa (ORCPT ); Tue, 19 Jan 2010 09:07:30 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752269Ab0ASOH3 (ORCPT ); Tue, 19 Jan 2010 09:07:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:34512 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751020Ab0ASOH3 (ORCPT ); Tue, 19 Jan 2010 09:07:29 -0500 Date: Tue, 19 Jan 2010 16:04:25 +0200 From: "Michael S. Tsirkin" To: Jan Kiszka Cc: Davide Libenzi , Avi Kivity , "kvm@vger.kernel.org" , Linux Kernel Mailing List Subject: Re: [PATCH 1/2] kvm: fix spurious interrupt with irqfd Message-ID: <20100119140425.GA28410@redhat.com> References: <20100113171230.GB19798@redhat.com> <4B55B2B8.5090105@siemens.com> <20100119134827.GA28191@redhat.com> <4B55BBB6.2020901@siemens.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B55BBB6.2020901@siemens.com> User-Agent: Mutt/1.5.19 (2009-01-05) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3244 Lines: 91 On Tue, Jan 19, 2010 at 03:03:34PM +0100, Jan Kiszka wrote: > Michael S. Tsirkin wrote: > > On Tue, Jan 19, 2010 at 02:25:12PM +0100, Jan Kiszka wrote: > >> Michael S. Tsirkin wrote: > >>> kvm didn't clear irqfd counter on deassign, as a result we could get a > >>> spurious interrupt when irqfd is assigned back. this leads to poor > >>> performance and, in theory, guest crash. > >>> > >>> Signed-off-by: Michael S. Tsirkin > >>> --- > >>> virt/kvm/eventfd.c | 3 ++- > >>> 1 files changed, 2 insertions(+), 1 deletions(-) > >>> > >>> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c > >>> index 62e4cd9..a9d3fc6 100644 > >>> --- a/virt/kvm/eventfd.c > >>> +++ b/virt/kvm/eventfd.c > >>> @@ -72,12 +72,13 @@ static void > >>> irqfd_shutdown(struct work_struct *work) > >>> { > >>> struct _irqfd *irqfd = container_of(work, struct _irqfd, shutdown); > >>> + u64 cnt; > >>> > >>> /* > >>> * Synchronize with the wait-queue and unhook ourselves to prevent > >>> * further events. > >>> */ > >>> - remove_wait_queue(irqfd->wqh, &irqfd->wait); > >>> + eventfd_ctx_remove_wait_queue(irqfd->eventfd, &irqfd->wait, &cnt); > >>> > >>> /* > >>> * We know no new events will be scheduled at this point, so block > >> For kvm-kmod, I'm fighting with compat support for > >> eventfd_ctx_remove_wait_queue. I basically have a solution for kernels > >> with CONFIG_KPROBES enabled (I need to look up unexported > >> __wake_up_locked[_key]), but there will also be target kernels that do > >> not have this. So there are three options for that case: > >> > >> - Warn the user and fall back to the old racy approach > >> - (Somehow) disable KVM subsystems that use eventfd > >> - Refuse to start KVM > >> As far as I understood, irqfd is interesting for device assignment and > >> now also for vhost, right? > > > > At the moment, only vhost. > > > >> What about ioeventfd? > > > > Same thing. > > > > OK... > > >> I just wonder how broad > >> the impact of a broken or non-existent eventfd subsystem for kvm-kmod > >> is. Any thoughts welcome. > > > > How do you handle kernels that don't export eventfd_ctx_fileget? > > Now that you mention it: not yet properly. So far we pass the file > struct as pseudo eventfd_ctx around on < 2.6.31. But now that I peek > into the struct in kvm_eventfd_ctx_remove_wait_queue, this should should > crash. Guess I need to look up that module the same way as I acquire > __wake_up_locked[_key]. This won't work that well: eventfd in upstream sends us POLLHUP so we can close the structure, in old kernels it doesn't so kernel will crash when we try to reference the structure later. > > > >> Jan > >> > >> PS: If anyone forgot why Avi handed over this job, you should now > >> remember why. :) > > > > Heh, I did the same kind of thing for infiniband for > > several years. It's hard to forget. > > > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/