Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751122AbbEAWDo (ORCPT ); Fri, 1 May 2015 18:03:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:33095 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750910AbbEAWDl (ORCPT ); Fri, 1 May 2015 18:03:41 -0400 Message-ID: <1430517819.4472.331.camel@redhat.com> Subject: Re: [GIT PULL] VFIO fixes for v4.1-rc2 From: Alex Williamson To: Linus Torvalds Cc: Oleg Nesterov , linux-kernel , kvm Date: Fri, 01 May 2015 16:03:39 -0600 In-Reply-To: References: <1430502057.4472.255.camel@redhat.com> <1430506137.4472.262.camel@redhat.com> Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3089 Lines: 69 On Fri, 2015-05-01 at 13:23 -0700, Linus Torvalds wrote: > On Fri, May 1, 2015 at 11:48 AM, Alex Williamson > wrote: > > > > Ok. It seemed like useful behavior to be able to provide some response > > to the user in the event that a ->remove handler is blocked by a device > > in-use and the user attempts to abort the action. > > Well, that kind of notification *might* be useful, but at the cost of > saying "somebody tried to send you a signal, so I am now telling you > about it, and then deleting that signal, and you'll never know what it > actually was"? > > That's not useful, that's just wrong. Yep, it was a bad idea. > Now, what might in theory be useful - but I haven't actually seen > anybody do anything like that - is to start out with an interruptible > sleep, warn if you get interrupted, and then continue with an > un-interruptible sleep (leaving the signal active). I was considering doing exactly this. > But even that sounds like a very special case, and I don't think > anything has ever done that. > > In general, our signal handling falls into three distinct categories: > > (a) interruptible (and you can cancel the operation and return "try again") > > (b) killable (you can cancel the operation, knowing that the > requester will be killed and won't try again) > > (c) uninterruptible > > where that (b) tends to be a special case of an operation that > technically isn't really interruptible (because the ABI doesn't allow > for retrying or error returns), but knowing that the caller will never > see the error case because it's killed means that you can do it. The > classic example of that is an NFS mount that is mounted "nointr" - you > can't return EINTR for a read or a write (because that invalidates > POSIX) but you want to let SIGKILL just kill the process in the middle > when the network is hung. I think we're in that (c) case unless we want to change our driver API to allow driver remove callbacks to return error. Killing the caller doesn't really help the situation without being able to back out of the remove path. Killing the task with the device open would help, but seems rather harsh. I expect we eventually want to be able to escalate to revoking the device from the user, but currently we only have a notifier to request the device from cooperative users. In the event of an uncooperative user, we block, which can be difficult to figure out, especially when we're dealing with SR-IOV devices and a PF unbind implicitly induces a VF unbind. The interruptible component here is simply a logging mechanism which should have turned into an "interruptible_once" rather than a signal flush. I try to avoid vfio being a special case, but maybe in this instance it's worthwhile. If you have other suggestions, please let me know. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/