Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752822AbbDPFyU (ORCPT ); Thu, 16 Apr 2015 01:54:20 -0400 Received: from mail-lb0-f177.google.com ([209.85.217.177]:35070 "EHLO mail-lb0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750946AbbDPFyG (ORCPT ); Thu, 16 Apr 2015 01:54:06 -0400 MIME-Version: 1.0 In-Reply-To: <20150416010402.GU889@ZenIV.linux.org.uk> References: <20150415130649.6f9ab20f@gandalf.local.home> <20150415173145.GA26146@kroah.com> <20150415225611.0c256ea6@lxorguk.ukuu.org.uk> <20150415221804.GP889@ZenIV.linux.org.uk> <20150415224854.GQ889@ZenIV.linux.org.uk> <20150415232716.GS889@ZenIV.linux.org.uk> <20150416010402.GU889@ZenIV.linux.org.uk> From: Andy Lutomirski Date: Wed, 15 Apr 2015 22:53:44 -0700 Message-ID: Subject: Re: [GIT PULL] kdbus for 4.1-rc1 To: Al Viro Cc: Arnd Bergmann , "linux-kernel@vger.kernel.org" , Jiri Kosina , Andrew Morton , Daniel Mack , One Thousand Gnomes , Linus Torvalds , Tom Gundersen , Richard Weinberger , Steven Rostedt , Greg Kroah-Hartman , David Herrmann , "Eric W. Biederman" , Djalal Harouni Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2529 Lines: 52 On Apr 15, 2015 6:04 PM, "Al Viro" wrote: > > On Wed, Apr 15, 2015 at 05:47:18PM -0700, Andy Lutomirski wrote: > > > I wonder if we could get away with having the receiver pre-allocate > > some placeholder fds and then have the kernel replace a placeholder > > with a passed fd immediately when the fd is sent and enqueue *that* in > > the cmsg data. If you send an fd to someone who hasn't assigned any > > placeholders to the receiving socket, then you get an error. > > *UGH* > > It's a really bad idea. The thing is, descriptor table that isn't shared > is assumed to be unchanged. So when fdget() looks a file up, it doesn't > have to bump its refcount - the reference in descriptor table itself will > stay. Conversely, fdput() doesn't have to drop it in such case (we encode > whether we need to drop into struct fd returned by fdget() and passed to > fdput()). > > That relies on no third-party modifications of descriptor table and yes, > the effect _is_ noticable - playing with struct file refcounts does result > in considerable overhead. > > If recepient sits in "gimme a descriptor", we are fine - if descriptor table > was shared, the other users would be doing full refcount song and dance and > if it wasn't, recepient is the sole user _and_ it isn't betwee fdget() and > fdput() at the moment. With your "replace the dummies when sending" trick > we break all of that - we don't know what the recepient is doing at the moment > and for all we know they might be in the middle of something like e.g. > fstat() on your placeholder. With rather unpleasant effects... Hmm. I don't love the special blocking call either -- it break polling loops. We could have the existence of a placeholderfd count as an extra reference to the descriptor table, with the associated performance hit. Or we could allow each placeholderfd to collect one received fd but not actually switch over. The latter is ugly and still has minor DoS issues -- we'd have to prevent placeholderfds from being passed through this mechanism or SCM_RIGHTS. But wait... what about an evil trick? What if all placeholderfds are the *same* struct file and that struct file is never deleted? Then fdget on a placeholderfd is safe, since it's implicitly pinned. --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/