Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754443AbbDQT1n (ORCPT ); Fri, 17 Apr 2015 15:27:43 -0400 Received: from bedivere.hansenpartnership.com ([66.63.167.143]:33515 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754218AbbDQT1k (ORCPT ); Fri, 17 Apr 2015 15:27:40 -0400 Message-ID: <1429298858.1079.22.camel@HansenPartnership.com> Subject: Re: [GIT PULL] kdbus for 4.1-rc1 From: James Bottomley To: David Herrmann Cc: Greg Kroah-Hartman , Jiri Kosina , Steven Rostedt , John Stoffel , Andy Lutomirski , Linus Torvalds , Andrew Morton , Arnd Bergmann , "Eric W. Biederman" , One Thousand Gnomes , Tom Gundersen , "linux-kernel@vger.kernel.org" , Daniel Mack , Djalal Harouni , "Paul E. McKenney" Date: Fri, 17 Apr 2015 12:27:38 -0700 In-Reply-To: References: <20150413190350.GA9485@kroah.com> <20150413204547.GB1760@kroah.com> <20150414175019.GA2874@kroah.com> <20150414192357.GA6107@kroah.com> <21805.29994.968937.364993@quad.stoffel.home> <20150414215135.GC6801@home.goodmis.org> <20150415083714.GD16381@kroah.com> <1429121536.2187.44.camel@HansenPartnership.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4673 Lines: 90 On Thu, 2015-04-16 at 14:13 +0200, David Herrmann wrote: > Hi > > On Wed, Apr 15, 2015 at 8:12 PM, James Bottomley > wrote: > > For me the biggest issue is the container problem: it's really hard to > > containerise kdbus because of the stateful nature of the protocol and > > the fact that it has a well known system bus. Separation into domains > > works for OS containers, but application containers need more fluidity. > > It's not unlike the same problem on windows: Windows application > > containers are very difficult to do because the global registry means > > that OLE handlers all have to run inside your container as well > > (effectively making it an OS container). I'm sure, since we already > > have a lot of containers people going to plumbers, that we can get them > > to turn up for the discussion. > > kdbus actually works very well in OS containers that mount a new > kdbusfs inside the container. This new instance of kdbus will be > entirely seperated from any other on the system. We've designed it > that way especially with OS containers in mind. This is explained in > kdbus.fs(7). It's very similar to devpts' container support, where you > mount a new instance of devpts into each container instance you run. > > For Docker-style (i.e. app-focused) containers, it's a more complex > story. Well, no, docker-style is just one flavour of application containers. I'm actually much more interested in something very different: applications that use container features (like docker, rocket and systemd). Facilitating them is an interesting exercise. Also, applications inside containers were around long before docker in the PaaS space at least. > kdbus will not solve this for you, but at least one thing > deserves being mentioned: for this kind of sandboxing kdbus certainly > makes things *easier*, compared to dbus1. So slightly better than really difficult isn't terribly useful. > Why? because the kernel > gains a notion of individual messages and method call transactions, > something that is completely unavailable if you stick to dbus1 where > all the kernel sees is a raw stream of AF_UNIX/SOCK_STREAM bytes. In > fact, kdbus as it is right now even contains minimal but explicit > support for sandboxing, by allowing creation of multiple bus endpoints > to the same bus that carry additional, more restrictive policy. Sandboxing is a minor (albeit very useful) use of containers. You nicely ignored the actual problem I listed, which is the system bus. And the specific example of what happens. Let me try again. Just to provide the context, Virtuozzo has long supported containers on both Windows and Linux. We have been doing application containers on Linux for a long time, but we've been having issues doing the same thing on windows (in spite of the fact that our windows container system is very similar to the Linux one). In windows, OLE + the global registry is dbus on steroids. The idea seems simple and elegant: remote system elements are provided to you via an IPC interaction instead of being directly dynamically linked into your virtual address space. It allows windows applications to deal with arbitrary objects of unknown type because the type handlers are provided by the system via OLE. It's really elegant in a single user desktop environment because the system's job is to serve and protect only that user. In a multi user environment (as MS found with VDI) it's a lot more problematic because now either the type handlers are global (meaning local users can't modify them unlike in the single user case) or they're all local, meaning we're back to OS containers again. If you think abstractly of containers as a way to bring multi-user features to single user environments (essentially that's what OS virtualization is) you can see immediately why we're having such issues with non-os containers on Windows because the single bus/global namespace idea doesn't play well with multi-user. This is why I think kdbus is a bad idea: it solidifies as a linux kernel API something which runs counter to granular OS virtualization (and something which caused Windows to fall behind Linux in the container space). Splitting out the acceleration problem and leaving the rest to user space currently looks fine because the ideas Al and Andy are kicking around don't cause problems with OS virtualization. James -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/