Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932079AbbFWGGi (ORCPT ); Tue, 23 Jun 2015 02:06:38 -0400 Received: from mail-la0-f44.google.com ([209.85.215.44]:36579 "EHLO mail-la0-f44.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751282AbbFWGGb (ORCPT ); Tue, 23 Jun 2015 02:06:31 -0400 MIME-Version: 1.0 From: Andy Lutomirski Date: Mon, 22 Jun 2015 23:06:09 -0700 Message-ID: Subject: kdbus: to merge or not to merge? To: Linus Torvalds , "linux-kernel@vger.kernel.org" , David Herrmann , Djalal Harouni , Greg KH , Havoc Pennington , "Eric W. Biederman" , One Thousand Gnomes , Tom Gundersen , Daniel Mack Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5339 Lines: 98 Hi Linus, Can you opine as to whether you think that kdbus should be merged? I don't mean whether you'd accept a pull request that Greg may or may not send during this merge window -- I mean whether you think that kdbus should be merged if it had appropriate review and people were okay with the implementation. The current state of uncertainty is problematic, I think. The kdbus team is spending a lot of time making things compatible with kdbus, and the latest systemd release makes kdbus userspace support mandatory. The kernel people who would review it (myself included) probably don't want to review new versions at a line-by-line level, because we (myself included) either don't know whether there's any point or don't think that it should be merged *even if the implementation were flawless*. For my part, here's my argument why the answer should be "no, kdbus shouldn't be merged": 1. It's not necessary. kdbus is a giant API surface. The problems it purports to solve are (very roughly) performance, ability to collect metadata in a manner that doesn't suck, sandbox support, better logging/monitoring, and availability very early in userspace startup. I think that the performance issues should be solved in userspace -- gdbus performance is atrocious for reasons that have nothing to do with the kernel or context switches [1]. The metadata problem, to the extent that it's a real problem, can and should be solved by improving AF_UNIX. The logging, monitoring, and early userspace problems can and should be solved in userspace. See #3 below for my thoughts on the sandbox. Right now, kdbus sounds awfully like Tux. 2. Kdbus introduces a novel buffering model. Receivers allocate a big chunk of what's essentially tmpfs space. Assuming that space is available (in a virtual memory sense), senders synchronously write to the receivers' tmpfs space. Broadcast senders synchronously write to *all* receivers' tmpfs space. I think that, regardless of implementation, this is problematic if the sender and the receiver are in different memcgs. Suppose that the message is to be written to a page in the receivers' tmpfs space that is not currently resident. If the write happens in the sender's memcg context, then a receiver can effectively allocate an unlimited number of pages in the sender's memcg, which will, in practice, be the init memcg if the sender is systemd. This breaks the memcg model. If, on the other hand, the sender writes to the receiver's tmpfs space in the receiver's memcg context, then the sender will block (or fail? presumably unpredictable failures are a bad thing) if the receiver's memcg is at capacity. 3. The sandbox model is, in my opinion, an experiment that isn't going to succeed. It's a poor model: a "restricted endpoint" (i.e. a sandboxed kdbus client) sees a view of the world defined by a limited policy language implemented by the kernel. This completely fails to express what I think should be common use cases. If a sandboxed app is given permission to access, say, /org/gnome/evolution/dataserver/CalendarView/3125/12, then it knows that it's looking at CalendarView/3125/12 (whatever that means) and there's no way to hide the name. If someone subsequently deletes that CalendarView and creates a new one with that name, racelessly blocking access to the new one for the app may be complicated. If a sandbox wants to prompt the user before allowing access to some resource, it has a problem: the policy language doesn't seem to be able to express request interception. The sandbox model is also already starting to accumulate kludges. Apparently it was recently discovered that the kdbus connection lifetime model was incompatible with sandbox policy, so as of a recent change [2] connection lifetime messages completely bypass sandbox policy. Maybe this isn't obviously insecure, but it seems like a bad sign that "it's probably okay to poke this hole" is already happening before the thing is even merged. I'll point out that a pure userspace implementation of sandboxed dbus connections would be straightforward to implement today, would have none of these problems, and would allow arbitrarily complex policy and the flexibility to redesign it in the future if the initial design turned out to be inappropriate for the sandbox being written. (You could even have two different implementations to go with two different sandboxes. Let a thousand sandboxes bloom, which is easy in userspace but not so great in the kernel.) In summary, I think that a very high quality implementation of the kdbus concept and API would be a bit faster than a very high quality userspace implementation of dbus. Other than that, I think it would actually be worse. The kdbus proponents seem to be comparing the current kdbus implementation to the current userspace implementation, and a favorable comparison there is not a good reason to merge it. --Andy [1] I spent a while today trying to benchmark sd-bus. I gave up, because I couldn't get test code to build. I don't have the patience to try harder. [2] https://git.kernel.org/cgit/linux/kernel/git/gregkh/char-misc.git/commit/?h=kdbus&id=d27c8057699d164648b7d8c1559fa6529998f89d -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in Please read the FAQ at http://www.tux.org/lkml/