Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753805AbbDPTCK (ORCPT ); Thu, 16 Apr 2015 15:02:10 -0400 Received: from mail-qk0-f180.google.com ([209.85.220.180]:33156 "EHLO mail-qk0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751652AbbDPTCC (ORCPT ); Thu, 16 Apr 2015 15:02:02 -0400 MIME-Version: 1.0 In-Reply-To: References: <20150413190350.GA9485@kroah.com> <20150413204547.GB1760@kroah.com> <20150414175019.GA2874@kroah.com> <20150415085641.GH16381@kroah.com> <20150415120618.4d8d90ff@lxorguk.ukuu.org.uk> <552E8B11.4010803@redhat.com> From: Havoc Pennington Date: Thu, 16 Apr 2015 15:01:29 -0400 X-Google-Sender-Auth: 45DyI-S_R8C2SRGPKStapZqqOhw Message-ID: Subject: Re: [GIT PULL] kdbus for 4.1-rc1 To: Tom Gundersen Cc: Andy Lutomirski , Rik van Riel , One Thousand Gnomes , Greg Kroah-Hartman , Jiri Kosina , Linus Torvalds , Andrew Morton , Arnd Bergmann , "Eric W. Biederman" , "linux-kernel@vger.kernel.org" , Daniel Mack , David Herrmann , Djalal Harouni Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3491 Lines: 65 On Thu, Apr 16, 2015 at 9:13 AM, Tom Gundersen wrote: > All types of messages (unicast and broadcast) are directly stored into > a pool slice of the receiving connection, and this slice is not reused > by the kernel until userspace is finished with it and frees it. Hence, > a client which doesn't process its incoming messages will, at some > point, run out of pool space. If that happens for unicast messages, > the sender will get an EXFULL error. If it happens for a multicast > message, all we can do is drop the message, and tell the receiver how > many messages have been lost when it issues KDBUS_CMD_RECV the next > time. There's more on that in kdbus.message(7). > Have you guys already grappled with what libraries/apps should do with this information? To handle the knowledge that "N messages have been lost," it seems like the client must answer "are there any messages that, if lost, would put any code using this connection into a confused state" and then the client has to recover from said confused state. A library probably can't do this - it doesn't know what state matters or how to recover it - so each app would have to... and are connections ever shared between modules of an app? (for example: could a library such as GTK+ or pulseaudio be using the connection, and then application code is also using the connection, so none of those code modules has the whole picture... at that point, none of the modules knows what to do about lost messages... to try to handle lost messages in a module, you'd need a private connection(?)... which might be fine as long as each app having a number of connections isn't too bloated.) How to handle a send error depends a lot on what's being sent... but if I were writing a general-purpose library wrapper, I'd be very tempted to hide EXFULL behind an unbounded (or very-high-bounded) userspace send buffer, which of course is what you were trying to avoid, but I am skeptical that the average app will handle this error sensibly. The traditional userspace bus isn't any better than what you've described here, of course - it's even worse - and it works well enough. The limits are simply set high enough that they won't be hit unless someone's broken or evil. Which is also the traditional approach to say file descriptor limits or swap space: set the limit high and hope you won't reach it. For the case of the X server, the limit on message buffers appears to be "until malloc fails," so they have the limit quite high, higher than userspace dbus does. "set high limits and don't hit them" is a tried-and-true approach. With either the existing userspace bus or kdbus, I bet you could come up with ways to use limit exhaustion to get various services and apps into confused states as they miss messages they were relying on, simply because this is too hard for apps to reliably get right. The lower the limits, the easier it would be to cause trouble by forcing them to be hit. In a perfect world we could figure out which client is "at fault" for filling a buffer - the slow receiver or the overzealous sender - so we could throttle or disconnect the guilty party instead of throwing errors that won't be handled well ... but not sure that's practical. Havoc -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/