Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752624AbbGaNGl (ORCPT ); Fri, 31 Jul 2015 09:06:41 -0400 Received: from relay4-d.mail.gandi.net ([217.70.183.196]:49233 "EHLO relay4-d.mail.gandi.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750941AbbGaNGh (ORCPT ); Fri, 31 Jul 2015 09:06:37 -0400 X-Originating-IP: 50.43.43.179 Date: Fri, 31 Jul 2015 06:06:21 -0700 From: Josh Triplett To: David Drysdale Cc: Michael Kerrisk , Linux API , Andrew Morton , Arnd Bergmann , Shuah Khan , Jonathan Corbet , Eric B Munson , Randy Dunlap , Andrea Arcangeli , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Oleg Nesterov , Linus Torvalds , Greg Kroah-Hartman , Andy Lutomirski , Al Viro , Rusty Russell , Peter Zijlstra , Vivek Goyal , Alexei Starovoitov , David Herrmann , "Theodore Ts'o" , Kees Cook , Miklos Szeredi , Milosz Tanski , Fam Zheng , Mathieu Desnoyers , linux-doc@vger.kernel.org, "linux-kernel@vger.kernel.org" Subject: Re: [PATCHv2 1/1] Documentation: describe how to add a system call Message-ID: <20150731130620.GA16435@x> References: <1438242731-27756-1-git-send-email-drysdale@google.com> <1438242731-27756-2-git-send-email-drysdale@google.com> <20150730185007.GC16452@x> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5017 Lines: 101 On Fri, Jul 31, 2015 at 10:48:32AM +0100, David Drysdale wrote: > On Thu, Jul 30, 2015 at 7:50 PM, Josh Triplett wrote: > > On Thu, Jul 30, 2015 at 08:52:11AM +0100, David Drysdale wrote: > >> Add a document describing the process of adding a new system call, > >> including the need for a flags argument for future compatibility, and > >> covering 32-bit/64-bit concerns (albeit in an x86-centric way). > >> > >> Signed-off-by: David Drysdale > >> Reviewed-by: Michael Kerrisk > >> Reviewed-by: Eric B Munson > >> Reviewed-by: Kees Cook > >> Reviewed-by: Randy Dunlap > > > > A few comments below. I also agree with Ingo's comment about > > documenting the param-structure approach in addition to flags. > > Many thanks for looking through the doc in detail! > > I'll send out an updated doc today; I'm then on vacation for the next > week so I'll collate further feedback after I return. Thanks for the quick response and for incorporating these changes. > > Some things to add a this point, before talking about adding a new file > > descriptor: > > > > If you want to provide userspace with a handle to a kernel object, do so > > in the form of a file descriptor. Do not invent a new type of userspace > > object handle with its own namespace. The kernel already has numerous > > system calls and well-defined semantics for managing and passing around > > file descriptors, as well as for cleaning up when the file descriptor is > > no longer in use. > > > > If your new system call returns a file descriptor, think about what it > > means to use the poll(2) family of syscalls on that file descriptor. If > > you want to send an event to userspace associated with your kernel > > object, you should do so by making your file descriptor ready for > > reading or writing. > > Good points, I've added a couple of paragraphs along these lines: > > " > If your new system call allows userspace to refer to a kernel object, it > should use a file descriptor as the handle for that object -- don't invent a > new type of userspace object handle when the kernel already has mechanisms and > well-defined semantics for using file descriptors. > > [snip: existing text on CLOEXEC] > > If your system call returns a new file descriptor, you should also consider > what it means to use the poll(2) family of system calls on that file > descriptor. Making a file descriptor ready for reading or writing is the > normal way for the kernel to indicate to userspace that an event has > occurred on the corresponding kernel object. > " These look good to me. > >> +needs to be governed by the appropriate Linux capability bit (checked with a > >> +call to capable()), as described in the capabilities(7) man page. > >> + > >> + - If there is an existing capability that governs related functionality, then > >> + use that. However, avoid combining lots of only vaguely related functions > >> + together under the same bit, as this goes against capabilities' purpose of > >> + splitting the power of root. In particular, avoid adding new uses of the > >> + already overly-general CAP_SYS_ADMIN capability. > >> + - If there is no related capability, then consider adding a new capability > >> + bit -- but bear in mind that the numbering space is limited, and each new > >> + bit needs to be understood and administered by sysadmins. > > > > Many previous discussions on this topic seem to have concluded that it's > > almost impossible to add a new capability without breaking existing > > programs. I would suggest not even mentioning this possibility. > > I'm not particularly knowledgable about capabilities (at least, not the > POSIX.1e/Linux kind), so I'll confess that I got this suggestion from > Michael Kerrisk. > > Michael, do you agree that we should just drop the possibility of adding > new capability bits? > > Also, Josh, do you have any references to the earlier discussions on the > topic so I can update the Sources section? No direct links handy at the moment without some searching, but one iteration of it came up when Matthew Garrett suggested adding CAP_COMPROMISE_KERNEL, and the ensuing discussion of capability semantics suggested that the way the capability space was defined and controlled by userspace meant that adding any new capability would effectively break userspace ABI. The userspace ABI for capabilities is not clear; some applications drop all possible capabilities and could get surprised by a new capability being dropped, while other applications drop only capabilities they know about and could get surprised by a new capability *not* being dropped. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/