Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751449AbbGaS4L (ORCPT ); Fri, 31 Jul 2015 14:56:11 -0400 Received: from mail-ig0-f176.google.com ([209.85.213.176]:36460 "EHLO mail-ig0-f176.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750957AbbGaS4I (ORCPT ); Fri, 31 Jul 2015 14:56:08 -0400 MIME-Version: 1.0 In-Reply-To: <20150731010234.GA7265@x> References: <1438242731-27756-1-git-send-email-drysdale@google.com> <1438242731-27756-2-git-send-email-drysdale@google.com> <20150730083831.GA22182@gmail.com> <20150730190434.GD16452@x> <20150731010234.GA7265@x> Date: Fri, 31 Jul 2015 11:56:06 -0700 X-Google-Sender-Auth: 5FCa06oks1C3KtNrNa14XM0KUFw Message-ID: Subject: Re: [PATCHv2 1/1] Documentation: describe how to add a system call From: Kees Cook To: Josh Triplett Cc: David Drysdale , Ingo Molnar , Linux API , Michael Kerrisk , Andrew Morton , Arnd Bergmann , Shuah Khan , Jonathan Corbet , Eric B Munson , Randy Dunlap , Andrea Arcangeli , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Oleg Nesterov , Linus Torvalds , Greg Kroah-Hartman , Andy Lutomirski , Al Viro , Rusty Russell , Peter Zijlstra , Vivek Goyal , Alexei Starovoitov , David Herrmann , "Theodore Ts'o" , Milosz Tanski , Fam Zheng , Mathieu Desnoyers , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Peter Zijlstra Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3729 Lines: 90 On Thu, Jul 30, 2015 at 6:02 PM, Josh Triplett wrote: > On Thu, Jul 30, 2015 at 01:03:43PM -0700, Kees Cook wrote: >> On Thu, Jul 30, 2015 at 12:04 PM, Josh Triplett wrote: >> > On Thu, Jul 30, 2015 at 11:21:54AM -0700, Kees Cook wrote: >> >> I like this, it's a good description of both options. I'm still biased >> >> about the approach: I prefer flags, since pointers to user structures >> >> complicate syscall filtering. ;) >> > >> > Seems like we should do two things to make that easier: >> > >> > 1) Create a standardized kernel mechanism for parameter-struct handling, >> > implementing the recommendations mentioned here. >> >> It's been suggested in the past that nlmsg is appropriate for such a >> thing, but I remain suspicious. :) > > Likewise. :) > >> > 2) Integrate into that mechanism a way to filter the resulting parameter >> > struct with BPF *after* it has been copied to kernel space (and thus >> > can no longer be tampered with). >> >> Yeah, this is a irritating part: the structures operated on are copied >> from userspace adhoc in each syscall. Doing argument checking would >> mean double copies initially, and perhaps teaching syscalls about >> optional "already copied" arguments or something as an optimization. > > No, double copies can't work for security reasons. Because otherwise > you could race the kernel from another thread, substituting different > values after the check and before the use. Right, the double copy method would require setting up a per-thread userspace memory mapping that was read-only from userspace but writable from kernel space. > I think the right API looks *roughly* like this: > > int _copy_param_struct(size_t kernel_len, void *kernel_struct, size_t user_len, void __user *user_struct) > { > if (user_len > kernel_len) > return -EINVAL; > if (user_len && copy_from_user(kernel_struct, user_struct, user_len)) > return -EFAULT; > if (user_len < kernel_len) > memset(kernel_struct + user_len, 0, kernel_len - user_len); > return 0; > } > > #define copy_param_struct(kernel_struct, user_len, user_struct) _copy_param_struct( \ > sizeof(*kernel_struct) + BUILD_BUG_ON_ZERO(!__same_type(*kernel_struct, *user_struct)), \ > kernel_struct, user_len, user_struct) > > > Then the syscall looks like this: > > SYSCALL_DEFINEn(xyzzy, ..., ..., size_t user_params_len, struct xyzzy_params __user *user_params) > { > int ret; > struct xyzzy_params params; > > ret = copy_param_struct(¶ms, user_params_len, user_params); > if (ret) > return ret; > ... > > > And you could then hook copy_params_struct to add arbitrary additional > syscall parameter validation. Bonus if there's some way to make the > copy and validation occur before the syscall is ever invoked, rather > than inside the syscall, but that would require adding fancier syscall > definition mechanisms that autogenerate such code. The trouble is that the hook for the syscall (both seccomp and ptrace) happens before the sys_* function executes. So the param extract suddenly becomes optional. As in, did ptrace/seccomp already extract the args? If so, use that copy, else copy them out myself now that I need them, etc. It's entirely doable, but it's going to require some careful design. -Kees -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/