Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752600AbaFYSZb (ORCPT ); Wed, 25 Jun 2014 14:25:31 -0400 Received: from mail-oa0-f46.google.com ([209.85.219.46]:61475 "EHLO mail-oa0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755144AbaFYSZ1 (ORCPT ); Wed, 25 Jun 2014 14:25:27 -0400 MIME-Version: 1.0 In-Reply-To: References: <1403642893-23107-1-git-send-email-keescook@chromium.org> <1403642893-23107-10-git-send-email-keescook@chromium.org> <20140625142121.GD7892@redhat.com> <20140625165209.GA14720@redhat.com> <20140625172410.GA17133@redhat.com> Date: Wed, 25 Jun 2014 11:25:26 -0700 X-Google-Sender-Auth: dB6MJ5EzhM8zcf7qBytVy1FBcfE Message-ID: Subject: Re: [PATCH v8 9/9] seccomp: implement SECCOMP_FILTER_FLAG_TSYNC From: Kees Cook To: Andy Lutomirski Cc: Oleg Nesterov , LKML , "Michael Kerrisk (man-pages)" , Alexei Starovoitov , Andrew Morton , Daniel Borkmann , Will Drewry , Julien Tinnes , David Drysdale , Linux API , "x86@kernel.org" , "linux-arm-kernel@lists.infradead.org" , linux-mips@linux-mips.org, linux-arch , linux-security-module Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 25, 2014 at 11:09 AM, Andy Lutomirski wrote: > On Wed, Jun 25, 2014 at 10:57 AM, Kees Cook wrote: >> On Wed, Jun 25, 2014 at 10:24 AM, Oleg Nesterov wrote: >>> On 06/25, Kees Cook wrote: >>>> >>>> On Wed, Jun 25, 2014 at 9:52 AM, Oleg Nesterov wrote: >>>> > >>>> > Yes, at least this should close the race with suid-exec. And there are no >>>> > other users. Except apparmor, and I hope you will check it because I simply >>>> > do not know what it does ;) >>>> > >>>> >> I wonder if changes to nnp need to "flushed" during syscall entry >>>> >> instead of getting updated externally/asynchronously? That way it >>>> >> won't be out of sync with the seccomp mode/filters. >>>> >> >>>> >> Perhaps secure computing needs to check some (maybe seccomp-only) >>>> >> atomic flags and flip on the "real" nnp if found? >>>> > >>>> > Not sure I understand you, could you clarify? >>>> >>>> Instead of having TSYNC change the nnp bit, it can set a new flag, say: >>>> >>>> task->seccomp.flags |= SECCOMP_NEEDS_NNP; >>>> >>>> This would be set along with seccomp.mode, seccomp.filter, and >>>> TIF_SECCOMP. Then, during the next secure_computing() call that thread >>>> makes, it would check the flag: >>>> >>>> if (task->seccomp.flags & SECCOMP_NEEDS_NNP) >>>> task->nnp = 1; >>>> >>>> This means that nnp couldn't change in the middle of a running syscall. >>> >>> Aha, so you were worried about the same thing. Not sure we need this, >>> but at least I understand you and... >>> >>>> Hmmm. Perhaps this doesn't solve anything, though? Perhaps my proposal >>>> above would actually make things worse, since now we'd have a thread >>>> with seccomp set up, and no nnp. If it was in the middle of exec, >>>> we're still causing a problem. >>> >>> Yes ;) >>> >>>> I think we'd also need a way to either delay the seccomp changes, or >>>> to notice this condition during exec. Bleh. >>> >>> Hmm. confused again, >> >> I mean to suggest that the tsync changes would be stored in each >> thread, but somewhere other than the true seccomp struct, but with >> TIF_SECCOMP set. When entering secure_computing(), current would check >> for the "changes to sync", and apply them, then start the syscall. In >> this way, we can never race a syscall (like exec). > > I'm not sure that helps. If you set a pending filter part-way through > exec, and exec copies that pending filter but doesn't notice NNP, then > there's an exploitable race. > >> >>>> What actually happens with a multi-threaded process calls exec? I >>>> assume all the other threads are destroyed? >>> >>> Yes. But this is the point-of-no-return, de_thread() is called after the execing >>> thared has already passed (say) check_unsafe_exec(). >>> >>> However, do_execve() takes cred_guard_mutex at the start in prepare_bprm_creds() >>> and drops it in install_exec_creds(), so it should solve the problem? >> >> I can't tell yet. I'm still trying to understand the order of >> operations here. It looks like de_thread() takes the sighand lock. >> do_execve_common does: >> >> prepare_bprm_creds (takes cred_guard_mutex) >> check_unsafe_exec (checks nnp to set LSM_UNSAFE_NO_NEW_PRIVS) >> prepare_binprm (handles suid escalation, checks nnp separately) >> security_bprm_set_creds (checks LSM_UNSAFE_NO_NEW_PRIVS) >> exec_binprm >> load_elf_binary >> flush_old_exec >> de_thread (takes and releases sighand->lock) >> install_exec_creds (releases cred_guard_mutex) >> >> I don't see a way to use cred_guard_mutex during tsync (which holds >> sighand->lock) without dead-locking. What were you considering here? > > Grab cred_guard_mutex and then sighand->lock, perhaps? Ah, yes, task->signal is like sighand: shared across all threads. -- Kees Cook Chrome OS Security -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/