Received: by 2002:a25:1506:0:0:0:0:0 with SMTP id 6csp5332628ybv; Tue, 11 Feb 2020 13:41:56 -0800 (PST) X-Google-Smtp-Source: APXvYqxFU90DBQ9npZshWVqSeXmmBlYfk7OMkWuJ2iMdIGV17tQLLJ9/+PcbUuHKkR6wF6glOlou X-Received: by 2002:a05:6830:50:: with SMTP id d16mr6900403otp.166.1581457316767; Tue, 11 Feb 2020 13:41:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1581457316; cv=none; d=google.com; s=arc-20160816; b=lYRzFPz+DN3qI5y4yu46moMt5+CBRCqPC0CWk57K2lB34p1XHEiPG9W3hOAK0M3snH YaS778gK1730oeV/1U0ve9PL2lntjpseXSvXz0omRYQ+LtANqZszeORqOneMYILuLWCo BjT07aoznTKo5GwT2Blu2RwSL75XMwBv9Uez6MWMQmyawGJ1ChQhIBLA1iOzMhxTxy5S WKD/DwbBzmWqb4jKL4E+kq2GnmqfEeadusc3jx4NXySHapQk0Tg+sWdkWpAF1uEn13Be YXveuRtj1U7KP2MNrc0XUQqZ0r+5/BcINwhGbqn6QS1IEAdGGAWz6/NA5RGJEZjqmN6x TPMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=q11zsssSHeKjTpfbV0UQcFcGMIVXHOWVWu3BFYxakp4=; b=MMJkF910IzLyPi8Z1nzMAQP/K9C8ASiqnMseiMCYi+s+3ihFqN+sbt6SIDgEdVYQ/k 0lwtJST2Er+FH+iMjiMYQvX3ypzOhf6oI/ZKXU2F+LMz5oyBDq4ZYLjR+koUIIcU2xvn eHtbD/ldCrzQB6A9SZCZxTrTDvpyvR8QWPzyMCPZNC6vWGsYgmiRby0lbGIAIw+Ni+cH fpg38KKk6grdOpI1aV2WQ3nEfCNZnmQiDFlhTJ4ah9yJ3tO63PYKe1AEG5Od8rfmbULI 18O462U3kuijuZRSUoB1jNxQ9aolIV/9sZq/CT5epyB0rIZnIcTBQ9nvOoCgmxM1gX+r ishg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=qeozm3+R; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q5si2618673otc.104.2020.02.11.13.41.44; Tue, 11 Feb 2020 13:41:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=qeozm3+R; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731817AbgBKUeR (ORCPT + 99 others); Tue, 11 Feb 2020 15:34:17 -0500 Received: from mail-ot1-f65.google.com ([209.85.210.65]:36939 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730566AbgBKUeR (ORCPT ); Tue, 11 Feb 2020 15:34:17 -0500 Received: by mail-ot1-f65.google.com with SMTP id d3so11533618otp.4 for ; Tue, 11 Feb 2020 12:34:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=q11zsssSHeKjTpfbV0UQcFcGMIVXHOWVWu3BFYxakp4=; b=qeozm3+R7EnJiUqxAO5WKU6Hyo8tOvNoKIv3sqG0gwJnruWChpe1RHY1CoExS/DHwK he5Szs4itn3GjZWTFLBu9ib1jeaKQF7I8F/J6ZCd+raCeXX8Ml8qL2LsKQzBrsvmpZWi LQGGl/dWBrexM2f/haFE13HCd+3IpMAZkvvGpB5n55Z5/Ag2N3RjnXl8xgLc2+1aFkTR UPSbLDiLCbU/Cb4J4f2WFKJTYxS11Fr3qOVTalvoCvAxRpL5kc/w5gTC8sfJ/NDgEgT6 jeFEpLJm7a3am8dmDSuwym2sTtDNasMRQr77XiJDdvOXMmN3Pf70q/USLL7OUWvG29yX 4aCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=q11zsssSHeKjTpfbV0UQcFcGMIVXHOWVWu3BFYxakp4=; b=XpNMa9eo0cBNLyRs6gPXILG92EmLc2NsHCd6LohUUDNstNL1I291i9c/qvwJaEu6Jv mhlI2FNGVwio1xKlhBybDXjydAUX6QAmhgienWuoH3Nrx5BnCnQzyIjVVBNyp3XOYztt pqWz3+yXl/sdI1GUWT54cdNx/87Dt2h1TC3QJX/5AiVH2gaiyXMsqzMusMoaCML6cU/R jf2BlNBxmfaov+mqHFDiwMzvS5YuNzc3HESyZNmd/jMBiM/rIIrtqt02piBGkmgDposf qJXGHcPkDBY2lI4hJvyDFZp/fPWz9dQfje1xnnLbBdM8kVLbH2SQrVg2WA/5GU83qTix iWJg== X-Gm-Message-State: APjAAAVdHNpg36diOxVZ6U/Ji8JAyhpiOJJ6vhsCF438BkMrGWSrD8dn rU9WhUDdStZy/kNonBThG/gESqgov14tlAqMEDk1lQ== X-Received: by 2002:a9d:65c1:: with SMTP id z1mr6936015oth.180.1581453255514; Tue, 11 Feb 2020 12:34:15 -0800 (PST) MIME-Version: 1.0 References: <20200123152440.28956-1-kpsingh@chromium.org> <20200123152440.28956-5-kpsingh@chromium.org> <20200211031208.e6osrcathampoog7@ast-mbp> <20200211124334.GA96694@google.com> <20200211175825.szxaqaepqfbd2wmg@ast-mbp> <20200211190943.sysdbz2zuz5666nq@ast-mbp> <20200211201039.om6xqoscfle7bguz@ast-mbp> In-Reply-To: <20200211201039.om6xqoscfle7bguz@ast-mbp> From: Jann Horn Date: Tue, 11 Feb 2020 21:33:49 +0100 Message-ID: Subject: Re: BPF LSM and fexit [was: [PATCH bpf-next v3 04/10] bpf: lsm: Add mutable hooks list for the BPF LSM] To: Alexei Starovoitov Cc: KP Singh , kernel list , bpf@vger.kernel.org, linux-security-module , Brendan Jackman , Florent Revest , Thomas Garnier , Alexei Starovoitov , Daniel Borkmann , James Morris , Kees Cook , Thomas Garnier , Michael Halcrow , Paul Turner , Brendan Gregg , Matthew Garrett , Christian Brauner , =?UTF-8?B?TWlja2HDq2wgU2FsYcO8bg==?= , Florent Revest , Brendan Jackman , "Serge E. Hallyn" , Mauro Carvalho Chehab , "David S. Miller" , Greg Kroah-Hartman , Kernel Team Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ()On Tue, Feb 11, 2020 at 9:10 PM Alexei Starovoitov wrote: > On Tue, Feb 11, 2020 at 08:36:18PM +0100, Jann Horn wrote: > > On Tue, Feb 11, 2020 at 8:09 PM Alexei Starovoitov > > wrote: > > > On Tue, Feb 11, 2020 at 07:44:05PM +0100, Jann Horn wrote: > > > > On Tue, Feb 11, 2020 at 6:58 PM Alexei Starovoitov > > > > wrote: > > > > > On Tue, Feb 11, 2020 at 01:43:34PM +0100, KP Singh wrote: > > > > [...] > > > > > > * When using the semantic provided by fexit, the BPF LSM program will > > > > > > always be executed and will be able to override / clobber the > > > > > > decision of LSMs which appear before it in the ordered list. This > > > > > > semantic is very different from what we currently have (i.e. the BPF > > > > > > LSM hook is only called if all the other LSMs allow the action) and > > > > > > seems to be bypassing the LSM framework. > > > > > > > > > > It that's a concern it's trivial to add 'if (RC == 0)' check to fexit > > > > > trampoline generator specific to lsm progs. > > > > [...] > > > > > Using fexit mechanism and bpf_sk_storage generalization is > > > > > all that is needed. None of it should touch security/*. > > > > > > > > If I understand your suggestion correctly, that seems like a terrible > > > > idea to me from the perspective of inspectability and debuggability. > > > > If at runtime, a function can branch off elsewhere to modify its > > > > decision, I want to see that in the source code. If someone e.g. > > > > changes the parameters or the locking rules around a security hook, > > > > how are they supposed to understand the implications if that happens > > > > through some magic fexit trampoline that is injected at runtime? > > > > > > I'm not following the concern. There is error injection facility that is > > > heavily used with and without bpf. In this case there is really no difference > > > whether trampoline is used with direct call or indirect callback via function > > > pointer. Both will jump to bpf prog. The _source code_ of bpf program will > > > _always_ be available for humans to examine via "bpftool prog dump" since BTF > > > is required. So from inspectability and debuggability point of view lsm+bpf > > > stuff is way more visible than any builtin LSM. At any time people will be able > > > to see what exactly is running on the system. Assuming folks can read C code. > > > > You said that you want to use fexit without touching security/, which > > AFAIU means that the branch from security_*() to the BPF LSM will be > > invisible in the *kernel's* source code unless the reader already > > knows about the BPF LSM. But maybe I'm just misunderstanding your > > idea. > > > > If a random developer is trying to change the locking rules around > > security_blah(), and wants to e.g. figure out whether it's okay to > > call that thing with a spinlock held, or whether one of the arguments > > is actually used, or stuff like that, the obvious way to verify that > > is to follow all the direct and indirect calls made from > > security_blah(). It's tedious, but it works, unless something is > > hooked up to it in a way that is visible in no way in the source code. > > > > I agree that the way in which the call happens behind the scenes > > doesn't matter all that much - I don't really care all that much > > whether it's an indirect call, a runtime-patched direct call in inline > > assembly, or an fexit hook. What I do care about is that someone > > reading through any affected function can immediately see that the > > branch exists - in other words, ideally, I'd like it to be something > > happening in the method body, but if you think that's unacceptable, I > > think there should at least be a function attribute that makes it very > > clear what's going on. > > Got it. Then let's whitelist them ? > All error injection points are marked with ALLOW_ERROR_INJECTION(). > We can do something similar here, but let's do it via BTF and avoid > abusing yet another elf section for this mark. > I think BTF_TYPE_EMIT() should work. Just need to pick explicit enough > name and extensive comment about what is going on. Sounds reasonable to me. :) > Locking rules and cleanup around security_blah() shouldn't change though. > Like security_task_alloc() should be paired with security_task_free(). > And so on. With bpf_sk_storage like logic the alloc/free of scratch > space will be similar to the way socket and bpf progs deal with it. > > Some of the lsm hooks are in critical path. Like security_socket_sendmsg(). > retpoline hurts. If we go with indirect calls right now it will be harder to > optimize later. It took us long time to come up with bpf trampoline and build > bpf dispatcher on top of it to remove single indirect call from XDP runtime. > For bpf+lsm would be good to avoid it from the start. Just out of curiosity: Are fexit hooks really much cheaper than indirect calls? AFAIK ftrace on x86-64 replaces the return pointer for fexit instrumentation (see prepare_ftrace_return()). So when the function returns, there is one return misprediction for branching into return_to_handler(), and then the processor's internal return stack will probably be misaligned so that after ftrace_return_to_handler() is done running, all the following returns will also be mispredicted. So I would've thought that fexit hooks would have at least roughly the same impact as indirect calls - indirect calls via retpoline do one mispredicted branch, fexit hooks do at least two AFAICS. But I guess indirect calls could still be slower if fexit benefits from having all the mispredicted pointers stored on the cache-hot stack while the indirect branch target is too infrequently accessed to be in L1D, or something like that?