Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp932540ybz; Wed, 22 Apr 2020 10:28:03 -0700 (PDT) X-Google-Smtp-Source: APiQypLkaACMQFIPbUxZcVobR/9W8lUlLoXtdRBZmdkvptu229uNaTFYTnwZpEuhsR1xD/7QOEJP X-Received: by 2002:aa7:c0d2:: with SMTP id j18mr11955844edp.283.1587576482895; Wed, 22 Apr 2020 10:28:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587576482; cv=none; d=google.com; s=arc-20160816; b=R1RrKCIBQaPen7TKmOmlEQ/lMgI75V/jmTaeYomRXN7v1R6DHnLW/bwBJU5GIJ5evZ 4NlK+wb4R7zmxVkUq9qFKT32/1soyykl+78Inwg2oZ22BgeYmwzZNrZASU/J8PW3OeF1 qCvrLqY/1gVQjjul0RE/GxsKXhonUW9KgjtpH+aHSxc3WZFh10ll98WoNaVqk3GyDaTk yXmJIRzGlmWHoHXwxsdtBAu+Jy4oM4HSlzLHssmBBQlEcYfeZhXO0Y070XmV33m4yzDP Dijn8c5tW4ziRj86wfEKNuUVdxcVz7vJmzSyimwudIOO9LNbEqDcXeCkwP7Xv9YcgUx1 8uhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=APNUHzLUzqMTjpD++gENW9cK47rOtpcZ7nOkDc65bfc=; b=vTAB614gGxkBfM4SoOZt5zR6DlccG5B/uAmcYhg+z60wPP/cNsXvz7JaGMbu9NzJQq 2ABYkWoPYEeMXbvLGh77xXBJdjAgKxNDXcJQzra1KHXJfRBiCnZe3PmHfYO7IW3vdDGB Rv+JkpNEcsixADF20QqQdZk1RO32AyfX5X6qfBH57oroJxeNq3G4zLry1yQ164LRI52w 8zw25FCo6pdXgOq8lPo2sPg1yPRXj5Ewo/g72cw5QWRKaIoLqq/CBKA3UBOpD7wu/GRc gcGyx6Xsym6TZ+HgOTjCAQ0mVIlW1fFU9lcbCQ8fqynScCE6yUFqNDmTYgizZ3wZOYq0 RJ1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@paul-moore-com.20150623.gappssmtp.com header.s=20150623 header.b=MByQJymp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a88si3822707edf.379.2020.04.22.10.27.39; Wed, 22 Apr 2020 10:28:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@paul-moore-com.20150623.gappssmtp.com header.s=20150623 header.b=MByQJymp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726006AbgDVRYY (ORCPT + 99 others); Wed, 22 Apr 2020 13:24:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44750 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726057AbgDVRYY (ORCPT ); Wed, 22 Apr 2020 13:24:24 -0400 Received: from mail-ed1-x543.google.com (mail-ed1-x543.google.com [IPv6:2a00:1450:4864:20::543]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C5C25C03C1AA for ; Wed, 22 Apr 2020 10:24:23 -0700 (PDT) Received: by mail-ed1-x543.google.com with SMTP id d16so2102005edq.7 for ; Wed, 22 Apr 2020 10:24:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=paul-moore-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=APNUHzLUzqMTjpD++gENW9cK47rOtpcZ7nOkDc65bfc=; b=MByQJympuDJNS+hBn6LYtGOyVLLcMOcuvUgrtqzIoVLJ+mX9g39QScWOin2wZMBJQc YyvxjUFYlyQ7DZEFJc2mPjebsZso3N4MXF159pil7scVlahieg94VT6vw34UeQU6E1uF a63WznrDe8rjEZUHdPw3QAItXutLSrOqHYGeW7aG7IWoHHHbpB9ub2/ed2FHoersmKDz h6CM0ZqOU4IXr0950R3JFr52Z/2cK1U/d+IpJWO3coY9bekURc3HRHnnlxiiSxsLbuat He6h4RTpF4YGJmq08XgrnHlJmBYGdqGkT8b1lvL9C2HQRHT8NiXbaKhdHt79z16VVXQq u/8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=APNUHzLUzqMTjpD++gENW9cK47rOtpcZ7nOkDc65bfc=; b=RNfsve8NBtltcX3sM8qhQwNJO22+qVyxJb+Cy3eZupfiAIVm0g2uIn4tt9tAwk3GuZ wkE8LQxLZagO0JSSI0FrRrsRbNEiZd6KadcewHmjSClzygghfmjuxeQzEuX0qbdJ8xAT srDdz3ne4Xt8hQmvTqbkU+QvKBL0Sx+VwXZO3PiGVK+BMuRI0lJ+sSiljGTGRknBOxZ2 eDlFODZ+X60aWjqePXcb7wrck49/F8Bo7b3J6zUfr6YIcBCZqhF7YM3VcC6oqNlvJaS1 UvkT9+E3AHvng9Nvuvl+1CkdZ9fmDqDz8sKNMvKIc179GKeXpjKTKaiEVL5hhK4lGeDj Ss1A== X-Gm-Message-State: AGi0PuaDpUx3ctlFbtWV2XucoRGdbIUr5x605ps9nETXqoCxzBp1BRHu icu8AGU6g6rQr9IB1EGjIY7oV8myXL1aviP4/poC/KWkMA== X-Received: by 2002:a05:6402:1adc:: with SMTP id ba28mr11965299edb.12.1587576262305; Wed, 22 Apr 2020 10:24:22 -0700 (PDT) MIME-Version: 1.0 References: <20200318215550.es4stkjwnefrfen2@madcap2.tricolour.ca> <20200319220249.jyr6xmwvflya5mks@madcap2.tricolour.ca> <20200324210152.5uydf3zqi3dwshfu@madcap2.tricolour.ca> <20200330134705.jlrkoiqpgjh3rvoh@madcap2.tricolour.ca> <20200330162156.mzh2tsnovngudlx2@madcap2.tricolour.ca> <20200330174937.xalrsiev7q3yxsx2@madcap2.tricolour.ca> <871ronf9x2.fsf@x220.int.ebiederm.org> <871rol7nw3.fsf@x220.int.ebiederm.org> In-Reply-To: <871rol7nw3.fsf@x220.int.ebiederm.org> From: Paul Moore Date: Wed, 22 Apr 2020 13:24:10 -0400 Message-ID: Subject: Re: [PATCH ghak90 V8 07/16] audit: add contid support for signalling the audit daemon To: "Eric W. Biederman" Cc: Richard Guy Briggs , nhorman@tuxdriver.com, linux-api@vger.kernel.org, containers@lists.linux-foundation.org, LKML , dhowells@redhat.com, linux-audit@redhat.com, netfilter-devel@vger.kernel.org, simo@redhat.com, netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org, Eric Paris , mpatel@redhat.com, Serge Hallyn Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 17, 2020 at 6:26 PM Eric W. Biederman wrote: > Paul Moore writes: > > On Thu, Apr 16, 2020 at 4:36 PM Eric W. Biederman wrote: > >> Paul Moore writes: > >> > On Mon, Mar 30, 2020 at 1:49 PM Richard Guy Briggs wrote: > >> >> On 2020-03-30 13:34, Paul Moore wrote: > >> >> > On Mon, Mar 30, 2020 at 12:22 PM Richard Guy Briggs wrote: > >> >> > > On 2020-03-30 10:26, Paul Moore wrote: > >> >> > > > On Mon, Mar 30, 2020 at 9:47 AM Richard Guy Briggs wrote: > >> >> > > > > On 2020-03-28 23:11, Paul Moore wrote: > >> >> > > > > > On Tue, Mar 24, 2020 at 5:02 PM Richard Guy Briggs wrote: > >> >> > > > > > > On 2020-03-23 20:16, Paul Moore wrote: > >> >> > > > > > > > On Thu, Mar 19, 2020 at 6:03 PM Richard Guy Briggs wrote: > >> >> > > > > > > > > On 2020-03-18 18:06, Paul Moore wrote: > >> > > >> > ... > >> > > >> >> > > Well, every time a record gets generated, *any* record gets generated, > >> >> > > we'll need to check for which audit daemons this record is in scope and > >> >> > > generate a different one for each depending on the content and whether > >> >> > > or not the content is influenced by the scope. > >> >> > > >> >> > That's the problem right there - we don't want to have to generate a > >> >> > unique record for *each* auditd on *every* record. That is a recipe > >> >> > for disaster. > >> >> > > >> >> > Solving this for all of the known audit records is not something we > >> >> > need to worry about in depth at the moment (although giving it some > >> >> > casual thought is not a bad thing), but solving this for the audit > >> >> > container ID information *is* something we need to worry about right > >> >> > now. > >> >> > >> >> If you think that a different nested contid value string per daemon is > >> >> not acceptable, then we are back to issuing a record that has only *one* > >> >> contid listed without any nesting information. This brings us back to > >> >> the original problem of keeping *all* audit log history since the boot > >> >> of the machine to be able to track the nesting of any particular contid. > >> > > >> > I'm not ruling anything out, except for the "let's just completely > >> > regenerate every record for each auditd instance". > >> > >> Paul I am a bit confused about what you are referring to when you say > >> regenerate every record. > >> > >> Are you saying that you don't want to repeat the sequence: > >> audit_log_start(...); > >> audit_log_format(...); > >> audit_log_end(...); > >> for every nested audit daemon? > > > > If it can be avoided yes. Audit performance is already not-awesome, > > this would make it even worse. > > As far as I can see not repeating sequences like that is fundamental > for making this work at all. Just because only the audit subsystem > should know about one or multiple audit daemons. Nothing else should > care. Yes, exactly, this has been mentioned in the past. Both the performance hit and the code complication in the caller are things we must avoid. > >> Or are you saying that you would like to literraly want to send the same > >> skb to each of the nested audit daemons? > > > > Ideally we would reuse the generated audit messages as much as > > possible. Less work is better. That's really my main concern here, > > let's make sure we aren't going to totally tank performance when we > > have a bunch of nested audit daemons. > > So I think there are two parts of this answer. Assuming we are talking > about nesting audit daemons in containers we will have different > rulesets and I expect most of the events for a nested audit daemon won't > be of interest to the outer audit daemon. Yes, this is another thing that Richard and I have discussed in the past. We will basically need to create per-daemon queues, rules, tracking state, etc.; that is easy enough. What will be slightly more tricky is the part where we apply the filters to the individual records and decide if that record is valid/desired for a given daemon. I think it can be done without too much pain, and any changes to the callers, but it will require a bit of work to make sure it is done well and that records are needlessly duplicated in the kernel. > Beyond that it should be very straight forward to keep a pointer and > leave the buffer as a scatter gather list until audit_log_end > and translate pids, and rewrite ACIDs attributes in audit_log_end > when we build the final packet. Either through collaboration with > audit_log_format or a special audit_log command that carefully sets > up the handful of things that need that information. In order to maximize record re-use I think we will want to hold off on assembling the final packet until it is sent to the daemons in the kauditd thread. We'll also likely need to create special audit_log_XXX functions to capture fields which we know will need translation, e.g. ACID information. (the reason for the new audit_log_XXX functions would be to mark the new sg element and ensure the buffer is handled correctly) Regardless of the details, I think the scatter gather approach is the key here - that seems like the best design idea I've seen thus far. It enables us to replace portions of the record as needed ... and possibly use the existing skb cow stuff ... it has been a while, but does the skb cow functions handle scatter gather skbs or do they need to be linear? > Hmm. I am seeing that we send skbs to kauditd and then kauditd > sends those skbs to userspace. I presume that is primary so that > sending messages to userspace does not block the process being audited. > Plus a little bit so that the retry logic will work. Long story short, it's a poor design. I'm not sure who came up with it, but I have about a 1000 questions that are variations on "why did this seem like a good idea?". I expect the audit_buffer definition to change significantly during the nested auditd work. > I think the naive implementation would be to simply have 1 kauditd > per auditd (strictly and audit context/namespace). Although that can be > optimized if that is a problem. > > Beyond that I think we would need to look at profiles to really > understand where the bottlenecks are. Agreed. This is a hidden implementation detail that doesn't affect the userspace API or the in-kernel callers. The first approach can be simple and we can complicate it as needed in future versions. > > I'm open to any ideas people may have. We have a problem, let's solve > > it. > > It definitely makes sense to look ahead to having audit daemons running > in containers, but in the grand scheme of things that is a nice to have. > Probably something we will and should get to, but we have lived a long > time without auditd running in containers so I expect we can live a > while longer. It looks like you are confusing my concern. I'm not pushing Richard to implement support for this in the current patchset, I'm pushing Richard to consider the design aspect of having multiple audit daemons so that we don't code ourselves into a corner with the audit record changes he is proposing. The audit record format is part of the kernel/userspace API and as a result requires great care when modifying/extending/etc. -- paul moore www.paul-moore.com