Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751877AbdLIS2R (ORCPT ); Sat, 9 Dec 2017 13:28:17 -0500 Received: from sonic309-27.consmr.mail.ne1.yahoo.com ([66.163.184.153]:42492 "EHLO sonic309-27.consmr.mail.ne1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751115AbdLIS2O (ORCPT ); Sat, 9 Dec 2017 13:28:14 -0500 X-YMail-OSG: L383WskVM1lp7n27EFAloFKMlqco6dw.7fOw1kUjd1ZFKhpdQ.rsMEiTTHySXHH hg6E8owxVNLS7sT1hhGYv7YkxWavykTzpoNj859ugCXXROqyxjVAXT4WLYlLW5_A9NiFoN9nZrC3 j8zhW1tvC..ROkz4uT6ZK0_8QSdF3XHI282yHTF3JV.iVCrbUkNEWUkyAGWDoJI9kawCjhVORM0H 0g7gkMoWmlJ3wMgg0YOO1QXhfsfWXCZxfHv15obovZsfcuSeCFLyhvLECwit0o0yt0aXEZ4M1eFs bGR01bK3QjvncG3rZWk9.E2pWbnNd5flx7tJFzw5D09HBwVW22jA23h6iiWZtMNT7Uhq.aQ8JFTO uMPf6jrQjA4RA8P7JAuapqrCkVVYqlDBcrfDjbpR6c2NtKd6vvT4BRCWB4T.ZBLcbFuH31moKE3O QIoYH58go.X32N0WhIMCacJU8LnVeyWJIlsfBwsT8FlLNjjj6tRcX6aYAt7k4vI3ENJEV9ePef8w - Subject: Re: RFC(v2): Audit Kernel Container IDs To: =?UTF-8?Q?Micka=c3=abl_Sala=c3=bcn?= , Richard Guy Briggs , cgroups@vger.kernel.org, Linux Containers , Linux API , Linux Audit , Linux FS Devel , Linux Kernel , Linux Network Development Cc: mszeredi@redhat.com, "Eric W. Biederman" , Simo Sorce , jlayton@redhat.com, "Carlos O'Donell" , David Howells , Al Viro , Andy Lutomirski , Eric Paris , trondmy@primarydata.com, Michael Kerrisk References: <20171012141359.saqdtnodwmbz33b2@madcap2.tricolour.ca> <75b7d6a6-42ba-2dff-1836-1091c7c024e7@schaufler-ca.com> <7ebca85a-425c-2b95-9a5f-59d81707339e@digikod.net> From: Casey Schaufler Message-ID: Date: Sat, 9 Dec 2017 10:28:08 -0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <7ebca85a-425c-2b95-9a5f-59d81707339e@digikod.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3808 Lines: 78 On 12/9/2017 2:20 AM, Micka�l Sala�n wrote: > On 12/10/2017 18:33, Casey Schaufler wrote: >> On 10/12/2017 7:14 AM, Richard Guy Briggs wrote: >>> Containers are a userspace concept. The kernel knows nothing of them. >>> >>> The Linux audit system needs a way to be able to track the container >>> provenance of events and actions. Audit needs the kernel's help to do >>> this. >>> >>> Since the concept of a container is entirely a userspace concept, a >>> registration from the userspace container orchestration system initiates >>> this. This will define a point in time and a set of resources >>> associated with a particular container with an audit container ID. >>> >>> The registration is a pseudo filesystem (proc, since PID tree already >>> exists) write of a u8[16] UUID representing the container ID to a file >>> representing a process that will become the first process in a new >>> container. This write might place restrictions on mount namespaces >>> required to define a container, or at least careful checking of >>> namespaces in the kernel to verify permissions of the orchestrator so it >>> can't change its own container ID. A bind mount of nsfs may be >>> necessary in the container orchestrator's mntNS. >>> Note: Use a 128-bit scalar rather than a string to make compares faster >>> and simpler. >>> >>> Require a new CAP_CONTAINER_ADMIN to be able to carry out the >>> registration. >> Hang on. If containers are a user space concept, how can >> you want CAP_CONTAINER_ANYTHING? If there's not such thing as >> a container, how can you be asking for a capability to manage >> them? >> >>> At that time, record the target container's user-supplied >>> container identifier along with the target container's first process >>> (which may become the target container's "init" process) process ID >>> (referenced from the initial PID namespace), all namespace IDs (in the >>> form of a nsfs device number and inode number tuple) in a new auxilliary >>> record AUDIT_CONTAINER with a qualifying op=$action field. > Here is an idea to avoid privilege problems or the need for a new > capability: make it automatic. What makes a container a container seems > to be the use of at least a namespace. You might think so, but I am assured that you can have a container without using namespaces. Intel's "Clear Containers", which use virtualization technology, are one example. I have considered creating "Smack Containers" using mandatory access control technology, more to press the point that "containers" is a marketing concept, not technology. > What about automatically create > and assign an ID to a process when it enters a namespace different than > one of its parent process? This delegates the (permission) > responsibility to the use of namespaces (e.g. /proc/sys/user/max_* limit). That gets ugly when you have a container that uses user, filesystem, network and whatever else namespaces. If all containers used the same set of namespaces I think this would be a fine idea, but they don't. > One interesting side effect of this approach would be to be able to > identify which processes are in the same set of namespaces, even if not > spawn from the container but entered after its creation (i.e. using > setns), by creating container IDs as a (deterministic) checksum from the > /proc/self/ns/* IDs. > > Since the concern is to identify a container, I think the ability to > audit the switch from one container ID to another is enough. I don't > think we need nested IDs. Because a container doesn't have to use namespaces to be a container you still need a mechanism for a process to declare that it is in fact in a container, and to identify the container. > > As a side note, you may want to take a look at the Linux-VServer's XID. > > Regards, > Micka�l >