2022-02-03 18:25:11

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 00/27] ima: Namespace IMA with audit support in IMA-ns

The goal of this series of patches is to start with the namespacing of
IMA and support auditing within an IMA namespace (IMA-ns) as the first
step.

In this series the IMA namespace is piggy backing on the user namespace
and therefore an IMA namespace is created when a user namespace is
created, although this is done late when SecurityFS is mounted inside
a user namespace. The advantage of piggy backing on the user namespace
is that the user namespace can provide the keys infrastructure that IMA
appraisal support will need later on.

We chose the goal of supporting auditing within an IMA namespace since it
requires the least changes to IMA. Following this series, auditing within
an IMA namespace can be activated by a user running the following lines
that rely on a statically linked busybox to be installed on the host for
execution within the minimal container environment:

mkdir -p rootfs/{bin,mnt,proc}
cp /sbin/busybox rootfs/bin
cp /sbin/busybox rootfs/bin/busybox2
echo >> rootfs/bin/busybox2
PATH=/bin unshare --user --map-root-user --mount-proc --pid --fork \
--root rootfs busybox sh -c \
"busybox mount -t securityfs /mnt /mnt; \
busybox echo 1 > /mnt/ima/active; \
busybox echo 'audit func=BPRM_CHECK mask=MAY_EXEC' > /mnt/ima/policy; \
busybox2 cat /mnt/ima/policy"

[busybox2 is used to demonstrate 2 audit messages; see below]

Following the audit log on the host the last line cat'ing the IMA policy
inside the namespace would have been audited. Unfortunately the auditing
line is not distinguishable from one stemming from actions on the host.
The hope here is that Richard Brigg's container id support for auditing
would help resolve the problem.

In the above the writing of '1' to the 'active' file is used to activate
the IMA namespace. Future extensions to IMA namespaces will make use of
the configuration stage after the mounting of securityfs and before the
activation to for example choose the measurement log template.

The following lines added to a suitable IMA policy on the host would
cause the execution of the commands inside the container (by uid 1000)
to be measured and audited as well on the host, thus leading to two
auditing messages for the 'busybox2 cat' above and log entries in IMA's
system log.

echo -e "measure func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
"audit func=BPRM_CHECK mask=MAY_EXEC uid=1000\n" \
> /sys/kernel/security/ima/policy

The goal of supporting measurement and auditing by the host, of actions
occurring within IMA namespaces, is that users, particularly root,
should not be able to evade the host's IMA policy just by spawning
new IMA namespaces, running programs there, and discarding the namespaces
again. This is achieved through 'hierarchical processing' of file
accesses that are evaluated against the policy of the namespace where
the action occurred and against all namespaces' and their policies leading
back to the root IMA namespace (init_ima_ns).

The patch series adds support for a virtualized SecurityFS with a few
new API calls that are used by IMA namespacing. Only the data relevant
to the IMA namespace are shown. The files and directories of other
security subsystems (TPM, evm, Tomoyo, safesetid) are not showing
up when secruityfs is mounted inside a user namespace.

Much of the code leading up to the virtualization of SecurityFS deals
with moving IMA's variables from various files into the IMA namespace
structure called 'ima_namespace'. When it comes to determining the
current IMA namespace I took the approach to get the current IMA
namespace (get_current_ns()) on the top level and pass the pointer all
the way down to those functions that now need access to the ima_namespace
to get to their variables. This later on comes in handy once hierarchical
processing is implemented in this series where we walk the list of
namespaces backwards and again need to pass the pointer into functions.

This patch also introduces usage of CAP_MAC_ADMIN to allow access to the
IMA policy via reduced capabilities. We would again later on use this
capability to allow users to set file extended attributes for IMA
appraisal support.

My tree with these patches is here:

git fetch https://github.com/stefanberger/linux-ima-namespaces v5.16+imans.v10.posted

Regards,
Stefan

Links to previous postings:
v1: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v2: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v3: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v4: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v5: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v6: https://lore.kernel.org/linux-integrity/[email protected]/T/#t
v7: https://lore.kernel.org/linux-integrity/20211217100659.2iah5prshavjk6v6@wittgenstein/T/#t
v8: https://lore.kernel.org/all/[email protected]/#r
v9: https://lore.kernel.org/linux-integrity/?t=20220131234353

v10:
- Added A-b's; addressed issues from v9
- Added 2 patches to support freeing of iint after namespace deletion
- Added patch to return error code from securityfs functions
- Added patch to limit number of policy rules in IMA-ns to 1024

v9:
- Rearranged order of patch that adds IMA-ns pointer to user_ns to be before
hierarchical processing patch
- Renamed ns_status variables from status to ns_status to avoid clashes
- Added bug fixing patches to top
- Added patch 'Move arch_policy_entry into ima_namespace'
- Added patch 'Move ima_lsm_policy_notifier into ima_namespace'
- Addressed comments to v8
- Added change comments to individual patches
- Formatted code following checkpatch.pl --strict

v8:
- Rearranged patches to support lazy creation of IMA namespaces
- Fixed issue related to re-auditing of a modified file. This required the
introduction of ns_status structure connected to list starting on an iint
- Fixed issue related to display of uid and gid in IMA policy to show uid
and gid values relative to the user namespace
- Handling of error code during hierarchical processing

v7:
- Dropped 2 patches related to key queues; using &init_ima_ns for all calls
from functions related to key queues where calls need ima_namespace
- Moved ima_namespace to security/integrity/ima/ima.h
- Extended API descriptions with ns parameter where needed
- Using init_ima_ns in functions related to appraisal and xattrs
- SecurityFS: Using ima_ns_from_file() to get ns pointer
- Reformatted to 80 columns per line

v6:
- Removed kref and pointer to user_ns in ima_namespace (patch 1)
- Moved only the policy file dentry into ima_namespace; other dentries are on
stack now and can be discarded
- Merged James's patch simplifying securityfs_remove and dropping dget()
- Added patch with Christian's suggestion to tie opened SecurityFS file to
the user/IMA namespace it belongs to
- Passing missing ima_namespace parameter in functions in ima_kexec.c (ppc64)
- Reverted v5's change to patch 4 related to protection of ima_namespace

v5:
- Followed Christian's suggestions on patch 1. Also, reverted increased reference
counter on init_user_ns since ima_ns doesn't take reference to its user_ns.
- No addtional reference is taken on securityfs dentries for user_ns != init_user_ns.
Updated documentation and removed cleanup of dentries on superblock kill.
(patches 12 & 16)
- Moved else branch to earlier patch (patch 11)
- Protect ima_namespace by taking reference on user namespace for delayed work queue.
(patch 4)

v4:
- For consistency moved 'ns = get_current_ns()' to top of functions
- Merge in James's latest SecurityFS patch

v3:
- Further modifications to virtualized SecurityFS following James's posted patch
- Dropping of early teardown for user_namespaces since not needed anymore

v2:
- Folllwed Christian's suggestion to virtualize securitytfs; no more securityfs_ns
- Followed James's advice for late 'population' of securityfs for IMA namespaces
- Squashed 2 patches dealing with capabilities
- Added missing 'depends on USER_NS' to Kconfig
- Added missing 'static' to several functions



Christian Brauner (1):
securityfs: rework dentry creation

Mehmet Kayaalp (2):
integrity/ima: Define ns_status for storing namespaced iint data
ima: Namespace audit status flags

Stefan Berger (24):
ima: Remove ima_policy file before directory
ima: Do not print policy rule with inactive LSM labels
ima: Return error code obtained from securityfs functions
ima: Define ima_namespace struct and start moving variables into it
ima: Move arch_policy_entry into ima_namespace
ima: Move ima_htable into ima_namespace
ima: Move measurement list related variables into ima_namespace
ima: Move some IMA policy and filesystem related variables into
ima_namespace
ima: Move IMA securityfs files into ima_namespace or onto stack
ima: Move ima_lsm_policy_notifier into ima_namespace
ima: Define mac_admin_ns_capable() as a wrapper for ns_capable()
ima: Only accept AUDIT rules for non-init_ima_ns namespaces for now
userns: Add pointer to ima_namespace to user_namespace
ima: Implement hierarchical processing of file accesses
ima: Implement ima_free_policy_rules() for freeing of an ima_namespace
ima: Add functions for creating and freeing of an ima_namespace
integrity: Add optional callback function to integrity_inode_free()
ima: Remove unused iints from the integrity_iint_cache
securityfs: Extend securityfs with namespacing support
ima: Setup securityfs for IMA namespace
ima: Introduce securityfs file to activate an IMA namespace
ima: Show owning user namespace's uid and gid when displaying policy
ima: Limit number of policy rules in non-init_ima_ns
ima: Enable IMA namespaces

include/linux/capability.h | 6 +
include/linux/ima.h | 36 ++
include/linux/integrity.h | 8 +-
include/linux/user_namespace.h | 4 +
init/Kconfig | 13 +
kernel/user.c | 4 +
kernel/user_namespace.c | 2 +
security/inode.c | 81 +++-
security/integrity/iint.c | 26 +-
security/integrity/ima/Makefile | 3 +-
security/integrity/ima/ima.h | 241 ++++++++++--
security/integrity/ima/ima_api.c | 34 +-
security/integrity/ima/ima_appraise.c | 31 +-
security/integrity/ima/ima_asymmetric_keys.c | 8 +-
security/integrity/ima/ima_fs.c | 299 +++++++++++---
security/integrity/ima/ima_init.c | 19 +-
security/integrity/ima/ima_init_ima_ns.c | 65 ++++
security/integrity/ima/ima_kexec.c | 15 +-
security/integrity/ima/ima_main.c | 223 ++++++++---
security/integrity/ima/ima_ns.c | 61 +++
security/integrity/ima/ima_ns_status.c | 385 +++++++++++++++++++
security/integrity/ima/ima_policy.c | 266 ++++++++-----
security/integrity/ima/ima_queue.c | 63 ++-
security/integrity/ima/ima_queue_keys.c | 11 +-
security/integrity/ima/ima_template.c | 5 +-
security/integrity/integrity.h | 39 +-
security/security.c | 2 +-
27 files changed, 1606 insertions(+), 344 deletions(-)
create mode 100644 security/integrity/ima/ima_init_ima_ns.c
create mode 100644 security/integrity/ima/ima_ns.c
create mode 100644 security/integrity/ima/ima_ns_status.c

--
2.31.1


2022-02-03 19:06:33

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 03/27] ima: Return error code obtained from securityfs functions

If an error occurs when creating a securityfs file, return the exact
error code to the caller.

Signed-off-by: Stefan Berger <[email protected]>
---
security/integrity/ima/ima_fs.c | 31 +++++++++++++++++++++++--------
1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 3ad8f7734208..cd1683dad3bf 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -452,47 +452,61 @@ static const struct file_operations ima_measure_policy_ops = {

int __init ima_fs_init(void)
{
+ int ret;
+
ima_dir = securityfs_create_dir("ima", integrity_dir);
if (IS_ERR(ima_dir))
- return -1;
+ return PTR_ERR(ima_dir);

ima_symlink = securityfs_create_symlink("ima", NULL, "integrity/ima",
NULL);
- if (IS_ERR(ima_symlink))
+ if (IS_ERR(ima_symlink)) {
+ ret = PTR_ERR(ima_symlink);
goto out;
+ }

binary_runtime_measurements =
securityfs_create_file("binary_runtime_measurements",
S_IRUSR | S_IRGRP, ima_dir, NULL,
&ima_measurements_ops);
- if (IS_ERR(binary_runtime_measurements))
+ if (IS_ERR(binary_runtime_measurements)) {
+ ret = PTR_ERR(binary_runtime_measurements);
goto out;
+ }

ascii_runtime_measurements =
securityfs_create_file("ascii_runtime_measurements",
S_IRUSR | S_IRGRP, ima_dir, NULL,
&ima_ascii_measurements_ops);
- if (IS_ERR(ascii_runtime_measurements))
+ if (IS_ERR(ascii_runtime_measurements)) {
+ ret = PTR_ERR(ascii_runtime_measurements);
goto out;
+ }

runtime_measurements_count =
securityfs_create_file("runtime_measurements_count",
S_IRUSR | S_IRGRP, ima_dir, NULL,
&ima_measurements_count_ops);
- if (IS_ERR(runtime_measurements_count))
+ if (IS_ERR(runtime_measurements_count)) {
+ ret = PTR_ERR(runtime_measurements_count);
goto out;
+ }

violations =
securityfs_create_file("violations", S_IRUSR | S_IRGRP,
ima_dir, NULL, &ima_htable_violations_ops);
- if (IS_ERR(violations))
+ if (IS_ERR(violations)) {
+ ret = PTR_ERR(violations);
goto out;
+ }

ima_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS,
ima_dir, NULL,
&ima_measure_policy_ops);
- if (IS_ERR(ima_policy))
+ if (IS_ERR(ima_policy)) {
+ ret = PTR_ERR(ima_policy);
goto out;
+ }

return 0;
out:
@@ -503,5 +517,6 @@ int __init ima_fs_init(void)
securityfs_remove(binary_runtime_measurements);
securityfs_remove(ima_symlink);
securityfs_remove(ima_dir);
- return -1;
+
+ return ret;
}
--
2.31.1

2022-02-04 14:39:05

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 14/27] userns: Add pointer to ima_namespace to user_namespace

Add a pointer to ima_namespace to the user_namespace and initialize
the init_user_ns with a pointer to init_ima_ns.

Signed-off-by: Stefan Berger <[email protected]>

---
v9:
- Deferred implementation of ima_ns_from_user_ns() to later patch
---
include/linux/ima.h | 2 ++
include/linux/user_namespace.h | 4 ++++
kernel/user.c | 4 ++++
3 files changed, 10 insertions(+)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index b6ab66a546ae..0cf2a80c8b35 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -14,6 +14,8 @@
#include <crypto/hash_info.h>
struct linux_binprm;

+extern struct ima_namespace init_ima_ns;
+
#ifdef CONFIG_IMA
extern enum hash_algo ima_get_current_hash_algo(void);
extern int ima_bprm_check(struct linux_binprm *bprm);
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index 33a4240e6a6f..019e8cf7b633 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -36,6 +36,7 @@ struct uid_gid_map { /* 64 bytes -- 1 cache line */
#define USERNS_INIT_FLAGS USERNS_SETGROUPS_ALLOWED

struct ucounts;
+struct ima_namespace;

enum ucount_type {
UCOUNT_USER_NAMESPACES,
@@ -99,6 +100,9 @@ struct user_namespace {
#endif
struct ucounts *ucounts;
long ucount_max[UCOUNT_COUNTS];
+#ifdef CONFIG_IMA_NS
+ struct ima_namespace *ima_ns;
+#endif
} __randomize_layout;

struct ucounts {
diff --git a/kernel/user.c b/kernel/user.c
index e2cf8c22b539..e5d1f4b9b8ba 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -19,6 +19,7 @@
#include <linux/export.h>
#include <linux/user_namespace.h>
#include <linux/proc_ns.h>
+#include <linux/ima.h>

/*
* userns count is 1 for root user, 1 for init_uts_ns,
@@ -67,6 +68,9 @@ struct user_namespace init_user_ns = {
.keyring_name_list = LIST_HEAD_INIT(init_user_ns.keyring_name_list),
.keyring_sem = __RWSEM_INITIALIZER(init_user_ns.keyring_sem),
#endif
+#ifdef CONFIG_IMA_NS
+ .ima_ns = &init_ima_ns,
+#endif
};
EXPORT_SYMBOL_GPL(init_user_ns);

--
2.31.1

2022-02-04 17:08:21

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 22/27] securityfs: Extend securityfs with namespacing support

Enable multiple instances of securityfs by keying each instance with a
pointer to the user namespace it belongs to.

Since we do not need the pinning of the filesystem for the virtualization
case, limit the usage of simple_pin_fs() and simpe_release_fs() to the
case when the init_user_ns is active. This simplifies the cleanup for the
virtualization case where usage of securityfs_remove() to free dentries
is therefore not needed anymore.

For the initial securityfs, i.e. the one mounted in the host userns mount,
nothing changes. The rules for securityfs_remove() are as before and it is
still paired with securityfs_create(). Specifically, a file created via
securityfs_create_dentry() in the initial securityfs mount still needs to
be removed by a call to securityfs_remove(). Creating a new dentry in the
initial securityfs mount still pins the filesystem like it always did.
Consequently, the initial securityfs mount is not destroyed on
umount/shutdown as long as at least one user of it still has dentries that
it hasn't removed with a call to securityfs_remove().

Prevent mounting of an instance of securityfs in another user namespace
than it belongs to. Also, prevent accesses to files and directories by
a user namespace that is neither the user namespace it belongs to
nor an ancestor of the user namespace that the instance of securityfs
belongs to. Do not prevent access if securityfs was bind-mounted and
therefore the init_user_ns is the owning user namespace.

Suggested-by: Christian Brauner <[email protected]>
Signed-off-by: Stefan Berger <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
---
security/inode.c | 72 ++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 61 insertions(+), 11 deletions(-)

diff --git a/security/inode.c b/security/inode.c
index 13e6780c4444..e525ba960063 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -21,9 +21,37 @@
#include <linux/security.h>
#include <linux/lsm_hooks.h>
#include <linux/magic.h>
+#include <linux/user_namespace.h>

-static struct vfsmount *mount;
-static int mount_count;
+static struct vfsmount *init_securityfs_mount;
+static int init_securityfs_mount_count;
+
+static int securityfs_permission(struct user_namespace *mnt_userns,
+ struct inode *inode, int mask)
+{
+ int err;
+
+ err = generic_permission(&init_user_ns, inode, mask);
+ if (!err) {
+ /* Unless bind-mounted, deny access if current_user_ns() is not
+ * ancestor.
+ */
+ if (inode->i_sb->s_user_ns != &init_user_ns &&
+ !in_userns(current_user_ns(), inode->i_sb->s_user_ns))
+ err = -EACCES;
+ }
+
+ return err;
+}
+
+static const struct inode_operations securityfs_dir_inode_operations = {
+ .permission = securityfs_permission,
+ .lookup = simple_lookup,
+};
+
+static const struct inode_operations securityfs_file_inode_operations = {
+ .permission = securityfs_permission,
+};

static void securityfs_free_inode(struct inode *inode)
{
@@ -40,20 +68,25 @@ static const struct super_operations securityfs_super_operations = {
static int securityfs_fill_super(struct super_block *sb, struct fs_context *fc)
{
static const struct tree_descr files[] = {{""}};
+ struct user_namespace *ns = fc->user_ns;
int error;

+ if (WARN_ON(ns != current_user_ns()))
+ return -EINVAL;
+
error = simple_fill_super(sb, SECURITYFS_MAGIC, files);
if (error)
return error;

sb->s_op = &securityfs_super_operations;
+ sb->s_root->d_inode->i_op = &securityfs_dir_inode_operations;

return 0;
}

static int securityfs_get_tree(struct fs_context *fc)
{
- return get_tree_single(fc, securityfs_fill_super);
+ return get_tree_keyed(fc, securityfs_fill_super, fc->user_ns);
}

static const struct fs_context_operations securityfs_context_ops = {
@@ -71,6 +104,7 @@ static struct file_system_type fs_type = {
.name = "securityfs",
.init_fs_context = securityfs_init_fs_context,
.kill_sb = kill_litter_super,
+ .fs_flags = FS_USERNS_MOUNT,
};

/**
@@ -109,6 +143,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
const struct file_operations *fops,
const struct inode_operations *iops)
{
+ struct user_namespace *ns = current_user_ns();
struct dentry *dentry;
struct inode *dir, *inode;
int error;
@@ -118,12 +153,19 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,

pr_debug("securityfs: creating file '%s'\n",name);

- error = simple_pin_fs(&fs_type, &mount, &mount_count);
- if (error)
- return ERR_PTR(error);
+ if (ns == &init_user_ns) {
+ error = simple_pin_fs(&fs_type, &init_securityfs_mount,
+ &init_securityfs_mount_count);
+ if (error)
+ return ERR_PTR(error);
+ }

- if (!parent)
- parent = mount->mnt_root;
+ if (!parent) {
+ if (ns == &init_user_ns)
+ parent = init_securityfs_mount->mnt_root;
+ else
+ return ERR_PTR(-EINVAL);
+ }

dir = d_inode(parent);

@@ -148,7 +190,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
inode->i_private = data;
if (S_ISDIR(mode)) {
- inode->i_op = &simple_dir_inode_operations;
+ inode->i_op = &securityfs_dir_inode_operations;
inode->i_fop = &simple_dir_operations;
inc_nlink(inode);
inc_nlink(dir);
@@ -156,6 +198,7 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
inode->i_op = iops ? iops : &simple_symlink_inode_operations;
inode->i_link = data;
} else {
+ inode->i_op = &securityfs_file_inode_operations;
inode->i_fop = fops;
}
d_instantiate(dentry, inode);
@@ -167,7 +210,9 @@ static struct dentry *securityfs_create_dentry(const char *name, umode_t mode,
dentry = ERR_PTR(error);
out:
inode_unlock(dir);
- simple_release_fs(&mount, &mount_count);
+ if (ns == &init_user_ns)
+ simple_release_fs(&init_securityfs_mount,
+ &init_securityfs_mount_count);
return dentry;
}

@@ -293,11 +338,14 @@ EXPORT_SYMBOL_GPL(securityfs_create_symlink);
*/
void securityfs_remove(struct dentry *dentry)
{
+ struct user_namespace *ns;
struct inode *dir;

if (!dentry || IS_ERR(dentry))
return;

+ ns = dentry->d_sb->s_user_ns;
+
dir = d_inode(dentry->d_parent);
inode_lock(dir);
if (simple_positive(dentry)) {
@@ -310,7 +358,9 @@ void securityfs_remove(struct dentry *dentry)
dput(dentry);
}
inode_unlock(dir);
- simple_release_fs(&mount, &mount_count);
+ if (ns == &init_user_ns)
+ simple_release_fs(&init_securityfs_mount,
+ &init_securityfs_mount_count);
}
EXPORT_SYMBOL_GPL(securityfs_remove);

--
2.31.1

2022-02-04 17:44:46

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 24/27] ima: Introduce securityfs file to activate an IMA namespace

Introduce securityfs file 'active' that allows a user to activate an IMA
namespace by writing a "1" (precisely a '1\0' or '1\n') to it. When
reading from the file, it shows either '0' or '1'.

Also, introduce ns_is_active() to be used in those places where the
ima_namespace pointer may either be NULL or where the active field may not
have been set to '1', yet. An inactive IMA namespace should never be
accessed since it has not been initialized, yet.

Set the init_ima_ns's active field to '1' since it is considered active
right from the beginning.

The rationale for introducing a file to activate an IMA namespace is that
subsequent support for IMA-measurement and IMA-appraisal will add
configuration files for selecting for example the template that an IMA
namespace is supposed to use for logging measurements, which will add
an IMA namespace configuration stage (using securityfs files) before its
activation.

Signed-off-by: Stefan Berger <[email protected]>

---
v10:
- use memdup_user_nul to copy input from user
- propagating error returned from ima_fs_add_ns_files()
---
security/integrity/ima/ima.h | 7 +++
security/integrity/ima/ima_fs.c | 71 ++++++++++++++++++++++++
security/integrity/ima/ima_init_ima_ns.c | 1 +
security/integrity/ima/ima_main.c | 2 +-
4 files changed, 80 insertions(+), 1 deletion(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 1e3f9dda218d..05e2de7697da 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -123,6 +123,8 @@ struct ima_h_table {
};

struct ima_namespace {
+ atomic_t active; /* whether namespace is active */
+
struct rb_root ns_status_tree;
rwlock_t ns_tree_lock;
struct kmem_cache *ns_status_cache;
@@ -154,6 +156,11 @@ struct ima_namespace {
} __randomize_layout;
extern struct ima_namespace init_ima_ns;

+static inline bool ns_is_active(struct ima_namespace *ns)
+{
+ return (ns && atomic_read(&ns->active));
+}
+
extern const int read_idmap[];

#ifdef CONFIG_HAVE_IMA_KEXEC
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index 84cd02a9e19b..58d80884880f 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -451,6 +451,71 @@ static const struct file_operations ima_measure_policy_ops = {
.llseek = generic_file_llseek,
};

+static ssize_t ima_show_active(struct file *filp,
+ char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct ima_namespace *ns = &init_ima_ns;
+ char tmpbuf[2];
+ ssize_t len;
+
+ len = scnprintf(tmpbuf, sizeof(tmpbuf),
+ "%d\n", atomic_read(&ns->active));
+ return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
+}
+
+static ssize_t ima_write_active(struct file *filp,
+ const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct ima_namespace *ns = &init_ima_ns;
+ unsigned int active;
+ char *kbuf;
+ int err;
+
+ if (ns_is_active(ns))
+ return -EBUSY;
+
+ /* accepting '1\n' and '1\0' and no partial writes */
+ if (count >= 3 || *ppos != 0)
+ return -EINVAL;
+
+ kbuf = memdup_user_nul(buf, count);
+ if (IS_ERR(kbuf))
+ return PTR_ERR(kbuf);
+
+ err = kstrtouint(kbuf, 10, &active);
+ kfree(kbuf);
+ if (err)
+ return err;
+
+ if (active != 1)
+ return -EINVAL;
+
+ atomic_set(&ns->active, 1);
+
+ return count;
+}
+
+static const struct file_operations ima_active_ops = {
+ .read = ima_show_active,
+ .write = ima_write_active,
+};
+
+static int ima_fs_add_ns_files(struct dentry *ima_dir)
+{
+ struct dentry *active;
+
+ active =
+ securityfs_create_file("active",
+ S_IRUSR | S_IWUSR | S_IRGRP, ima_dir, NULL,
+ &ima_active_ops);
+ if (IS_ERR(active))
+ return PTR_ERR(active);
+
+ return 0;
+}
+
int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
{
struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
@@ -531,6 +596,12 @@ int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
}
}

+ if (ns != &init_ima_ns) {
+ ret = ima_fs_add_ns_files(ima_dir);
+ if (ret)
+ goto out;
+ }
+
return 0;
out:
securityfs_remove(ns->ima_policy);
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index d4ddfd1de60b..39ee0c2477a6 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -58,5 +58,6 @@ struct ima_namespace init_ima_ns = {
.ima_lsm_policy_notifier = {
.notifier_call = ima_lsm_policy_change,
},
+ .active = ATOMIC_INIT(1),
};
EXPORT_SYMBOL(init_ima_ns);
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 1dee8cb5afa2..9ca9223bbcf6 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -441,7 +441,7 @@ static int process_measurement(struct user_namespace *user_ns,

while (user_ns) {
ns = ima_ns_from_user_ns(user_ns);
- if (ns) {
+ if (ns_is_active(ns)) {
int rc;

rc = __process_measurement(ns, file, cred, secid, buf,
--
2.31.1

2022-02-04 23:53:55

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 08/27] ima: Move measurement list related variables into ima_namespace

Move measurement list related variables into the ima_namespace. This way
a front-end like securityfs can show the measurement list inside an IMA
namespace.

Signed-off-by: Stefan Berger <[email protected]>
---
security/integrity/ima/ima.h | 5 +++--
security/integrity/ima/ima_fs.c | 6 ++++--
security/integrity/ima/ima_init_ima_ns.c | 5 +++++
security/integrity/ima/ima_kexec.c | 12 ++++++-----
security/integrity/ima/ima_queue.c | 27 +++++++++++-------------
5 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 340a59174670..45706836a77b 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -106,7 +106,6 @@ struct ima_queue_entry {
struct list_head later; /* place in ima_measurements list */
struct ima_template_entry *entry;
};
-extern struct list_head ima_measurements; /* list of all measurements */

/* Some details preceding the binary serialized measurement list */
struct ima_kexec_hdr {
@@ -136,6 +135,8 @@ struct ima_namespace {
struct ima_rule_entry *arch_policy_entry;

struct ima_h_table ima_htable;
+ struct list_head ima_measurements; /* list of all measurements */
+ unsigned long binary_runtime_size; /* used by init_ima_ns */
} __randomize_layout;
extern struct ima_namespace init_ima_ns;

@@ -186,7 +187,7 @@ int ima_restore_measurement_entry(struct ima_namespace *ns,
int ima_restore_measurement_list(struct ima_namespace *ns,
loff_t bufsize, void *buf);
int ima_measurements_show(struct seq_file *m, void *v);
-unsigned long ima_get_binary_runtime_size(void);
+unsigned long ima_get_binary_runtime_size(struct ima_namespace *ns);
int ima_init_template(void);
void ima_init_template_list(void);
int __init ima_init_digests(void);
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index dca7fe32d65e..5ef0e2b2cf64 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -80,12 +80,13 @@ static const struct file_operations ima_measurements_count_ops = {
/* returns pointer to hlist_node */
static void *ima_measurements_start(struct seq_file *m, loff_t *pos)
{
+ struct ima_namespace *ns = &init_ima_ns;
loff_t l = *pos;
struct ima_queue_entry *qe;

/* we need a lock since pos could point beyond last element */
rcu_read_lock();
- list_for_each_entry_rcu(qe, &ima_measurements, later) {
+ list_for_each_entry_rcu(qe, &ns->ima_measurements, later) {
if (!l--) {
rcu_read_unlock();
return qe;
@@ -97,6 +98,7 @@ static void *ima_measurements_start(struct seq_file *m, loff_t *pos)

static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
{
+ struct ima_namespace *ns = &init_ima_ns;
struct ima_queue_entry *qe = v;

/* lock protects when reading beyond last element
@@ -107,7 +109,7 @@ static void *ima_measurements_next(struct seq_file *m, void *v, loff_t *pos)
rcu_read_unlock();
(*pos)++;

- return (&qe->later == &ima_measurements) ? NULL : qe;
+ return (&qe->later == &ns->ima_measurements) ? NULL : qe;
}

static void ima_measurements_stop(struct seq_file *m, void *v)
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 1945fa8cfc4d..a7477072c587 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -20,6 +20,11 @@ static int ima_init_namespace(struct ima_namespace *ns)
atomic_long_set(&ns->ima_htable.len, 0);
atomic_long_set(&ns->ima_htable.violations, 0);
memset(&ns->ima_htable.queue, 0, sizeof(ns->ima_htable.queue));
+ INIT_LIST_HEAD(&ns->ima_measurements);
+ if (IS_ENABLED(CONFIG_IMA_KEXEC) && ns == &init_ima_ns)
+ ns->binary_runtime_size = 0;
+ else
+ ns->binary_runtime_size = ULONG_MAX;

return 0;
}
diff --git a/security/integrity/ima/ima_kexec.c b/security/integrity/ima/ima_kexec.c
index f3ef8a0df992..e83b18492f46 100644
--- a/security/integrity/ima/ima_kexec.c
+++ b/security/integrity/ima/ima_kexec.c
@@ -15,7 +15,8 @@
#include "ima.h"

#ifdef CONFIG_IMA_KEXEC
-static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,
+static int ima_dump_measurement_list(struct ima_namespace *ns,
+ unsigned long *buffer_size, void **buffer,
unsigned long segment_size)
{
struct ima_queue_entry *qe;
@@ -36,7 +37,7 @@ static int ima_dump_measurement_list(unsigned long *buffer_size, void **buffer,

memset(&khdr, 0, sizeof(khdr));
khdr.version = 1;
- list_for_each_entry_rcu(qe, &ima_measurements, later) {
+ list_for_each_entry_rcu(qe, &ns->ima_measurements, later) {
if (file.count < file.size) {
khdr.count++;
ima_measurements_show(&file, qe);
@@ -84,6 +85,7 @@ void ima_add_kexec_buffer(struct kimage *image)
struct kexec_buf kbuf = { .image = image, .buf_align = PAGE_SIZE,
.buf_min = 0, .buf_max = ULONG_MAX,
.top_down = true };
+ struct ima_namespace *ns = &init_ima_ns;
unsigned long binary_runtime_size;

/* use more understandable variable names than defined in kbuf */
@@ -96,11 +98,11 @@ void ima_add_kexec_buffer(struct kimage *image)
* Reserve an extra half page of memory for additional measurements
* added during the kexec load.
*/
- binary_runtime_size = ima_get_binary_runtime_size();
+ binary_runtime_size = ima_get_binary_runtime_size(ns);
if (binary_runtime_size >= ULONG_MAX - PAGE_SIZE)
kexec_segment_size = ULONG_MAX;
else
- kexec_segment_size = ALIGN(ima_get_binary_runtime_size() +
+ kexec_segment_size = ALIGN(ima_get_binary_runtime_size(ns) +
PAGE_SIZE / 2, PAGE_SIZE);
if ((kexec_segment_size == ULONG_MAX) ||
((kexec_segment_size >> PAGE_SHIFT) > totalram_pages() / 2)) {
@@ -108,7 +110,7 @@ void ima_add_kexec_buffer(struct kimage *image)
return;
}

- ima_dump_measurement_list(&kexec_buffer_size, &kexec_buffer,
+ ima_dump_measurement_list(ns, &kexec_buffer_size, &kexec_buffer,
kexec_segment_size);
if (!kexec_buffer) {
pr_err("Not enough memory for the kexec measurement buffer.\n");
diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
index 43961d5cd2ef..0355c2b0932c 100644
--- a/security/integrity/ima/ima_queue.c
+++ b/security/integrity/ima/ima_queue.c
@@ -24,13 +24,6 @@
/* pre-allocated array of tpm_digest structures to extend a PCR */
static struct tpm_digest *digests;

-LIST_HEAD(ima_measurements); /* list of all measurements */
-#ifdef CONFIG_IMA_KEXEC
-static unsigned long binary_runtime_size;
-#else
-static unsigned long binary_runtime_size = ULONG_MAX;
-#endif
-
/* mutex protects atomicity of extending measurement list
* and extending the TPM PCR aggregate. Since tpm_extend can take
* long (and the tpm driver uses a mutex), we can't use the spinlock.
@@ -100,7 +93,7 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
qe->entry = entry;

INIT_LIST_HEAD(&qe->later);
- list_add_tail_rcu(&qe->later, &ima_measurements);
+ list_add_tail_rcu(&qe->later, &ns->ima_measurements);

atomic_long_inc(&ns->ima_htable.len);
if (update_htable) {
@@ -110,12 +103,14 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
INIT_HLIST_NODE(&qe->hnext);
}

- if (binary_runtime_size != ULONG_MAX) {
+ if (ns->binary_runtime_size != ULONG_MAX) {
int size;

size = get_binary_runtime_size(entry);
- binary_runtime_size = (binary_runtime_size < ULONG_MAX - size) ?
- binary_runtime_size + size : ULONG_MAX;
+ ns->binary_runtime_size =
+ (ns->binary_runtime_size < ULONG_MAX - size)
+ ? ns->binary_runtime_size + size
+ : ULONG_MAX;
}
return 0;
}
@@ -123,14 +118,16 @@ static int ima_add_digest_entry(struct ima_namespace *ns,
/*
* Return the amount of memory required for serializing the
* entire binary_runtime_measurement list, including the ima_kexec_hdr
- * structure.
+ * structure. Carrying the measurement list across kexec is limited
+ * to init_ima_ns.
*/
-unsigned long ima_get_binary_runtime_size(void)
+unsigned long ima_get_binary_runtime_size(struct ima_namespace *ns)
{
- if (binary_runtime_size >= (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
+ if (ns->binary_runtime_size >=
+ (ULONG_MAX - sizeof(struct ima_kexec_hdr)))
return ULONG_MAX;
else
- return binary_runtime_size + sizeof(struct ima_kexec_hdr);
+ return ns->binary_runtime_size + sizeof(struct ima_kexec_hdr);
}

static int ima_pcr_extend(struct tpm_digest *digests_arg, int pcr)
--
2.31.1

2022-02-05 02:26:39

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 23/27] ima: Setup securityfs for IMA namespace

Setup securityfs with symlinks, directories, and files for IMA
namespacing support. The same directory structure that IMA uses on the
host is also created for the namespacing case.

The securityfs file and directory ownerships cannot be set when the
IMA namespace is initialized. Therefore, delay the setup of the file
system to a later point when securityfs is in securityfs_fill_super.

Introduce a variable ima_policy_removed in ima_namespace that is used to
remember whether the policy file has previously been removed and thus
should not be created again in case of unmounting and again mounting
securityfs inside an IMA namespace.

This filesystem can now be mounted as follows:

mount -t securityfs /sys/kernel/security/ /sys/kernel/security/

The following directories, symlinks, and files are available
when IMA namespacing is enabled, otherwise it will be empty:

$ ls -l sys/kernel/security/
total 0
lr--r--r--. 1 root root 0 Dec 2 00:18 ima -> integrity/ima
drwxr-xr-x. 3 root root 0 Dec 2 00:18 integrity

$ ls -l sys/kernel/security/ima/
total 0
-r--r-----. 1 root root 0 Dec 2 00:18 ascii_runtime_measurements
-r--r-----. 1 root root 0 Dec 2 00:18 binary_runtime_measurements
-rw-------. 1 root root 0 Dec 2 00:18 policy
-r--r-----. 1 root root 0 Dec 2 00:18 runtime_measurements_count
-r--r-----. 1 root root 0 Dec 2 00:18 violations

Signed-off-by: Stefan Berger <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
Acked-by: Christian Brauner <[email protected]>

---

v9:
- rename policy_dentry_removed to ima_policy_removed
---
include/linux/ima.h | 13 ++++++++++
security/inode.c | 6 ++++-
security/integrity/ima/ima.h | 1 +
security/integrity/ima/ima_fs.c | 46 +++++++++++++++++++++++----------
4 files changed, 52 insertions(+), 14 deletions(-)

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 2fe32f1bf84b..c584527c0f47 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -41,6 +41,7 @@ extern int ima_measure_critical_data(const char *event_label,
const char *event_name,
const void *buf, size_t buf_len,
bool hash, u8 *digest, size_t digest_len);
+extern int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root);

#ifdef CONFIG_IMA_APPRAISE_BOOTPARAM
extern void ima_appraise_parse_cmdline(void);
@@ -227,6 +228,12 @@ void free_ima_ns(struct user_namespace *ns);

void ima_free_ns_status_list(struct list_head *head, rwlock_t *ns_list_lock);

+static inline int ima_securityfs_init(struct user_namespace *user_ns,
+ struct dentry *root)
+{
+ return ima_fs_ns_init(user_ns, root);
+}
+
#else

static inline void free_ima_ns(struct user_namespace *user_ns)
@@ -238,6 +245,12 @@ static inline void ima_free_ns_status_list(struct list_head *head,
{
}

+static inline int ima_securityfs_init(struct user_namespace *ns,
+ struct dentry *root)
+{
+ return 0;
+}
+
#endif /* CONFIG_IMA_NS */

#endif /* _LINUX_IMA_H */
diff --git a/security/inode.c b/security/inode.c
index e525ba960063..cdb08520151c 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -16,6 +16,7 @@
#include <linux/fs_context.h>
#include <linux/mount.h>
#include <linux/pagemap.h>
+#include <linux/ima.h>
#include <linux/init.h>
#include <linux/namei.h>
#include <linux/security.h>
@@ -81,7 +82,10 @@ static int securityfs_fill_super(struct super_block *sb, struct fs_context *fc)
sb->s_op = &securityfs_super_operations;
sb->s_root->d_inode->i_op = &securityfs_dir_inode_operations;

- return 0;
+ if (ns != &init_user_ns)
+ error = ima_securityfs_init(ns, sb->s_root);
+
+ return error;
}

static int securityfs_get_tree(struct fs_context *fc)
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index ec0b81c7dbf5..1e3f9dda218d 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -148,6 +148,7 @@ struct ima_namespace {
int valid_policy;

struct dentry *ima_policy;
+ bool ima_policy_removed;

struct notifier_block ima_lsm_policy_notifier;
} __randomize_layout;
diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
index c41aa61b7393..84cd02a9e19b 100644
--- a/security/integrity/ima/ima_fs.c
+++ b/security/integrity/ima/ima_fs.c
@@ -21,6 +21,7 @@
#include <linux/rcupdate.h>
#include <linux/parser.h>
#include <linux/vmalloc.h>
+#include <linux/ima.h>

#include "ima.h"

@@ -433,6 +434,7 @@ static int ima_release_policy(struct inode *inode, struct file *file)
#if !defined(CONFIG_IMA_WRITE_POLICY) && !defined(CONFIG_IMA_READ_POLICY)
securityfs_remove(ns->ima_policy);
ns->ima_policy = NULL;
+ ns->ima_policy_removed = true;
#elif defined(CONFIG_IMA_WRITE_POLICY)
clear_bit(IMA_FS_BUSY, &ns->ima_fs_flags);
#elif defined(CONFIG_IMA_READ_POLICY)
@@ -449,9 +451,11 @@ static const struct file_operations ima_measure_policy_ops = {
.llseek = generic_file_llseek,
};

-static int __init ima_fs_ns_init(struct ima_namespace *ns)
+int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
{
- struct dentry *ima_dir;
+ struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
+ struct dentry *int_dir;
+ struct dentry *ima_dir = NULL;
struct dentry *ima_symlink = NULL;
struct dentry *binary_runtime_measurements = NULL;
struct dentry *ascii_runtime_measurements = NULL;
@@ -459,11 +463,22 @@ static int __init ima_fs_ns_init(struct ima_namespace *ns)
struct dentry *violations = NULL;
int ret;

- ima_dir = securityfs_create_dir("ima", integrity_dir);
- if (IS_ERR(ima_dir))
- return PTR_ERR(ima_dir);
+ /* FIXME: update when evm and integrity are namespaced */
+ if (user_ns != &init_user_ns) {
+ int_dir = securityfs_create_dir("integrity", root);
+ if (IS_ERR(int_dir))
+ return PTR_ERR(int_dir);
+ } else {
+ int_dir = integrity_dir;
+ }

- ima_symlink = securityfs_create_symlink("ima", NULL, "integrity/ima",
+ ima_dir = securityfs_create_dir("ima", int_dir);
+ if (IS_ERR(ima_dir)) {
+ ret = PTR_ERR(ima_dir);
+ goto out;
+ }
+
+ ima_symlink = securityfs_create_symlink("ima", root, "integrity/ima",
NULL);
if (IS_ERR(ima_symlink)) {
ret = PTR_ERR(ima_symlink);
@@ -505,12 +520,15 @@ static int __init ima_fs_ns_init(struct ima_namespace *ns)
goto out;
}

- ns->ima_policy = securityfs_create_file("policy", POLICY_FILE_FLAGS,
- ima_dir, NULL,
- &ima_measure_policy_ops);
- if (IS_ERR(ns->ima_policy)) {
- ret = PTR_ERR(ns->ima_policy);
- goto out;
+ if (!ns->ima_policy_removed) {
+ ns->ima_policy =
+ securityfs_create_file("policy", POLICY_FILE_FLAGS,
+ ima_dir, NULL,
+ &ima_measure_policy_ops);
+ if (IS_ERR(ns->ima_policy)) {
+ ret = PTR_ERR(ns->ima_policy);
+ goto out;
+ }
}

return 0;
@@ -522,11 +540,13 @@ static int __init ima_fs_ns_init(struct ima_namespace *ns)
securityfs_remove(binary_runtime_measurements);
securityfs_remove(ima_symlink);
securityfs_remove(ima_dir);
+ if (user_ns != &init_user_ns)
+ securityfs_remove(int_dir);

return ret;
}

int __init ima_fs_init(void)
{
- return ima_fs_ns_init(&init_ima_ns);
+ return ima_fs_ns_init(&init_user_ns, NULL);
}
--
2.31.1

2022-02-05 08:22:45

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 18/27] integrity/ima: Define ns_status for storing namespaced iint data

From: Mehmet Kayaalp <[email protected]>

Add an rbtree to the IMA namespace structure that stores a namespaced
version of iint->flags in ns_status struct. Similar to the
integrity_iint_cache, both the iint and ns_status are looked up using the
inode pointer value. The lookup, allocate, and insertion code is also
similar.

In subsequent patches we will have to find all ns_status entries an iint
is being used in and reset flags there. To do this, connect a list of
ns_status to the integrity_iint_cache and provide a reader-writer
lock in the integrity_iint_cache to lock access to the list.

To simplify the code in the non-namespaces case embed an ns_status in
the integrity_iint_cache and have it linked into the iint's ns_status list
when calling ima_get_ns_status().

When getting an ns_status first try to find it in the RB tree. Here we can
run into the situation that an ns_status found in the RB tree has a
different iint associated with it for the same inode. In this case we need
to delete the ns_status structure and get a new one.

There are two cases for freeing:
- when the iint is freed (inode deletion): Walk the list of ns_status
entries and disconnect each ns_status from the list; take the
writer lock to protect access to the list; also, take the item off the
per-namespace rbtree

- when the ima_namepace is freed: While walking the rbtree, remove the
ns_status from the list while also holding the iint's writer lock;
to be able to grab the lock we have to have a pointer to the iint on
the ns_status structure.

To avoid an ns_status to be freed by the two cases concurrently, prevent
these two cases to run concurrently. Therefore, groups of threads
deleting either inodes or ima_namespaces are allowed to run concurrently
but no two threads may run and one delete an inode and the other an
ima_namespace.

Signed-off-by: Mehmet Kayaalp <[email protected]>
Signed-off-by: Stefan Berger <[email protected]>

---
v9:
- Move ns_status into integrity.h and embedded it into integrity_iint_cache
for the non-CONFIG_IMA_NS case
---
include/linux/ima.h | 7 +
security/integrity/iint.c | 7 +
security/integrity/ima/Makefile | 2 +-
security/integrity/ima/ima.h | 17 ++
security/integrity/ima/ima_init_ima_ns.c | 5 +
security/integrity/ima/ima_ns.c | 1 +
security/integrity/ima/ima_ns_status.c | 344 +++++++++++++++++++++++
security/integrity/integrity.h | 35 ++-
8 files changed, 416 insertions(+), 2 deletions(-)
create mode 100644 security/integrity/ima/ima_ns_status.c

diff --git a/include/linux/ima.h b/include/linux/ima.h
index 964a08702573..2fe32f1bf84b 100644
--- a/include/linux/ima.h
+++ b/include/linux/ima.h
@@ -225,12 +225,19 @@ static inline bool ima_appraise_signature(enum kernel_read_file_id func)

void free_ima_ns(struct user_namespace *ns);

+void ima_free_ns_status_list(struct list_head *head, rwlock_t *ns_list_lock);
+
#else

static inline void free_ima_ns(struct user_namespace *user_ns)
{
}

+static inline void ima_free_ns_status_list(struct list_head *head,
+ rwlock_t *ns_list_lock)
+{
+}
+
#endif /* CONFIG_IMA_NS */

#endif /* _LINUX_IMA_H */
diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index 8638976f7990..371cbceaf479 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -19,6 +19,7 @@
#include <linux/uaccess.h>
#include <linux/security.h>
#include <linux/lsm_hooks.h>
+#include <linux/ima.h>
#include "integrity.h"

static struct rb_root integrity_iint_tree = RB_ROOT;
@@ -82,6 +83,8 @@ static void iint_free(struct integrity_iint_cache *iint)
iint->ima_creds_status = INTEGRITY_UNKNOWN;
iint->evm_status = INTEGRITY_UNKNOWN;
iint->measured_pcrs = 0;
+ rwlock_init(&iint->ns_list_lock);
+ INIT_LIST_HEAD(&iint->ns_list);
kmem_cache_free(iint_cache, iint);
}

@@ -155,6 +158,8 @@ void integrity_inode_free(struct inode *inode)
rb_erase(&iint->rb_node, &integrity_iint_tree);
write_unlock(&integrity_iint_lock);

+ ima_free_ns_status_list(&iint->ns_list, &iint->ns_list_lock);
+
iint_free(iint);
}

@@ -170,6 +175,8 @@ static void init_once(void *foo)
iint->ima_creds_status = INTEGRITY_UNKNOWN;
iint->evm_status = INTEGRITY_UNKNOWN;
mutex_init(&iint->mutex);
+ rwlock_init(&iint->ns_list_lock);
+ INIT_LIST_HEAD(&iint->ns_list);
}

static int __init integrity_iintcache_init(void)
diff --git a/security/integrity/ima/Makefile b/security/integrity/ima/Makefile
index b86a35fbed60..edfb0c30a063 100644
--- a/security/integrity/ima/Makefile
+++ b/security/integrity/ima/Makefile
@@ -14,7 +14,7 @@ ima-$(CONFIG_HAVE_IMA_KEXEC) += ima_kexec.o
ima-$(CONFIG_IMA_BLACKLIST_KEYRING) += ima_mok.o
ima-$(CONFIG_IMA_MEASURE_ASYMMETRIC_KEYS) += ima_asymmetric_keys.o
ima-$(CONFIG_IMA_QUEUE_EARLY_BOOT_KEYS) += ima_queue_keys.o
-ima-$(CONFIG_IMA_NS) += ima_ns.o
+ima-$(CONFIG_IMA_NS) += ima_ns.o ima_ns_status.o

ifeq ($(CONFIG_EFI),y)
ima-$(CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT) += ima_efi.o
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 751e936a6311..4af8f2c4df40 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -123,6 +123,10 @@ struct ima_h_table {
};

struct ima_namespace {
+ struct rb_root ns_status_tree;
+ rwlock_t ns_tree_lock;
+ struct kmem_cache *ns_status_cache;
+
/* policy rules */
struct list_head ima_default_rules;
struct list_head ima_policy_rules;
@@ -507,6 +511,12 @@ static inline struct ima_namespace

struct ima_namespace *create_ima_ns(void);

+struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+ struct inode *inode,
+ struct integrity_iint_cache *iint);
+
+void ima_free_ns_status_tree(struct ima_namespace *ns);
+
#else

static inline struct ima_namespace *create_ima_ns(void)
@@ -515,6 +525,13 @@ static inline struct ima_namespace *create_ima_ns(void)
return ERR_PTR(-EFAULT);
}

+static inline struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+ struct inode *inode,
+ struct integrity_iint_cache *iint)
+{
+ return NULL;
+}
+
#endif /* CONFIG_IMA_NS */

#endif /* __LINUX_IMA_H */
diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
index 166dab4a3126..d4ddfd1de60b 100644
--- a/security/integrity/ima/ima_init_ima_ns.c
+++ b/security/integrity/ima/ima_init_ima_ns.c
@@ -12,6 +12,11 @@ int ima_init_namespace(struct ima_namespace *ns)
{
int rc;

+ ns->ns_status_tree = RB_ROOT;
+ rwlock_init(&ns->ns_tree_lock);
+ /* Use KMEM_CACHE for simplicity */
+ ns->ns_status_cache = KMEM_CACHE(ns_status, SLAB_PANIC);
+
INIT_LIST_HEAD(&ns->ima_default_rules);
INIT_LIST_HEAD(&ns->ima_policy_rules);
INIT_LIST_HEAD(&ns->ima_temp_rules);
diff --git a/security/integrity/ima/ima_ns.c b/security/integrity/ima/ima_ns.c
index b3b81a1e313e..29af6fea2d74 100644
--- a/security/integrity/ima/ima_ns.c
+++ b/security/integrity/ima/ima_ns.c
@@ -29,6 +29,7 @@ static void destroy_ima_ns(struct ima_namespace *ns)
unregister_blocking_lsm_notifier(&ns->ima_lsm_policy_notifier);
kfree(ns->arch_policy_entry);
ima_free_policy_rules(ns);
+ ima_free_ns_status_tree(ns);
}

void free_ima_ns(struct user_namespace *user_ns)
diff --git a/security/integrity/ima/ima_ns_status.c b/security/integrity/ima/ima_ns_status.c
new file mode 100644
index 000000000000..9c753caad6ac
--- /dev/null
+++ b/security/integrity/ima/ima_ns_status.c
@@ -0,0 +1,344 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2016-2021 IBM Corporation
+ * Author:
+ * Yuqiong Sun <[email protected]>
+ * Stefan Berger <[email protected]>
+ */
+
+#include <linux/user_namespace.h>
+#include <linux/ima.h>
+
+#include "ima.h"
+
+/*
+ * An ns_status must be on a per-namespace rbtree and on a per-iint list.
+ *
+ * Locking order for ns_status:
+ * 1) ns->ns_tree_lock : Lock the rbtree
+ * 2) iint->ns_list_lock: Lock the list
+ *
+ * An ns_status can be freed either by walking the rbtree (namespace deletion)
+ * or by walking the linked list of ns_status (inode/iint deletion). There are
+ * two functions that implement each one of the cases. To avoid concurrent
+ * freeing of the same ns_status, the two freeing paths cannot be run
+ * concurrently but each path can be run by multiple threads since no two
+ * threads will free the same inode/iint and no two threads will free the same
+ * namespace. Grouping threads like this ensures that:
+ * - while walking the rbtree: all ns_status will be on their list and the iint
+ * will still exist
+ * - while walking the list: all ns_status will be on their rbtree
+ */
+enum lk_group {
+ GRP_NS_STATUS_LIST = 0,
+ GRP_NS_STATUS_TREE
+};
+
+static atomic_t lg_ctr[2] = {
+ ATOMIC_INIT(0),
+ ATOMIC_INIT(0)
+};
+
+static DEFINE_SPINLOCK(lg_ctr_lock);
+
+static struct wait_queue_head lg_wq[2] = {
+ __WAIT_QUEUE_HEAD_INITIALIZER(lg_wq[0]),
+ __WAIT_QUEUE_HEAD_INITIALIZER(lg_wq[1])
+};
+
+static atomic_t ns_list_waiters = ATOMIC_INIT(0);
+
+/*
+ * Any number of concurrent threads may free ns_status's in either one of the
+ * groups but the groups must not run concurrently. The GRP_NS_STATUS_TREE
+ * group yields to waiters in the GRP_NS_STATUS_LIST group since namespace
+ * deletion has more time.
+ */
+static void lock_group(enum lk_group group)
+{
+ unsigned long flags;
+ bool done = false;
+ int announced = 0;
+
+ while (1) {
+ spin_lock_irqsave(&lg_ctr_lock, flags);
+
+ switch (group) {
+ case GRP_NS_STATUS_LIST:
+ if (atomic_read(&lg_ctr[GRP_NS_STATUS_TREE]) == 0) {
+ if (announced)
+ atomic_dec(&ns_list_waiters);
+ done = true;
+ atomic_inc(&lg_ctr[GRP_NS_STATUS_LIST]);
+ } else {
+ /* rbtree being deleted; announce waiting */
+ if (!announced) {
+ atomic_inc(&ns_list_waiters);
+ announced = 1;
+ }
+ }
+ break;
+ case GRP_NS_STATUS_TREE:
+ if (atomic_read(&lg_ctr[GRP_NS_STATUS_LIST]) == 0 &&
+ atomic_read(&ns_list_waiters) == 0) {
+ done = true;
+ atomic_inc(&lg_ctr[GRP_NS_STATUS_TREE]);
+ }
+ break;
+ }
+
+ spin_unlock_irqrestore(&lg_ctr_lock, flags);
+
+ if (done)
+ break;
+
+ /* wait until opposite group is done */
+ switch (group) {
+ case GRP_NS_STATUS_LIST:
+ wait_event_interruptible
+ (lg_wq[GRP_NS_STATUS_LIST],
+ atomic_read(&lg_ctr[GRP_NS_STATUS_TREE]) == 0);
+ break;
+ case GRP_NS_STATUS_TREE:
+ wait_event_interruptible
+ (lg_wq[GRP_NS_STATUS_TREE],
+ atomic_read(&lg_ctr[GRP_NS_STATUS_LIST]) == 0 &&
+ atomic_read(&ns_list_waiters) == 0);
+ break;
+ }
+ }
+}
+
+static void unlock_group(enum lk_group group)
+{
+ switch (group) {
+ case GRP_NS_STATUS_LIST:
+ if (atomic_dec_and_test(&lg_ctr[GRP_NS_STATUS_LIST]))
+ wake_up_interruptible_all(&lg_wq[GRP_NS_STATUS_TREE]);
+ break;
+ case GRP_NS_STATUS_TREE:
+ if (atomic_dec_and_test(&lg_ctr[GRP_NS_STATUS_TREE]))
+ wake_up_interruptible_all(&lg_wq[GRP_NS_STATUS_LIST]);
+ break;
+ }
+}
+
+static void ns_status_free(struct ima_namespace *ns,
+ struct ns_status *ns_status)
+{
+ pr_debug("FREE ns_status: %p\n", ns_status);
+
+ kmem_cache_free(ns->ns_status_cache, ns_status);
+}
+
+/*
+ * ima_free_ns_status_tree - free all items on the ns_status_tree and take each
+ * one off the list; yield to ns_list free'ers
+ *
+ * This function is called when an ima_namespace is freed. All entries in the
+ * rbtree will be taken off their list and collected in a garbage collection
+ * list and freed at the end. This allows to walk the rbtree again.
+ */
+void ima_free_ns_status_tree(struct ima_namespace *ns)
+{
+ struct ns_status *ns_status, *next;
+ struct llist_node *node;
+ LLIST_HEAD(garbage);
+ unsigned int ctr;
+ bool restart;
+
+ do {
+ ctr = 0;
+ restart = false;
+
+ lock_group(GRP_NS_STATUS_TREE);
+ write_lock(&ns->ns_tree_lock);
+
+ rbtree_postorder_for_each_entry_safe(ns_status, next,
+ &ns->ns_status_tree,
+ rb_node) {
+ write_lock(&ns_status->iint->ns_list_lock);
+ if (!list_empty(&ns_status->ns_next)) {
+ list_del_init(&ns_status->ns_next);
+ llist_add(&ns_status->gc_llist, &garbage);
+ ctr++;
+ }
+ write_unlock(&ns_status->iint->ns_list_lock);
+
+ /*
+ * After some progress yield to any waiting ns_list
+ * free'ers.
+ */
+ if (atomic_read(&ns_list_waiters) > 0 && ctr >= 5) {
+ restart = true;
+ break;
+ }
+ }
+
+ write_unlock(&ns->ns_tree_lock);
+ unlock_group(GRP_NS_STATUS_TREE);
+ } while (restart);
+
+ node = llist_del_all(&garbage);
+ llist_for_each_entry_safe(ns_status, next, node, gc_llist)
+ ns_status_free(ns, ns_status);
+
+ kmem_cache_destroy(ns->ns_status_cache);
+}
+
+/*
+ * ima_free_ns_status_list: free the list of ns_status items and take
+ * each one off its namespace rbtree
+ */
+void ima_free_ns_status_list(struct list_head *head, rwlock_t *ns_list_lock)
+{
+ struct ns_status *ns_status;
+
+ lock_group(GRP_NS_STATUS_LIST);
+
+ while (1) {
+ write_lock(ns_list_lock);
+ ns_status = list_first_entry_or_null(head, struct ns_status,
+ ns_next);
+ if (ns_status)
+ list_del_init(&ns_status->ns_next);
+ write_unlock(ns_list_lock);
+
+ if (!ns_status)
+ break;
+
+ write_lock(&ns_status->ns->ns_tree_lock);
+
+ rb_erase(&ns_status->rb_node, &ns_status->ns->ns_status_tree);
+ RB_CLEAR_NODE(&ns_status->rb_node);
+
+ write_unlock(&ns_status->ns->ns_tree_lock);
+
+ ns_status_free(ns_status->ns, ns_status);
+ }
+
+ unlock_group(GRP_NS_STATUS_LIST);
+}
+
+/*
+ * ns_status_find - return the ns_status associated with an inode;
+ * caller must hold lock for tree
+ */
+static struct ns_status *ns_status_find(struct ima_namespace *ns,
+ struct inode *inode)
+{
+ struct ns_status *ns_status;
+ struct rb_node *n = ns->ns_status_tree.rb_node;
+
+ while (n) {
+ ns_status = rb_entry(n, struct ns_status, rb_node);
+
+ if (inode < ns_status->inode)
+ n = n->rb_left;
+ else if (inode > ns_status->inode)
+ n = n->rb_right;
+ else
+ break;
+ }
+ if (!n)
+ return NULL;
+
+ return ns_status;
+}
+
+static void insert_ns_status(struct ima_namespace *ns, struct inode *inode,
+ struct ns_status *ns_status)
+{
+ struct rb_node **p;
+ struct rb_node *node, *parent = NULL;
+ struct ns_status *test_status;
+
+ p = &ns->ns_status_tree.rb_node;
+ while (*p) {
+ parent = *p;
+ test_status = rb_entry(parent, struct ns_status, rb_node);
+ if (inode < test_status->inode)
+ p = &(*p)->rb_left;
+ else
+ p = &(*p)->rb_right;
+ }
+ node = &ns_status->rb_node;
+ rb_link_node(node, parent, p);
+ rb_insert_color(node, &ns->ns_status_tree);
+}
+
+static void ns_status_unlink(struct ima_namespace *ns,
+ struct ns_status *ns_status)
+{
+ write_lock(&ns_status->iint->ns_list_lock);
+ if (!list_empty(&ns_status->ns_next))
+ list_del_init(&ns_status->ns_next);
+ write_unlock(&ns_status->iint->ns_list_lock);
+
+ rb_erase(&ns_status->rb_node, &ns->ns_status_tree);
+ RB_CLEAR_NODE(&ns_status->rb_node);
+}
+
+struct ns_status *ima_get_ns_status(struct ima_namespace *ns,
+ struct inode *inode,
+ struct integrity_iint_cache *iint)
+{
+ struct ns_status *ns_status;
+ bool get_new = true;
+
+ /*
+ * Prevent finding the status via the list (inode/iint deletion) since
+ * we may free it.
+ */
+ lock_group(GRP_NS_STATUS_TREE);
+
+ write_lock(&ns->ns_tree_lock);
+
+ ns_status = ns_status_find(ns, inode);
+ if (ns_status) {
+ if (unlikely(ns_status->iint != iint)) {
+ /* Same inode but stale iint: free it and get new */
+ ns_status_unlink(ns, ns_status);
+ ns_status_free(ns, ns_status);
+ } else if (inode->i_ino == ns_status->i_ino &&
+ inode->i_generation == ns_status->i_generation) {
+ goto unlock;
+ } else {
+ /* Reuse of ns_status is possible but need reset */
+ ns_status_reset(ns_status);
+ get_new = false;
+ }
+ }
+
+ if (get_new) {
+ ns_status = kmem_cache_alloc(ns->ns_status_cache, GFP_NOFS);
+ if (!ns_status) {
+ ns_status = ERR_PTR(-ENOMEM);
+ goto unlock;
+ }
+
+ pr_debug("NEW ns_status: %p\n", ns_status);
+
+ ns_status_init(ns_status);
+ insert_ns_status(ns, inode, ns_status);
+ }
+
+ ns_status->iint = iint;
+ ns_status->inode = inode;
+ ns_status->ns = ns;
+ ns_status->i_ino = inode->i_ino;
+ ns_status->i_generation = inode->i_generation;
+
+ /* make visible on list */
+ write_lock(&iint->ns_list_lock);
+ if (list_empty(&ns_status->ns_next))
+ list_add_tail(&ns_status->ns_next, &iint->ns_list);
+ write_unlock(&iint->ns_list_lock);
+
+unlock:
+ write_unlock(&ns->ns_tree_lock);
+
+ unlock_group(GRP_NS_STATUS_TREE);
+
+ return ns_status;
+}
diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
index 547425c20e11..b7d5ca108900 100644
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -122,13 +122,39 @@ struct signature_v2_hdr {
uint8_t sig[]; /* signature payload */
} __packed;

+/* integrity status of an inode in a namespace */
+struct ns_status {
+ struct list_head ns_next;
+ unsigned long flags; /* flags split with iint */
+#ifdef CONFIG_IMA_NS
+ struct rb_node rb_node;
+ struct integrity_iint_cache *iint;
+ struct inode *inode;
+ struct ima_namespace *ns;
+ ino_t i_ino;
+ u32 i_generation;
+ struct llist_node gc_llist; /* used while freeing */
+#endif
+};
+
+static inline void ns_status_reset(struct ns_status *ns_status)
+{
+ ns_status->flags = 0;
+}
+
+static inline void ns_status_init(struct ns_status *ns_status)
+{
+ INIT_LIST_HEAD(&ns_status->ns_next);
+ ns_status_reset(ns_status);
+}
+
/* integrity data associated with an inode */
struct integrity_iint_cache {
struct rb_node rb_node; /* rooted in integrity_iint_tree */
struct mutex mutex; /* protects: version, flags, digest */
struct inode *inode; /* back pointer to inode in question */
u64 version; /* track inode changes */
- unsigned long flags;
+ unsigned long flags; /* flags split with ns_status */
unsigned long measured_pcrs;
unsigned long atomic_flags;
enum integrity_status ima_file_status:4;
@@ -138,6 +164,13 @@ struct integrity_iint_cache {
enum integrity_status ima_creds_status:4;
enum integrity_status evm_status:4;
struct ima_digest_data *ima_hash;
+
+ /*
+ * Lock and list of ns_status for files shared by different
+ * namespaces
+ */
+ rwlock_t ns_list_lock;
+ struct list_head ns_list;
};

/* rbtree tree calls to lookup, insert, delete
--
2.31.1


2022-02-07 14:30:15

by Stefan Berger

[permalink] [raw]
Subject: [PATCH v10 21/27] ima: Remove unused iints from the integrity_iint_cache

When the rbtree of an IMA namespace is torn down, also remove those iints
that are completely unused since only the torn-down namespace stored data
about the associated inode in it.

An iint is unused when the following two conditions are met:

- Its ns_status list is empty which means that no IMA namespace
currently has auditing related state stored in it.

- The iint's flags don't contain any of the flags IMA_MEASURE,
IMA_APPRAISE or IMA_HASH that the host would still store there.
It doesn't need an ns_status list for these but also only for
IMA_AUDIT.

Introduce the #define IMA_IINT_FLAGS that represent the mask to test the
iint->flags with in this case. This test provides the reason to keep the
iint if any of these flags are set.

The IMA_IINT_FLAGS mask will loose its flags as more flags are namespaced
and can then be removed in the end and only the check for the empty list
will remain.

Process the list of garbage-collected ns_status outside the locking of
the ns_status tree and related lock-group and free any iint that was
previously found to be unused while walking the list. File accesses, that
may have happened in the meantime, could have re-activated the iint and
therefore pass along the test function to check whether the iint is still
unused.

Signed-off-by: Stefan Berger <[email protected]>
---
security/integrity/iint.c | 4 +++
security/integrity/ima/ima.h | 2 ++
security/integrity/ima/ima_ns_status.c | 43 +++++++++++++++++++++++++-
security/integrity/integrity.h | 1 +
4 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/security/integrity/iint.c b/security/integrity/iint.c
index 4580df0e716e..b0996bd0ee67 100644
--- a/security/integrity/iint.c
+++ b/security/integrity/iint.c
@@ -158,6 +158,10 @@ void integrity_inode_free(struct inode *inode, iint_removable_cb check)
write_lock(&integrity_iint_lock);

iint = __integrity_iint_find(inode);
+ if (!iint) {
+ write_unlock(&integrity_iint_lock);
+ return;
+ }

if (check)
freeit = check(iint);
diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
index 883aeb30590f..ec0b81c7dbf5 100644
--- a/security/integrity/ima/ima.h
+++ b/security/integrity/ima/ima.h
@@ -497,6 +497,8 @@ static inline int ima_filter_rule_match(u32 secid, u32 field, u32 op,
#define IMA_NS_STATUS_ACTIONS IMA_AUDIT
#define IMA_NS_STATUS_FLAGS (IMA_AUDIT | IMA_AUDITED)

+#define IMA_IINT_FLAGS (IMA_MEASURE | IMA_APPRAISE | IMA_HASH)
+
static inline unsigned long iint_flags(struct integrity_iint_cache *iint,
struct ns_status *ns_status)
{
diff --git a/security/integrity/ima/ima_ns_status.c b/security/integrity/ima/ima_ns_status.c
index 9c753caad6ac..d1ccae2c2313 100644
--- a/security/integrity/ima/ima_ns_status.c
+++ b/security/integrity/ima/ima_ns_status.c
@@ -131,6 +131,26 @@ static void ns_status_free(struct ima_namespace *ns,
kmem_cache_free(ns->ns_status_cache, ns_status);
}

+/* Test whether an iint is unused due to empty ns_status list AND the
+ * not-yet namespaced flags are not set on it.
+ */
+static bool __iint_is_unused(struct integrity_iint_cache *iint)
+{
+ return list_empty(&iint->ns_list) &&
+ (iint_flags(iint, NULL) & IMA_IINT_FLAGS) == 0;
+}
+
+static bool iint_is_unused(struct integrity_iint_cache *iint)
+{
+ bool ret;
+
+ write_lock(&iint->ns_list_lock);
+ ret = __iint_is_unused(iint);
+ write_unlock(&iint->ns_list_lock);
+
+ return ret;
+}
+
/*
* ima_free_ns_status_tree - free all items on the ns_status_tree and take each
* one off the list; yield to ns_list free'ers
@@ -161,6 +181,18 @@ void ima_free_ns_status_tree(struct ima_namespace *ns)
if (!list_empty(&ns_status->ns_next)) {
list_del_init(&ns_status->ns_next);
llist_add(&ns_status->gc_llist, &garbage);
+
+ /*
+ * While ns_status->iint is guaranteed to be
+ * there, check whether the iint is still in
+ * use by anyone at this moment.
+ */
+ if (__iint_is_unused(ns_status->iint)) {
+ ns_status->inode_to_remove =
+ ns_status->iint->inode;
+ } else {
+ ns_status->inode_to_remove = NULL;
+ }
ctr++;
}
write_unlock(&ns_status->iint->ns_list_lock);
@@ -180,8 +212,17 @@ void ima_free_ns_status_tree(struct ima_namespace *ns)
} while (restart);

node = llist_del_all(&garbage);
- llist_for_each_entry_safe(ns_status, next, node, gc_llist)
+ llist_for_each_entry_safe(ns_status, next, node, gc_llist) {
+ if (ns_status->inode_to_remove) {
+ /*
+ * Pass along the test function in case inode is in
+ * use now.
+ */
+ integrity_inode_free(ns_status->inode_to_remove,
+ iint_is_unused);
+ }
ns_status_free(ns, ns_status);
+ }

kmem_cache_destroy(ns->ns_status_cache);
}
diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
index dbe9f36d3692..6276e6a615b7 100644
--- a/security/integrity/integrity.h
+++ b/security/integrity/integrity.h
@@ -134,6 +134,7 @@ struct ns_status {
ino_t i_ino;
u32 i_generation;
struct llist_node gc_llist; /* used while freeing */
+ void *inode_to_remove; /* used while freeing */
#endif
};

--
2.31.1


2022-02-10 13:00:34

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 03/27] ima: Return error code obtained from securityfs functions

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> If an error occurs when creating a securityfs file, return the exact
> error code to the caller.
>
> Signed-off-by: Stefan Berger <[email protected]>

Thanks, Stefan. Nice cleanup.

Reviewed-by: Mimi Zohar <[email protected]>


2022-02-15 22:57:37

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 03/27] ima: Return error code obtained from securityfs functions

On Thu, 2022-02-10 at 07:02 -0500, Mimi Zohar wrote:
> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> > If an error occurs when creating a securityfs file, return the exact
> > error code to the caller.
> >
> > Signed-off-by: Stefan Berger <[email protected]>
>
> Thanks, Stefan. Nice cleanup.

This is now queued in next-integrity.

--
Thanks,

Mimi

2022-02-17 23:50:46

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 08/27] ima: Move measurement list related variables into ima_namespace

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> Move measurement list related variables into the ima_namespace. This way
> a front-end like securityfs can show the measurement list inside an IMA
> namespace.

Also, in order for kexec to allocate memory for the existing
measurement list, the measurement list memory size is stored in the
binary_runtime_size variable. To avoid special-casing init_ima_ns, as
much as possible, move it into the ima_namespace.

>
> Signed-off-by: Stefan Berger <[email protected]>

Reviewed-by: Mimi Zohar <[email protected]>

2022-02-19 11:35:36

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 14/27] userns: Add pointer to ima_namespace to user_namespace

Hi Stefan,

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> Add a pointer to ima_namespace to the user_namespace and initialize
> the init_user_ns with a pointer to init_ima_ns.
>
> Signed-off-by: Stefan Berger <[email protected]>

The patch description explains 'what' is being done, but not 'why'.
Please add a sentence before what you have providing the motivation.

Otherwise,

Reviewed-by: Mimi Zohar <[email protected]>

--
thanks,

Mimi

2022-02-23 02:31:46

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 22/27] securityfs: Extend securityfs with namespacing support

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> Enable multiple instances of securityfs by keying each instance with a
> pointer to the user namespace it belongs to.
>
> Since we do not need the pinning of the filesystem for the virtualization

^namespacing case

> case, limit the usage of simple_pin_fs() and simpe_release_fs() to the

^simple_release_fs

> case when the init_user_ns is active. This simplifies the cleanup for the
> virtualization case where usage of securityfs_remove() to free dentries

^namespacing

> is therefore not needed anymore.
>
> For the initial securityfs, i.e. the one mounted in the host userns mount,
> nothing changes. The rules for securityfs_remove() are as before and it is
> still paired with securityfs_create(). Specifically, a file created via
> securityfs_create_dentry() in the initial securityfs mount still needs to
> be removed by a call to securityfs_remove(). Creating a new dentry in the
> initial securityfs mount still pins the filesystem like it always did.
> Consequently, the initial securityfs mount is not destroyed on
> umount/shutdown as long as at least one user of it still has dentries that
> it hasn't removed with a call to securityfs_remove().
>
> Prevent mounting of an instance of securityfs in another user namespace
> than it belongs to. Also, prevent accesses to files and directories by
> a user namespace that is neither the user namespace it belongs to
> nor an ancestor of the user namespace that the instance of securityfs
> belongs to. Do not prevent access if securityfs was bind-mounted and
> therefore the init_user_ns is the owning user namespace.
>
> Suggested-by: Christian Brauner <[email protected]>
> Signed-off-by: Stefan Berger <[email protected]>
> Signed-off-by: James Bottomley <[email protected]>

Christian, I understand that "[PATCH v10 23/27] ima: Setup securityfs
for IMA namespace" needs to be deferred, but is there a reason for
deferring "[PATCH v10 22/27] securityfs: Extend securityfs with
namespacing support"? As the securityfs patches are really
independent of IMA namespacing, I would have thought "[PATCH v10
04/27] securityfs: rework dentry creation" and this patch should be co-
located at the beginning of the patch set.

--
thanks,

Mimi

2022-02-23 12:08:17

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 23/27] ima: Setup securityfs for IMA namespace

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> Setup securityfs with symlinks, directories, and files for IMA
> namespacing support. The same directory structure that IMA uses on the
> host is also created for the namespacing case.
>
> The securityfs file and directory ownerships cannot be set when the
> IMA namespace is initialized. Therefore, delay the setup of the file
> system to a later point when securityfs is in securityfs_fill_super.
>
> Introduce a variable ima_policy_removed in ima_namespace that is used to
> remember whether the policy file has previously been removed and thus
> should not be created again in case of unmounting and again mounting
> securityfs inside an IMA namespace.

When the ability of extending the custom IMA policy was added, support
for displaying the policy was added. (Refer to the IMA_READ_POLICY
Kconfig.) This patch set adds support for a user, true root in the
namespace, to be able to write a custom policy. If the
IMA_READ_POLICY is not enabled, then nobody, including host root, will
be able to view it.

Instead of continuing to support not being able to read the IMA policy,
updating the IMA_READ_POLICY Kconfig for the IMA_NS case to require it
seems preferable.

> This filesystem can now be mounted as follows:
>
> mount -t securityfs /sys/kernel/security/ /sys/kernel/security/
>
> The following directories, symlinks, and files are available
> when IMA namespacing is enabled, otherwise it will be empty:
>
> $ ls -l sys/kernel/security/
> total 0
> lr--r--r--. 1 root root 0 Dec 2 00:18 ima -> integrity/ima
> drwxr-xr-x. 3 root root 0 Dec 2 00:18 integrity
>
> $ ls -l sys/kernel/security/ima/
> total 0
> -r--r-----. 1 root root 0 Dec 2 00:18 ascii_runtime_measurements
> -r--r-----. 1 root root 0 Dec 2 00:18 binary_runtime_measurements
> -rw-------. 1 root root 0 Dec 2 00:18 policy
> -r--r-----. 1 root root 0 Dec 2 00:18 runtime_measurements_count
> -r--r-----. 1 root root 0 Dec 2 00:18 violations
>
> Signed-off-by: Stefan Berger <[email protected]>
> Signed-off-by: James Bottomley <[email protected]>
> Acked-by: Christian Brauner <[email protected]>

Otherwise,

Reviewed-by: Mimi Zohar <[email protected]>

2022-02-23 12:39:17

by Christian Brauner

[permalink] [raw]
Subject: Re: [PATCH v10 22/27] securityfs: Extend securityfs with namespacing support

On Tue, Feb 22, 2022 at 08:48:47PM -0500, Mimi Zohar wrote:
> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> > Enable multiple instances of securityfs by keying each instance with a
> > pointer to the user namespace it belongs to.
> >
> > Since we do not need the pinning of the filesystem for the virtualization
>
> ^namespacing case
>
> > case, limit the usage of simple_pin_fs() and simpe_release_fs() to the
>
> ^simple_release_fs
>
> > case when the init_user_ns is active. This simplifies the cleanup for the
> > virtualization case where usage of securityfs_remove() to free dentries
>
> ^namespacing
>
> > is therefore not needed anymore.
> >
> > For the initial securityfs, i.e. the one mounted in the host userns mount,
> > nothing changes. The rules for securityfs_remove() are as before and it is
> > still paired with securityfs_create(). Specifically, a file created via
> > securityfs_create_dentry() in the initial securityfs mount still needs to
> > be removed by a call to securityfs_remove(). Creating a new dentry in the
> > initial securityfs mount still pins the filesystem like it always did.
> > Consequently, the initial securityfs mount is not destroyed on
> > umount/shutdown as long as at least one user of it still has dentries that
> > it hasn't removed with a call to securityfs_remove().
> >
> > Prevent mounting of an instance of securityfs in another user namespace
> > than it belongs to. Also, prevent accesses to files and directories by
> > a user namespace that is neither the user namespace it belongs to
> > nor an ancestor of the user namespace that the instance of securityfs
> > belongs to. Do not prevent access if securityfs was bind-mounted and
> > therefore the init_user_ns is the owning user namespace.
> >
> > Suggested-by: Christian Brauner <[email protected]>
> > Signed-off-by: Stefan Berger <[email protected]>
> > Signed-off-by: James Bottomley <[email protected]>
>
> Christian, I understand that "[PATCH v10 23/27] ima: Setup securityfs
> for IMA namespace" needs to be deferred, but is there a reason for
> deferring "[PATCH v10 22/27] securityfs: Extend securityfs with
> namespacing support"? As the securityfs patches are really
> independent of IMA namespacing, I would have thought "[PATCH v10
> 04/27] securityfs: rework dentry creation" and this patch should be co-
> located at the beginning of the patch set.

It felt more natural to me to defer it until the end but I have no
strong thoughts on this as of right now. Since Stefan has mentioned
moving this earlier already himself and you seem to agree as well, feel
free to do so.

2022-02-23 16:07:49

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 24/27] ima: Introduce securityfs file to activate an IMA namespace

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> Introduce securityfs file 'active' that allows a user to activate an IMA
> namespace by writing a "1" (precisely a '1\0' or '1\n') to it. When
> reading from the file, it shows either '0' or '1'.

A patch description should start with the motivation for the change
being introduced. The last paragraph mentions "why" it will be needed
in the future, but there are other reasons for explicitly requiring
activation. Probably something along the lines of not every user
namespace requires an active IMA namespace. Please include those
reasons here.

>
> Also, introduce ns_is_active() to be used in those places where the
> ima_namespace pointer may either be NULL or where the active field may not
> have been set to '1', yet. An inactive IMA namespace should never be
> accessed since it has not been initialized, yet.
>
> Set the init_ima_ns's active field to '1' since it is considered active
> right from the beginning.
>
> The rationale for introducing a file to activate an IMA namespace is that
> subsequent support for IMA-measurement and IMA-appraisal will add
> configuration files for selecting for example the template that an IMA
> namespace is supposed to use for logging measurements, which will add
> an IMA namespace configuration stage (using securityfs files) before its
> activation.

This could be included at the beginning, as part of the motivation for
the patch, but it shouldn't be the only reason.

>
> Signed-off-by: Stefan Berger <[email protected]>
>
> ---
> v10:
> - use memdup_user_nul to copy input from user
> - propagating error returned from ima_fs_add_ns_files()
> ---
> security/integrity/ima/ima.h | 7 +++
> security/integrity/ima/ima_fs.c | 71 ++++++++++++++++++++++++
> security/integrity/ima/ima_init_ima_ns.c | 1 +
> security/integrity/ima/ima_main.c | 2 +-
> 4 files changed, 80 insertions(+), 1 deletion(-)
>
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index 1e3f9dda218d..05e2de7697da 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -123,6 +123,8 @@ struct ima_h_table {
> };
>
> struct ima_namespace {
> + atomic_t active; /* whether namespace is active */
> +
> struct rb_root ns_status_tree;
> rwlock_t ns_tree_lock;
> struct kmem_cache *ns_status_cache;
> @@ -154,6 +156,11 @@ struct ima_namespace {
> } __randomize_layout;
> extern struct ima_namespace init_ima_ns;
>
> +static inline bool ns_is_active(struct ima_namespace *ns)
> +{
> + return (ns && atomic_read(&ns->active));
> +}
> +
> extern const int read_idmap[];
>
> #ifdef CONFIG_HAVE_IMA_KEXEC
> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> index 84cd02a9e19b..58d80884880f 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -451,6 +451,71 @@ static const struct file_operations ima_measure_policy_ops = {
> .llseek = generic_file_llseek,
> };
>
> +static ssize_t ima_show_active(struct file *filp,
> + char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + struct ima_namespace *ns = &init_ima_ns;
> + char tmpbuf[2];
> + ssize_t len;
> +
> + len = scnprintf(tmpbuf, sizeof(tmpbuf),
> + "%d\n", atomic_read(&ns->active));
> + return simple_read_from_buffer(buf, count, ppos, tmpbuf, len);
> +}
> +
> +static ssize_t ima_write_active(struct file *filp,
> + const char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + struct ima_namespace *ns = &init_ima_ns;
> + unsigned int active;
> + char *kbuf;
> + int err;
> +
> + if (ns_is_active(ns))
> + return -EBUSY;
> +
> + /* accepting '1\n' and '1\0' and no partial writes */
> + if (count >= 3 || *ppos != 0)
> + return -EINVAL;
> +
> + kbuf = memdup_user_nul(buf, count);
> + if (IS_ERR(kbuf))
> + return PTR_ERR(kbuf);
> +
> + err = kstrtouint(kbuf, 10, &active);
> + kfree(kbuf);
> + if (err)
> + return err;
> +
> + if (active != 1)
> + return -EINVAL;
> +
> + atomic_set(&ns->active, 1);
> +
> + return count;
> +}
> +
> +static const struct file_operations ima_active_ops = {
> + .read = ima_show_active,
> + .write = ima_write_active,
> +};
> +
> +static int ima_fs_add_ns_files(struct dentry *ima_dir)
> +{
> + struct dentry *active;
> +
> + active =
> + securityfs_create_file("active",
> + S_IRUSR | S_IWUSR | S_IRGRP, ima_dir, NULL,
> + &ima_active_ops);
> + if (IS_ERR(active))
> + return PTR_ERR(active);
> +
> + return 0;
> +}
> +
> int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
> {
> struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
> @@ -531,6 +596,12 @@ int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
> }
> }
>
> + if (ns != &init_ima_ns) {
> + ret = ima_fs_add_ns_files(ima_dir);
> + if (ret)
> + goto out;
> + }
> +

In all other cases, the securityfs files are directly created in
ima_fs_ns_init(). What is different about "active" that a new
function is defined?

thanks,

Mimi

> return 0;
> out:
> securityfs_remove(ns->ima_policy);
> diff --git a/security/integrity/ima/ima_init_ima_ns.c b/security/integrity/ima/ima_init_ima_ns.c
> index d4ddfd1de60b..39ee0c2477a6 100644
> --- a/security/integrity/ima/ima_init_ima_ns.c
> +++ b/security/integrity/ima/ima_init_ima_ns.c
> @@ -58,5 +58,6 @@ struct ima_namespace init_ima_ns = {
> .ima_lsm_policy_notifier = {
> .notifier_call = ima_lsm_policy_change,
> },
> + .active = ATOMIC_INIT(1),
> };
> EXPORT_SYMBOL(init_ima_ns);
> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index 1dee8cb5afa2..9ca9223bbcf6 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -441,7 +441,7 @@ static int process_measurement(struct user_namespace *user_ns,
>
> while (user_ns) {
> ns = ima_ns_from_user_ns(user_ns);
> - if (ns) {
> + if (ns_is_active(ns)) {
> int rc;
>
> rc = __process_measurement(ns, file, cred, secid, buf,


2022-02-23 17:23:25

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 24/27] ima: Introduce securityfs file to activate an IMA namespace

On Wed, 2022-02-23 at 12:08 -0500, Stefan Berger wrote:
> On 2/23/22 08:54, Mimi Zohar wrote:
> > On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> >
> >> int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
> >> {
> >> struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
> >> @@ -531,6 +596,12 @@ int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
> >> }
> >> }
> >>
> >> + if (ns != &init_ima_ns) {
> >> + ret = ima_fs_add_ns_files(ima_dir);
> >> + if (ret)
> >> + goto out;
> >> + }
> >> +
> > In all other cases, the securityfs files are directly created in
> > ima_fs_ns_init(). What is different about "active" that a new
> > function is defined?
>
>
> It was meant as a function to create namespace-specific files, if more
> were to come along. I can move the code from ima_fs_add_ns_files() into
> this function if you want.

Perhaps defer defining a new function until that happens.

thanks,

Mimi

2022-02-23 19:31:20

by Mimi Zohar

[permalink] [raw]
Subject: Re: [PATCH v10 18/27] integrity/ima: Define ns_status for storing namespaced iint data

On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
> From: Mehmet Kayaalp <[email protected]>
>
> Add an rbtree to the IMA namespace structure that stores a namespaced
> version of iint->flags in ns_status struct. Similar to the
> integrity_iint_cache, both the iint and ns_status are looked up using the
> inode pointer value. The lookup, allocate, and insertion code is also
> similar.
>
> In subsequent patches we will have to find all ns_status entries an iint
> is being used in and reset flags there. To do this, connect a list of
> ns_status to the integrity_iint_cache and provide a reader-writer
> lock in the integrity_iint_cache to lock access to the list.
>
> To simplify the code in the non-namespaces case embed an ns_status in
> the integrity_iint_cache and have it linked into the iint's ns_status list
> when calling ima_get_ns_status().
>
> When getting an ns_status first try to find it in the RB tree. Here we can
> run into the situation that an ns_status found in the RB tree has a
> different iint associated with it for the same inode. In this case we need
> to delete the ns_status structure and get a new one.
>
> There are two cases for freeing:
> - when the iint is freed (inode deletion): Walk the list of ns_status
> entries and disconnect each ns_status from the list; take the
> writer lock to protect access to the list; also, take the item off the
> per-namespace rbtree
>
> - when the ima_namepace is freed: While walking the rbtree, remove the
> ns_status from the list while also holding the iint's writer lock;
> to be able to grab the lock we have to have a pointer to the iint on
> the ns_status structure.
>
> To avoid an ns_status to be freed by the two cases concurrently, prevent
> these two cases to run concurrently. Therefore, groups of threads
> deleting either inodes or ima_namespaces are allowed to run concurrently
> but no two threads may run and one delete an inode and the other an
> ima_namespace.

The locking involved here is really complex. I'm sure you thought
about it a lot, but isn't there a better alternative?

>
> Signed-off-by: Mehmet Kayaalp <[email protected]>
> Signed-off-by: Stefan Berger <[email protected]>
>
> ---
> v9:
> - Move ns_status into integrity.h and embedded it into integrity_iint_cache
> for the non-CONFIG_IMA_NS case

<snip>

>
> +/*
> + * ima_free_ns_status_tree - free all items on the ns_status_tree and take each
> + * one off the list; yield to ns_list free'ers
> + *
> + * This function is called when an ima_namespace is freed. All entries in the
> + * rbtree will be taken off their list and collected in a garbage collection
> + * list and freed at the end. This allows to walk the rbtree again.
> + */
> +void ima_free_ns_status_tree(struct ima_namespace *ns)
> +{
> + struct ns_status *ns_status, *next;
> + struct llist_node *node;
> + LLIST_HEAD(garbage);
> + unsigned int ctr;
> + bool restart;
> +
> + do {
> + ctr = 0;
> + restart = false;
> +
> + lock_group(GRP_NS_STATUS_TREE);
> + write_lock(&ns->ns_tree_lock);
> +
> + rbtree_postorder_for_each_entry_safe(ns_status, next,
> + &ns->ns_status_tree,
> + rb_node) {
> + write_lock(&ns_status->iint->ns_list_lock);
> + if (!list_empty(&ns_status->ns_next)) {
> + list_del_init(&ns_status->ns_next);
> + llist_add(&ns_status->gc_llist, &garbage);
> + ctr++;
> + }

At this point when the namespace is being deleted, no entries are being
added to the rbtree, so it is safe to remove the nodes here. There's
no need to first create a list and then remove them.

> + write_unlock(&ns_status->iint->ns_list_lock);
> +
> + /*
> + * After some progress yield to any waiting ns_list
> + * free'ers.
> + */
> + if (atomic_read(&ns_list_waiters) > 0 && ctr >= 5) {
> + restart = true;
> + break;
> + }

Giving priority to removing entries in the iint cache is important, but
I wish there was a better alternative.

> + }
> +
> + write_unlock(&ns->ns_tree_lock);
> + unlock_group(GRP_NS_STATUS_TREE);
> + } while (restart);
> +
> + node = llist_del_all(&garbage);
> + llist_for_each_entry_safe(ns_status, next, node, gc_llist)
> + ns_status_free(ns, ns_status);
> +
> + kmem_cache_destroy(ns->ns_status_cache);
> +}

--
thanks,

Mimi

2022-02-23 20:09:01

by Stefan Berger

[permalink] [raw]
Subject: Re: [PATCH v10 24/27] ima: Introduce securityfs file to activate an IMA namespace


On 2/23/22 08:54, Mimi Zohar wrote:
> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
>
>> int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
>> {
>> struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
>> @@ -531,6 +596,12 @@ int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
>> }
>> }
>>
>> + if (ns != &init_ima_ns) {
>> + ret = ima_fs_add_ns_files(ima_dir);
>> + if (ret)
>> + goto out;
>> + }
>> +
> In all other cases, the securityfs files are directly created in
> ima_fs_ns_init(). What is different about "active" that a new
> function is defined?


It was meant as a function to create namespace-specific files, if more
were to come along.  I can move the code from ima_fs_add_ns_files() into
this function if you want.


>
> thanks,
>
> Mimi
>
>

2022-02-24 01:19:38

by Stefan Berger

[permalink] [raw]
Subject: Re: [PATCH v10 24/27] ima: Introduce securityfs file to activate an IMA namespace


On 2/23/22 12:12, Mimi Zohar wrote:
> On Wed, 2022-02-23 at 12:08 -0500, Stefan Berger wrote:
>> On 2/23/22 08:54, Mimi Zohar wrote:
>>> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
>>>
>>>> int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
>>>> {
>>>> struct ima_namespace *ns = ima_ns_from_user_ns(user_ns);
>>>> @@ -531,6 +596,12 @@ int ima_fs_ns_init(struct user_namespace *user_ns, struct dentry *root)
>>>> }
>>>> }
>>>>
>>>> + if (ns != &init_ima_ns) {
>>>> + ret = ima_fs_add_ns_files(ima_dir);
>>>> + if (ret)
>>>> + goto out;
>>>> + }
>>>> +
>>> In all other cases, the securityfs files are directly created in
>>> ima_fs_ns_init(). What is different about "active" that a new
>>> function is defined?
>>
>> It was meant as a function to create namespace-specific files, if more
>> were to come along. I can move the code from ima_fs_add_ns_files() into
>> this function if you want.
> Perhaps defer defining a new function until that happens.

I moved the code out of this function into ima_fs_ns_init() now.



2022-02-24 01:30:39

by Stefan Berger

[permalink] [raw]
Subject: Re: [PATCH v10 22/27] securityfs: Extend securityfs with namespacing support


On 2/23/22 03:14, Christian Brauner wrote:
> On Tue, Feb 22, 2022 at 08:48:47PM -0500, Mimi Zohar wrote:
>> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
>>> Enable multiple instances of securityfs by keying each instance with a
>>> pointer to the user namespace it belongs to.
>>>
>>> Since we do not need the pinning of the filesystem for the virtualization
>> ^namespacing case
>>
>>> case, limit the usage of simple_pin_fs() and simpe_release_fs() to the
>> ^simple_release_fs
>>
>>> case when the init_user_ns is active. This simplifies the cleanup for the
>>> virtualization case where usage of securityfs_remove() to free dentries
>> ^namespacing
>>
>>> is therefore not needed anymore.
>>>
>>> For the initial securityfs, i.e. the one mounted in the host userns mount,
>>> nothing changes. The rules for securityfs_remove() are as before and it is
>>> still paired with securityfs_create(). Specifically, a file created via
>>> securityfs_create_dentry() in the initial securityfs mount still needs to
>>> be removed by a call to securityfs_remove(). Creating a new dentry in the
>>> initial securityfs mount still pins the filesystem like it always did.
>>> Consequently, the initial securityfs mount is not destroyed on
>>> umount/shutdown as long as at least one user of it still has dentries that
>>> it hasn't removed with a call to securityfs_remove().
>>>
>>> Prevent mounting of an instance of securityfs in another user namespace
>>> than it belongs to. Also, prevent accesses to files and directories by
>>> a user namespace that is neither the user namespace it belongs to
>>> nor an ancestor of the user namespace that the instance of securityfs
>>> belongs to. Do not prevent access if securityfs was bind-mounted and
>>> therefore the init_user_ns is the owning user namespace.
>>>
>>> Suggested-by: Christian Brauner <[email protected]>
>>> Signed-off-by: Stefan Berger <[email protected]>
>>> Signed-off-by: James Bottomley <[email protected]>
>> Christian, I understand that "[PATCH v10 23/27] ima: Setup securityfs
>> for IMA namespace" needs to be deferred, but is there a reason for
>> deferring "[PATCH v10 22/27] securityfs: Extend securityfs with
>> namespacing support"? As the securityfs patches are really
>> independent of IMA namespacing, I would have thought "[PATCH v10
>> 04/27] securityfs: rework dentry creation" and this patch should be co-
>> located at the beginning of the patch set.
> It felt more natural to me to defer it until the end but I have no
> strong thoughts on this as of right now. Since Stefan has mentioned
> moving this earlier already himself and you seem to agree as well, feel
> free to do so.

I'll move it after 'securityfs: rework dentry creation' if that's ok.

   Stefan

2022-02-24 02:22:42

by Stefan Berger

[permalink] [raw]
Subject: Re: [PATCH v10 18/27] integrity/ima: Define ns_status for storing namespaced iint data


On 2/23/22 11:12, Mimi Zohar wrote:
> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
>> From: Mehmet Kayaalp <[email protected]>
>>
>> Add an rbtree to the IMA namespace structure that stores a namespaced
>> version of iint->flags in ns_status struct. Similar to the
>> integrity_iint_cache, both the iint and ns_status are looked up using the
>> inode pointer value. The lookup, allocate, and insertion code is also
>> similar.
>>
>> In subsequent patches we will have to find all ns_status entries an iint
>> is being used in and reset flags there. To do this, connect a list of
>> ns_status to the integrity_iint_cache and provide a reader-writer
>> lock in the integrity_iint_cache to lock access to the list.
>>
>> To simplify the code in the non-namespaces case embed an ns_status in
>> the integrity_iint_cache and have it linked into the iint's ns_status list
>> when calling ima_get_ns_status().
>>
>> When getting an ns_status first try to find it in the RB tree. Here we can
>> run into the situation that an ns_status found in the RB tree has a
>> different iint associated with it for the same inode. In this case we need
>> to delete the ns_status structure and get a new one.
>>
>> There are two cases for freeing:
>> - when the iint is freed (inode deletion): Walk the list of ns_status
>> entries and disconnect each ns_status from the list; take the
>> writer lock to protect access to the list; also, take the item off the
>> per-namespace rbtree
>>
>> - when the ima_namepace is freed: While walking the rbtree, remove the
>> ns_status from the list while also holding the iint's writer lock;
>> to be able to grab the lock we have to have a pointer to the iint on
>> the ns_status structure.
>>
>> To avoid an ns_status to be freed by the two cases concurrently, prevent
>> these two cases to run concurrently. Therefore, groups of threads
>> deleting either inodes or ima_namespaces are allowed to run concurrently
>> but no two threads may run and one delete an inode and the other an
>> ima_namespace.
> The locking involved here is really complex. I'm sure you thought
> about it a lot, but isn't there a better alternative?

I am afraid this is a difficult question and a short and concise answer
is not possible...

The complexity of the locking is driven by concurrency and the data
structures that are involved. The data structures (existing global iint
rbtree, ns_status structure, and per namespace rbtree for ns_status) and
how they are organized and connected (via linked lists) are a
consequence of the fact that we need to be able to handle files shared
between IMA namespaces (and the host) so that re-auditing, re-measuring
and re-appraisal of files after file modifications or modifications of
the security.ima xattr (by any namespaces) can be done efficiently.
Furthermore, it helps to efficiently remove all the status information
that an IMA namespace has kept for files it audited/measured/appraised.
The goal was to make this as scalable as possible by having each
namespace get out of the way of other namespaces by preventing them from
locking each other out too much. The single biggest problem are files
shared between IMA namespaces.

The best argument for the design I can come up with is the 'Big O
notation' describing the time complexity of operations.


The existing global iint rbtree maintains IMA status information for
each inode. Lookup and insertion of data into the gloab iint rbtree  is
O(log(n)), thus optimal.

To accommodate re-auditing/re-measurement/re-appraisal, which is driven
by resetting status flags, I connected a list of ns_status structures,
in which each namespace maintains its status information for each inode,
to the iint maintained in that global rbtree. The resetting of status
flags is fast because traversing the list after a lookup in the tree is
O(n). Lookup + resetting the flags therefore is O(log(n) + n). If the
list didn't exist we would have to search all IMA namespaces for the
inode to be able to reset the flags, resulting in O(n * log(n)) time
complexity, which is of course much worse. So, the list of ns_status
linked to an iint has a good reason: better time complexity to traverse
the list and reset status flags. Beside  that it also supports fast
handling of deletion of files where the iint has to be delete from the
global rbtree and the ns_status list it holds must also be deleted (each
ns_status also needs to be delete from a per IMA-namespace rbtree then)


There's also a per-IMA namespace rbtree for each inode that serves two
purposes:

a) Fast lookup of ns_status (O(log(n)) for an IMA namespace; at least to
insert an ns_status into this tree we need not write-lock the iint tree
but the initial iint creation required the write-locking of the iint tree

b) Maintaining a collection of inodes that the namespace has
audited/measured/appraised for efficient deletion upon IMA namespace
teardown: We can traverse this tree in O(n) time and determine which
iints have no more namespace users and delete them from the iint tree.


Now the dilemma with this is that an ns_status structure is connected to
a list hanging off the iint and on top of this it is part of an rbtree.
And this is where the 'group locking' is coming from. What we need to
prevent is that an ns_status is deleted from its iint list (when a file
is deleted) while it is also deleted from the per-IMA namespace rbtree
(when the namespace is deleted). Both must not be done concurrently.
What is possible is that a group of threads may tear down namespaces and
the other group may act on file deletion, but threads from both groups
must not run concurrently.


Now we can at least look at two alternatives for the per-IMA namespace
rbtree.

1) One alternative is to use a list instead of an rbtree. We would loose
the fast lookup via the per IMA namespace tree and get O(n) lookup times
but quick insertion into the list [O(1)]. We still would have the
collection of inodes. And we would still have the dilemma that an
ns_status would be connected to two lists, thus requiring the group
locking. I don't think using a list instead of an rbtree is a solution.

2) We could try to get rid of the per-IMA namespace rbtree altogether
and just use the global iint rbtree that exists today with a list of
ns_status connected to its iints. If we do this we would loose the
knowledge of which inodes a namespace has an ns_status structure for.
The only way we would find this is by traversing the global iint tree
(O(n)) and follow each iint list (O(m)) to see whether we find an
ns_status holding information about the iint. The time complexity for
this would be O(n*m) but much less than O(n^2). A downside would also be
that we would have to keep a lock on the global iint rbtree while
traversing it, thus locking out those that want to add inodes to the
tree. On the upside it would allow us to get rid of the group locking.
Lookup of an ns_status in the global iint tree would be O(n) + O(m) and
insertion would be O(n) + O(1).


Certainly, the alternative is 2) with its own trade-offs. My guess is
some sort of yielding could probably also be helpful there then to avoid
blocking higher priority operations than deleting of a namespace.


>> +/*
>> + * ima_free_ns_status_tree - free all items on the ns_status_tree and take each
>> + * one off the list; yield to ns_list free'ers
>> + *
>> + * This function is called when an ima_namespace is freed. All entries in the
>> + * rbtree will be taken off their list and collected in a garbage collection
>> + * list and freed at the end. This allows to walk the rbtree again.
>> + */
>> +void ima_free_ns_status_tree(struct ima_namespace *ns)
>> +{
>> + struct ns_status *ns_status, *next;
>> + struct llist_node *node;
>> + LLIST_HEAD(garbage);
>> + unsigned int ctr;
>> + bool restart;
>> +
>> + do {
>> + ctr = 0;
>> + restart = false;
>> +
>> + lock_group(GRP_NS_STATUS_TREE);
>> + write_lock(&ns->ns_tree_lock);
>> +
>> + rbtree_postorder_for_each_entry_safe(ns_status, next,
>> + &ns->ns_status_tree,
>> + rb_node) {
>> + write_lock(&ns_status->iint->ns_list_lock);
>> + if (!list_empty(&ns_status->ns_next)) {
>> + list_del_init(&ns_status->ns_next);
>> + llist_add(&ns_status->gc_llist, &garbage);
>> + ctr++;
>> + }
> At this point when the namespace is being deleted, no entries are being
> added to the rbtree, so it is safe to remove the nodes here. There's
> no need to first create a list and then remove them.


We are traversing the per-namespace rbtree here and later on in this
series (21/27) we try to determine which iints are now unused in the
global iint tree and delete the iints there. Due to the locking that's
necessary to remove the iint from the global rbtree all the ns_status
are collected in this list here already now. True, in this patch here it
is not necessary but in 21/27 it will be necessary to have them on this
list. I had thought about merging the patches but its complex enough as-is.


   Stefan


>
>> + write_unlock(&ns_status->iint->ns_list_lock);
>> +
>> + /*
>> + * After some progress yield to any waiting ns_list
>> + * free'ers.
>> + */
>> + if (atomic_read(&ns_list_waiters) > 0 && ctr >= 5) {
>> + restart = true;
>> + break;
>> + }
> Giving priority to removing entries in the iint cache is important, but
> I wish there was a better alternative.
>
>> + }
>> +
>> + write_unlock(&ns->ns_tree_lock);
>> + unlock_group(GRP_NS_STATUS_TREE);
>> + } while (restart);
>> +
>> + node = llist_del_all(&garbage);
>> + llist_for_each_entry_safe(ns_status, next, node, gc_llist)
>> + ns_status_free(ns, ns_status);
>> +
>> + kmem_cache_destroy(ns->ns_status_cache);
>> +}

2022-02-24 02:50:11

by Stefan Berger

[permalink] [raw]
Subject: Re: [PATCH v10 18/27] integrity/ima: Define ns_status for storing namespaced iint data


On 2/23/22 21:21, Stefan Berger wrote:
>
> On 2/23/22 11:12, Mimi Zohar wrote:
>> On Tue, 2022-02-01 at 15:37 -0500, Stefan Berger wrote:
>>> From: Mehmet Kayaalp <[email protected]>
>>>
>>> Add an rbtree to the IMA namespace structure that stores a namespaced
>>> version of iint->flags in ns_status struct. Similar to the
>>> integrity_iint_cache, both the iint and ns_status are looked up
>>> using the
>>> inode pointer value. The lookup, allocate, and insertion code is also
>>> similar.
>>>
>>> In subsequent patches we will have to find all ns_status entries an
>>> iint
>>> is being used in and reset flags there. To do this, connect a list of
>>> ns_status to the integrity_iint_cache and provide a reader-writer
>>> lock in the integrity_iint_cache to lock access to the list.
>>>
>>> To simplify the code in the non-namespaces case embed an ns_status in
>>> the integrity_iint_cache and have it linked into the iint's
>>> ns_status list
>>> when calling ima_get_ns_status().
>>>
>>> When getting an ns_status first try to find it in the RB tree. Here
>>> we can
>>> run into the situation that an ns_status found in the RB tree has a
>>> different iint associated with it for the same inode. In this case
>>> we need
>>> to delete the ns_status structure and get a new one.
>>>
>>> There are two cases for freeing:
>>> - when the iint is freed (inode deletion): Walk the list of ns_status
>>>    entries and disconnect each ns_status from the list; take the
>>>    writer lock to protect access to the list; also, take the item
>>> off the
>>>    per-namespace rbtree
>>>
>>> - when the ima_namepace is freed: While walking the rbtree, remove the
>>>    ns_status from the list while also holding the iint's writer lock;
>>>    to be able to grab the lock we have to have a pointer to the iint on
>>>    the ns_status structure.
>>>
>>> To avoid an ns_status to be freed by the two cases concurrently,
>>> prevent
>>> these two cases to run concurrently. Therefore, groups of threads
>>> deleting either inodes or ima_namespaces are allowed to run
>>> concurrently
>>> but no two threads may run and one delete an inode and the other an
>>> ima_namespace.
>> The locking involved here is really complex.  I'm sure you thought
>> about it a lot, but isn't there a better alternative?
>
> I am afraid this is a difficult question and a short and concise
> answer is not possible...
>
> The complexity of the locking is driven by concurrency and the data
> structures that are involved. The data structures (existing global
> iint rbtree, ns_status structure, and per namespace rbtree for
> ns_status) and how they are organized and connected (via linked lists)
> are a consequence of the fact that we need to be able to handle files
> shared between IMA namespaces (and the host) so that re-auditing,
> re-measuring and re-appraisal of files after file modifications or
> modifications of the security.ima xattr (by any namespaces) can be
> done efficiently. Furthermore, it helps to efficiently remove all the
> status information that an IMA namespace has kept for files it
> audited/measured/appraised. The goal was to make this as scalable as
> possible by having each namespace get out of the way of other
> namespaces by preventing them from locking each other out too much.
> The single biggest problem are files shared between IMA namespaces.
>
> The best argument for the design I can come up with is the 'Big O
> notation' describing the time complexity of operations.
>
>
> The existing global iint rbtree maintains IMA status information for
> each inode. Lookup and insertion of data into the gloab iint rbtree 
> is O(log(n)), thus optimal.
>
> To accommodate re-auditing/re-measurement/re-appraisal, which is
> driven by resetting status flags, I connected a list of ns_status
> structures, in which each namespace maintains its status information
> for each inode, to the iint maintained in that global rbtree. The
> resetting of status flags is fast because traversing the list after a
> lookup in the tree is O(n). Lookup + resetting the flags therefore is
> O(log(n) + n). If the list didn't exist we would have to search all
> IMA namespaces for the inode to be able to reset the flags, resulting
> in O(n * log(n)) time complexity, which is of course much worse. So,
> the list of ns_status linked to an iint has a good reason: better time
> complexity to traverse the list and reset status flags. Beside  that
> it also supports fast handling of deletion of files where the iint has
> to be delete from the global rbtree and the ns_status list it holds
> must also be deleted (each ns_status also needs to be delete from a
> per IMA-namespace rbtree then)
>
>
> There's also a per-IMA namespace rbtree for each inode that serves two
> purposes:
>
> a) Fast lookup of ns_status (O(log(n)) for an IMA namespace; at least
> to insert an ns_status into this tree we need not write-lock the iint
> tree but the initial iint creation required the write-locking of the
> iint tree
>
> b) Maintaining a collection of inodes that the namespace has
> audited/measured/appraised for efficient deletion upon IMA namespace
> teardown: We can traverse this tree in O(n) time and determine which
> iints have no more namespace users and delete them from the iint tree.
>
>
> Now the dilemma with this is that an ns_status structure is connected
> to a list hanging off the iint and on top of this it is part of an
> rbtree. And this is where the 'group locking' is coming from. What we
> need to prevent is that an ns_status is deleted from its iint list
> (when a file is deleted) while it is also deleted from the per-IMA
> namespace rbtree (when the namespace is deleted). Both must not be
> done concurrently. What is possible is that a group of threads may
> tear down namespaces and the other group may act on file deletion, but
> threads from both groups must not run concurrently.
>
>
> Now we can at least look at two alternatives for the per-IMA namespace
> rbtree.
>
> 1) One alternative is to use a list instead of an rbtree. We would
> loose the fast lookup via the per IMA namespace tree and get O(n)
> lookup times but quick insertion into the list [O(1)]. We still would
> have the collection of inodes. And we would still have the dilemma
> that an ns_status would be connected to two lists, thus requiring the
> group locking. I don't think using a list instead of an rbtree is a
> solution.
>
> 2) We could try to get rid of the per-IMA namespace rbtree altogether
> and just use the global iint rbtree that exists today with a list of
> ns_status connected to its iints. If we do this we would loose the
> knowledge of which inodes a namespace has an ns_status structure for.
> The only way we would find this is by traversing the global iint tree
> (O(n)) and follow each iint list (O(m)) to see whether we find an
> ns_status holding information about the iint. The time complexity for
> this would be O(n*m) but much less than O(n^2). A downside would also
> be that we would have to keep a lock on the global iint rbtree while
> traversing it, thus locking out those that want to add inodes to the
> tree. On the upside it would allow us to get rid of the group locking.
> Lookup of an ns_status in the global iint tree would be O(n) + O(m)
> and insertion would be O(n) + O(1).
>
>
> Certainly, the alternative is 2) with its own trade-offs. My guess is
> some sort of yielding could probably also be helpful there then to
> avoid blocking higher priority operations than deleting of a namespace.


I forgot to mention: It makes a difference if one has to walk the global
iint tree to find the few ns_status for the namespace among possibly
thousands of entries in that tree than having a per-IMA namespace rbtree
that has these few ns_status right there. So walking the iint tree is
more like O(N) versus O(n) walking the per-IMA namespace rbtree.