2018-06-06 17:07:44

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 00/10] audit: implement container identifier

Implement kernel audit container identifier.

This patchset is a third based on the proposal document (V3)
posted:
https://www.redhat.com/archives/linux-audit/2018-January/msg00014.html

The first patch implements the proc fs write to set the audit container
identifier of a process, emitting an AUDIT_CONTAINER_ID record to announce the
registration of that audit container identifier on that process. This patch
requires userspace support for record acceptance and proper type
display.

The second implements the auxiliary record AUDIT_CONTAINER if an
audit container identifier is identifiable with an event. This patch
requires userspace support for proper type display.

The third adds signal and ptrace support.

The 4th creates a local audit context to be able to bind a standalone
record with a locally created auxiliary record.

The 5th patch adds audit container identifier records to the tty
standalone record.

The 6th adds audit container identifier filtering to the exit,
exclude and user lists. This patch adds the AUDIT_CONTID field and
requires auditctl userspace support for the --contid option.

The 7th adds network namespace audit container identifier labelling
based on member tasks' audit container identifier labels.

The 8th adds audit container identifier support to standalone netfilter
records that don't have a task context and lists each container to which
that net namespace belongs.

The 9th implements reading the audit container identifier from the proc
filesystem for debugging. This patch isn't planned for upstream
inclusion.

The 10th fixes an rfkill spelling mistake in a comment that is otherwise
the only match for "contid" in the kernel.


Example: Set an audit container identifier of 123456 to the "sleep" task:

sleep 2&
child=$!
echo 123456 > /proc/$child/audit_containerid; echo $?
ausearch -ts recent -m container
echo child:$child contid:$( cat /proc/$child/audit_containerid)

This should produce a record such as:

type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes


Example: Set a filter on an audit container identifier 123459 on /tmp/tmpcontainerid:

contid=123459
key=tmpcontainerid
auditctl -a exit,always -F dir=/tmp -F perm=wa -F contid=$contid -F key=$key
perl -e "sleep 1; open(my \$tmpfile, '>', \"/tmp/$key\"); close(\$tmpfile);" &
child=$!
echo $contid > /proc/$child/audit_containerid
sleep 2
ausearch -i -ts recent -k $key
auditctl -d exit,always -F dir=/tmp -F perm=wa -F contid=$contid -F key=$key
rm -f /tmp/$key

This should produce an event such as:

type=CONTAINER msg=audit(2018-06-06 12:46:31.707:26953) : op=task contid=123459
type=PROCTITLE msg=audit(2018-06-06 12:46:31.707:26953) : proctitle=perl -e sleep 1; open(my $tmpfile, '>', "/tmp/tmpcontainerid"); close($tmpfile);
type=PATH msg=audit(2018-06-06 12:46:31.707:26953) : item=1 name=/tmp/tmpcontainerid inode=25656 dev=00:26 mode=file,644 ouid=root ogid=root rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=none cap_fi=none cap_fe=0 cap_fver=0
type=PATH msg=audit(2018-06-06 12:46:31.707:26953) : item=0 name=/tmp/ inode=8985 dev=00:26 mode=dir,sticky,777 ouid=root ogid=root rdev=00:00 obj=system_u:object_r:tmp_t:s0 nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0
type=CWD msg=audit(2018-06-06 12:46:31.707:26953) : cwd=/root
type=SYSCALL msg=audit(2018-06-06 12:46:31.707:26953) : arch=x86_64 syscall=openat success=yes exit=3 a0=0xffffffffffffff9c a1=0x5621f2b81900 a2=O_WRONLY|O_CREAT|O_TRUNC a3=0x1b6 items=2 ppid=628 pid=2232 auid=root uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=ttyS0 ses=1 comm=perl exe=/usr/bin/perl subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=tmpcontainerid


Requires: https://github.com/linux-audit/audit-kernel/issues/81
See: https://github.com/linux-audit/audit-kernel/issues/90
See: https://github.com/linux-audit/audit-userspace/issues/40
See: https://github.com/linux-audit/audit-testsuite/issues/64
See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID

Changelog:

v3
- switched from containerid in task_struct to audit_task_info (depends on ghak81)
- drop INVALID_CID in favour of only AUDIT_CID_UNSET
- check for !audit_task_info, throw -ENOPROTOOPT on set
- changed -EPERM to -EEXIST for parent check
- return AUDIT_CID_UNSET if !audit_enabled
- squash child/thread check patch into AUDIT_CONTAINER_ID patch
- changed -EPERM to -EBUSY for child check
- separate child and thread checks, use -EALREADY for latter
- move addition of op= from ptrace/signal patch to AUDIT_CONTAINER patch
- fix && to || bashism in ptrace/signal patch
- uninline and export function for audit_free_context()
- drop CONFIG_CHANGE, FEATURE_CHANGE, ANOM_ABEND, ANOM_SECCOMP patches
- move audit_enabled check (xt_AUDIT)
- switched from containerid list in struct net to net_generic's struct audit_net
- move containerid list iteration into audit (xt_AUDIT)
- create function to move namespace switch into audit
- switched /proc/PID/ entry from containerid to audit_containerid
- call kzalloc with GFP_ATOMIC on in_atomic() in audit_alloc_context()
- call kzalloc with GFP_ATOMIC on in_atomic() in audit_log_container_info()
- use xt_net(par) instead of sock_net(skb->sk) to get net
- switched record and field names: initial CONTAINER_ID, aux CONTAINER, field CONTID
- allow to set own contid
- open code audit_set_containerid
- add contid inherited flag
- ccontainerid and pcontainerid eliminated due to inherited flag
- change name of container list funcitons
- rename containerid to contid
- convert initial container record to syscall aux
- fix spelling mistake of contidion in net/rfkill/core.c to avoid contid name collision

v2
- add check for children and threads
- add network namespace container identifier list
- add NETFILTER_PKT audit container identifier logging
- patch description and documentation clean-up and example
- reap unused ppid

Richard Guy Briggs (10):
audit: add container id
audit: log container info of syscalls
audit: add containerid support for ptrace and signals
audit: add support for non-syscall auxiliary records
audit: add containerid support for tty_audit
audit: add containerid filtering
audit: add support for containerid to network namespaces
audit: NETFILTER_PKT: record each container ID associated with a netNS
debug audit: read container ID of a process
rfkill: fix spelling mistake contidion to condition

drivers/tty/tty_audit.c | 5 +-
fs/proc/base.c | 53 +++++++++++++++++++
include/linux/audit.h | 68 ++++++++++++++++++++++++
include/uapi/linux/audit.h | 8 ++-
kernel/audit.c | 114 ++++++++++++++++++++++++++++++++++++++++
kernel/audit.h | 3 ++
kernel/auditfilter.c | 47 +++++++++++++++++
kernel/auditsc.c | 128 ++++++++++++++++++++++++++++++++++++++++++---
kernel/nsproxy.c | 4 ++
net/netfilter/xt_AUDIT.c | 12 ++++-
net/rfkill/core.c | 2 +-
11 files changed, 433 insertions(+), 11 deletions(-)

--
1.8.3.1



2018-06-06 17:02:12

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 03/10] audit: add containerid support for ptrace and signals

Add audit container identifier support to ptrace and signals. In
particular, the "op" field provides a way to label the auxiliary record
to which it is associated.

Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 11 +++++------
kernel/audit.c | 13 +++++++------
kernel/audit.h | 2 ++
kernel/auditsc.c | 21 ++++++++++++++++-----
4 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 4e1e34e..ab50985 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -34,6 +34,7 @@ struct audit_sig_info {
uid_t uid;
pid_t pid;
char ctx[0];
+ u64 cid;
};

struct audit_buffer;
@@ -152,9 +153,8 @@ extern void audit_log_key(struct audit_buffer *ab,
extern int audit_log_task_context(struct audit_buffer *ab);
extern void audit_log_task_info(struct audit_buffer *ab,
struct task_struct *tsk);
-extern int audit_log_contid(struct task_struct *tsk,
- struct audit_context *context,
- char *op);
+extern int audit_log_contid(struct audit_context *context,
+ char *op, u64 contid);

extern int audit_update_lsm_rules(void);

@@ -205,9 +205,8 @@ static inline int audit_log_task_context(struct audit_buffer *ab)
static inline void audit_log_task_info(struct audit_buffer *ab,
struct task_struct *tsk)
{ }
-static inline int audit_log_contid(struct task_struct *tsk,
- struct audit_context *context,
- char *op)
+static inline int audit_log_contid(struct audit_context *context,
+ char *op, u64 contid)
{ }
#define audit_enabled 0
#endif /* CONFIG_AUDIT */
diff --git a/kernel/audit.c b/kernel/audit.c
index 5e150c6..ba304a8 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -142,6 +142,7 @@ struct audit_net {
kuid_t audit_sig_uid = INVALID_UID;
pid_t audit_sig_pid = -1;
u32 audit_sig_sid = 0;
+u64 audit_sig_cid = AUDIT_CID_UNSET;

/* Records can be lost in several ways:
0) [suppressed in audit_alloc]
@@ -1437,6 +1438,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
memcpy(sig_data->ctx, ctx, len);
security_release_secctx(ctx, len);
}
+ sig_data->cid = audit_sig_cid;
audit_send_reply(skb, seq, AUDIT_SIGNAL_INFO, 0, 0,
sig_data, sizeof(*sig_data) + len);
kfree(sig_data);
@@ -2050,23 +2052,22 @@ void audit_log_session_info(struct audit_buffer *ab)

/*
* audit_log_contid - report container info
- * @tsk: task to be recorded
* @context: task or local context for record
* @op: contid string description
+ * @contid: container ID to report
*/
-int audit_log_contid(struct task_struct *tsk,
- struct audit_context *context, char *op)
+int audit_log_contid(struct audit_context *context,
+ char *op, u64 contid)
{
struct audit_buffer *ab;

- if (!audit_contid_set(tsk))
+ if (!cid_valid(contid))
return 0;
/* Generate AUDIT_CONTAINER record with container ID */
ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
if (!ab)
return -ENOMEM;
- audit_log_format(ab, "op=%s contid=%llu",
- op, audit_get_contid(tsk));
+ audit_log_format(ab, "op=%s contid=%llu", op, contid);
audit_log_end(ab);
return 0;
}
diff --git a/kernel/audit.h b/kernel/audit.h
index 214e149..1cf1c35 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -147,6 +147,7 @@ struct audit_context {
kuid_t target_uid;
unsigned int target_sessionid;
u32 target_sid;
+ u64 target_cid;
char target_comm[TASK_COMM_LEN];

struct audit_tree_refs *trees, *first_trees;
@@ -329,6 +330,7 @@ extern void audit_log_d_path_exe(struct audit_buffer *ab,
extern pid_t audit_sig_pid;
extern kuid_t audit_sig_uid;
extern u32 audit_sig_sid;
+extern u64 audit_sig_cid;

extern int audit_filter(int msgtype, unsigned int listtype);

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index a3c946c..cface9d 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -113,6 +113,7 @@ struct audit_aux_data_pids {
kuid_t target_uid[AUDIT_AUX_PIDS];
unsigned int target_sessionid[AUDIT_AUX_PIDS];
u32 target_sid[AUDIT_AUX_PIDS];
+ u64 target_cid[AUDIT_AUX_PIDS];
char target_comm[AUDIT_AUX_PIDS][TASK_COMM_LEN];
int pid_count;
};
@@ -1456,21 +1457,27 @@ static void audit_log_exit(struct audit_context *context, struct task_struct *ts
for (aux = context->aux_pids; aux; aux = aux->next) {
struct audit_aux_data_pids *axs = (void *)aux;

- for (i = 0; i < axs->pid_count; i++)
+ for (i = 0; i < axs->pid_count; i++) {
+ char axsn[sizeof("aux0xN ")];
+
+ sprintf(axsn, "aux0x%x", i);
if (audit_log_pid_context(context, axs->target_pid[i],
axs->target_auid[i],
axs->target_uid[i],
axs->target_sessionid[i],
axs->target_sid[i],
- axs->target_comm[i]))
+ axs->target_comm[i])
+ || audit_log_contid(context, axsn, axs->target_cid[i]))
call_panic = 1;
+ }
}

if (context->target_pid &&
- audit_log_pid_context(context, context->target_pid,
+ (audit_log_pid_context(context, context->target_pid,
context->target_auid, context->target_uid,
context->target_sessionid,
- context->target_sid, context->target_comm))
+ context->target_sid, context->target_comm)
+ || audit_log_contid(context, "target", context->target_cid)))
call_panic = 1;

if (context->pwd.dentry && context->pwd.mnt) {
@@ -1490,7 +1497,7 @@ static void audit_log_exit(struct audit_context *context, struct task_struct *ts

audit_log_proctitle(tsk, context);

- audit_log_contid(tsk, context, "task");
+ audit_log_contid(context, "task", audit_get_contid(tsk));

/* Send end of event record to help user space know we are finished */
ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE);
@@ -2375,6 +2382,7 @@ void __audit_ptrace(struct task_struct *t)
context->target_uid = task_uid(t);
context->target_sessionid = audit_get_sessionid(t);
security_task_getsecid(t, &context->target_sid);
+ context->target_cid = audit_get_contid(t);
memcpy(context->target_comm, t->comm, TASK_COMM_LEN);
}

@@ -2402,6 +2410,7 @@ int audit_signal_info(int sig, struct task_struct *t)
else
audit_sig_uid = uid;
security_task_getsecid(current, &audit_sig_sid);
+ audit_sig_cid = audit_get_contid(current);
}

if (!audit_signals || audit_dummy_context())
@@ -2415,6 +2424,7 @@ int audit_signal_info(int sig, struct task_struct *t)
ctx->target_uid = t_uid;
ctx->target_sessionid = audit_get_sessionid(t);
security_task_getsecid(t, &ctx->target_sid);
+ ctx->target_cid = audit_get_contid(t);
memcpy(ctx->target_comm, t->comm, TASK_COMM_LEN);
return 0;
}
@@ -2436,6 +2446,7 @@ int audit_signal_info(int sig, struct task_struct *t)
axp->target_uid[axp->pid_count] = t_uid;
axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
security_task_getsecid(t, &axp->target_sid[axp->pid_count]);
+ axp->target_cid[axp->pid_count] = audit_get_contid(t);
memcpy(axp->target_comm[axp->pid_count], t->comm, TASK_COMM_LEN);
axp->pid_count++;

--
1.8.3.1


2018-06-06 17:02:18

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records

Standalone audit records have the timestamp and serial number generated
on the fly and as such are unique, making them standalone. This new
function audit_alloc_local() generates a local audit context that will
be used only for a standalone record and its auxiliary record(s). The
context is discarded immediately after the local associated records are
produced.

Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 8 ++++++++
kernel/auditsc.c | 25 +++++++++++++++++++++++--
2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index ab50985..f549121 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -232,7 +232,9 @@ struct audit_task_info {
extern struct audit_task_info init_struct_audit;
extern void __init audit_task_init(void);
extern int audit_alloc(struct task_struct *task);
+extern struct audit_context *audit_alloc_local(void);
extern void audit_free(struct task_struct *task);
+extern void audit_free_context(struct audit_context *context);
extern void __audit_syscall_entry(int major, unsigned long a0, unsigned long a1,
unsigned long a2, unsigned long a3);
extern void __audit_syscall_exit(int ret_success, long ret_value);
@@ -493,6 +495,12 @@ static inline int audit_alloc(struct task_struct *task)
{
return 0;
}
+static inline struct audit_context *audit_alloc_local(void)
+{
+ return NULL;
+}
+static inline void audit_free_context(struct audit_context *context)
+{ }
static inline void audit_free(struct task_struct *task)
{ }
static inline void audit_syscall_entry(int major, unsigned long a0,
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index cface9d..81c9765 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -916,8 +916,11 @@ static inline void audit_free_aux(struct audit_context *context)
static inline struct audit_context *audit_alloc_context(enum audit_state state)
{
struct audit_context *context;
+ gfp_t gfpflags;

- context = kzalloc(sizeof(*context), GFP_KERNEL);
+ /* We can be called in atomic context via audit_tg() */
+ gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
+ context = kzalloc(sizeof(*context), gfpflags);
if (!context)
return NULL;
context->state = state;
@@ -993,8 +996,26 @@ struct audit_task_info init_struct_audit = {
.ctx = NULL,
};

-static inline void audit_free_context(struct audit_context *context)
+struct audit_context *audit_alloc_local(void)
{
+ struct audit_context *context;
+
+ if (!audit_ever_enabled)
+ return NULL; /* Return if not auditing. */
+
+ context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
+ if (!context)
+ return NULL;
+ context->serial = audit_serial();
+ context->ctime = current_kernel_time64();
+ context->in_syscall = 1;
+ return context;
+}
+
+void audit_free_context(struct audit_context *context)
+{
+ if (!context)
+ return;
audit_free_names(context);
unroll_tree_refs(context, NULL, 0);
free_tree_refs(context);
--
1.8.3.1


2018-06-06 17:02:49

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 06/10] audit: add containerid filtering

Implement audit container identifier filtering using the AUDIT_CONTID
field name to send an 8-character string representing a u64 since the
value field is only u32.

Sending it as two u32 was considered, but gathering and comparing two
fields was more complex.

The feature indicator is AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER.

See: https://github.com/linux-audit/audit-kernel/issues/91
See: https://github.com/linux-audit/audit-userspace/issues/40
See: https://github.com/linux-audit/audit-testsuite/issues/64
See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 1 +
include/uapi/linux/audit.h | 5 ++++-
kernel/audit.h | 1 +
kernel/auditfilter.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
kernel/auditsc.c | 3 +++
5 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index f549121..1e37abf 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -76,6 +76,7 @@ struct audit_field {
u32 type;
union {
u32 val;
+ u64 val64;
kuid_t uid;
kgid_t gid;
struct {
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 469ab25..b440558 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -262,6 +262,7 @@
#define AUDIT_LOGINUID_SET 24
#define AUDIT_SESSIONID 25 /* Session ID */
#define AUDIT_FSTYPE 26 /* FileSystem Type */
+#define AUDIT_CONTID 27 /* Container ID */

/* These are ONLY useful when checking
* at syscall exit time (AUDIT_AT_EXIT). */
@@ -342,6 +343,7 @@ enum {
#define AUDIT_FEATURE_BITMAP_SESSIONID_FILTER 0x00000010
#define AUDIT_FEATURE_BITMAP_LOST_RESET 0x00000020
#define AUDIT_FEATURE_BITMAP_FILTER_FS 0x00000040
+#define AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER 0x00000080

#define AUDIT_FEATURE_BITMAP_ALL (AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT | \
AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME | \
@@ -349,7 +351,8 @@ enum {
AUDIT_FEATURE_BITMAP_EXCLUDE_EXTEND | \
AUDIT_FEATURE_BITMAP_SESSIONID_FILTER | \
AUDIT_FEATURE_BITMAP_LOST_RESET | \
- AUDIT_FEATURE_BITMAP_FILTER_FS)
+ AUDIT_FEATURE_BITMAP_FILTER_FS | \
+ AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER)

/* deprecated: AUDIT_VERSION_* */
#define AUDIT_VERSION_LATEST AUDIT_FEATURE_BITMAP_ALL
diff --git a/kernel/audit.h b/kernel/audit.h
index 1cf1c35..743d445 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -235,6 +235,7 @@ static inline int audit_hash_ino(u32 ino)

extern int audit_match_class(int class, unsigned syscall);
extern int audit_comparator(const u32 left, const u32 op, const u32 right);
+extern int audit_comparator64(const u64 left, const u32 op, const u64 right);
extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
extern int audit_gid_comparator(kgid_t left, u32 op, kgid_t right);
extern int parent_len(const char *path);
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index eaa3201..a5f60ce 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -410,6 +410,7 @@ static int audit_field_valid(struct audit_entry *entry, struct audit_field *f)
/* FALL THROUGH */
case AUDIT_ARCH:
case AUDIT_FSTYPE:
+ case AUDIT_CONTID:
if (f->op != Audit_not_equal && f->op != Audit_equal)
return -EINVAL;
break;
@@ -584,6 +585,14 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
}
entry->rule.exe = audit_mark;
break;
+ case AUDIT_CONTID:
+ if (f->val != sizeof(u64))
+ goto exit_free;
+ str = audit_unpack_string(&bufp, &remain, f->val);
+ if (IS_ERR(str))
+ goto exit_free;
+ f->val64 = ((u64 *)str)[0];
+ break;
}
}

@@ -666,6 +675,11 @@ static struct audit_rule_data *audit_krule_to_data(struct audit_krule *krule)
data->buflen += data->values[i] =
audit_pack_string(&bufp, audit_mark_path(krule->exe));
break;
+ case AUDIT_CONTID:
+ data->buflen += data->values[i] = sizeof(u64);
+ for (i = 0; i < sizeof(u64); i++)
+ ((char *)bufp)[i] = ((char *)&f->val64)[i];
+ break;
case AUDIT_LOGINUID_SET:
if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) {
data->fields[i] = AUDIT_LOGINUID;
@@ -752,6 +766,10 @@ static int audit_compare_rule(struct audit_krule *a, struct audit_krule *b)
if (!gid_eq(a->fields[i].gid, b->fields[i].gid))
return 1;
break;
+ case AUDIT_CONTID:
+ if (a->fields[i].val64 != b->fields[i].val64)
+ return 1;
+ break;
default:
if (a->fields[i].val != b->fields[i].val)
return 1;
@@ -1208,6 +1226,31 @@ int audit_comparator(u32 left, u32 op, u32 right)
}
}

+int audit_comparator64(u64 left, u32 op, u64 right)
+{
+ switch (op) {
+ case Audit_equal:
+ return (left == right);
+ case Audit_not_equal:
+ return (left != right);
+ case Audit_lt:
+ return (left < right);
+ case Audit_le:
+ return (left <= right);
+ case Audit_gt:
+ return (left > right);
+ case Audit_ge:
+ return (left >= right);
+ case Audit_bitmask:
+ return (left & right);
+ case Audit_bittest:
+ return ((left & right) == right);
+ default:
+ BUG();
+ return 0;
+ }
+}
+
int audit_uid_comparator(kuid_t left, u32 op, kuid_t right)
{
switch (op) {
@@ -1346,6 +1389,10 @@ int audit_filter(int msgtype, unsigned int listtype)
result = audit_comparator(audit_loginuid_set(current),
f->op, f->val);
break;
+ case AUDIT_CONTID:
+ result = audit_comparator64(audit_get_contid(current),
+ f->op, f->val64);
+ break;
case AUDIT_MSGTYPE:
result = audit_comparator(msgtype, f->op, f->val);
break;
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 81c9765..ea1ee35 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -622,6 +622,9 @@ static int audit_filter_rules(struct task_struct *tsk,
case AUDIT_LOGINUID_SET:
result = audit_comparator(audit_loginuid_set(tsk), f->op, f->val);
break;
+ case AUDIT_CONTID:
+ result = audit_comparator64(audit_get_contid(tsk), f->op, f->val64);
+ break;
case AUDIT_SUBJ_USER:
case AUDIT_SUBJ_ROLE:
case AUDIT_SUBJ_TYPE:
--
1.8.3.1


2018-06-06 17:02:53

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces

Audit events could happen in a network namespace outside of a task
context due to packets received from the net that trigger an auditing
rule prior to being associated with a running task. The network
namespace could in use by multiple containers by association to the
tasks in that network namespace. We still want a way to attribute
these events to any potential containers. Keep a list per network
namespace to track these audit container identifiiers.

Add/increment the audit container identifier on:
- initial setting of the audit container identifier via /proc
- clone/fork call that inherits an audit container identifier
- unshare call that inherits an audit container identifier
- setns call that inherits an audit container identifier
Delete/decrement the audit container identifier on:
- an inherited audit container identifier dropped when child set
- process exit
- unshare call that drops a net namespace
- setns call that drops a net namespace

See: https://github.com/linux-audit/audit-kernel/issues/92
See: https://github.com/linux-audit/audit-testsuite/issues/64
See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 23 ++++++++++++++++
kernel/audit.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++
kernel/auditsc.c | 5 ++++
kernel/nsproxy.c | 4 +++
4 files changed, 104 insertions(+)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 1e37abf..7e2e51c 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -26,6 +26,7 @@
#include <linux/sched.h>
#include <linux/ptrace.h>
#include <uapi/linux/audit.h>
+#include <linux/refcount.h>

#define AUDIT_INO_UNSET ((unsigned long)-1)
#define AUDIT_DEV_UNSET ((dev_t)-1)
@@ -87,6 +88,12 @@ struct audit_field {
u32 op;
};

+struct audit_contid {
+ struct list_head list;
+ u64 id;
+ refcount_t refcount;
+};
+
extern int is_audit_feature_set(int which);

extern int __init audit_register_class(int class, unsigned *list);
@@ -156,6 +163,10 @@ extern void audit_log_task_info(struct audit_buffer *ab,
struct task_struct *tsk);
extern int audit_log_contid(struct audit_context *context,
char *op, u64 contid);
+extern struct list_head *audit_get_contid_list(const struct net *net);
+extern void audit_contid_add(struct net *net, u64 contid);
+extern void audit_contid_del(struct net *net, u64 contid);
+extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);

extern int audit_update_lsm_rules(void);

@@ -209,6 +220,18 @@ static inline void audit_log_task_info(struct audit_buffer *ab,
static inline int audit_log_contid(struct audit_context *context,
char *op, u64 contid)
{ }
+static inline struct list_head *audit_get_contid_list(const struct net *net)
+{
+ static LIST_HEAD(list);
+ return &list;
+}
+static inline void audit_contid_add(struct net *net, u64 contid)
+{ }
+static inline void audit_contid_del(struct net *net, u64 contid)
+{ }
+static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
+{ }
+
#define audit_enabled 0
#endif /* CONFIG_AUDIT */

diff --git a/kernel/audit.c b/kernel/audit.c
index ba304a8..ecd2de4 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -106,6 +106,7 @@
*/
struct audit_net {
struct sock *sk;
+ struct list_head contid_list;
};

/**
@@ -311,6 +312,76 @@ static struct sock *audit_get_sk(const struct net *net)
return aunet->sk;
}

+/**
+ * audit_get_contid_list - Return the audit container ID list for the given network namespace
+ * @net: the destination network namespace
+ *
+ * Description:
+ * Returns the list pointer if valid, NULL otherwise. The caller must ensure
+ * that a reference is held for the network namespace while the sock is in use.
+ */
+struct list_head *audit_get_contid_list(const struct net *net)
+{
+ struct audit_net *aunet = net_generic(net, audit_net_id);
+
+ return &aunet->contid_list;
+}
+
+void audit_contid_add(struct net *net, u64 contid)
+{
+ struct list_head *contid_list = audit_get_contid_list(net);
+ struct audit_contid *cont;
+
+ if (!cid_valid(contid))
+ return;
+ if (!list_empty(contid_list))
+ list_for_each_entry(cont, contid_list, list)
+ if (cont->id == contid) {
+ refcount_inc(&cont->refcount);
+ return;
+ }
+ cont = kmalloc(sizeof(struct audit_contid), GFP_KERNEL);
+ if (!cont)
+ return;
+ INIT_LIST_HEAD(&cont->list);
+ cont->id = contid;
+ refcount_set(&cont->refcount, 1);
+ list_add(&cont->list, contid_list);
+}
+
+void audit_contid_del(struct net *net, u64 contid)
+{
+ struct list_head *contid_list = audit_get_contid_list(net);
+ struct audit_contid *cont = NULL;
+ int found = 0;
+
+ if (!cid_valid(contid))
+ return;
+ if (!list_empty(contid_list))
+ list_for_each_entry(cont, contid_list, list)
+ if (cont->id == contid) {
+ found = 1;
+ break;
+ }
+ if (!found)
+ return;
+ list_del(&cont->list);
+ if (refcount_dec_and_test(&cont->refcount))
+ kfree(cont);
+}
+
+void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
+{
+ u64 contid = audit_get_contid(p);
+ struct nsproxy *new = p->nsproxy;
+
+ if (!cid_valid(contid))
+ return;
+ audit_contid_del(ns->net_ns, contid);
+ if (new)
+ audit_contid_add(new->net_ns, contid);
+}
+
void audit_panic(const char *message)
{
switch (audit_failure) {
@@ -1550,6 +1621,7 @@ static int __net_init audit_net_init(struct net *net)
return -ENOMEM;
}
aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
+ INIT_LIST_HEAD(&aunet->contid_list);

return 0;
}
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index ea1ee35..6ab5e5e 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -75,6 +75,7 @@
#include <linux/uaccess.h>
#include <linux/fsnotify_backend.h>
#include <uapi/linux/limits.h>
+#include <net/net_namespace.h>

#include "audit.h"

@@ -2165,6 +2166,7 @@ int audit_set_contid(struct task_struct *task, u64 contid)
uid_t uid;
struct tty_struct *tty;
char comm[sizeof(current->comm)];
+ struct net *net = task->nsproxy->net_ns;

/* Can't set if audit disabled */
if (!task->audit)
@@ -2185,10 +2187,13 @@ int audit_set_contid(struct task_struct *task, u64 contid)
else if (cid_valid(oldcontid) && !task->audit->inherited)
rc = -EEXIST;
if (!rc) {
+ if (cid_valid(oldcontid))
+ audit_contid_del(net, oldcontid);
task_lock(task);
task->audit->contid = contid;
task->audit->inherited = false;
task_unlock(task);
+ audit_contid_add(net, contid);
}

if (!audit_enabled)
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index f6c5d33..dcb69fe 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -27,6 +27,7 @@
#include <linux/syscalls.h>
#include <linux/cgroup.h>
#include <linux/perf_event.h>
+#include <linux/audit.h>

static struct kmem_cache *nsproxy_cachep;

@@ -140,6 +141,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
struct nsproxy *old_ns = tsk->nsproxy;
struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns);
struct nsproxy *new_ns;
+ u64 contid = audit_get_contid(tsk);

if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
CLONE_NEWPID | CLONE_NEWNET |
@@ -167,6 +169,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
return PTR_ERR(new_ns);

tsk->nsproxy = new_ns;
+ audit_contid_add(new_ns->net_ns, contid);
return 0;
}

@@ -224,6 +227,7 @@ void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
ns = p->nsproxy;
p->nsproxy = new;
task_unlock(p);
+ audit_switch_task_namespaces(ns, p);

if (ns && atomic_dec_and_test(&ns->count))
free_nsproxy(ns);
--
1.8.3.1


2018-06-06 17:03:05

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 09/10] debug audit: read container ID of a process

Add support for reading the audit container identifier from the proc
filesystem.

This is a read from the proc entry of the form
/proc/PID/audit_containerid where PID is the process ID of the task
whose audit container identifier is sought.

The read expects up to a u64 value (unset: 18446744073709551615).

Signed-off-by: Richard Guy Briggs <[email protected]>
---
fs/proc/base.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 318dff4..ca8bfe2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1303,6 +1303,21 @@ static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
.llseek = generic_file_llseek,
};

+static ssize_t proc_contid_read(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct inode *inode = file_inode(file);
+ struct task_struct *task = get_proc_task(inode);
+ ssize_t length;
+ char tmpbuf[TMPBUFLEN*2];
+
+ if (!task)
+ return -ESRCH;
+ length = scnprintf(tmpbuf, TMPBUFLEN*2, "%llu", audit_get_contid(task));
+ put_task_struct(task);
+ return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
+}
+
static ssize_t proc_contid_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
@@ -1333,6 +1348,7 @@ static ssize_t proc_contid_write(struct file *file, const char __user *buf,
}

static const struct file_operations proc_contid_operations = {
+ .read = proc_contid_read,
.write = proc_contid_write,
.llseek = generic_file_llseek,
};
@@ -3030,7 +3046,7 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
#ifdef CONFIG_AUDITSYSCALL
REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
REG("sessionid", S_IRUGO, proc_sessionid_operations),
- REG("audit_containerid", S_IWUSR, proc_contid_operations),
+ REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
@@ -3422,7 +3438,7 @@ static int proc_tid_comm_permission(struct inode *inode, int mask)
#ifdef CONFIG_AUDITSYSCALL
REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
REG("sessionid", S_IRUGO, proc_sessionid_operations),
- REG("audit_containerid", S_IWUSR, proc_contid_operations),
+ REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
--
1.8.3.1


2018-06-06 17:03:12

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 10/10] rfkill: fix spelling mistake contidion to condition

Signed-off-by: Richard Guy Briggs <[email protected]>
---
net/rfkill/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rfkill/core.c b/net/rfkill/core.c
index 59d0eb9..e89a009 100644
--- a/net/rfkill/core.c
+++ b/net/rfkill/core.c
@@ -494,7 +494,7 @@ void rfkill_remove_epo_lock(void)
/**
* rfkill_is_epo_lock_active - returns true EPO is active
*
- * Returns 0 (false) if there is NOT an active EPO contidion,
+ * Returns 0 (false) if there is NOT an active EPO condition,
* and 1 (true) if there is an active EPO contition, which
* locks all radios in one of the BLOCKED states.
*
--
1.8.3.1


2018-06-06 17:04:40

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

Add audit container identifier auxiliary record(s) to NETFILTER_PKT
event standalone records. Iterate through all potential audit container
identifiers associated with a network namespace.

Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 5 +++++
kernel/audit.c | 20 +++++++++++++++++++-
kernel/auditsc.c | 2 ++
net/netfilter/xt_AUDIT.c | 12 ++++++++++--
4 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 7e2e51c..4560a4e 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context *context,
extern void audit_contid_add(struct net *net, u64 contid);
extern void audit_contid_del(struct net *net, u64 contid);
extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);
+extern void audit_log_contid_list(struct net *net,
+ struct audit_context *context);

extern int audit_update_lsm_rules(void);

@@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net, u64 contid)
{ }
static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
{ }
+static inline void audit_log_contid_list(struct net *net,
+ struct audit_context *context)
+{ }

#define audit_enabled 0
#endif /* CONFIG_AUDIT */
diff --git a/kernel/audit.c b/kernel/audit.c
index ecd2de4..8cca41a 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
audit_contid_add(new->net_ns, contid);
}

+void audit_log_contid_list(struct net *net, struct audit_context *context)
+{
+ struct audit_contid *cont;
+ int i = 0;
+
+ list_for_each_entry(cont, audit_get_contid_list(net), list) {
+ char buf[14];
+
+ sprintf(buf, "net%u", i++);
+ audit_log_contid(context, buf, cont->id);
+ }
+}
+EXPORT_SYMBOL(audit_log_contid_list);
+
void audit_panic(const char *message)
{
switch (audit_failure) {
@@ -2132,17 +2146,21 @@ int audit_log_contid(struct audit_context *context,
char *op, u64 contid)
{
struct audit_buffer *ab;
+ gfp_t gfpflags;

if (!cid_valid(contid))
return 0;
+ /* We can be called in atomic context via audit_tg() */
+ gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
/* Generate AUDIT_CONTAINER record with container ID */
- ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
+ ab = audit_log_start(context, gfpflags, AUDIT_CONTAINER);
if (!ab)
return -ENOMEM;
audit_log_format(ab, "op=%s contid=%llu", op, contid);
audit_log_end(ab);
return 0;
}
+EXPORT_SYMBOL(audit_log_contid);

void audit_log_key(struct audit_buffer *ab, char *key)
{
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 6ab5e5e..e2a16d2 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1015,6 +1015,7 @@ struct audit_context *audit_alloc_local(void)
context->in_syscall = 1;
return context;
}
+EXPORT_SYMBOL(audit_alloc_local);

void audit_free_context(struct audit_context *context)
{
@@ -1029,6 +1030,7 @@ void audit_free_context(struct audit_context *context)
audit_proctitle_free(context);
kfree(context);
}
+EXPORT_SYMBOL(audit_free_context);

static int audit_log_pid_context(struct audit_context *context, pid_t pid,
kuid_t auid, kuid_t uid, unsigned int sessionid,
diff --git a/net/netfilter/xt_AUDIT.c b/net/netfilter/xt_AUDIT.c
index f368ee6..10d2707 100644
--- a/net/netfilter/xt_AUDIT.c
+++ b/net/netfilter/xt_AUDIT.c
@@ -71,10 +71,13 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
{
struct audit_buffer *ab;
int fam = -1;
+ struct audit_context *context;
+ struct net *net;

if (audit_enabled == 0)
- goto errout;
- ab = audit_log_start(NULL, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
+ goto out;
+ context = audit_alloc_local();
+ ab = audit_log_start(context, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
if (ab == NULL)
goto errout;

@@ -104,7 +107,12 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)

audit_log_end(ab);

+ net = xt_net(par);
+ audit_log_contid_list(net, context);
+
errout:
+ audit_free_context(context);
+out:
return XT_CONTINUE;
}

--
1.8.3.1


2018-06-06 17:05:51

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit

Add audit container identifier auxiliary record to tty logging rule
event standalone records.

Signed-off-by: Richard Guy Briggs <[email protected]>
---
drivers/tty/tty_audit.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
index e30aa6b..66bd850 100644
--- a/drivers/tty/tty_audit.c
+++ b/drivers/tty/tty_audit.c
@@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, dev_t dev,
uid_t uid = from_kuid(&init_user_ns, task_uid(tsk));
uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk));
unsigned int sessionid = audit_get_sessionid(tsk);
+ struct audit_context *context = audit_alloc_local();

- ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY);
+ ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY);
if (ab) {
char name[sizeof(tsk->comm)];

@@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, dev_t dev,
audit_log_n_hex(ab, data, size);
audit_log_end(ab);
}
+ audit_log_contid(context, "tty", audit_get_contid(tsk));
+ audit_free_context(context);
}

/**
--
1.8.3.1


2018-06-06 17:07:56

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

Implement the proc fs write to set the audit container identifier of a
process, emitting an AUDIT_CONTAINER_ID record to document the event.

This is a write from the container orchestrator task to a proc entry of
the form /proc/PID/audit_containerid where PID is the process ID of the
newly created task that is to become the first task in a container, or
an additional task added to a container.

The write expects up to a u64 value (unset: 18446744073709551615).

The writer must have capability CAP_AUDIT_CONTROL.

This will produce a record such as this:
type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes

The "op" field indicates an initial set. The "pid" to "ses" fields are
the orchestrator while the "opid" field is the object's PID, the process
being "contained". Old and new audit container identifier values are
given in the "contid" fields, while res indicates its success.

It is not permitted to unset or re-set the audit container identifier.
A child inherits its parent's audit container identifier, but then can
be set only once after.

See: https://github.com/linux-audit/audit-kernel/issues/90
See: https://github.com/linux-audit/audit-userspace/issues/51
See: https://github.com/linux-audit/audit-testsuite/issues/64
See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID

Signed-off-by: Richard Guy Briggs <[email protected]>
---
fs/proc/base.c | 37 ++++++++++++++++++++++++
include/linux/audit.h | 25 ++++++++++++++++
include/uapi/linux/audit.h | 2 ++
kernel/auditsc.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 135 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index eafa39a..318dff4 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1302,6 +1302,41 @@ static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
.read = proc_sessionid_read,
.llseek = generic_file_llseek,
};
+
+static ssize_t proc_contid_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct inode *inode = file_inode(file);
+ u64 contid;
+ int rv;
+ struct task_struct *task = get_proc_task(inode);
+
+ if (!task)
+ return -ESRCH;
+ if (*ppos != 0) {
+ /* No partial writes. */
+ put_task_struct(task);
+ return -EINVAL;
+ }
+
+ rv = kstrtou64_from_user(buf, count, 10, &contid);
+ if (rv < 0) {
+ put_task_struct(task);
+ return rv;
+ }
+
+ rv = audit_set_contid(task, contid);
+ put_task_struct(task);
+ if (rv < 0)
+ return rv;
+ return count;
+}
+
+static const struct file_operations proc_contid_operations = {
+ .write = proc_contid_write,
+ .llseek = generic_file_llseek,
+};
+
#endif

#ifdef CONFIG_FAULT_INJECTION
@@ -2995,6 +3030,7 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
#ifdef CONFIG_AUDITSYSCALL
REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
REG("sessionid", S_IRUGO, proc_sessionid_operations),
+ REG("audit_containerid", S_IWUSR, proc_contid_operations),
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
@@ -3386,6 +3422,7 @@ static int proc_tid_comm_permission(struct inode *inode, int mask)
#ifdef CONFIG_AUDITSYSCALL
REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
REG("sessionid", S_IRUGO, proc_sessionid_operations),
+ REG("audit_containerid", S_IWUSR, proc_contid_operations),
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
diff --git a/include/linux/audit.h b/include/linux/audit.h
index 4f824c4..497cd81 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -219,6 +219,8 @@ static inline void audit_log_task_info(struct audit_buffer *ab,
struct audit_task_info {
kuid_t loginuid;
unsigned int sessionid;
+ u64 contid;
+ bool inherited; /* containerid inheritance */
struct audit_context *ctx;
};
extern struct audit_task_info init_struct_audit;
@@ -331,6 +333,7 @@ static inline void audit_ptrace(struct task_struct *t)
extern int auditsc_get_stamp(struct audit_context *ctx,
struct timespec64 *t, unsigned int *serial);
extern int audit_set_loginuid(kuid_t loginuid);
+extern int audit_set_contid(struct task_struct *tsk, u64 contid);

static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
{
@@ -348,6 +351,14 @@ static inline unsigned int audit_get_sessionid(struct task_struct *tsk)
return AUDIT_SID_UNSET;
}

+static inline u64 audit_get_contid(struct task_struct *tsk)
+{
+ if (!tsk->audit)
+ return AUDIT_CID_UNSET;
+ else
+ return tsk->audit->contid;
+}
+
extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t gid, umode_t mode);
extern void __audit_bprm(struct linux_binprm *bprm);
@@ -542,6 +553,10 @@ static inline unsigned int audit_get_sessionid(struct task_struct *tsk)
{
return AUDIT_SID_UNSET;
}
+static inline kuid_t audit_get_contid(struct task_struct *tsk)
+{
+ return AUDIT_CID_UNSET;
+}
static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
{ }
static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
@@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct task_struct *tsk)
return uid_valid(audit_get_loginuid(tsk));
}

+static inline bool cid_valid(u64 contid)
+{
+ return contid != AUDIT_CID_UNSET;
+}
+
+static inline bool audit_contid_set(struct task_struct *tsk)
+{
+ return cid_valid(audit_get_contid(tsk));
+}
+
static inline void audit_log_string(struct audit_buffer *ab, const char *buf)
{
audit_log_n_string(ab, buf, strlen(buf));
diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 04f9bd2..c3b1aca 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -71,6 +71,7 @@
#define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
#define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
#define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
+#define AUDIT_CONTAINER_ID 1020 /* Define the container id and information */

#define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly uninteresting to kernel */
#define AUDIT_USER_AVC 1107 /* We filter this differently */
@@ -466,6 +467,7 @@ struct audit_tty_status {

#define AUDIT_UID_UNSET (unsigned int)-1
#define AUDIT_SID_UNSET ((unsigned int)-1)
+#define AUDIT_CID_UNSET ((u64)-1)

/* audit_rule_data supports filter rules with both integer and string
* fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 59ef7a81..611e926 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
return -ENOMEM;
info->loginuid = audit_get_loginuid(current);
info->sessionid = audit_get_sessionid(current);
+ info->contid = audit_get_contid(current);
+ info->inherited = true;
tsk->audit = info;

if (likely(!audit_ever_enabled))
@@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
struct audit_task_info init_struct_audit = {
.loginuid = INVALID_UID,
.sessionid = AUDIT_SID_UNSET,
+ .contid = AUDIT_CID_UNSET,
+ .inherited = true,
.ctx = NULL,
};

@@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
}

/**
+ * audit_set_contid - set current task's audit_context contid
+ * @contid: contid value
+ *
+ * Returns 0 on success, -EPERM on permission failure.
+ *
+ * Called (set) from fs/proc/base.c::proc_contid_write().
+ */
+int audit_set_contid(struct task_struct *task, u64 contid)
+{
+ u64 oldcontid;
+ int rc = 0;
+ struct audit_buffer *ab;
+ uid_t uid;
+ struct tty_struct *tty;
+ char comm[sizeof(current->comm)];
+
+ /* Can't set if audit disabled */
+ if (!task->audit)
+ return -ENOPROTOOPT;
+ oldcontid = audit_get_contid(task);
+ /* Don't allow the audit containerid to be unset */
+ if (!cid_valid(contid))
+ rc = -EINVAL;
+ /* if we don't have caps, reject */
+ else if (!capable(CAP_AUDIT_CONTROL))
+ rc = -EPERM;
+ /* if task has children or is not single-threaded, deny */
+ else if (!list_empty(&task->children))
+ rc = -EBUSY;
+ else if (!(thread_group_leader(task) && thread_group_empty(task)))
+ rc = -EALREADY;
+ /* it is already set, and not inherited from the parent, reject */
+ else if (cid_valid(oldcontid) && !task->audit->inherited)
+ rc = -EEXIST;
+ if (!rc) {
+ task_lock(task);
+ task->audit->contid = contid;
+ task->audit->inherited = false;
+ task_unlock(task);
+ }
+
+ if (!audit_enabled)
+ return rc;
+
+ ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
+ if (!ab)
+ return rc;
+
+ uid = from_kuid(&init_user_ns, task_uid(current));
+ tty = audit_get_tty(current);
+ audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d uid=%u auid=%u tty=%s ses=%u",
+ task_tgid_nr(task), oldcontid, contid,
+ task_tgid_nr(current), uid
+ from_kuid(&init_user_ns, audit_get_loginuid(current)),
+ tty ? tty_name(tty) : "(none)",
+ audit_get_sessionid(current));
+ audit_put_tty(tty);
+ audit_log_task_context(ab);
+ audit_log_format(ab, " comm=");
+ audit_log_untrustedstring(ab, get_task_comm(comm, current));
+ audit_log_d_path_exe(ab, current->mm);
+ audit_log_format(ab, " res=%d", !rc);
+ audit_log_end(ab);
+ return rc;
+}
+
+/**
* __audit_mq_open - record audit data for a POSIX MQ open
* @oflag: open flag
* @mode: mode bits
--
1.8.3.1


2018-06-06 18:29:55

by Richard Guy Briggs

[permalink] [raw]
Subject: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

Create a new audit record AUDIT_CONTAINER to document the audit
container identifier of a process if it is present.

Called from audit_log_exit(), syscalls are covered.

A sample raw event:
type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257 success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="tmpcontainerid"
type=CWD msg=audit(1519924845.499:257): cwd="/root"
type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=PATH msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid" inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
type=PROCTITLE msg=audit(1519924845.499:257): proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964
type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458

See: https://github.com/linux-audit/audit-kernel/issues/90
See: https://github.com/linux-audit/audit-userspace/issues/51
See: https://github.com/linux-audit/audit-testsuite/issues/64
See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 7 +++++++
include/uapi/linux/audit.h | 1 +
kernel/audit.c | 23 +++++++++++++++++++++++
kernel/auditsc.c | 3 +++
4 files changed, 34 insertions(+)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 497cd81..4e1e34e 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -152,6 +152,9 @@ extern void audit_log_key(struct audit_buffer *ab,
extern int audit_log_task_context(struct audit_buffer *ab);
extern void audit_log_task_info(struct audit_buffer *ab,
struct task_struct *tsk);
+extern int audit_log_contid(struct task_struct *tsk,
+ struct audit_context *context,
+ char *op);

extern int audit_update_lsm_rules(void);

@@ -202,6 +205,10 @@ static inline int audit_log_task_context(struct audit_buffer *ab)
static inline void audit_log_task_info(struct audit_buffer *ab,
struct task_struct *tsk)
{ }
+static inline int audit_log_contid(struct task_struct *tsk,
+ struct audit_context *context,
+ char *op)
+{ }
#define audit_enabled 0
#endif /* CONFIG_AUDIT */

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index c3b1aca..469ab25 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -115,6 +115,7 @@
#define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd */
#define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
#define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
+#define AUDIT_CONTAINER 1332 /* Container ID */

#define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
#define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
diff --git a/kernel/audit.c b/kernel/audit.c
index e7478cb..5e150c6 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
}

+/*
+ * audit_log_contid - report container info
+ * @tsk: task to be recorded
+ * @context: task or local context for record
+ * @op: contid string description
+ */
+int audit_log_contid(struct task_struct *tsk,
+ struct audit_context *context, char *op)
+{
+ struct audit_buffer *ab;
+
+ if (!audit_contid_set(tsk))
+ return 0;
+ /* Generate AUDIT_CONTAINER record with container ID */
+ ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
+ if (!ab)
+ return -ENOMEM;
+ audit_log_format(ab, "op=%s contid=%llu",
+ op, audit_get_contid(tsk));
+ audit_log_end(ab);
+ return 0;
+}
+
void audit_log_key(struct audit_buffer *ab, char *key)
{
audit_log_format(ab, " key=");
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 611e926..a3c946c 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -1490,10 +1490,13 @@ static void audit_log_exit(struct audit_context *context, struct task_struct *ts

audit_log_proctitle(tsk, context);

+ audit_log_contid(tsk, context, "task");
+
/* Send end of event record to help user space know we are finished */
ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE);
if (ab)
audit_log_end(ab);
+
if (call_panic)
audit_panic("error converting sid to string");
}
--
1.8.3.1


2018-06-06 19:07:59

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On Wednesday, June 6, 2018 12:58:28 PM EDT Richard Guy Briggs wrote:
> Implement the proc fs write to set the audit container identifier of a
> process, emitting an AUDIT_CONTAINER_ID record to document the event.
>
> This is a write from the container orchestrator task to a proc entry of
> the form /proc/PID/audit_containerid where PID is the process ID of the
> newly created task that is to become the first task in a container, or
> an additional task added to a container.
>
> The write expects up to a u64 value (unset: 18446744073709551615).
>
> The writer must have capability CAP_AUDIT_CONTROL.
>
> This will produce a record such as this:
> type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set
> opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root
> uid=root tty=ttyS0 ses=1
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash
> exe=/usr/bin/bash res=yes
>
> The "op" field indicates an initial set. The "pid" to "ses" fields are
> the orchestrator while the "opid" field is the object's PID, the process
> being "contained". Old and new audit container identifier values are
> given in the "contid" fields, while res indicates its success.
>
> It is not permitted to unset or re-set the audit container identifier.
> A child inherits its parent's audit container identifier, but then can
> be set only once after.
>
> See: https://github.com/linux-audit/audit-kernel/issues/90
> See: https://github.com/linux-audit/audit-userspace/issues/51
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See:
> https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> fs/proc/base.c | 37 ++++++++++++++++++++++++
> include/linux/audit.h | 25 ++++++++++++++++
> include/uapi/linux/audit.h | 2 ++
> kernel/auditsc.c | 71
> ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 135
> insertions(+)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index eafa39a..318dff4 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1302,6 +1302,41 @@ static ssize_t proc_sessionid_read(struct file *
> file, char __user * buf, .read = proc_sessionid_read,
> .llseek = generic_file_llseek,
> };
> +
> +static ssize_t proc_contid_write(struct file *file, const char __user
> *buf, + size_t count, loff_t *ppos)
> +{
> + struct inode *inode = file_inode(file);
> + u64 contid;
> + int rv;
> + struct task_struct *task = get_proc_task(inode);
> +
> + if (!task)
> + return -ESRCH;
> + if (*ppos != 0) {
> + /* No partial writes. */
> + put_task_struct(task);
> + return -EINVAL;
> + }
> +
> + rv = kstrtou64_from_user(buf, count, 10, &contid);
> + if (rv < 0) {
> + put_task_struct(task);
> + return rv;
> + }
> +
> + rv = audit_set_contid(task, contid);
> + put_task_struct(task);
> + if (rv < 0)
> + return rv;
> + return count;
> +}
> +
> +static const struct file_operations proc_contid_operations = {
> + .write = proc_contid_write,
> + .llseek = generic_file_llseek,
> +};
> +
> #endif
>
> #ifdef CONFIG_FAULT_INJECTION
> @@ -2995,6 +3030,7 @@ static int proc_pid_patch_state(struct seq_file *m,
> struct pid_namespace *ns, #ifdef CONFIG_AUDITSYSCALL
> REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> REG("sessionid", S_IRUGO, proc_sessionid_operations),
> + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> @@ -3386,6 +3422,7 @@ static int proc_tid_comm_permission(struct inode
> *inode, int mask) #ifdef CONFIG_AUDITSYSCALL
> REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> REG("sessionid", S_IRUGO, proc_sessionid_operations),
> + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 4f824c4..497cd81 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -219,6 +219,8 @@ static inline void audit_log_task_info(struct
> audit_buffer *ab, struct audit_task_info {
> kuid_t loginuid;
> unsigned int sessionid;
> + u64 contid;
> + bool inherited; /* containerid inheritance */
> struct audit_context *ctx;
> };
> extern struct audit_task_info init_struct_audit;
> @@ -331,6 +333,7 @@ static inline void audit_ptrace(struct task_struct *t)
> extern int auditsc_get_stamp(struct audit_context *ctx,
> struct timespec64 *t, unsigned int *serial);
> extern int audit_set_loginuid(kuid_t loginuid);
> +extern int audit_set_contid(struct task_struct *tsk, u64 contid);
>
> static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
> {
> @@ -348,6 +351,14 @@ static inline unsigned int audit_get_sessionid(struct
> task_struct *tsk) return AUDIT_SID_UNSET;
> }
>
> +static inline u64 audit_get_contid(struct task_struct *tsk)
> +{
> + if (!tsk->audit)
> + return AUDIT_CID_UNSET;
> + else
> + return tsk->audit->contid;
> +}
> +
> extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
> extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t
> gid, umode_t mode); extern void __audit_bprm(struct linux_binprm *bprm);
> @@ -542,6 +553,10 @@ static inline unsigned int audit_get_sessionid(struct
> task_struct *tsk) {
> return AUDIT_SID_UNSET;
> }
> +static inline kuid_t audit_get_contid(struct task_struct *tsk)
> +{
> + return AUDIT_CID_UNSET;
> +}
> static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
> { }
> static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
> @@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct
> task_struct *tsk) return uid_valid(audit_get_loginuid(tsk));
> }
>
> +static inline bool cid_valid(u64 contid)
> +{
> + return contid != AUDIT_CID_UNSET;
> +}
> +
> +static inline bool audit_contid_set(struct task_struct *tsk)
> +{
> + return cid_valid(audit_get_contid(tsk));
> +}
> +
> static inline void audit_log_string(struct audit_buffer *ab, const char
> *buf) {
> audit_log_n_string(ab, buf, strlen(buf));
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index 04f9bd2..c3b1aca 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -71,6 +71,7 @@
> #define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
> #define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
> #define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
> +#define AUDIT_CONTAINER_ID 1020 /* Define the container id and
information
> */
>
> #define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly
> uninteresting to kernel */ #define AUDIT_USER_AVC 1107 /* We filter this
> differently */
> @@ -466,6 +467,7 @@ struct audit_tty_status {
>
> #define AUDIT_UID_UNSET (unsigned int)-1
> #define AUDIT_SID_UNSET ((unsigned int)-1)
> +#define AUDIT_CID_UNSET ((u64)-1)
>
> /* audit_rule_data supports filter rules with both integer and string
> * fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 59ef7a81..611e926 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
> return -ENOMEM;
> info->loginuid = audit_get_loginuid(current);
> info->sessionid = audit_get_sessionid(current);
> + info->contid = audit_get_contid(current);
> + info->inherited = true;
> tsk->audit = info;
>
> if (likely(!audit_ever_enabled))
> @@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
> struct audit_task_info init_struct_audit = {
> .loginuid = INVALID_UID,
> .sessionid = AUDIT_SID_UNSET,
> + .contid = AUDIT_CID_UNSET,
> + .inherited = true,
> .ctx = NULL,
> };
>
> @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> }
>
> /**
> + * audit_set_contid - set current task's audit_context contid
> + * @contid: contid value
> + *
> + * Returns 0 on success, -EPERM on permission failure.
> + *
> + * Called (set) from fs/proc/base.c::proc_contid_write().
> + */
> +int audit_set_contid(struct task_struct *task, u64 contid)
> +{
> + u64 oldcontid;
> + int rc = 0;
> + struct audit_buffer *ab;
> + uid_t uid;
> + struct tty_struct *tty;
> + char comm[sizeof(current->comm)];
> +
> + /* Can't set if audit disabled */
> + if (!task->audit)
> + return -ENOPROTOOPT;
> + oldcontid = audit_get_contid(task);
> + /* Don't allow the audit containerid to be unset */
> + if (!cid_valid(contid))
> + rc = -EINVAL;
> + /* if we don't have caps, reject */
> + else if (!capable(CAP_AUDIT_CONTROL))
> + rc = -EPERM;
> + /* if task has children or is not single-threaded, deny */
> + else if (!list_empty(&task->children))
> + rc = -EBUSY;
> + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> + rc = -EALREADY;
> + /* it is already set, and not inherited from the parent, reject */
> + else if (cid_valid(oldcontid) && !task->audit->inherited)
> + rc = -EEXIST;
> + if (!rc) {
> + task_lock(task);
> + task->audit->contid = contid;
> + task->audit->inherited = false;
> + task_unlock(task);
> + }
> +
> + if (!audit_enabled)
> + return rc;
> +
> + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
> + if (!ab)
> + return rc;
> +
> + uid = from_kuid(&init_user_ns, task_uid(current));
> + tty = audit_get_tty(current);
> + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d
> uid=%u auid=%u tty=%s ses=%u", + task_tgid_nr(task), oldcontid,
contid,
> + task_tgid_nr(current), uid
> + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> + tty ? tty_name(tty) : "(none)",
> + audit_get_sessionid(current));

The event code doesn't match the example event at the top. (uid and auid are
transposed.) But the code looks right.

Ack for the event format.

-Steve

> + audit_put_tty(tty);
> + audit_log_task_context(ab);
> + audit_log_format(ab, " comm=");
> + audit_log_untrustedstring(ab, get_task_comm(comm, current));
> + audit_log_d_path_exe(ab, current->mm);
> + audit_log_format(ab, " res=%d", !rc);
> + audit_log_end(ab);
> + return rc;
> +}
> +
> +/**
> * __audit_mq_open - record audit data for a POSIX MQ open
> * @oflag: open flag
> * @mode: mode bits





2018-06-06 20:02:44

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Wednesday, June 6, 2018 12:58:29 PM EDT Richard Guy Briggs wrote:
> Create a new audit record AUDIT_CONTAINER to document the audit
> container identifier of a process if it is present.
>
> Called from audit_log_exit(), syscalls are covered.
>
> A sample raw event:
> type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257
> success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2
> ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
> fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash"
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> key="tmpcontainerid" type=CWD msg=audit(1519924845.499:257): cwd="/root"
> type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863
> dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00
> obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000
> cap_fi=0000000000000000 cap_fe=0 cap_fver=0 type=PATH
> msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid"
> inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00
> obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE
> cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> type=PROCTITLE msg=audit(1519924845.499:257):
> proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D
> 702F746D70636F6E7461696E65726964 type=CONTAINER
> msg=audit(1519924845.499:257): op=task contid=123458

Ack for the audit record names.

-Steve

> See: https://github.com/linux-audit/audit-kernel/issues/90
> See: https://github.com/linux-audit/audit-userspace/issues/51
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See:
> https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 7 +++++++
> include/uapi/linux/audit.h | 1 +
> kernel/audit.c | 23 +++++++++++++++++++++++
> kernel/auditsc.c | 3 +++
> 4 files changed, 34 insertions(+)
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 497cd81..4e1e34e 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -152,6 +152,9 @@ extern void audit_log_key(struct audit_buffer
*ab,
> extern int audit_log_task_context(struct audit_buffer *ab);
> extern void audit_log_task_info(struct audit_buffer *ab,
> struct task_struct *tsk);
> +extern int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context,
> + char *op);
>
> extern int audit_update_lsm_rules(void);
>
> @@ -202,6 +205,10 @@ static inline int audit_log_task_context(struct
> audit_buffer *ab) static inline void audit_log_task_info(struct
> audit_buffer *ab,
> struct task_struct *tsk)
> { }
> +static inline int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context,
> + char *op)
> +{ }
> #define audit_enabled 0
> #endif /* CONFIG_AUDIT */
>
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index c3b1aca..469ab25 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -115,6 +115,7 @@
> #define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd
*/
> #define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
> #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
> +#define AUDIT_CONTAINER 1332 /* Container ID */
>
> #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
> #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index e7478cb..5e150c6 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
> audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
> }
>
> +/*
> + * audit_log_contid - report container info
> + * @tsk: task to be recorded
> + * @context: task or local context for record
> + * @op: contid string description
> + */
> +int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context, char *op)
> +{
> + struct audit_buffer *ab;
> +
> + if (!audit_contid_set(tsk))
> + return 0;
> + /* Generate AUDIT_CONTAINER record with container ID */
> + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> + if (!ab)
> + return -ENOMEM;
> + audit_log_format(ab, "op=%s contid=%llu",
> + op, audit_get_contid(tsk));
> + audit_log_end(ab);
> + return 0;
> +}
> +
> void audit_log_key(struct audit_buffer *ab, char *key)
> {
> audit_log_format(ab, " key=");
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 611e926..a3c946c 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1490,10 +1490,13 @@ static void audit_log_exit(struct audit_context
> *context, struct task_struct *ts
>
> audit_log_proctitle(tsk, context);
>
> + audit_log_contid(tsk, context, "task");
> +
> /* Send end of event record to help user space know we are finished */
> ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE);
> if (ab)
> audit_log_end(ab);
> +
> if (call_panic)
> audit_panic("error converting sid to string");
> }





2018-06-06 20:45:12

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On 2018-06-06 13:56, Steve Grubb wrote:
> On Wednesday, June 6, 2018 12:58:28 PM EDT Richard Guy Briggs wrote:
> > Implement the proc fs write to set the audit container identifier of a
> > process, emitting an AUDIT_CONTAINER_ID record to document the event.
> >
> > This is a write from the container orchestrator task to a proc entry of
> > the form /proc/PID/audit_containerid where PID is the process ID of the
> > newly created task that is to become the first task in a container, or
> > an additional task added to a container.
> >
> > The write expects up to a u64 value (unset: 18446744073709551615).
> >
> > The writer must have capability CAP_AUDIT_CONTROL.
> >
> > This will produce a record such as this:
> > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set
> > opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root
> > uid=root tty=ttyS0 ses=1
> > subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash
> > exe=/usr/bin/bash res=yes
> >
> > The "op" field indicates an initial set. The "pid" to "ses" fields are
> > the orchestrator while the "opid" field is the object's PID, the process
> > being "contained". Old and new audit container identifier values are
> > given in the "contid" fields, while res indicates its success.
> >
> > It is not permitted to unset or re-set the audit container identifier.
> > A child inherits its parent's audit container identifier, but then can
> > be set only once after.
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/90
> > See: https://github.com/linux-audit/audit-userspace/issues/51
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See:
> > https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > fs/proc/base.c | 37 ++++++++++++++++++++++++
> > include/linux/audit.h | 25 ++++++++++++++++
> > include/uapi/linux/audit.h | 2 ++
> > kernel/auditsc.c | 71
> > ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 135
> > insertions(+)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index eafa39a..318dff4 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1302,6 +1302,41 @@ static ssize_t proc_sessionid_read(struct file *
> > file, char __user * buf, .read = proc_sessionid_read,
> > .llseek = generic_file_llseek,
> > };
> > +
> > +static ssize_t proc_contid_write(struct file *file, const char __user
> > *buf, + size_t count, loff_t *ppos)
> > +{
> > + struct inode *inode = file_inode(file);
> > + u64 contid;
> > + int rv;
> > + struct task_struct *task = get_proc_task(inode);
> > +
> > + if (!task)
> > + return -ESRCH;
> > + if (*ppos != 0) {
> > + /* No partial writes. */
> > + put_task_struct(task);
> > + return -EINVAL;
> > + }
> > +
> > + rv = kstrtou64_from_user(buf, count, 10, &contid);
> > + if (rv < 0) {
> > + put_task_struct(task);
> > + return rv;
> > + }
> > +
> > + rv = audit_set_contid(task, contid);
> > + put_task_struct(task);
> > + if (rv < 0)
> > + return rv;
> > + return count;
> > +}
> > +
> > +static const struct file_operations proc_contid_operations = {
> > + .write = proc_contid_write,
> > + .llseek = generic_file_llseek,
> > +};
> > +
> > #endif
> >
> > #ifdef CONFIG_FAULT_INJECTION
> > @@ -2995,6 +3030,7 @@ static int proc_pid_patch_state(struct seq_file *m,
> > struct pid_namespace *ns, #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> > @@ -3386,6 +3422,7 @@ static int proc_tid_comm_permission(struct inode
> > *inode, int mask) #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > + REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 4f824c4..497cd81 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -219,6 +219,8 @@ static inline void audit_log_task_info(struct
> > audit_buffer *ab, struct audit_task_info {
> > kuid_t loginuid;
> > unsigned int sessionid;
> > + u64 contid;
> > + bool inherited; /* containerid inheritance */
> > struct audit_context *ctx;
> > };
> > extern struct audit_task_info init_struct_audit;
> > @@ -331,6 +333,7 @@ static inline void audit_ptrace(struct task_struct *t)
> > extern int auditsc_get_stamp(struct audit_context *ctx,
> > struct timespec64 *t, unsigned int *serial);
> > extern int audit_set_loginuid(kuid_t loginuid);
> > +extern int audit_set_contid(struct task_struct *tsk, u64 contid);
> >
> > static inline kuid_t audit_get_loginuid(struct task_struct *tsk)
> > {
> > @@ -348,6 +351,14 @@ static inline unsigned int audit_get_sessionid(struct
> > task_struct *tsk) return AUDIT_SID_UNSET;
> > }
> >
> > +static inline u64 audit_get_contid(struct task_struct *tsk)
> > +{
> > + if (!tsk->audit)
> > + return AUDIT_CID_UNSET;
> > + else
> > + return tsk->audit->contid;
> > +}
> > +
> > extern void __audit_ipc_obj(struct kern_ipc_perm *ipcp);
> > extern void __audit_ipc_set_perm(unsigned long qbytes, uid_t uid, gid_t
> > gid, umode_t mode); extern void __audit_bprm(struct linux_binprm *bprm);
> > @@ -542,6 +553,10 @@ static inline unsigned int audit_get_sessionid(struct
> > task_struct *tsk) {
> > return AUDIT_SID_UNSET;
> > }
> > +static inline kuid_t audit_get_contid(struct task_struct *tsk)
> > +{
> > + return AUDIT_CID_UNSET;
> > +}
> > static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
> > { }
> > static inline void audit_ipc_set_perm(unsigned long qbytes, uid_t uid,
> > @@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct
> > task_struct *tsk) return uid_valid(audit_get_loginuid(tsk));
> > }
> >
> > +static inline bool cid_valid(u64 contid)
> > +{
> > + return contid != AUDIT_CID_UNSET;
> > +}
> > +
> > +static inline bool audit_contid_set(struct task_struct *tsk)
> > +{
> > + return cid_valid(audit_get_contid(tsk));
> > +}
> > +
> > static inline void audit_log_string(struct audit_buffer *ab, const char
> > *buf) {
> > audit_log_n_string(ab, buf, strlen(buf));
> > diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> > index 04f9bd2..c3b1aca 100644
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -71,6 +71,7 @@
> > #define AUDIT_TTY_SET 1017 /* Set TTY auditing status */
> > #define AUDIT_SET_FEATURE 1018 /* Turn an audit feature on or off */
> > #define AUDIT_GET_FEATURE 1019 /* Get which features are enabled */
> > +#define AUDIT_CONTAINER_ID 1020 /* Define the container id and
> information
> > */
> >
> > #define AUDIT_FIRST_USER_MSG 1100 /* Userspace messages mostly
> > uninteresting to kernel */ #define AUDIT_USER_AVC 1107 /* We filter this
> > differently */
> > @@ -466,6 +467,7 @@ struct audit_tty_status {
> >
> > #define AUDIT_UID_UNSET (unsigned int)-1
> > #define AUDIT_SID_UNSET ((unsigned int)-1)
> > +#define AUDIT_CID_UNSET ((u64)-1)
> >
> > /* audit_rule_data supports filter rules with both integer and string
> > * fields. It corresponds with AUDIT_ADD_RULE, AUDIT_DEL_RULE and
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 59ef7a81..611e926 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
> > return -ENOMEM;
> > info->loginuid = audit_get_loginuid(current);
> > info->sessionid = audit_get_sessionid(current);
> > + info->contid = audit_get_contid(current);
> > + info->inherited = true;
> > tsk->audit = info;
> >
> > if (likely(!audit_ever_enabled))
> > @@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
> > struct audit_task_info init_struct_audit = {
> > .loginuid = INVALID_UID,
> > .sessionid = AUDIT_SID_UNSET,
> > + .contid = AUDIT_CID_UNSET,
> > + .inherited = true,
> > .ctx = NULL,
> > };
> >
> > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> > }
> >
> > /**
> > + * audit_set_contid - set current task's audit_context contid
> > + * @contid: contid value
> > + *
> > + * Returns 0 on success, -EPERM on permission failure.
> > + *
> > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > + */
> > +int audit_set_contid(struct task_struct *task, u64 contid)
> > +{
> > + u64 oldcontid;
> > + int rc = 0;
> > + struct audit_buffer *ab;
> > + uid_t uid;
> > + struct tty_struct *tty;
> > + char comm[sizeof(current->comm)];
> > +
> > + /* Can't set if audit disabled */
> > + if (!task->audit)
> > + return -ENOPROTOOPT;
> > + oldcontid = audit_get_contid(task);
> > + /* Don't allow the audit containerid to be unset */
> > + if (!cid_valid(contid))
> > + rc = -EINVAL;
> > + /* if we don't have caps, reject */
> > + else if (!capable(CAP_AUDIT_CONTROL))
> > + rc = -EPERM;
> > + /* if task has children or is not single-threaded, deny */
> > + else if (!list_empty(&task->children))
> > + rc = -EBUSY;
> > + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > + rc = -EALREADY;
> > + /* it is already set, and not inherited from the parent, reject */
> > + else if (cid_valid(oldcontid) && !task->audit->inherited)
> > + rc = -EEXIST;
> > + if (!rc) {
> > + task_lock(task);
> > + task->audit->contid = contid;
> > + task->audit->inherited = false;
> > + task_unlock(task);
> > + }
> > +
> > + if (!audit_enabled)
> > + return rc;
> > +
> > + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
> > + if (!ab)
> > + return rc;
> > +
> > + uid = from_kuid(&init_user_ns, task_uid(current));
> > + tty = audit_get_tty(current);
> > + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d
> > uid=%u auid=%u tty=%s ses=%u", + task_tgid_nr(task), oldcontid,
> contid,
> > + task_tgid_nr(current), uid
> > + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > + tty ? tty_name(tty) : "(none)",
> > + audit_get_sessionid(current));
>
> The event code doesn't match the example event at the top. (uid and auid are
> transposed.) But the code looks right.

Hmmm, I thought I checked that explicitly... That event sample must
have come from the previous compile before I fixed that.

> Ack for the event format.

Thanks!

> -Steve
>
> > + audit_put_tty(tty);
> > + audit_log_task_context(ab);
> > + audit_log_format(ab, " comm=");
> > + audit_log_untrustedstring(ab, get_task_comm(comm, current));
> > + audit_log_d_path_exe(ab, current->mm);
> > + audit_log_format(ab, " res=%d", !rc);
> > + audit_log_end(ab);
> > + return rc;
> > +}
> > +
> > +/**
> > * __audit_mq_open - record audit data for a POSIX MQ open
> > * @oflag: open flag
> > * @mode: mode bits

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-18 20:58:38

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 10/10] rfkill: fix spelling mistake contidion to condition

On Wed, Jun 6, 2018 at 1:02 PM Richard Guy Briggs <[email protected]> wrote:
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> net/rfkill/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

I would suggest splitting this patch from the audit container ID
patchset and sending it to the netdev folks. It really has nothing to
do with the audit container ID work, although I suspect I know why you
bumped into this spelling mistake ;)

> diff --git a/net/rfkill/core.c b/net/rfkill/core.c
> index 59d0eb9..e89a009 100644
> --- a/net/rfkill/core.c
> +++ b/net/rfkill/core.c
> @@ -494,7 +494,7 @@ void rfkill_remove_epo_lock(void)
> /**
> * rfkill_is_epo_lock_active - returns true EPO is active
> *
> - * Returns 0 (false) if there is NOT an active EPO contidion,
> + * Returns 0 (false) if there is NOT an active EPO condition,
> * and 1 (true) if there is an active EPO contition, which
> * locks all radios in one of the BLOCKED states.
> *
> --
> 1.8.3.1

--
paul moore
http://www.paul-moore.com

2018-07-20 22:14:50

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> Implement the proc fs write to set the audit container identifier of a
> process, emitting an AUDIT_CONTAINER_ID record to document the event.
>
> This is a write from the container orchestrator task to a proc entry of
> the form /proc/PID/audit_containerid where PID is the process ID of the
> newly created task that is to become the first task in a container, or
> an additional task added to a container.
>
> The write expects up to a u64 value (unset: 18446744073709551615).
>
> The writer must have capability CAP_AUDIT_CONTROL.
>
> This will produce a record such as this:
> type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes
>
> The "op" field indicates an initial set. The "pid" to "ses" fields are
> the orchestrator while the "opid" field is the object's PID, the process
> being "contained". Old and new audit container identifier values are
> given in the "contid" fields, while res indicates its success.
>
> It is not permitted to unset or re-set the audit container identifier.
> A child inherits its parent's audit container identifier, but then can
> be set only once after.
>
> See: https://github.com/linux-audit/audit-kernel/issues/90
> See: https://github.com/linux-audit/audit-userspace/issues/51
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> fs/proc/base.c | 37 ++++++++++++++++++++++++
> include/linux/audit.h | 25 ++++++++++++++++
> include/uapi/linux/audit.h | 2 ++
> kernel/auditsc.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 135 insertions(+)

...

> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct task_struct *tsk)
> return uid_valid(audit_get_loginuid(tsk));
> }
>
> +static inline bool cid_valid(u64 contid)
> +{
> + return contid != AUDIT_CID_UNSET;
> +}
> +
> +static inline bool audit_contid_set(struct task_struct *tsk)
> +{
> + return cid_valid(audit_get_contid(tsk));
> +}

For the sake of consistency I think we should rename cid_valid() to
audit_contid_valid().

> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 59ef7a81..611e926 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
> return -ENOMEM;
> info->loginuid = audit_get_loginuid(current);
> info->sessionid = audit_get_sessionid(current);
> + info->contid = audit_get_contid(current);
> + info->inherited = true;

First see my others comments in this patch about inheritence, but if
we decide that flagging inherited values is important we should
probably rename the "inherited" field to indicate that it applies to
just the "contid" field.

> tsk->audit = info;
>
> if (likely(!audit_ever_enabled))
> @@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
> struct audit_task_info init_struct_audit = {
> .loginuid = INVALID_UID,
> .sessionid = AUDIT_SID_UNSET,
> + .contid = AUDIT_CID_UNSET,
> + .inherited = true,
> .ctx = NULL,
> };
>
> @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> }
>
> /**
> + * audit_set_contid - set current task's audit_context contid
> + * @contid: contid value
> + *
> + * Returns 0 on success, -EPERM on permission failure.
> + *
> + * Called (set) from fs/proc/base.c::proc_contid_write().
> + */
> +int audit_set_contid(struct task_struct *task, u64 contid)
> +{
> + u64 oldcontid;
> + int rc = 0;
> + struct audit_buffer *ab;
> + uid_t uid;
> + struct tty_struct *tty;
> + char comm[sizeof(current->comm)];
> +
> + /* Can't set if audit disabled */
> + if (!task->audit)
> + return -ENOPROTOOPT;
> + oldcontid = audit_get_contid(task);
> + /* Don't allow the audit containerid to be unset */
> + if (!cid_valid(contid))
> + rc = -EINVAL;
> + /* if we don't have caps, reject */
> + else if (!capable(CAP_AUDIT_CONTROL))
> + rc = -EPERM;
> + /* if task has children or is not single-threaded, deny */
> + else if (!list_empty(&task->children))
> + rc = -EBUSY;

Is this safe without holding tasklist_lock? I worry we might be
vulnerable to a race with fork().

> + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> + rc = -EALREADY;

Similar concern here as well, although related to threads.

> + /* it is already set, and not inherited from the parent, reject */
> + else if (cid_valid(oldcontid) && !task->audit->inherited)
> + rc = -EEXIST;

Maybe I'm missing something, but why do we care about preventing
reassigning the audit container ID in this case? The task is single
threaded and has no descendants at this point so it should be safe,
yes? So long as the task changing the audit container ID has
capable(CAP_AUDIT_CONTOL) it shouldn't matter, right?

Related, I'm questioning if we would ever care if the audit container
ID was inherited or not?

> + if (!rc) {
> + task_lock(task);
> + task->audit->contid = contid;
> + task->audit->inherited = false;
> + task_unlock(task);

I suspect the task_lock() may not be what we want here, but if we are
using task_lock() to protect the audit fields two things come to mind:

1. We should update the header comments for task_lock() in task.h to
indicate that it also protects ->audit.
2. Where else do we need to worry about taking this lock? At the very
least we should take this lock near the top of this function before we
check task->audit and not drop it until after we have set it, or
failed the operation for one of the reasons above.

> + }
> +
> + if (!audit_enabled)
> + return rc;
> +
> + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
> + if (!ab)
> + return rc;
> +
> + uid = from_kuid(&init_user_ns, task_uid(current));
> + tty = audit_get_tty(current);
> + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d uid=%u auid=%u tty=%s ses=%u",
> + task_tgid_nr(task), oldcontid, contid,
> + task_tgid_nr(current), uid
> + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> + tty ? tty_name(tty) : "(none)",
> + audit_get_sessionid(current));
> + audit_put_tty(tty);
> + audit_log_task_context(ab);
> + audit_log_format(ab, " comm=");
> + audit_log_untrustedstring(ab, get_task_comm(comm, current));
> + audit_log_d_path_exe(ab, current->mm);
> + audit_log_format(ab, " res=%d", !rc);
> + audit_log_end(ab);
> + return rc;
> +}
> +
> +/**
> * __audit_mq_open - record audit data for a POSIX MQ open
> * @oflag: open flag
> * @mode: mode bits

--
paul moore
http://www.paul-moore.com

2018-07-20 22:15:07

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> Create a new audit record AUDIT_CONTAINER to document the audit
> container identifier of a process if it is present.
>
> Called from audit_log_exit(), syscalls are covered.
>
> A sample raw event:
> type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257 success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="tmpcontainerid"
> type=CWD msg=audit(1519924845.499:257): cwd="/root"
> type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> type=PATH msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid" inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> type=PROCTITLE msg=audit(1519924845.499:257): proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964
> type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458
>
> See: https://github.com/linux-audit/audit-kernel/issues/90
> See: https://github.com/linux-audit/audit-userspace/issues/51
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 7 +++++++
> include/uapi/linux/audit.h | 1 +
> kernel/audit.c | 23 +++++++++++++++++++++++
> kernel/auditsc.c | 3 +++
> 4 files changed, 34 insertions(+)

...

> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -115,6 +115,7 @@
> #define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd */
> #define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
> #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
> +#define AUDIT_CONTAINER 1332 /* Container ID */

I'm not sure I'm completely sold on the AUDIT_CONTAINER_ID and
AUDIT_CONTAINER record type names. From what I can tell
AUDIT_CONTAINER_ID seems to be used for audit container ID management
operations, e.g. setting the ID, whereas the AUDIT_CONTAINER is used
to tag events with the corresponding audit container ID. Assuming
that is correct, it seems like AUDIT_CONTAINER might be better served
if it was named AUDIT_CONTAINER_ID and if we could change
AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Thoughts?

> #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
> #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index e7478cb..5e150c6 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
> audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
> }
>
> +/*
> + * audit_log_contid - report container info
> + * @tsk: task to be recorded
> + * @context: task or local context for record
> + * @op: contid string description
> + */
> +int audit_log_contid(struct task_struct *tsk,
> + struct audit_context *context, char *op)
> +{
> + struct audit_buffer *ab;
> +
> + if (!audit_contid_set(tsk))
> + return 0;
> + /* Generate AUDIT_CONTAINER record with container ID */
> + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> + if (!ab)
> + return -ENOMEM;
> + audit_log_format(ab, "op=%s contid=%llu",
> + op, audit_get_contid(tsk));

Can you explain your reason for including an "op" field in this record
type? I've been looking at the rest of the patches in this patchset
and it seems to be used more as an indicator of the record's
generating context rather than any sort of audit container ID
operation.

> + audit_log_end(ab);
> + return 0;
> +}

--
paul moore
http://www.paul-moore.com

2018-07-20 22:15:25

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 03/10] audit: add containerid support for ptrace and signals

On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs <[email protected]> wrote:
> Add audit container identifier support to ptrace and signals. In
> particular, the "op" field provides a way to label the auxiliary record
> to which it is associated.
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 11 +++++------
> kernel/audit.c | 13 +++++++------
> kernel/audit.h | 2 ++
> kernel/auditsc.c | 21 ++++++++++++++++-----
> 4 files changed, 30 insertions(+), 17 deletions(-)

...

> diff --git a/kernel/audit.c b/kernel/audit.c
> index 5e150c6..ba304a8 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -142,6 +142,7 @@ struct audit_net {
> kuid_t audit_sig_uid = INVALID_UID;
> pid_t audit_sig_pid = -1;
> u32 audit_sig_sid = 0;
> +u64 audit_sig_cid = AUDIT_CID_UNSET;
>
> /* Records can be lost in several ways:
> 0) [suppressed in audit_alloc]
> @@ -1437,6 +1438,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
> memcpy(sig_data->ctx, ctx, len);
> security_release_secctx(ctx, len);
> }
> + sig_data->cid = audit_sig_cid;
> audit_send_reply(skb, seq, AUDIT_SIGNAL_INFO, 0, 0,
> sig_data, sizeof(*sig_data) + len);
> kfree(sig_data);
> @@ -2050,23 +2052,22 @@ void audit_log_session_info(struct audit_buffer *ab)
>
> /*
> * audit_log_contid - report container info
> - * @tsk: task to be recorded
> * @context: task or local context for record
> * @op: contid string description
> + * @contid: container ID to report
> */
> -int audit_log_contid(struct task_struct *tsk,
> - struct audit_context *context, char *op)
> +int audit_log_contid(struct audit_context *context,
> + char *op, u64 contid)
> {
> struct audit_buffer *ab;
>
> - if (!audit_contid_set(tsk))
> + if (!cid_valid(contid))
> return 0;
> /* Generate AUDIT_CONTAINER record with container ID */
> ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> if (!ab)
> return -ENOMEM;
> - audit_log_format(ab, "op=%s contid=%llu",
> - op, audit_get_contid(tsk));
> + audit_log_format(ab, "op=%s contid=%llu", op, contid);
> audit_log_end(ab);
> return 0;
> }

You might as well just make these changes to audit_log_contid() in
patch 2 when you first define audit_log_contid().

--
paul moore
http://www.paul-moore.com

2018-07-20 22:15:43

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records

On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs <[email protected]> wrote:
> Standalone audit records have the timestamp and serial number generated
> on the fly and as such are unique, making them standalone. This new
> function audit_alloc_local() generates a local audit context that will
> be used only for a standalone record and its auxiliary record(s). The
> context is discarded immediately after the local associated records are
> produced.
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 8 ++++++++
> kernel/auditsc.c | 25 +++++++++++++++++++++++--
> 2 files changed, 31 insertions(+), 2 deletions(-)

...

> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -916,8 +916,11 @@ static inline void audit_free_aux(struct audit_context *context)
> static inline struct audit_context *audit_alloc_context(enum audit_state state)
> {
> struct audit_context *context;
> + gfp_t gfpflags;
>
> - context = kzalloc(sizeof(*context), GFP_KERNEL);
> + /* We can be called in atomic context via audit_tg() */
> + gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;

Instead of trying to guess the proper gfp flags, and likely getting it
wrong at some point (the in_atomic() comment in preempt.h don't give
me the warm fuzzies), why not pass the gfp flags as an argument?

Right now it looks like we would only have two callers: audit_alloc()
and audit_audit_local(). The audit_alloc() invocation would simply
pass GFP_KERNEL and we could allow the audit_alloc_local() callers to
specify the gfp flags when calling audit_alloc_local() (although I
suspect that will always be GFP_ATOMIC since we should only be calling
audit_alloc_local() from interrupt-like context, in all other cases we
should use the audit_context from the current task_struct.

> + context = kzalloc(sizeof(*context), gfpflags);
> if (!context)
> return NULL;
> context->state = state;
> @@ -993,8 +996,26 @@ struct audit_task_info init_struct_audit = {
> .ctx = NULL,
> };
>
> -static inline void audit_free_context(struct audit_context *context)
> +struct audit_context *audit_alloc_local(void)
> {

Let's see where this goes, but we may want to rename this slightly to
indicate that this should only be called from interrupt context when
we can't rely on current's audit_context. Bonus points if we can find
a way to enforce this with a WARN() assertion so we can better catch
abuse.

> + struct audit_context *context;
> +
> + if (!audit_ever_enabled)
> + return NULL; /* Return if not auditing. */
> +
> + context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
> + if (!context)
> + return NULL;
> + context->serial = audit_serial();
> + context->ctime = current_kernel_time64();
> + context->in_syscall = 1;

Setting in_syscall is both interesting and a bit troubling, if for no
other reason than I expect most (all?) callers to be in an interrupt
context when audit_alloc_local() is called. Setting in_syscall would
appear to be conceptually in this case. Can you help explain why this
is the right thing to do, or necessary to ensure things are handled
correctly?

Looking quickly at the audit code, it seems to only be used on record
and/or syscall termination to end things properly as well as in some
of the PATH record code paths to limit filename collection to actual
syscalls. However, this was just a quick look so I could be missing
some important points.

> + return context;
> +}
> +
> +void audit_free_context(struct audit_context *context)
> +{
> + if (!context)
> + return;
> audit_free_names(context);
> unroll_tree_refs(context, NULL, 0);
> free_tree_refs(context);
> --
> 1.8.3.1
>
> --
> Linux-audit mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-audit

--
paul moore
http://www.paul-moore.com

2018-07-20 22:16:42

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit

On Wed, Jun 6, 2018 at 1:04 PM Richard Guy Briggs <[email protected]> wrote:
> Add audit container identifier auxiliary record to tty logging rule
> event standalone records.
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> drivers/tty/tty_audit.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
> index e30aa6b..66bd850 100644
> --- a/drivers/tty/tty_audit.c
> +++ b/drivers/tty/tty_audit.c
> @@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, dev_t dev,
> uid_t uid = from_kuid(&init_user_ns, task_uid(tsk));
> uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk));
> unsigned int sessionid = audit_get_sessionid(tsk);
> + struct audit_context *context = audit_alloc_local();

We should be using current's audit_context in tty_audit_log().
Actually, we should probably just get rid of the tsk variable in
tty_audit_log() and use current directly to make things a bit more
obvious.

<time passes>

I did some digging and I have a two year old, half-baked patch that
cleans up this tsk/current usage as well as a few others. I just
rebased it against audit/next and surprisingly it seems to pass a
basic smoke test (kernel boots and audit-testsuite passes); I'll post
it to the list as a RFC once I'm done reviewing these patches.

> - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY);
> + ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY);
> if (ab) {
> char name[sizeof(tsk->comm)];
>
> @@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, dev_t dev,
> audit_log_n_hex(ab, data, size);
> audit_log_end(ab);
> }
> + audit_log_contid(context, "tty", audit_get_contid(tsk));
> + audit_free_context(context);
> }

--
paul moore
http://www.paul-moore.com

2018-07-20 22:16:49

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 06/10] audit: add containerid filtering

On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> Implement audit container identifier filtering using the AUDIT_CONTID
> field name to send an 8-character string representing a u64 since the
> value field is only u32.
>
> Sending it as two u32 was considered, but gathering and comparing two
> fields was more complex.
>
> The feature indicator is AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER.

We've had some discussions about starting to be a bit more "stingy"
with our feature bitmap fields, and normally I'm not sure I would want
to burn one on a new filter (we should be able to fail safely here),
but considering the significance of the audit container ID work I
think it might be okay to burn a bit :)

How about renaming this to just AUDIT_FEATURE_BITMAP_CONTAINERID?

> See: https://github.com/linux-audit/audit-kernel/issues/91
> See: https://github.com/linux-audit/audit-userspace/issues/40
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 1 +
> include/uapi/linux/audit.h | 5 ++++-
> kernel/audit.h | 1 +
> kernel/auditfilter.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++
> kernel/auditsc.c | 3 +++
> 5 files changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index f549121..1e37abf 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -76,6 +76,7 @@ struct audit_field {
> u32 type;
> union {
> u32 val;
> + u64 val64;
> kuid_t uid;
> kgid_t gid;
> struct {
> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
> index 469ab25..b440558 100644
> --- a/include/uapi/linux/audit.h
> +++ b/include/uapi/linux/audit.h
> @@ -262,6 +262,7 @@
> #define AUDIT_LOGINUID_SET 24
> #define AUDIT_SESSIONID 25 /* Session ID */
> #define AUDIT_FSTYPE 26 /* FileSystem Type */
> +#define AUDIT_CONTID 27 /* Container ID */
>
> /* These are ONLY useful when checking
> * at syscall exit time (AUDIT_AT_EXIT). */
> @@ -342,6 +343,7 @@ enum {
> #define AUDIT_FEATURE_BITMAP_SESSIONID_FILTER 0x00000010
> #define AUDIT_FEATURE_BITMAP_LOST_RESET 0x00000020
> #define AUDIT_FEATURE_BITMAP_FILTER_FS 0x00000040
> +#define AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER 0x00000080
>
> #define AUDIT_FEATURE_BITMAP_ALL (AUDIT_FEATURE_BITMAP_BACKLOG_LIMIT | \
> AUDIT_FEATURE_BITMAP_BACKLOG_WAIT_TIME | \
> @@ -349,7 +351,8 @@ enum {
> AUDIT_FEATURE_BITMAP_EXCLUDE_EXTEND | \
> AUDIT_FEATURE_BITMAP_SESSIONID_FILTER | \
> AUDIT_FEATURE_BITMAP_LOST_RESET | \
> - AUDIT_FEATURE_BITMAP_FILTER_FS)
> + AUDIT_FEATURE_BITMAP_FILTER_FS | \
> + AUDIT_FEATURE_BITMAP_CONTAINERID_FILTER)
>
> /* deprecated: AUDIT_VERSION_* */
> #define AUDIT_VERSION_LATEST AUDIT_FEATURE_BITMAP_ALL
> diff --git a/kernel/audit.h b/kernel/audit.h
> index 1cf1c35..743d445 100644
> --- a/kernel/audit.h
> +++ b/kernel/audit.h
> @@ -235,6 +235,7 @@ static inline int audit_hash_ino(u32 ino)
>
> extern int audit_match_class(int class, unsigned syscall);
> extern int audit_comparator(const u32 left, const u32 op, const u32 right);
> +extern int audit_comparator64(const u64 left, const u32 op, const u64 right);
> extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
> extern int audit_gid_comparator(kgid_t left, u32 op, kgid_t right);
> extern int parent_len(const char *path);
> diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
> index eaa3201..a5f60ce 100644
> --- a/kernel/auditfilter.c
> +++ b/kernel/auditfilter.c
> @@ -410,6 +410,7 @@ static int audit_field_valid(struct audit_entry *entry, struct audit_field *f)
> /* FALL THROUGH */
> case AUDIT_ARCH:
> case AUDIT_FSTYPE:
> + case AUDIT_CONTID:
> if (f->op != Audit_not_equal && f->op != Audit_equal)
> return -EINVAL;
> break;
> @@ -584,6 +585,14 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
> }
> entry->rule.exe = audit_mark;
> break;
> + case AUDIT_CONTID:
> + if (f->val != sizeof(u64))
> + goto exit_free;
> + str = audit_unpack_string(&bufp, &remain, f->val);
> + if (IS_ERR(str))
> + goto exit_free;
> + f->val64 = ((u64 *)str)[0];
> + break;
> }
> }
>
> @@ -666,6 +675,11 @@ static struct audit_rule_data *audit_krule_to_data(struct audit_krule *krule)
> data->buflen += data->values[i] =
> audit_pack_string(&bufp, audit_mark_path(krule->exe));
> break;
> + case AUDIT_CONTID:
> + data->buflen += data->values[i] = sizeof(u64);
> + for (i = 0; i < sizeof(u64); i++)
> + ((char *)bufp)[i] = ((char *)&f->val64)[i];
> + break;
> case AUDIT_LOGINUID_SET:
> if (krule->pflags & AUDIT_LOGINUID_LEGACY && !f->val) {
> data->fields[i] = AUDIT_LOGINUID;
> @@ -752,6 +766,10 @@ static int audit_compare_rule(struct audit_krule *a, struct audit_krule *b)
> if (!gid_eq(a->fields[i].gid, b->fields[i].gid))
> return 1;
> break;
> + case AUDIT_CONTID:
> + if (a->fields[i].val64 != b->fields[i].val64)
> + return 1;
> + break;
> default:
> if (a->fields[i].val != b->fields[i].val)
> return 1;
> @@ -1208,6 +1226,31 @@ int audit_comparator(u32 left, u32 op, u32 right)
> }
> }
>
> +int audit_comparator64(u64 left, u32 op, u64 right)
> +{
> + switch (op) {
> + case Audit_equal:
> + return (left == right);
> + case Audit_not_equal:
> + return (left != right);
> + case Audit_lt:
> + return (left < right);
> + case Audit_le:
> + return (left <= right);
> + case Audit_gt:
> + return (left > right);
> + case Audit_ge:
> + return (left >= right);
> + case Audit_bitmask:
> + return (left & right);
> + case Audit_bittest:
> + return ((left & right) == right);
> + default:
> + BUG();
> + return 0;
> + }
> +}
> +
> int audit_uid_comparator(kuid_t left, u32 op, kuid_t right)
> {
> switch (op) {
> @@ -1346,6 +1389,10 @@ int audit_filter(int msgtype, unsigned int listtype)
> result = audit_comparator(audit_loginuid_set(current),
> f->op, f->val);
> break;
> + case AUDIT_CONTID:
> + result = audit_comparator64(audit_get_contid(current),
> + f->op, f->val64);
> + break;
> case AUDIT_MSGTYPE:
> result = audit_comparator(msgtype, f->op, f->val);
> break;
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 81c9765..ea1ee35 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -622,6 +622,9 @@ static int audit_filter_rules(struct task_struct *tsk,
> case AUDIT_LOGINUID_SET:
> result = audit_comparator(audit_loginuid_set(tsk), f->op, f->val);
> break;
> + case AUDIT_CONTID:
> + result = audit_comparator64(audit_get_contid(tsk), f->op, f->val64);
> + break;
> case AUDIT_SUBJ_USER:
> case AUDIT_SUBJ_ROLE:
> case AUDIT_SUBJ_TYPE:
> --
> 1.8.3.1
>
> --
> Linux-audit mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-audit



--
paul moore
http://www.paul-moore.com

2018-07-20 22:16:56

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces

On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> Audit events could happen in a network namespace outside of a task
> context due to packets received from the net that trigger an auditing
> rule prior to being associated with a running task. The network
> namespace could in use by multiple containers by association to the
> tasks in that network namespace. We still want a way to attribute
> these events to any potential containers. Keep a list per network
> namespace to track these audit container identifiiers.
>
> Add/increment the audit container identifier on:
> - initial setting of the audit container identifier via /proc
> - clone/fork call that inherits an audit container identifier
> - unshare call that inherits an audit container identifier
> - setns call that inherits an audit container identifier
> Delete/decrement the audit container identifier on:
> - an inherited audit container identifier dropped when child set
> - process exit
> - unshare call that drops a net namespace
> - setns call that drops a net namespace
>
> See: https://github.com/linux-audit/audit-kernel/issues/92
> See: https://github.com/linux-audit/audit-testsuite/issues/64
> See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 23 ++++++++++++++++
> kernel/audit.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++
> kernel/auditsc.c | 5 ++++
> kernel/nsproxy.c | 4 +++
> 4 files changed, 104 insertions(+)

...

> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 1e37abf..7e2e51c 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -87,6 +88,12 @@ struct audit_field {
> u32 op;
> };
>
> +struct audit_contid {
> + struct list_head list;
> + u64 id;
> + refcount_t refcount;
> +};

Do we need to worry about locking the audit container ID list? Does
the network namespace code (or some other namespace code) ensure that
add/deletes are serialized?


> @@ -156,6 +163,10 @@ extern void audit_log_task_info(struct audit_buffer *ab,
> struct task_struct *tsk);
> extern int audit_log_contid(struct audit_context *context,
> char *op, u64 contid);
> +extern struct list_head *audit_get_contid_list(const struct net *net);
> +extern void audit_contid_add(struct net *net, u64 contid);
> +extern void audit_contid_del(struct net *net, u64 contid);

I wonder if we should change these function names to indicate that
they are managing the netns/cid list? Right now there is no mention
of networking other than the first parameter.

Maybe audit_netns_contid_*()?

> +extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);
>
> extern int audit_update_lsm_rules(void);
>
> @@ -209,6 +220,18 @@ static inline void audit_log_task_info(struct audit_buffer *ab,
> static inline int audit_log_contid(struct audit_context *context,
> char *op, u64 contid)
> { }
> +static inline struct list_head *audit_get_contid_list(const struct net *net)
> +{
> + static LIST_HEAD(list);
> + return &list;
> +}

Why can't we just return NULL here like a normal dummy function? It's
only ever used inside audit. Actually, why is this even in
include/linux/audit.h, couldn't we put it in kernel/audit.h or even
just make it a static in audit.c?

> +static inline void audit_contid_add(struct net *net, u64 contid)
> +{ }
> +static inline void audit_contid_del(struct net *net, u64 contid)
> +{ }
> +static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> +{ }
> +
> #define audit_enabled 0
> #endif /* CONFIG_AUDIT */
>
> diff --git a/kernel/audit.c b/kernel/audit.c
> index ba304a8..ecd2de4 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -106,6 +106,7 @@
> */
> struct audit_net {
> struct sock *sk;
> + struct list_head contid_list;
> };
>
> /**
> @@ -311,6 +312,76 @@ static struct sock *audit_get_sk(const struct net *net)
> return aunet->sk;
> }
>
> +/**
> + * audit_get_contid_list - Return the audit container ID list for the given network namespace
> + * @net: the destination network namespace
> + *
> + * Description:
> + * Returns the list pointer if valid, NULL otherwise. The caller must ensure
> + * that a reference is held for the network namespace while the sock is in use.
> + */
> +struct list_head *audit_get_contid_list(const struct net *net)
> +{
> + struct audit_net *aunet = net_generic(net, audit_net_id);
> +
> + return &aunet->contid_list;
> +}
> +
> +void audit_contid_add(struct net *net, u64 contid)
> +{
> + struct list_head *contid_list = audit_get_contid_list(net);
> + struct audit_contid *cont;
> +
> + if (!cid_valid(contid))
> + return;
> + if (!list_empty(contid_list))
> + list_for_each_entry(cont, contid_list, list)
> + if (cont->id == contid) {
> + refcount_inc(&cont->refcount);
> + return;
> + }

<thinking out loud>I think this is fine for right now, but we may need
to be a bit clever about how we store the IDs - walking an unsorted
list with lots of entries may prove to be too painful.</thinking out
loud>

> + cont = kmalloc(sizeof(struct audit_contid), GFP_KERNEL);
> + if (!cont)
> + return;
> + INIT_LIST_HEAD(&cont->list);
> + cont->id = contid;
> + refcount_set(&cont->refcount, 1);
> + list_add(&cont->list, contid_list);
> +}
> +
> +void audit_contid_del(struct net *net, u64 contid)
> +{
> + struct list_head *contid_list = audit_get_contid_list(net);
> + struct audit_contid *cont = NULL;
> + int found = 0;
> +
> + if (!cid_valid(contid))
> + return;
> + if (!list_empty(contid_list))
> + list_for_each_entry(cont, contid_list, list)
> + if (cont->id == contid) {
> + found = 1;
> + break;

You don't really need the found variable, you can just move all of the
work you need to do up into the if statement and return from inside
the if statement.

> + }
> + if (!found)
> + return;
> + list_del(&cont->list);
> + if (refcount_dec_and_test(&cont->refcount))
> + kfree(cont);

Don't you want to dec_and_test first and only remove it from the list
if there are no other references?

> +}
>
> +void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> +{
> + u64 contid = audit_get_contid(p);
> + struct nsproxy *new = p->nsproxy;
> +
> + if (!cid_valid(contid))
> + return;
> + audit_contid_del(ns->net_ns, contid);
> + if (new)
> + audit_contid_add(new->net_ns, contid);
> +}
> +
> void audit_panic(const char *message)
> {
> switch (audit_failure) {
> @@ -1550,6 +1621,7 @@ static int __net_init audit_net_init(struct net *net)
> return -ENOMEM;
> }
> aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
> + INIT_LIST_HEAD(&aunet->contid_list);
>
> return 0;
> }
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index ea1ee35..6ab5e5e 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -75,6 +75,7 @@
> #include <linux/uaccess.h>
> #include <linux/fsnotify_backend.h>
> #include <uapi/linux/limits.h>
> +#include <net/net_namespace.h>
>
> #include "audit.h"
>
> @@ -2165,6 +2166,7 @@ int audit_set_contid(struct task_struct *task, u64 contid)
> uid_t uid;
> struct tty_struct *tty;
> char comm[sizeof(current->comm)];
> + struct net *net = task->nsproxy->net_ns;
>
> /* Can't set if audit disabled */
> if (!task->audit)
> @@ -2185,10 +2187,13 @@ int audit_set_contid(struct task_struct *task, u64 contid)
> else if (cid_valid(oldcontid) && !task->audit->inherited)
> rc = -EEXIST;
> if (!rc) {
> + if (cid_valid(oldcontid))
> + audit_contid_del(net, oldcontid);
> task_lock(task);
> task->audit->contid = contid;
> task->audit->inherited = false;
> task_unlock(task);
> + audit_contid_add(net, contid);
> }
>
> if (!audit_enabled)
> diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> index f6c5d33..dcb69fe 100644
> --- a/kernel/nsproxy.c
> +++ b/kernel/nsproxy.c
> @@ -27,6 +27,7 @@
> #include <linux/syscalls.h>
> #include <linux/cgroup.h>
> #include <linux/perf_event.h>
> +#include <linux/audit.h>
>
> static struct kmem_cache *nsproxy_cachep;
>
> @@ -140,6 +141,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
> struct nsproxy *old_ns = tsk->nsproxy;
> struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns);
> struct nsproxy *new_ns;
> + u64 contid = audit_get_contid(tsk);
>
> if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
> CLONE_NEWPID | CLONE_NEWNET |
> @@ -167,6 +169,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
> return PTR_ERR(new_ns);
>
> tsk->nsproxy = new_ns;
> + audit_contid_add(new_ns->net_ns, contid);
> return 0;
> }
>
> @@ -224,6 +227,7 @@ void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
> ns = p->nsproxy;
> p->nsproxy = new;
> task_unlock(p);
> + audit_switch_task_namespaces(ns, p);
>
> if (ns && atomic_dec_and_test(&ns->count))
> free_nsproxy(ns);
> --
> 1.8.3.1
>
> --
> Linux-audit mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-audit

--
paul moore
http://www.paul-moore.com

2018-07-20 22:17:14

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> Add audit container identifier auxiliary record(s) to NETFILTER_PKT
> event standalone records. Iterate through all potential audit container
> identifiers associated with a network namespace.
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 5 +++++
> kernel/audit.c | 20 +++++++++++++++++++-
> kernel/auditsc.c | 2 ++
> net/netfilter/xt_AUDIT.c | 12 ++++++++++--
> 4 files changed, 36 insertions(+), 3 deletions(-)

...

> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 7e2e51c..4560a4e 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context *context,
> extern void audit_contid_add(struct net *net, u64 contid);
> extern void audit_contid_del(struct net *net, u64 contid);
> extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);
> +extern void audit_log_contid_list(struct net *net,
> + struct audit_context *context);

See my comment in previous patches about changing the function name to
better indicate it's dedicate use for network namespaces.

> extern int audit_update_lsm_rules(void);
>
> @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net, u64 contid)
> { }
> static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> { }
> +static inline void audit_log_contid_list(struct net *net,
> + struct audit_context *context)
> +{ }
>
> #define audit_enabled 0
> #endif /* CONFIG_AUDIT */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index ecd2de4..8cca41a 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> audit_contid_add(new->net_ns, contid);
> }
>
> +void audit_log_contid_list(struct net *net, struct audit_context *context)
> +{
> + struct audit_contid *cont;
> + int i = 0;
> +
> + list_for_each_entry(cont, audit_get_contid_list(net), list) {
> + char buf[14];
> +
> + sprintf(buf, "net%u", i++);
> + audit_log_contid(context, buf, cont->id);

Hmm. It looks like this will generate multiple audit container ID
records with "op=netX contid=Y" (X=netns number, Y=audit container
ID), is that what we want? I've mentioned my concern around the "op"
values in these records earlier in the patchset, that still applies
here, but now I'm also concerned about the multiple records. I'm
thinking we might be better served with a single record with either
multiple "contid" fields, or a single "contid" field with a set of
comma separated values (or some other delimiter that Steve's tools
will tolerate).

Steve, thoughts?

> + }
> +}
> +EXPORT_SYMBOL(audit_log_contid_list);
> +
> void audit_panic(const char *message)
> {
> switch (audit_failure) {
> @@ -2132,17 +2146,21 @@ int audit_log_contid(struct audit_context *context,
> char *op, u64 contid)
> {
> struct audit_buffer *ab;
> + gfp_t gfpflags;
>
> if (!cid_valid(contid))
> return 0;
> + /* We can be called in atomic context via audit_tg() */
> + gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;

See my previous comments in the earlier patches about guessing at
gfpflags; let's just add a gfpflags parameter to audit_log_contid().

> /* Generate AUDIT_CONTAINER record with container ID */
> - ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> + ab = audit_log_start(context, gfpflags, AUDIT_CONTAINER);
> if (!ab)
> return -ENOMEM;
> audit_log_format(ab, "op=%s contid=%llu", op, contid);
> audit_log_end(ab);
> return 0;
> }
> +EXPORT_SYMBOL(audit_log_contid);

Move the EXPORT_SYMBOL() to earlier in the patchset when you first
define audit_log_contid().

> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 6ab5e5e..e2a16d2 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1015,6 +1015,7 @@ struct audit_context *audit_alloc_local(void)
> context->in_syscall = 1;
> return context;
> }
> +EXPORT_SYMBOL(audit_alloc_local);

Same as above.

> void audit_free_context(struct audit_context *context)
> {
> @@ -1029,6 +1030,7 @@ void audit_free_context(struct audit_context *context)
> audit_proctitle_free(context);
> kfree(context);
> }
> +EXPORT_SYMBOL(audit_free_context);

Same.

> static int audit_log_pid_context(struct audit_context *context, pid_t pid,
> kuid_t auid, kuid_t uid, unsigned int sessionid,
> diff --git a/net/netfilter/xt_AUDIT.c b/net/netfilter/xt_AUDIT.c
> index f368ee6..10d2707 100644
> --- a/net/netfilter/xt_AUDIT.c
> +++ b/net/netfilter/xt_AUDIT.c
> @@ -71,10 +71,13 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
> {
> struct audit_buffer *ab;
> int fam = -1;
> + struct audit_context *context;
> + struct net *net;
>
> if (audit_enabled == 0)
> - goto errout;
> - ab = audit_log_start(NULL, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
> + goto out;
> + context = audit_alloc_local();
> + ab = audit_log_start(context, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
> if (ab == NULL)
> goto errout;
>
> @@ -104,7 +107,12 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
>
> audit_log_end(ab);
>
> + net = xt_net(par);
> + audit_log_contid_list(net, context);
> +
> errout:
> + audit_free_context(context);
> +out:
> return XT_CONTINUE;
> }
>
> --
> 1.8.3.1
>
> --
> Linux-audit mailing list
> [email protected]
> https://www.redhat.com/mailman/listinfo/linux-audit



--
paul moore
http://www.paul-moore.com

2018-07-20 22:17:38

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 09/10] debug audit: read container ID of a process

On Wed, Jun 6, 2018 at 1:02 PM Richard Guy Briggs <[email protected]> wrote:
> Add support for reading the audit container identifier from the proc
> filesystem.
>
> This is a read from the proc entry of the form
> /proc/PID/audit_containerid where PID is the process ID of the task
> whose audit container identifier is sought.
>
> The read expects up to a u64 value (unset: 18446744073709551615).
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> fs/proc/base.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 318dff4..ca8bfe2 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1303,6 +1303,21 @@ static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
> .llseek = generic_file_llseek,
> };
>
> +static ssize_t proc_contid_read(struct file *file, char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + struct inode *inode = file_inode(file);
> + struct task_struct *task = get_proc_task(inode);
> + ssize_t length;
> + char tmpbuf[TMPBUFLEN*2];
> +
> + if (!task)
> + return -ESRCH;
> + length = scnprintf(tmpbuf, TMPBUFLEN*2, "%llu", audit_get_contid(task));
> + put_task_struct(task);
> + return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
> +}

While I still remain very nervous about opening the audit container ID
up for abuse by making it accessible, I understand that this would
make things a lot easier us (e.g. testing) and perhaps the container
engines as well. In order to limit the potential for abuse, what do
you think about restricting read access to those processes which have
CAP_AUDIT_CONTROL, similar to what we do for setting the audit
container ID?

> static ssize_t proc_contid_write(struct file *file, const char __user *buf,
> size_t count, loff_t *ppos)
> {
> @@ -1333,6 +1348,7 @@ static ssize_t proc_contid_write(struct file *file, const char __user *buf,
> }
>
> static const struct file_operations proc_contid_operations = {
> + .read = proc_contid_read,
> .write = proc_contid_write,
> .llseek = generic_file_llseek,
> };
> @@ -3030,7 +3046,7 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
> #ifdef CONFIG_AUDITSYSCALL
> REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> REG("sessionid", S_IRUGO, proc_sessionid_operations),
> - REG("audit_containerid", S_IWUSR, proc_contid_operations),
> + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> @@ -3422,7 +3438,7 @@ static int proc_tid_comm_permission(struct inode *inode, int mask)
> #ifdef CONFIG_AUDITSYSCALL
> REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> REG("sessionid", S_IRUGO, proc_sessionid_operations),
> - REG("audit_containerid", S_IWUSR, proc_contid_operations),
> + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),

--
paul moore
http://www.paul-moore.com

2018-07-21 15:33:47

by Laura Garcia

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

CC'ing Netfilter.

On Wed, Jun 6, 2018 at 6:58 PM, Richard Guy Briggs <[email protected]> wrote:
> Add audit container identifier auxiliary record(s) to NETFILTER_PKT
> event standalone records. Iterate through all potential audit container
> identifiers associated with a network namespace.
>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/audit.h | 5 +++++
> kernel/audit.c | 20 +++++++++++++++++++-
> kernel/auditsc.c | 2 ++
> net/netfilter/xt_AUDIT.c | 12 ++++++++++--
> 4 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/audit.h b/include/linux/audit.h
> index 7e2e51c..4560a4e 100644
> --- a/include/linux/audit.h
> +++ b/include/linux/audit.h
> @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context *context,
> extern void audit_contid_add(struct net *net, u64 contid);
> extern void audit_contid_del(struct net *net, u64 contid);
> extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);
> +extern void audit_log_contid_list(struct net *net,
> + struct audit_context *context);
>
> extern int audit_update_lsm_rules(void);
>
> @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net, u64 contid)
> { }
> static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> { }
> +static inline void audit_log_contid_list(struct net *net,
> + struct audit_context *context)
> +{ }
>
> #define audit_enabled 0
> #endif /* CONFIG_AUDIT */
> diff --git a/kernel/audit.c b/kernel/audit.c
> index ecd2de4..8cca41a 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> audit_contid_add(new->net_ns, contid);
> }
>
> +void audit_log_contid_list(struct net *net, struct audit_context *context)
> +{
> + struct audit_contid *cont;
> + int i = 0;
> +
> + list_for_each_entry(cont, audit_get_contid_list(net), list) {
> + char buf[14];
> +
> + sprintf(buf, "net%u", i++);
> + audit_log_contid(context, buf, cont->id);
> + }
> +}
> +EXPORT_SYMBOL(audit_log_contid_list);
> +
> void audit_panic(const char *message)
> {
> switch (audit_failure) {
> @@ -2132,17 +2146,21 @@ int audit_log_contid(struct audit_context *context,
> char *op, u64 contid)
> {
> struct audit_buffer *ab;
> + gfp_t gfpflags;
>
> if (!cid_valid(contid))
> return 0;
> + /* We can be called in atomic context via audit_tg() */
> + gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
> /* Generate AUDIT_CONTAINER record with container ID */
> - ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> + ab = audit_log_start(context, gfpflags, AUDIT_CONTAINER);
> if (!ab)
> return -ENOMEM;
> audit_log_format(ab, "op=%s contid=%llu", op, contid);
> audit_log_end(ab);
> return 0;
> }
> +EXPORT_SYMBOL(audit_log_contid);
>
> void audit_log_key(struct audit_buffer *ab, char *key)
> {
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 6ab5e5e..e2a16d2 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1015,6 +1015,7 @@ struct audit_context *audit_alloc_local(void)
> context->in_syscall = 1;
> return context;
> }
> +EXPORT_SYMBOL(audit_alloc_local);
>
> void audit_free_context(struct audit_context *context)
> {
> @@ -1029,6 +1030,7 @@ void audit_free_context(struct audit_context *context)
> audit_proctitle_free(context);
> kfree(context);
> }
> +EXPORT_SYMBOL(audit_free_context);
>
> static int audit_log_pid_context(struct audit_context *context, pid_t pid,
> kuid_t auid, kuid_t uid, unsigned int sessionid,
> diff --git a/net/netfilter/xt_AUDIT.c b/net/netfilter/xt_AUDIT.c
> index f368ee6..10d2707 100644
> --- a/net/netfilter/xt_AUDIT.c
> +++ b/net/netfilter/xt_AUDIT.c
> @@ -71,10 +71,13 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
> {
> struct audit_buffer *ab;
> int fam = -1;
> + struct audit_context *context;
> + struct net *net;
>
> if (audit_enabled == 0)
> - goto errout;
> - ab = audit_log_start(NULL, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
> + goto out;
> + context = audit_alloc_local();
> + ab = audit_log_start(context, GFP_ATOMIC, AUDIT_NETFILTER_PKT);
> if (ab == NULL)
> goto errout;
>
> @@ -104,7 +107,12 @@ static bool audit_ip6(struct audit_buffer *ab, struct sk_buff *skb)
>
> audit_log_end(ab);
>
> + net = xt_net(par);
> + audit_log_contid_list(net, context);
> +
> errout:
> + audit_free_context(context);
> +out:
> return XT_CONTINUE;
> }
>
> --
> 1.8.3.1
>

2018-07-21 19:25:54

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 09/10] debug audit: read container ID of a process

On 2018-07-20 18:15, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:02 PM Richard Guy Briggs <[email protected]> wrote:
> > Add support for reading the audit container identifier from the proc
> > filesystem.
> >
> > This is a read from the proc entry of the form
> > /proc/PID/audit_containerid where PID is the process ID of the task
> > whose audit container identifier is sought.
> >
> > The read expects up to a u64 value (unset: 18446744073709551615).
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > fs/proc/base.c | 20 ++++++++++++++++++--
> > 1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 318dff4..ca8bfe2 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1303,6 +1303,21 @@ static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
> > .llseek = generic_file_llseek,
> > };
> >
> > +static ssize_t proc_contid_read(struct file *file, char __user *buf,
> > + size_t count, loff_t *ppos)
> > +{
> > + struct inode *inode = file_inode(file);
> > + struct task_struct *task = get_proc_task(inode);
> > + ssize_t length;
> > + char tmpbuf[TMPBUFLEN*2];
> > +
> > + if (!task)
> > + return -ESRCH;
> > + length = scnprintf(tmpbuf, TMPBUFLEN*2, "%llu", audit_get_contid(task));
> > + put_task_struct(task);
> > + return simple_read_from_buffer(buf, count, ppos, tmpbuf, length);
> > +}
>
> While I still remain very nervous about opening the audit container ID
> up for abuse by making it accessible, I understand that this would
> make things a lot easier us (e.g. testing) and perhaps the container
> engines as well. In order to limit the potential for abuse, what do
> you think about restricting read access to those processes which have
> CAP_AUDIT_CONTROL, similar to what we do for setting the audit
> container ID?

That seems like a reasonable restriction.

> > static ssize_t proc_contid_write(struct file *file, const char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > @@ -1333,6 +1348,7 @@ static ssize_t proc_contid_write(struct file *file, const char __user *buf,
> > }
> >
> > static const struct file_operations proc_contid_operations = {
> > + .read = proc_contid_read,
> > .write = proc_contid_write,
> > .llseek = generic_file_llseek,
> > };
> > @@ -3030,7 +3046,7 @@ static int proc_pid_patch_state(struct seq_file *m, struct pid_namespace *ns,
> > #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > - REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> > @@ -3422,7 +3438,7 @@ static int proc_tid_comm_permission(struct inode *inode, int mask)
> > #ifdef CONFIG_AUDITSYSCALL
> > REG("loginuid", S_IWUSR|S_IRUGO, proc_loginuid_operations),
> > REG("sessionid", S_IRUGO, proc_sessionid_operations),
> > - REG("audit_containerid", S_IWUSR, proc_contid_operations),
> > + REG("audit_containerid", S_IWUSR|S_IRUSR, proc_contid_operations),
> > #endif
> > #ifdef CONFIG_FAULT_INJECTION
> > REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-21 20:33:26

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-20 18:13, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> > Create a new audit record AUDIT_CONTAINER to document the audit
> > container identifier of a process if it is present.
> >
> > Called from audit_log_exit(), syscalls are covered.
> >
> > A sample raw event:
> > type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257 success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="tmpcontainerid"
> > type=CWD msg=audit(1519924845.499:257): cwd="/root"
> > type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> > type=PATH msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid" inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> > type=PROCTITLE msg=audit(1519924845.499:257): proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964
> > type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/90
> > See: https://github.com/linux-audit/audit-userspace/issues/51
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > include/linux/audit.h | 7 +++++++
> > include/uapi/linux/audit.h | 1 +
> > kernel/audit.c | 23 +++++++++++++++++++++++
> > kernel/auditsc.c | 3 +++
> > 4 files changed, 34 insertions(+)
>
> ...
>
> > --- a/include/uapi/linux/audit.h
> > +++ b/include/uapi/linux/audit.h
> > @@ -115,6 +115,7 @@
> > #define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd */
> > #define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
> > #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
> > +#define AUDIT_CONTAINER 1332 /* Container ID */
>
> I'm not sure I'm completely sold on the AUDIT_CONTAINER_ID and
> AUDIT_CONTAINER record type names. From what I can tell
> AUDIT_CONTAINER_ID seems to be used for audit container ID management
> operations, e.g. setting the ID, whereas the AUDIT_CONTAINER is used
> to tag events with the corresponding audit container ID. Assuming
> that is correct, it seems like AUDIT_CONTAINER might be better served
> if it was named AUDIT_CONTAINER_ID and if we could change
> AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Thoughts?

Please see discussion at:
https://www.redhat.com/archives/linux-audit/2018-May/msg00101.html

I'm fine with changing AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc.

> > #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
> > #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index e7478cb..5e150c6 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
> > audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
> > }
> >
> > +/*
> > + * audit_log_contid - report container info
> > + * @tsk: task to be recorded
> > + * @context: task or local context for record
> > + * @op: contid string description
> > + */
> > +int audit_log_contid(struct task_struct *tsk,
> > + struct audit_context *context, char *op)
> > +{
> > + struct audit_buffer *ab;
> > +
> > + if (!audit_contid_set(tsk))
> > + return 0;
> > + /* Generate AUDIT_CONTAINER record with container ID */
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > + if (!ab)
> > + return -ENOMEM;
> > + audit_log_format(ab, "op=%s contid=%llu",
> > + op, audit_get_contid(tsk));
>
> Can you explain your reason for including an "op" field in this record
> type? I've been looking at the rest of the patches in this patchset
> and it seems to be used more as an indicator of the record's
> generating context rather than any sort of audit container ID
> operation.

"action" might work, but that's netfilter and numeric... "kind"?
Nothing else really seems to fit from a field name, type or lack of
searchability perspective.

Steve, do you have an opinion?

> > + audit_log_end(ab);
> > + return 0;
> > +}
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-22 13:33:15

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > + * audit_log_contid - report container info
> > > + * @tsk: task to be recorded
> > > + * @context: task or local context for record
> > > + * @op: contid string description
> > > + */
> > > +int audit_log_contid(struct task_struct *tsk,
> > > + struct audit_context *context, char *op)
> > > +{
> > > + struct audit_buffer *ab;
> > > +
> > > + if (!audit_contid_set(tsk))
> > > + return 0;
> > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > + if (!ab)
> > > + return -ENOMEM;
> > > + audit_log_format(ab, "op=%s contid=%llu",
> > > + op, audit_get_contid(tsk));
> >
> > Can you explain your reason for including an "op" field in this record
> > type? I've been looking at the rest of the patches in this patchset
> > and it seems to be used more as an indicator of the record's
> > generating context rather than any sort of audit container ID
> > operation.
>
> "action" might work, but that's netfilter and numeric... "kind"?
> Nothing else really seems to fit from a field name, type or lack of
> searchability perspective.
>
> Steve, do you have an opinion?

We only have 1 sample event where we have op=task. What are the other
possible values?

-Steve




2018-07-22 21:00:52

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-22 09:32, Steve Grubb wrote:
> On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > + * audit_log_contid - report container info
> > > > + * @tsk: task to be recorded
> > > > + * @context: task or local context for record
> > > > + * @op: contid string description
> > > > + */
> > > > +int audit_log_contid(struct task_struct *tsk,
> > > > + struct audit_context *context, char *op)
> > > > +{
> > > > + struct audit_buffer *ab;
> > > > +
> > > > + if (!audit_contid_set(tsk))
> > > > + return 0;
> > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > > + if (!ab)
> > > > + return -ENOMEM;
> > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > + op, audit_get_contid(tsk));
> > >
> > > Can you explain your reason for including an "op" field in this record
> > > type? I've been looking at the rest of the patches in this patchset
> > > and it seems to be used more as an indicator of the record's
> > > generating context rather than any sort of audit container ID
> > > operation.
> >
> > "action" might work, but that's netfilter and numeric... "kind"?
> > Nothing else really seems to fit from a field name, type or lack of
> > searchability perspective.
> >
> > Steve, do you have an opinion?
>
> We only have 1 sample event where we have op=task. What are the other
> possible values?

For the AUDIT_CONTAINER record we have op= "task", "target" (from the
ptrace and signals patch), "tty".

For the AUDIT_CONTAINER_ID record we have "op=set".

> -Steve

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-22 21:07:44

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-22 16:55, Richard Guy Briggs wrote:
> On 2018-07-22 09:32, Steve Grubb wrote:
> > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > + * audit_log_contid - report container info
> > > > > + * @tsk: task to be recorded
> > > > > + * @context: task or local context for record
> > > > > + * @op: contid string description
> > > > > + */
> > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > + struct audit_context *context, char *op)
> > > > > +{
> > > > > + struct audit_buffer *ab;
> > > > > +
> > > > > + if (!audit_contid_set(tsk))
> > > > > + return 0;
> > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > > > + if (!ab)
> > > > > + return -ENOMEM;
> > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > + op, audit_get_contid(tsk));
> > > >
> > > > Can you explain your reason for including an "op" field in this record
> > > > type? I've been looking at the rest of the patches in this patchset
> > > > and it seems to be used more as an indicator of the record's
> > > > generating context rather than any sort of audit container ID
> > > > operation.
> > >
> > > "action" might work, but that's netfilter and numeric... "kind"?
> > > Nothing else really seems to fit from a field name, type or lack of
> > > searchability perspective.
> > >
> > > Steve, do you have an opinion?
> >
> > We only have 1 sample event where we have op=task. What are the other
> > possible values?
>
> For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> ptrace and signals patch), "tty".

Sorry, pressed "send" too quickly. Also "aux0x%x" (also from the
ptrace/signals patch), "net%u" (from the AUDIT_NETFILTER_PKT patch).

> For the AUDIT_CONTAINER_ID record we have "op=set".
>
> > -Steve
>
> - RGB

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-23 13:18:47

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Sat, Jul 21, 2018 at 4:32 PM Richard Guy Briggs <[email protected]> wrote:
> On 2018-07-20 18:13, Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> > > Create a new audit record AUDIT_CONTAINER to document the audit
> > > container identifier of a process if it is present.
> > >
> > > Called from audit_log_exit(), syscalls are covered.
> > >
> > > A sample raw event:
> > > type=SYSCALL msg=audit(1519924845.499:257): arch=c000003e syscall=257 success=yes exit=3 a0=ffffff9c a1=56374e1cef30 a2=241 a3=1b6 items=2 ppid=606 pid=635 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=3 comm="bash" exe="/usr/bin/bash" subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key="tmpcontainerid"
> > > type=CWD msg=audit(1519924845.499:257): cwd="/root"
> > > type=PATH msg=audit(1519924845.499:257): item=0 name="/tmp/" inode=13863 dev=00:27 mode=041777 ouid=0 ogid=0 rdev=00:00 obj=system_u:object_r:tmp_t:s0 nametype= PARENT cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> > > type=PATH msg=audit(1519924845.499:257): item=1 name="/tmp/tmpcontainerid" inode=17729 dev=00:27 mode=0100644 ouid=0 ogid=0 rdev=00:00 obj=unconfined_u:object_r:user_tmp_t:s0 nametype=CREATE cap_fp=0000000000000000 cap_fi=0000000000000000 cap_fe=0 cap_fver=0
> > > type=PROCTITLE msg=audit(1519924845.499:257): proctitle=62617368002D6300736C65657020313B206563686F2074657374203E202F746D702F746D70636F6E7461696E65726964
> > > type=CONTAINER msg=audit(1519924845.499:257): op=task contid=123458
> > >
> > > See: https://github.com/linux-audit/audit-kernel/issues/90
> > > See: https://github.com/linux-audit/audit-userspace/issues/51
> > > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > include/linux/audit.h | 7 +++++++
> > > include/uapi/linux/audit.h | 1 +
> > > kernel/audit.c | 23 +++++++++++++++++++++++
> > > kernel/auditsc.c | 3 +++
> > > 4 files changed, 34 insertions(+)
> >
> > ...
> >
> > > --- a/include/uapi/linux/audit.h
> > > +++ b/include/uapi/linux/audit.h
> > > @@ -115,6 +115,7 @@
> > > #define AUDIT_REPLACE 1329 /* Replace auditd if this packet unanswerd */
> > > #define AUDIT_KERN_MODULE 1330 /* Kernel Module events */
> > > #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */
> > > +#define AUDIT_CONTAINER 1332 /* Container ID */
> >
> > I'm not sure I'm completely sold on the AUDIT_CONTAINER_ID and
> > AUDIT_CONTAINER record type names. From what I can tell
> > AUDIT_CONTAINER_ID seems to be used for audit container ID management
> > operations, e.g. setting the ID, whereas the AUDIT_CONTAINER is used
> > to tag events with the corresponding audit container ID. Assuming
> > that is correct, it seems like AUDIT_CONTAINER might be better served
> > if it was named AUDIT_CONTAINER_ID and if we could change
> > AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc. Thoughts?
>
> Please see discussion at:
> https://www.redhat.com/archives/linux-audit/2018-May/msg00101.html
>
> I'm fine with changing AUDIT_CONTAINER_ID to AUDIT_CONTAINER_OP/MGMT/etc.

Noted, and while I'm generally a big fan of consistency for things
like this, I think these things are different enough (the loginuid is
recorded as a field, the audit container ID is recorded in a dedicated
record) that we don't need to be bound by LOGINUID's naming
convention.

> > > #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */
> > > #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */
> > > diff --git a/kernel/audit.c b/kernel/audit.c
> > > index e7478cb..5e150c6 100644
> > > --- a/kernel/audit.c
> > > +++ b/kernel/audit.c
> > > @@ -2048,6 +2048,29 @@ void audit_log_session_info(struct audit_buffer *ab)
> > > audit_log_format(ab, " auid=%u ses=%u", auid, sessionid);
> > > }
> > >
> > > +/*
> > > + * audit_log_contid - report container info
> > > + * @tsk: task to be recorded
> > > + * @context: task or local context for record
> > > + * @op: contid string description
> > > + */
> > > +int audit_log_contid(struct task_struct *tsk,
> > > + struct audit_context *context, char *op)
> > > +{
> > > + struct audit_buffer *ab;
> > > +
> > > + if (!audit_contid_set(tsk))
> > > + return 0;
> > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > + if (!ab)
> > > + return -ENOMEM;
> > > + audit_log_format(ab, "op=%s contid=%llu",
> > > + op, audit_get_contid(tsk));
> >
> > Can you explain your reason for including an "op" field in this record
> > type? I've been looking at the rest of the patches in this patchset
> > and it seems to be used more as an indicator of the record's
> > generating context rather than any sort of audit container ID
> > operation.
>
> "action" might work, but that's netfilter and numeric... "kind"?
> Nothing else really seems to fit from a field name, type or lack of
> searchability perspective.

My concern isn't so much the name of the "op" field, although that
does seem wrong, but rather the existence of the field in the first
place. This audit container ID record (whatever we end up calling it)
exists to attach an audit container ID to an audit event, that's it;
an audit event should have other records which provide the context
(granted, the exact number of records depends on the event and the
system's configuration). If we are relying on this record to provide
critical information about the audit event other than the audit
container ID, I believe this is a strong indicator that the existing
audit records are lacking and should be augmented.

--
paul moore
http://www.paul-moore.com

2018-07-23 13:21:10

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> On 2018-07-22 09:32, Steve Grubb wrote:
> > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > + * audit_log_contid - report container info
> > > > > + * @tsk: task to be recorded
> > > > > + * @context: task or local context for record
> > > > > + * @op: contid string description
> > > > > + */
> > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > + struct audit_context *context, char
> > > > > *op)
> > > > > +{
> > > > > + struct audit_buffer *ab;
> > > > > +
> > > > > + if (!audit_contid_set(tsk))
> > > > > + return 0;
> > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > > > + if (!ab)
> > > > > + return -ENOMEM;
> > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > + op, audit_get_contid(tsk));
> > > >
> > > > Can you explain your reason for including an "op" field in this
> > > > record
> > > > type? I've been looking at the rest of the patches in this patchset
> > > > and it seems to be used more as an indicator of the record's
> > > > generating context rather than any sort of audit container ID
> > > > operation.
> > >
> > > "action" might work, but that's netfilter and numeric... "kind"?
> > > Nothing else really seems to fit from a field name, type or lack of
> > > searchability perspective.
> > >
> > > Steve, do you have an opinion?
> >
> > We only have 1 sample event where we have op=task. What are the other
> > possible values?
>
> For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> ptrace and signals patch), "tty".
>
> For the AUDIT_CONTAINER_ID record we have "op=set".

Since the purpose of this record is to log the container id, I think that is
all that is needed. We can get the context from the other records in the
event. I'd suggest dropping the "op" field.

-Steve



2018-07-23 15:16:22

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-23 09:19, Steve Grubb wrote:
> On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> > On 2018-07-22 09:32, Steve Grubb wrote:
> > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > > + * audit_log_contid - report container info
> > > > > > + * @tsk: task to be recorded
> > > > > > + * @context: task or local context for record
> > > > > > + * @op: contid string description
> > > > > > + */
> > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > + struct audit_context *context, char
> > > > > > *op)
> > > > > > +{
> > > > > > + struct audit_buffer *ab;
> > > > > > +
> > > > > > + if (!audit_contid_set(tsk))
> > > > > > + return 0;
> > > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_CONTAINER);
> > > > > > + if (!ab)
> > > > > > + return -ENOMEM;
> > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > + op, audit_get_contid(tsk));
> > > > >
> > > > > Can you explain your reason for including an "op" field in this
> > > > > record
> > > > > type? I've been looking at the rest of the patches in this patchset
> > > > > and it seems to be used more as an indicator of the record's
> > > > > generating context rather than any sort of audit container ID
> > > > > operation.
> > > >
> > > > "action" might work, but that's netfilter and numeric... "kind"?
> > > > Nothing else really seems to fit from a field name, type or lack of
> > > > searchability perspective.
> > > >
> > > > Steve, do you have an opinion?
> > >
> > > We only have 1 sample event where we have op=task. What are the other
> > > possible values?
> >
> > For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> > ptrace and signals patch), "tty".
> >
> > For the AUDIT_CONTAINER_ID record we have "op=set".
>
> Since the purpose of this record is to log the container id, I think that is
> all that is needed. We can get the context from the other records in the
> event. I'd suggest dropping the "op" field.

Ok, the information above it for two different audit container
identifier records. Which one should drop the "op=" field? Both? Or
just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which
might be renamed) could use it to distinguish a "set" record from a
dropped audit container identifier that is no longer registered by any
task or namespace.

> -Steve

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-23 16:50:28

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Monday, July 23, 2018 11:11:48 AM EDT Richard Guy Briggs wrote:
> On 2018-07-23 09:19, Steve Grubb wrote:
> > On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> > > On 2018-07-22 09:32, Steve Grubb wrote:
> > > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > > > + * audit_log_contid - report container info
> > > > > > > + * @tsk: task to be recorded
> > > > > > > + * @context: task or local context for record
> > > > > > > + * @op: contid string description
> > > > > > > + */
> > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > + struct audit_context *context,
> > > > > > > char
> > > > > > > *op)
> > > > > > > +{
> > > > > > > + struct audit_buffer *ab;
> > > > > > > +
> > > > > > > + if (!audit_contid_set(tsk))
> > > > > > > + return 0;
> > > > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > AUDIT_CONTAINER);
> > > > > > > + if (!ab)
> > > > > > > + return -ENOMEM;
> > > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > + op, audit_get_contid(tsk));
> > > > > >
> > > > > > Can you explain your reason for including an "op" field in this
> > > > > > record
> > > > > > type? I've been looking at the rest of the patches in this
> > > > > > patchset
> > > > > > and it seems to be used more as an indicator of the record's
> > > > > > generating context rather than any sort of audit container ID
> > > > > > operation.
> > > > >
> > > > > "action" might work, but that's netfilter and numeric... "kind"?
> > > > > Nothing else really seems to fit from a field name, type or lack of
> > > > > searchability perspective.
> > > > >
> > > > > Steve, do you have an opinion?
> > > >
> > > > We only have 1 sample event where we have op=task. What are the other
> > > > possible values?
> > >
> > > For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> > > ptrace and signals patch), "tty".
> > >
> > > For the AUDIT_CONTAINER_ID record we have "op=set".
> >
> > Since the purpose of this record is to log the container id, I think that
> > is all that is needed. We can get the context from the other records in
> > the event. I'd suggest dropping the "op" field.
>
> Ok, the information above it for two different audit container
> identifier records. Which one should drop the "op=" field? Both? Or
> just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which
> might be renamed) could use it to distinguish a "set" record from a
> dropped audit container identifier that is no longer registered by any
> task or namespace.

Neither of them need it. All they need to do is state the container that is
being acted upon.

-Steve




2018-07-23 18:32:32

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On Mon, Jul 23, 2018 at 12:48 PM Steve Grubb <[email protected]> wrote:
> On Monday, July 23, 2018 11:11:48 AM EDT Richard Guy Briggs wrote:
> > On 2018-07-23 09:19, Steve Grubb wrote:
> > > On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> > > > On 2018-07-22 09:32, Steve Grubb wrote:
> > > > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > > > > + * audit_log_contid - report container info
> > > > > > > > + * @tsk: task to be recorded
> > > > > > > > + * @context: task or local context for record
> > > > > > > > + * @op: contid string description
> > > > > > > > + */
> > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > + struct audit_context *context,
> > > > > > > > char
> > > > > > > > *op)
> > > > > > > > +{
> > > > > > > > + struct audit_buffer *ab;
> > > > > > > > +
> > > > > > > > + if (!audit_contid_set(tsk))
> > > > > > > > + return 0;
> > > > > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > AUDIT_CONTAINER);
> > > > > > > > + if (!ab)
> > > > > > > > + return -ENOMEM;
> > > > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > + op, audit_get_contid(tsk));
> > > > > > >
> > > > > > > Can you explain your reason for including an "op" field in this
> > > > > > > record
> > > > > > > type? I've been looking at the rest of the patches in this
> > > > > > > patchset
> > > > > > > and it seems to be used more as an indicator of the record's
> > > > > > > generating context rather than any sort of audit container ID
> > > > > > > operation.
> > > > > >
> > > > > > "action" might work, but that's netfilter and numeric... "kind"?
> > > > > > Nothing else really seems to fit from a field name, type or lack of
> > > > > > searchability perspective.
> > > > > >
> > > > > > Steve, do you have an opinion?
> > > > >
> > > > > We only have 1 sample event where we have op=task. What are the other
> > > > > possible values?
> > > >
> > > > For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> > > > ptrace and signals patch), "tty".
> > > >
> > > > For the AUDIT_CONTAINER_ID record we have "op=set".
> > >
> > > Since the purpose of this record is to log the container id, I think that
> > > is all that is needed. We can get the context from the other records in
> > > the event. I'd suggest dropping the "op" field.
> >
> > Ok, the information above it for two different audit container
> > identifier records. Which one should drop the "op=" field? Both? Or
> > just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which
> > might be renamed) could use it to distinguish a "set" record from a
> > dropped audit container identifier that is no longer registered by any
> > task or namespace.
>
> Neither of them need it. All they need to do is state the container that is
> being acted upon.

I think we should keep the "op" field for audit container ID
management operations, even though we really only have a "set"
operation at the moment, but the others should drop the "op" field
(see my previous emails in this thread).

--
paul moore
http://www.paul-moore.com

2018-07-24 14:07:22

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces

On 2018-07-20 18:14, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > Audit events could happen in a network namespace outside of a task
> > context due to packets received from the net that trigger an auditing
> > rule prior to being associated with a running task. The network
> > namespace could in use by multiple containers by association to the
> > tasks in that network namespace. We still want a way to attribute
> > these events to any potential containers. Keep a list per network
> > namespace to track these audit container identifiiers.
> >
> > Add/increment the audit container identifier on:
> > - initial setting of the audit container identifier via /proc
> > - clone/fork call that inherits an audit container identifier
> > - unshare call that inherits an audit container identifier
> > - setns call that inherits an audit container identifier
> > Delete/decrement the audit container identifier on:
> > - an inherited audit container identifier dropped when child set
> > - process exit
> > - unshare call that drops a net namespace
> > - setns call that drops a net namespace
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/92
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > include/linux/audit.h | 23 ++++++++++++++++
> > kernel/audit.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > kernel/auditsc.c | 5 ++++
> > kernel/nsproxy.c | 4 +++
> > 4 files changed, 104 insertions(+)
>
> ...
>
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 1e37abf..7e2e51c 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -87,6 +88,12 @@ struct audit_field {
> > u32 op;
> > };
> >
> > +struct audit_contid {
> > + struct list_head list;
> > + u64 id;
> > + refcount_t refcount;
> > +};
>
> Do we need to worry about locking the audit container ID list? Does
> the network namespace code (or some other namespace code) ensure that
> add/deletes are serialized?

Now that you mention it, I don't have any idea. I'll need to look into this.

> > @@ -156,6 +163,10 @@ extern void audit_log_task_info(struct audit_buffer *ab,
> > struct task_struct *tsk);
> > extern int audit_log_contid(struct audit_context *context,
> > char *op, u64 contid);
> > +extern struct list_head *audit_get_contid_list(const struct net *net);
> > +extern void audit_contid_add(struct net *net, u64 contid);
> > +extern void audit_contid_del(struct net *net, u64 contid);
>
> I wonder if we should change these function names to indicate that
> they are managing the netns/cid list? Right now there is no mention
> of networking other than the first parameter.
>
> Maybe audit_netns_contid_*()?

I was going to protest that they should be more generally named
functions to deal with namespaces rather than specifically network
namespaces, but each type of namespace will need its own accessor
functions since each type of namespace has a different namespace type
pointer. It is tempting to try to generalize it, but that could be an
excercise for the reader if another type of namespace needs this sort of
support.

> > +extern void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p);
> >
> > extern int audit_update_lsm_rules(void);
> >
> > @@ -209,6 +220,18 @@ static inline void audit_log_task_info(struct audit_buffer *ab,
> > static inline int audit_log_contid(struct audit_context *context,
> > char *op, u64 contid)
> > { }
> > +static inline struct list_head *audit_get_contid_list(const struct net *net)
> > +{
> > + static LIST_HEAD(list);
> > + return &list;
> > +}
>
> Why can't we just return NULL here like a normal dummy function? It's
> only ever used inside audit. Actually, why is this even in
> include/linux/audit.h, couldn't we put it in kernel/audit.h or even
> just make it a static in audit.c?

You are right, static in kernel/audit.c is sufficient.

> > +static inline void audit_contid_add(struct net *net, u64 contid)
> > +{ }
> > +static inline void audit_contid_del(struct net *net, u64 contid)
> > +{ }
> > +static inline void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> > +{ }
> > +
> > #define audit_enabled 0
> > #endif /* CONFIG_AUDIT */
> >
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index ba304a8..ecd2de4 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -106,6 +106,7 @@
> > */
> > struct audit_net {
> > struct sock *sk;
> > + struct list_head contid_list;
> > };
> >
> > /**
> > @@ -311,6 +312,76 @@ static struct sock *audit_get_sk(const struct net *net)
> > return aunet->sk;
> > }
> >
> > +/**
> > + * audit_get_contid_list - Return the audit container ID list for the given network namespace
> > + * @net: the destination network namespace
> > + *
> > + * Description:
> > + * Returns the list pointer if valid, NULL otherwise. The caller must ensure
> > + * that a reference is held for the network namespace while the sock is in use.
> > + */
> > +struct list_head *audit_get_contid_list(const struct net *net)
> > +{
> > + struct audit_net *aunet = net_generic(net, audit_net_id);
> > +
> > + return &aunet->contid_list;
> > +}
> > +
> > +void audit_contid_add(struct net *net, u64 contid)
> > +{
> > + struct list_head *contid_list = audit_get_contid_list(net);
> > + struct audit_contid *cont;
> > +
> > + if (!cid_valid(contid))
> > + return;
> > + if (!list_empty(contid_list))
> > + list_for_each_entry(cont, contid_list, list)
> > + if (cont->id == contid) {
> > + refcount_inc(&cont->refcount);
> > + return;
> > + }
>
> <thinking out loud>I think this is fine for right now, but we may need
> to be a bit clever about how we store the IDs - walking an unsorted
> list with lots of entries may prove to be too painful.</thinking out
> loud>

Ok, agreed, it may want a hash array list... That should be a
straightforward optimization later.

> > + cont = kmalloc(sizeof(struct audit_contid), GFP_KERNEL);
> > + if (!cont)
> > + return;
> > + INIT_LIST_HEAD(&cont->list);
> > + cont->id = contid;
> > + refcount_set(&cont->refcount, 1);
> > + list_add(&cont->list, contid_list);
> > +}
> > +
> > +void audit_contid_del(struct net *net, u64 contid)
> > +{
> > + struct list_head *contid_list = audit_get_contid_list(net);
> > + struct audit_contid *cont = NULL;
> > + int found = 0;
> > +
> > + if (!cid_valid(contid))
> > + return;
> > + if (!list_empty(contid_list))
> > + list_for_each_entry(cont, contid_list, list)
> > + if (cont->id == contid) {
> > + found = 1;
> > + break;
>
> You don't really need the found variable, you can just move all of the
> work you need to do up into the if statement and return from inside
> the if statement.

Fine, sure.

> > + }
> > + if (!found)
> > + return;
> > + list_del(&cont->list);
> > + if (refcount_dec_and_test(&cont->refcount))
> > + kfree(cont);
>
> Don't you want to dec_and_test first and only remove it from the list
> if there are no other references?

I don't think so. Let me try to describe it in prose to see if I
understood this properly and see if this makes more sense: I want to
remove this audit_contid list member from this net's audit_contid list
and decrement unconditionally this member's refcount so it knows there
is one less thing pointing at it and when there is no longer anything
pointing at it, free it.

> > +}
> >
> > +void audit_switch_task_namespaces(struct nsproxy *ns, struct task_struct *p)
> > +{
> > + u64 contid = audit_get_contid(p);
> > + struct nsproxy *new = p->nsproxy;
> > +
> > + if (!cid_valid(contid))
> > + return;
> > + audit_contid_del(ns->net_ns, contid);
> > + if (new)
> > + audit_contid_add(new->net_ns, contid);
> > +}
> > +
> > void audit_panic(const char *message)
> > {
> > switch (audit_failure) {
> > @@ -1550,6 +1621,7 @@ static int __net_init audit_net_init(struct net *net)
> > return -ENOMEM;
> > }
> > aunet->sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT;
> > + INIT_LIST_HEAD(&aunet->contid_list);
> >
> > return 0;
> > }
> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index ea1ee35..6ab5e5e 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -75,6 +75,7 @@
> > #include <linux/uaccess.h>
> > #include <linux/fsnotify_backend.h>
> > #include <uapi/linux/limits.h>
> > +#include <net/net_namespace.h>
> >
> > #include "audit.h"
> >
> > @@ -2165,6 +2166,7 @@ int audit_set_contid(struct task_struct *task, u64 contid)
> > uid_t uid;
> > struct tty_struct *tty;
> > char comm[sizeof(current->comm)];
> > + struct net *net = task->nsproxy->net_ns;
> >
> > /* Can't set if audit disabled */
> > if (!task->audit)
> > @@ -2185,10 +2187,13 @@ int audit_set_contid(struct task_struct *task, u64 contid)
> > else if (cid_valid(oldcontid) && !task->audit->inherited)
> > rc = -EEXIST;
> > if (!rc) {
> > + if (cid_valid(oldcontid))
> > + audit_contid_del(net, oldcontid);
> > task_lock(task);
> > task->audit->contid = contid;
> > task->audit->inherited = false;
> > task_unlock(task);
> > + audit_contid_add(net, contid);
> > }
> >
> > if (!audit_enabled)
> > diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
> > index f6c5d33..dcb69fe 100644
> > --- a/kernel/nsproxy.c
> > +++ b/kernel/nsproxy.c
> > @@ -27,6 +27,7 @@
> > #include <linux/syscalls.h>
> > #include <linux/cgroup.h>
> > #include <linux/perf_event.h>
> > +#include <linux/audit.h>
> >
> > static struct kmem_cache *nsproxy_cachep;
> >
> > @@ -140,6 +141,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
> > struct nsproxy *old_ns = tsk->nsproxy;
> > struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns);
> > struct nsproxy *new_ns;
> > + u64 contid = audit_get_contid(tsk);
> >
> > if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
> > CLONE_NEWPID | CLONE_NEWNET |
> > @@ -167,6 +169,7 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
> > return PTR_ERR(new_ns);
> >
> > tsk->nsproxy = new_ns;
> > + audit_contid_add(new_ns->net_ns, contid);
> > return 0;
> > }
> >
> > @@ -224,6 +227,7 @@ void switch_task_namespaces(struct task_struct *p, struct nsproxy *new)
> > ns = p->nsproxy;
> > p->nsproxy = new;
> > task_unlock(p);
> > + audit_switch_task_namespaces(ns, p);
> >
> > if (ns && atomic_dec_and_test(&ns->count))
> > free_nsproxy(ns);
> > --
> > 1.8.3.1
> >
> > --
> > Linux-audit mailing list
> > [email protected]
> > https://www.redhat.com/mailman/listinfo/linux-audit
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-24 14:11:24

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit

On 2018-07-20 18:14, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:04 PM Richard Guy Briggs <[email protected]> wrote:
> > Add audit container identifier auxiliary record to tty logging rule
> > event standalone records.
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > drivers/tty/tty_audit.c | 5 ++++-
> > 1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
> > index e30aa6b..66bd850 100644
> > --- a/drivers/tty/tty_audit.c
> > +++ b/drivers/tty/tty_audit.c
> > @@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, dev_t dev,
> > uid_t uid = from_kuid(&init_user_ns, task_uid(tsk));
> > uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk));
> > unsigned int sessionid = audit_get_sessionid(tsk);
> > + struct audit_context *context = audit_alloc_local();
>
> We should be using current's audit_context in tty_audit_log().
> Actually, we should probably just get rid of the tsk variable in
> tty_audit_log() and use current directly to make things a bit more
> obvious.

Ok, agreed. At this point, it it current passed in anyways so no harm
other than efficiency.

> <time passes>
>
> I did some digging and I have a two year old, half-baked patch that
> cleans up this tsk/current usage as well as a few others. I just
> rebased it against audit/next and surprisingly it seems to pass a
> basic smoke test (kernel boots and audit-testsuite passes); I'll post
> it to the list as a RFC once I'm done reviewing these patches.

I'll leave this patch the way it is since there should be no difference
and trust this other patch will work its way through the system and
solve that.

> > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY);
> > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY);
> > if (ab) {
> > char name[sizeof(tsk->comm)];
> >
> > @@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, dev_t dev,
> > audit_log_n_hex(ab, data, size);
> > audit_log_end(ab);
> > }
> > + audit_log_contid(context, "tty", audit_get_contid(tsk));
> > + audit_free_context(context);
> > }
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-24 19:10:27

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On 2018-07-20 18:13, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> > Implement the proc fs write to set the audit container identifier of a
> > process, emitting an AUDIT_CONTAINER_ID record to document the event.
> >
> > This is a write from the container orchestrator task to a proc entry of
> > the form /proc/PID/audit_containerid where PID is the process ID of the
> > newly created task that is to become the first task in a container, or
> > an additional task added to a container.
> >
> > The write expects up to a u64 value (unset: 18446744073709551615).
> >
> > The writer must have capability CAP_AUDIT_CONTROL.
> >
> > This will produce a record such as this:
> > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes
> >
> > The "op" field indicates an initial set. The "pid" to "ses" fields are
> > the orchestrator while the "opid" field is the object's PID, the process
> > being "contained". Old and new audit container identifier values are
> > given in the "contid" fields, while res indicates its success.
> >
> > It is not permitted to unset or re-set the audit container identifier.
> > A child inherits its parent's audit container identifier, but then can
> > be set only once after.
> >
> > See: https://github.com/linux-audit/audit-kernel/issues/90
> > See: https://github.com/linux-audit/audit-userspace/issues/51
> > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > fs/proc/base.c | 37 ++++++++++++++++++++++++
> > include/linux/audit.h | 25 ++++++++++++++++
> > include/uapi/linux/audit.h | 2 ++
> > kernel/auditsc.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++
> > 4 files changed, 135 insertions(+)
>
> ...
>
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -606,6 +621,16 @@ static inline bool audit_loginuid_set(struct task_struct *tsk)
> > return uid_valid(audit_get_loginuid(tsk));
> > }
> >
> > +static inline bool cid_valid(u64 contid)
> > +{
> > + return contid != AUDIT_CID_UNSET;
> > +}
> > +
> > +static inline bool audit_contid_set(struct task_struct *tsk)
> > +{
> > + return cid_valid(audit_get_contid(tsk));
> > +}
>
> For the sake of consistency I think we should rename cid_valid() to
> audit_contid_valid().

Ok.

> > diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> > index 59ef7a81..611e926 100644
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -956,6 +956,8 @@ int audit_alloc(struct task_struct *tsk)
> > return -ENOMEM;
> > info->loginuid = audit_get_loginuid(current);
> > info->sessionid = audit_get_sessionid(current);
> > + info->contid = audit_get_contid(current);
> > + info->inherited = true;
>
> First see my others comments in this patch about inheritence, but if
> we decide that flagging inherited values is important we should
> probably rename the "inherited" field to indicate that it applies to
> just the "contid" field.

Ok.

> > tsk->audit = info;
> >
> > if (likely(!audit_ever_enabled))
> > @@ -985,6 +987,8 @@ int audit_alloc(struct task_struct *tsk)
> > struct audit_task_info init_struct_audit = {
> > .loginuid = INVALID_UID,
> > .sessionid = AUDIT_SID_UNSET,
> > + .contid = AUDIT_CID_UNSET,
> > + .inherited = true,
> > .ctx = NULL,
> > };
> >
> > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> > }
> >
> > /**
> > + * audit_set_contid - set current task's audit_context contid
> > + * @contid: contid value
> > + *
> > + * Returns 0 on success, -EPERM on permission failure.
> > + *
> > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > + */
> > +int audit_set_contid(struct task_struct *task, u64 contid)
> > +{
> > + u64 oldcontid;
> > + int rc = 0;
> > + struct audit_buffer *ab;
> > + uid_t uid;
> > + struct tty_struct *tty;
> > + char comm[sizeof(current->comm)];
> > +
> > + /* Can't set if audit disabled */
> > + if (!task->audit)
> > + return -ENOPROTOOPT;
> > + oldcontid = audit_get_contid(task);
> > + /* Don't allow the audit containerid to be unset */
> > + if (!cid_valid(contid))
> > + rc = -EINVAL;
> > + /* if we don't have caps, reject */
> > + else if (!capable(CAP_AUDIT_CONTROL))
> > + rc = -EPERM;
> > + /* if task has children or is not single-threaded, deny */
> > + else if (!list_empty(&task->children))
> > + rc = -EBUSY;
>
> Is this safe without holding tasklist_lock? I worry we might be
> vulnerable to a race with fork().
>
> > + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > + rc = -EALREADY;
>
> Similar concern here as well, although related to threads.

I think you are correct here and tasklist_lock should cover both. Do we
also want rcu_read_lock() immediately preceeding it?

> > + /* it is already set, and not inherited from the parent, reject */
> > + else if (cid_valid(oldcontid) && !task->audit->inherited)
> > + rc = -EEXIST;
>
> Maybe I'm missing something, but why do we care about preventing
> reassigning the audit container ID in this case? The task is single
> threaded and has no descendants at this point so it should be safe,
> yes? So long as the task changing the audit container ID has
> capable(CAP_AUDIT_CONTOL) it shouldn't matter, right?

Because we hammered out this idea 6 months ago in the design phase and I
thought we all firmly agreed that the audit container identifier could
only be set once. Has any significant discussion happenned since then
to change that wisdom? I just wonder why this is coming up now.

> Related, I'm questioning if we would ever care if the audit container
> ID was inherited or not?

We do since that is the only way we can tell if the value has been set
once already or inherited unless we check if the parent's audit
container identifier is identical (which tells us it was inherited).

> > + if (!rc) {
> > + task_lock(task);
> > + task->audit->contid = contid;
> > + task->audit->inherited = false;
> > + task_unlock(task);
>
> I suspect the task_lock() may not be what we want here, but if we are
> using task_lock() to protect the audit fields two things come to mind:
>
> 1. We should update the header comments for task_lock() in task.h to
> indicate that it also protects ->audit.

Fair enough.

> 2. Where else do we need to worry about taking this lock? At the very
> least we should take this lock near the top of this function before we
> check task->audit and not drop it until after we have set it, or
> failed the operation for one of the reasons above.

Agreed, since another process on another CPU could race attempting this
same operation. However, the task_lock() comment precludes using it
with write_lock_irq(&task_lock) that might be required above.

> > + }
> > +
> > + if (!audit_enabled)
> > + return rc;
> > +
> > + ab = audit_log_start(audit_context(), GFP_KERNEL, AUDIT_CONTAINER_ID);
> > + if (!ab)
> > + return rc;
> > +
> > + uid = from_kuid(&init_user_ns, task_uid(current));
> > + tty = audit_get_tty(current);
> > + audit_log_format(ab, "op=set opid=%d old-contid=%llu contid=%llu pid=%d uid=%u auid=%u tty=%s ses=%u",
> > + task_tgid_nr(task), oldcontid, contid,
> > + task_tgid_nr(current), uid
> > + from_kuid(&init_user_ns, audit_get_loginuid(current)),
> > + tty ? tty_name(tty) : "(none)",
> > + audit_get_sessionid(current));
> > + audit_put_tty(tty);
> > + audit_log_task_context(ab);
> > + audit_log_format(ab, " comm=");
> > + audit_log_untrustedstring(ab, get_task_comm(comm, current));
> > + audit_log_d_path_exe(ab, current->mm);
> > + audit_log_format(ab, " res=%d", !rc);
> > + audit_log_end(ab);
> > + return rc;
> > +}
> > +
> > +/**
> > * __audit_mq_open - record audit data for a POSIX MQ open
> > * @oflag: open flag
> > * @mode: mode bits
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-24 19:41:52

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records

On 2018-07-20 18:14, Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs <[email protected]> wrote:
> > Standalone audit records have the timestamp and serial number generated
> > on the fly and as such are unique, making them standalone. This new
> > function audit_alloc_local() generates a local audit context that will
> > be used only for a standalone record and its auxiliary record(s). The
> > context is discarded immediately after the local associated records are
> > produced.
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > include/linux/audit.h | 8 ++++++++
> > kernel/auditsc.c | 25 +++++++++++++++++++++++--
> > 2 files changed, 31 insertions(+), 2 deletions(-)
>
> ...
>
> > --- a/kernel/auditsc.c
> > +++ b/kernel/auditsc.c
> > @@ -916,8 +916,11 @@ static inline void audit_free_aux(struct audit_context *context)
> > static inline struct audit_context *audit_alloc_context(enum audit_state state)
> > {
> > struct audit_context *context;
> > + gfp_t gfpflags;
> >
> > - context = kzalloc(sizeof(*context), GFP_KERNEL);
> > + /* We can be called in atomic context via audit_tg() */
> > + gfpflags = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL;
>
> Instead of trying to guess the proper gfp flags, and likely getting it
> wrong at some point (the in_atomic() comment in preempt.h don't give
> me the warm fuzzies), why not pass the gfp flags as an argument?
>
> Right now it looks like we would only have two callers: audit_alloc()
> and audit_audit_local(). The audit_alloc() invocation would simply
> pass GFP_KERNEL and we could allow the audit_alloc_local() callers to
> specify the gfp flags when calling audit_alloc_local() (although I
> suspect that will always be GFP_ATOMIC since we should only be calling
> audit_alloc_local() from interrupt-like context, in all other cases we
> should use the audit_context from the current task_struct.

Ok, I'll explicitly pass it in.

> > + context = kzalloc(sizeof(*context), gfpflags);
> > if (!context)
> > return NULL;
> > context->state = state;
> > @@ -993,8 +996,26 @@ struct audit_task_info init_struct_audit = {
> > .ctx = NULL,
> > };
> >
> > -static inline void audit_free_context(struct audit_context *context)
> > +struct audit_context *audit_alloc_local(void)
> > {
>
> Let's see where this goes, but we may want to rename this slightly to
> indicate that this should only be called from interrupt context when
> we can't rely on current's audit_context. Bonus points if we can find
> a way to enforce this with a WARN() assertion so we can better catch
> abuse.

I'll see what I can come up with.

> > + struct audit_context *context;
> > +
> > + if (!audit_ever_enabled)
> > + return NULL; /* Return if not auditing. */
> > +
> > + context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
> > + if (!context)
> > + return NULL;
> > + context->serial = audit_serial();
> > + context->ctime = current_kernel_time64();
> > + context->in_syscall = 1;
>
> Setting in_syscall is both interesting and a bit troubling, if for no
> other reason than I expect most (all?) callers to be in an interrupt
> context when audit_alloc_local() is called. Setting in_syscall would
> appear to be conceptually in this case. Can you help explain why this
> is the right thing to do, or necessary to ensure things are handled
> correctly?

I'll admit this is cheating a bit, but seemed harmless. It is needed so
that auditsc_get_stamp() from audit_get_stamp() from audit_log_start()
doesn't bail on me without giving me its already assigned time and
serial values rather than generating a new one. I did look to see if
there were any other undesireable side effects and found none, so I'm
tmepted to rename the ->in_syscall to something a bit more helpful. I
could add a new audit_context structure member to make
auditsc_get_stamp() co-operative, but this seems wasteful and
unnecessary.

> Looking quickly at the audit code, it seems to only be used on record
> and/or syscall termination to end things properly as well as in some
> of the PATH record code paths to limit filename collection to actual
> syscalls. However, this was just a quick look so I could be missing
> some important points.
>
> > + return context;
> > +}
> > +
> > +void audit_free_context(struct audit_context *context)
> > +{
> > + if (!context)
> > + return;
> > audit_free_names(context);
> > unroll_tree_refs(context, NULL, 0);
> > free_tree_refs(context);
> > --
> > 1.8.3.1
> >
> > --
> > Linux-audit mailing list
> > [email protected]
> > https://www.redhat.com/mailman/listinfo/linux-audit
>
> --
> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-24 19:49:28

by Steve Grubb

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

On Friday, July 20, 2018 6:15:00 PM EDT Paul Moore wrote:
> On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > Add audit container identifier auxiliary record(s) to NETFILTER_PKT
> > event standalone records. Iterate through all potential audit container
> > identifiers associated with a network namespace.
> >
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > include/linux/audit.h | 5 +++++
> > kernel/audit.c | 20 +++++++++++++++++++-
> > kernel/auditsc.c | 2 ++
> > net/netfilter/xt_AUDIT.c | 12 ++++++++++--
> > 4 files changed, 36 insertions(+), 3 deletions(-)
>
> ...
>
> > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > index 7e2e51c..4560a4e 100644
> > --- a/include/linux/audit.h
> > +++ b/include/linux/audit.h
> > @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context
> > *context, extern void audit_contid_add(struct net *net, u64 contid);
> > extern void audit_contid_del(struct net *net, u64 contid);
> > extern void audit_switch_task_namespaces(struct nsproxy *ns, struct
> > task_struct *p); +extern void audit_log_contid_list(struct net *net,
> > + struct audit_context *context);
>
> See my comment in previous patches about changing the function name to
> better indicate it's dedicate use for network namespaces.
>
> > extern int audit_update_lsm_rules(void);
> >
> > @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net,
> > u64 contid) { }
> > static inline void audit_switch_task_namespaces(struct nsproxy *ns,
> > struct task_struct *p) { }
> > +static inline void audit_log_contid_list(struct net *net,
> > + struct audit_context *context)
> > +{ }
> >
> > #define audit_enabled 0
> > #endif /* CONFIG_AUDIT */
> > diff --git a/kernel/audit.c b/kernel/audit.c
> > index ecd2de4..8cca41a 100644
> > --- a/kernel/audit.c
> > +++ b/kernel/audit.c
> > @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy
> > *ns, struct task_struct *p) audit_contid_add(new->net_ns, contid);
> > }
> >
> > +void audit_log_contid_list(struct net *net, struct audit_context
> > *context) +{
> > + struct audit_contid *cont;
> > + int i = 0;
> > +
> > + list_for_each_entry(cont, audit_get_contid_list(net), list) {
> > + char buf[14];
> > +
> > + sprintf(buf, "net%u", i++);
> > + audit_log_contid(context, buf, cont->id);
>
> Hmm. It looks like this will generate multiple audit container ID
> records with "op=netX contid=Y" (X=netns number, Y=audit container
> ID), is that what we want? I've mentioned my concern around the "op"
> values in these records earlier in the patchset, that still applies
> here, but now I'm also concerned about the multiple records. I'm
> thinking we might be better served with a single record with either
> multiple "contid" fields, or a single "contid" field with a set of
> comma separated values (or some other delimiter that Steve's tools
> will tolerate).
>
> Steve, thoughts?

A single record is best. Maybe pattern this after the args listed in an
execve record.

-Steve




2018-07-24 20:23:24

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

On Tue, Jul 24, 2018 at 3:48 PM Steve Grubb <[email protected]> wrote:
> On Friday, July 20, 2018 6:15:00 PM EDT Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > > Add audit container identifier auxiliary record(s) to NETFILTER_PKT
> > > event standalone records. Iterate through all potential audit container
> > > identifiers associated with a network namespace.
> > >
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > include/linux/audit.h | 5 +++++
> > > kernel/audit.c | 20 +++++++++++++++++++-
> > > kernel/auditsc.c | 2 ++
> > > net/netfilter/xt_AUDIT.c | 12 ++++++++++--
> > > 4 files changed, 36 insertions(+), 3 deletions(-)
> >
> > ...
> >
> > > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > > index 7e2e51c..4560a4e 100644
> > > --- a/include/linux/audit.h
> > > +++ b/include/linux/audit.h
> > > @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context
> > > *context, extern void audit_contid_add(struct net *net, u64 contid);
> > > extern void audit_contid_del(struct net *net, u64 contid);
> > > extern void audit_switch_task_namespaces(struct nsproxy *ns, struct
> > > task_struct *p); +extern void audit_log_contid_list(struct net *net,
> > > + struct audit_context *context);
> >
> > See my comment in previous patches about changing the function name to
> > better indicate it's dedicate use for network namespaces.
> >
> > > extern int audit_update_lsm_rules(void);
> > >
> > > @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net,
> > > u64 contid) { }
> > > static inline void audit_switch_task_namespaces(struct nsproxy *ns,
> > > struct task_struct *p) { }
> > > +static inline void audit_log_contid_list(struct net *net,
> > > + struct audit_context *context)
> > > +{ }
> > >
> > > #define audit_enabled 0
> > > #endif /* CONFIG_AUDIT */
> > > diff --git a/kernel/audit.c b/kernel/audit.c
> > > index ecd2de4..8cca41a 100644
> > > --- a/kernel/audit.c
> > > +++ b/kernel/audit.c
> > > @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy
> > > *ns, struct task_struct *p) audit_contid_add(new->net_ns, contid);
> > > }
> > >
> > > +void audit_log_contid_list(struct net *net, struct audit_context
> > > *context) +{
> > > + struct audit_contid *cont;
> > > + int i = 0;
> > > +
> > > + list_for_each_entry(cont, audit_get_contid_list(net), list) {
> > > + char buf[14];
> > > +
> > > + sprintf(buf, "net%u", i++);
> > > + audit_log_contid(context, buf, cont->id);
> >
> > Hmm. It looks like this will generate multiple audit container ID
> > records with "op=netX contid=Y" (X=netns number, Y=audit container
> > ID), is that what we want? I've mentioned my concern around the "op"
> > values in these records earlier in the patchset, that still applies
> > here, but now I'm also concerned about the multiple records. I'm
> > thinking we might be better served with a single record with either
> > multiple "contid" fields, or a single "contid" field with a set of
> > comma separated values (or some other delimiter that Steve's tools
> > will tolerate).
> >
> > Steve, thoughts?
>
> A single record is best. Maybe pattern this after the args listed in an
> execve record.

I'm concerned that an execve-like approach might not scale very well
as would could potentially have a lot of containers sharing a single
network namespace ("a%d=%d" vs ",%d"). Further, with execve we log
the argument position in addition to the argument itself, that isn't
something we need to worry about with the audit container IDs.

--
paul moore
http://www.paul-moore.com

2018-07-24 20:36:01

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces

On Tue, Jul 24, 2018 at 10:06 AM Richard Guy Briggs <[email protected]> wrote:
> On 2018-07-20 18:14, Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > > Audit events could happen in a network namespace outside of a task
> > > context due to packets received from the net that trigger an auditing
> > > rule prior to being associated with a running task. The network
> > > namespace could in use by multiple containers by association to the
> > > tasks in that network namespace. We still want a way to attribute
> > > these events to any potential containers. Keep a list per network
> > > namespace to track these audit container identifiiers.
> > >
> > > Add/increment the audit container identifier on:
> > > - initial setting of the audit container identifier via /proc
> > > - clone/fork call that inherits an audit container identifier
> > > - unshare call that inherits an audit container identifier
> > > - setns call that inherits an audit container identifier
> > > Delete/decrement the audit container identifier on:
> > > - an inherited audit container identifier dropped when child set
> > > - process exit
> > > - unshare call that drops a net namespace
> > > - setns call that drops a net namespace
> > >
> > > See: https://github.com/linux-audit/audit-kernel/issues/92
> > > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > include/linux/audit.h | 23 ++++++++++++++++
> > > kernel/audit.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > kernel/auditsc.c | 5 ++++
> > > kernel/nsproxy.c | 4 +++
> > > 4 files changed, 104 insertions(+)

...

> > > + }
> > > + if (!found)
> > > + return;
> > > + list_del(&cont->list);
> > > + if (refcount_dec_and_test(&cont->refcount))
> > > + kfree(cont);
> >
> > Don't you want to dec_and_test first and only remove it from the list
> > if there are no other references?
>
> I don't think so. Let me try to describe it in prose to see if I
> understood this properly and see if this makes more sense: I want to
> remove this audit_contid list member from this net's audit_contid list
> and decrement unconditionally this member's refcount so it knows there
> is one less thing pointing at it and when there is no longer anything
> pointing at it, free it.

Yep, sorry, my mistake, I was thinking the other way around (netns
going away) ... which actually, this patchset doesn't handle that does
it (I don't see any new code in audit_net_exit())? Is is in a later
patch? If so, it really should be in the same patch as this code to
prevent bisect nasties.

--
paul moore
http://www.paul-moore.com

2018-07-24 20:38:18

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 05/10] audit: add containerid support for tty_audit

On Tue, Jul 24, 2018 at 10:10 AM Richard Guy Briggs <[email protected]> wrote:
> On 2018-07-20 18:14, Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:04 PM Richard Guy Briggs <[email protected]> wrote:
> > > Add audit container identifier auxiliary record to tty logging rule
> > > event standalone records.
> > >
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > drivers/tty/tty_audit.c | 5 ++++-
> > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
> > > index e30aa6b..66bd850 100644
> > > --- a/drivers/tty/tty_audit.c
> > > +++ b/drivers/tty/tty_audit.c
> > > @@ -66,8 +66,9 @@ static void tty_audit_log(const char *description, dev_t dev,
> > > uid_t uid = from_kuid(&init_user_ns, task_uid(tsk));
> > > uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk));
> > > unsigned int sessionid = audit_get_sessionid(tsk);
> > > + struct audit_context *context = audit_alloc_local();
> >
> > We should be using current's audit_context in tty_audit_log().
> > Actually, we should probably just get rid of the tsk variable in
> > tty_audit_log() and use current directly to make things a bit more
> > obvious.
>
> Ok, agreed. At this point, it it current passed in anyways so no harm
> other than efficiency.
>
> > <time passes>
> >
> > I did some digging and I have a two year old, half-baked patch that
> > cleans up this tsk/current usage as well as a few others. I just
> > rebased it against audit/next and surprisingly it seems to pass a
> > basic smoke test (kernel boots and audit-testsuite passes); I'll post
> > it to the list as a RFC once I'm done reviewing these patches.
>
> I'll leave this patch the way it is since there should be no difference
> and trust this other patch will work its way through the system and
> solve that.

Yep, that's a merge issue I'll deal with when we get to that point.
Although, I would expect that when you post an updated patchset that
it applies cleanly to the then-current audit/next tree (or Linus'
tree, but audit/next is preferable). In other words, please rebase
your development branch before doing your final dev testing and
posting.

> > > - ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_TTY);
> > > + ab = audit_log_start(context, GFP_KERNEL, AUDIT_TTY);
> > > if (ab) {
> > > char name[sizeof(tsk->comm)];
> > >
> > > @@ -80,6 +81,8 @@ static void tty_audit_log(const char *description, dev_t dev,
> > > audit_log_n_hex(ab, data, size);
> > > audit_log_end(ab);
> > > }
> > > + audit_log_contid(context, "tty", audit_get_contid(tsk));
> > > + audit_free_context(context);
> > > }
> >
> > --
> > paul moore
> > http://www.paul-moore.com
>
> - RGB
>
> --
> Richard Guy Briggs <[email protected]>
> Sr. S/W Engineer, Kernel Security, Base Operating Systems
> Remote, Ottawa, Red Hat Canada
> IRC: rgb, SunRaycer
> Voice: +1.647.777.2635, Internal: (81) 32635

--
paul moore
http://www.paul-moore.com

2018-07-24 20:59:58

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 08/10] audit: NETFILTER_PKT: record each container ID associated with a netNS

On 2018-07-24 16:22, Paul Moore wrote:
> On Tue, Jul 24, 2018 at 3:48 PM Steve Grubb <[email protected]> wrote:
> > On Friday, July 20, 2018 6:15:00 PM EDT Paul Moore wrote:
> > > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > > > Add audit container identifier auxiliary record(s) to NETFILTER_PKT
> > > > event standalone records. Iterate through all potential audit container
> > > > identifiers associated with a network namespace.
> > > >
> > > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > > ---
> > > > include/linux/audit.h | 5 +++++
> > > > kernel/audit.c | 20 +++++++++++++++++++-
> > > > kernel/auditsc.c | 2 ++
> > > > net/netfilter/xt_AUDIT.c | 12 ++++++++++--
> > > > 4 files changed, 36 insertions(+), 3 deletions(-)
> > >
> > > ...
> > >
> > > > diff --git a/include/linux/audit.h b/include/linux/audit.h
> > > > index 7e2e51c..4560a4e 100644
> > > > --- a/include/linux/audit.h
> > > > +++ b/include/linux/audit.h
> > > > @@ -167,6 +167,8 @@ extern int audit_log_contid(struct audit_context
> > > > *context, extern void audit_contid_add(struct net *net, u64 contid);
> > > > extern void audit_contid_del(struct net *net, u64 contid);
> > > > extern void audit_switch_task_namespaces(struct nsproxy *ns, struct
> > > > task_struct *p); +extern void audit_log_contid_list(struct net *net,
> > > > + struct audit_context *context);
> > >
> > > See my comment in previous patches about changing the function name to
> > > better indicate it's dedicate use for network namespaces.
> > >
> > > > extern int audit_update_lsm_rules(void);
> > > >
> > > > @@ -231,6 +233,9 @@ static inline void audit_contid_del(struct net *net,
> > > > u64 contid) { }
> > > > static inline void audit_switch_task_namespaces(struct nsproxy *ns,
> > > > struct task_struct *p) { }
> > > > +static inline void audit_log_contid_list(struct net *net,
> > > > + struct audit_context *context)
> > > > +{ }
> > > >
> > > > #define audit_enabled 0
> > > > #endif /* CONFIG_AUDIT */
> > > > diff --git a/kernel/audit.c b/kernel/audit.c
> > > > index ecd2de4..8cca41a 100644
> > > > --- a/kernel/audit.c
> > > > +++ b/kernel/audit.c
> > > > @@ -382,6 +382,20 @@ void audit_switch_task_namespaces(struct nsproxy
> > > > *ns, struct task_struct *p) audit_contid_add(new->net_ns, contid);
> > > > }
> > > >
> > > > +void audit_log_contid_list(struct net *net, struct audit_context
> > > > *context) +{
> > > > + struct audit_contid *cont;
> > > > + int i = 0;
> > > > +
> > > > + list_for_each_entry(cont, audit_get_contid_list(net), list) {
> > > > + char buf[14];
> > > > +
> > > > + sprintf(buf, "net%u", i++);
> > > > + audit_log_contid(context, buf, cont->id);
> > >
> > > Hmm. It looks like this will generate multiple audit container ID
> > > records with "op=netX contid=Y" (X=netns number, Y=audit container
> > > ID), is that what we want? I've mentioned my concern around the "op"
> > > values in these records earlier in the patchset, that still applies
> > > here, but now I'm also concerned about the multiple records. I'm
> > > thinking we might be better served with a single record with either
> > > multiple "contid" fields, or a single "contid" field with a set of
> > > comma separated values (or some other delimiter that Steve's tools
> > > will tolerate).
> > >
> > > Steve, thoughts?
> >
> > A single record is best. Maybe pattern this after the args listed in an
> > execve record.
>
> I'm concerned that an execve-like approach might not scale very well
> as would could potentially have a lot of containers sharing a single
> network namespace ("a%d=%d" vs ",%d"). Further, with execve we log
> the argument position in addition to the argument itself, that isn't
> something we need to worry about with the audit container IDs.

I think a comma-separated list would be most efficient, but could
potentially overload one record. The "netX" labels are pretty
meaningless unless they are that netNS' inode number (with qualifying
dev, of course), but that would be elsewhere in another record.

> paul moore

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-24 21:55:26

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On Tue, Jul 24, 2018 at 3:09 PM Richard Guy Briggs <[email protected]> wrote:
> On 2018-07-20 18:13, Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> > > Implement the proc fs write to set the audit container identifier of a
> > > process, emitting an AUDIT_CONTAINER_ID record to document the event.
> > >
> > > This is a write from the container orchestrator task to a proc entry of
> > > the form /proc/PID/audit_containerid where PID is the process ID of the
> > > newly created task that is to become the first task in a container, or
> > > an additional task added to a container.
> > >
> > > The write expects up to a u64 value (unset: 18446744073709551615).
> > >
> > > The writer must have capability CAP_AUDIT_CONTROL.
> > >
> > > This will produce a record such as this:
> > > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes
> > >
> > > The "op" field indicates an initial set. The "pid" to "ses" fields are
> > > the orchestrator while the "opid" field is the object's PID, the process
> > > being "contained". Old and new audit container identifier values are
> > > given in the "contid" fields, while res indicates its success.
> > >
> > > It is not permitted to unset or re-set the audit container identifier.
> > > A child inherits its parent's audit container identifier, but then can
> > > be set only once after.
> > >
> > > See: https://github.com/linux-audit/audit-kernel/issues/90
> > > See: https://github.com/linux-audit/audit-userspace/issues/51
> > > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > >
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > fs/proc/base.c | 37 ++++++++++++++++++++++++
> > > include/linux/audit.h | 25 ++++++++++++++++
> > > include/uapi/linux/audit.h | 2 ++
> > > kernel/auditsc.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++
> > > 4 files changed, 135 insertions(+)

...

> > > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> > > }
> > >
> > > /**
> > > + * audit_set_contid - set current task's audit_context contid
> > > + * @contid: contid value
> > > + *
> > > + * Returns 0 on success, -EPERM on permission failure.
> > > + *
> > > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > > + */
> > > +int audit_set_contid(struct task_struct *task, u64 contid)
> > > +{
> > > + u64 oldcontid;
> > > + int rc = 0;
> > > + struct audit_buffer *ab;
> > > + uid_t uid;
> > > + struct tty_struct *tty;
> > > + char comm[sizeof(current->comm)];
> > > +
> > > + /* Can't set if audit disabled */
> > > + if (!task->audit)
> > > + return -ENOPROTOOPT;
> > > + oldcontid = audit_get_contid(task);
> > > + /* Don't allow the audit containerid to be unset */
> > > + if (!cid_valid(contid))
> > > + rc = -EINVAL;
> > > + /* if we don't have caps, reject */
> > > + else if (!capable(CAP_AUDIT_CONTROL))
> > > + rc = -EPERM;
> > > + /* if task has children or is not single-threaded, deny */
> > > + else if (!list_empty(&task->children))
> > > + rc = -EBUSY;
> >
> > Is this safe without holding tasklist_lock? I worry we might be
> > vulnerable to a race with fork().
> >
> > > + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > > + rc = -EALREADY;
> >
> > Similar concern here as well, although related to threads.
>
> I think you are correct here and tasklist_lock should cover both. Do we
> also want rcu_read_lock() immediately preceeding it?

You'll need to take a closer look and determine the locking scheme. I
simply took a quick look while reviewing this patch to see what of the
existing locks, if any, would be most applicable here; tasklist_lock
seemed like a good starting point.

It looks like tasklist_lock is defined as a rwlock_t so I'm not sure
it would make sense to use it with a RCU protected structure
(typically it's RCU+spinlock), but maybe that is the case with a
task_struct, you'll need to check.

> > > + /* it is already set, and not inherited from the parent, reject */
> > > + else if (cid_valid(oldcontid) && !task->audit->inherited)
> > > + rc = -EEXIST;
> >
> > Maybe I'm missing something, but why do we care about preventing
> > reassigning the audit container ID in this case? The task is single
> > threaded and has no descendants at this point so it should be safe,
> > yes? So long as the task changing the audit container ID has
> > capable(CAP_AUDIT_CONTOL) it shouldn't matter, right?
>
> Because we hammered out this idea 6 months ago in the design phase and I
> thought we all firmly agreed that the audit container identifier could
> only be set once. Has any significant discussion happenned since then
> to change that wisdom? I just wonder why this is coming up now.

Implementation, and time, can change how one looks at an earlier
design. I believe this is why most well reasoned specifications have
a reference design.

Remind me why the design had the restriction of write once for the
audit container ID? At this point given the CAP_AUDIT_CONTROL and the
single-thread, no-children restrictions I'm not sure what harm there
is in allowing the value to be written multiple times (so long as the
changes are audited of course).

> > Related, I'm questioning if we would ever care if the audit container
> > ID was inherited or not?
>
> We do since that is the only way we can tell if the value has been set
> once already or inherited unless we check if the parent's audit
> container identifier is identical (which tells us it was inherited).

Tied to the above question. If we don't care about multiple changes,
given the other constraints, we probably don't need the inherited
flag.

--
paul moore
http://www.paul-moore.com

2018-07-24 21:58:59

by Paul Moore

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records

On Tue, Jul 24, 2018 at 3:40 PM Richard Guy Briggs <[email protected]> wrote:
> On 2018-07-20 18:14, Paul Moore wrote:
> > On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs <[email protected]> wrote:
> > > Standalone audit records have the timestamp and serial number generated
> > > on the fly and as such are unique, making them standalone. This new
> > > function audit_alloc_local() generates a local audit context that will
> > > be used only for a standalone record and its auxiliary record(s). The
> > > context is discarded immediately after the local associated records are
> > > produced.
> > >
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > include/linux/audit.h | 8 ++++++++
> > > kernel/auditsc.c | 25 +++++++++++++++++++++++--
> > > 2 files changed, 31 insertions(+), 2 deletions(-)

...

> > > + struct audit_context *context;
> > > +
> > > + if (!audit_ever_enabled)
> > > + return NULL; /* Return if not auditing. */
> > > +
> > > + context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
> > > + if (!context)
> > > + return NULL;
> > > + context->serial = audit_serial();
> > > + context->ctime = current_kernel_time64();
> > > + context->in_syscall = 1;
> >
> > Setting in_syscall is both interesting and a bit troubling, if for no
> > other reason than I expect most (all?) callers to be in an interrupt
> > context when audit_alloc_local() is called. Setting in_syscall would
> > appear to be conceptually in this case. Can you help explain why this
> > is the right thing to do, or necessary to ensure things are handled
> > correctly?
>
> I'll admit this is cheating a bit, but seemed harmless. It is needed so
> that auditsc_get_stamp() from audit_get_stamp() from audit_log_start()
> doesn't bail on me without giving me its already assigned time and
> serial values rather than generating a new one. I did look to see if
> there were any other undesireable side effects and found none, so I'm
> tmepted to rename the ->in_syscall to something a bit more helpful. I
> could add a new audit_context structure member to make
> auditsc_get_stamp() co-operative, but this seems wasteful and
> unnecessary.

That's what I suspected.

Let's look into renaming the "in_syscall" field, it borderline
confusing now, and hijacking it for something which is very obviously
not "in syscall" is A Very Bad Thing.

--
paul moore
http://www.paul-moore.com

2018-07-26 00:55:45

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-23 14:31, Paul Moore wrote:
> On Mon, Jul 23, 2018 at 12:48 PM Steve Grubb <[email protected]> wrote:
> > On Monday, July 23, 2018 11:11:48 AM EDT Richard Guy Briggs wrote:
> > > On 2018-07-23 09:19, Steve Grubb wrote:
> > > > On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> > > > > On 2018-07-22 09:32, Steve Grubb wrote:
> > > > > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > > > > > + * audit_log_contid - report container info
> > > > > > > > > + * @tsk: task to be recorded
> > > > > > > > > + * @context: task or local context for record
> > > > > > > > > + * @op: contid string description
> > > > > > > > > + */
> > > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > > + struct audit_context *context,
> > > > > > > > > char
> > > > > > > > > *op)
> > > > > > > > > +{
> > > > > > > > > + struct audit_buffer *ab;
> > > > > > > > > +
> > > > > > > > > + if (!audit_contid_set(tsk))
> > > > > > > > > + return 0;
> > > > > > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > > AUDIT_CONTAINER);
> > > > > > > > > + if (!ab)
> > > > > > > > > + return -ENOMEM;
> > > > > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > > + op, audit_get_contid(tsk));
> > > > > > > >
> > > > > > > > Can you explain your reason for including an "op" field in this
> > > > > > > > record
> > > > > > > > type? I've been looking at the rest of the patches in this
> > > > > > > > patchset
> > > > > > > > and it seems to be used more as an indicator of the record's
> > > > > > > > generating context rather than any sort of audit container ID
> > > > > > > > operation.
> > > > > > >
> > > > > > > "action" might work, but that's netfilter and numeric... "kind"?
> > > > > > > Nothing else really seems to fit from a field name, type or lack of
> > > > > > > searchability perspective.
> > > > > > >
> > > > > > > Steve, do you have an opinion?
> > > > > >
> > > > > > We only have 1 sample event where we have op=task. What are the other
> > > > > > possible values?
> > > > >
> > > > > For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> > > > > ptrace and signals patch), "tty".
> > > > >
> > > > > For the AUDIT_CONTAINER_ID record we have "op=set".
> > > >
> > > > Since the purpose of this record is to log the container id, I think that
> > > > is all that is needed. We can get the context from the other records in
> > > > the event. I'd suggest dropping the "op" field.
> > >
> > > Ok, the information above it for two different audit container
> > > identifier records. Which one should drop the "op=" field? Both? Or
> > > just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which
> > > might be renamed) could use it to distinguish a "set" record from a
> > > dropped audit container identifier that is no longer registered by any
> > > task or namespace.
> >
> > Neither of them need it. All they need to do is state the container that is
> > being acted upon.
>
> I think we should keep the "op" field for audit container ID
> management operations, even though we really only have a "set"
> operation at the moment, but the others should drop the "op" field
> (see my previous emails in this thread).

In fact, I'd like to question the wisdom of dropping this field and
perhaps fine a better or new name for it, since these contid records
could be multiple and different from the primary task that is generating
this record. In particular, there are extra contid records generated by
the ptrace/signals patch that could be from other processes in other
containers that I mentioned in a branch thread that got dropped
including the auc_pids data and the target_pid also attached to the
primary task's audit context.

> paul moore
> http://www.paul-moore.com

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-26 13:38:13

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 07/10] audit: add support for containerid to network namespaces

On 2018-07-24 16:33, Paul Moore wrote:
> On Tue, Jul 24, 2018 at 10:06 AM Richard Guy Briggs <[email protected]> wrote:
> > On 2018-07-20 18:14, Paul Moore wrote:
> > > On Wed, Jun 6, 2018 at 1:03 PM Richard Guy Briggs <[email protected]> wrote:
> > > > Audit events could happen in a network namespace outside of a task
> > > > context due to packets received from the net that trigger an auditing
> > > > rule prior to being associated with a running task. The network
> > > > namespace could in use by multiple containers by association to the
> > > > tasks in that network namespace. We still want a way to attribute
> > > > these events to any potential containers. Keep a list per network
> > > > namespace to track these audit container identifiiers.
> > > >
> > > > Add/increment the audit container identifier on:
> > > > - initial setting of the audit container identifier via /proc
> > > > - clone/fork call that inherits an audit container identifier
> > > > - unshare call that inherits an audit container identifier
> > > > - setns call that inherits an audit container identifier
> > > > Delete/decrement the audit container identifier on:
> > > > - an inherited audit container identifier dropped when child set
> > > > - process exit
> > > > - unshare call that drops a net namespace
> > > > - setns call that drops a net namespace
> > > >
> > > > See: https://github.com/linux-audit/audit-kernel/issues/92
> > > > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > > ---
> > > > include/linux/audit.h | 23 ++++++++++++++++
> > > > kernel/audit.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > kernel/auditsc.c | 5 ++++
> > > > kernel/nsproxy.c | 4 +++
> > > > 4 files changed, 104 insertions(+)
>
> ...
>
> > > > + }
> > > > + if (!found)
> > > > + return;
> > > > + list_del(&cont->list);
> > > > + if (refcount_dec_and_test(&cont->refcount))
> > > > + kfree(cont);
> > >
> > > Don't you want to dec_and_test first and only remove it from the list
> > > if there are no other references?
> >
> > I don't think so. Let me try to describe it in prose to see if I
> > understood this properly and see if this makes more sense: I want to
> > remove this audit_contid list member from this net's audit_contid list
> > and decrement unconditionally this member's refcount so it knows there
> > is one less thing pointing at it and when there is no longer anything
> > pointing at it, free it.
>
> Yep, sorry, my mistake, I was thinking the other way around (netns
> going away) ... which actually, this patchset doesn't handle that does
> it (I don't see any new code in audit_net_exit())? Is is in a later
> patch? If so, it really should be in the same patch as this code to
> prevent bisect nasties.

I don't think any code is needed in audit_net_exit() other than a WARN()
that the contid list is empty, since if a netns is going away, all its
tasks would have exited (removing its contid from the list) or moved to
a new nstns (moving its contid from this list to another netns).
At worst, this is a memory leak and not a bad pointer dereference.

> paul moore

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-26 14:34:33

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 04/10] audit: add support for non-syscall auxiliary records

On 2018-07-24 17:57, Paul Moore wrote:
> On Tue, Jul 24, 2018 at 3:40 PM Richard Guy Briggs <[email protected]> wrote:
> > On 2018-07-20 18:14, Paul Moore wrote:
> > > On Wed, Jun 6, 2018 at 1:01 PM Richard Guy Briggs <[email protected]> wrote:
> > > > Standalone audit records have the timestamp and serial number generated
> > > > on the fly and as such are unique, making them standalone. This new
> > > > function audit_alloc_local() generates a local audit context that will
> > > > be used only for a standalone record and its auxiliary record(s). The
> > > > context is discarded immediately after the local associated records are
> > > > produced.
> > > >
> > > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > > ---
> > > > include/linux/audit.h | 8 ++++++++
> > > > kernel/auditsc.c | 25 +++++++++++++++++++++++--
> > > > 2 files changed, 31 insertions(+), 2 deletions(-)
>
> ...
>
> > > > + struct audit_context *context;
> > > > +
> > > > + if (!audit_ever_enabled)
> > > > + return NULL; /* Return if not auditing. */
> > > > +
> > > > + context = audit_alloc_context(AUDIT_RECORD_CONTEXT);
> > > > + if (!context)
> > > > + return NULL;
> > > > + context->serial = audit_serial();
> > > > + context->ctime = current_kernel_time64();
> > > > + context->in_syscall = 1;
> > >
> > > Setting in_syscall is both interesting and a bit troubling, if for no
> > > other reason than I expect most (all?) callers to be in an interrupt
> > > context when audit_alloc_local() is called. Setting in_syscall would
> > > appear to be conceptually in this case. Can you help explain why this
> > > is the right thing to do, or necessary to ensure things are handled
> > > correctly?
> >
> > I'll admit this is cheating a bit, but seemed harmless. It is needed so
> > that auditsc_get_stamp() from audit_get_stamp() from audit_log_start()
> > doesn't bail on me without giving me its already assigned time and
> > serial values rather than generating a new one. I did look to see if
> > there were any other undesireable side effects and found none, so I'm
> > tmepted to rename the ->in_syscall to something a bit more helpful. I
> > could add a new audit_context structure member to make
> > auditsc_get_stamp() co-operative, but this seems wasteful and
> > unnecessary.
>
> That's what I suspected.
>
> Let's look into renaming the "in_syscall" field, it borderline
> confusing now, and hijacking it for something which is very obviously
> not "in syscall" is A Very Bad Thing.

Ok, looking more carefully, I'm not going to touch in_syscall, since it
does more than I remember discovering when investigating why the
existing stamp wasn't being used. I don't want to change the existing
behaviour. I'll somewhat reluctantly grow the context struct and add a
"local" boolean to it so that auditsc_get_stamp knows to use the
existing stamp in both the in_syscall and local cases.

> paul moore

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-30 18:53:05

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 01/10] audit: add container id

On 2018-07-24 17:54, Paul Moore wrote:
> On Tue, Jul 24, 2018 at 3:09 PM Richard Guy Briggs <[email protected]> wrote:
> > On 2018-07-20 18:13, Paul Moore wrote:
> > > On Wed, Jun 6, 2018 at 1:00 PM Richard Guy Briggs <[email protected]> wrote:
> > > > Implement the proc fs write to set the audit container identifier of a
> > > > process, emitting an AUDIT_CONTAINER_ID record to document the event.
> > > >
> > > > This is a write from the container orchestrator task to a proc entry of
> > > > the form /proc/PID/audit_containerid where PID is the process ID of the
> > > > newly created task that is to become the first task in a container, or
> > > > an additional task added to a container.
> > > >
> > > > The write expects up to a u64 value (unset: 18446744073709551615).
> > > >
> > > > The writer must have capability CAP_AUDIT_CONTROL.
> > > >
> > > > This will produce a record such as this:
> > > > type=CONTAINER_ID msg=audit(2018-06-06 12:39:29.636:26949) : op=set opid=2209 old-contid=18446744073709551615 contid=123456 pid=628 auid=root uid=root tty=ttyS0 ses=1 subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 comm=bash exe=/usr/bin/bash res=yes
> > > >
> > > > The "op" field indicates an initial set. The "pid" to "ses" fields are
> > > > the orchestrator while the "opid" field is the object's PID, the process
> > > > being "contained". Old and new audit container identifier values are
> > > > given in the "contid" fields, while res indicates its success.
> > > >
> > > > It is not permitted to unset or re-set the audit container identifier.
> > > > A child inherits its parent's audit container identifier, but then can
> > > > be set only once after.
> > > >
> > > > See: https://github.com/linux-audit/audit-kernel/issues/90
> > > > See: https://github.com/linux-audit/audit-userspace/issues/51
> > > > See: https://github.com/linux-audit/audit-testsuite/issues/64
> > > > See: https://github.com/linux-audit/audit-kernel/wiki/RFE-Audit-Container-ID
> > > >
> > > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > > ---
> > > > fs/proc/base.c | 37 ++++++++++++++++++++++++
> > > > include/linux/audit.h | 25 ++++++++++++++++
> > > > include/uapi/linux/audit.h | 2 ++
> > > > kernel/auditsc.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++
> > > > 4 files changed, 135 insertions(+)
>
> ...
>
> > > > @@ -2112,6 +2116,73 @@ int audit_set_loginuid(kuid_t loginuid)
> > > > }
> > > >
> > > > /**
> > > > + * audit_set_contid - set current task's audit_context contid
> > > > + * @contid: contid value
> > > > + *
> > > > + * Returns 0 on success, -EPERM on permission failure.
> > > > + *
> > > > + * Called (set) from fs/proc/base.c::proc_contid_write().
> > > > + */
> > > > +int audit_set_contid(struct task_struct *task, u64 contid)
> > > > +{
> > > > + u64 oldcontid;
> > > > + int rc = 0;
> > > > + struct audit_buffer *ab;
> > > > + uid_t uid;
> > > > + struct tty_struct *tty;
> > > > + char comm[sizeof(current->comm)];
> > > > +
> > > > + /* Can't set if audit disabled */
> > > > + if (!task->audit)
> > > > + return -ENOPROTOOPT;
> > > > + oldcontid = audit_get_contid(task);
> > > > + /* Don't allow the audit containerid to be unset */
> > > > + if (!cid_valid(contid))
> > > > + rc = -EINVAL;
> > > > + /* if we don't have caps, reject */
> > > > + else if (!capable(CAP_AUDIT_CONTROL))
> > > > + rc = -EPERM;
> > > > + /* if task has children or is not single-threaded, deny */
> > > > + else if (!list_empty(&task->children))
> > > > + rc = -EBUSY;
> > >
> > > Is this safe without holding tasklist_lock? I worry we might be
> > > vulnerable to a race with fork().
> > >
> > > > + else if (!(thread_group_leader(task) && thread_group_empty(task)))
> > > > + rc = -EALREADY;
> > >
> > > Similar concern here as well, although related to threads.
> >
> > I think you are correct here and tasklist_lock should cover both. Do we
> > also want rcu_read_lock() immediately preceeding it?
>
> You'll need to take a closer look and determine the locking scheme. I
> simply took a quick look while reviewing this patch to see what of the
> existing locks, if any, would be most applicable here; tasklist_lock
> seemed like a good starting point.
>
> It looks like tasklist_lock is defined as a rwlock_t so I'm not sure
> it would make sense to use it with a RCU protected structure
> (typically it's RCU+spinlock), but maybe that is the case with a
> task_struct, you'll need to check.

All I need is a read rather than write tasklist_lock since I'm not
changing any inter-task relationships, which makes it possible to nest
it inside or outside the task_lock(). I don't think I need the RCU
lock.

> > > > + /* it is already set, and not inherited from the parent, reject */
> > > > + else if (cid_valid(oldcontid) && !task->audit->inherited)
> > > > + rc = -EEXIST;
> > >
> > > Maybe I'm missing something, but why do we care about preventing
> > > reassigning the audit container ID in this case? The task is single
> > > threaded and has no descendants at this point so it should be safe,
> > > yes? So long as the task changing the audit container ID has
> > > capable(CAP_AUDIT_CONTOL) it shouldn't matter, right?
> >
> > Because we hammered out this idea 6 months ago in the design phase and I
> > thought we all firmly agreed that the audit container identifier could
> > only be set once. Has any significant discussion happenned since then
> > to change that wisdom? I just wonder why this is coming up now.
>
> Implementation, and time, can change how one looks at an earlier
> design. I believe this is why most well reasoned specifications have
> a reference design.
>
> Remind me why the design had the restriction of write once for the
> audit container ID? At this point given the CAP_AUDIT_CONTROL and the
> single-thread, no-children restrictions I'm not sure what harm there
> is in allowing the value to be written multiple times (so long as the
> changes are audited of course).

Looking back through the conversations, I think you may be right that we
no longer need it, but it is easy to re-add if we find it necessary.

> > > Related, I'm questioning if we would ever care if the audit container
> > > ID was inherited or not?
> >
> > We do since that is the only way we can tell if the value has been set
> > once already or inherited unless we check if the parent's audit
> > container identifier is identical (which tells us it was inherited).
>
> Tied to the above question. If we don't care about multiple changes,
> given the other constraints, we probably don't need the inherited
> flag.

Agreed.

> paul moore

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635

2018-07-31 20:12:18

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [RFC PATCH ghak90 (was ghak32) V3 02/10] audit: log container info of syscalls

On 2018-07-25 20:51, Richard Guy Briggs wrote:
> On 2018-07-23 14:31, Paul Moore wrote:
> > On Mon, Jul 23, 2018 at 12:48 PM Steve Grubb <[email protected]> wrote:
> > > On Monday, July 23, 2018 11:11:48 AM EDT Richard Guy Briggs wrote:
> > > > On 2018-07-23 09:19, Steve Grubb wrote:
> > > > > On Sunday, July 22, 2018 4:55:10 PM EDT Richard Guy Briggs wrote:
> > > > > > On 2018-07-22 09:32, Steve Grubb wrote:
> > > > > > > On Saturday, July 21, 2018 4:29:30 PM EDT Richard Guy Briggs wrote:
> > > > > > > > > > + * audit_log_contid - report container info
> > > > > > > > > > + * @tsk: task to be recorded
> > > > > > > > > > + * @context: task or local context for record
> > > > > > > > > > + * @op: contid string description
> > > > > > > > > > + */
> > > > > > > > > > +int audit_log_contid(struct task_struct *tsk,
> > > > > > > > > > + struct audit_context *context,
> > > > > > > > > > char
> > > > > > > > > > *op)
> > > > > > > > > > +{
> > > > > > > > > > + struct audit_buffer *ab;
> > > > > > > > > > +
> > > > > > > > > > + if (!audit_contid_set(tsk))
> > > > > > > > > > + return 0;
> > > > > > > > > > + /* Generate AUDIT_CONTAINER record with container ID */
> > > > > > > > > > + ab = audit_log_start(context, GFP_KERNEL,
> > > > > > > > > > AUDIT_CONTAINER);
> > > > > > > > > > + if (!ab)
> > > > > > > > > > + return -ENOMEM;
> > > > > > > > > > + audit_log_format(ab, "op=%s contid=%llu",
> > > > > > > > > > + op, audit_get_contid(tsk));
> > > > > > > > >
> > > > > > > > > Can you explain your reason for including an "op" field in this
> > > > > > > > > record
> > > > > > > > > type? I've been looking at the rest of the patches in this
> > > > > > > > > patchset
> > > > > > > > > and it seems to be used more as an indicator of the record's
> > > > > > > > > generating context rather than any sort of audit container ID
> > > > > > > > > operation.
> > > > > > > >
> > > > > > > > "action" might work, but that's netfilter and numeric... "kind"?
> > > > > > > > Nothing else really seems to fit from a field name, type or lack of
> > > > > > > > searchability perspective.
> > > > > > > >
> > > > > > > > Steve, do you have an opinion?
> > > > > > >
> > > > > > > We only have 1 sample event where we have op=task. What are the other
> > > > > > > possible values?
> > > > > >
> > > > > > For the AUDIT_CONTAINER record we have op= "task", "target" (from the
> > > > > > ptrace and signals patch), "tty".
> > > > > >
> > > > > > For the AUDIT_CONTAINER_ID record we have "op=set".
> > > > >
> > > > > Since the purpose of this record is to log the container id, I think that
> > > > > is all that is needed. We can get the context from the other records in
> > > > > the event. I'd suggest dropping the "op" field.
> > > >
> > > > Ok, the information above it for two different audit container
> > > > identifier records. Which one should drop the "op=" field? Both? Or
> > > > just the AUDIT_CONTAINER record? The AUDIT_CONTAINER_ID record (which
> > > > might be renamed) could use it to distinguish a "set" record from a
> > > > dropped audit container identifier that is no longer registered by any
> > > > task or namespace.
> > >
> > > Neither of them need it. All they need to do is state the container that is
> > > being acted upon.
> >
> > I think we should keep the "op" field for audit container ID
> > management operations, even though we really only have a "set"
> > operation at the moment, but the others should drop the "op" field
> > (see my previous emails in this thread).
>
> In fact, I'd like to question the wisdom of dropping this field and
> perhaps fine a better or new name for it, since these contid records
> could be multiple and different from the primary task that is generating
> this record. In particular, there are extra contid records generated by
> the ptrace/signals patch that could be from other processes in other
> containers that I mentioned in a branch thread that got dropped
> including the auc_pids data and the target_pid also attached to the
> primary task's audit context.

Ok, not seeing any further follow-up on this item in almost a week, I'll
not delay any more and post rev 4 of the patchset.

> > paul moore
>
> - RGB

- RGB

--
Richard Guy Briggs <[email protected]>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635