2013-08-20 21:33:26

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 00/12] RFC: steps to make audit pid namespace-safe

This patchset is a revival of some of Eric Biederman's work to make audit
pid-namespace-safe.

In a couple of places, audit was printing PIDs in the task's pid namespace
rather than relative to the audit daemon's pid namespace, which currently is
init_pid_ns.

It also allows processes to log audit user messages in their own pid
namespaces, which was not previously permitted. Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=947530
https://bugs.launchpad.net/ubuntu/+source/vsftpd/+bug/1160372
https://bugzilla.novell.com/show_bug.cgi?id=786024

Part of the cleanup here involves deprecating task->pid and task->tgid, which
are error-prone duplicates of the task->pids structure

The next step which I hope to add to this patchset will be to purge task->pid
and task->tgid from the rest of the kernel if possible. Once that is done,
task_pid_nr_init_ns() and task_tgid_nr_init_ns() that were introduced in patch
05/12 and used in patches 06/12 and 08/12 could be replaced with task_pid_nr()
and task_tgid_nr(). Eric B. did take a stab at that, but checking all the
subtleties will be non-trivial.

Does anyone have any opinions or better yet hard data on cache line misses
between pid_nr(struct pid*) and pid_nr_ns(struct pid*, &init_pid_ns)? I'd
like to see pid_nr() use pid_nr_ns(struct pid*, &init_pid_ns), or
pid_nr_init_ns() eliminated in favour of the original pid_nr(). pid_nr()
currently accesses the first level of the pid structure without having to
dereference the level number. If there is an actual speed difference, it could
be worth keeping, otherwise, I'd prefer to simplify that code.

Eric also had a patch to add a printk option to format a struct pid pointer
which was PID namespace-aware. I don't see the point, but I'll let him explain
it.

Discuss.

Eric W. Biederman (5):
audit: Kill the unused struct audit_aux_data_capset
audit: Simplify and correct audit_log_capset

Richard Guy Briggs (7):
audit: fix netlink portid naming and types
pid: get ppid pid_t of task in init_pid_ns safely
audit: convert PPIDs to the inital PID namespace.
pid: get pid_t of task in init_pid_ns correctly
audit: store audit_pid as a struct pid pointer
audit: anchor all pid references in the initial pid namespace
pid: modify task_pid_nr to work without task->pid.
pid: modify task_tgid_nr to work without task->tgid.
pid: rewrite task helper functions avoiding task->pid and task->tgid
pid: mark struct task const in helper functions

drivers/tty/tty_audit.c | 3 +-
include/linux/audit.h | 8 ++--
include/linux/pid.h | 6 +++
include/linux/sched.h | 81 ++++++++++++++++++++++++----------
kernel/audit.c | 76 +++++++++++++++++++------------
kernel/audit.h | 12 +++---
kernel/auditfilter.c | 35 +++++++++++----
kernel/auditsc.c | 36 ++++++---------
kernel/capability.c | 2 +-
kernel/pid.c | 4 +-
security/apparmor/audit.c | 7 +--
security/integrity/integrity_audit.c | 2 +-
security/lsm_audit.c | 11 +++--
security/tomoyo/audit.c | 2 +-
14 files changed, 177 insertions(+), 108 deletions(-)


2013-08-20 21:33:28

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 02/12] audit: fix netlink portid naming and types

Normally, netlink ports use the PID of the userspace process as the port ID.
If the PID is already in use by a port, the kernel will allocate another port
ID to avoid conflict. Re-name all references to netlink ports from pid to
portid to reflect this reality and avoid confusion with actual PIDs. Ports
use the __u32 type, so re-type all portids accordingly.

(This patch is very similar to ebiederman's 5deadd69)

Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 2 +-
kernel/audit.c | 32 ++++++++++++++++----------------
kernel/audit.h | 8 ++++----
kernel/auditfilter.c | 18 ++++++++++--------
4 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index 729a4d1..a3af0fa 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -462,7 +462,7 @@ extern int audit_update_lsm_rules(void);
/* Private API (for audit.c only) */
extern int audit_filter_user(int type);
extern int audit_filter_type(int type);
-extern int audit_receive_filter(int type, int pid, int seq,
+extern int audit_receive_filter(int type, __u32 portid, int seq,
void *data, size_t datasz);
extern int audit_enabled;
#else /* CONFIG_AUDIT */
diff --git a/kernel/audit.c b/kernel/audit.c
index 91e53d0..2476334 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -94,7 +94,7 @@ static int audit_failure = AUDIT_FAIL_PRINTK;
* the portid to use to send netlink messages to that process.
*/
int audit_pid;
-static int audit_nlk_portid;
+static __u32 audit_nlk_portid;

/* If audit_rate_limit is non-zero, limit the rate of sending audit records
* to that number per second. This prevents DoS attacks, but results in
@@ -165,15 +165,15 @@ struct audit_buffer {
};

struct audit_reply {
- int pid;
+ __u32 portid;
struct sk_buff *skb;
};

-static void audit_set_pid(struct audit_buffer *ab, pid_t pid)
+static void audit_set_portid(struct audit_buffer *ab, __u32 portid)
{
if (ab) {
struct nlmsghdr *nlh = nlmsg_hdr(ab->skb);
- nlh->nlmsg_pid = pid;
+ nlh->nlmsg_pid = portid;
}
}

@@ -472,7 +472,7 @@ static int kauditd_thread(void *dummy)
int audit_send_list(void *_dest)
{
struct audit_netlink_list *dest = _dest;
- int pid = dest->pid;
+ __u32 portid = dest->portid;
struct sk_buff *skb;

/* wait for parent to finish and send an ACK */
@@ -480,14 +480,14 @@ int audit_send_list(void *_dest)
mutex_unlock(&audit_cmd_mutex);

while ((skb = __skb_dequeue(&dest->q)) != NULL)
- netlink_unicast(audit_sock, skb, pid, 0);
+ netlink_unicast(audit_sock, skb, portid, 0);

kfree(dest);

return 0;
}

-struct sk_buff *audit_make_reply(int pid, int seq, int type, int done,
+struct sk_buff *audit_make_reply(__u32 portid, int seq, int type, int done,
int multi, const void *payload, int size)
{
struct sk_buff *skb;
@@ -500,7 +500,7 @@ struct sk_buff *audit_make_reply(int pid, int seq, int type, int done,
if (!skb)
return NULL;

- nlh = nlmsg_put(skb, pid, seq, t, size, flags);
+ nlh = nlmsg_put(skb, portid, seq, t, size, flags);
if (!nlh)
goto out_kfree_skb;
data = nlmsg_data(nlh);
@@ -521,13 +521,13 @@ static int audit_send_reply_thread(void *arg)

/* Ignore failure. It'll only happen if the sender goes away,
because our timeout is set to infinite. */
- netlink_unicast(audit_sock, reply->skb, reply->pid, 0);
+ netlink_unicast(audit_sock, reply->skb, reply->portid, 0);
kfree(reply);
return 0;
}
/**
* audit_send_reply - send an audit reply message via netlink
- * @pid: process id to send reply to
+ * @portid: netlink port to which to send reply
* @seq: sequence number
* @type: audit message type
* @done: done (last) flag
@@ -535,11 +535,11 @@ static int audit_send_reply_thread(void *arg)
* @payload: payload data
* @size: payload size
*
- * Allocates an skb, builds the netlink message, and sends it to the pid.
+ * Allocates an skb, builds the netlink message, and sends it to the port id.
* No failure notifications.
*/
-static void audit_send_reply(int pid, int seq, int type, int done, int multi,
- const void *payload, int size)
+static void audit_send_reply(__u32 portid, int seq, int type, int done,
+ int multi, const void *payload, int size)
{
struct sk_buff *skb;
struct task_struct *tsk;
@@ -549,11 +549,11 @@ static void audit_send_reply(int pid, int seq, int type, int done, int multi,
if (!reply)
return;

- skb = audit_make_reply(pid, seq, type, done, multi, payload, size);
+ skb = audit_make_reply(portid, seq, type, done, multi, payload, size);
if (!skb)
goto out;

- reply->pid = pid;
+ reply->portid = portid;
reply->skb = skb;

tsk = kthread_run(audit_send_reply_thread, reply, "audit_send_reply");
@@ -727,7 +727,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
size--;
audit_log_n_untrustedstring(ab, data, size);
}
- audit_set_pid(ab, NETLINK_CB(skb).portid);
+ audit_set_portid(ab, NETLINK_CB(skb).portid);
audit_log_end(ab);
}
break;
diff --git a/kernel/audit.h b/kernel/audit.h
index 123c9b7..36edcf5 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -237,13 +237,13 @@ extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
extern int audit_gid_comparator(kgid_t left, u32 op, kgid_t right);
extern int parent_len(const char *path);
extern int audit_compare_dname_path(const char *dname, const char *path, int plen);
-extern struct sk_buff * audit_make_reply(int pid, int seq, int type,
- int done, int multi,
- const void *payload, int size);
+extern struct sk_buff *audit_make_reply(__u32 portid, int seq, int type,
+ int done, int multi,
+ const void *payload, int size);
extern void audit_panic(const char *message);

struct audit_netlink_list {
- int pid;
+ __u32 portid;
struct sk_buff_head q;
};

diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index f7aee8b..381d3de 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -971,7 +971,7 @@ out:
}

/* List rules using struct audit_rule_data. */
-static void audit_list_rules(int pid, int seq, struct sk_buff_head *q)
+static void audit_list_rules(__u32 portid, int seq, struct sk_buff_head *q)
{
struct sk_buff *skb;
struct audit_krule *r;
@@ -986,14 +986,15 @@ static void audit_list_rules(int pid, int seq, struct sk_buff_head *q)
data = audit_krule_to_data(r);
if (unlikely(!data))
break;
- skb = audit_make_reply(pid, seq, AUDIT_LIST_RULES, 0, 1,
- data, sizeof(*data) + data->buflen);
+ skb = audit_make_reply(portid, seq, AUDIT_LIST_RULES,
+ 0, 1, data,
+ sizeof(*data) + data->buflen);
if (skb)
skb_queue_tail(q, skb);
kfree(data);
}
}
- skb = audit_make_reply(pid, seq, AUDIT_LIST_RULES, 1, 1, NULL, 0);
+ skb = audit_make_reply(portid, seq, AUDIT_LIST_RULES, 1, 1, NULL, 0);
if (skb)
skb_queue_tail(q, skb);
}
@@ -1023,12 +1024,13 @@ static void audit_log_rule_change(char *action, struct audit_krule *rule, int re
/**
* audit_receive_filter - apply all rules to the specified message type
* @type: audit message type
- * @pid: target pid for netlink audit messages
+ * @portid: target port id for netlink audit messages
* @seq: netlink audit message sequence (serial) number
* @data: payload data
* @datasz: size of payload data
*/
-int audit_receive_filter(int type, int pid, int seq, void *data, size_t datasz)
+int audit_receive_filter(int type, __u32 portid, int seq, void *data,
+ size_t datasz)
{
struct task_struct *tsk;
struct audit_netlink_list *dest;
@@ -1046,11 +1048,11 @@ int audit_receive_filter(int type, int pid, int seq, void *data, size_t datasz)
dest = kmalloc(sizeof(struct audit_netlink_list), GFP_KERNEL);
if (!dest)
return -ENOMEM;
- dest->pid = pid;
+ dest->portid = portid;
skb_queue_head_init(&dest->q);

mutex_lock(&audit_filter_mutex);
- audit_list_rules(pid, seq, &dest->q);
+ audit_list_rules(portid, seq, &dest->q);
mutex_unlock(&audit_filter_mutex);

tsk = kthread_run(audit_send_list, dest, "audit_send_list");
--
1.7.1

2013-08-20 21:33:30

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 03/12] pid: get ppid pid_t of task in init_pid_ns safely

Added the functions
task_ppid()
task_ppid_nr_ns()
task_ppid_nr_init_ns()
to safely abstract the lookup of the PPID (real_parent's tgid) of a process,
including rcu locking, in any required pid namespace. This provides an
alternative to sys_getppid(), which is relative to the child process' pid
namespace.

(informed by ebiederman's 6c621b7e)
Cc: [email protected]
Cc: Eric W. Biederman <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/sched.h | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d722490..ec04090 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1448,6 +1448,11 @@ static inline struct pid *task_session(struct task_struct *task)
return task->group_leader->pids[PIDTYPE_SID].pid;
}

+static inline struct pid *task_ppid(struct task_struct *task)
+{
+ return task_tgid(rcu_dereference(current->real_parent));
+}
+
struct pid_namespace;

/*
@@ -1496,6 +1501,24 @@ static inline pid_t task_tgid_vnr(struct task_struct *tsk)
}


+static inline pid_t task_ppid_nr_ns(struct task_struct *tsk,
+ struct pid_namespace *ns)
+{
+ pid_t pid;
+
+ rcu_read_lock();
+ pid = pid_nr_ns(task_ppid(current), ns);
+ rcu_read_unlock();
+
+ return pid;
+}
+
+static inline pid_t task_ppid_nr_init_ns(struct task_struct *tsk)
+{
+ return task_ppid_nr_ns(tsk, &init_pid_ns);
+}
+
+
static inline pid_t task_pgrp_nr_ns(struct task_struct *tsk,
struct pid_namespace *ns)
{
--
1.7.1

2013-08-20 21:33:36

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 06/12] audit: Simplify and correct audit_log_capset

From: Eric W. Biederman <[email protected]>

- Always report the current process as capset now always only works on
the current process. This prevents reporting 0 or a random pid in
a random pid namespace.

- Don't bother to pass the pid as is available.

Signed-off-by: "Eric W. Biederman" <[email protected]>
(cherry picked from commit bcc85f0af31af123e32858069eb2ad8f39f90e67)
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/audit.h | 6 +++---
kernel/auditsc.c | 6 ++----
kernel/capability.c | 2 +-
3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/include/linux/audit.h b/include/linux/audit.h
index a3af0fa..49223a6 100644
--- a/include/linux/audit.h
+++ b/include/linux/audit.h
@@ -218,7 +218,7 @@ extern void __audit_mq_getsetattr(mqd_t mqdes, struct mq_attr *mqstat);
extern int __audit_log_bprm_fcaps(struct linux_binprm *bprm,
const struct cred *new,
const struct cred *old);
-extern void __audit_log_capset(pid_t pid, const struct cred *new, const struct cred *old);
+extern void __audit_log_capset(const struct cred *new, const struct cred *old);
extern void __audit_mmap_fd(int fd, int flags);

static inline void audit_ipc_obj(struct kern_ipc_perm *ipcp)
@@ -284,11 +284,11 @@ static inline int audit_log_bprm_fcaps(struct linux_binprm *bprm,
return 0;
}

-static inline void audit_log_capset(pid_t pid, const struct cred *new,
+static inline void audit_log_capset(const struct cred *new,
const struct cred *old)
{
if (unlikely(!audit_dummy_context()))
- __audit_log_capset(pid, new, old);
+ __audit_log_capset(new, old);
}

static inline void audit_mmap_fd(int fd, int flags)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 74a0748..60a966d 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -2316,18 +2316,16 @@ int __audit_log_bprm_fcaps(struct linux_binprm *bprm,

/**
* __audit_log_capset - store information about the arguments to the capset syscall
- * @pid: target pid of the capset call
* @new: the new credentials
* @old: the old (current) credentials
*
* Record the aguments userspace sent to sys_capset for later printing by the
* audit system if applicable
*/
-void __audit_log_capset(pid_t pid,
- const struct cred *new, const struct cred *old)
+void __audit_log_capset(const struct cred *new, const struct cred *old)
{
struct audit_context *context = current->audit_context;
- context->capset.pid = pid;
+ context->capset.pid = task_pid_nr_init_ns(current);
context->capset.cap.effective = new->cap_effective;
context->capset.cap.inheritable = new->cap_effective;
context->capset.cap.permitted = new->cap_permitted;
diff --git a/kernel/capability.c b/kernel/capability.c
index f6c2ce5..7f60311 100644
--- a/kernel/capability.c
+++ b/kernel/capability.c
@@ -277,7 +277,7 @@ SYSCALL_DEFINE2(capset, cap_user_header_t, header, const cap_user_data_t, data)
if (ret < 0)
goto error;

- audit_log_capset(pid, new, current_cred());
+ audit_log_capset(new, current_cred());

return commit_creds(new);

--
1.7.1

2013-08-20 21:33:40

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 09/12] pid: modify task_pid_nr to work without task->pid.

task->pid is an error prone construct and results in duplicate maintenance.
Start it's demise by modifying task_pid_nr to not use it.

(informed by ebiederman's 3a2e8c59)
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/sched.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 9651d95..fb09b93 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1473,7 +1473,7 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,

static inline pid_t task_pid_nr(struct task_struct *tsk)
{
- return tsk->pid;
+ return pid_nr(task_pid(tsk));
}

static inline pid_t task_pid_nr_ns(struct task_struct *tsk,
--
1.7.1

2013-08-20 21:33:45

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 08/12] audit: anchor all pid references in the initial pid namespace

Store and log all PIDs with reference to the initial PID namespace and
deprecate the use of the error-prone duplicity of task->pid and task->tgid.

Still only permit the audit logging daemon and control to operate from the
initial PID namespace, but allow processes to log from another PID namespace.

Cc: "Eric W. Biederman" <[email protected]>
(informed by ebiederman's c776b5d2)

Signed-off-by: Richard Guy Briggs <[email protected]>
---
drivers/tty/tty_audit.c | 3 ++-
kernel/audit.c | 15 ++++++++++-----
kernel/auditfilter.c | 17 ++++++++++++++++-
kernel/auditsc.c | 16 +++++++++-------
security/apparmor/audit.c | 2 +-
security/integrity/integrity_audit.c | 2 +-
security/lsm_audit.c | 11 +++++++----
security/tomoyo/audit.c | 2 +-
8 files changed, 47 insertions(+), 21 deletions(-)

diff --git a/drivers/tty/tty_audit.c b/drivers/tty/tty_audit.c
index a4fdce7..a0ae01f 100644
--- a/drivers/tty/tty_audit.c
+++ b/drivers/tty/tty_audit.c
@@ -65,6 +65,7 @@ static void tty_audit_log(const char *description, int major, int minor,
{
struct audit_buffer *ab;
struct task_struct *tsk = current;
+ pid_t pid = task_pid_nr_init_ns(tsk);
uid_t uid = from_kuid(&init_user_ns, task_uid(tsk));
uid_t loginuid = from_kuid(&init_user_ns, audit_get_loginuid(tsk));
u32 sessionid = audit_get_sessionid(tsk);
@@ -74,7 +75,7 @@ static void tty_audit_log(const char *description, int major, int minor,
char name[sizeof(tsk->comm)];

audit_log_format(ab, "%s pid=%u uid=%u auid=%u ses=%u major=%d"
- " minor=%d comm=", description, tsk->pid, uid,
+ " minor=%d comm=", description, pid, uid,
loginuid, sessionid, major, minor);
get_task_comm(name, tsk);
audit_log_untrustedstring(ab, name);
diff --git a/kernel/audit.c b/kernel/audit.c
index fa1d5f5..8e4958b 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -574,9 +574,8 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
{
int err = 0;

- /* Only support the initial namespaces for now. */
- if ((current_user_ns() != &init_user_ns) ||
- (task_active_pid_ns(current) != &init_pid_ns))
+ /* Only support initial user namespace for now. */
+ if ((current_user_ns() != &init_user_ns))
return -EPERM;

switch (msg_type) {
@@ -594,6 +593,11 @@ static int audit_netlink_ok(struct sk_buff *skb, u16 msg_type)
case AUDIT_TTY_SET:
case AUDIT_TRIM:
case AUDIT_MAKE_EQUIV:
+ /* Only support auditd and auditctl in initial pid namespace
+ * for now. */
+ if ((task_active_pid_ns(current) != &init_pid_ns))
+ return -EPERM;
+
if (!capable(CAP_AUDIT_CONTROL))
err = -EPERM;
break;
@@ -614,6 +618,7 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
{
int rc = 0;
uid_t uid = from_kuid(&init_user_ns, current_uid());
+ pid_t pid = task_tgid_nr_init_ns(current);

if (!audit_enabled) {
*ab = NULL;
@@ -623,7 +628,7 @@ static int audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
*ab = audit_log_start(NULL, GFP_KERNEL, msg_type);
if (unlikely(!*ab))
return rc;
- audit_log_format(*ab, "pid=%d uid=%u", task_tgid_vnr(current), uid);
+ audit_log_format(*ab, "pid=%d uid=%u", pid, uid);
audit_log_session_info(*ab);
audit_log_task_context(*ab);

@@ -1605,7 +1610,7 @@ void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk)
" euid=%u suid=%u fsuid=%u"
" egid=%u sgid=%u fsgid=%u ses=%u tty=%s",
task_ppid_nr_init_ns(tsk),
- tsk->pid,
+ task_pid_nr_init_ns(tsk),
from_kuid(&init_user_ns, audit_get_loginuid(tsk)),
from_kuid(&init_user_ns, cred->uid),
from_kgid(&init_user_ns, cred->gid),
diff --git a/kernel/auditfilter.c b/kernel/auditfilter.c
index 381d3de..2a60455 100644
--- a/kernel/auditfilter.c
+++ b/kernel/auditfilter.c
@@ -428,6 +428,19 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
f->val = 0;
}

+ if ((f->type == AUDIT_PID) || (f->type == AUDIT_PPID)) {
+ struct pid *pid;
+ rcu_read_lock();
+ pid = find_vpid(f->val);
+ if (!pid) {
+ rcu_read_unlock();
+ err = -ESRCH;
+ goto exit_free;
+ }
+ f->val = pid_nr_init_ns(pid);
+ rcu_read_unlock();
+ }
+
err = audit_field_valid(entry, f);
if (err)
goto exit_free;
@@ -1223,12 +1236,14 @@ static int audit_filter_user_rules(struct audit_krule *rule, int type,

for (i = 0; i < rule->field_count; i++) {
struct audit_field *f = &rule->fields[i];
+ pid_t pid;
int result = 0;
u32 sid;

switch (f->type) {
case AUDIT_PID:
- result = audit_comparator(task_pid_vnr(current), f->op, f->val);
+ pid = task_pid_nr_init_ns(current);
+ result = audit_comparator(pid, f->op, f->val);
break;
case AUDIT_UID:
result = audit_uid_comparator(current_uid(), f->op, f->uid);
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 2dcf67d..5cde81c 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -458,10 +458,12 @@ static int audit_filter_rules(struct task_struct *tsk,
struct audit_field *f = &rule->fields[i];
struct audit_names *n;
int result = 0;
+ pid_t pid;

switch (f->type) {
case AUDIT_PID:
- result = audit_comparator(tsk->pid, f->op, f->val);
+ pid = task_pid_nr_init_ns(tsk);
+ result = audit_comparator(pid, f->op, f->val);
break;
case AUDIT_PPID:
if (ctx) {
@@ -1989,7 +1991,7 @@ int audit_set_loginuid(kuid_t loginuid)
audit_log_format(ab, "login pid=%d uid=%u "
"old auid=%u new auid=%u"
" old ses=%u new ses=%u",
- task->pid,
+ task_pid_nr_init_ns(task),
from_kuid(&init_user_ns, task_uid(task)),
from_kuid(&init_user_ns, task->loginuid),
from_kuid(&init_user_ns, loginuid),
@@ -2197,7 +2199,7 @@ void __audit_ptrace(struct task_struct *t)
{
struct audit_context *context = current->audit_context;

- context->target_pid = t->pid;
+ context->target_pid = task_pid_nr_init_ns(t);
context->target_auid = audit_get_loginuid(t);
context->target_uid = task_uid(t);
context->target_sessionid = audit_get_sessionid(t);
@@ -2222,7 +2224,7 @@ int __audit_signal_info(int sig, struct task_struct *t)

if (audit_pid && task_tgid(t) == audit_pid) {
if (sig == SIGTERM || sig == SIGHUP || sig == SIGUSR1 || sig == SIGUSR2) {
- audit_sig_pid = tsk->pid;
+ audit_sig_pid = task_pid_nr_init_ns(tsk);
if (uid_valid(tsk->loginuid))
audit_sig_uid = tsk->loginuid;
else
@@ -2236,7 +2238,7 @@ int __audit_signal_info(int sig, struct task_struct *t)
/* optimize the common case by putting first signal recipient directly
* in audit_context */
if (!ctx->target_pid) {
- ctx->target_pid = t->tgid;
+ ctx->target_pid = task_tgid_nr_init_ns(t);
ctx->target_auid = audit_get_loginuid(t);
ctx->target_uid = t_uid;
ctx->target_sessionid = audit_get_sessionid(t);
@@ -2257,7 +2259,7 @@ int __audit_signal_info(int sig, struct task_struct *t)
}
BUG_ON(axp->pid_count >= AUDIT_AUX_PIDS);

- axp->target_pid[axp->pid_count] = t->tgid;
+ axp->target_pid[axp->pid_count] = task_tgid_nr_init_ns(t);
axp->target_auid[axp->pid_count] = audit_get_loginuid(t);
axp->target_uid[axp->pid_count] = t_uid;
axp->target_sessionid[axp->pid_count] = audit_get_sessionid(t);
@@ -2356,7 +2358,7 @@ static void audit_log_task(struct audit_buffer *ab)
from_kgid(&init_user_ns, gid),
sessionid);
audit_log_task_context(ab);
- audit_log_format(ab, " pid=%d comm=", current->pid);
+ audit_log_format(ab, " pid=%d comm=", task_pid_nr_init_ns(current));
audit_log_untrustedstring(ab, current->comm);
}

diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index 497b306..afe71f5 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -148,7 +148,7 @@ static void audit_pre(struct audit_buffer *ab, void *ca)
}

if (sa->aad->tsk) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
+ audit_log_format(ab, " pid=%d comm=", task_pid_nr_init_ns(tsk));
audit_log_untrustedstring(ab, tsk->comm);
}

diff --git a/security/integrity/integrity_audit.c b/security/integrity/integrity_audit.c
index d7efb30..1e7291a 100644
--- a/security/integrity/integrity_audit.c
+++ b/security/integrity/integrity_audit.c
@@ -39,7 +39,7 @@ void integrity_audit_msg(int audit_msgno, struct inode *inode,

ab = audit_log_start(current->audit_context, GFP_KERNEL, audit_msgno);
audit_log_format(ab, "pid=%d uid=%u auid=%u ses=%u",
- current->pid,
+ task_pid_nr_init_ns(current),
from_kuid(&init_user_ns, current_cred()->uid),
from_kuid(&init_user_ns, audit_get_loginuid(current)),
audit_get_sessionid(current));
diff --git a/security/lsm_audit.c b/security/lsm_audit.c
index 8d8d97d..4f43488 100644
--- a/security/lsm_audit.c
+++ b/security/lsm_audit.c
@@ -220,7 +220,7 @@ static void dump_common_audit_data(struct audit_buffer *ab,
*/
BUILD_BUG_ON(sizeof(a->u) > sizeof(void *)*2);

- audit_log_format(ab, " pid=%d comm=", tsk->pid);
+ audit_log_format(ab, " pid=%d comm=", task_pid_nr_init_ns(tsk));
audit_log_untrustedstring(ab, tsk->comm);

switch (a->type) {
@@ -278,9 +278,12 @@ static void dump_common_audit_data(struct audit_buffer *ab,
}
case LSM_AUDIT_DATA_TASK:
tsk = a->u.tsk;
- if (tsk && tsk->pid) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
- audit_log_untrustedstring(ab, tsk->comm);
+ if (tsk) {
+ pid_t pid = task_pid_nr_init_ns(tsk);
+ if (pid) {
+ audit_log_format(ab, " pid=%d comm=", pid);
+ audit_log_untrustedstring(ab, tsk->comm);
+ }
}
break;
case LSM_AUDIT_DATA_NET:
diff --git a/security/tomoyo/audit.c b/security/tomoyo/audit.c
index c1b0037..6657607 100644
--- a/security/tomoyo/audit.c
+++ b/security/tomoyo/audit.c
@@ -147,7 +147,7 @@ static inline const char *tomoyo_filetype(const umode_t mode)
static char *tomoyo_print_header(struct tomoyo_request_info *r)
{
struct tomoyo_time stamp;
- const pid_t gpid = task_pid_nr(current);
+ const pid_t gpid = task_pid_nr_init_ns(current);
struct tomoyo_obj_info *obj = r->obj;
static const int tomoyo_buffer_len = 4096;
char *buffer = kmalloc(tomoyo_buffer_len, GFP_NOFS);
--
1.7.1

2013-08-20 21:33:50

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 12/12] pid: mark struct task const in helper functions

It doesn't make any sense to recallers to pass in a non-const struct
task so update the function signatures to only require a const struct
task.

(informed by ebiederman's c76b2526)
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/sched.h | 44 ++++++++++++++++++++++----------------------
kernel/pid.c | 4 ++--
2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 46e739d..a585b51 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1423,12 +1423,12 @@ static inline void set_numabalancing_state(bool enabled)
}
#endif

-static inline struct pid *task_pid(struct task_struct *task)
+static inline struct pid *task_pid(const struct task_struct *task)
{
return task->pids[PIDTYPE_PID].pid;
}

-static inline struct pid *task_tgid(struct task_struct *task)
+static inline struct pid *task_tgid(const struct task_struct *task)
{
return task->group_leader->pids[PIDTYPE_PID].pid;
}
@@ -1438,12 +1438,12 @@ static inline struct pid *task_tgid(struct task_struct *task)
* the result of task_pgrp/task_session even if task == current,
* we can race with another thread doing sys_setsid/sys_setpgid.
*/
-static inline struct pid *task_pgrp(struct task_struct *task)
+static inline struct pid *task_pgrp(const struct task_struct *task)
{
return task->group_leader->pids[PIDTYPE_PGID].pid;
}

-static inline struct pid *task_session(struct task_struct *task)
+static inline struct pid *task_session(const struct task_struct *task)
{
return task->group_leader->pids[PIDTYPE_SID].pid;
}
@@ -1468,50 +1468,50 @@ struct pid_namespace;
*
* see also pid_nr() etc in include/linux/pid.h
*/
-pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
+pid_t __task_pid_nr_ns(const struct task_struct *task, enum pid_type type,
struct pid_namespace *ns);

-static inline pid_t task_pid_nr(struct task_struct *tsk)
+static inline pid_t task_pid_nr(const struct task_struct *tsk)
{
return pid_nr(task_pid(tsk));
}

-static inline pid_t task_pid_nr_ns(struct task_struct *tsk,
+static inline pid_t task_pid_nr_ns(const struct task_struct *tsk,
struct pid_namespace *ns)
{
return __task_pid_nr_ns(tsk, PIDTYPE_PID, ns);
}

-static inline pid_t task_pid_nr_init_ns(struct task_struct *tsk)
+static inline pid_t task_pid_nr_init_ns(const struct task_struct *tsk)
{
return task_pid_nr_ns(tsk, &init_pid_ns);
}

-static inline pid_t task_pid_vnr(struct task_struct *tsk)
+static inline pid_t task_pid_vnr(const struct task_struct *tsk)
{
return __task_pid_nr_ns(tsk, PIDTYPE_PID, NULL);
}


-static inline pid_t task_tgid_nr(struct task_struct *tsk)
+static inline pid_t task_tgid_nr(const struct task_struct *tsk)
{
return pid_nr(task_tgid(tsk));
}

-pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);
+pid_t task_tgid_nr_ns(const struct task_struct *tsk, struct pid_namespace *ns);

-static inline pid_t task_tgid_nr_init_ns(struct task_struct *tsk)
+static inline pid_t task_tgid_nr_init_ns(const struct task_struct *tsk)
{
return task_tgid_nr_ns(tsk, &init_pid_ns);
}

-static inline pid_t task_tgid_vnr(struct task_struct *tsk)
+static inline pid_t task_tgid_vnr(const struct task_struct *tsk)
{
return pid_vnr(task_tgid(tsk));
}


-static inline pid_t task_ppid_nr_ns(struct task_struct *tsk,
+static inline pid_t task_ppid_nr_ns(const struct task_struct *tsk,
struct pid_namespace *ns)
{
pid_t pid;
@@ -1523,37 +1523,37 @@ static inline pid_t task_ppid_nr_ns(struct task_struct *tsk,
return pid;
}

-static inline pid_t task_ppid_nr_init_ns(struct task_struct *tsk)
+static inline pid_t task_ppid_nr_init_ns(const struct task_struct *tsk)
{
return task_ppid_nr_ns(tsk, &init_pid_ns);
}


-static inline pid_t task_pgrp_nr_ns(struct task_struct *tsk,
+static inline pid_t task_pgrp_nr_ns(const struct task_struct *tsk,
struct pid_namespace *ns)
{
return __task_pid_nr_ns(tsk, PIDTYPE_PGID, ns);
}

-static inline pid_t task_pgrp_vnr(struct task_struct *tsk)
+static inline pid_t task_pgrp_vnr(const struct task_struct *tsk)
{
return __task_pid_nr_ns(tsk, PIDTYPE_PGID, NULL);
}


-static inline pid_t task_session_nr_ns(struct task_struct *tsk,
+static inline pid_t task_session_nr_ns(const struct task_struct *tsk,
struct pid_namespace *ns)
{
return __task_pid_nr_ns(tsk, PIDTYPE_SID, ns);
}

-static inline pid_t task_session_vnr(struct task_struct *tsk)
+static inline pid_t task_session_vnr(const struct task_struct *tsk)
{
return __task_pid_nr_ns(tsk, PIDTYPE_SID, NULL);
}

/* obsolete, do not use */
-static inline pid_t task_pgrp_nr(struct task_struct *tsk)
+static inline pid_t task_pgrp_nr(const struct task_struct *tsk)
{
return task_pgrp_nr_ns(tsk, &init_pid_ns);
}
@@ -1566,7 +1566,7 @@ static inline pid_t task_pgrp_nr(struct task_struct *tsk)
* If pid_alive fails, then pointers within the task structure
* can be stale and must not be dereferenced.
*/
-static inline int pid_alive(struct task_struct *p)
+static inline int pid_alive(const struct task_struct *p)
{
return p->pids[PIDTYPE_PID].pid != NULL;
}
@@ -1577,7 +1577,7 @@ static inline int pid_alive(struct task_struct *p)
*
* Check if a task structure is the first user space task the kernel created.
*/
-static inline int is_global_init(struct task_struct *tsk)
+static inline int is_global_init(const struct task_struct *tsk)
{
return task_pid_nr(tsk) == 1;
}
diff --git a/kernel/pid.c b/kernel/pid.c
index 66505c1..17755ae 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -506,7 +506,7 @@ pid_t pid_vnr(struct pid *pid)
}
EXPORT_SYMBOL_GPL(pid_vnr);

-pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
+pid_t __task_pid_nr_ns(const struct task_struct *task, enum pid_type type,
struct pid_namespace *ns)
{
pid_t nr = 0;
@@ -525,7 +525,7 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type,
}
EXPORT_SYMBOL(__task_pid_nr_ns);

-pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
+pid_t task_tgid_nr_ns(const struct task_struct *tsk, struct pid_namespace *ns)
{
return pid_nr_ns(task_tgid(tsk), ns);
}
--
1.7.1

2013-08-20 21:34:16

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

This stops these four task helper functions from using the deprecated and
error-prone task->pid and task->tgid.

(informed by ebiederman's ea5a4d01)
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/sched.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8e69807..46e739d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
*/
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}

extern struct pid *cad_pid;
@@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
*/
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
extern struct task_struct *curr_task(int cpu);
extern void set_curr_task(int cpu, struct task_struct *p);
@@ -2203,13 +2203,13 @@ static inline bool thread_group_leader(struct task_struct *p)
*/
static inline int has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == task_tgid(p);
}

static inline
int same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return task_tgid(p1) == task_tgid(p2);
}

static inline struct task_struct *next_thread(const struct task_struct *p)
--
1.7.1

2013-08-20 21:34:18

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 10/12] pid: modify task_tgid_nr to work without task->tgid.

task->tgid is an error prone construct and results in duplicate maintenance.
Start it's demise by modifying task_tgid_nr to not use it.

Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/sched.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index fb09b93..8e69807 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1495,7 +1495,7 @@ static inline pid_t task_pid_vnr(struct task_struct *tsk)

static inline pid_t task_tgid_nr(struct task_struct *tsk)
{
- return tsk->tgid;
+ return pid_nr(task_tgid(tsk));
}

pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);
--
1.7.1

2013-08-20 21:34:51

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 07/12] audit: store audit_pid as a struct pid pointer

- PID will be reported in the relevant querying PID namespace.

- Refuse to change the current audit_pid if the new value does not
point to an existing process.

- Refuse to set the current audit_pid if the new value is not its own PID
(unless it is being unset).

- Convert audit_pid into the initial pid namespace for reports

(informed by ebiederman's 5bf431da)
Cc: "Eric W. Biederman" <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
kernel/audit.c | 25 +++++++++++++++++++------
kernel/audit.h | 4 ++--
kernel/auditsc.c | 6 +++---
3 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 75b0989..fa1d5f5 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -93,7 +93,7 @@ static int audit_failure = AUDIT_FAIL_PRINTK;
* contains the pid of the auditd process and audit_nlk_portid contains
* the portid to use to send netlink messages to that process.
*/
-int audit_pid;
+struct pid *audit_pid;
static __u32 audit_nlk_portid;

/* If audit_rate_limit is non-zero, limit the rate of sending audit records
@@ -388,9 +388,11 @@ static void kauditd_send_skb(struct sk_buff *skb)
err = netlink_unicast(audit_sock, skb, audit_nlk_portid, 0);
if (err < 0) {
BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
- printk(KERN_ERR "audit: *NO* daemon at audit_pid=%d\n", audit_pid);
+ pr_err("audit: *NO* daemon at audit_pid=%d\n"
+ , pid_nr_init_ns(audit_pid));
audit_log_lost("auditd disappeared\n");
- audit_pid = 0;
+ put_pid(audit_pid);
+ audit_pid = NULL;
/* we might get lucky and get this in the next auditd */
audit_hold_skb(skb);
} else
@@ -661,7 +663,7 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
case AUDIT_GET:
status_set.enabled = audit_enabled;
status_set.failure = audit_failure;
- status_set.pid = audit_pid;
+ status_set.pid = pid_vnr(audit_pid);
status_set.rate_limit = audit_rate_limit;
status_set.backlog_limit = audit_backlog_limit;
status_set.lost = atomic_read(&audit_lost);
@@ -684,10 +686,21 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
return err;
}
if (status_get->mask & AUDIT_STATUS_PID) {
- int new_pid = status_get->pid;
+ struct pid *new_pid = find_get_pid(status_get->pid);
+ if (status_get->pid && !new_pid)
+ return -ESRCH;
+
+ /* check that non-zero pid it is trying to set is its
+ * own*/
+ if (status_get->pid &&
+ (status_get->pid != task_pid_vnr(current)))
+ return -EPERM;

if (audit_enabled != AUDIT_OFF)
- audit_log_config_change("audit_pid", new_pid, audit_pid, 1);
+ audit_log_config_change("audit_pid",
+ pid_nr_init_ns(new_pid),
+ pid_nr_init_ns(audit_pid),
+ 1);
audit_pid = new_pid;
audit_nlk_portid = NETLINK_CB(skb).portid;
}
diff --git a/kernel/audit.h b/kernel/audit.h
index 36edcf5..3932a3b 100644
--- a/kernel/audit.h
+++ b/kernel/audit.h
@@ -218,7 +218,7 @@ extern void audit_log_name(struct audit_context *context,
struct audit_names *n, struct path *path,
int record_num, int *call_panic);

-extern int audit_pid;
+extern struct pid *audit_pid;

#define AUDIT_INODE_BUCKETS 32
extern struct list_head audit_inode_hash[AUDIT_INODE_BUCKETS];
@@ -310,7 +310,7 @@ extern u32 audit_sig_sid;
extern int __audit_signal_info(int sig, struct task_struct *t);
static inline int audit_signal_info(int sig, struct task_struct *t)
{
- if (unlikely((audit_pid && t->tgid == audit_pid) ||
+ if (unlikely((audit_pid && task_tgid(t) == audit_pid) ||
(audit_signals && !audit_dummy_context())))
return __audit_signal_info(sig, t);
return 0;
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 60a966d..2dcf67d 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -739,7 +739,7 @@ static enum audit_state audit_filter_syscall(struct task_struct *tsk,
struct audit_entry *e;
enum audit_state state;

- if (audit_pid && tsk->tgid == audit_pid)
+ if (audit_pid && task_tgid(tsk) == audit_pid)
return AUDIT_DISABLED;

rcu_read_lock();
@@ -800,7 +800,7 @@ void audit_filter_inodes(struct task_struct *tsk, struct audit_context *ctx)
{
struct audit_names *n;

- if (audit_pid && tsk->tgid == audit_pid)
+ if (audit_pid && task_tgid(tsk) == audit_pid)
return;

rcu_read_lock();
@@ -2220,7 +2220,7 @@ int __audit_signal_info(int sig, struct task_struct *t)
struct audit_context *ctx = tsk->audit_context;
kuid_t uid = current_uid(), t_uid = task_uid(t);

- if (audit_pid && t->tgid == audit_pid) {
+ if (audit_pid && task_tgid(t) == audit_pid) {
if (sig == SIGTERM || sig == SIGHUP || sig == SIGUSR1 || sig == SIGUSR2) {
audit_sig_pid = tsk->pid;
if (uid_valid(tsk->loginuid))
--
1.7.1

2013-08-20 21:33:35

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 05/12] pid: get pid_t of task in init_pid_ns correctly

Added the functions
task_pid_nr_init_ns()
task_tgid_nr_init_ns()
to avoid the use of the error-prone duplicity of task->pid and task->tgid,
avoid changing the existing usage of task_pid_nr() and task_tgid_nr() while it
gets converted away from task->pid and task->tgid, and provide a clear
abstraction of the frequently used init_pid_ns in task_pid_nr_ns() and
task_tgid_nr_ns().
Also added pid_nr_init_ns() to explicitly use init_pid_ns.

(informed by ebiederman's 3a2e8c59)
Cc: Eric W. Biederman <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
include/linux/pid.h | 6 ++++++
include/linux/sched.h | 10 ++++++++++
2 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/include/linux/pid.h b/include/linux/pid.h
index 23705a5..051e134 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -172,6 +172,12 @@ static inline pid_t pid_nr(struct pid *pid)
pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns);
pid_t pid_vnr(struct pid *pid);

+static inline pid_t pid_nr_init_ns(struct pid *pid)
+{
+ return pid_nr_ns(pid, &init_pid_ns);
+}
+
+
#define do_each_pid_task(pid, type, task) \
do { \
if ((pid) != NULL) \
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ec04090..9651d95 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1482,6 +1482,11 @@ static inline pid_t task_pid_nr_ns(struct task_struct *tsk,
return __task_pid_nr_ns(tsk, PIDTYPE_PID, ns);
}

+static inline pid_t task_pid_nr_init_ns(struct task_struct *tsk)
+{
+ return task_pid_nr_ns(tsk, &init_pid_ns);
+}
+
static inline pid_t task_pid_vnr(struct task_struct *tsk)
{
return __task_pid_nr_ns(tsk, PIDTYPE_PID, NULL);
@@ -1495,6 +1500,11 @@ static inline pid_t task_tgid_nr(struct task_struct *tsk)

pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns);

+static inline pid_t task_tgid_nr_init_ns(struct task_struct *tsk)
+{
+ return task_tgid_nr_ns(tsk, &init_pid_ns);
+}
+
static inline pid_t task_tgid_vnr(struct task_struct *tsk)
{
return pid_vnr(task_tgid(tsk));
--
1.7.1

2013-08-20 21:35:55

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 04/12] audit: convert PPIDs to the inital PID namespace.

sys_getppid() returns the parent pid of the current process in its own pid
namespace. Since audit filters are based in the init pid namespace, a process
could avoid a filter or trigger an unintended one by being in an alternate pid
namespace or log meaningless information.

Switch to task_ppid_nr_init_ns() for PPIDs to anchor all audit filters in the
init_pid_ns.

(informed by ebiederman's 6c621b7e)
Cc: [email protected]
Cc: Eric W. Biederman <[email protected]>
Signed-off-by: Richard Guy Briggs <[email protected]>
---
kernel/audit.c | 4 ++--
kernel/auditsc.c | 2 +-
security/apparmor/audit.c | 5 +----
3 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 2476334..75b0989 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1588,10 +1588,10 @@ void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk)
spin_unlock_irq(&tsk->sighand->siglock);

audit_log_format(ab,
- " ppid=%ld pid=%d auid=%u uid=%u gid=%u"
+ " ppid=%d pid=%d auid=%u uid=%u gid=%u"
" euid=%u suid=%u fsuid=%u"
" egid=%u sgid=%u fsgid=%u ses=%u tty=%s",
- sys_getppid(),
+ task_ppid_nr_init_ns(tsk),
tsk->pid,
from_kuid(&init_user_ns, audit_get_loginuid(tsk)),
from_kuid(&init_user_ns, cred->uid),
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 8c644c5..74a0748 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -466,7 +466,7 @@ static int audit_filter_rules(struct task_struct *tsk,
case AUDIT_PPID:
if (ctx) {
if (!ctx->ppid)
- ctx->ppid = sys_getppid();
+ ctx->ppid = task_ppid_nr_init_ns(current);
result = audit_comparator(ctx->ppid, f->op, f->val);
}
break;
diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index 031d2d9..497b306 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -132,10 +132,7 @@ static void audit_pre(struct audit_buffer *ab, void *ca)

if (sa->aad->profile) {
struct aa_profile *profile = sa->aad->profile;
- pid_t pid;
- rcu_read_lock();
- pid = rcu_dereference(tsk->real_parent)->pid;
- rcu_read_unlock();
+ pid_t pid = task_ppid_nr_init_ns(tsk);
audit_log_format(ab, " parent=%d", pid);
if (profile->ns != root_ns) {
audit_log_format(ab, " namespace=");
--
1.7.1

2013-08-20 21:33:25

by Richard Guy Briggs

[permalink] [raw]
Subject: [PATCH 01/12] audit: Kill the unused struct audit_aux_data_capset

From: Eric W. Biederman <[email protected]>

Signed-off-by: "Eric W. Biederman" <[email protected]>
(cherry picked from commit 6904431d6b41190e42d6b94430b67cb4e7e6a4b7)
Signed-off-by: Richard Guy Briggs <[email protected]>
---
kernel/auditsc.c | 6 ------
1 files changed, 0 insertions(+), 6 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 9845cb3..8c644c5 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -121,12 +121,6 @@ struct audit_aux_data_bprm_fcaps {
struct audit_cap_data new_pcap;
};

-struct audit_aux_data_capset {
- struct audit_aux_data d;
- pid_t pid;
- struct audit_cap_data cap;
-};
-
struct audit_tree_refs {
struct audit_tree_refs *next;
struct audit_chunk *c[31];
--
1.7.1

2013-08-22 19:15:23

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On 08/20, Richard Guy Briggs wrote:
>
> static inline int is_global_init(struct task_struct *tsk)
> {
> - return tsk->pid == 1;
> + return task_pid_nr(tsk) == 1;
> }

Probably it would be better to simply kill it. Almost every usage is
wrong.

> static inline bool is_idle_task(const struct task_struct *p)
> {
> - return p->pid == 0;
> + return task_pid(p) == &init_struct_pid;
> }

hmm. there should be a simpler check for this...

> static inline int has_group_leader_pid(struct task_struct *p)
> {
> - return p->pid == p->tgid;
> + return task_pid(p) == task_tgid(p);
> }
>
> static inline
> int same_thread_group(struct task_struct *p1, struct task_struct *p2)
> {
> - return p1->tgid == p2->tgid;
> + return task_tgid(p1) == task_tgid(p2);

This is suboptinal. See the attached
include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
from -mm below.

Oleg.


--- a/include/linux/sched.h~include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid
+++ a/include/linux/sched.h
@@ -2179,15 +2179,15 @@ static inline bool thread_group_leader(s
* all we care about is that we have a task with the appropriate
* pid, we don't actually care if we have the right task.
*/
-static inline int has_group_leader_pid(struct task_struct *p)
+static inline bool has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == p->signal->leader_pid;
}

static inline
-int same_thread_group(struct task_struct *p1, struct task_struct *p2)
+bool same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return p1->signal == p2->signal;
}

static inline struct task_struct *next_thread(const struct task_struct *p)

2013-08-22 20:06:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Tue, Aug 20, 2013 at 05:32:03PM -0400, Richard Guy Briggs wrote:
> This stops these four task helper functions from using the deprecated and
> error-prone task->pid and task->tgid.
>
> (informed by ebiederman's ea5a4d01)
> Cc: "Eric W. Biederman" <[email protected]>
> Signed-off-by: Richard Guy Briggs <[email protected]>
> ---
> include/linux/sched.h | 8 ++++----
> 1 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 8e69807..46e739d 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
> */
> static inline int is_global_init(struct task_struct *tsk)
> {
> - return tsk->pid == 1;
> + return task_pid_nr(tsk) == 1;
> }
>
> extern struct pid *cad_pid;
> @@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
> */
> static inline bool is_idle_task(const struct task_struct *p)
> {
> - return p->pid == 0;
> + return task_pid(p) == &init_struct_pid;
> }
> extern struct task_struct *curr_task(int cpu);
> extern void set_curr_task(int cpu, struct task_struct *p);

Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.

2013-08-22 21:44:21

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> On Tue, Aug 20, 2013 at 05:32:03PM -0400, Richard Guy Briggs wrote:
> > This stops these four task helper functions from using the deprecated and
> > error-prone task->pid and task->tgid.
> >
> > (informed by ebiederman's ea5a4d01)
> > Cc: "Eric W. Biederman" <[email protected]>
> > Signed-off-by: Richard Guy Briggs <[email protected]>
> > ---
> > include/linux/sched.h | 8 ++++----
> > 1 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index 8e69807..46e739d 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
> > */
> > static inline int is_global_init(struct task_struct *tsk)
> > {
> > - return tsk->pid == 1;
> > + return task_pid_nr(tsk) == 1;
> > }
> >
> > extern struct pid *cad_pid;
> > @@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
> > */
> > static inline bool is_idle_task(const struct task_struct *p)
> > {
> > - return p->pid == 0;
> > + return task_pid(p) == &init_struct_pid;
> > }
> > extern struct task_struct *curr_task(int cpu);
> > extern void set_curr_task(int cpu, struct task_struct *p);
>
> Why would you ever want to do this? It just makes these tests more
> expensive for no gain what so ff'ing ever.

Backups are generally considered a good idea, but in this case, I'd
quote:
"A man with one watch knows what time it is. A man with two is
never certain."

Reminds me of the twist of a phrase frequently seen in the US gov:
"Government Duplicity, Do Not Propagate" ;-)


Can you suggest a safe way to live with this duplicity?


- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-23 06:36:30

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Thu, Aug 22, 2013 at 05:43:47PM -0400, Richard Guy Briggs wrote:
> On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> > On Tue, Aug 20, 2013 at 05:32:03PM -0400, Richard Guy Briggs wrote:
> > > This stops these four task helper functions from using the deprecated and
> > > error-prone task->pid and task->tgid.
> > >
> > > (informed by ebiederman's ea5a4d01)
> > > Cc: "Eric W. Biederman" <[email protected]>
> > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > ---
> > > include/linux/sched.h | 8 ++++----
> > > 1 files changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > index 8e69807..46e739d 100644
> > > --- a/include/linux/sched.h
> > > +++ b/include/linux/sched.h
> > > @@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
> > > */
> > > static inline int is_global_init(struct task_struct *tsk)
> > > {
> > > - return tsk->pid == 1;
> > > + return task_pid_nr(tsk) == 1;
> > > }
> > >
> > > extern struct pid *cad_pid;
> > > @@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
> > > */
> > > static inline bool is_idle_task(const struct task_struct *p)
> > > {
> > > - return p->pid == 0;
> > > + return task_pid(p) == &init_struct_pid;
> > > }
> > > extern struct task_struct *curr_task(int cpu);
> > > extern void set_curr_task(int cpu, struct task_struct *p);
> >
> > Why would you ever want to do this? It just makes these tests more
> > expensive for no gain what so ff'ing ever.
>
> Backups are generally considered a good idea, but in this case, I'd
> quote:
> "A man with one watch knows what time it is. A man with two is
> never certain."

Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.

Furthermore idle threads really are special and it doesn't make sense to
address them in any but the root namespace, doubly so because only
kernel space does this.

As for the init thread, that function is called is_global_init() for
crying out loud, what numb nut doesn't get that that's supposed to be
using the root namespace?

Seriously, you namespace guys should stop messing things up and
confusing yourselves and others.

2013-08-23 19:34:28

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On 08/22, Richard Guy Briggs wrote:
>
> On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> >
> > Why would you ever want to do this? It just makes these tests more
> > expensive for no gain what so ff'ing ever.
>
> Backups are generally considered a good idea, but in this case, I'd
> quote:

And perhaps you are right. At least we can probably kill task->tgid.
And I agree, it would be nice to kill task->pid.

But: I also agree with Peter, we should try to not make the current
code more expensive.

Anyway. Imho, you should not mix the different things in one series.
If you want to fix audit, do not add the patches like 10/12 or 11/12
into this series.

Or 3/12. OK, I agree sys_getppid() in audit_log_task_info() looks
strange at least. Just fix it using the helpers we already have and
add the new helpers later. Or send the patch(es) which adds the new
helpers first.

Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
Use the helper we already have, or introduce the new one first and
change the current users of task_pid_nr().

In short. Fortunately you do not need to convince me, I do not
maintain audit or namespaces ;) But imho this series looks a bit
confusing.

Oleg.

2013-08-26 22:08:12

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Thu, Aug 22, 2013 at 09:08:48PM +0200, Oleg Nesterov wrote:
> On 08/20, Richard Guy Briggs wrote:
> >
> > static inline int is_global_init(struct task_struct *tsk)
> > {
> > - return tsk->pid == 1;
> > + return task_pid_nr(tsk) == 1;
> > }
>
> Probably it would be better to simply kill it. Almost every usage is
> wrong.

Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?

If is_global_init(), is that because they could be unaware of pid
namespaces?

If task_pid_nr(), is that for the same reason?

> > static inline bool is_idle_task(const struct task_struct *p)
> > {
> > - return p->pid == 0;
> > + return task_pid(p) == &init_struct_pid;
> > }
>
> hmm. there should be a simpler check for this...

Other than the original, this one is pretty simple. What did you have
in mind?

> > static inline int has_group_leader_pid(struct task_struct *p)
> > {
> > - return p->pid == p->tgid;
> > + return task_pid(p) == task_tgid(p);
> > }
> >
> > static inline
> > int same_thread_group(struct task_struct *p1, struct task_struct *p2)
> > {
> > - return p1->tgid == p2->tgid;
> > + return task_tgid(p1) == task_tgid(p2);
>
> This is suboptinal. See the attached
> include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
> from -mm below.

I'm fine with that if it is deemed better. The point was to remove the
dependence on task_struct::tgid.

> Oleg.

- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-27 02:37:55

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Fri, Aug 23, 2013 at 08:36:21AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 22, 2013 at 05:43:47PM -0400, Richard Guy Briggs wrote:
> > On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> > > On Tue, Aug 20, 2013 at 05:32:03PM -0400, Richard Guy Briggs wrote:
> > > > This stops these four task helper functions from using the deprecated and
> > > > error-prone task->pid and task->tgid.
> > > >
> > > > (informed by ebiederman's ea5a4d01)
> > > > Cc: "Eric W. Biederman" <[email protected]>
> > > > Signed-off-by: Richard Guy Briggs <[email protected]>
> > > > ---
> > > > include/linux/sched.h | 8 ++++----
> > > > 1 files changed, 4 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index 8e69807..46e739d 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
> > > > */
> > > > static inline int is_global_init(struct task_struct *tsk)
> > > > {
> > > > - return tsk->pid == 1;
> > > > + return task_pid_nr(tsk) == 1;
> > > > }
> > > >
> > > > extern struct pid *cad_pid;
> > > > @@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
> > > > */
> > > > static inline bool is_idle_task(const struct task_struct *p)
> > > > {
> > > > - return p->pid == 0;
> > > > + return task_pid(p) == &init_struct_pid;
> > > > }
> > > > extern struct task_struct *curr_task(int cpu);
> > > > extern void set_curr_task(int cpu, struct task_struct *p);
> > >
> > > Why would you ever want to do this? It just makes these tests more
> > > expensive for no gain what so ff'ing ever.
> >
> > Backups are generally considered a good idea, but in this case, I'd
> > quote:
> > "A man with one watch knows what time it is. A man with two is
> > never certain."
>
> Except that's not the case, with namespaces there's a clear hierarchy
> and the task_struct::pid is the one true value aka. root namespace.

Peter, I agonized over the access efficiency of dropping this one or the
duplicate in task_struct::pids and this one was far easier to drop in
terms of somehow always forcing
task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.

It should be possible to audit the kernel to make certain task->pid is
only ever written at the time of task creation and copied to its own
task->pids[PIDTYPE_PID].pid->numbers[0].nr at the time of task creation
so that the two values are consistent. Continuously auditing the kernel
so this is the case would be a bit more of a challenge.

Would it be reasonable to suggest task_struct::pid only be accessed by
the existing inlined task_pid_nr() converted to const?

The goal is to gain assurance that any PIDs referred to in audit logs
are indisputable.

Would you be alright with doing away with task_struct::tgid?

> Furthermore idle threads really are special and it doesn't make sense to
> address them in any but the root namespace, doubly so because only
> kernel space does this.

If that is the case, and the above holds true, then we don't need the
second hunk, I agree.

> As for the init thread, that function is called is_global_init() for
> crying out loud, what numb nut doesn't get that that's supposed to be
> using the root namespace?

A process in another pid namespace? If that's never going to happen,
then drop it.

> Seriously, you namespace guys should stop messing things up and
> confusing yourselves and others.

"you namespace guys"? I'm not a namespace guy. I'm a rusty kernel
network security guy taking on audit, trying to prepare it for moving
into a more and more namespace-enabled place of
containerization/light-virtualization.

We aren't ready for it yet, but there is demand to run additional auditd
daemons in other pid namespaces and some of this work is trying to move
in that direction.

This patchset certainly wasn't intended to be ready for adoption just
yet. It was this sort of discussion I was hoping to have.


- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-27 03:05:26

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Fri, Aug 23, 2013 at 09:28:07PM +0200, Oleg Nesterov wrote:
> On 08/22, Richard Guy Briggs wrote:
> >
> > On Thu, Aug 22, 2013 at 10:05:55PM +0200, Peter Zijlstra wrote:
> > >
> > > Why would you ever want to do this? It just makes these tests more
> > > expensive for no gain what so ff'ing ever.
> >
> > Backups are generally considered a good idea, but in this case, I'd
> > quote:
>
> And perhaps you are right. At least we can probably kill task->tgid.
> And I agree, it would be nice to kill task->pid.
>
> But: I also agree with Peter, we should try to not make the current
> code more expensive.

I don't disagree. I was given some hope by Eric Biederman that it might
not cause cache-line misses...

> Anyway. Imho, you should not mix the different things in one series.
> If you want to fix audit, do not add the patches like 10/12 or 11/12
> into this series.

They were included to gain reassurance that PIDs reported in audit logs
were accurate. If those assurances can be made, then I can limit my
work to audit itself.

> Or 3/12.

That is a cleanup to make clear what parts are actually pid-related and
what isn't.

> OK, I agree sys_getppid() in audit_log_task_info() looks
> strange at least. Just fix it using the helpers we already have and
> add the new helpers later. Or send the patch(es) which adds the new
> helpers first.

Patch 04/12 is that helper. It is used in only two places in audit (and
once in apparmor), so I could have duplicated the code, but since it
needs rcu locking, an inline funciton in the audit namespace seemed
somewhat pointless when it could just as easily be shared with other
subsystems.

> Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
> Use the helper we already have, or introduce the new one first and
> change the current users of task_pid_nr().

If task_struct::pid is definitely not going away, then that whole part
is moot and we'll just use task_pid_nr() as is.

> In short. Fortunately you do not need to convince me, I do not
> maintain audit or namespaces ;) But imho this series looks a bit
> confusing.

It is incomplete. The last step(s) were intended to purge
task_struct::pid (or abstract it using task_pid_nr()), which haven't
been submitted yet. The whole point of the patchset was to give
confidence in the PIDs reported in any audit logs.

> Oleg.

- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-27 12:11:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Mon, Aug 26, 2013 at 10:37:22PM -0400, Richard Guy Briggs wrote:
> On Fri, Aug 23, 2013 at 08:36:21AM +0200, Peter Zijlstra wrote:
> > Except that's not the case, with namespaces there's a clear hierarchy
> > and the task_struct::pid is the one true value aka. root namespace.
>
> Peter, I agonized over the access efficiency of dropping this one or the
> duplicate in task_struct::pids and this one was far easier to drop in
> terms of somehow always forcing
> task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.

You mean there's more than 1 site that sets task_struct::pid? I thought
we only assign that thing once in fork.c someplace.

> It should be possible to audit the kernel to make certain task->pid is
> only ever written at the time of task creation and copied to its own
> task->pids[PIDTYPE_PID].pid->numbers[0].nr at the time of task creation
> so that the two values are consistent. Continuously auditing the kernel
> so this is the case would be a bit more of a challenge.

I know there's people running scripts over git commits to catch abuse,
if this is scriptable that might be doable.

> Would it be reasonable to suggest task_struct::pid only be accessed by
> the existing inlined task_pid_nr() converted to const?

Sure that works for me, alternatively what's wrong with making
task_struct::pid itself a const pid_t ? Then assignment will need an
ugly cast to even work.

> The goal is to gain assurance that any PIDs referred to in audit logs
> are indisputable.
>
> Would you be alright with doing away with task_struct::tgid?

So I don't particularly use that one much -- if at all. So no I don't
mind it too much.

> > Furthermore idle threads really are special and it doesn't make sense to
> > address them in any but the root namespace, doubly so because only
> > kernel space does this.
>
> If that is the case, and the above holds true, then we don't need the
> second hunk, I agree.

It should be the case -- not entirely sure it is the case, but in any
case pid-0 should be 'special' by all accounts.

> > As for the init thread, that function is called is_global_init() for
> > crying out loud, what numb nut doesn't get that that's supposed to be
> > using the root namespace?
>
> A process in another pid namespace? If that's never going to happen,
> then drop it.

That'd be a bug I suppose, you want the 'global' init when using that
function. I don't use this function, never have. So I really don't know
_that_ much about it. It just seems to me that the name really implies
it wants the root init process and not any other.

> > Seriously, you namespace guys should stop messing things up and
> > confusing yourselves and others.
>
> "you namespace guys"? I'm not a namespace guy. I'm a rusty kernel
> network security guy taking on audit, trying to prepare it for moving
> into a more and more namespace-enabled place of
> containerization/light-virtualization.

Well, you let yourself in with 'those' people ;-)

> We aren't ready for it yet, but there is demand to run additional auditd
> daemons in other pid namespaces and some of this work is trying to move
> in that direction.

So afaict that's 'simply' writing the 'right' pid to your logger, right?
Your additional concern that the pid space isn't corrupted sounds a bit
superfluous to me, we don't typically muck about with pids, stuff would
horribly break if we did that.

There's a few special cases, like the idle threads having pid-0 and
'simple' people like myself prefer to use task_struct::pid for debugging
when we run our simple kernels without all this namespace stuff enabled.

The entire task->pids[PIDTYPE_PID].pid->numbers[0].nr thing just seems
increddibly unwieldy and double dereferences, even if the lines are
'hot' are worse than single derefs.

2013-08-27 16:22:05

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On 08/26, Richard Guy Briggs wrote:
>
> On Thu, Aug 22, 2013 at 09:08:48PM +0200, Oleg Nesterov wrote:
> > On 08/20, Richard Guy Briggs wrote:
> > >
> > > static inline int is_global_init(struct task_struct *tsk)
> > > {
> > > - return tsk->pid == 1;
> > > + return task_pid_nr(tsk) == 1;
> > > }
> >
> > Probably it would be better to simply kill it. Almost every usage is
> > wrong.
>
> Can you be more clear? I don't follow. It should instead return a
> boolean. Usage of is_global_init() or task_pid_nr()?

Just look at the callers. For example, how is_global_init() can save
/sbin/init from oom-killer if it is multithreaded ?

> If is_global_init(), is that because they could be unaware of pid
> namespaces?

Because I think nobody actually needs is_a_group_leader_of_global_init(),
and this is what this helper actually is.

> > > static inline bool is_idle_task(const struct task_struct *p)
> > > {
> > > - return p->pid == 0;
> > > + return task_pid(p) == &init_struct_pid;
> > > }
> >
> > hmm. there should be a simpler check for this...
>
> Other than the original, this one is pretty simple.

I meant that the original check is cheaper,

> What did you have
> in mind?

Well. I agree that it would be nice to avoid the dependence on task->pid
if possible. And perhaps even kill it eventually. But I am not sure how
much we should try.

If it was the last user of ->pid, then I would agree with this change.
Although we can make it cheaper, say, we can change idle_init() to
nullify tasks->next and use ->next == NULL.

But we have a lot more ->pid users, perhaps we should change them first.

And more importantly, let me repeat. I do not think that this change
should be mixed with other changes in this series.

> > > static inline int has_group_leader_pid(struct task_struct *p)
> > > {
> > > - return p->pid == p->tgid;
> > > + return task_pid(p) == task_tgid(p);
> > > }
> > >
> > > static inline
> > > int same_thread_group(struct task_struct *p1, struct task_struct *p2)
> > > {
> > > - return p1->tgid == p2->tgid;
> > > + return task_tgid(p1) == task_tgid(p2);
> >
> > This is suboptinal. See the attached
> > include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
> > from -mm below.
>
> I'm fine with that if it is deemed better. The point was to remove the
> dependence on task_struct::tgid.

But I agree! My only point was, this conflicts with the patch we already
have and that patch is more optimal. p1->leader == p2->leader is cheaper.

Oleg.

2013-08-27 17:17:59

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On 08/26, Richard Guy Briggs wrote:
>
> On Fri, Aug 23, 2013 at 09:28:07PM +0200, Oleg Nesterov wrote:
>
> > Or 3/12.
>
> That is a cleanup to make clear what parts are actually pid-related and
> what isn't.

You know, I decided to send another email about this patch. This cleanup
doesn't look even correct.

> > OK, I agree sys_getppid() in audit_log_task_info() looks
> > strange at least. Just fix it using the helpers we already have and
> > add the new helpers later. Or send the patch(es) which adds the new
> > helpers first.
>
> Patch 04/12 is that helper. It is used in only two places in audit

I see what 3/4 do. But I am not sure we need this. At least in this
series.

OK, why do we need 3 new helper? audit needs only one,
task_ppid_nr_init_ns(). And who else needs this task_ppid* stuff?

> once in apparmor),

OK, apparmor. So perhaps a single new helper in sched.c makes sense,
I dunno. But see above.

> > Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
> > Use the helper we already have, or introduce the new one first and
> > change the current users of task_pid_nr().
>
> If task_struct::pid is definitely not going away, then that whole part
> is moot and we'll just use task_pid_nr() as is.

Can't understand. We already have task_tgid_nr(). This helper can be
changed to avoid ->tgig. Why task_ppid_nr_init_ns() can't use the
helper we already have?

But let me repeat. I am not maintainer and I do not really care.
You should convince Eric, I am not going to argue.


Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at

http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664

?

Oleg.

2013-08-27 17:28:33

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH 03/12] pid: get ppid pid_t of task in init_pid_ns safely

On 08/20, Richard Guy Briggs wrote:
>
> Added the functions
> task_ppid()
> task_ppid_nr_ns()
> task_ppid_nr_init_ns()
> to safely abstract the lookup of the PPID

but it is not safe.

> +static inline struct pid *task_ppid(struct task_struct *task)
> +{
> + return task_tgid(rcu_dereference(current->real_parent));
^^^^^^^
task?

> +static inline pid_t task_ppid_nr_ns(struct task_struct *tsk,
> + struct pid_namespace *ns)
> +{
> + pid_t pid;
> +
> + rcu_read_lock();
> + pid = pid_nr_ns(task_ppid(current), ns);
^^^^^^^
again.

> + rcu_read_unlock();

And why this is safe?

rcu_read_lock() can't help if tsk was already dead _before_ it takes
the rcu lock. ->real_parent can point the already freed/reused/unmapped
memory.

This is safe if, for example, the caller alredy holds rcu_read_lock()
and tsk was found by find_task_by*(), or tsk is current.

Richard, just in case... I am going to vacation, I will be completely
offline till Sep 10.

Oleg.

2013-08-27 21:35:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

Peter Zijlstra <[email protected]> writes:

> On Mon, Aug 26, 2013 at 10:37:22PM -0400, Richard Guy Briggs wrote:
>> On Fri, Aug 23, 2013 at 08:36:21AM +0200, Peter Zijlstra wrote:
>> > Except that's not the case, with namespaces there's a clear hierarchy
>> > and the task_struct::pid is the one true value aka. root namespace.
>>
>> Peter, I agonized over the access efficiency of dropping this one or the
>> duplicate in task_struct::pids and this one was far easier to drop in
>> terms of somehow always forcing
>> task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.
>
> You mean there's more than 1 site that sets task_struct::pid? I thought
> we only assign that thing once in fork.c someplace.

No there is not and that is not a concern.

Now I had thought given how the perf subsystem was constructed that only
the god like root was even allowed to use the code. But it turns out
there is sysctl_perf_event_paranoid that can bet twiddled that in some
circumstance that unprivileged users are allowed to use perf. Which
ultimately means perf will return the wrong data.

Meaning that perf is broken by design and perf has no excuse to be using
task->pid. Similarly every other place in the kernel that has made the
same mistake. I mention perf explicitly for two reasons. perf gets the
namespace handling horribly wrong, perf is the only place in the kernel
where we are accessing pids frequently enough for an extra cache line
miss to be a concern.

When really pids in the kernel what we care about is not some stupid
number but the stuct pid which points to that tasks, process groups, and
sessions. You know the object that a pid is the name for.

So yes as a long term direction task->pid and task->tgid need to be
killed because we keep getting subsystems like perf that return the
wrong data to userspace, or perform the wrong checks, because the
current structure makes it seem like it is ok to do the wrong thing.

Now that should not be Richard's fight. Hopefully he can focus on
fixing audit.

> There's a few special cases, like the idle threads having pid-0 and
> 'simple' people like myself prefer to use task_struct::pid for debugging
> when we run our simple kernels without all this namespace stuff
> enabled.

Which is why a special printf format is likely a good idea because it
means you can easily print pids without needing to call ungainly helper
functions.

Of course you can't run kernels without this ``namespace'' stuff
enabled. The best you can do is run kernels without the ability to
create new instances of the namespaces.

> The entire task->pids[PIDTYPE_PID].pid->numbers[0].nr thing just seems
> increddibly unwieldy and double dereferences, even if the lines are
> 'hot' are worse than single derefs.

But it is so much better than having to look up task->pid in a hash
table to get anything done, which is the previous state of affairs.

When the pid namespace support was merged except for a few overlooked
corner cases everything was converted except a bunch of printk
statements. Now I look in the kernel and we have had subsystems added
that totally get the namespace handling wrong because it is easy and
apparently socially acceptable to mess up other peoples hard work.

Apparently even Linus yelling at people a few years back wasn't enough
for people to wake up and be responsible developers and use proper
abstractions. So the only valid long term direction I can see is to
remove all of the abstractions that make getting pid handling wrong,
and to fix all of the code that gets it wrong today. So that there are
no more bad examples in the kernel.

This isn't Richard's fight, and this isn't what needs to happen with
audit. Audit just needs to be fixed so that so that it reports pid
numbers the audit daemon can make sense of, and to do that the audit
should use helper functions that are pid namespace aware and make it
clear that the proper pid namespace is being used.

In the long term ->pid and ->tgid must die, and take all of this wrong
think with it.

Eric

2013-08-28 08:16:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid

On Tue, Aug 27, 2013 at 02:35:11PM -0700, Eric W. Biederman wrote:
> Peter Zijlstra <[email protected]> writes:
>
> > On Mon, Aug 26, 2013 at 10:37:22PM -0400, Richard Guy Briggs wrote:
> >> On Fri, Aug 23, 2013 at 08:36:21AM +0200, Peter Zijlstra wrote:
> >> > Except that's not the case, with namespaces there's a clear hierarchy
> >> > and the task_struct::pid is the one true value aka. root namespace.
> >>
> >> Peter, I agonized over the access efficiency of dropping this one or the
> >> duplicate in task_struct::pids and this one was far easier to drop in
> >> terms of somehow always forcing
> >> task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.
> >
> > You mean there's more than 1 site that sets task_struct::pid? I thought
> > we only assign that thing once in fork.c someplace.
>
> No there is not and that is not a concern.
>
> Now I had thought given how the perf subsystem was constructed that only
> the god like root was even allowed to use the code. But it turns out
> there is sysctl_perf_event_paranoid that can bet twiddled that in some
> circumstance that unprivileged users are allowed to use perf.

Even without poking at that, a user is always allowed to use perf on his
own tasks.

> Which
> ultimately means perf will return the wrong data.

How so, perf uses pid-namespaces, have a look at
kernel/events/core.c:perf_event_[pt]id(). We store the namespace of the
task creating the event (which is also -- hopefully -- the consumer of
the data it generates). If you create an event and then switch
namespaces you've bigger issues I suppose.

> Meaning that perf is broken by design and perf has no excuse to be using
> task->pid.

It doesn't.

> Similarly every other place in the kernel that has made the
> same mistake. I mention perf explicitly for two reasons. perf gets the
> namespace handling horribly wrong,

Do tell.

> perf is the only place in the kernel
> where we are accessing pids frequently enough for an extra cache line
> miss to be a concern.
>
> When really pids in the kernel what we care about is not some stupid
> number but the stuct pid which points to that tasks, process groups, and
> sessions. You know the object that a pid is the name for.
>
> So yes as a long term direction task->pid and task->tgid need to be
> killed because we keep getting subsystems like perf that return the
> wrong data to userspace, or perform the wrong checks, because the
> current structure makes it seem like it is ok to do the wrong thing.

I think you should have a look at code before you start raving.. makes
you looks silly.

2013-08-30 19:06:50

by Richard Guy Briggs

[permalink] [raw]
Subject: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> Btw. audit looks unmaintained... if you are going to take care of
> this code, perhaps you can look at
>
> http://marc.info/?l=linux-kernel&m=137589907108485
> http://marc.info/?l=linux-kernel&m=137590271809664

(I don't want to lose these refs... First I've seen these.)

Why do you say this? Could you elaborate? Due to the state of the code
itself, or the lack of response from audit folks? (Like most, I'm not
subscribed to that firehose, so I don't have archives that show
addressees.) Most of the kernel audit folks are on
[email protected] list.

> Oleg.

- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-30 19:54:57

by Steve Grubb

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On Friday, August 30, 2013 03:06:46 PM Richard Guy Briggs wrote:
> On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> > Btw. audit looks unmaintained... if you are going to take care of
> > this code, perhaps you can look at
> >
> > http://marc.info/?l=linux-kernel&m=137589907108485
> > http://marc.info/?l=linux-kernel&m=137590271809664

You don't want to clear the TIF audit flag when context == NULL. What that will
do is make a bunch of inauditable processes. There are times when audit is
disabled and then re-enabled later. If the flag gets cleared, then a task's
syscall will never enter the auditing framework from kernel/entry_64.S.

That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
have the boot argument. Not setting audit=1 on the boot arguments means that
any process running before the audit daemon enables auditing can never ever be
audited because the only place its set is when processes are cloned.

Hope this clears up the use. NAK to the patch, it'll break auditing.

-Steve

2013-08-30 19:57:24

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 03/12] pid: get ppid pid_t of task in init_pid_ns safely

On Tue, Aug 27, 2013 at 07:21:55PM +0200, Oleg Nesterov wrote:
> On 08/20, Richard Guy Briggs wrote:
> >
> > Added the functions
> > task_ppid()
> > task_ppid_nr_ns()
> > task_ppid_nr_init_ns()
> > to safely abstract the lookup of the PPID
>
> but it is not safe.
>
> > +static inline struct pid *task_ppid(struct task_struct *task)
> > +{
> > + return task_tgid(rcu_dereference(current->real_parent));
> ^^^^^^^
> task?

Yup, thanks for those two catches.

> > + rcu_read_unlock();
>
> And why this is safe?
>
> rcu_read_lock() can't help if tsk was already dead _before_ it takes
> the rcu lock. ->real_parent can point the already freed/reused/unmapped
> memory.

Does it not bump a refcount if it is holding a pointer to it? So the
parent task might be dead, but it won't cause a pointer dereference
issue.

> This is safe if, for example, the caller alredy holds rcu_read_lock()
> and tsk was found by find_task_by*(), or tsk is current.

Fair enough, I'll have a more careful look at this. Thanks.

Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...

So what is a reliable way of keeping a reference to a task? I had
assumed that the best way was to keep a pointer to its task_struct,
making sure its refcount had been bumped by something like
get_task_struct(). Another way would be to do the same with its struct
pid. The third that I'm trying to avoid is using its init_pid_namespace
pid_t since it could refer to a completely different task if the pid_t
rolls over.

> Oleg.

- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-08-30 20:37:19

by John Johansen

[permalink] [raw]
Subject: Re: [PATCH 03/12] pid: get ppid pid_t of task in init_pid_ns safely

On 08/30/2013 12:56 PM, Richard Guy Briggs wrote:
> On Tue, Aug 27, 2013 at 07:21:55PM +0200, Oleg Nesterov wrote:
>> On 08/20, Richard Guy Briggs wrote:
>>>
>>> Added the functions
>>> task_ppid()
>>> task_ppid_nr_ns()
>>> task_ppid_nr_init_ns()
>>> to safely abstract the lookup of the PPID
>>
>> but it is not safe.
>>
>>> +static inline struct pid *task_ppid(struct task_struct *task)
>>> +{
>>> + return task_tgid(rcu_dereference(current->real_parent));
>> ^^^^^^^
>> task?
>
> Yup, thanks for those two catches.
>
>>> + rcu_read_unlock();
>>
>> And why this is safe?
>>
>> rcu_read_lock() can't help if tsk was already dead _before_ it takes
>> the rcu lock. ->real_parent can point the already freed/reused/unmapped
>> memory.
>
> Does it not bump a refcount if it is holding a pointer to it? So the
> parent task might be dead, but it won't cause a pointer dereference
> issue.
>
>> This is safe if, for example, the caller alredy holds rcu_read_lock()
>> and tsk was found by find_task_by*(), or tsk is current.
>
> Fair enough, I'll have a more careful look at this. Thanks.
>
> Most of the instances are current, but the one called from apparmour is
> stored. I've just learned that this is bad and someone else just chimed
> in that they have a patch to remove it...

the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.

There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.

2013-08-30 22:41:13

by John Johansen

[permalink] [raw]
Subject: Re: [PATCH 1/3] apparmor: fix capability to not use the current task, during reporting

Mediation is based off of the cred but auditing includes the current
task which may not be related to the actual request.

Signed-off-by: John Johansen <[email protected]>
---
security/apparmor/capability.c | 15 +++++----------
security/apparmor/domain.c | 2 +-
security/apparmor/include/capability.h | 5 ++---
security/apparmor/include/ipc.h | 4 ++--
security/apparmor/ipc.c | 9 ++++-----
security/apparmor/lsm.c | 2 +-
6 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 887a5e9..98a73eb 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -48,8 +48,7 @@ static void audit_cb(struct audit_buffer *ab, void *va)

/**
* audit_caps - audit a capability
- * @profile: profile confining task (NOT NULL)
- * @task: task capability test was performed against (NOT NULL)
+ * @profile: profile being tested for confinement (NOT NULL)
* @cap: capability tested
* @error: error code returned by test
*
@@ -58,8 +57,7 @@ static void audit_cb(struct audit_buffer *ab, void *va)
*
* Returns: 0 or sa->error on success, error code on failure
*/
-static int audit_caps(struct aa_profile *profile, struct task_struct *task,
- int cap, int error)
+static int audit_caps(struct aa_profile *profile, int cap, int error)
{
struct audit_cache *ent;
int type = AUDIT_APPARMOR_AUTO;
@@ -68,7 +66,6 @@ static int audit_caps(struct aa_profile *profile, struct task_struct *task,
sa.type = LSM_AUDIT_DATA_CAP;
sa.aad = &aad;
sa.u.cap = cap;
- sa.aad->tsk = task;
sa.aad->op = OP_CAPABLE;
sa.aad->error = error;

@@ -119,8 +116,7 @@ static int profile_capable(struct aa_profile *profile, int cap)

/**
* aa_capable - test permission to use capability
- * @task: task doing capability test against (NOT NULL)
- * @profile: profile confining @task (NOT NULL)
+ * @profile: profile being tested against (NOT NULL)
* @cap: capability to be tested
* @audit: whether an audit record should be generated
*
@@ -128,8 +124,7 @@ static int profile_capable(struct aa_profile *profile, int cap)
*
* Returns: 0 on success, or else an error code.
*/
-int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
- int audit)
+int aa_capable(struct aa_profile *profile, int cap, int audit)
{
int error = profile_capable(profile, cap);

@@ -139,5 +134,5 @@ int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
return error;
}

- return audit_caps(profile, task, cap, error);
+ return audit_caps(profile, cap, error);
}
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 01b7bd6..f037c57 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -75,7 +75,7 @@ static int may_change_ptraced_domain(struct task_struct *task,
if (!tracer || unconfined(tracerp))
goto out;

- error = aa_may_ptrace(tracer, tracerp, to_profile, PTRACE_MODE_ATTACH);
+ error = aa_may_ptrace(tracerp, to_profile, PTRACE_MODE_ATTACH);

out:
rcu_read_unlock();
diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
index c24d295..e4fea19 100644
--- a/security/apparmor/include/capability.h
+++ b/security/apparmor/include/capability.h
@@ -4,7 +4,7 @@
* This file contains AppArmor capability mediation definitions.
*
* Copyright (C) 1998-2008 Novell/SUSE
- * Copyright 2009-2010 Canonical Ltd.
+ * Copyright 2009-2013 Canonical Ltd.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
@@ -34,8 +34,7 @@ struct aa_caps {
kernel_cap_t extended;
};

-int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
- int audit);
+int aa_capable(struct aa_profile *profile, int cap, int audit);

static inline void aa_free_cap_rules(struct aa_caps *caps)
{
diff --git a/security/apparmor/include/ipc.h b/security/apparmor/include/ipc.h
index aeda0fb..288ca76 100644
--- a/security/apparmor/include/ipc.h
+++ b/security/apparmor/include/ipc.h
@@ -19,8 +19,8 @@

struct aa_profile;

-int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
- struct aa_profile *tracee, unsigned int mode);
+int aa_may_ptrace(struct aa_profile *tracer, struct aa_profile *tracee,
+ unsigned int mode);

int aa_ptrace(struct task_struct *tracer, struct task_struct *tracee,
unsigned int mode);
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index c51d226..777ac1c 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -54,15 +54,14 @@ static int aa_audit_ptrace(struct aa_profile *profile,

/**
* aa_may_ptrace - test if tracer task can trace the tracee
- * @tracer_task: task who will do the tracing (NOT NULL)
* @tracer: profile of the task doing the tracing (NOT NULL)
* @tracee: task to be traced
* @mode: whether PTRACE_MODE_READ || PTRACE_MODE_ATTACH
*
* Returns: %0 else error code if permission denied or error
*/
-int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
- struct aa_profile *tracee, unsigned int mode)
+int aa_may_ptrace(struct aa_profile *tracer, struct aa_profile *tracee,
+ unsigned int mode)
{
/* TODO: currently only based on capability, not extended ptrace
* rules,
@@ -72,7 +71,7 @@ int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
if (unconfined(tracer) || tracer == tracee)
return 0;
/* log this capability request */
- return aa_capable(tracer_task, tracer, CAP_SYS_PTRACE, 1);
+ return aa_capable(tracer, CAP_SYS_PTRACE, 1);
}

/**
@@ -101,7 +100,7 @@ int aa_ptrace(struct task_struct *tracer, struct task_struct *tracee,
if (!unconfined(tracer_p)) {
struct aa_profile *tracee_p = aa_get_task_profile(tracee);

- error = aa_may_ptrace(tracer, tracer_p, tracee_p, mode);
+ error = aa_may_ptrace(tracer_p, tracee_p, mode);
error = aa_audit_ptrace(tracer_p, tracee_p, error);

aa_put_profile(tracee_p);
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 2e2a0dd..69c54c8 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -145,7 +145,7 @@ static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
if (!error) {
profile = aa_cred_profile(cred);
if (!unconfined(profile))
- error = aa_capable(current, profile, cap, audit);
+ error = aa_capable(profile, cap, audit);
}
return error;
}
--
1.8.3.2

2013-08-30 22:42:31

by John Johansen

[permalink] [raw]
Subject: Re: [PATCH 2/3] apparmor: remove tsk field from the apparmor_audit_struct

Now that aa_capabile no longer sets the task field it can be removed
and the lsm_audit version of the field can be used.

Signed-off-by: John Johansen <[email protected]>
---
security/apparmor/audit.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index 031d2d9..e32c448 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -111,7 +111,7 @@ static const char *const aa_audit_type[] = {
static void audit_pre(struct audit_buffer *ab, void *ca)
{
struct common_audit_data *sa = ca;
- struct task_struct *tsk = sa->aad->tsk ? sa->aad->tsk : current;
+ struct task_struct *tsk = sa->u.tsk ? sa->u.tsk : current;

if (aa_g_audit_header) {
audit_log_format(ab, "apparmor=");
@@ -149,12 +149,6 @@ static void audit_pre(struct audit_buffer *ab, void *ca)
audit_log_format(ab, " name=");
audit_log_untrustedstring(ab, sa->aad->name);
}
-
- if (sa->aad->tsk) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
- audit_log_untrustedstring(ab, tsk->comm);
- }
-
}

/**
@@ -212,7 +206,7 @@ int aa_audit(int type, struct aa_profile *profile, gfp_t gfp,

if (sa->aad->type == AUDIT_APPARMOR_KILL)
(void)send_sig_info(SIGKILL, NULL,
- sa->aad->tsk ? sa->aad->tsk : current);
+ sa->u.tsk ? sa->u.tsk : current);

if (sa->aad->type == AUDIT_APPARMOR_ALLOWED)
return complain_error(sa->aad->error);
--
1.8.3.2

2013-08-30 22:43:47

by John Johansen

[permalink] [raw]
Subject: Re: [PATCH 03/3] apparmor: remove parent task info from audit logging

The reporting of the parent task info is a vestage from old versions of
apparmor. The need for this information was removed by unique null-
profiles before apparmor was upstreamed so remove this info from logging.

Signed-off-by: John Johansen <[email protected]>
---
security/apparmor/audit.c | 6 ------
security/apparmor/include/audit.h | 1 -
2 files changed, 7 deletions(-)

diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index e32c448..89c7865 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -111,7 +111,6 @@ static const char *const aa_audit_type[] = {
static void audit_pre(struct audit_buffer *ab, void *ca)
{
struct common_audit_data *sa = ca;
- struct task_struct *tsk = sa->u.tsk ? sa->u.tsk : current;

if (aa_g_audit_header) {
audit_log_format(ab, "apparmor=");
@@ -132,11 +131,6 @@ static void audit_pre(struct audit_buffer *ab, void *ca)

if (sa->aad->profile) {
struct aa_profile *profile = sa->aad->profile;
- pid_t pid;
- rcu_read_lock();
- pid = rcu_dereference(tsk->real_parent)->pid;
- rcu_read_unlock();
- audit_log_format(ab, " parent=%d", pid);
if (profile->ns != root_ns) {
audit_log_format(ab, " namespace=");
audit_log_untrustedstring(ab, profile->ns->base.hname);
diff --git a/security/apparmor/include/audit.h b/security/apparmor/include/audit.h
index 69d8cae..cf65443 100644
--- a/security/apparmor/include/audit.h
+++ b/security/apparmor/include/audit.h
@@ -110,7 +110,6 @@ struct apparmor_audit_data {
void *profile;
const char *name;
const char *info;
- struct task_struct *tsk;
union {
void *target;
struct {
--
1.8.3.2

2013-09-03 18:32:32

by Richard Guy Briggs

[permalink] [raw]
Subject: Re: [PATCH 03/12] pid: get ppid pid_t of task in init_pid_ns safely

On Fri, Aug 30, 2013 at 01:37:09PM -0700, John Johansen wrote:
> On 08/30/2013 12:56 PM, Richard Guy Briggs wrote:
> > On Tue, Aug 27, 2013 at 07:21:55PM +0200, Oleg Nesterov wrote:
> >> On 08/20, Richard Guy Briggs wrote:
> > Most of the instances are current, but the one called from apparmour is
> > stored. I've just learned that this is bad and someone else just chimed
> > in that they have a patch to remove it...
>
> the apparmor case isn't actually stored long term. The stored task will be
> a parameter that was passed into an lsm hook and the buffer that it is
> stored in dies before the hook is done. Its temporarily stored in the
> struct so that it can be passed into the lsm_audit fn, and printed into an
> allocated audit buffer. The text version in the audit buffer is what will
> exist beyond the hook.
>
> There are three patches, I'll reply them below once I have finished rebasing
> them to apply to the current tree instead of my dev tree.

John, thanks for this context and fix. That helps simplify things.

- RGB

--
Richard Guy Briggs <[email protected]>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545

2013-09-08 16:00:44

by Oleg Nesterov

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

Sorry for delay, vacation.

First of all, I do not pretend I understand this code. This was mostly
the question, and in fact I mostly asked about audit_bprm() in 0/1.

However,

On 08/30, Steve Grubb wrote:
> On Friday, August 30, 2013 03:06:46 PM Richard Guy Briggs wrote:
> > On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> > > Btw. audit looks unmaintained... if you are going to take care of
> > > this code, perhaps you can look at
> > >
> > > http://marc.info/?l=linux-kernel&m=137589907108485
> > > http://marc.info/?l=linux-kernel&m=137590271809664
>
> You don't want to clear the TIF audit flag when context == NULL. What that will
> do is make a bunch of inauditable processes. There are times when audit is
> disabled and then re-enabled later. If the flag gets cleared, then a task's
> syscall will never enter the auditing framework from kernel/entry_64.S.
>
> That flag is 0 when auditing has never ever been enabled. If auditing is
> enabled, it should always be a 1 unless the task filter has determined that
> this process should not be audited ever. In practice, this is almost never
> used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
> have the boot argument. Not setting audit=1 on the boot arguments means that
> any process running before the audit daemon enables auditing can never ever be
> audited because the only place its set is when processes are cloned.

Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?

And I do not understand "when context == NULL" above. Say, audit_syscall_entry()
does nothing if !audit_context, and nobody except copy_process() does
audit_alloc(). So why do we need to trigger the audit's paths if it is NULL?

> Hope this clears up the use. NAK to the patch, it'll break auditing.

Not really, but thanks for your reply anyway.

Oleg.

2013-09-10 17:26:44

by Oleg Nesterov

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On 09/08, Oleg Nesterov wrote:
>
> First of all, I do not pretend I understand this code. This was mostly
> the question, and in fact I mostly asked about audit_bprm() in 0/1.
>
> However,
>
> On 08/30, Steve Grubb wrote:
> > On Friday, August 30, 2013 03:06:46 PM Richard Guy Briggs wrote:
> > > On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> > > > Btw. audit looks unmaintained... if you are going to take care of
> > > > this code, perhaps you can look at
> > > >
> > > > http://marc.info/?l=linux-kernel&m=137589907108485
> > > > http://marc.info/?l=linux-kernel&m=137590271809664
> >
> > You don't want to clear the TIF audit flag when context == NULL. What that will
> > do is make a bunch of inauditable processes. There are times when audit is
> > disabled and then re-enabled later. If the flag gets cleared, then a task's
> > syscall will never enter the auditing framework from kernel/entry_64.S.
> >
> > That flag is 0 when auditing has never ever been enabled. If auditing is
> > enabled, it should always be a 1 unless the task filter has determined that
> > this process should not be audited ever. In practice, this is almost never
> > used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
> > have the boot argument. Not setting audit=1 on the boot arguments means that
> > any process running before the audit daemon enables auditing can never ever be
> > audited because the only place its set is when processes are cloned.
>
> Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
>
> And I do not understand "when context == NULL" above. Say, audit_syscall_entry()
> does nothing if !audit_context, and nobody except copy_process() does
> audit_alloc(). So why do we need to trigger the audit's paths if it is NULL?
>
> > Hope this clears up the use. NAK to the patch, it'll break auditing.
>
> Not really, but thanks for your reply anyway.

So, Steve, do you still think that patch was wrong? Attached below
just in case.

Oleg.

[PATCH 1/1] audit_alloc: clear TIF_SYSCALL_AUDIT if !audit_context

If audit_filter_task() nacks the new thread it makes sense
to clear TIF_SYSCALL_AUDIT which can be copied from parent
by dup_task_struct().

A wrong TIF_SYSCALL_AUDIT is not really bad, but it triggers
the "slow" audit paths in entry.S.

Signed-off-by: Oleg Nesterov <[email protected]>
---
kernel/auditsc.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 9845cb3..95293ab 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -943,8 +943,10 @@ int audit_alloc(struct task_struct *tsk)
return 0; /* Return if not auditing. */

state = audit_filter_task(tsk, &key);
- if (state == AUDIT_DISABLED)
+ if (state == AUDIT_DISABLED) {
+ clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
+ }

if (!(context = audit_alloc_context(state))) {
kfree(key);
--
1.5.5.1

2013-09-13 18:28:22

by Steve Grubb

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On Sunday, September 08, 2013 05:54:35 PM Oleg Nesterov wrote:
> Sorry for delay, vacation.
>
> First of all, I do not pretend I understand this code. This was mostly
> the question, and in fact I mostly asked about audit_bprm() in 0/1.
>
> However,
>
> On 08/30, Steve Grubb wrote:
> > On Friday, August 30, 2013 03:06:46 PM Richard Guy Briggs wrote:
> > > On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> > > > Btw. audit looks unmaintained... if you are going to take care of
> > > > this code, perhaps you can look at
> > > >
> > > > http://marc.info/?l=linux-kernel&m=137589907108485
> > > > http://marc.info/?l=linux-kernel&m=137590271809664
> >
> > You don't want to clear the TIF audit flag when context == NULL. What that
> > will do is make a bunch of inauditable processes. There are times when
> > audit is disabled and then re-enabled later. If the flag gets cleared,
> > then a task's syscall will never enter the auditing framework from
> > kernel/entry_64.S.
> >
> > That flag is 0 when auditing has never ever been enabled. If auditing is
> > enabled, it should always be a 1 unless the task filter has determined
> > that
> > this process should not be audited ever. In practice, this is almost never
> > used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why
> > we have the boot argument. Not setting audit=1 on the boot arguments
> > means that any process running before the audit daemon enables auditing
> > can never ever be audited because the only place its set is when
> > processes are cloned.
>
> Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?

The code I'm looking at does right at the end of the function.


> And I do not understand "when context == NULL" above. Say,
> audit_syscall_entry() does nothing if !audit_context, and nobody except
> copy_process() does audit_alloc(). So why do we need to trigger the audit's
> paths if it is NULL?

Because if you enter the audit framework, that means auditing has been turned
on at some point in the past, and could be turned back on at some point in the
future. If auditing has never been enabled, then you never enter the audit
framework at all. If the context is NULL, then audit is not currently enabled.

-Steve

2013-09-13 18:43:02

by Steve Grubb

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On Tuesday, September 10, 2013 07:20:33 PM Oleg Nesterov wrote:
> On 09/08, Oleg Nesterov wrote:
> > First of all, I do not pretend I understand this code. This was mostly
> > the question, and in fact I mostly asked about audit_bprm() in 0/1.
> >
> > However,
> >
> > On 08/30, Steve Grubb wrote:
> > > On Friday, August 30, 2013 03:06:46 PM Richard Guy Briggs wrote:
> > > > On Tue, Aug 27, 2013 at 07:11:34PM +0200, Oleg Nesterov wrote:
> > > > > Btw. audit looks unmaintained... if you are going to take care of
> > > > > this code, perhaps you can look at
> > > > >
> > > > > http://marc.info/?l=linux-kernel&m=137589907108485
> > > > > http://marc.info/?l=linux-kernel&m=137590271809664
> > >
> > > You don't want to clear the TIF audit flag when context == NULL. What
> > > that will do is make a bunch of inauditable processes. There are times
> > > when audit is disabled and then re-enabled later. If the flag gets
> > > cleared, then a task's syscall will never enter the auditing framework
> > > from kernel/entry_64.S.
> > >
> > > That flag is 0 when auditing has never ever been enabled. If auditing is
> > > enabled, it should always be a 1 unless the task filter has determined
> > > that
> > > this process should not be audited ever. In practice, this is almost
> > > never
> > > used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is
> > > why we have the boot argument. Not setting audit=1 on the boot
> > > arguments means that any process running before the audit daemon
> > > enables auditing can never ever be audited because the only place its
> > > set is when processes are cloned.>
> > Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
> >
> > And I do not understand "when context == NULL" above. Say,
> > audit_syscall_entry() does nothing if !audit_context, and nobody except
> > copy_process() does audit_alloc(). So why do we need to trigger the
> > audit's paths if it is NULL?>
> > > Hope this clears up the use. NAK to the patch, it'll break auditing.
> >
> > Not really, but thanks for your reply anyway.
>
> So, Steve, do you still think that patch was wrong? Attached below
> just in case.

I think this looks OK. If the task filter NACK's auditing the process, then
clearing the flag is probably correct. I have design notes from back around the
2.6.7 kernel saying this was the intention.

ACK.

-Steve


> [PATCH 1/1] audit_alloc: clear TIF_SYSCALL_AUDIT if !audit_context
>
> If audit_filter_task() nacks the new thread it makes sense
> to clear TIF_SYSCALL_AUDIT which can be copied from parent
> by dup_task_struct().
>
> A wrong TIF_SYSCALL_AUDIT is not really bad, but it triggers
> the "slow" audit paths in entry.S.
>
> Signed-off-by: Oleg Nesterov <[email protected]>
> ---
> kernel/auditsc.c | 4 +++-
> 1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 9845cb3..95293ab 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -943,8 +943,10 @@ int audit_alloc(struct task_struct *tsk)
> return 0; /* Return if not auditing. */
>
> state = audit_filter_task(tsk, &key);
> - if (state == AUDIT_DISABLED)
> + if (state == AUDIT_DISABLED) {
> + clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
> return 0;
> + }
>
> if (!(context = audit_alloc_context(state))) {
> kfree(key);

2013-09-14 18:15:08

by Oleg Nesterov

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On 09/13, Steve Grubb wrote:
>
> On Sunday, September 08, 2013 05:54:35 PM Oleg Nesterov wrote:
> >
> > Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
>
> The code I'm looking at does right at the end of the function.

The code I'm looking at does right at the end too ;) but it also
returns at the start if audit_filter_task() returns AUDIT_DISABLED.

> > And I do not understand "when context == NULL" above. Say,
> > audit_syscall_entry() does nothing if !audit_context, and nobody except
> > copy_process() does audit_alloc(). So why do we need to trigger the audit's
> > paths if it is NULL?
>
> Because if you enter the audit framework,

framework? TIF_SYSCALL_AUDIT has only meaning in entry.S, we need it
to ensure that the audited task can't miss audit_syscall_*().

> that means auditing has been turned
> on at some point in the past, and could be turned back on at some point in the
> future.

And this will change nothing, afaics (wrt TIF_SYSCALL_AUDIT).

Oleg.

2013-09-14 18:16:15

by Oleg Nesterov

[permalink] [raw]
Subject: Re: audit looks unmaintained? [was: Re: [PATCH 11/12] pid: rewrite task helper functions avoiding task->pid and task->tgid]

On 09/13, Steve Grubb wrote:
>
> On Tuesday, September 10, 2013 07:20:33 PM Oleg Nesterov wrote:
> >
> > So, Steve, do you still think that patch was wrong? Attached below
> > just in case.
>
> I think this looks OK. If the task filter NACK's auditing the process, then
> clearing the flag is probably correct. I have design notes from back around the
> 2.6.7 kernel saying this was the intention.

Then I do not really understand your previous email... Nevermind ;)

> ACK.

Thanks. I'll resend this patch with your ack applied.

Oleg.