2018-07-05 06:02:17

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 0/5] ipc: cleanups & bugfixes

Hi,

Dmitry convinced me that I should properly review the initialization
of new ipc objects, and I found another issue.

The series corrects 3 issues with ipc_addid(), and also renames
two functions and corrects a wrong comment.

0001-ipc-reorganize-initialization-of-kern_ipc_perm.id:
Access kern_ipc_perm.id under the IPC spinlock.
My original idea of removing kern_ipc_perm entirely
is not possible, e.g. the proc interface needs the id.

0002-ipc-reorganize-initialization-of-kern_ipc_perm.seq:
Bugfix for the syzbot finding

0003-ipc-util.c-Use-ipc_rcu_putref-for-failues-in-ipc_add:
Bugfix from code review

0004-ipc-Rename-ipcctl_pre_down_nolock.patch:
Comment update & function rename from code review

0005-ipc-rename-ipc_lock-to-ipc_lock_idr:
Function rename from code review

0006-ipc-util.c-correct-comment-in-ipc_obtain_object_che
Comment correction from code review

The patches are lightly tested, especially I have not tested
the checkpoint/restore code or tested the failure cases.

--
Manfred


2018-07-05 06:00:34

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 1/6] ipc: reorganize initialization of kern_ipc_perm.id

ipc_addid() initializes kern_ipc_perm.id after having called
ipc_idr_alloc().

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_idr()
may see an uninitialized value.

The patch moves all accesses to kern_ipc_perm.id under the spin_lock().

The issues is related to the finding of
[email protected]:
syzbot found an issue with kern_ipc_perm.seq

Signed-off-by: Manfred Spraul <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Signed-off-by: Manfred Spraul <[email protected]>
---
ipc/msg.c | 19 ++++++++++++++-----
ipc/sem.c | 18 +++++++++++++-----
ipc/shm.c | 19 ++++++++++++++-----
3 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 3b6545302598..829c2062ded4 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -491,7 +491,6 @@ static int msgctl_stat(struct ipc_namespace *ns, int msqid,
int cmd, struct msqid64_ds *p)
{
struct msg_queue *msq;
- int id = 0;
int err;

memset(p, 0, sizeof(*p));
@@ -503,7 +502,6 @@ static int msgctl_stat(struct ipc_namespace *ns, int msqid,
err = PTR_ERR(msq);
goto out_unlock;
}
- id = msq->q_perm.id;
} else { /* IPC_STAT */
msq = msq_obtain_object_check(ns, msqid);
if (IS_ERR(msq)) {
@@ -548,10 +546,21 @@ static int msgctl_stat(struct ipc_namespace *ns, int msqid,
p->msg_lspid = pid_vnr(msq->q_lspid);
p->msg_lrpid = pid_vnr(msq->q_lrpid);

- ipc_unlock_object(&msq->q_perm);
- rcu_read_unlock();
- return id;
+ if (cmd == IPC_STAT) {
+ /*
+ * As defined in SUS:
+ * Return 0 on success
+ */
+ err = 0;
+ } else {
+ /*
+ * MSG_STAT and MSG_STAT_ANY (both Linux specific)
+ * Return the full id, including the sequence counter
+ */
+ err = msq->q_perm.id;
+ }

+ ipc_unlock_object(&msq->q_perm);
out_unlock:
rcu_read_unlock();
return err;
diff --git a/ipc/sem.c b/ipc/sem.c
index 5af1943ad782..e8971fa1d847 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1222,7 +1222,6 @@ static int semctl_stat(struct ipc_namespace *ns, int semid,
{
struct sem_array *sma;
time64_t semotime;
- int id = 0;
int err;

memset(semid64, 0, sizeof(*semid64));
@@ -1234,7 +1233,6 @@ static int semctl_stat(struct ipc_namespace *ns, int semid,
err = PTR_ERR(sma);
goto out_unlock;
}
- id = sma->sem_perm.id;
} else { /* IPC_STAT */
sma = sem_obtain_object_check(ns, semid);
if (IS_ERR(sma)) {
@@ -1274,10 +1272,20 @@ static int semctl_stat(struct ipc_namespace *ns, int semid,
#endif
semid64->sem_nsems = sma->sem_nsems;

+ if (cmd == IPC_STAT) {
+ /*
+ * As defined in SUS:
+ * Return 0 on success
+ */
+ err = 0;
+ } else {
+ /*
+ * SEM_STAT and SEM_STAT_ANY (both Linux specific)
+ * Return the full id, including the sequence counter
+ */
+ err = sma->sem_perm.id;
+ }
ipc_unlock_object(&sma->sem_perm);
- rcu_read_unlock();
- return id;
-
out_unlock:
rcu_read_unlock();
return err;
diff --git a/ipc/shm.c b/ipc/shm.c
index 051a3e1fb8df..59fe8b3b3794 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -949,7 +949,6 @@ static int shmctl_stat(struct ipc_namespace *ns, int shmid,
int cmd, struct shmid64_ds *tbuf)
{
struct shmid_kernel *shp;
- int id = 0;
int err;

memset(tbuf, 0, sizeof(*tbuf));
@@ -961,7 +960,6 @@ static int shmctl_stat(struct ipc_namespace *ns, int shmid,
err = PTR_ERR(shp);
goto out_unlock;
}
- id = shp->shm_perm.id;
} else { /* IPC_STAT */
shp = shm_obtain_object_check(ns, shmid);
if (IS_ERR(shp)) {
@@ -1011,10 +1009,21 @@ static int shmctl_stat(struct ipc_namespace *ns, int shmid,
tbuf->shm_lpid = pid_vnr(shp->shm_lprid);
tbuf->shm_nattch = shp->shm_nattch;

- ipc_unlock_object(&shp->shm_perm);
- rcu_read_unlock();
- return id;
+ if (cmd == IPC_STAT) {
+ /*
+ * As defined in SUS:
+ * Return 0 on success
+ */
+ err = 0;
+ } else {
+ /*
+ * SHM_STAT and SHM_STAT_ANY (both Linux specific)
+ * Return the full id, including the sequence counter
+ */
+ err = shp->shm_perm.id;
+ }

+ ipc_unlock_object(&shp->shm_perm);
out_unlock:
rcu_read_unlock();
return err;
--
2.17.1


2018-07-05 06:00:42

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 5/6] ipc: rename ipc_lock() to ipc_lock_idr()

ipc/util.c contains multiple functions to get the ipc object
pointer given an id number.

There are two sets of function: One set verifies the sequence
counter part of the id number, other functions do not check
the sequence counter.

The standard for function names in ipc/util.c is
- ..._check() functions verify the sequence counter
- ..._idr() functions do not verify the sequence counter

ipc_lock() is an exception: It does not verify the sequence
counter value, but this is not obvious from the function name.

Therefore: Rename the function to ipc_lock_idr(), to make it
obvious that it does not check the sequence counter.

Signed-off-by: Manfred Spraul <[email protected]>
---
ipc/shm.c | 4 ++--
ipc/util.c | 10 ++++++----
ipc/util.h | 2 +-
3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/ipc/shm.c b/ipc/shm.c
index 426ba1039a7b..cd8655c7bb77 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -179,11 +179,11 @@ static inline struct shmid_kernel *shm_obtain_object_check(struct ipc_namespace
*/
static inline struct shmid_kernel *shm_lock(struct ipc_namespace *ns, int id)
{
- struct kern_ipc_perm *ipcp = ipc_lock(&shm_ids(ns), id);
+ struct kern_ipc_perm *ipcp = ipc_lock_idr(&shm_ids(ns), id);

/*
* Callers of shm_lock() must validate the status of the returned ipc
- * object pointer (as returned by ipc_lock()), and error out as
+ * object pointer (as returned by ipc_lock_idr()), and error out as
* appropriate.
*/
if (IS_ERR(ipcp))
diff --git a/ipc/util.c b/ipc/util.c
index 751d39baaf38..4f2db913acf9 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -604,15 +604,17 @@ struct kern_ipc_perm *ipc_obtain_object_idr(struct ipc_ids *ids, int id)
}

/**
- * ipc_lock - lock an ipc structure without rwsem held
+ * ipc_lock_idr - lock an ipc structure without rwsem held
* @ids: ipc identifier set
* @id: ipc id to look for
*
* Look for an id in the ipc ids idr and lock the associated ipc object.
+ * The function does not check if the sequence counter matches the
+ * found ipc object.
*
* The ipc object is locked on successful exit.
*/
-struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id)
+struct kern_ipc_perm *ipc_lock_idr(struct ipc_ids *ids, int id)
{
struct kern_ipc_perm *out;

@@ -624,8 +626,8 @@ struct kern_ipc_perm *ipc_lock(struct ipc_ids *ids, int id)
spin_lock(&out->lock);

/*
- * ipc_rmid() may have already freed the ID while ipc_lock()
- * was spinning: here verify that the structure is still valid.
+ * ipc_rmid() may have already freed the ID while waiting for
+ * the lock. Here verify that the structure is still valid.
* Upon races with RMID, return -EIDRM, thus indicating that
* the ID points to a removed identifier.
*/
diff --git a/ipc/util.h b/ipc/util.h
index fcf81425ae98..ed74b0fc68c9 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -142,7 +142,7 @@ int ipc_rcu_getref(struct kern_ipc_perm *ptr);
void ipc_rcu_putref(struct kern_ipc_perm *ptr,
void (*func)(struct rcu_head *head));

-struct kern_ipc_perm *ipc_lock(struct ipc_ids *, int);
+struct kern_ipc_perm *ipc_lock_idr(struct ipc_ids *, int);
struct kern_ipc_perm *ipc_obtain_object_idr(struct ipc_ids *ids, int id);

void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out);
--
2.17.1


2018-07-05 06:01:02

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 6/6] ipc/util.c: correct comment in ipc_obtain_object_check

The comment that explains ipc_obtain_object_check is wrong:
The function checks the sequence number, not the reference
counter.
Note that checking the reference counter would be meaningless:
The reference counter is decreased without holding any locks,
thus an object with kern_ipc_perm.deleted=true may disappear at
the end of the next rcu grace period.

Signed-off-by: Manfred Spraul <[email protected]>
---
ipc/util.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index 4f2db913acf9..776a9ce2905f 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -646,8 +646,8 @@ struct kern_ipc_perm *ipc_lock_idr(struct ipc_ids *ids, int id)
* @ids: ipc identifier set
* @id: ipc id to look for
*
- * Similar to ipc_obtain_object_idr() but also checks
- * the ipc object reference counter.
+ * Similar to ipc_obtain_object_idr() but also checks the ipc object
+ * sequence number.
*
* Call inside the RCU critical section.
* The ipc object is *not* locked on exit.
--
2.17.1


2018-07-05 06:01:21

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 4/6] ipc: Rename ipcctl_pre_down_nolock().

Both the comment and the name of ipcctl_pre_down_nolock()
are misleading: The function must be called while holdling
the rw semaphore.
Therefore the patch renames the function to ipcctl_obtain_check():
This name matches the other names used in util.c:
- "obtain" function look up a pointer in the idr, without
acquiring the object lock.
- The caller is responsible for locking.
- _check means that some checks are made.

Signed-off-by: Manfred Spraul <[email protected]>
---
ipc/msg.c | 2 +-
ipc/sem.c | 2 +-
ipc/shm.c | 2 +-
ipc/util.c | 6 +++---
ipc/util.h | 2 +-
5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 5bf5cb8017ea..ba85d8849e8d 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -385,7 +385,7 @@ static int msgctl_down(struct ipc_namespace *ns, int msqid, int cmd,
down_write(&msg_ids(ns).rwsem);
rcu_read_lock();

- ipcp = ipcctl_pre_down_nolock(ns, &msg_ids(ns), msqid, cmd,
+ ipcp = ipcctl_obtain_check(ns, &msg_ids(ns), msqid, cmd,
&msqid64->msg_perm, msqid64->msg_qbytes);
if (IS_ERR(ipcp)) {
err = PTR_ERR(ipcp);
diff --git a/ipc/sem.c b/ipc/sem.c
index 9d49efeac2e5..9742e9a1c0c2 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1595,7 +1595,7 @@ static int semctl_down(struct ipc_namespace *ns, int semid,
down_write(&sem_ids(ns).rwsem);
rcu_read_lock();

- ipcp = ipcctl_pre_down_nolock(ns, &sem_ids(ns), semid, cmd,
+ ipcp = ipcctl_obtain_check(ns, &sem_ids(ns), semid, cmd,
&semid64->sem_perm, 0);
if (IS_ERR(ipcp)) {
err = PTR_ERR(ipcp);
diff --git a/ipc/shm.c b/ipc/shm.c
index 06b7bf11a011..426ba1039a7b 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -868,7 +868,7 @@ static int shmctl_down(struct ipc_namespace *ns, int shmid, int cmd,
down_write(&shm_ids(ns).rwsem);
rcu_read_lock();

- ipcp = ipcctl_pre_down_nolock(ns, &shm_ids(ns), shmid, cmd,
+ ipcp = ipcctl_obtain_check(ns, &shm_ids(ns), shmid, cmd,
&shmid64->shm_perm, 0);
if (IS_ERR(ipcp)) {
err = PTR_ERR(ipcp);
diff --git a/ipc/util.c b/ipc/util.c
index 8b09496ed720..751d39baaf38 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -703,7 +703,7 @@ int ipc_update_perm(struct ipc64_perm *in, struct kern_ipc_perm *out)
}

/**
- * ipcctl_pre_down_nolock - retrieve an ipc and check permissions for some IPC_XXX cmd
+ * ipcctl_obtain_check - retrieve an ipc and check permissions for some IPC_XXX cmd
* @ns: ipc namespace
* @ids: the table of ids where to look for the ipc
* @id: the id of the ipc to retrieve
@@ -713,7 +713,7 @@ int ipc_update_perm(struct ipc64_perm *in, struct kern_ipc_perm *out)
*
* This function does some common audit and permissions check for some IPC_XXX
* cmd and is called from semctl_down, shmctl_down and msgctl_down.
- * It must be called without any lock held and:
+ * It:
*
* - retrieves the ipc with the given id in the given table.
* - performs some audit and permission check, depending on the given cmd
@@ -722,7 +722,7 @@ int ipc_update_perm(struct ipc64_perm *in, struct kern_ipc_perm *out)
*
* Call holding the both the rwsem and the rcu read lock.
*/
-struct kern_ipc_perm *ipcctl_pre_down_nolock(struct ipc_namespace *ns,
+struct kern_ipc_perm *ipcctl_obtain_check(struct ipc_namespace *ns,
struct ipc_ids *ids, int id, int cmd,
struct ipc64_perm *perm, int extra_perm)
{
diff --git a/ipc/util.h b/ipc/util.h
index 0aba3230d007..fcf81425ae98 100644
--- a/ipc/util.h
+++ b/ipc/util.h
@@ -148,7 +148,7 @@ struct kern_ipc_perm *ipc_obtain_object_idr(struct ipc_ids *ids, int id);
void kernel_to_ipc64_perm(struct kern_ipc_perm *in, struct ipc64_perm *out);
void ipc64_perm_to_ipc_perm(struct ipc64_perm *in, struct ipc_perm *out);
int ipc_update_perm(struct ipc64_perm *in, struct kern_ipc_perm *out);
-struct kern_ipc_perm *ipcctl_pre_down_nolock(struct ipc_namespace *ns,
+struct kern_ipc_perm *ipcctl_obtain_check(struct ipc_namespace *ns,
struct ipc_ids *ids, int id, int cmd,
struct ipc64_perm *perm, int extra_perm);

--
2.17.1


2018-07-05 06:01:37

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

ipc_addid() initializes kern_ipc_perm.seq after having called
ipc_idr_alloc().

Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
may see an uninitialized value.

The patch moves the initialization of kern_ipc_perm.seq before the
calls of ipc_idr_alloc().

Notes:
1) This patch has a user space visible side effect:
If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
if semget()/msgget()/shmget() fails in the final step of adding the id
to the rhash tree, then .._next_id is cleared. Before the patch, is
remained unmodified.

There is no change of the behavior after a successful ..get() call:
It always clears .._next_id, there is no impact to non checkpoint/restore
code as that code does not use .._next_id.

2) The patch correctly documents that after a call to ipc_idr_alloc(),
the full tear-down sequence must be used. The callers of ipc_addid()
do not fullfill that, i.e. more bugfixes are required.

Reported-by: [email protected]
Signed-off-by: Manfred Spraul <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Signed-off-by: Manfred Spraul <[email protected]>
---
Documentation/sysctl/kernel.txt | 3 ++-
ipc/util.c | 45 +++++++++++++++++++++++----------
2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index eded671d55eb..b2d4a8f8fe97 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -440,7 +440,8 @@ Notes:
1) kernel doesn't guarantee, that new object will have desired id. So,
it's up to userspace, how to handle an object with "wrong" id.
2) Toggle with non-default value will be set back to -1 by kernel after
-successful IPC object allocation.
+successful IPC object allocation. If an IPC object allocation syscall
+fails, it is undefined if the value remains unmodified or is reset to -1.

==============================================================

diff --git a/ipc/util.c b/ipc/util.c
index 4e81182fa0ac..662c28c6c9fa 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -197,13 +197,24 @@ static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
/*
* Specify desired id for next allocated IPC object.
*/
-#define ipc_idr_alloc(ids, new) \
- idr_alloc(&(ids)->ipcs_idr, (new), \
- (ids)->next_id < 0 ? 0 : ipcid_to_idx((ids)->next_id),\
- 0, GFP_NOWAIT)
+static inline int ipc_idr_alloc(struct ipc_ids *ids,
+ struct kern_ipc_perm *new)
+{
+ int key;

-static inline int ipc_buildid(int id, struct ipc_ids *ids,
- struct kern_ipc_perm *new)
+ if (ids->next_id < 0) {
+ key = idr_alloc(&ids->ipcs_idr, new, 0, 0, GFP_NOWAIT);
+ } else {
+ key = idr_alloc(&ids->ipcs_idr, new,
+ ipcid_to_idx(ids->next_id),
+ 0, GFP_NOWAIT);
+ ids->next_id = -1;
+ }
+ return key;
+}
+
+static inline void ipc_set_seq(struct ipc_ids *ids,
+ struct kern_ipc_perm *new)
{
if (ids->next_id < 0) { /* default, behave as !CHECKPOINT_RESTORE */
new->seq = ids->seq++;
@@ -211,24 +222,19 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
ids->seq = 0;
} else {
new->seq = ipcid_to_seqx(ids->next_id);
- ids->next_id = -1;
}
-
- return SEQ_MULTIPLIER * new->seq + id;
}

#else
#define ipc_idr_alloc(ids, new) \
idr_alloc(&(ids)->ipcs_idr, (new), 0, 0, GFP_NOWAIT)

-static inline int ipc_buildid(int id, struct ipc_ids *ids,
+static inline void ipc_set_seq(struct ipc_ids *ids,
struct kern_ipc_perm *new)
{
new->seq = ids->seq++;
if (ids->seq > IPCID_SEQ_MAX)
ids->seq = 0;
-
- return SEQ_MULTIPLIER * new->seq + id;
}

#endif /* CONFIG_CHECKPOINT_RESTORE */
@@ -270,6 +276,19 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
new->cuid = new->uid = euid;
new->gid = new->cgid = egid;

+ ipc_set_seq(ids, new);
+
+ /*
+ * As soon as a new object is inserted into the idr,
+ * ipc_obtain_object_idr() or ipc_obtain_object_check() can find it,
+ * and the lockless preparations for ipc operations can start.
+ * This means especially: permission checks, audit calls, allocation
+ * of undo structures, ...
+ *
+ * Thus the object must be fully initialized, and if something fails,
+ * then the full tear-down sequence must be followed.
+ * (i.e.: set new->deleted, reduce refcount, call_rcu())
+ */
id = ipc_idr_alloc(ids, new);
idr_preload_end();

@@ -291,7 +310,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
if (id > ids->max_id)
ids->max_id = id;

- new->id = ipc_buildid(id, ids, new);
+ new->id = SEQ_MULTIPLIER * new->seq + id;

return id;
}
--
2.17.1


2018-07-05 06:02:29

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 3/6] ipc/util.c: Use ipc_rcu_putref() for failues in ipc_addid()

ipc_addid() is impossible to use:
- for certain failures, the caller must not use ipc_rcu_putref(),
because the reference counter is not yet initialized.
- for other failures, the caller must use ipc_rcu_putref(),
because parallel operations could be ongoing already.

The patch cleans that up, by initializing the refcount early,
and by modifying all callers.

The issues is related to the finding of
[email protected]:
syzbot found an issue with reading kern_ipc_perm.seq,
here both read and write to already released memory could happen.

Signed-off-by: Manfred Spraul <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
---
ipc/msg.c | 2 +-
ipc/sem.c | 2 +-
ipc/shm.c | 2 ++
ipc/util.c | 12 ++++++++++--
4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/ipc/msg.c b/ipc/msg.c
index 829c2062ded4..5bf5cb8017ea 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -162,7 +162,7 @@ static int newque(struct ipc_namespace *ns, struct ipc_params *params)
/* ipc_addid() locks msq upon success. */
retval = ipc_addid(&msg_ids(ns), &msq->q_perm, ns->msg_ctlmni);
if (retval < 0) {
- call_rcu(&msq->q_perm.rcu, msg_rcu_free);
+ ipc_rcu_putref(&msq->q_perm, msg_rcu_free);
return retval;
}

diff --git a/ipc/sem.c b/ipc/sem.c
index e8971fa1d847..9d49efeac2e5 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -556,7 +556,7 @@ static int newary(struct ipc_namespace *ns, struct ipc_params *params)
/* ipc_addid() locks sma upon success. */
retval = ipc_addid(&sem_ids(ns), &sma->sem_perm, ns->sc_semmni);
if (retval < 0) {
- call_rcu(&sma->sem_perm.rcu, sem_rcu_free);
+ ipc_rcu_putref(&sma->sem_perm, sem_rcu_free);
return retval;
}
ns->used_sems += nsems;
diff --git a/ipc/shm.c b/ipc/shm.c
index 59fe8b3b3794..06b7bf11a011 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -671,6 +671,8 @@ static int newseg(struct ipc_namespace *ns, struct ipc_params *params)
if (is_file_hugepages(file) && shp->mlock_user)
user_shm_unlock(size, shp->mlock_user);
fput(file);
+ ipc_rcu_putref(&shp->shm_perm, shm_rcu_free);
+ return error;
no_file:
call_rcu(&shp->shm_perm.rcu, shm_rcu_free);
return error;
diff --git a/ipc/util.c b/ipc/util.c
index 662c28c6c9fa..8b09496ed720 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -248,7 +248,9 @@ static inline void ipc_set_seq(struct ipc_ids *ids,
* Add an entry 'new' to the ipc ids idr. The permissions object is
* initialised and the first free entry is set up and the id assigned
* is returned. The 'new' entry is returned in a locked state on success.
+ *
* On failure the entry is not locked and a negative err-code is returned.
+ * The caller must use ipc_rcu_putref() to free the identifier.
*
* Called with writer ipc_ids.rwsem held.
*/
@@ -258,6 +260,9 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
kgid_t egid;
int id, err;

+ /* 1) Initialize the refcount so that ipc_rcu_putref works */
+ refcount_set(&new->refcount, 1);
+
if (limit > IPCMNI)
limit = IPCMNI;

@@ -266,9 +271,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)

idr_preload(GFP_KERNEL);

- refcount_set(&new->refcount, 1);
spin_lock_init(&new->lock);
- new->deleted = false;
rcu_read_lock();
spin_lock(&new->lock);

@@ -277,6 +280,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
new->gid = new->cgid = egid;

ipc_set_seq(ids, new);
+ new->deleted = false;

/*
* As soon as a new object is inserted into the idr,
@@ -288,6 +292,9 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
* Thus the object must be fully initialized, and if something fails,
* then the full tear-down sequence must be followed.
* (i.e.: set new->deleted, reduce refcount, call_rcu())
+ *
+ * This function sets new->deleted, the caller must use ipc_rcu_putef()
+ * for the remaining steps.
*/
id = ipc_idr_alloc(ids, new);
idr_preload_end();
@@ -301,6 +308,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
}
}
if (id < 0) {
+ new->deleted = true;
spin_unlock(&new->lock);
rcu_read_unlock();
return id;
--
2.17.1


2018-07-05 08:37:36

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

On Thu, Jul 5, 2018 at 7:59 AM, Manfred Spraul <[email protected]> wrote:
> ipc_addid() initializes kern_ipc_perm.seq after having called
> ipc_idr_alloc().
>
> Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
> may see an uninitialized value.
>
> The patch moves the initialization of kern_ipc_perm.seq before the
> calls of ipc_idr_alloc().
>
> Notes:
> 1) This patch has a user space visible side effect:
> If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
> if semget()/msgget()/shmget() fails in the final step of adding the id
> to the rhash tree, then .._next_id is cleared. Before the patch, is
> remained unmodified.
>
> There is no change of the behavior after a successful ..get() call:
> It always clears .._next_id, there is no impact to non checkpoint/restore
> code as that code does not use .._next_id.
>
> 2) The patch correctly documents that after a call to ipc_idr_alloc(),
> the full tear-down sequence must be used. The callers of ipc_addid()
> do not fullfill that, i.e. more bugfixes are required.
>
> Reported-by: [email protected]
> Signed-off-by: Manfred Spraul <[email protected]>
> Cc: Dmitry Vyukov <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Davidlohr Bueso <[email protected]>
> Cc: Michael Kerrisk <[email protected]>
> Signed-off-by: Manfred Spraul <[email protected]>
> ---
> Documentation/sysctl/kernel.txt | 3 ++-
> ipc/util.c | 45 +++++++++++++++++++++++----------
> 2 files changed, 34 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
> index eded671d55eb..b2d4a8f8fe97 100644
> --- a/Documentation/sysctl/kernel.txt
> +++ b/Documentation/sysctl/kernel.txt
> @@ -440,7 +440,8 @@ Notes:
> 1) kernel doesn't guarantee, that new object will have desired id. So,
> it's up to userspace, how to handle an object with "wrong" id.
> 2) Toggle with non-default value will be set back to -1 by kernel after
> -successful IPC object allocation.
> +successful IPC object allocation. If an IPC object allocation syscall
> +fails, it is undefined if the value remains unmodified or is reset to -1.
>
> ==============================================================
>
> diff --git a/ipc/util.c b/ipc/util.c
> index 4e81182fa0ac..662c28c6c9fa 100644
> --- a/ipc/util.c
> +++ b/ipc/util.c
> @@ -197,13 +197,24 @@ static struct kern_ipc_perm *ipc_findkey(struct ipc_ids *ids, key_t key)
> /*
> * Specify desired id for next allocated IPC object.
> */
> -#define ipc_idr_alloc(ids, new) \
> - idr_alloc(&(ids)->ipcs_idr, (new), \
> - (ids)->next_id < 0 ? 0 : ipcid_to_idx((ids)->next_id),\
> - 0, GFP_NOWAIT)
> +static inline int ipc_idr_alloc(struct ipc_ids *ids,
> + struct kern_ipc_perm *new)
> +{
> + int key;
>
> -static inline int ipc_buildid(int id, struct ipc_ids *ids,
> - struct kern_ipc_perm *new)
> + if (ids->next_id < 0) {
> + key = idr_alloc(&ids->ipcs_idr, new, 0, 0, GFP_NOWAIT);
> + } else {
> + key = idr_alloc(&ids->ipcs_idr, new,
> + ipcid_to_idx(ids->next_id),
> + 0, GFP_NOWAIT);
> + ids->next_id = -1;
> + }
> + return key;
> +}
> +
> +static inline void ipc_set_seq(struct ipc_ids *ids,
> + struct kern_ipc_perm *new)
> {
> if (ids->next_id < 0) { /* default, behave as !CHECKPOINT_RESTORE */
> new->seq = ids->seq++;
> @@ -211,24 +222,19 @@ static inline int ipc_buildid(int id, struct ipc_ids *ids,
> ids->seq = 0;
> } else {
> new->seq = ipcid_to_seqx(ids->next_id);
> - ids->next_id = -1;
> }
> -
> - return SEQ_MULTIPLIER * new->seq + id;
> }
>
> #else
> #define ipc_idr_alloc(ids, new) \
> idr_alloc(&(ids)->ipcs_idr, (new), 0, 0, GFP_NOWAIT)
>
> -static inline int ipc_buildid(int id, struct ipc_ids *ids,
> +static inline void ipc_set_seq(struct ipc_ids *ids,
> struct kern_ipc_perm *new)
> {
> new->seq = ids->seq++;
> if (ids->seq > IPCID_SEQ_MAX)
> ids->seq = 0;
> -
> - return SEQ_MULTIPLIER * new->seq + id;
> }
>
> #endif /* CONFIG_CHECKPOINT_RESTORE */
> @@ -270,6 +276,19 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
> new->cuid = new->uid = euid;
> new->gid = new->cgid = egid;
>
> + ipc_set_seq(ids, new);
> +
> + /*
> + * As soon as a new object is inserted into the idr,
> + * ipc_obtain_object_idr() or ipc_obtain_object_check() can find it,
> + * and the lockless preparations for ipc operations can start.
> + * This means especially: permission checks, audit calls, allocation
> + * of undo structures, ...
> + *
> + * Thus the object must be fully initialized, and if something fails,
> + * then the full tear-down sequence must be followed.
> + * (i.e.: set new->deleted, reduce refcount, call_rcu())
> + */
> id = ipc_idr_alloc(ids, new);
> idr_preload_end();
>
> @@ -291,7 +310,7 @@ int ipc_addid(struct ipc_ids *ids, struct kern_ipc_perm *new, int limit)
> if (id > ids->max_id)
> ids->max_id = id;
>
> - new->id = ipc_buildid(id, ids, new);
> + new->id = SEQ_MULTIPLIER * new->seq + id;
>
> return id;
> }


Hi Manfred,

The series looks like a significant improvement to me. Thanks!

I feel that this code can be further simplified (unless I am missing
something here). Please take a look at this version:

https://github.com/dvyukov/linux/commit/f77aeaf80f3c4ab524db92184d874b03063fea3a?diff=split

This is on top of your patches. It basically does the same as your
code, but consolidates all id/seq assignment and dealing with next_id,
and deduplicates code re CONFIG_CHECKPOINT_RESTORE. Currently it's a
bit tricky to follow e.g. where exactly next_id is consumed and where
it needs to be left intact.
The only difference is that my code assigns new->id earlier. Not sure
if it can lead to anything bad. But if yes, then it seems that
currently uninitialized new->id is exposed. If necessary (?) we could
reset new->id in the same place where we set new->deleted.

Other patches in the series look good to me.

p.s. I've uploaded full context diffs of all patches here:
https://github.com/dvyukov/linux/commits/ipc1


Attachments:
ipc.txt (2.42 kB)

2018-07-05 15:14:45

by Manfred Spraul

[permalink] [raw]
Subject: Re: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

Hi Dmitry,

On 07/05/2018 10:36 AM, Dmitry Vyukov wrote:
> [...]
> Hi Manfred,
>
> The series looks like a significant improvement to me. Thanks!
>
> I feel that this code can be further simplified (unless I am missing
> something here). Please take a look at this version:
>
> https://github.com/dvyukov/linux/commit/f77aeaf80f3c4ab524db92184d874b03063fea3a?diff=split
>
> This is on top of your patches. It basically does the same as your
> code, but consolidates all id/seq assignment and dealing with next_id,
> and deduplicates code re CONFIG_CHECKPOINT_RESTORE. Currently it's a
> bit tricky to follow e.g. where exactly next_id is consumed and where
> it needs to be left intact.
> The only difference is that my code assigns new->id earlier. Not sure
> if it can lead to anything bad. But if yes, then it seems that
> currently uninitialized new->id is exposed. If necessary (?) we could
> reset new->id in the same place where we set new->deleted.
Everything looks correct for me, it is better than the current code.
Except that you didn't sign off your last patch.

As next step: Who can merge the patches towards linux-next?
The only open point that I see are stress tests of the error codepaths.

And:
I don't think that the patches are relevant for linux-stable, correct?

--
    Manfred

2018-07-05 16:10:19

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

On Thu, Jul 5, 2018 at 5:12 PM, Manfred Spraul <[email protected]> wrote:
> Hi Dmitry,
>
> On 07/05/2018 10:36 AM, Dmitry Vyukov wrote:
>>
>> [...]
>> Hi Manfred,
>>
>> The series looks like a significant improvement to me. Thanks!
>>
>> I feel that this code can be further simplified (unless I am missing
>> something here). Please take a look at this version:
>>
>>
>> https://github.com/dvyukov/linux/commit/f77aeaf80f3c4ab524db92184d874b03063fea3a?diff=split
>>
>> This is on top of your patches. It basically does the same as your
>> code, but consolidates all id/seq assignment and dealing with next_id,
>> and deduplicates code re CONFIG_CHECKPOINT_RESTORE. Currently it's a
>> bit tricky to follow e.g. where exactly next_id is consumed and where
>> it needs to be left intact.
>> The only difference is that my code assigns new->id earlier. Not sure
>> if it can lead to anything bad. But if yes, then it seems that
>> currently uninitialized new->id is exposed. If necessary (?) we could
>> reset new->id in the same place where we set new->deleted.
>
> Everything looks correct for me, it is better than the current code.
> Except that you didn't sign off your last patch.

It was meant more like a review comment expressed in code. But I signed it now.

> As next step: Who can merge the patches towards linux-next?
> The only open point that I see are stress tests of the error codepaths.
>
> And:
> I don't think that the patches are relevant for linux-stable, correct?

I don't have a pressing need for stable (simply because KMSAN/KTSAN
are not backported to any stable releases, so we won't discover it
there).


Attachments:
ipc.txt (2.48 kB)

2018-07-05 21:58:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

On Thu, 5 Jul 2018 17:12:36 +0200 Manfred Spraul <[email protected]> wrote:

> Hi Dmitry,
>
> On 07/05/2018 10:36 AM, Dmitry Vyukov wrote:
> > [...]
> > Hi Manfred,
> >
> > The series looks like a significant improvement to me. Thanks!
> >
> > I feel that this code can be further simplified (unless I am missing
> > something here). Please take a look at this version:
> >
> > https://github.com/dvyukov/linux/commit/f77aeaf80f3c4ab524db92184d874b03063fea3a?diff=split
> >
> > This is on top of your patches. It basically does the same as your
> > code, but consolidates all id/seq assignment and dealing with next_id,
> > and deduplicates code re CONFIG_CHECKPOINT_RESTORE. Currently it's a
> > bit tricky to follow e.g. where exactly next_id is consumed and where
> > it needs to be left intact.
> > The only difference is that my code assigns new->id earlier. Not sure
> > if it can lead to anything bad. But if yes, then it seems that
> > currently uninitialized new->id is exposed. If necessary (?) we could
> > reset new->id in the same place where we set new->deleted.
> Everything looks correct for me, it is better than the current code.
> Except that you didn't sign off your last patch.
>
> As next step: Who can merge the patches towards linux-next?

Me.

But it's unclear which patchset we're talking about. What's the plan
here? To combine both efforts?


> The only open point that I see are stress tests of the error codepaths.
>
> And:
> I don't think that the patches are relevant for linux-stable, correct?


2018-07-06 17:19:26

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH 2/6] ipc: reorganize initialization of kern_ipc_perm.seq

On Thu, 05 Jul 2018, Andrew Morton wrote:

>On Thu, 5 Jul 2018 17:12:36 +0200 Manfred Spraul <[email protected]> wrote:
>
>> Hi Dmitry,
>>
>> On 07/05/2018 10:36 AM, Dmitry Vyukov wrote:
>> > [...]
>> > Hi Manfred,
>> >
>> > The series looks like a significant improvement to me. Thanks!
>> >
>> > I feel that this code can be further simplified (unless I am missing
>> > something here). Please take a look at this version:
>> >
>> > https://github.com/dvyukov/linux/commit/f77aeaf80f3c4ab524db92184d874b03063fea3a?diff=split
>> >
>> > This is on top of your patches. It basically does the same as your
>> > code, but consolidates all id/seq assignment and dealing with next_id,
>> > and deduplicates code re CONFIG_CHECKPOINT_RESTORE. Currently it's a
>> > bit tricky to follow e.g. where exactly next_id is consumed and where
>> > it needs to be left intact.
>> > The only difference is that my code assigns new->id earlier. Not sure
>> > if it can lead to anything bad. But if yes, then it seems that
>> > currently uninitialized new->id is exposed. If necessary (?) we could
>> > reset new->id in the same place where we set new->deleted.
>> Everything looks correct for me, it is better than the current code.
>> Except that you didn't sign off your last patch.
>>
>> As next step: Who can merge the patches towards linux-next?
>
>Me.
>
>But it's unclear which patchset we're talking about. What's the plan
>here? To combine both efforts?

I'm thinking this series will conflict with the ipc rhashtable fixes I
sent out earlier (https://lkml.org/lkml/2018/6/21/727).

Thanks,
Davidlohr