2016-11-09 16:27:26

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH -next 0/2] ipc/sem: semop updates

Hi,

Here are two small updates to semop(2), suggested a while ago by Manfred.

Both patches have passed the usual regression tests, including ltp and
some sysvsem benchmarks.

Thanks!

Davidlohr Bueso (2):
ipc/sem: simplify wait-wake loop
ipc/sem: avoid idr tree lookup for interrupted semop

ipc/sem.c | 124 +++++++++++++++++++++++---------------------------------------
1 file changed, 46 insertions(+), 78 deletions(-)

--
2.6.6


2016-11-09 16:27:27

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH -next 2/2] ipc/sem: avoid idr tree lookup for interrupted semop

We can avoid the idr tree lookup (albeit possibly avoiding
idr_find_fast()) when being awoken in EINTR, as the semid
will not change in this context while blocked. Use the sma
pointer directly and take the sem_lock, then re-check for
RMID races. We continue to re-check the queue.status with
the lock held such that we can detect situations where we
where are dealing with a spurious wakeup but another task
that holds the sem_lock updated the queue.status while we
were spinning for it. Once we take the lock it obviously
won't change again.

Being the only caller, get rid of sem_obtain_lock() altogether.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
ipc/sem.c | 37 +++++--------------------------------
1 file changed, 5 insertions(+), 32 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index a5eaf517c8b4..11cdec301167 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -431,29 +431,6 @@ static inline void sem_unlock(struct sem_array *sma, int locknum)
*
* The caller holds the RCU read lock.
*/
-static inline struct sem_array *sem_obtain_lock(struct ipc_namespace *ns,
- int id, struct sembuf *sops, int nsops, int *locknum)
-{
- struct kern_ipc_perm *ipcp;
- struct sem_array *sma;
-
- ipcp = ipc_obtain_object_idr(&sem_ids(ns), id);
- if (IS_ERR(ipcp))
- return ERR_CAST(ipcp);
-
- sma = container_of(ipcp, struct sem_array, sem_perm);
- *locknum = sem_lock(sma, sops, nsops);
-
- /* ipc_rmid() may have already freed the ID while sem_lock
- * was spinning: verify that the structure is still valid
- */
- if (ipc_valid_object(ipcp))
- return container_of(ipcp, struct sem_array, sem_perm);
-
- sem_unlock(sma, *locknum);
- return ERR_PTR(-EINVAL);
-}
-
static inline struct sem_array *sem_obtain_object(struct ipc_namespace *ns, int id)
{
struct kern_ipc_perm *ipcp = ipc_obtain_object_idr(&sem_ids(ns), id);
@@ -2016,16 +1993,12 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
}

rcu_read_lock();
- sma = sem_obtain_lock(ns, semid, sops, nsops, &locknum);
- error = READ_ONCE(queue.status);
+ sem_lock(sma, sops, nsops);

- /*
- * Array removed? If yes, leave without sem_unlock().
- */
- if (IS_ERR(sma)) {
- rcu_read_unlock();
- goto out_free;
- }
+ if (!ipc_valid_object(&sma->sem_perm))
+ goto out_unlock_free;
+
+ error = READ_ONCE(queue.status);

/*
* If queue.status != -EINTR we are woken up by another process.
--
2.6.6

2016-11-09 16:27:26

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH -next 1/2] ipc/sem: simplify wait-wake loop

Instead of using the reverse goto, we can simplify the flow and
make it more language natural by just doing do-while instead.
One would hope this is the standard way (or obviously just with
a while bucle) that we do wait/wakeup handling in the kernel.
The exact same logic is kept, just more indented.

Signed-off-by: Davidlohr Bueso <[email protected]>
---
ipc/sem.c | 107 ++++++++++++++++++++++++++++++--------------------------------
1 file changed, 51 insertions(+), 56 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index ebd18a7104fd..a5eaf517c8b4 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -1980,71 +1980,66 @@ SYSCALL_DEFINE4(semtimedop, int, semid, struct sembuf __user *, tsops,
sma->complex_count++;
}

-sleep_again:
- queue.status = -EINTR;
- queue.sleeper = current;
+ do {
+ queue.status = -EINTR;
+ queue.sleeper = current;

- __set_current_state(TASK_INTERRUPTIBLE);
- sem_unlock(sma, locknum);
- rcu_read_unlock();
+ __set_current_state(TASK_INTERRUPTIBLE);
+ sem_unlock(sma, locknum);
+ rcu_read_unlock();

- if (timeout)
- jiffies_left = schedule_timeout(jiffies_left);
- else
- schedule();
+ if (timeout)
+ jiffies_left = schedule_timeout(jiffies_left);
+ else
+ schedule();

- /*
- * fastpath: the semop has completed, either successfully or not, from
- * the syscall pov, is quite irrelevant to us at this point; we're done.
- *
- * We _do_ care, nonetheless, about being awoken by a signal or
- * spuriously. The queue.status is checked again in the slowpath (aka
- * after taking sem_lock), such that we can detect scenarios where we
- * were awakened externally, during the window between wake_q_add() and
- * wake_up_q().
- */
- error = READ_ONCE(queue.status);
- if (error != -EINTR) {
/*
- * User space could assume that semop() is a memory barrier:
- * Without the mb(), the cpu could speculatively read in user
- * space stale data that was overwritten by the previous owner
- * of the semaphore.
+ * fastpath: the semop has completed, either successfully or not, from
+ * the syscall pov, is quite irrelevant to us at this point; we're done.
+ *
+ * We _do_ care, nonetheless, about being awoken by a signal or
+ * spuriously. The queue.status is checked again in the slowpath (aka
+ * after taking sem_lock), such that we can detect scenarios where we
+ * were awakened externally, during the window between wake_q_add() and
+ * wake_up_q().
*/
- smp_mb();
- goto out_free;
- }
-
- rcu_read_lock();
- sma = sem_obtain_lock(ns, semid, sops, nsops, &locknum);
- error = READ_ONCE(queue.status);
+ error = READ_ONCE(queue.status);
+ if (error != -EINTR) {
+ /*
+ * User space could assume that semop() is a memory barrier:
+ * Without the mb(), the cpu could speculatively read in user
+ * space stale data that was overwritten by the previous owner
+ * of the semaphore.
+ */
+ smp_mb();
+ goto out_free;
+ }

- /*
- * Array removed? If yes, leave without sem_unlock().
- */
- if (IS_ERR(sma)) {
- rcu_read_unlock();
- goto out_free;
- }
+ rcu_read_lock();
+ sma = sem_obtain_lock(ns, semid, sops, nsops, &locknum);
+ error = READ_ONCE(queue.status);

- /*
- * If queue.status != -EINTR we are woken up by another process.
- * Leave without unlink_queue(), but with sem_unlock().
- */
- if (error != -EINTR)
- goto out_unlock_free;
+ /*
+ * Array removed? If yes, leave without sem_unlock().
+ */
+ if (IS_ERR(sma)) {
+ rcu_read_unlock();
+ goto out_free;
+ }

- /*
- * If an interrupt occurred we have to clean up the queue.
- */
- if (timeout && jiffies_left == 0)
- error = -EAGAIN;
+ /*
+ * If queue.status != -EINTR we are woken up by another process.
+ * Leave without unlink_queue(), but with sem_unlock().
+ */
+ if (error != -EINTR)
+ goto out_unlock_free;

- /*
- * If the wakeup was spurious, just retry.
- */
- if (error == -EINTR && !signal_pending(current))
- goto sleep_again;
+ /*
+ * If an interrupt occurred we have to clean up the queue.
+ */
+ if (timeout && jiffies_left == 0)
+ error = -EAGAIN;
+ } while (error == -EINTR && !signal_pending(current)); /* spurious */

unlink_queue(sma, &queue);

--
2.6.6