2019-10-11 11:21:04

by Manfred Spraul

[permalink] [raw]
Subject: [PATCH 4/5] ipc/sem.c: Document and update memory barriers

The patch documents and updates the memory barriers in ipc/sem.c:
- Document that the WRITE_ONCE for q->status relies on a barrier
inside wake_q_add().

- Read q->status using READ_ONCE+smp_acquire__after_ctrl_dep().
as the pair for the barrier inside wake_q_add()

- Remove READ_ONCE & WRITE_ONCE for the situations where spinlocks
provide exclusion.

- Add comments to all barriers, and mention the rules in the block
regarding locking.

Signed-off-by: Manfred Spraul <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
---
ipc/sem.c | 64 ++++++++++++++++++++++++++++++++++++++++++++-----------
1 file changed, 51 insertions(+), 13 deletions(-)

diff --git a/ipc/sem.c b/ipc/sem.c
index ec97a7072413..53d970c4e60d 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -205,7 +205,9 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
*
* Memory ordering:
* Most ordering is enforced by using spin_lock() and spin_unlock().
- * The special case is use_global_lock:
+ *
+ * Exceptions:
+ * 1) use_global_lock:
* Setting it from non-zero to 0 is a RELEASE, this is ensured by
* using smp_store_release().
* Testing if it is non-zero is an ACQUIRE, this is ensured by using
@@ -214,6 +216,24 @@ static int sysvipc_sem_proc_show(struct seq_file *s, void *it);
* this smp_load_acquire(), this is guaranteed because the smp_load_acquire()
* is inside a spin_lock() and after a write from 0 to non-zero a
* spin_lock()+spin_unlock() is done.
+ *
+ * 2) queue.status:
+ * Initialization is done while holding sem_lock(), so no further barrier is
+ * required.
+ * Setting it to a result code is a RELEASE, this is ensured by both the
+ * barrier inside wake_q_add() (for case a) and while holding sem_lock()
+ * (for case b).
+ * The AQUIRE when reading the result code without holding sem_lock() is
+ * achieved by using READ_ONCE() + smp_acquire__after_ctrl_dep().
+ * (case a above).
+ * Reading the result code while holding sem_lock() needs no further barriers,
+ * the locks inside sem_lock() enforce ordering (case b above)
+ *
+ * 3) current->state:
+ * current->state is set to TASK_INTERRUPTIBLE while holding sem_lock().
+ * The wakeup is handled using the wake_q infrastructure. wake_q wakeups may
+ * happen immediately after calling wake_q_add. As wake_q_add() is called
+ * when holding sem_lock(), no further barriers are required.
*/

#define sc_semmsl sem_ctls[0]
@@ -766,13 +786,21 @@ static int perform_atomic_semop(struct sem_array *sma, struct sem_queue *q)
static inline void wake_up_sem_queue_prepare(struct sem_queue *q, int error,
struct wake_q_head *wake_q)
{
+ /*
+ * When the wakeup is performed, q->sleeper->state is read and later
+ * set to TASK_RUNNING. This may happen at any time, even before
+ * wake_q_add() returns. Memory ordering for q->sleeper->state is
+ * enforced by sem_lock(): we own sem_lock now (that was the ACQUIRE),
+ * and q->sleeper wrote q->sleeper->state before calling sem_unlock()
+ * (->RELEASE).
+ */
wake_q_add(wake_q, q->sleeper);
/*
- * Rely on the above implicit barrier, such that we can
- * ensure that we hold reference to the task before setting
- * q->status. Otherwise we could race with do_exit if the
- * task is awoken by an external event before calling
- * wake_up_process().
+ * Memory barrier pairing:
+ * case a: The barrier inside wake_q_add() pairs with
+ * READ_ONCE(q->status) + smp_acquire__after_ctrl_dep() in
+ * do_semtimedop().
+ * case b: nothing, ordering is enforced by the locks in sem_lock().
*/
WRITE_ONCE(q->status, error);
}
@@ -2148,9 +2176,11 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
}

do {
- WRITE_ONCE(queue.status, -EINTR);
+ /* memory ordering ensured by the lock in sem_lock() */
+ queue.status = EINTR;
queue.sleeper = current;

+ /* memory ordering is ensured by the lock in sem_lock() */
__set_current_state(TASK_INTERRUPTIBLE);
sem_unlock(sma, locknum);
rcu_read_unlock();
@@ -2174,12 +2204,16 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
error = READ_ONCE(queue.status);
if (error != -EINTR) {
/*
- * User space could assume that semop() is a memory
- * barrier: Without the mb(), the cpu could
- * speculatively read in userspace stale data that was
- * overwritten by the previous owner of the semaphore.
+ * Memory barrier for queue.status, case a):
+ * The smp_acquire__after_ctrl_dep(), together with the
+ * READ_ONCE() above pairs with the barrier inside
+ * wake_q_add().
+ * The barrier protects user space, too: User space may
+ * assume that all data from the CPU that did the wakeup
+ * semop() is visible on the wakee CPU when the sleeping
+ * semop() returns.
*/
- smp_mb();
+ smp_acquire__after_ctrl_dep();
goto out_free;
}

@@ -2189,7 +2223,11 @@ static long do_semtimedop(int semid, struct sembuf __user *tsops,
if (!ipc_valid_object(&sma->sem_perm))
goto out_unlock_free;

- error = READ_ONCE(queue.status);
+ /*
+ * No necessity for any barrier:
+ * We are protect by sem_lock() (case b)
+ */
+ error = queue.status;

/*
* If queue.status != -EINTR we are woken up by another process.
--
2.21.0


2019-10-21 08:38:46

by Chen, Rong A

[permalink] [raw]
Subject: [ipc/sem.c] 6394de3b86: BUG:kernel_NULL_pointer_dereference,address

FYI, we noticed the following commit (built with gcc-7):

commit: 6394de3b868537a90dd9128607192b0e97109f6b ("[PATCH 4/5] ipc/sem.c: Document and update memory barriers")
url: https://github.com/0day-ci/linux/commits/Manfred-Spraul/wake_q-Cleanup-Documentation-update/20191014-055627


in testcase: ltp
with following parameters:

disk: 1HDD
fs: xfs
test: syscalls_part4

test-description: The LTP testsuite contains a collection of tools for testing the Linux kernel and related features.
test-url: http://linux-test-project.github.io/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 8G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-----------------------------------------------------+------------+------------+
| | e14fb72a58 | 6394de3b86 |
+-----------------------------------------------------+------------+------------+
| boot_successes | 12 | 0 |
| boot_failures | 0 | 12 |
| BUG:kernel_NULL_pointer_dereference,address | 0 | 9 |
| Oops:#[##] | 0 | 12 |
| RIP:__list_del_entry_valid | 0 | 10 |
| Kernel_panic-not_syncing:Fatal_exception | 0 | 12 |
| BUG:unable_to_handle_page_fault_for_address | 0 | 3 |
| RIP:update_queue | 0 | 2 |
| WARNING:at_lib/list_debug.c:#__list_del_entry_valid | 0 | 2 |
| RIP:wake_q_add | 0 | 2 |
+-----------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 482.792516] tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
[ 482.792520]
[ 482.802139] tst_timer_test.c:345: INFO: CLOCK_MONOTONIC resolution 1ns
[ 482.802143]
[ 482.810179] tst_timer_test.c:357: INFO: prctl(PR_GET_TIMERSLACK) = 50us
[ 482.810183]
[ 482.820803] select_var.h:66: INFO: Testing libc select()
[ 482.820807]
[ 482.830021] tst_timer_test.c:264: INFO: select() sleeping for 1000us 500 iterations, threshold 450.01us
[ 482.830024]
[ 482.847596] tst_timer_test.c:307: INFO: min 1059us, max 7003us, median 1165us, trunc mean 1469.95us (discarded 25)
[ 482.847600]
[ 482.858297] tst_timer_test.c:310: FAIL: select() slept for too long
[ 482.858302]
[ 482.864980]
[ 482.870867] Time: us | Frequency
[ 482.870870]
[ 482.881126] --------------------------------------------------------------------------------
[ 482.881130]
[ 482.895705] 1059 | ********************************************************************
[ 482.895711]
[ 482.903498] 1372 | *+
[ 482.903501]
[ 482.908663] 1685 | +
[ 482.908666]
[ 482.913050] 1998 | *-
[ 482.913053]
[ 482.916768] 2311 | +
[ 482.916771]
[ 482.921066] 2624 | *+
[ 482.921069]
[ 482.925993] 2937 | *+
[ 482.925996]
[ 482.930684] 3250 | *+
[ 482.930687]
[ 482.935946] 3563 | **-
[ 482.935949]
[ 482.941233] 3876 | *-
[ 482.941238]
[ 482.946576] 4189 | *
[ 482.946579]
[ 482.950791] 4502 | +
[ 482.950820]
[ 482.954670] 4815 | *
[ 482.954673]
[ 482.958516] 5128 | +
[ 482.958519]
[ 482.962408] 5441 | .
[ 482.962411]
[ 482.966423] 5754 | +
[ 482.966426]
[ 482.970379] 6067 | .
[ 482.970382]
[ 482.974289] 6380 | .
[ 482.974292]
[ 482.978140] 6693 | +
[ 482.978143]
[ 482.985114] --------------------------------------------------------------------------------
[ 482.985117]
[ 482.992572] 313us | 1 sample = 0.17481 '*', 0.34961 '+', 0.69923 '-', non-zero '.'
[ 482.992576]
[ 482.996417]
[ 483.002115] tst_timer_test.c:264: INFO: select() sleeping for 2000us 500 iterations, threshold 450.01us
[ 483.002118]
[ 483.011234] tst_timer_test.c:307: INFO: min 2031us, max 7845us, median 2148us, trunc mean 2348.76us (discarded 25)
[ 483.011237]
[ 483.018878] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.018881]
[ 483.026893] tst_timer_test.c:264: INFO: select() sleeping for 5000us 300 iterations, threshold 450.04us
[ 483.026895]
[ 483.036121] tst_timer_test.c:307: INFO: min 5046us, max 10579us, median 5168us, trunc mean 5493.15us (discarded 15)
[ 483.036124]
[ 483.043537] tst_timer_test.c:310: FAIL: select() slept for too long
[ 483.043540]
[ 483.047267]
[ 483.050003] Time: us | Frequency
[ 483.050006]
[ 483.057332] --------------------------------------------------------------------------------
[ 483.057335]
[ 483.065553] 5046 | ********************************************************************
[ 483.065556]
[ 483.075327] 5338 | ****-
[ 483.075330]
[ 483.081463] 5630 | **+
[ 483.081466]
[ 483.089971] 5922 | ***
[ 483.089974]
[ 483.093897] 6214 | *+
[ 483.093900]
[ 483.099142] 6506 | **
[ 483.099145]
[ 483.103055] 6798 | +
[ 483.103058]
[ 483.107398] 7090 | +
[ 483.107401]
[ 483.111596] 7382 | +
[ 483.111599]
[ 483.115623] 7674 | -
[ 483.115627]
[ 483.119716] 7966 | **
[ 483.119720]
[ 483.123760] 8258 | -
[ 483.123763]
[ 483.127902] 8550 | +
[ 483.127904]
[ 483.131916] 8842 | *
[ 483.131919]
[ 483.136585] 9134 | +
[ 483.136588]
[ 483.142237] 9426 | *+
[ 483.142243]
[ 483.149548] 9718 | +
[ 483.149554]
[ 483.153138] 10010 | +
[ 483.153141]
[ 483.157302] 10302 | +
[ 483.157305]
[ 483.164111] --------------------------------------------------------------------------------
[ 483.164114]
[ 483.171562] 292us | 1 sample = 0.31050 '*', 0.62100 '+', 1.24201 '-', non-zero '.'
[ 483.171565]
[ 483.175390]
[ 483.181145] tst_timer_test.c:264: INFO: select() sleeping for 10000us 100 iterations, threshold 450.33us
[ 483.181149]
[ 483.190439] tst_timer_test.c:307: INFO: min 10053us, max 14724us, median 10197us, trunc mean 10749.59us (discarded 5)
[ 483.190442]
[ 483.197834] tst_timer_test.c:310: FAIL: select() slept for too long
[ 483.197837]
[ 483.201303]
[ 483.203916] Time: us | Frequency
[ 483.203919]
[ 483.211027] --------------------------------------------------------------------------------
[ 483.211030]
[ 483.219870] 10053 | ********************************************************************
[ 483.219873]
[ 483.225281] 10299 | *********+
[ 483.225284]
[ 483.230064] 10545 | **-
[ 483.230068]
[ 483.234029] 10791 | **-
[ 483.234032]
[ 483.237921] 11037 | ***+
[ 483.237924]
[ 483.241894] 11283 | *****+
[ 483.241897]
[ 483.245752] 11529 | ***+
[ 483.245755]
[ 483.249736] 11775 | ***+
[ 483.249739]
[ 483.253790] 12021 | ***+
[ 483.253811]
[ 483.257606] 12267 |
[ 483.257609]
[ 483.261554] 12513 | **-
[ 483.261557]
[ 483.265496] 12759 | **-
[ 483.265499]
[ 483.269462] 13005 | *
[ 483.269465]
[ 483.273456] 13251 | *
[ 483.273459]
[ 483.277520] 13497 | **-
[ 483.277523]
[ 483.281319] 13743 | **-
[ 483.281322]
[ 483.285722] 13989 | *
[ 483.285725]
[ 483.288929] 14235 | *
[ 483.288932]
[ 483.292662] 14481 | **-
[ 483.292665]
[ 483.299588] --------------------------------------------------------------------------------
[ 483.299591]
[ 483.306884] 246us | 1 sample = 1.19298 '*', 2.38596 '+', 4.77193 '-', non-zero '.'
[ 483.306887]
[ 483.312268]
[ 483.317644] tst_timer_test.c:264: INFO: select() sleeping for 25000us 50 iterations, threshold 451.29us
[ 483.317647]
[ 483.326674] tst_timer_test.c:307: INFO: min 25065us, max 31973us, median 25175us, trunc mean 25657.06us (discarded 2)
[ 483.326678]
[ 483.333754] tst_timer_test.c:310: FAIL: select() slept for too long
[ 483.333758]
[ 483.339116]
[ 483.341544] Time: us | Frequency
[ 483.341547]
[ 483.348868] --------------------------------------------------------------------------------
[ 483.348872]
[ 483.356606] 25065 | ********************************************************************
[ 483.356610]
[ 483.365158] 25429 | ***+
[ 483.365161]
[ 483.369295] 25793 | *+
[ 483.369298]
[ 483.373778] 26157 | ***+
[ 483.373781]
[ 483.377857] 26521 | ***+
[ 483.377860]
[ 483.381337] 26885 | ***+
[ 483.381341]
[ 483.385648] 27249 | *+
[ 483.385679]
[ 483.388959] 27613 |
[ 483.388961]
[ 483.393088] 27977 | *+
[ 483.393091]
[ 483.399322] 28341 | *+
[ 483.399328]
[ 483.408035] 28705 |
[ 483.408041]
[ 483.414130] 29069 |
[ 483.414134]
[ 483.417933] 29433 |
[ 483.417936]
[ 483.421615] 29797 |
[ 483.421619]
[ 483.424957] 30161 | *+
[ 483.424960]
[ 483.429455] 30525 | *+
[ 483.429458]
[ 483.435853] 30889 |
[ 483.435859]
[ 483.439883] 31253 |
[ 483.439886]
[ 483.443367] 31617 | *+
[ 483.443370]
[ 483.449449] --------------------------------------------------------------------------------
[ 483.449452]
[ 483.456743] 364us | 1 sample = 1.94286 '*', 3.88571 '+', 7.77143 '-', non-zero '.'
[ 483.456746]
[ 483.462420]
[ 483.470113] tst_timer_test.c:264: INFO: select() sleeping for 100000us 10 iterations, threshold 537.00us
[ 483.470119]
[ 483.481094] tst_timer_test.c:307: INFO: min 100131us, max 103539us, median 100220us, trunc mean 100670.44us (discarded 1)
[ 483.481097]
[ 483.488255] tst_timer_test.c:310: FAIL: select() slept for too long
[ 483.488258]
[ 483.493307]
[ 483.495878] Time: us | Frequency
[ 483.495882]
[ 483.502201] --------------------------------------------------------------------------------
[ 483.502204]
[ 483.509908] 100131 | ********************************************************************
[ 483.509912]
[ 483.516527] 100311 |
[ 483.516530]
[ 483.519777] 100491 |
[ 483.519780]
[ 483.523031] 100671 |
[ 483.523034]
[ 483.526706] 100851 |
[ 483.526710]
[ 483.531098] 101031 | ***********-
[ 483.531101]
[ 483.534626] 101211 |
[ 483.534629]
[ 483.539392] 101391 | ***********-
[ 483.539395]
[ 483.543898] 101571 |
[ 483.543901]
[ 483.548411] 101751 |
[ 483.548415]
[ 483.553914] 101931 |
[ 483.553919]
[ 483.559862] 102111 | ***********-
[ 483.559865]
[ 483.563383] 102291 |
[ 483.563386]
[ 483.567519] 102471 |
[ 483.567522]
[ 483.571286] 102651 |
[ 483.571290]
[ 483.574563] 102831 |
[ 483.574566]
[ 483.578421] 103011 |
[ 483.578427]
[ 483.581578] 103191 |
[ 483.581582]
[ 483.586016] 103371 | ***********-
[ 483.586019]
[ 483.592335] --------------------------------------------------------------------------------
[ 483.592338]
[ 483.599755] 180us | 1 sample = 11.33333 '*', 22.66667 '+', 45.33333 '-', non-zero '.'
[ 483.599759]
[ 483.605208]
[ 483.610713] tst_timer_test.c:264: INFO: select() sleeping for 1000000us 2 iterations, threshold 4400.00us
[ 483.610716]
[ 483.620198] tst_timer_test.c:307: INFO: min 1000207us, max 1002366us, median 1000207us, trunc mean 1000207.00us (discarded 1)
[ 483.620201]
[ 483.627885] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.627888]
[ 483.635436] tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
[ 483.635439]
[ 483.641490] tst_timer_test.c:345: INFO: CLOCK_MONOTONIC resolution 1ns
[ 483.641494]
[ 483.647824] tst_timer_test.c:357: INFO: prctl(PR_GET_TIMERSLACK) = 50us
[ 483.647827]
[ 483.656692] select_var.h:69: INFO: Testing SYS_select syscall
[ 483.656695]
[ 483.672043] tst_timer_test.c:264: INFO: select() sleeping for 1000us 500 iterations, threshold 450.01us
[ 483.672047]
[ 483.681266] tst_timer_test.c:307: INFO: min 1026us, max 9860us, median 1143us, trunc mean 1303.48us (discarded 25)
[ 483.681269]
[ 483.691056] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.691060]
[ 483.699289] tst_timer_test.c:264: INFO: select() sleeping for 2000us 500 iterations, threshold 450.01us
[ 483.699292]
[ 483.707203] tst_timer_test.c:285: INFO: Found 2 outliners in [27564,23935] range
[ 483.707206]
[ 483.717665] tst_timer_test.c:307: INFO: min 2030us, max 27564us, median 2149us, trunc mean 2432.32us (discarded 25)
[ 483.717668]
[ 483.725838] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.725841]
[ 483.736025] tst_timer_test.c:264: INFO: select() sleeping for 5000us 300 iterations, threshold 450.04us
[ 483.736028]
[ 483.745866] tst_timer_test.c:307: INFO: min 5023us, max 10096us, median 5164us, trunc mean 5361.09us (discarded 15)
[ 483.745869]
[ 483.754268] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.754271]
[ 483.764744] tst_timer_test.c:264: INFO: select() sleeping for 10000us 100 iterations, threshold 450.33us
[ 483.764747]
[ 483.774978] tst_timer_test.c:307: INFO: min 10069us, max 14123us, median 10169us, trunc mean 10315.24us (discarded 5)
[ 483.774981]
[ 483.785770] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 483.785773]
[ 483.794917] tst_timer_test.c:264: INFO: select() sleeping for 25000us 50 iterations, threshold 451.29us
[ 483.794920]
[ 484.008296] tst_timer_test.c:307: INFO: min 25065us, max 29510us, median 25233us, trunc mean 25540.21us (discarded 2)
[ 484.008299]
[ 484.016840] tst_timer_test.c:310: FAIL: select() slept for too long
[ 484.016843]
[ 484.024443]
[ 484.028091] Time: us | Frequency
[ 484.028094]
[ 484.040652] --------------------------------------------------------------------------------
[ 484.040657]
[ 484.052910] 25065 | ********************************************************************
[ 484.052913]
[ 484.063459] 25299 | **************
[ 484.063462]
[ 484.068515] 25533 |
[ 484.068519]
[ 484.074365] 25767 | *******
[ 484.074368]
[ 484.079386] 26001 | *********-
[ 484.079389]
[ 484.085273] 26235 | ****+
[ 484.085277]
[ 484.089746] 26469 | **-
[ 484.089749]
[ 484.094720] 26703 |
[ 484.094723]
[ 484.098897] 26937 | **-
[ 484.098901]
[ 484.103199] 27171 |
[ 484.103203]
[ 484.107318] 27405 |
[ 484.107322]
[ 484.113051] 27639 | ****+
[ 484.113054]
[ 484.120068] 27873 |
[ 484.120073]
[ 484.124407] 28107 | **-
[ 484.124411]
[ 484.127982] 28341 |
[ 484.127985]
[ 484.131838] 28575 |
[ 484.131841]
[ 484.136283] 28809 |
[ 484.136287]
[ 484.141330] 29043 |
[ 484.141335]
[ 484.147183] 29277 | **-
[ 484.147186]
[ 484.153769] --------------------------------------------------------------------------------
[ 484.153772]
[ 484.161029] 234us | 1 sample = 2.34483 '*', 4.68966 '+', 9.37931 '-', non-zero '.'
[ 484.161032]
[ 484.166668]
[ 484.172311] tst_timer_test.c:264: INFO: select() sleeping for 100000us 10 iterations, threshold 537.00us
[ 484.172314]
[ 485.015313] tst_timer_test.c:307: INFO: min 100208us, max 100294us, median 100275us, trunc mean 100265.11us (discarded 1)
[ 485.015319]
[ 485.023959] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 485.023963]
[ 485.043016] tst_timer_test.c:264: INFO: select() sleeping for 1000000us 2 iterations, threshold 4400.00us
[ 485.043020]
[ 487.019884] tst_timer_test.c:307: INFO: min 1001184us, max 1001186us, median 1001184us, trunc mean 1001184.00us (discarded 1)
[ 487.019890]
[ 487.035395] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 487.035401]
[ 487.045448] tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
[ 487.045453]
[ 487.056039] tst_timer_test.c:345: INFO: CLOCK_MONOTONIC resolution 1ns
[ 487.056043]
[ 487.066416] tst_timer_test.c:357: INFO: prctl(PR_GET_TIMERSLACK) = 50us
[ 487.066421]
[ 487.077159] select_var.h:72: INFO: Testing SYS_pselect6 syscall
[ 487.077163]
[ 487.088250] tst_timer_test.c:264: INFO: select() sleeping for 1000us 500 iterations, threshold 450.01us
[ 487.088254]
[ 487.648115] tst_timer_test.c:307: INFO: min 1060us, max 7520us, median 1188us, trunc mean 1187.02us (discarded 25)
[ 487.648121]
[ 487.660865] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 487.660871]
[ 487.681755] tst_timer_test.c:264: INFO: select() sleeping for 2000us 500 iterations, threshold 450.01us
[ 487.681760]
[ 488.765527] tst_timer_test.c:307: INFO: min 2072us, max 5461us, median 2217us, trunc mean 2206.84us (discarded 25)
[ 488.765532]
[ 488.777841] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 488.777844]
[ 488.789015] tst_timer_test.c:264: INFO: select() sleeping for 5000us 300 iterations, threshold 450.04us
[ 488.789018]
[ 490.358620] tst_timer_test.c:307: INFO: min 5090us, max 9929us, median 5230us, trunc mean 5229.22us (discarded 15)
[ 490.358624]
[ 490.370107] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 490.370110]
[ 490.381126] tst_timer_test.c:264: INFO: select() sleeping for 10000us 100 iterations, threshold 450.33us
[ 490.381130]
[ 491.389762] tst_timer_test.c:307: INFO: min 10071us, max 16644us, median 10234us, trunc mean 10230.05us (discarded 5)
[ 491.389768]
[ 491.400810] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 491.400812]
[ 491.412920] tst_timer_test.c:264: INFO: select() sleeping for 25000us 50 iterations, threshold 451.29us
[ 491.412923]
[ 492.654871] tst_timer_test.c:307: INFO: min 25155us, max 28435us, median 25232us, trunc mean 25232.15us (discarded 2)
[ 492.654877]
[ 492.665458] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 492.665461]
[ 492.676978] tst_timer_test.c:264: INFO: select() sleeping for 100000us 10 iterations, threshold 537.00us
[ 492.676981]
[ 493.658209] tst_timer_test.c:307: INFO: min 100273us, max 100291us, median 100288us, trunc mean 100284.78us (discarded 1)
[ 493.658215]
[ 493.669206] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 493.669209]
[ 493.681098] tst_timer_test.c:264: INFO: select() sleeping for 1000000us 2 iterations, threshold 4400.00us
[ 493.681102]
[ 495.654195] LTP: starting semctl01
[ 495.665460] tst_timer_test.c:307: INFO: min 1001178us, max 1001182us, median 1001178us, trunc mean 1001178.00us (discarded 1)
[ 495.665465]
[ 495.678498] tst_timer_test.c:322: PASS: Measured times are within thresholds
[ 495.678501]
[ 495.686271] tst_test.c:1118: INFO: Timeout per run is 0h 05m 00s
[ 495.686275]
[ 495.695265] tst_timer_test.c:345: INFO: CLOCK_MONOTONIC resolution 1ns
[ 495.695269]
[ 495.703086] tst_timer_test.c:357: INFO: prctl(PR_GET_TIMERSLACK) = 50us
[ 495.703089]
[ 495.712458] select_var.h:75: INFO: Testing SYS__newselect syscall
[ 495.712462]
[ 495.725790] tst_timer_test.c:264: INFO: select() sleeping for 1000us 500 iterations, threshold 450.01us
[ 495.725812]
[ 495.737856] select_var.h:52: CONF: syscall(-1) __NR__newselect not supported
[ 495.737860]
[ 495.743433]
[ 495.746020] Summary:
[ 495.746023]
[ 495.749409] BUG: kernel NULL pointer dereference, address: 0000000000000002
[ 495.751192] passed 15
[ 495.751195]
[ 495.754769] #PF: supervisor read access in kernel mode
[ 495.754771] #PF: error_code(0x0000) - not-present page
[ 495.754773] PGD 0 P4D 0
[ 495.754777] Oops: 0000 [#1] SMP PTI
[ 495.754782] CPU: 0 PID: 7442 Comm: semctl01 Not tainted 5.4.0-rc2-00382-g6394de3b86853 #1
[ 495.757491] failed 6
[ 495.757495]
[ 495.759023] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 495.759035] RIP: 0010:__list_del_entry_valid+0x25/0x90
[ 495.759040] Code: c3 0f 1f 40 00 48 8b 07 48 b9 00 01 00 00 00 00 ad de 48 8b 57 08 48 39 c8 74 26 48 b9 22 01 00 00 00 00 ad de 48 39 ca 74 2e <48> 8b 32 48 39 fe 75 3a 48 8b 50 08 48 39 f2 75 48 b8 01 00 00 00
[ 495.762252] skipped 0
[ 495.762255]
[ 495.764857] RSP: 0018:ffffab238045fc28 EFLAGS: 00010203
[ 495.764860] RAX: 0000000000000020 RBX: 0000000000000020 RCX: dead000000000122
[ 495.764861] RDX: 0000000000000002 RSI: ffffab2380a7bd20 RDI: ffffab2380a7bd20
[ 495.764862] RBP: ffff898b5bed7400 R08: ffffab238045fd10 R09: 0000000000000001
[ 495.764863] R10: 0000000000000001 R11: ffffffff85132ae0 R12: ffffab2380a7bd20
[ 495.764864] R13: 0000000000000004 R14: 0000000000000000 R15: ffffab2380a7bd20
[ 495.764866] FS: 00007f013c3f3500(0000) GS:ffff898b7fc00000(0000) knlGS:0000000000000000
[ 495.764868] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 495.764869] CR2: 0000000000000002 CR3: 0000000197550000 CR4: 00000000000006f0
[ 495.764876] Call Trace:
[ 495.767458] warnings 0
[ 495.767461]
[ 495.769346] update_queue+0xac/0x1a0
[ 495.773856] <<<execution_status>>>
[ 495.773859]
[ 495.774808] do_smart_update+0x8b/0x110
[ 495.774813] do_semtimedop+0xc77/0x1270
[ 495.777521] initiation_status="ok"
[ 495.777524]
[ 495.780101] ? reuse_swap_page+0x106/0x350
[ 495.780107] ? ptep_set_access_flags+0x23/0x30
[ 495.780114] ? wp_page_reuse+0x58/0x70
[ 495.780117] ? do_wp_page+0x141/0x3e0
[ 495.780120] ? __handle_mm_fault+0x9ea/0xf70
[ 495.785671] duration=27 termination_type=exited termination_id=1 corefile=no
[ 495.785674]
[ 495.790152] ? handle_mm_fault+0xdd/0x210
[ 495.790159] ? __do_page_fault+0x310/0x520
[ 495.790169] ? do_syscall_64+0x5b/0x1d0
[ 495.793079] cutime=3 cstime=11
[ 495.793084]
[ 495.794012] ? do_semtimedop+0x1270/0x1270
[ 495.794014] do_syscall_64+0x5b/0x1d0
[ 495.794024] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 495.794029] RIP: 0033:0x7f013bf23587
[ 495.797426] <<<test_end>>>
[ 495.797429]
[ 495.800048] Code: 73 01 c3 48 8b 0d 11 e9 2a 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 41 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e1 e8 2a 00 f7 d8 64 89 01 48
[ 495.800050] RSP: 002b:00007ffdfdaafba8 EFLAGS: 00000202 ORIG_RAX: 0000000000000041
[ 495.800052] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f013bf23587
[ 495.800053] RDX: 0000000000000001 RSI: 00005608192d16f4 RDI: 0000000000000000
[ 495.800054] RBP: 00005608192cff10 R08: 0000000000000000 R09: 00007ffdfdaae9d0
[ 495.800055] R10: 0000000000000853 R11: 0000000000000202 R12: 0000000000000006
[ 495.800056] R13: 00005608190b55d0 R14: 00007f013c3f3498 R15: 00005608192d1574
[ 495.800058] Modules linked in: overlay fuse vfat fat btrfs xor zstd_decompress zstd_compress raid6_pq ext4 mbcache jbd2 loop xfs libcrc32c dm_mod intel_rapl_msr intel_rapl_common crct10dif_pclmul sr_mod crc32_pclmul cdrom crc32c_intel sg ghash_clmulni_intel bochs_drm drm_vram_helper ppdev ttm ata_generic pata_acpi drm_kms_helper snd_pcm aesni_intel syscopyarea sysfillrect snd_timer sysimgblt fb_sys_fops crypto_simd drm snd cryptd glue_helper joydev soundcore pcspkr serio_raw i2c_piix4 ata_piix parport_pc parport floppy ip_tables
[ 495.803953] <<<test_start>>>
[ 495.803955]
[ 495.806468] CR2: 0000000000000002
[ 495.806474] ---[ end trace 84cb6cfa04b40df0 ]---


To reproduce:

# build kernel
cd linux
cp config-5.4.0-rc2-00382-g6394de3b86853 .config
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-7 CC=gcc-7 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email



Thanks,
Rong Chen


Attachments:
(No filename) (23.91 kB)
config-5.4.0-rc2-00382-g6394de3b86853 (203.86 kB)
job-script (5.19 kB)
dmesg.xz (99.78 kB)
Download all attachments

2019-10-24 09:09:28

by Manfred Spraul

[permalink] [raw]
Subject: Re: [ipc/sem.c] 6394de3b86: BUG:kernel_NULL_pointer_dereference,address

Hello,

On 10/21/19 10:35 AM, kernel test robot wrote:
> FYI, we noticed the following commit (built with gcc-7):
>
> commit: 6394de3b868537a90dd9128607192b0e97109f6b ("[PATCH 4/5] ipc/sem.c: Document and update memory barriers")
> url: https://github.com/0day-ci/linux/commits/Manfred-Spraul/wake_q-Cleanup-Documentation-update/20191014-055627

Yes, known issue:

> @@ -2148,9 +2176,11 @@ static long do_semtimedop(int semid, struct
> sembuf __user *tsops,
> ??????? }
>
> ??????? do {
> -?????????????? WRITE_ONCE(queue.status, -EINTR);
> +?????????????? /* memory ordering ensured by the lock in sem_lock() */
> +?????????????? queue.status = EINTR;
> ??????????????? queue.sleeper = current;
>
> +?????????????? /* memory ordering is ensured by the lock in sem_lock() */
> ??????????????? __set_current_state(TASK_INTERRUPTIBLE);
> ??????????????? sem_unlock(sma, locknum);
> ??????????????? rcu_read_unlock();
It must be "-EINTR", not "EINTR".

If there is a timeout or a spurious wakeup, then the do_semtimedop()
returns to user space without unlinking everything properly.

I was able to reproduce the issue: V1 of the series ends up with the
shown error.

V3 as now merged doesn't fail.

--

??? Manfred