2021-03-29 08:06:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/53] 4.9.264-rc1 review

This is the start of the stable review cycle for the 4.9.264 release.
There are 53 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.264-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.264-rc1

Markus Theil <[email protected]>
mac80211: fix double free in ibss_leave

Eric Dumazet <[email protected]>
net: qrtr: fix a kernel-infoleak in qrtr_recvmsg()

Eric Dumazet <[email protected]>
net: sched: validate stab values

Martin Willi <[email protected]>
can: dev: Move device back to init netns on owning netns delete

Mike Galbraith <[email protected]>
futex: Handle transient "ownerless" rtmutex state correctly

Mateusz Nosek <[email protected]>
futex: Fix incorrect should_fail_futex() handling

Yang Tao <[email protected]>
futex: Prevent robust futex exit race

Will Deacon <[email protected]>
arm64: futex: Bound number of LDXR/STXR loops in FUTEX_WAKE_OP

Will Deacon <[email protected]>
locking/futex: Allow low-level atomic operations to return -EAGAIN

Peter Zijlstra <[email protected]>
futex: Fix (possible) missed wakeup

Thomas Gleixner <[email protected]>
futex: Handle early deadlock return correctly

Peter Zijlstra <[email protected]>
futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()

Thomas Gleixner <[email protected]>
futex: Avoid freeing an active timer

Peter Zijlstra <[email protected]>
futex: Drop hb->lock before enqueueing on the rtmutex

Peter Zijlstra <[email protected]>
futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()

Peter Zijlstra <[email protected]>
futex,rt_mutex: Introduce rt_mutex_init_waiter()

Peter Zijlstra <[email protected]>
futex: Use smp_store_release() in mark_wake_futex()

Matthew Wilcox <[email protected]>
idr: add ida_is_empty

Adrian Hunter <[email protected]>
perf auxtrace: Fix auxtrace queue conflict

Andy Shevchenko <[email protected]>
ACPI: scan: Use unique number for instance_no

Rafael J. Wysocki <[email protected]>
ACPI: scan: Rearrange memory allocation in acpi_device_add()

Potnuri Bharat Teja <[email protected]>
RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server

Johan Hovold <[email protected]>
net: cdc-phonet: fix data-interface release on probe failure

Johannes Berg <[email protected]>
mac80211: fix rate mask reset

Torin Cooper-Bennun <[email protected]>
can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning

Tong Zhang <[email protected]>
can: c_can: move runtime PM enable/disable to c_can_platform

Tong Zhang <[email protected]>
can: c_can_pci: c_can_pci_remove(): fix use-after-free

Lv Yunlong <[email protected]>
net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template

Dinghao Liu <[email protected]>
e1000e: Fix error handling in e1000_set_d0_lplu_state_82571

Vitaly Lifshits <[email protected]>
e1000e: add rtnl_lock() to e1000_reset_task

Florian Fainelli <[email protected]>
net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port

Eric Dumazet <[email protected]>
macvlan: macvlan_count_rx() needs to be aware of preemption

Grygorii Strashko <[email protected]>
bus: omap_l3_noc: mark l3 irqs as IRQF_NO_THREAD

Horia Geantă <[email protected]>
arm64: dts: ls1043a: mark crypto engine dma coherent

Phillip Lougher <[email protected]>
squashfs: fix xattr id and id lookup sanity checks

Sean Nyekjaer <[email protected]>
squashfs: fix inode lookup sanity checks

Borislav Petkov <[email protected]>
x86/tlb: Flush global mappings when KAISER is disabled

Sergei Trofimovich <[email protected]>
ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign

Sergei Trofimovich <[email protected]>
ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls

J. Bruce Fields <[email protected]>
nfs: we don't support removing system.nfs4_acl

Peter Zijlstra <[email protected]>
u64_stats,lockdep: Fix u64_stats_init() vs lockdep

Tong Zhang <[email protected]>
atm: idt77252: fix null-ptr-dereference

Tong Zhang <[email protected]>
atm: uPD98402: fix incorrect allocation

Jia-Ju Bai <[email protected]>
net: wan: fix error return code of uhdlc_init()

Frank Sorenson <[email protected]>
NFS: Correct size calculation for create reply length

Timo Rothenpieler <[email protected]>
nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default

Denis Efremov <[email protected]>
sun/niu: fix wrong RXMAC_BC_FRM_CNT_COUNT count

Jia-Ju Bai <[email protected]>
net: tehuti: fix error return code in bdx_probe()

Dinghao Liu <[email protected]>
ixgbe: Fix memleak in ixgbe_configure_clsu32

Tong Zhang <[email protected]>
atm: lanai: dont run lanai_dev_close if not open

Tong Zhang <[email protected]>
atm: eni: dont release is never initialized

Michael Ellerman <[email protected]>
powerpc/4xx: Fix build errors from mfdcr()

Heiko Thiery <[email protected]>
net: fec: ptp: avoid register access when ipg clock is disabled


-------------

Diffstat:

Makefile | 4 +-
arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 +
arch/arm64/include/asm/futex.h | 59 ++--
arch/ia64/include/asm/syscall.h | 2 +-
arch/ia64/kernel/ptrace.c | 24 +-
arch/powerpc/include/asm/dcr-native.h | 8 +-
arch/x86/include/asm/tlbflush.h | 11 +-
drivers/acpi/internal.h | 6 +-
drivers/acpi/scan.c | 88 +++--
drivers/atm/eni.c | 3 +-
drivers/atm/idt77105.c | 4 +-
drivers/atm/lanai.c | 5 +-
drivers/atm/uPD98402.c | 2 +-
drivers/bus/omap_l3_noc.c | 4 +-
drivers/infiniband/hw/cxgb4/cm.c | 4 +-
drivers/net/can/c_can/c_can.c | 24 +-
drivers/net/can/c_can/c_can_pci.c | 3 +-
drivers/net/can/c_can/c_can_platform.c | 6 +-
drivers/net/can/dev.c | 1 +
drivers/net/can/m_can/m_can.c | 3 -
drivers/net/dsa/bcm_sf2.c | 6 +-
drivers/net/ethernet/freescale/fec_ptp.c | 7 +
drivers/net/ethernet/intel/e1000e/82571.c | 2 +
drivers/net/ethernet/intel/e1000e/netdev.c | 6 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 6 +-
.../net/ethernet/qlogic/qlcnic/qlcnic_minidump.c | 3 +
drivers/net/ethernet/sun/niu.c | 2 -
drivers/net/ethernet/tehuti/tehuti.c | 1 +
drivers/net/usb/cdc-phonet.c | 2 +
drivers/net/wan/fsl_ucc_hdlc.c | 8 +-
drivers/usb/gadget/function/f_hid.c | 6 +-
drivers/usb/gadget/function/f_printer.c | 6 +-
fs/nfs/Kconfig | 2 +-
fs/nfs/nfs3xdr.c | 3 +-
fs/nfs/nfs4proc.c | 3 +
fs/squashfs/export.c | 8 +-
fs/squashfs/id.c | 6 +-
fs/squashfs/squashfs_fs.h | 1 +
fs/squashfs/xattr_id.c | 6 +-
include/acpi/acpi_bus.h | 1 +
include/linux/idr.h | 5 +
include/linux/if_macvlan.h | 3 +-
include/linux/u64_stats_sync.h | 7 +-
include/net/red.h | 10 +-
include/net/rtnetlink.h | 2 +
kernel/futex.c | 393 ++++++++++++++-------
kernel/locking/rtmutex.c | 106 ++++--
kernel/locking/rtmutex_common.h | 5 +-
net/core/dev.c | 2 +-
net/mac80211/cfg.c | 4 +-
net/mac80211/ibss.c | 2 +
net/qrtr/qrtr.c | 5 +
net/sched/sch_choke.c | 7 +-
net/sched/sch_gred.c | 2 +-
net/sched/sch_red.c | 7 +-
net/sched/sch_sfq.c | 2 +-
tools/perf/util/auxtrace.c | 4 -
57 files changed, 607 insertions(+), 306 deletions(-)



2021-03-29 08:06:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 45/53] locking/futex: Allow low-level atomic operations to return -EAGAIN

From: Will Deacon <[email protected]>

commit 6b4f4bc9cb22875f97023984a625386f0c7cc1c0 upstream.

Some futex() operations, including FUTEX_WAKE_OP, require the kernel to
perform an atomic read-modify-write of the futex word via the userspace
mapping. These operations are implemented by each architecture in
arch_futex_atomic_op_inuser() and futex_atomic_cmpxchg_inatomic(), which
are called in atomic context with the relevant hash bucket locks held.

Although these routines may return -EFAULT in response to a page fault
generated when accessing userspace, they are expected to succeed (i.e.
return 0) in all other cases. This poses a problem for architectures
that do not provide bounded forward progress guarantees or fairness of
contended atomic operations and can lead to starvation in some cases.

In these problematic scenarios, we must return back to the core futex
code so that we can drop the hash bucket locks and reschedule if
necessary, much like we do in the case of a page fault.

Allow architectures to return -EAGAIN from their implementations of
arch_futex_atomic_op_inuser() and futex_atomic_cmpxchg_inatomic(), which
will cause the core futex code to reschedule if necessary and return
back to the architecture code later on.

Cc: <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/futex.c | 185 +++++++++++++++++++++++++++++++++++----------------------
1 file changed, 115 insertions(+), 70 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1407,13 +1407,15 @@ static int lookup_pi_state(u32 __user *u

static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval)
{
+ int err;
u32 uninitialized_var(curval);

if (unlikely(should_fail_futex(true)))
return -EFAULT;

- if (unlikely(cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)))
- return -EFAULT;
+ err = cmpxchg_futex_value_locked(&curval, uaddr, uval, newval);
+ if (unlikely(err))
+ return err;

/* If user space value changed, let the caller retry */
return curval != uval ? -EAGAIN : 0;
@@ -1606,10 +1608,8 @@ static int wake_futex_pi(u32 __user *uad
if (unlikely(should_fail_futex(true)))
ret = -EFAULT;

- if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval)) {
- ret = -EFAULT;
-
- } else if (curval != uval) {
+ ret = cmpxchg_futex_value_locked(&curval, uaddr, uval, newval);
+ if (!ret && (curval != uval)) {
/*
* If a unconditional UNLOCK_PI operation (user space did not
* try the TID->0 transition) raced with a waiter setting the
@@ -1795,32 +1795,32 @@ retry_private:
double_lock_hb(hb1, hb2);
op_ret = futex_atomic_op_inuser(op, uaddr2);
if (unlikely(op_ret < 0)) {
-
double_unlock_hb(hb1, hb2);

-#ifndef CONFIG_MMU
- /*
- * we don't get EFAULT from MMU faults if we don't have an MMU,
- * but we might get them from range checking
- */
- ret = op_ret;
- goto out_put_keys;
-#endif
-
- if (unlikely(op_ret != -EFAULT)) {
+ if (!IS_ENABLED(CONFIG_MMU) ||
+ unlikely(op_ret != -EFAULT && op_ret != -EAGAIN)) {
+ /*
+ * we don't get EFAULT from MMU faults if we don't have
+ * an MMU, but we might get them from range checking
+ */
ret = op_ret;
goto out_put_keys;
}

- ret = fault_in_user_writeable(uaddr2);
- if (ret)
- goto out_put_keys;
+ if (op_ret == -EFAULT) {
+ ret = fault_in_user_writeable(uaddr2);
+ if (ret)
+ goto out_put_keys;
+ }

- if (!(flags & FLAGS_SHARED))
+ if (!(flags & FLAGS_SHARED)) {
+ cond_resched();
goto retry_private;
+ }

put_futex_key(&key2);
put_futex_key(&key1);
+ cond_resched();
goto retry;
}

@@ -2516,14 +2516,17 @@ retry:
if (!pi_state->owner)
newtid |= FUTEX_OWNER_DIED;

- if (get_futex_value_locked(&uval, uaddr))
- goto handle_fault;
+ err = get_futex_value_locked(&uval, uaddr);
+ if (err)
+ goto handle_err;

for (;;) {
newval = (uval & FUTEX_OWNER_DIED) | newtid;

- if (cmpxchg_futex_value_locked(&curval, uaddr, uval, newval))
- goto handle_fault;
+ err = cmpxchg_futex_value_locked(&curval, uaddr, uval, newval);
+ if (err)
+ goto handle_err;
+
if (curval == uval)
break;
uval = curval;
@@ -2538,23 +2541,36 @@ retry:
return argowner == current;

/*
- * To handle the page fault we need to drop the locks here. That gives
- * the other task (either the highest priority waiter itself or the
- * task which stole the rtmutex) the chance to try the fixup of the
- * pi_state. So once we are back from handling the fault we need to
- * check the pi_state after reacquiring the locks and before trying to
- * do another fixup. When the fixup has been done already we simply
- * return.
+ * In order to reschedule or handle a page fault, we need to drop the
+ * locks here. In the case of a fault, this gives the other task
+ * (either the highest priority waiter itself or the task which stole
+ * the rtmutex) the chance to try the fixup of the pi_state. So once we
+ * are back from handling the fault we need to check the pi_state after
+ * reacquiring the locks and before trying to do another fixup. When
+ * the fixup has been done already we simply return.
*
* Note: we hold both hb->lock and pi_mutex->wait_lock. We can safely
* drop hb->lock since the caller owns the hb -> futex_q relation.
* Dropping the pi_mutex->wait_lock requires the state revalidate.
*/
-handle_fault:
+handle_err:
raw_spin_unlock_irq(&pi_state->pi_mutex.wait_lock);
spin_unlock(q->lock_ptr);

- err = fault_in_user_writeable(uaddr);
+ switch (err) {
+ case -EFAULT:
+ err = fault_in_user_writeable(uaddr);
+ break;
+
+ case -EAGAIN:
+ cond_resched();
+ err = 0;
+ break;
+
+ default:
+ WARN_ON_ONCE(1);
+ break;
+ }

spin_lock(q->lock_ptr);
raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
@@ -3128,10 +3144,8 @@ retry:
* A unconditional UNLOCK_PI op raced against a waiter
* setting the FUTEX_WAITERS bit. Try again.
*/
- if (ret == -EAGAIN) {
- put_futex_key(&key);
- goto retry;
- }
+ if (ret == -EAGAIN)
+ goto pi_retry;
/*
* wake_futex_pi has detected invalid state. Tell user
* space.
@@ -3146,9 +3160,19 @@ retry:
* preserve the WAITERS bit not the OWNER_DIED one. We are the
* owner.
*/
- if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0)) {
+ if ((ret = cmpxchg_futex_value_locked(&curval, uaddr, uval, 0))) {
spin_unlock(&hb->lock);
- goto pi_faulted;
+ switch (ret) {
+ case -EFAULT:
+ goto pi_faulted;
+
+ case -EAGAIN:
+ goto pi_retry;
+
+ default:
+ WARN_ON_ONCE(1);
+ goto out_putkey;
+ }
}

/*
@@ -3162,6 +3186,11 @@ out_putkey:
put_futex_key(&key);
return ret;

+pi_retry:
+ put_futex_key(&key);
+ cond_resched();
+ goto retry;
+
pi_faulted:
put_futex_key(&key);

@@ -3504,6 +3533,7 @@ err_unlock:
static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, int pi)
{
u32 uval, uninitialized_var(nval), mval;
+ int err;

/* Futex address must be 32bit aligned */
if ((((unsigned long)uaddr) % sizeof(*uaddr)) != 0)
@@ -3513,42 +3543,57 @@ retry:
if (get_user(uval, uaddr))
return -1;

- if ((uval & FUTEX_TID_MASK) == task_pid_vnr(curr)) {
- /*
- * Ok, this dying thread is truly holding a futex
- * of interest. Set the OWNER_DIED bit atomically
- * via cmpxchg, and if the value had FUTEX_WAITERS
- * set, wake up a waiter (if any). (We have to do a
- * futex_wake() even if OWNER_DIED is already set -
- * to handle the rare but possible case of recursive
- * thread-death.) The rest of the cleanup is done in
- * userspace.
- */
- mval = (uval & FUTEX_WAITERS) | FUTEX_OWNER_DIED;
- /*
- * We are not holding a lock here, but we want to have
- * the pagefault_disable/enable() protection because
- * we want to handle the fault gracefully. If the
- * access fails we try to fault in the futex with R/W
- * verification via get_user_pages. get_user() above
- * does not guarantee R/W access. If that fails we
- * give up and leave the futex locked.
- */
- if (cmpxchg_futex_value_locked(&nval, uaddr, uval, mval)) {
+ if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
+ return 0;
+
+ /*
+ * Ok, this dying thread is truly holding a futex
+ * of interest. Set the OWNER_DIED bit atomically
+ * via cmpxchg, and if the value had FUTEX_WAITERS
+ * set, wake up a waiter (if any). (We have to do a
+ * futex_wake() even if OWNER_DIED is already set -
+ * to handle the rare but possible case of recursive
+ * thread-death.) The rest of the cleanup is done in
+ * userspace.
+ */
+ mval = (uval & FUTEX_WAITERS) | FUTEX_OWNER_DIED;
+
+ /*
+ * We are not holding a lock here, but we want to have
+ * the pagefault_disable/enable() protection because
+ * we want to handle the fault gracefully. If the
+ * access fails we try to fault in the futex with R/W
+ * verification via get_user_pages. get_user() above
+ * does not guarantee R/W access. If that fails we
+ * give up and leave the futex locked.
+ */
+ if ((err = cmpxchg_futex_value_locked(&nval, uaddr, uval, mval))) {
+ switch (err) {
+ case -EFAULT:
if (fault_in_user_writeable(uaddr))
return -1;
goto retry;
- }
- if (nval != uval)
+
+ case -EAGAIN:
+ cond_resched();
goto retry;

- /*
- * Wake robust non-PI futexes here. The wakeup of
- * PI futexes happens in exit_pi_state():
- */
- if (!pi && (uval & FUTEX_WAITERS))
- futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
+ default:
+ WARN_ON_ONCE(1);
+ return err;
+ }
}
+
+ if (nval != uval)
+ goto retry;
+
+ /*
+ * Wake robust non-PI futexes here. The wakeup of
+ * PI futexes happens in exit_pi_state():
+ */
+ if (!pi && (uval & FUTEX_WAITERS))
+ futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
+
return 0;
}



2021-03-29 08:06:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 46/53] arm64: futex: Bound number of LDXR/STXR loops in FUTEX_WAKE_OP

From: Will Deacon <[email protected]>

commit 03110a5cb2161690ae5ac04994d47ed0cd6cef75 upstream.

Our futex implementation makes use of LDXR/STXR loops to perform atomic
updates to user memory from atomic context. This can lead to latency
problems if we end up spinning around the LL/SC sequence at the expense
of doing something useful.

Rework our futex atomic operations so that we return -EAGAIN if we fail
to update the futex word after 128 attempts. The core futex code will
reschedule if necessary and we'll try again later.

Fixes: 6170a97460db ("arm64: Atomic operations")
Signed-off-by: Will Deacon <[email protected]>
[bwh: Backported to 4.9: adjust context]
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/include/asm/futex.h | 59 +++++++++++++++++++++++++----------------
1 file changed, 37 insertions(+), 22 deletions(-)

--- a/arch/arm64/include/asm/futex.h
+++ b/arch/arm64/include/asm/futex.h
@@ -26,7 +26,12 @@
#include <asm/errno.h>
#include <asm/sysreg.h>

+#define FUTEX_MAX_LOOPS 128 /* What's the largest number you can think of? */
+
#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg) \
+do { \
+ unsigned int loops = FUTEX_MAX_LOOPS; \
+ \
asm volatile( \
ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
@@ -34,21 +39,26 @@
"1: ldxr %w1, %2\n" \
insn "\n" \
"2: stlxr %w0, %w3, %2\n" \
-" cbnz %w0, 1b\n" \
-" dmb ish\n" \
+" cbz %w0, 3f\n" \
+" sub %w4, %w4, %w0\n" \
+" cbnz %w4, 1b\n" \
+" mov %w0, %w7\n" \
"3:\n" \
+" dmb ish\n" \
" .pushsection .fixup,\"ax\"\n" \
" .align 2\n" \
-"4: mov %w0, %w5\n" \
+"4: mov %w0, %w6\n" \
" b 3b\n" \
" .popsection\n" \
_ASM_EXTABLE(1b, 4b) \
_ASM_EXTABLE(2b, 4b) \
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, \
CONFIG_ARM64_PAN) \
- : "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp) \
- : "r" (oparg), "Ir" (-EFAULT) \
- : "memory")
+ : "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp), \
+ "+r" (loops) \
+ : "r" (oparg), "Ir" (-EFAULT), "Ir" (-EAGAIN) \
+ : "memory"); \
+} while (0)

static inline int
arch_futex_atomic_op_inuser(int op, int oparg, int *oval, u32 __user *uaddr)
@@ -59,23 +69,23 @@ arch_futex_atomic_op_inuser(int op, int

switch (op) {
case FUTEX_OP_SET:
- __futex_atomic_op("mov %w3, %w4",
+ __futex_atomic_op("mov %w3, %w5",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_ADD:
- __futex_atomic_op("add %w3, %w1, %w4",
+ __futex_atomic_op("add %w3, %w1, %w5",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_OR:
- __futex_atomic_op("orr %w3, %w1, %w4",
+ __futex_atomic_op("orr %w3, %w1, %w5",
ret, oldval, uaddr, tmp, oparg);
break;
case FUTEX_OP_ANDN:
- __futex_atomic_op("and %w3, %w1, %w4",
+ __futex_atomic_op("and %w3, %w1, %w5",
ret, oldval, uaddr, tmp, ~oparg);
break;
case FUTEX_OP_XOR:
- __futex_atomic_op("eor %w3, %w1, %w4",
+ __futex_atomic_op("eor %w3, %w1, %w5",
ret, oldval, uaddr, tmp, oparg);
break;
default:
@@ -95,6 +105,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval,
u32 oldval, u32 newval)
{
int ret = 0;
+ unsigned int loops = FUTEX_MAX_LOOPS;
u32 val, tmp;
u32 __user *uaddr;

@@ -106,21 +117,25 @@ futex_atomic_cmpxchg_inatomic(u32 *uval,
ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
" prfm pstl1strm, %2\n"
"1: ldxr %w1, %2\n"
-" sub %w3, %w1, %w4\n"
-" cbnz %w3, 3f\n"
-"2: stlxr %w3, %w5, %2\n"
-" cbnz %w3, 1b\n"
-" dmb ish\n"
+" sub %w3, %w1, %w5\n"
+" cbnz %w3, 4f\n"
+"2: stlxr %w3, %w6, %2\n"
+" cbz %w3, 3f\n"
+" sub %w4, %w4, %w3\n"
+" cbnz %w4, 1b\n"
+" mov %w0, %w8\n"
"3:\n"
+" dmb ish\n"
+"4:\n"
" .pushsection .fixup,\"ax\"\n"
-"4: mov %w0, %w6\n"
-" b 3b\n"
+"5: mov %w0, %w7\n"
+" b 4b\n"
" .popsection\n"
- _ASM_EXTABLE(1b, 4b)
- _ASM_EXTABLE(2b, 4b)
+ _ASM_EXTABLE(1b, 5b)
+ _ASM_EXTABLE(2b, 5b)
ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN, CONFIG_ARM64_PAN)
- : "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
- : "r" (oldval), "r" (newval), "Ir" (-EFAULT)
+ : "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp), "+r" (loops)
+ : "r" (oldval), "r" (newval), "Ir" (-EFAULT), "Ir" (-EAGAIN)
: "memory");

*uval = val;


2021-03-29 08:07:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 48/53] futex: Fix incorrect should_fail_futex() handling

From: Mateusz Nosek <[email protected]>

commit 921c7ebd1337d1a46783d7e15a850e12aed2eaa0 upstream.

If should_futex_fail() returns true in futex_wake_pi(), then the 'ret'
variable is set to -EFAULT and then immediately overwritten. So the failure
injection is non-functional.

Fix it by actually leaving the function and returning -EFAULT.

The Fixes tag is kinda blury because the initial commit which introduced
failure injection was already sloppy, but the below mentioned commit broke
it completely.

[ tglx: Massaged changelog ]

Fixes: 6b4f4bc9cb22 ("locking/futex: Allow low-level atomic operations to return -EAGAIN")
Signed-off-by: Mateusz Nosek <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/futex.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1605,8 +1605,10 @@ static int wake_futex_pi(u32 __user *uad
*/
newval = FUTEX_WAITERS | task_pid_vnr(new_owner);

- if (unlikely(should_fail_futex(true)))
+ if (unlikely(should_fail_futex(true))) {
ret = -EFAULT;
+ goto out_unlock;
+ }

ret = cmpxchg_futex_value_locked(&curval, uaddr, uval, newval);
if (!ret && (curval != uval)) {


2021-03-29 08:07:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 47/53] futex: Prevent robust futex exit race

From: Yang Tao <[email protected]>

commit ca16d5bee59807bf04deaab0a8eccecd5061528c upstream.

Robust futexes utilize the robust_list mechanism to allow the kernel to
release futexes which are held when a task exits. The exit can be voluntary
or caused by a signal or fault. This prevents that waiters block forever.

The futex operations in user space store a pointer to the futex they are
either locking or unlocking in the op_pending member of the per task robust
list.

After a lock operation has succeeded the futex is queued in the robust list
linked list and the op_pending pointer is cleared.

After an unlock operation has succeeded the futex is removed from the
robust list linked list and the op_pending pointer is cleared.

The robust list exit code checks for the pending operation and any futex
which is queued in the linked list. It carefully checks whether the futex
value is the TID of the exiting task. If so, it sets the OWNER_DIED bit and
tries to wake up a potential waiter.

This is race free for the lock operation but unlock has two race scenarios
where waiters might not be woken up. These issues can be observed with
regular robust pthread mutexes. PI aware pthread mutexes are not affected.

(1) Unlocking task is killed after unlocking the futex value in user space
before being able to wake a waiter.

pthread_mutex_unlock()
|
V
atomic_exchange_rel (&mutex->__data.__lock, 0)
<------------------------killed
lll_futex_wake () |
|
|(__lock = 0)
|(enter kernel)
|
V
do_exit()
exit_mm()
mm_release()
exit_robust_list()
handle_futex_death()
|
|(__lock = 0)
|(uval = 0)
|
V
if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
return 0;

The sanity check which ensures that the user space futex is owned by
the exiting task prevents the wakeup of waiters which in consequence
block infinitely.

(2) Waiting task is killed after a wakeup and before it can acquire the
futex in user space.

OWNER WAITER
futex_wait()
pthread_mutex_unlock() |
| |
|(__lock = 0) |
| |
V |
futex_wake() ------------> wakeup()
|
|(return to userspace)
|(__lock = 0)
|
V
oldval = mutex->__data.__lock
<-----------------killed
atomic_compare_and_exchange_val_acq (&mutex->__data.__lock, |
id | assume_other_futex_waiters, 0) |
|
|
(enter kernel)|
|
V
do_exit()
|
|
V
handle_futex_death()
|
|(__lock = 0)
|(uval = 0)
|
V
if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
return 0;

The sanity check which ensures that the user space futex is owned
by the exiting task prevents the wakeup of waiters, which seems to
be correct as the exiting task does not own the futex value, but
the consequence is that other waiters wont be woken up and block
infinitely.

In both scenarios the following conditions are true:

- task->robust_list->list_op_pending != NULL
- user space futex value == 0
- Regular futex (not PI)

If these conditions are met then it is reasonably safe to wake up a
potential waiter in order to prevent the above problems.

As this might be a false positive it can cause spurious wakeups, but the
waiter side has to handle other types of unrelated wakeups, e.g. signals
gracefully anyway. So such a spurious wakeup will not affect the
correctness of these operations.

This workaround must not touch the user space futex value and cannot set
the OWNER_DIED bit because the lock value is 0, i.e. uncontended. Setting
OWNER_DIED in this case would result in inconsistent state and subsequently
in malfunction of the owner died handling in user space.

The rest of the user space state is still consistent as no other task can
observe the list_op_pending entry in the exiting tasks robust list.

The eventually woken up waiter will observe the uncontended lock value and
take it over.

[ tglx: Massaged changelog and comment. Made the return explicit and not
depend on the subsequent check and added constants to hand into
handle_futex_death() instead of plain numbers. Fixed a few coding
style issues. ]

Fixes: 0771dfefc9e5 ("[PATCH] lightweight robust futexes: core")
Signed-off-by: Yang Tao <[email protected]>
Signed-off-by: Yi Wang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/futex.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 51 insertions(+), 7 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3526,11 +3526,16 @@ err_unlock:
return ret;
}

+/* Constants for the pending_op argument of handle_futex_death */
+#define HANDLE_DEATH_PENDING true
+#define HANDLE_DEATH_LIST false
+
/*
* Process a futex-list entry, check whether it's owned by the
* dying task, and do notification if so:
*/
-static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, int pi)
+static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr,
+ bool pi, bool pending_op)
{
u32 uval, uninitialized_var(nval), mval;
int err;
@@ -3543,6 +3548,42 @@ retry:
if (get_user(uval, uaddr))
return -1;

+ /*
+ * Special case for regular (non PI) futexes. The unlock path in
+ * user space has two race scenarios:
+ *
+ * 1. The unlock path releases the user space futex value and
+ * before it can execute the futex() syscall to wake up
+ * waiters it is killed.
+ *
+ * 2. A woken up waiter is killed before it can acquire the
+ * futex in user space.
+ *
+ * In both cases the TID validation below prevents a wakeup of
+ * potential waiters which can cause these waiters to block
+ * forever.
+ *
+ * In both cases the following conditions are met:
+ *
+ * 1) task->robust_list->list_op_pending != NULL
+ * @pending_op == true
+ * 2) User space futex value == 0
+ * 3) Regular futex: @pi == false
+ *
+ * If these conditions are met, it is safe to attempt waking up a
+ * potential waiter without touching the user space futex value and
+ * trying to set the OWNER_DIED bit. The user space futex value is
+ * uncontended and the rest of the user space mutex state is
+ * consistent, so a woken waiter will just take over the
+ * uncontended futex. Setting the OWNER_DIED bit would create
+ * inconsistent state and malfunction of the user space owner died
+ * handling.
+ */
+ if (pending_op && !pi && !uval) {
+ futex_wake(uaddr, 1, 1, FUTEX_BITSET_MATCH_ANY);
+ return 0;
+ }
+
if ((uval & FUTEX_TID_MASK) != task_pid_vnr(curr))
return 0;

@@ -3662,10 +3703,11 @@ static void exit_robust_list(struct task
* A pending lock might already be on the list, so
* don't process it twice:
*/
- if (entry != pending)
+ if (entry != pending) {
if (handle_futex_death((void __user *)entry + futex_offset,
- curr, pi))
+ curr, pi, HANDLE_DEATH_LIST))
return;
+ }
if (rc)
return;
entry = next_entry;
@@ -3679,9 +3721,10 @@ static void exit_robust_list(struct task
cond_resched();
}

- if (pending)
+ if (pending) {
handle_futex_death((void __user *)pending + futex_offset,
- curr, pip);
+ curr, pip, HANDLE_DEATH_PENDING);
+ }
}

static void futex_cleanup(struct task_struct *tsk)
@@ -3964,7 +4007,8 @@ void compat_exit_robust_list(struct task
if (entry != pending) {
void __user *uaddr = futex_uaddr(entry, futex_offset);

- if (handle_futex_death(uaddr, curr, pi))
+ if (handle_futex_death(uaddr, curr, pi,
+ HANDLE_DEATH_LIST))
return;
}
if (rc)
@@ -3983,7 +4027,7 @@ void compat_exit_robust_list(struct task
if (pending) {
void __user *uaddr = futex_uaddr(pending, futex_offset);

- handle_futex_death(uaddr, curr, pip);
+ handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING);
}
}



2021-03-29 08:07:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 49/53] futex: Handle transient "ownerless" rtmutex state correctly

From: Mike Galbraith <[email protected]>

commit 9f5d1c336a10c0d24e83e40b4c1b9539f7dba627 upstream.

Gratian managed to trigger the BUG_ON(!newowner) in fixup_pi_state_owner().
This is one possible chain of events leading to this:

Task Prio Operation
T1 120 lock(F)
T2 120 lock(F) -> blocks (top waiter)
T3 50 (RT) lock(F) -> boosts T1 and blocks (new top waiter)
XX timeout/ -> wakes T2
signal
T1 50 unlock(F) -> wakes T3 (rtmutex->owner == NULL, waiter bit is set)
T2 120 cleanup -> try_to_take_mutex() fails because T3 is the top waiter
and the lower priority T2 cannot steal the lock.
-> fixup_pi_state_owner() sees newowner == NULL -> BUG_ON()

The comment states that this is invalid and rt_mutex_real_owner() must
return a non NULL owner when the trylock failed, but in case of a queued
and woken up waiter rt_mutex_real_owner() == NULL is a valid transient
state. The higher priority waiter has simply not yet managed to take over
the rtmutex.

The BUG_ON() is therefore wrong and this is just another retry condition in
fixup_pi_state_owner().

Drop the locks, so that T3 can make progress, and then try the fixup again.

Gratian provided a great analysis, traces and a reproducer. The analysis is
to the point, but it confused the hell out of that tglx dude who had to
page in all the futex horrors again. Condensed version is above.

[ tglx: Wrote comment and changelog ]

Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Reported-by: Gratian Crisan <[email protected]>
Signed-off-by: Mike Galbraith <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/futex.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2497,10 +2497,22 @@ retry:
}

/*
- * Since we just failed the trylock; there must be an owner.
+ * The trylock just failed, so either there is an owner or
+ * there is a higher priority waiter than this one.
*/
newowner = rt_mutex_owner(&pi_state->pi_mutex);
- BUG_ON(!newowner);
+ /*
+ * If the higher priority waiter has not yet taken over the
+ * rtmutex then newowner is NULL. We can't return here with
+ * that state because it's inconsistent vs. the user space
+ * state. So drop the locks and try again. It's a valid
+ * situation and not any different from the other retry
+ * conditions.
+ */
+ if (unlikely(!newowner)) {
+ err = -EAGAIN;
+ goto handle_err;
+ }
} else {
WARN_ON_ONCE(argowner != current);
if (oldowner == current) {


2021-03-29 08:07:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 51/53] net: sched: validate stab values

From: Eric Dumazet <[email protected]>

commit e323d865b36134e8c5c82c834df89109a5c60dab upstream.

iproute2 package is well behaved, but malicious user space can
provide illegal shift values and trigger UBSAN reports.

Add stab parameter to red_check_params() to validate user input.

syzbot reported:

UBSAN: shift-out-of-bounds in ./include/net/red.h:312:18
shift exponent 111 is too large for 64-bit type 'long unsigned int'
CPU: 1 PID: 14662 Comm: syz-executor.3 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x141/0x1d7 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:327
red_calc_qavg_from_idle_time include/net/red.h:312 [inline]
red_calc_qavg include/net/red.h:353 [inline]
choke_enqueue.cold+0x18/0x3dd net/sched/sch_choke.c:221
__dev_xmit_skb net/core/dev.c:3837 [inline]
__dev_queue_xmit+0x1943/0x2e00 net/core/dev.c:4150
neigh_hh_output include/net/neighbour.h:499 [inline]
neigh_output include/net/neighbour.h:508 [inline]
ip6_finish_output2+0x911/0x1700 net/ipv6/ip6_output.c:117
__ip6_finish_output net/ipv6/ip6_output.c:182 [inline]
__ip6_finish_output+0x4c1/0xe10 net/ipv6/ip6_output.c:161
ip6_finish_output+0x35/0x200 net/ipv6/ip6_output.c:192
NF_HOOK_COND include/linux/netfilter.h:290 [inline]
ip6_output+0x1e4/0x530 net/ipv6/ip6_output.c:215
dst_output include/net/dst.h:448 [inline]
NF_HOOK include/linux/netfilter.h:301 [inline]
NF_HOOK include/linux/netfilter.h:295 [inline]
ip6_xmit+0x127e/0x1eb0 net/ipv6/ip6_output.c:320
inet6_csk_xmit+0x358/0x630 net/ipv6/inet6_connection_sock.c:135
dccp_transmit_skb+0x973/0x12c0 net/dccp/output.c:138
dccp_send_reset+0x21b/0x2b0 net/dccp/output.c:535
dccp_finish_passive_close net/dccp/proto.c:123 [inline]
dccp_finish_passive_close+0xed/0x140 net/dccp/proto.c:118
dccp_terminate_connection net/dccp/proto.c:958 [inline]
dccp_close+0xb3c/0xe60 net/dccp/proto.c:1028
inet_release+0x12e/0x280 net/ipv4/af_inet.c:431
inet6_release+0x4c/0x70 net/ipv6/af_inet6.c:478
__sock_release+0xcd/0x280 net/socket.c:599
sock_close+0x18/0x20 net/socket.c:1258
__fput+0x288/0x920 fs/file_table.c:280
task_work_run+0xdd/0x1a0 kernel/task_work.c:140
tracehook_notify_resume include/linux/tracehook.h:189 [inline]

Fixes: 8afa10cbe281 ("net_sched: red: Avoid illegal values")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/net/red.h | 10 +++++++++-
net/sched/sch_choke.c | 7 ++++---
net/sched/sch_gred.c | 2 +-
net/sched/sch_red.c | 7 +++++--
net/sched/sch_sfq.c | 2 +-
5 files changed, 20 insertions(+), 8 deletions(-)

--- a/include/net/red.h
+++ b/include/net/red.h
@@ -167,7 +167,8 @@ static inline void red_set_vars(struct r
v->qcount = -1;
}

-static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog, u8 Scell_log)
+static inline bool red_check_params(u32 qth_min, u32 qth_max, u8 Wlog,
+ u8 Scell_log, u8 *stab)
{
if (fls(qth_min) + Wlog > 32)
return false;
@@ -177,6 +178,13 @@ static inline bool red_check_params(u32
return false;
if (qth_max < qth_min)
return false;
+ if (stab) {
+ int i;
+
+ for (i = 0; i < RED_STAB_SIZE; i++)
+ if (stab[i] >= 32)
+ return false;
+ }
return true;
}

--- a/net/sched/sch_choke.c
+++ b/net/sched/sch_choke.c
@@ -409,6 +409,7 @@ static int choke_change(struct Qdisc *sc
struct sk_buff **old = NULL;
unsigned int mask;
u32 max_P;
+ u8 *stab;

if (opt == NULL)
return -EINVAL;
@@ -424,8 +425,8 @@ static int choke_change(struct Qdisc *sc
max_P = tb[TCA_CHOKE_MAX_P] ? nla_get_u32(tb[TCA_CHOKE_MAX_P]) : 0;

ctl = nla_data(tb[TCA_CHOKE_PARMS]);
-
- if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log))
+ stab = nla_data(tb[TCA_CHOKE_STAB]);
+ if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log, stab))
return -EINVAL;

if (ctl->limit > CHOKE_MAX_QUEUE)
@@ -478,7 +479,7 @@ static int choke_change(struct Qdisc *sc

red_set_parms(&q->parms, ctl->qth_min, ctl->qth_max, ctl->Wlog,
ctl->Plog, ctl->Scell_log,
- nla_data(tb[TCA_CHOKE_STAB]),
+ stab,
max_P);
red_set_vars(&q->vars);

--- a/net/sched/sch_gred.c
+++ b/net/sched/sch_gred.c
@@ -356,7 +356,7 @@ static inline int gred_change_vq(struct
struct gred_sched *table = qdisc_priv(sch);
struct gred_sched_data *q = table->tab[dp];

- if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log))
+ if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log, stab))
return -EINVAL;

if (!q) {
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -169,6 +169,7 @@ static int red_change(struct Qdisc *sch,
struct Qdisc *child = NULL;
int err;
u32 max_P;
+ u8 *stab;

if (opt == NULL)
return -EINVAL;
@@ -184,7 +185,9 @@ static int red_change(struct Qdisc *sch,
max_P = tb[TCA_RED_MAX_P] ? nla_get_u32(tb[TCA_RED_MAX_P]) : 0;

ctl = nla_data(tb[TCA_RED_PARMS]);
- if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog, ctl->Scell_log))
+ stab = nla_data(tb[TCA_RED_STAB]);
+ if (!red_check_params(ctl->qth_min, ctl->qth_max, ctl->Wlog,
+ ctl->Scell_log, stab))
return -EINVAL;

if (ctl->limit > 0) {
@@ -206,7 +209,7 @@ static int red_change(struct Qdisc *sch,
red_set_parms(&q->parms,
ctl->qth_min, ctl->qth_max, ctl->Wlog,
ctl->Plog, ctl->Scell_log,
- nla_data(tb[TCA_RED_STAB]),
+ stab,
max_P);
red_set_vars(&q->vars);

--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -645,7 +645,7 @@ static int sfq_change(struct Qdisc *sch,
}

if (ctl_v1 && !red_check_params(ctl_v1->qth_min, ctl_v1->qth_max,
- ctl_v1->Wlog, ctl_v1->Scell_log))
+ ctl_v1->Wlog, ctl_v1->Scell_log, NULL))
return -EINVAL;
if (ctl_v1 && ctl_v1->qth_min) {
p = kmalloc(sizeof(*p), GFP_KERNEL);


2021-03-29 08:07:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 52/53] net: qrtr: fix a kernel-infoleak in qrtr_recvmsg()

From: Eric Dumazet <[email protected]>

commit 50535249f624d0072cd885bcdce4e4b6fb770160 upstream.

struct sockaddr_qrtr has a 2-byte hole, and qrtr_recvmsg() currently
does not clear it before copying kernel data to user space.

It might be too late to name the hole since sockaddr_qrtr structure is uapi.

BUG: KMSAN: kernel-infoleak in kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
CPU: 0 PID: 29705 Comm: syz-executor.3 Not tainted 5.11.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x21c/0x280 lib/dump_stack.c:120
kmsan_report+0xfb/0x1e0 mm/kmsan/kmsan_report.c:118
kmsan_internal_check_memory+0x202/0x520 mm/kmsan/kmsan.c:402
kmsan_copy_to_user+0x9c/0xb0 mm/kmsan/kmsan_hooks.c:249
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
_copy_to_user+0x1ac/0x270 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:209 [inline]
move_addr_to_user+0x3a2/0x640 net/socket.c:237
____sys_recvmsg+0x696/0xd50 net/socket.c:2575
___sys_recvmsg net/socket.c:2610 [inline]
do_recvmmsg+0xa97/0x22d0 net/socket.c:2710
__sys_recvmmsg net/socket.c:2789 [inline]
__do_sys_recvmmsg net/socket.c:2812 [inline]
__se_sys_recvmmsg+0x24a/0x410 net/socket.c:2805
__x64_sys_recvmmsg+0x62/0x80 net/socket.c:2805
do_syscall_64+0x9f/0x140 arch/x86/entry/common.c:48
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x465f69
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f43659d6188 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 000000000056bf60 RCX: 0000000000465f69
RDX: 0000000000000008 RSI: 0000000020003e40 RDI: 0000000000000003
RBP: 00000000004bfa8f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000010060 R11: 0000000000000246 R12: 000000000056bf60
R13: 0000000000a9fb1f R14: 00007f43659d6300 R15: 0000000000022000

Local variable ----addr@____sys_recvmsg created at:
____sys_recvmsg+0x168/0xd50 net/socket.c:2550
____sys_recvmsg+0x168/0xd50 net/socket.c:2550

Bytes 2-3 of 12 are uninitialized
Memory access of size 12 starts at ffff88817c627b40
Data copied to user address 0000000020000140

Fixes: bdabad3e363d ("net: Add Qualcomm IPC router")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Courtney Cavin <[email protected]>
Reported-by: syzbot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/qrtr/qrtr.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/net/qrtr/qrtr.c
+++ b/net/qrtr/qrtr.c
@@ -728,6 +728,11 @@ static int qrtr_recvmsg(struct socket *s
rc = copied;

if (addr) {
+ /* There is an anonymous 2-byte hole after sq_family,
+ * make sure to clear it.
+ */
+ memset(addr, 0, sizeof(*addr));
+
addr->sq_family = AF_QIPCRTR;
addr->sq_node = le32_to_cpu(phdr->src_node_id);
addr->sq_port = le32_to_cpu(phdr->src_port_id);


2021-03-29 08:07:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 53/53] mac80211: fix double free in ibss_leave

From: Markus Theil <[email protected]>

commit 3bd801b14e0c5d29eeddc7336558beb3344efaa3 upstream.

Clear beacon ie pointer and ie length after free
in order to prevent double free.

==================================================================
BUG: KASAN: double-free or invalid-free \
in ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876

CPU: 0 PID: 8472 Comm: syz-executor100 Not tainted 5.11.0-rc6-syzkaller #0
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
print_address_description.constprop.0.cold+0x5b/0x2c6 mm/kasan/report.c:230
kasan_report_invalid_free+0x51/0x80 mm/kasan/report.c:355
____kasan_slab_free+0xcc/0xe0 mm/kasan/common.c:341
kasan_slab_free include/linux/kasan.h:192 [inline]
__cache_free mm/slab.c:3424 [inline]
kfree+0xed/0x270 mm/slab.c:3760
ieee80211_ibss_leave+0x83/0xe0 net/mac80211/ibss.c:1876
rdev_leave_ibss net/wireless/rdev-ops.h:545 [inline]
__cfg80211_leave_ibss+0x19a/0x4c0 net/wireless/ibss.c:212
__cfg80211_leave+0x327/0x430 net/wireless/core.c:1172
cfg80211_leave net/wireless/core.c:1221 [inline]
cfg80211_netdev_notifier_call+0x9e8/0x12c0 net/wireless/core.c:1335
notifier_call_chain+0xb5/0x200 kernel/notifier.c:83
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:2040
call_netdevice_notifiers_extack net/core/dev.c:2052 [inline]
call_netdevice_notifiers net/core/dev.c:2066 [inline]
__dev_close_many+0xee/0x2e0 net/core/dev.c:1586
__dev_close net/core/dev.c:1624 [inline]
__dev_change_flags+0x2cb/0x730 net/core/dev.c:8476
dev_change_flags+0x8a/0x160 net/core/dev.c:8549
dev_ifsioc+0x210/0xa70 net/core/dev_ioctl.c:265
dev_ioctl+0x1b1/0xc40 net/core/dev_ioctl.c:511
sock_do_ioctl+0x148/0x2d0 net/socket.c:1060
sock_ioctl+0x477/0x6a0 net/socket.c:1177
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported-by: [email protected]
Signed-off-by: Markus Theil <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/mac80211/ibss.c | 2 ++
1 file changed, 2 insertions(+)

--- a/net/mac80211/ibss.c
+++ b/net/mac80211/ibss.c
@@ -1862,6 +1862,8 @@ int ieee80211_ibss_leave(struct ieee8021

/* remove beacon */
kfree(sdata->u.ibss.ie);
+ sdata->u.ibss.ie = NULL;
+ sdata->u.ibss.ie_len = 0;

/* on the next join, re-program HT parameters */
memset(&ifibss->ht_capa, 0, sizeof(ifibss->ht_capa));


2021-03-29 08:07:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/53] net/qlcnic: Fix a use after free in qlcnic_83xx_get_minidump_template

From: Lv Yunlong <[email protected]>

[ Upstream commit db74623a3850db99cb9692fda9e836a56b74198d ]

In qlcnic_83xx_get_minidump_template, fw_dump->tmpl_hdr was freed by
vfree(). But unfortunately, it is used when extended is true.

Fixes: 7061b2bdd620e ("qlogic: Deletion of unnecessary checks before two function calls")
Signed-off-by: Lv Yunlong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
index 5174e0bd75d1..625336264a44 100644
--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c
@@ -1426,6 +1426,7 @@ void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *adapter)

if (fw_dump->tmpl_hdr == NULL || current_version > prev_version) {
vfree(fw_dump->tmpl_hdr);
+ fw_dump->tmpl_hdr = NULL;

if (qlcnic_83xx_md_check_extended_dump_capability(adapter))
extended = !qlcnic_83xx_extend_md_capab(adapter);
@@ -1444,6 +1445,8 @@ void qlcnic_83xx_get_minidump_template(struct qlcnic_adapter *adapter)
struct qlcnic_83xx_dump_template_hdr *hdr;

hdr = fw_dump->tmpl_hdr;
+ if (!hdr)
+ return;
hdr->drv_cap_mask = 0x1f;
fw_dump->cap_mask = 0x1f;
dev_info(&pdev->dev,
--
2.30.1



2021-03-29 08:08:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 50/53] can: dev: Move device back to init netns on owning netns delete

From: Martin Willi <[email protected]>

commit 3a5ca857079ea022e0b1b17fc154f7ad7dbc150f upstream.

When a non-initial netns is destroyed, the usual policy is to delete
all virtual network interfaces contained, but move physical interfaces
back to the initial netns. This keeps the physical interface visible
on the system.

CAN devices are somewhat special, as they define rtnl_link_ops even
if they are physical devices. If a CAN interface is moved into a
non-initial netns, destroying that netns lets the interface vanish
instead of moving it back to the initial netns. default_device_exit()
skips CAN interfaces due to having rtnl_link_ops set. Reproducer:

ip netns add foo
ip link set can0 netns foo
ip netns delete foo

WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60
CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1
Workqueue: netns cleanup_net
[<c010e700>] (unwind_backtrace) from [<c010a1d8>] (show_stack+0x10/0x14)
[<c010a1d8>] (show_stack) from [<c086dc10>] (dump_stack+0x94/0xa8)
[<c086dc10>] (dump_stack) from [<c086b938>] (__warn+0xb8/0x114)
[<c086b938>] (__warn) from [<c086ba10>] (warn_slowpath_fmt+0x7c/0xac)
[<c086ba10>] (warn_slowpath_fmt) from [<c0629f20>] (ops_exit_list+0x38/0x60)
[<c0629f20>] (ops_exit_list) from [<c062a5c4>] (cleanup_net+0x230/0x380)
[<c062a5c4>] (cleanup_net) from [<c0142c20>] (process_one_work+0x1d8/0x438)
[<c0142c20>] (process_one_work) from [<c0142ee4>] (worker_thread+0x64/0x5a8)
[<c0142ee4>] (worker_thread) from [<c0148a98>] (kthread+0x148/0x14c)
[<c0148a98>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)

To properly restore physical CAN devices to the initial netns on owning
netns exit, introduce a flag on rtnl_link_ops that can be set by drivers.
For CAN devices setting this flag, default_device_exit() considers them
non-virtual, applying the usual namespace move.

The issue was introduced in the commit mentioned below, as at that time
CAN devices did not have a dellink() operation.

Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Martin Willi <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/can/dev.c | 1 +
include/net/rtnetlink.h | 2 ++
net/core/dev.c | 2 +-
3 files changed, 4 insertions(+), 1 deletion(-)

--- a/drivers/net/can/dev.c
+++ b/drivers/net/can/dev.c
@@ -1084,6 +1084,7 @@ static void can_dellink(struct net_devic

static struct rtnl_link_ops can_link_ops __read_mostly = {
.kind = "can",
+ .netns_refund = true,
.maxtype = IFLA_CAN_MAX,
.policy = can_policy,
.setup = can_setup,
--- a/include/net/rtnetlink.h
+++ b/include/net/rtnetlink.h
@@ -28,6 +28,7 @@ static inline int rtnl_msg_family(const
*
* @list: Used internally
* @kind: Identifier
+ * @netns_refund: Physical device, move to init_net on netns exit
* @maxtype: Highest device specific netlink attribute number
* @policy: Netlink policy for device specific attribute validation
* @validate: Optional validation function for netlink/changelink parameters
@@ -84,6 +85,7 @@ struct rtnl_link_ops {
unsigned int (*get_num_tx_queues)(void);
unsigned int (*get_num_rx_queues)(void);

+ bool netns_refund;
int slave_maxtype;
const struct nla_policy *slave_policy;
int (*slave_validate)(struct nlattr *tb[],
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8300,7 +8300,7 @@ static void __net_exit default_device_ex
continue;

/* Leave virtual devices for the generic cleanup */
- if (dev->rtnl_link_ops)
+ if (dev->rtnl_link_ops && !dev->rtnl_link_ops->netns_refund)
continue;

/* Push remaining network devices to init_net */


2021-03-29 08:08:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 29/53] can: m_can: m_can_do_rx_poll(): fix extraneous msg loss warning

From: Torin Cooper-Bennun <[email protected]>

[ Upstream commit c0e399f3baf42279f48991554240af8c457535d1 ]

Message loss from RX FIFO 0 is already handled in
m_can_handle_lost_msg(), with netdev output included.

Removing this warning also improves driver performance under heavy
load, where m_can_do_rx_poll() may be called many times before this
interrupt is cleared, causing this message to be output many
times (thanks Mariusz Madej for this report).

Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Link: https://lore.kernel.org/r/[email protected]
Reported-by: Mariusz Madej <[email protected]>
Signed-off-by: Torin Cooper-Bennun <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/m_can/m_can.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
index 0bd7e7164796..197c27d8f584 100644
--- a/drivers/net/can/m_can/m_can.c
+++ b/drivers/net/can/m_can/m_can.c
@@ -428,9 +428,6 @@ static int m_can_do_rx_poll(struct net_device *dev, int quota)
}

while ((rxfs & RXFS_FFL_MASK) && (quota > 0)) {
- if (rxfs & RXFS_RFL)
- netdev_warn(dev, "Rx FIFO 0 Message Lost\n");
-
m_can_read_fifo(dev, rxfs);

quota--;
--
2.30.1



2021-03-29 08:08:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 23/53] net: dsa: bcm_sf2: Qualify phydev->dev_flags based on port

From: Florian Fainelli <[email protected]>

[ Upstream commit 47142ed6c34d544ae9f0463e58d482289cbe0d46 ]

Similar to commit 92696286f3bb37ba50e4bd8d1beb24afb759a799 ("net:
bcmgenet: Set phydev->dev_flags only for internal PHYs") we need to
qualify the phydev->dev_flags based on whether the port is connected to
an internal or external PHY otherwise we risk having a flags collision
with a completely different interpretation depending on the driver.

Fixes: aa9aef77c761 ("net: dsa: bcm_sf2: communicate integrated PHY revision to PHY driver")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/dsa/bcm_sf2.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 0c69d5858558..40b3adf7ad99 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -588,8 +588,10 @@ static u32 bcm_sf2_sw_get_phy_flags(struct dsa_switch *ds, int port)
* in bits 15:8 and the patch level in bits 7:0 which is exactly what
* the REG_PHY_REVISION register layout is.
*/
-
- return priv->hw_params.gphy_rev;
+ if (priv->int_phy_mask & BIT(port))
+ return priv->hw_params.gphy_rev;
+ else
+ return 0;
}

static void bcm_sf2_sw_adjust_link(struct dsa_switch *ds, int port,
--
2.30.1



2021-03-29 08:08:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/53] can: c_can: move runtime PM enable/disable to c_can_platform

From: Tong Zhang <[email protected]>

[ Upstream commit 6e2fe01dd6f98da6cae8b07cd5cfa67abc70d97d ]

Currently doing modprobe c_can_pci will make the kernel complain:

Unbalanced pm_runtime_enable!

this is caused by pm_runtime_enable() called before pm is initialized.

This fix is similar to 227619c3ff7c, move those pm_enable/disable code
to c_can_platform.

Fixes: 4cdd34b26826 ("can: c_can: Add runtime PM support to Bosch C_CAN/D_CAN controller")
Link: http://lore.kernel.org/r/[email protected]
Signed-off-by: Tong Zhang <[email protected]>
Tested-by: Uwe Kleine-König <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/c_can/c_can.c | 24 +-----------------------
drivers/net/can/c_can/c_can_platform.c | 6 +++++-
2 files changed, 6 insertions(+), 24 deletions(-)

diff --git a/drivers/net/can/c_can/c_can.c b/drivers/net/can/c_can/c_can.c
index 4ead5a18b794..c41ab2cb272e 100644
--- a/drivers/net/can/c_can/c_can.c
+++ b/drivers/net/can/c_can/c_can.c
@@ -212,18 +212,6 @@ static const struct can_bittiming_const c_can_bittiming_const = {
.brp_inc = 1,
};

-static inline void c_can_pm_runtime_enable(const struct c_can_priv *priv)
-{
- if (priv->device)
- pm_runtime_enable(priv->device);
-}
-
-static inline void c_can_pm_runtime_disable(const struct c_can_priv *priv)
-{
- if (priv->device)
- pm_runtime_disable(priv->device);
-}
-
static inline void c_can_pm_runtime_get_sync(const struct c_can_priv *priv)
{
if (priv->device)
@@ -1318,7 +1306,6 @@ static const struct net_device_ops c_can_netdev_ops = {

int register_c_can_dev(struct net_device *dev)
{
- struct c_can_priv *priv = netdev_priv(dev);
int err;

/* Deactivate pins to prevent DRA7 DCAN IP from being
@@ -1328,28 +1315,19 @@ int register_c_can_dev(struct net_device *dev)
*/
pinctrl_pm_select_sleep_state(dev->dev.parent);

- c_can_pm_runtime_enable(priv);
-
dev->flags |= IFF_ECHO; /* we support local echo */
dev->netdev_ops = &c_can_netdev_ops;

err = register_candev(dev);
- if (err)
- c_can_pm_runtime_disable(priv);
- else
+ if (!err)
devm_can_led_init(dev);
-
return err;
}
EXPORT_SYMBOL_GPL(register_c_can_dev);

void unregister_c_can_dev(struct net_device *dev)
{
- struct c_can_priv *priv = netdev_priv(dev);
-
unregister_candev(dev);
-
- c_can_pm_runtime_disable(priv);
}
EXPORT_SYMBOL_GPL(unregister_c_can_dev);

diff --git a/drivers/net/can/c_can/c_can_platform.c b/drivers/net/can/c_can/c_can_platform.c
index 717530eac70c..c6a03f565e3f 100644
--- a/drivers/net/can/c_can/c_can_platform.c
+++ b/drivers/net/can/c_can/c_can_platform.c
@@ -29,6 +29,7 @@
#include <linux/list.h>
#include <linux/io.h>
#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
#include <linux/clk.h>
#include <linux/of.h>
#include <linux/of_device.h>
@@ -385,6 +386,7 @@ static int c_can_plat_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, dev);
SET_NETDEV_DEV(dev, &pdev->dev);

+ pm_runtime_enable(priv->device);
ret = register_c_can_dev(dev);
if (ret) {
dev_err(&pdev->dev, "registering %s failed (err=%d)\n",
@@ -397,6 +399,7 @@ static int c_can_plat_probe(struct platform_device *pdev)
return 0;

exit_free_device:
+ pm_runtime_disable(priv->device);
free_c_can_dev(dev);
exit:
dev_err(&pdev->dev, "probe failed\n");
@@ -407,9 +410,10 @@ exit:
static int c_can_plat_remove(struct platform_device *pdev)
{
struct net_device *dev = platform_get_drvdata(pdev);
+ struct c_can_priv *priv = netdev_priv(dev);

unregister_c_can_dev(dev);
-
+ pm_runtime_disable(priv->device);
free_c_can_dev(dev);

return 0;
--
2.30.1



2021-03-29 08:09:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 44/53] futex: Fix (possible) missed wakeup

From: Peter Zijlstra <[email protected]>

commit b061c38bef43406df8e73c5be06cbfacad5ee6ad upstream.

We must not rely on wake_q_add() to delay the wakeup; in particular
commit:

1d0dcb3ad9d3 ("futex: Implement lockless wakeups")

moved wake_q_add() before smp_store_release(&q->lock_ptr, NULL), which
could result in futex_wait() waking before observing ->lock_ptr ==
NULL and going back to sleep again.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 1d0dcb3ad9d3 ("futex: Implement lockless wakeups")
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/futex.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1553,11 +1553,7 @@ static void mark_wake_futex(struct wake_
if (WARN(q->pi_state || q->rt_waiter, "refusing to wake PI futex\n"))
return;

- /*
- * Queue the task for later wakeup for after we've released
- * the hb->lock. wake_q_add() grabs reference to p.
- */
- wake_q_add(wake_q, p);
+ get_task_struct(p);
__unqueue_futex(q);
/*
* The waiting task can free the futex_q as soon as
@@ -1566,6 +1562,13 @@ static void mark_wake_futex(struct wake_
* store to lock_ptr from getting ahead of the plist_del.
*/
smp_store_release(&q->lock_ptr, NULL);
+
+ /*
+ * Queue the task for later wakeup for after we've released
+ * the hb->lock. wake_q_add() grabs reference to p.
+ */
+ wake_q_add(wake_q, p);
+ put_task_struct(p);
}

/*


2021-03-29 08:09:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/53] arm64: dts: ls1043a: mark crypto engine dma coherent

From: Horia Geantă <[email protected]>

commit 4fb3a074755b7737c4081cffe0ccfa08c2f2d29d upstream.

Crypto engine (CAAM) on LS1043A platform is configured HW-coherent,
mark accordingly the DT node.

Lack of "dma-coherent" property for an IP that is configured HW-coherent
can lead to problems, similar to what has been reported for LS1046A.

Cc: <[email protected]> # v4.8+
Fixes: 63dac35b58f4 ("arm64: dts: ls1043a: add crypto node")
Link: https://lore.kernel.org/linux-crypto/[email protected]
Signed-off-by: Horia Geantă <[email protected]>
Acked-by: Li Yang <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi | 1 +
1 file changed, 1 insertion(+)

--- a/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi
@@ -177,6 +177,7 @@
ranges = <0x0 0x00 0x1700000 0x100000>;
reg = <0x00 0x1700000 0x0 0x100000>;
interrupts = <0 75 0x4>;
+ dma-coherent;

sec_jr0: jr@10000 {
compatible = "fsl,sec-v5.4-job-ring",


2021-03-29 08:09:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 42/53] futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()

From: Peter Zijlstra <[email protected]>

commit 04dc1b2fff4e96cb4142227fbdc63c8871ad4ed9 upstream.

Markus reported that the glibc/nptl/tst-robustpi8 test was failing after
commit:

cfafcd117da0 ("futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()")

The following trace shows the problem:

ld-linux-x86-64-2161 [019] .... 410.760971: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_LOCK_PI
ld-linux-x86-64-2161 [019] ...1 410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0
ld-linux-x86-64-2165 [011] .... 410.760978: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_UNLOCK_PI
ld-linux-x86-64-2165 [011] d..1 410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0
ld-linux-x86-64-2165 [011] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000
ld-linux-x86-64-2161 [019] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT

Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161
which then returns with -ETIMEDOUT. That wrecks the lock state, because now
the owner isn't aware it acquired the lock and removes the pending robust
list entry.

If 2161 is killed, the robust list will not clear out this futex and the
subsequent acquire on this futex will then (correctly) result in -ESRCH
which is unexpected by glibc, triggers an internal assertion and dies.

Task 2161 Task 2165

rt_mutex_wait_proxy_lock()
timeout();
/* T2161 is still queued in the waiter list */
return -ETIMEDOUT;

futex_unlock_pi()
spin_lock(hb->lock);
rtmutex_unlock()
remove_rtmutex_waiter(T2161);
mark_lock_available();
/* Make the next waiter owner of the user space side */
futex_uval = 2161;
spin_unlock(hb->lock);
spin_lock(hb->lock);
rt_mutex_cleanup_proxy_lock()
if (rtmutex_owner() !== current)
...
return FAIL;
....
return -ETIMEOUT;

This means that rt_mutex_cleanup_proxy_lock() needs to call
try_to_take_rt_mutex() so it can take over the rtmutex correctly which was
assigned by the waker. If the rtmutex is owned by some other task then this
call is harmless and just confirmes that the waiter is not able to acquire
it.

While there, fix what looks like a merge error which resulted in
rt_mutex_cleanup_proxy_lock() having two calls to
fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any.
Both should have one, since both potentially touch the waiter list.

Fixes: 38d589f2fd08 ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()")
Reported-by: Markus Trippelsdorf <[email protected]>
Bug-Spotted-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Florian Weimer <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: Markus Trippelsdorf <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/locking/rtmutex.c | 24 ++++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)

--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1796,12 +1796,14 @@ int rt_mutex_wait_proxy_lock(struct rt_m
int ret;

raw_spin_lock_irq(&lock->wait_lock);
-
- set_current_state(TASK_INTERRUPTIBLE);
-
/* sleep on the mutex */
+ set_current_state(TASK_INTERRUPTIBLE);
ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter);
-
+ /*
+ * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
+ * have to fix that up.
+ */
+ fixup_rt_mutex_waiters(lock);
raw_spin_unlock_irq(&lock->wait_lock);

return ret;
@@ -1833,15 +1835,25 @@ bool rt_mutex_cleanup_proxy_lock(struct

raw_spin_lock_irq(&lock->wait_lock);
/*
+ * Do an unconditional try-lock, this deals with the lock stealing
+ * state where __rt_mutex_futex_unlock() -> mark_wakeup_next_waiter()
+ * sets a NULL owner.
+ *
+ * We're not interested in the return value, because the subsequent
+ * test on rt_mutex_owner() will infer that. If the trylock succeeded,
+ * we will own the lock and it will have removed the waiter. If we
+ * failed the trylock, we're still not owner and we need to remove
+ * ourselves.
+ */
+ try_to_take_rt_mutex(lock, current, waiter);
+ /*
* Unless we're the owner; we're still enqueued on the wait_list.
* So check if we became owner, if not, take us off the wait_list.
*/
if (rt_mutex_owner(lock) != current) {
remove_waiter(lock, waiter);
- fixup_rt_mutex_waiters(lock);
cleanup = true;
}
-
/*
* try_to_take_rt_mutex() sets the waiter bit unconditionally. We might
* have to fix that up.


2021-03-29 08:09:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 24/53] e1000e: add rtnl_lock() to e1000_reset_task

From: Vitaly Lifshits <[email protected]>

[ Upstream commit 21f857f0321d0d0ea9b1a758bd55dc63d1cb2437 ]

A possible race condition was found in e1000_reset_task,
after discovering a similar issue in igb driver via
commit 024a8168b749 ("igb: reinit_locked() should be called
with rtnl_lock").

Added rtnl_lock() and rtnl_unlock() to avoid this.

Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Vitaly Lifshits <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/e1000e/netdev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 3c01bc43889a..46323019aa63 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5920,15 +5920,19 @@ static void e1000_reset_task(struct work_struct *work)
struct e1000_adapter *adapter;
adapter = container_of(work, struct e1000_adapter, reset_task);

+ rtnl_lock();
/* don't run the task if already down */
- if (test_bit(__E1000_DOWN, &adapter->state))
+ if (test_bit(__E1000_DOWN, &adapter->state)) {
+ rtnl_unlock();
return;
+ }

if (!(adapter->flags & FLAG_RESTART_NOW)) {
e1000e_dump(adapter);
e_err("Reset adapter unexpectedly\n");
}
e1000e_reinit_locked(adapter);
+ rtnl_unlock();
}

/**
--
2.30.1



2021-03-29 08:09:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/53] can: c_can_pci: c_can_pci_remove(): fix use-after-free

From: Tong Zhang <[email protected]>

[ Upstream commit 0429d6d89f97ebff4f17f13f5b5069c66bde8138 ]

There is a UAF in c_can_pci_remove(). dev is released by
free_c_can_dev() and is used by pci_iounmap(pdev, priv->base) later.
To fix this issue, save the mmio address before releasing dev.

Fixes: 5b92da0443c2 ("c_can_pci: generic module for C_CAN/D_CAN on PCI")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Tong Zhang <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/can/c_can/c_can_pci.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/can/c_can/c_can_pci.c b/drivers/net/can/c_can/c_can_pci.c
index d065c0e2d18e..f3e0b2124a37 100644
--- a/drivers/net/can/c_can/c_can_pci.c
+++ b/drivers/net/can/c_can/c_can_pci.c
@@ -239,12 +239,13 @@ static void c_can_pci_remove(struct pci_dev *pdev)
{
struct net_device *dev = pci_get_drvdata(pdev);
struct c_can_priv *priv = netdev_priv(dev);
+ void __iomem *addr = priv->base;

unregister_c_can_dev(dev);

free_c_can_dev(dev);

- pci_iounmap(pdev, priv->base);
+ pci_iounmap(pdev, addr);
pci_disable_msi(pdev);
pci_clear_master(pdev);
pci_release_regions(pdev);
--
2.30.1



2021-03-29 08:09:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/53] e1000e: Fix error handling in e1000_set_d0_lplu_state_82571

From: Dinghao Liu <[email protected]>

[ Upstream commit b52912b8293f2c496f42583e65599aee606a0c18 ]

There is one e1e_wphy() call in e1000_set_d0_lplu_state_82571
that we have caught its return value but lack further handling.
Check and terminate the execution flow just like other e1e_wphy()
in this function.

Fixes: bc7f75fa9788 ("[E1000E]: New pci-express e1000 driver (currently for ICH9 devices only)")
Signed-off-by: Dinghao Liu <[email protected]>
Acked-by: Sasha Neftin <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/e1000e/82571.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/e1000e/82571.c b/drivers/net/ethernet/intel/e1000e/82571.c
index 6b03c8553e59..65deaf8f3004 100644
--- a/drivers/net/ethernet/intel/e1000e/82571.c
+++ b/drivers/net/ethernet/intel/e1000e/82571.c
@@ -917,6 +917,8 @@ static s32 e1000_set_d0_lplu_state_82571(struct e1000_hw *hw, bool active)
} else {
data &= ~IGP02E1000_PM_D0_LPLU;
ret_val = e1e_wphy(hw, IGP02E1000_PHY_POWER_MGMT, data);
+ if (ret_val)
+ return ret_val;
/* LPLU and SmartSpeed are mutually exclusive. LPLU is used
* during Dx states where the power conservation is most
* important. During driver activity we should enable
--
2.30.1



2021-03-29 18:49:22

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/53] 4.9.264-rc1 review

On 3/29/21 12:57 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.264 release.
> There are 53 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.264-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>
>
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:

Tested-by: Florian Fainelli <[email protected]>

and thanks to Ben's latest back ports, no longer seeing the do_futex
warning that I was seeing, thanks a lot Ben!
--
Florian

2021-03-29 21:34:54

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/53] 4.9.264-rc1 review

On Mon, Mar 29, 2021 at 09:57:35AM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.264 release.
> There are 53 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000.
> Anything received after that time might be too late.
>

Build results:
total: 163 pass: 163 fail: 0
Qemu test results:
total: 381 pass: 381 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter

2021-03-30 01:31:44

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/53] 4.9.264-rc1 review

On 3/29/21 1:57 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.264 release.
> There are 53 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.264-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2021-03-30 07:07:36

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/53] 4.9.264-rc1 review

On Mon, 29 Mar 2021 at 13:33, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 4.9.264 release.
> There are 53 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed, 31 Mar 2021 07:55:56 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.264-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

Summary
------------------------------------------------------------------------

kernel: 4.9.264-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: 3c2295cc6be320f7598a3ae9da3abc98230bd931
git describe: v4.9.263-54-g3c2295cc6be3
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-4.9.y/build/v4.9.263-54-g3c2295cc6be3

No regressions (compared to build v4.9.263)

No fixes (compared to build v4.9.263)

Ran 49047 total tests in the following environments and test suites.

Environments
--------------
- arm
- arm64
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-64k_page_size
- juno-r2 - arm64
- juno-r2-compat
- juno-r2-kasan
- mips
- qemu-arm-debug
- qemu-arm64-debug
- qemu-arm64-kasan
- qemu-i386-debug
- qemu-x86_64-debug
- qemu-x86_64-kasan
- qemu_arm
- qemu_arm64
- qemu_arm64-compat
- qemu_i386
- qemu_x86_64
- qemu_x86_64-compat
- sparc
- x15 - arm
- x86_64
- x86-kasan
- x86_64

Test Suites
-----------
* build
* linux-log-parser
* igt-gpu-tools
* install-android-platform-tools-r2600
* kselftest-android
* kselftest-bpf
* kselftest-capabilities
* kselftest-cgroup
* kselftest-clone3
* kselftest-core
* kselftest-cpu-hotplug
* kselftest-cpufreq
* kselftest-efivarfs
* kselftest-filesystems
* kselftest-firmware
* kselftest-fpu
* kselftest-futex
* kselftest-gpio
* kselftest-intel_pstate
* kselftest-ipc
* kselftest-ir
* kselftest-kcmp
* kselftest-kvm
* kselftest-livepatch
* kselftest-lkdtm
* kselftest-ptrace
* kselftest-rseq
* kselftest-rtc
* kselftest-seccomp
* kselftest-sigaltstack
* kselftest-size
* kselftest-splice
* kselftest-static_keys
* kselftest-sysctl
* kvm-unit-tests
* libhugetlbfs
* ltp-commands-tests
* ltp-containers-tests
* ltp-controllers-tests
* ltp-cve-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-mm-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* v4l2-compliance
* fwts
* kselftest-lib
* kselftest-membarrier
* kselftest-timens
* kselftest-timers
* kselftest-tmpfs
* kselftest-tpm2
* kselftest-user
* kselftest-zram
* ltp-cap_bounds-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* network-basic-tests
* perf
* kselftest-kexec
* kselftest-sync
* kselftest-vm
* kselftest-x86
* ltp-open-posix-tests
* ssuite

--
Linaro LKFT
https://lkft.linaro.org