2018-11-19 16:54:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/83] 4.9.138-stable review

This is the start of the stable review cycle for the 4.9.138 release.
There are 83 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Wed Nov 21 16:25:13 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.138-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.138-rc1

Mark Rutland <[email protected]>
KVM: arm64: Fix caching of host MDCR_EL2 value

Chris Wilson <[email protected]>
drm/i915/execlists: Force write serialisation into context image vs execution

Clint Taylor <[email protected]>
drm/i915/hdmi: Add HDMI 2.0 audio clock recovery N values

Stanislav Lisovskiy <[email protected]>
drm/dp_mst: Check if primary mstb is null

Marc Zyngier <[email protected]>
drm/rockchip: Allow driver to be shutdown on reboot/kexec

Mike Kravetz <[email protected]>
mm: migration: fix migration of huge PMD shared pages

Mike Kravetz <[email protected]>
hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!

Arnd Bergmann <[email protected]>
lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn

Guenter Roeck <[email protected]>
configfs: replace strncpy with memcpy

Miklos Szeredi <[email protected]>
fuse: fix leaked notify reply

Lukas Czerner <[email protected]>
fuse: fix use-after-free in fuse_direct_IO()

Maciej W. Rozycki <[email protected]>
rtc: hctosys: Add missing range error reporting

Scott Mayhew <[email protected]>
nfsd: COPY and CLONE operations require the saved filehandle to be set

Frank Sorenson <[email protected]>
sunrpc: correct the computation for page_ptr when truncating

Eric W. Biederman <[email protected]>
mount: Prevent MNT_DETACH from disconnecting locked mounts

Eric W. Biederman <[email protected]>
mount: Don't allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

Eric W. Biederman <[email protected]>
mount: Retest MNT_LOCKED in do_umount

Vasily Averin <[email protected]>
ext4: fix buffer leak in __ext4_read_dirblock() on error path

Vasily Averin <[email protected]>
ext4: fix buffer leak in ext4_xattr_move_to_block() on error path

Vasily Averin <[email protected]>
ext4: release bs.bh before re-using in ext4_xattr_block_find()

Vasily Averin <[email protected]>
ext4: fix possible leak of s_journal_flag_rwsem in error path

Theodore Ts'o <[email protected]>
ext4: fix possible leak of sbi->s_group_desc_leak in error path

Theodore Ts'o <[email protected]>
ext4: avoid possible double brelse() in add_new_gdb() on error path

Vasily Averin <[email protected]>
ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing

Vasily Averin <[email protected]>
ext4: avoid buffer leak in ext4_orphan_add() after prior errors

Vasily Averin <[email protected]>
ext4: fix possible inode leak in the retry loop of ext4_resize_fs()

Vasily Averin <[email protected]>
ext4: avoid potential extra brelse in setup_new_flex_group_blocks()

Vasily Averin <[email protected]>
ext4: add missing brelse() add_new_gdb_meta_bg()'s error path

Vasily Averin <[email protected]>
ext4: add missing brelse() in set_flexbg_block_bitmap()'s error path

Vasily Averin <[email protected]>
ext4: add missing brelse() update_backups()'s error path

Michael Kelley <[email protected]>
clockevents/drivers/i8253: Add support for PIT shutdown quirk

Filipe Manana <[email protected]>
Btrfs: fix data corruption due to cloning of eof block

Robbie Ko <[email protected]>
Btrfs: fix cur_offset in the error case for nocow

H. Peter Anvin (Intel) <[email protected]>
arch/alpha, termios: implement BOTHER, IBSHIFT and termios2

H. Peter Anvin <[email protected]>
termios, tty/tty_baudrate.c: fix buffer overrun

John Garry <[email protected]>
of, numa: Validate some distance map rules

Arnd Bergmann <[email protected]>
mtd: docg3: don't set conflicting BCH_CONST_PARAMS option

Vasily Khoruzhick <[email protected]>
netfilter: conntrack: fix calculation of next bucket number in early_drop

Andrea Arcangeli <[email protected]>
mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

Changwei Ge <[email protected]>
ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry

Greg Edwards <[email protected]>
vhost/scsi: truncate T10 PI iov_iter to prot_bytes

Gustavo A. R. Silva <[email protected]>
reset: hisilicon: fix potential NULL pointer dereference

Mikulas Patocka <[email protected]>
mach64: fix image corruption due to reading accelerator registers

Mikulas Patocka <[email protected]>
mach64: fix display corruption on big endian machines

Yan, Zheng <[email protected]>
Revert "ceph: fix dentry leak in splice_dentry()"

Ilya Dryomov <[email protected]>
libceph: bump CEPH_MSG_MAX_DATA_LEN

Enric Balletbo i Serra <[email protected]>
clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call

Ronald Wahl <[email protected]>
clk: at91: Fix division by zero in PLL recalc_rate()

Krzysztof Kozlowski <[email protected]>
clk: s2mps11: Fix matching when built as module and DT node contains compatible

Max Filippov <[email protected]>
xtensa: fix boot parameters address translation

Max Filippov <[email protected]>
xtensa: make sure bFLT stack is 16 byte aligned

Max Filippov <[email protected]>
xtensa: add NOTES section to the linker script

Huacai Chen <[email protected]>
MIPS: Loongson-3: Fix BRIDGE irq delivery problem

Huacai Chen <[email protected]>
MIPS: Loongson-3: Fix CPU UART irq delivery problem

Helge Deller <[email protected]>
parisc: Fix exported address of os_hpmc handler

Helge Deller <[email protected]>
parisc: Fix HPMC handler by increasing size to multiple of 16 bytes

Helge Deller <[email protected]>
parisc: Align os_hpmc_size on word boundary

Kees Cook <[email protected]>
bna: ethtool: Avoid reading past end of buffer

Vincenzo Maffione <[email protected]>
e1000: fix race condition between e1000_down() and e1000_watchdog

Colin Ian King <[email protected]>
e1000: avoid null pointer dereference on invalid stat type

Michal Hocko <[email protected]>
mm: do not bug_on on incorrect length in __mm_populate()

Miklos Szeredi <[email protected]>
fuse: set FR_SENT while locked

Miklos Szeredi <[email protected]>
fuse: fix blocked_waitq wakeup

Kirill Tkhai <[email protected]>
fuse: Fix use-after-free in fuse_dev_do_write()

Kirill Tkhai <[email protected]>
fuse: Fix use-after-free in fuse_dev_do_read()

Quinn Tran <[email protected]>
scsi: qla2xxx: shutdown chip if reset fail

Himanshu Madhani <[email protected]>
scsi: qla2xxx: Fix incorrect port speed being set for FC adapters

Young_X <[email protected]>
cdrom: fix improper type cast, which can leat to information leak.

Dominique Martinet <[email protected]>
9p: clear dangling pointers in p9stat_free

Dominique Martinet <[email protected]>
9p locks: fix glock.client_id leak in do_lock

Breno Leitao <[email protected]>
powerpc/selftests: Wait all threads to join

Marco Felsch <[email protected]>
media: tvp5150: fix width alignment during set_selection()

Phil Elwell <[email protected]>
sc16is7xx: Fix for multi-channel stall

Huacai Chen <[email protected]>
MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS

Joel Stanley <[email protected]>
powerpc/boot: Ensure _zimage_start is a weak symbol

Dengcheng Zhu <[email protected]>
MIPS: kexec: Mark CPU offline before disabling local IRQ

Nicholas Mc Guire <[email protected]>
media: pci: cx23885: handle adding to list failure

Tomi Valkeinen <[email protected]>
drm/omap: fix memory barrier bug in DMM driver

Daniel Axtens <[email protected]>
powerpc/nohash: fix undefined behaviour when testing page size support

Fabio Estevam <[email protected]>
ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL

Miles Chen <[email protected]>
tty: check name length in tty_find_polling_driver()

Sam Bobroff <[email protected]>
powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()


-------------

Diffstat:

Makefile | 4 +-
arch/alpha/include/asm/termios.h | 8 +++-
arch/alpha/include/uapi/asm/ioctls.h | 5 ++
arch/alpha/include/uapi/asm/termbits.h | 17 +++++++
arch/arm/configs/imx_v6_v7_defconfig | 1 +
arch/arm/kvm/arm.c | 4 +-
arch/mips/include/asm/mach-loongson64/irq.h | 2 +-
arch/mips/kernel/crash.c | 3 ++
arch/mips/kernel/machine_kexec.c | 3 ++
arch/mips/loongson64/loongson-3/irq.c | 56 +++-------------------
arch/mips/pci/pci-legacy.c | 4 ++
arch/parisc/kernel/hpmc.S | 10 ++--
arch/powerpc/boot/crt0.S | 4 +-
arch/powerpc/kernel/eeh.c | 5 ++
arch/powerpc/mm/tlb_nohash.c | 3 ++
arch/xtensa/boot/Makefile | 2 +-
arch/xtensa/include/asm/processor.h | 6 ++-
arch/xtensa/kernel/head.S | 7 ++-
arch/xtensa/kernel/vmlinux.lds.S | 1 +
drivers/cdrom/cdrom.c | 2 +-
drivers/clk/at91/clk-pll.c | 3 ++
drivers/clk/clk-s2mps11.c | 30 ++++++++++++
drivers/clk/hisilicon/reset.c | 5 +-
drivers/clk/rockchip/clk-ddr.c | 4 --
drivers/clocksource/i8253.c | 14 +++++-
drivers/gpu/drm/drm_dp_mst_topology.c | 3 ++
drivers/gpu/drm/i915/intel_audio.c | 17 +++++++
drivers/gpu/drm/i915/intel_lrc.c | 14 +++++-
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 11 +++++
drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 +++
drivers/media/i2c/tvp5150.c | 14 ++++--
drivers/media/pci/cx23885/altera-ci.c | 10 ++++
drivers/mtd/devices/Kconfig | 2 +-
drivers/net/ethernet/brocade/bna/bnad_ethtool.c | 4 +-
drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 9 ++--
drivers/net/ethernet/intel/e1000/e1000_main.c | 11 ++++-
drivers/of/of_numa.c | 9 +++-
drivers/rtc/hctosys.c | 4 +-
drivers/scsi/qla2xxx/qla_init.c | 2 +-
drivers/scsi/qla2xxx/qla_mbx.c | 5 +-
drivers/tty/serial/sc16is7xx.c | 19 +++++---
drivers/tty/tty_io.c | 2 +-
drivers/tty/tty_ioctl.c | 4 +-
drivers/vhost/scsi.c | 4 +-
drivers/video/fbdev/aty/mach64_accel.c | 28 +++++------
fs/9p/vfs_file.c | 16 ++++++-
fs/btrfs/inode.c | 5 +-
fs/btrfs/ioctl.c | 12 ++++-
fs/ceph/inode.c | 8 +++-
fs/configfs/symlink.c | 2 +-
fs/ext4/namei.c | 5 +-
fs/ext4/resize.c | 28 ++++++-----
fs/ext4/super.c | 17 +++----
fs/ext4/xattr.c | 4 ++
fs/fuse/dev.c | 29 +++++++++---
fs/fuse/file.c | 4 +-
fs/namespace.c | 22 +++++++--
fs/nfsd/nfs4proc.c | 3 ++
fs/ocfs2/dir.c | 3 +-
include/linux/ceph/libceph.h | 8 +++-
include/linux/hugetlb.h | 14 ++++++
include/linux/i8253.h | 1 +
include/linux/mm.h | 6 +++
lib/ubsan.c | 3 +-
mm/gup.c | 2 -
mm/hugetlb.c | 60 +++++++++++++++++++++---
mm/mempolicy.c | 32 ++++++++++++-
mm/mmap.c | 19 ++++----
mm/rmap.c | 56 ++++++++++++++++++++++
net/9p/protocol.c | 5 ++
net/netfilter/nf_conntrack_core.c | 13 +++--
net/sunrpc/xdr.c | 5 +-
tools/testing/selftests/powerpc/tm/tm-tmspr.c | 27 +++++++----
73 files changed, 580 insertions(+), 210 deletions(-)




2018-11-19 16:53:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/83] powerpc/eeh: Fix possible null deref in eeh_dump_dev_log()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sam Bobroff <[email protected]>

[ Upstream commit f9bc28aedfb5bbd572d2d365f3095c1becd7209b ]

If an error occurs during an unplug operation, it's possible for
eeh_dump_dev_log() to be called when edev->pdn is null, which
currently leads to dereferencing a null pointer.

Handle this by skipping the error log for those devices.

Signed-off-by: Sam Bobroff <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/kernel/eeh.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -169,6 +169,11 @@ static size_t eeh_dump_dev_log(struct ee
int n = 0, l = 0;
char buffer[128];

+ if (!pdn) {
+ pr_warn("EEH: Note: No error log for absent device.\n");
+ return 0;
+ }
+
n += scnprintf(buf+n, len-n, "%04x:%02x:%02x.%01x\n",
edev->phb->global_number, pdn->busno,
PCI_SLOT(pdn->devfn), PCI_FUNC(pdn->devfn));



2018-11-19 16:53:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/83] media: tvp5150: fix width alignment during set_selection()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marco Felsch <[email protected]>

[ Upstream commit bd24db04101f45a9c1d874fe21b0c7eab7bcadec ]

The driver ignored the width alignment which exists due to the UYVY
colorspace format. Fix the width alignment and make use of the the
provided v4l2 helper function to set the width, height and all
alignments in one.

Fixes: 963ddc63e20d ("[media] media: tvp5150: Add cropping support")

Signed-off-by: Marco Felsch <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/i2c/tvp5150.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

--- a/drivers/media/i2c/tvp5150.c
+++ b/drivers/media/i2c/tvp5150.c
@@ -897,9 +897,6 @@ static int tvp5150_set_selection(struct

/* tvp5150 has some special limits */
rect.left = clamp(rect.left, 0, TVP5150_MAX_CROP_LEFT);
- rect.width = clamp_t(unsigned int, rect.width,
- TVP5150_H_MAX - TVP5150_MAX_CROP_LEFT - rect.left,
- TVP5150_H_MAX - rect.left);
rect.top = clamp(rect.top, 0, TVP5150_MAX_CROP_TOP);

/* Calculate height based on current standard */
@@ -913,9 +910,16 @@ static int tvp5150_set_selection(struct
else
hmax = TVP5150_V_MAX_OTHERS;

- rect.height = clamp_t(unsigned int, rect.height,
+ /*
+ * alignments:
+ * - width = 2 due to UYVY colorspace
+ * - height, image = no special alignment
+ */
+ v4l_bound_align_image(&rect.width,
+ TVP5150_H_MAX - TVP5150_MAX_CROP_LEFT - rect.left,
+ TVP5150_H_MAX - rect.left, 1, &rect.height,
hmax - TVP5150_MAX_CROP_TOP - rect.top,
- hmax - rect.top);
+ hmax - rect.top, 0, 0);

tvp5150_write(sd, TVP5150_VERT_BLANKING_START, rect.top);
tvp5150_write(sd, TVP5150_VERT_BLANKING_STOP,



2018-11-19 16:53:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/83] powerpc/selftests: Wait all threads to join

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Breno Leitao <[email protected]>

[ Upstream commit 693b31b2fc1636f0aa7af53136d3b49f6ad9ff39 ]

Test tm-tmspr might exit before all threads stop executing, because it just
waits for the very last thread to join before proceeding/exiting.

This patch makes sure that all threads that were created will join before
proceeding/exiting.

This patch also guarantees that the amount of threads being created is equal
to thread_num.

Signed-off-by: Breno Leitao <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
tools/testing/selftests/powerpc/tm/tm-tmspr.c | 27 ++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)

--- a/tools/testing/selftests/powerpc/tm/tm-tmspr.c
+++ b/tools/testing/selftests/powerpc/tm/tm-tmspr.c
@@ -98,7 +98,7 @@ void texasr(void *in)

int test_tmspr()
{
- pthread_t thread;
+ pthread_t *thread;
int thread_num;
unsigned long i;

@@ -107,21 +107,28 @@ int test_tmspr()
/* To cause some context switching */
thread_num = 10 * sysconf(_SC_NPROCESSORS_ONLN);

+ thread = malloc(thread_num * sizeof(pthread_t));
+ if (thread == NULL)
+ return EXIT_FAILURE;
+
/* Test TFIAR and TFHAR */
- for (i = 0 ; i < thread_num ; i += 2){
- if (pthread_create(&thread, NULL, (void*)tfiar_tfhar, (void *)i))
+ for (i = 0; i < thread_num; i += 2) {
+ if (pthread_create(&thread[i], NULL, (void *)tfiar_tfhar,
+ (void *)i))
return EXIT_FAILURE;
}
- if (pthread_join(thread, NULL) != 0)
- return EXIT_FAILURE;
-
/* Test TEXASR */
- for (i = 0 ; i < thread_num ; i++){
- if (pthread_create(&thread, NULL, (void*)texasr, (void *)i))
+ for (i = 1; i < thread_num; i += 2) {
+ if (pthread_create(&thread[i], NULL, (void *)texasr, (void *)i))
return EXIT_FAILURE;
}
- if (pthread_join(thread, NULL) != 0)
- return EXIT_FAILURE;
+
+ for (i = 0; i < thread_num; i++) {
+ if (pthread_join(thread[i], NULL) != 0)
+ return EXIT_FAILURE;
+ }
+
+ free(thread);

if (passed)
return 0;



2018-11-19 16:54:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/83] 9p locks: fix glock.client_id leak in do_lock

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dominique Martinet <[email protected]>

[ Upstream commit b4dc44b3cac9e8327e0655f530ed0c46f2e6214c ]

the 9p client code overwrites our glock.client_id pointing to a static
buffer by an allocated string holding the network provided value which
we do not care about; free and reset the value as appropriate.

This is almost identical to the leak in v9fs_file_getlock() fixed by
Al Viro in commit ce85dd58ad5a6 ("9p: we are leaking glock.client_id
in v9fs_file_getlock()"), which was returned as an error by a coverity
false positive -- while we are here attempt to make the code slightly
more robust to future change of the net/9p/client code and hopefully
more clear to coverity that there is no problem.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dominique Martinet <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
fs/9p/vfs_file.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -204,6 +204,14 @@ static int v9fs_file_do_lock(struct file
break;
if (schedule_timeout_interruptible(P9_LOCK_TIMEOUT) != 0)
break;
+ /*
+ * p9_client_lock_dotl overwrites flock.client_id with the
+ * server message, free and reuse the client name
+ */
+ if (flock.client_id != fid->clnt->name) {
+ kfree(flock.client_id);
+ flock.client_id = fid->clnt->name;
+ }
}

/* map 9p status to VFS status */
@@ -235,6 +243,8 @@ out_unlock:
locks_lock_file_wait(filp, fl);
fl->fl_type = fl_type;
}
+ if (flock.client_id != fid->clnt->name)
+ kfree(flock.client_id);
out:
return res;
}
@@ -269,7 +279,7 @@ static int v9fs_file_getlock(struct file

res = p9_client_getlock_dotl(fid, &glock);
if (res < 0)
- return res;
+ goto out;
/* map 9p lock type to os lock type */
switch (glock.type) {
case P9_LOCK_TYPE_RDLCK:
@@ -290,7 +300,9 @@ static int v9fs_file_getlock(struct file
fl->fl_end = glock.start + glock.length - 1;
fl->fl_pid = glock.proc_id;
}
- kfree(glock.client_id);
+out:
+ if (glock.client_id != fid->clnt->name)
+ kfree(glock.client_id);
return res;
}




2018-11-19 16:54:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/83] scsi: qla2xxx: Fix incorrect port speed being set for FC adapters

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Himanshu Madhani <[email protected]>

commit 4c1458df9635c7e3ced155f594d2e7dfd7254e21 upstream.

Fixes: 6246b8a1d26c7c ("[SCSI] qla2xxx: Enhancements to support ISP83xx.")
Fixes: 1bb395485160d2 ("qla2xxx: Correct iiDMA-update calling conventions.")
Cc: <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/qla2xxx/qla_mbx.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -3580,10 +3580,7 @@ qla2x00_set_idma_speed(scsi_qla_host_t *
mcp->mb[0] = MBC_PORT_PARAMS;
mcp->mb[1] = loop_id;
mcp->mb[2] = BIT_0;
- if (IS_CNA_CAPABLE(vha->hw))
- mcp->mb[3] = port_speed & (BIT_5|BIT_4|BIT_3|BIT_2|BIT_1|BIT_0);
- else
- mcp->mb[3] = port_speed & (BIT_2|BIT_1|BIT_0);
+ mcp->mb[3] = port_speed & (BIT_5|BIT_4|BIT_3|BIT_2|BIT_1|BIT_0);
mcp->mb[9] = vha->vp_idx;
mcp->out_mb = MBX_9|MBX_3|MBX_2|MBX_1|MBX_0;
mcp->in_mb = MBX_3|MBX_1|MBX_0;



2018-11-19 16:54:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 18/83] fuse: Fix use-after-free in fuse_dev_do_read()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kirill Tkhai <[email protected]>

commit bc78abbd55dd28e2287ec6d6502b842321a17c87 upstream.

We may pick freed req in this way:

[cpu0] [cpu1]
fuse_dev_do_read() fuse_dev_do_write()
list_move_tail(&req->list, ...); ...
spin_unlock(&fpq->lock); ...
... request_end(fc, req);
... fuse_put_request(fc, req);
if (test_bit(FR_INTERRUPTED, ...))
queue_interrupt(fiq, req);

Fix that by keeping req alive until we finish all manipulations.

Reported-by: [email protected]
Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <[email protected]> # v4.2
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/dev.c | 2 ++
1 file changed, 2 insertions(+)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1303,12 +1303,14 @@ static ssize_t fuse_dev_do_read(struct f
goto out_end;
}
list_move_tail(&req->list, &fpq->processing);
+ __fuse_get_request(req);
spin_unlock(&fpq->lock);
set_bit(FR_SENT, &req->flags);
/* matches barrier in request_wait_answer() */
smp_mb__after_atomic();
if (test_bit(FR_INTERRUPTED, &req->flags))
queue_interrupt(fiq, req);
+ fuse_put_request(fc, req);

return reqsize;




2018-11-19 16:54:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 21/83] fuse: set FR_SENT while locked

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <[email protected]>

commit 4c316f2f3ff315cb48efb7435621e5bfb81df96d upstream.

Otherwise fuse_dev_do_write() could come in and finish off the request, and
the set_bit(FR_SENT, ...) could trigger the WARN_ON(test_bit(FR_SENT, ...))
in request_end().

Signed-off-by: Miklos Szeredi <[email protected]>
Reported-by: [email protected]
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <[email protected]> # v4.2
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1311,8 +1311,8 @@ static ssize_t fuse_dev_do_read(struct f
}
list_move_tail(&req->list, &fpq->processing);
__fuse_get_request(req);
- spin_unlock(&fpq->lock);
set_bit(FR_SENT, &req->flags);
+ spin_unlock(&fpq->lock);
/* matches barrier in request_wait_answer() */
smp_mb__after_atomic();
if (test_bit(FR_INTERRUPTED, &req->flags))



2018-11-19 16:54:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 22/83] mm: do not bug_on on incorrect length in __mm_populate()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

commit bb177a732c4369bb58a1fe1df8f552b6f0f7db5f upstream.

syzbot has noticed that a specially crafted library can easily hit
VM_BUG_ON in __mm_populate

kernel BUG at mm/gup.c:1242!
invalid opcode: 0000 [#1] SMP
CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017
RIP: 0010:__mm_populate+0x1e2/0x1f0
Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb
Call Trace:
vm_brk_flags+0xc3/0x100
vm_brk+0x1f/0x30
load_elf_library+0x281/0x2e0
__ia32_sys_uselib+0x170/0x1e0
do_fast_syscall_32+0xca/0x420
entry_SYSENTER_compat+0x70/0x7f

The reason is that the length of the new brk is not page aligned when we
try to populate the it. There is no reason to bug on that though.
do_brk_flags already aligns the length properly so the mapping is
expanded as it should. All we need is to tell mm_populate about it.
Besides that there is absolutely no reason to to bug_on in the first
place. The worst thing that could happen is that the last page wouldn't
get populated and that is far from putting system into an inconsistent
state.

Fix the issue by moving the length sanitization code from do_brk_flags
up to vm_brk_flags. The only other caller of do_brk_flags is brk
syscall entry and it makes sure to provide the proper length so t here
is no need for sanitation and so we can use do_brk_flags without it.

Also remove the bogus BUG_ONs.

[[email protected]: fix up vm_brk_flags s@request@len@]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: syzbot <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Reviewed-by: Oscar Salvador <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: "Aneesh Kumar K.V" <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Al Viro <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
[bwh: Backported to 4.9:
- There is no do_brk_flags() function; update do_brk()
- Adjust context]
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
mm/gup.c | 2 --
mm/mmap.c | 19 ++++++++++---------
2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index be4ccddac26f..d71da7216c6e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1122,8 +1122,6 @@ int __mm_populate(unsigned long start, unsigned long len, int ignore_errors)
int locked = 0;
long ret = 0;

- VM_BUG_ON(start & ~PAGE_MASK);
- VM_BUG_ON(len != PAGE_ALIGN(len));
end = start + len;

for (nstart = start; nstart < end; nstart = nend) {
diff --git a/mm/mmap.c b/mm/mmap.c
index aa97074a4a99..283755645d17 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2876,21 +2876,15 @@ static inline void verify_mm_writelocked(struct mm_struct *mm)
* anonymous maps. eventually we may be able to do some
* brk-specific accounting here.
*/
-static int do_brk(unsigned long addr, unsigned long request)
+static int do_brk(unsigned long addr, unsigned long len)
{
struct mm_struct *mm = current->mm;
struct vm_area_struct *vma, *prev;
- unsigned long flags, len;
+ unsigned long flags;
struct rb_node **rb_link, *rb_parent;
pgoff_t pgoff = addr >> PAGE_SHIFT;
int error;

- len = PAGE_ALIGN(request);
- if (len < request)
- return -ENOMEM;
- if (!len)
- return 0;
-
flags = VM_DATA_DEFAULT_FLAGS | VM_ACCOUNT | mm->def_flags;

error = get_unmapped_area(NULL, addr, len, 0, MAP_FIXED);
@@ -2959,12 +2953,19 @@ static int do_brk(unsigned long addr, unsigned long request)
return 0;
}

-int vm_brk(unsigned long addr, unsigned long len)
+int vm_brk(unsigned long addr, unsigned long request)
{
struct mm_struct *mm = current->mm;
+ unsigned long len;
int ret;
bool populate;

+ len = PAGE_ALIGN(request);
+ if (len < request)
+ return -ENOMEM;
+ if (!len)
+ return 0;
+
if (down_write_killable(&mm->mmap_sem))
return -EINTR;

--
2.17.1




2018-11-19 16:54:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/83] ARM: imx_v6_v7_defconfig: Select CONFIG_TMPFS_POSIX_ACL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Fabio Estevam <[email protected]>

[ Upstream commit 35d3cbe84544da74e39e1cec01374092467e3119 ]

Andreas Müller reports:

"Fixes:

| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[220]: Failed to apply ACL on /dev/v4l-subdev0: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[224]: Failed to apply ACL on /dev/v4l-subdev1: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[215]: Failed to apply ACL on /dev/v4l-subdev10: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[228]: Failed to apply ACL on /dev/v4l-subdev2: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[232]: Failed to apply ACL on /dev/v4l-subdev5: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[217]: Failed to apply ACL on /dev/v4l-subdev11: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[214]: Failed to apply ACL on /dev/dri/card1: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[216]: Failed to apply ACL on /dev/v4l-subdev8: Operation not supported
| Sep 04 09:05:10 imx6qdl-variscite-som systemd-udevd[226]: Failed to apply ACL on /dev/v4l-subdev9: Operation not supported

and nasty follow-ups: Starting weston from sddm as unpriviledged user fails
with some hints on missing access rights."

Select the CONFIG_TMPFS_POSIX_ACL option to fix these issues.

Reported-by: Andreas Müller <[email protected]>
Signed-off-by: Fabio Estevam <[email protected]>
Acked-by: Otavio Salvador <[email protected]>
Signed-off-by: Shawn Guo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/arm/configs/imx_v6_v7_defconfig | 1 +
1 file changed, 1 insertion(+)

--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -361,6 +361,7 @@ CONFIG_ZISOFS=y
CONFIG_UDF_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=y
+CONFIG_TMPFS_POSIX_ACL=y
CONFIG_JFFS2_FS=y
CONFIG_UBIFS_FS=y
CONFIG_NFS_FS=y



2018-11-19 16:54:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 09/83] MIPS/PCI: Call pcie_bus_configure_settings() to set MPS/MRRS

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Huacai Chen <[email protected]>

[ Upstream commit 2794f688b2c336e0da85e9f91fed33febbd9f54a ]

Call pcie_bus_configure_settings() on MIPS, like for other platforms.
The function pcie_bus_configure_settings() makes sure the MPS (Max
Payload Size) across the bus is uniform and provides the ability to
tune the MRSS (Max Read Request Size) and MPS (Max Payload Size) to
higher performance values. Some devices will not operate properly if
these aren't set correctly because the firmware doesn't always do it.

Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/20649/
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/pci/pci-legacy.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/mips/pci/pci-legacy.c
+++ b/arch/mips/pci/pci-legacy.c
@@ -116,8 +116,12 @@ static void pcibios_scanbus(struct pci_c
if (pci_has_flag(PCI_PROBE_ONLY)) {
pci_bus_claim_resources(bus);
} else {
+ struct pci_bus *child;
+
pci_bus_size_bridges(bus);
pci_bus_assign_resources(bus);
+ list_for_each_entry(child, &bus->children, node)
+ pcie_bus_configure_settings(child);
}
pci_bus_add_devices(bus);
}



2018-11-19 16:54:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 40/83] mach64: fix image corruption due to reading accelerator registers

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <[email protected]>

commit c09bcc91bb94ed91f1391bffcbe294963d605732 upstream.

Reading the registers without waiting for engine idle returns
unpredictable values. These unpredictable values result in display
corruption - if atyfb_imageblit reads the content of DP_PIX_WIDTH with the
bit DP_HOST_TRIPLE_EN set (from previous invocation), the driver would
never ever clear the bit, resulting in display corruption.

We don't want to wait for idle because it would degrade performance, so
this patch modifies the driver so that it never reads accelerator
registers.

HOST_CNTL doesn't have to be read, we can just write it with
HOST_BYTE_ALIGN because no other part of the driver cares if
HOST_BYTE_ALIGN is set.

DP_PIX_WIDTH is written in the functions atyfb_copyarea and atyfb_fillrect
with the default value and in atyfb_imageblit with the value set according
to the source image data.

Signed-off-by: Mikulas Patocka <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Cc: [email protected]
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/video/fbdev/aty/mach64_accel.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)

--- a/drivers/video/fbdev/aty/mach64_accel.c
+++ b/drivers/video/fbdev/aty/mach64_accel.c
@@ -126,7 +126,7 @@ void aty_init_engine(struct atyfb_par *p

/* set host attributes */
wait_for_fifo(13, par);
- aty_st_le32(HOST_CNTL, 0, par);
+ aty_st_le32(HOST_CNTL, HOST_BYTE_ALIGN, par);

/* set pattern attributes */
aty_st_le32(PAT_REG0, 0, par);
@@ -232,7 +232,8 @@ void atyfb_copyarea(struct fb_info *info
rotation = rotation24bpp(dx, direction);
}

- wait_for_fifo(4, par);
+ wait_for_fifo(5, par);
+ aty_st_le32(DP_PIX_WIDTH, par->crtc.dp_pix_width, par);
aty_st_le32(DP_SRC, FRGD_SRC_BLIT, par);
aty_st_le32(SRC_Y_X, (sx << 16) | sy, par);
aty_st_le32(SRC_HEIGHT1_WIDTH1, (width << 16) | area->height, par);
@@ -268,7 +269,8 @@ void atyfb_fillrect(struct fb_info *info
rotation = rotation24bpp(dx, DST_X_LEFT_TO_RIGHT);
}

- wait_for_fifo(3, par);
+ wait_for_fifo(4, par);
+ aty_st_le32(DP_PIX_WIDTH, par->crtc.dp_pix_width, par);
aty_st_le32(DP_FRGD_CLR, color, par);
aty_st_le32(DP_SRC,
BKGD_SRC_BKGD_CLR | FRGD_SRC_FRGD_CLR | MONO_SRC_ONE,
@@ -283,7 +285,7 @@ void atyfb_imageblit(struct fb_info *inf
{
struct atyfb_par *par = (struct atyfb_par *) info->par;
u32 src_bytes, dx = image->dx, dy = image->dy, width = image->width;
- u32 pix_width_save, pix_width, host_cntl, rotation = 0, src, mix;
+ u32 pix_width, rotation = 0, src, mix;

if (par->asleep)
return;
@@ -295,8 +297,7 @@ void atyfb_imageblit(struct fb_info *inf
return;
}

- pix_width = pix_width_save = aty_ld_le32(DP_PIX_WIDTH, par);
- host_cntl = aty_ld_le32(HOST_CNTL, par) | HOST_BYTE_ALIGN;
+ pix_width = par->crtc.dp_pix_width;

switch (image->depth) {
case 1:
@@ -369,12 +370,11 @@ void atyfb_imageblit(struct fb_info *inf
mix = FRGD_MIX_D_XOR_S | BKGD_MIX_D;
}

- wait_for_fifo(6, par);
- aty_st_le32(DP_WRITE_MASK, 0xFFFFFFFF, par);
+ wait_for_fifo(5, par);
aty_st_le32(DP_PIX_WIDTH, pix_width, par);
aty_st_le32(DP_MIX, mix, par);
aty_st_le32(DP_SRC, src, par);
- aty_st_le32(HOST_CNTL, host_cntl, par);
+ aty_st_le32(HOST_CNTL, HOST_BYTE_ALIGN, par);
aty_st_le32(DST_CNTL, DST_Y_TOP_TO_BOTTOM | DST_X_LEFT_TO_RIGHT | rotation, par);

draw_rect(dx, dy, width, image->height, par);
@@ -423,8 +423,4 @@ void atyfb_imageblit(struct fb_info *inf
aty_st_le32(HOST_DATA0, get_unaligned_le32(pbitmap), par);
}
}
-
- /* restore pix_width */
- wait_for_fifo(1, par);
- aty_st_le32(DP_PIX_WIDTH, pix_width_save, par);
}



2018-11-19 16:55:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 43/83] ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Changwei Ge <[email protected]>

commit 29aa30167a0a2e6045a0d6d2e89d8168132333d5 upstream.

Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().

According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But
there is obviouse misuse of brelse around related code.

After failure of ocfs2_check_dir_entry(), current code just moves to
next position and uses the problematic buffer head again and again
during which the problematic buffer head is released for multiple times.
I suppose, this a serious issue which is long-lived in ocfs2. This may
cause other file systems which is also used in a the same host insane.

So we should also consider about bakcporting this patch into linux
-stable.

Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB0452.apcprd06.prod.outlook.com
Signed-off-by: Changwei Ge <[email protected]>
Suggested-by: Changkuo Shi <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Joseph Qi <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -1896,8 +1896,7 @@ static int ocfs2_dir_foreach_blk_el(stru
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;



2018-11-19 16:55:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 49/83] arch/alpha, termios: implement BOTHER, IBSHIFT and termios2

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: H. Peter Anvin (Intel) <[email protected]>

commit d0ffb805b729322626639336986bc83fc2e60871 upstream.

Alpha has had c_ispeed and c_ospeed, but still set speeds in c_cflags
using arbitrary flags. Because BOTHER is not defined, the general
Linux code doesn't allow setting arbitrary baud rates, and because
CBAUDEX == 0, we can have an array overrun of the baud_rate[] table in
drivers/tty/tty_baudrate.c if (c_cflags & CBAUD) == 037.

Resolve both problems by #defining BOTHER to 037 on Alpha.

However, userspace still needs to know if setting BOTHER is actually
safe given legacy kernels (does anyone actually care about that on
Alpha anymore?), so enable the TCGETS2/TCSETS*2 ioctls on Alpha, even
though they use the same structure. Define struct termios2 just for
compatibility; it is the exact same structure as struct termios. In a
future patchset, this will be cleaned up so the uapi headers are
usable from libc.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Kate Stewart <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Eugene Syromiatnikov <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: Johan Hovold <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/alpha/include/asm/termios.h | 8 +++++++-
arch/alpha/include/uapi/asm/ioctls.h | 5 +++++
arch/alpha/include/uapi/asm/termbits.h | 17 +++++++++++++++++
3 files changed, 29 insertions(+), 1 deletion(-)

--- a/arch/alpha/include/asm/termios.h
+++ b/arch/alpha/include/asm/termios.h
@@ -72,9 +72,15 @@
})

#define user_termios_to_kernel_termios(k, u) \
- copy_from_user(k, u, sizeof(struct termios))
+ copy_from_user(k, u, sizeof(struct termios2))

#define kernel_termios_to_user_termios(u, k) \
+ copy_to_user(u, k, sizeof(struct termios2))
+
+#define user_termios_to_kernel_termios_1(k, u) \
+ copy_from_user(k, u, sizeof(struct termios))
+
+#define kernel_termios_to_user_termios_1(u, k) \
copy_to_user(u, k, sizeof(struct termios))

#endif /* _ALPHA_TERMIOS_H */
--- a/arch/alpha/include/uapi/asm/ioctls.h
+++ b/arch/alpha/include/uapi/asm/ioctls.h
@@ -31,6 +31,11 @@
#define TCXONC _IO('t', 30)
#define TCFLSH _IO('t', 31)

+#define TCGETS2 _IOR('T', 42, struct termios2)
+#define TCSETS2 _IOW('T', 43, struct termios2)
+#define TCSETSW2 _IOW('T', 44, struct termios2)
+#define TCSETSF2 _IOW('T', 45, struct termios2)
+
#define TIOCSWINSZ _IOW('t', 103, struct winsize)
#define TIOCGWINSZ _IOR('t', 104, struct winsize)
#define TIOCSTART _IO('t', 110) /* start output, like ^Q */
--- a/arch/alpha/include/uapi/asm/termbits.h
+++ b/arch/alpha/include/uapi/asm/termbits.h
@@ -25,6 +25,19 @@ struct termios {
speed_t c_ospeed; /* output speed */
};

+/* Alpha has identical termios and termios2 */
+
+struct termios2 {
+ tcflag_t c_iflag; /* input mode flags */
+ tcflag_t c_oflag; /* output mode flags */
+ tcflag_t c_cflag; /* control mode flags */
+ tcflag_t c_lflag; /* local mode flags */
+ cc_t c_cc[NCCS]; /* control characters */
+ cc_t c_line; /* line discipline (== c_cc[19]) */
+ speed_t c_ispeed; /* input speed */
+ speed_t c_ospeed; /* output speed */
+};
+
/* Alpha has matching termios and ktermios */

struct ktermios {
@@ -147,6 +160,7 @@ struct ktermios {
#define B3000000 00034
#define B3500000 00035
#define B4000000 00036
+#define BOTHER 00037

#define CSIZE 00001400
#define CS5 00000000
@@ -164,6 +178,9 @@ struct ktermios {
#define CMSPAR 010000000000 /* mark or space (stick) parity */
#define CRTSCTS 020000000000 /* flow control */

+#define CIBAUD 07600000
+#define IBSHIFT 16
+
/* c_lflag bits */
#define ISIG 0x00000080
#define ICANON 0x00000100



2018-11-19 16:55:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 53/83] ext4: add missing brelse() update_backups()s error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit ea0abbb648452cdb6e1734b702b6330a7448fcf8 upstream.

Fixes: ac27a0ec112a ("ext4: initial copy of files from ext3")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.19
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1095,8 +1095,10 @@ static void update_backups(struct super_
backup_block, backup_block -
ext4_group_first_block_no(sb, group));
BUFFER_TRACE(bh, "get_write_access");
- if ((err = ext4_journal_get_write_access(handle, bh)))
+ if ((err = ext4_journal_get_write_access(handle, bh))) {
+ brelse(bh);
break;
+ }
lock_buffer(bh);
memcpy(bh->b_data, data, size);
if (rest)



2018-11-19 16:55:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 15/83] cdrom: fix improper type cast, which can leat to information leak.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Young_X <[email protected]>

commit e4f3aa2e1e67bb48dfbaaf1cad59013d5a5bc276 upstream.

There is another cast from unsigned long to int which causes
a bounds check to fail with specially crafted input. The value is
then used as an index in the slot array in cdrom_slot_status().

This issue is similar to CVE-2018-16658 and CVE-2018-10940.

Signed-off-by: Young_X <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Cc: Ben Hutchings <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/cdrom/cdrom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -2435,7 +2435,7 @@ static int cdrom_ioctl_select_disc(struc
return -ENOSYS;

if (arg != CDSL_CURRENT && arg != CDSL_NONE) {
- if ((int)arg >= cdi->capacity)
+ if (arg >= cdi->capacity)
return -EINVAL;
}




2018-11-19 16:55:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 51/83] Btrfs: fix data corruption due to cloning of eof block

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Filipe Manana <[email protected]>

commit ac765f83f1397646c11092a032d4f62c3d478b81 upstream.

We currently allow cloning a range from a file which includes the last
block of the file even if the file's size is not aligned to the block
size. This is fine and useful when the destination file has the same size,
but when it does not and the range ends somewhere in the middle of the
destination file, it leads to corruption because the bytes between the EOF
and the end of the block have undefined data (when there is support for
discard/trimming they have a value of 0x00).

Example:

$ mkfs.btrfs -f /dev/sdb
$ mount /dev/sdb /mnt

$ export foo_size=$((256 * 1024 + 100))
$ xfs_io -f -c "pwrite -S 0x3c 0 $foo_size" /mnt/foo
$ xfs_io -f -c "pwrite -S 0xb5 0 1M" /mnt/bar

$ xfs_io -c "reflink /mnt/foo 0 512K $foo_size" /mnt/bar

$ od -A d -t x1 /mnt/bar
0000000 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
*
0524288 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c 3c
*
0786528 3c 3c 3c 3c 00 00 00 00 00 00 00 00 00 00 00 00
0786544 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
*
0790528 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5 b5
*
1048576

The bytes in the range from 786532 (512Kb + 256Kb + 100 bytes) to 790527
(512Kb + 256Kb + 4Kb - 1) got corrupted, having now a value of 0x00 instead
of 0xb5.

This is similar to the problem we had for deduplication that got recently
fixed by commit de02b9f6bb65 ("Btrfs: fix data corruption when
deduplicating between different files").

Fix this by not allowing such operations to be performed and return the
errno -EINVAL to user space. This is what XFS is doing as well at the VFS
level. This change however now makes us return -EINVAL instead of
-EOPNOTSUPP for cases where the source range maps to an inline extent and
the destination range's end is smaller then the destination file's size,
since the detection of inline extents is done during the actual process of
dropping file extent items (at __btrfs_drop_extents()). Returning the
-EINVAL error is done early on and solely based on the input parameters
(offsets and length) and destination file's size. This makes us consistent
with XFS and anyone else supporting cloning since this case is now checked
at a higher level in the VFS and is where the -EINVAL will be returned
from starting with kernel 4.20 (the VFS changed was introduced in 4.20-rc1
by commit 07d19dc9fbe9 ("vfs: avoid problematic remapping requests into
partial EOF block"). So this change is more geared towards stable kernels,
as it's unlikely the new VFS checks get removed intentionally.

A test case for fstests follows soon, as well as an update to filter
existing tests that expect -EOPNOTSUPP to accept -EINVAL as well.

CC: <[email protected]> # 4.4+
Signed-off-by: Filipe Manana <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/ioctl.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3911,9 +3911,17 @@ static noinline int btrfs_clone_files(st
goto out_unlock;
if (len == 0)
olen = len = src->i_size - off;
- /* if we extend to eof, continue to block boundary */
- if (off + len == src->i_size)
+ /*
+ * If we extend to eof, continue to block boundary if and only if the
+ * destination end offset matches the destination file's size, otherwise
+ * we would be corrupting data by placing the eof block into the middle
+ * of a file.
+ */
+ if (off + len == src->i_size) {
+ if (!IS_ALIGNED(len, bs) && destoff + len < inode->i_size)
+ goto out_unlock;
len = ALIGN(src->i_size, bs) - off;
+ }

if (len == 0) {
ret = 0;



2018-11-19 16:55:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 17/83] scsi: qla2xxx: shutdown chip if reset fail

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Quinn Tran <[email protected]>

commit 1e4ac5d6fe0a4af17e4b6251b884485832bf75a3 upstream.

If chip unable to fully initialize, use full shutdown sequence to clear out
any stale FW state.

Fixes: e315cd28b9ef ("[SCSI] qla2xxx: Code changes for qla data structure refactoring")
Cc: [email protected] #4.10
Signed-off-by: Quinn Tran <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/scsi/qla2xxx/qla_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -4894,7 +4894,7 @@ qla2x00_abort_isp(scsi_qla_host_t *vha)
* The next call disables the board
* completely.
*/
- ha->isp_ops->reset_adapter(vha);
+ qla2x00_abort_isp_cleanup(vha);
vha->flags.online = 0;
clear_bit(ISP_ABORT_RETRY,
&vha->dpc_flags);



2018-11-19 16:55:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/83] fuse: fix blocked_waitq wakeup

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <[email protected]>

commit 908a572b80f6e9577b45e81b3dfe2e22111286b8 upstream.

Using waitqueue_active() is racy. Make sure we issue a wake_up()
unconditionally after storing into fc->blocked. After that it's okay to
optimize with waitqueue_active() since the first wake up provides the
necessary barrier for all waiters, not the just the woken one.

Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 3c18ef8117f0 ("fuse: optimize wake_up")
Cc: <[email protected]> # v3.10
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/dev.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -383,12 +383,19 @@ static void request_end(struct fuse_conn
if (test_bit(FR_BACKGROUND, &req->flags)) {
spin_lock(&fc->lock);
clear_bit(FR_BACKGROUND, &req->flags);
- if (fc->num_background == fc->max_background)
+ if (fc->num_background == fc->max_background) {
fc->blocked = 0;
-
- /* Wake up next waiter, if any */
- if (!fc->blocked && waitqueue_active(&fc->blocked_waitq))
wake_up(&fc->blocked_waitq);
+ } else if (!fc->blocked) {
+ /*
+ * Wake up next waiter, if any. It's okay to use
+ * waitqueue_active(), as we've already synced up
+ * fc->blocked with waiters with the wake_up() call
+ * above.
+ */
+ if (waitqueue_active(&fc->blocked_waitq))
+ wake_up(&fc->blocked_waitq);
+ }

if (fc->num_background == fc->congestion_threshold &&
fc->connected && fc->bdi_initialized) {



2018-11-19 16:55:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 57/83] ext4: fix possible inode leak in the retry loop of ext4_resize_fs()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit db6aee62406d9fbb53315fcddd81f1dc271d49fa upstream.

Fixes: 1c6bd7173d66 ("ext4: convert file system to meta_bg if needed ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.7
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -2026,6 +2026,10 @@ retry:
n_blocks_count_retry = 0;
free_flex_gd(flex_gd);
flex_gd = NULL;
+ if (resize_inode) {
+ iput(resize_inode);
+ resize_inode = NULL;
+ }
goto retry;
}




2018-11-19 16:55:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 19/83] fuse: Fix use-after-free in fuse_dev_do_write()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kirill Tkhai <[email protected]>

commit d2d2d4fb1f54eff0f3faa9762d84f6446a4bc5d0 upstream.

After we found req in request_find() and released the lock,
everything may happen with the req in parallel:

cpu0 cpu1
fuse_dev_do_write() fuse_dev_do_write()
req = request_find(fpq, ...) ...
spin_unlock(&fpq->lock) ...
... req = request_find(fpq, oh.unique)
... spin_unlock(&fpq->lock)
queue_interrupt(&fc->iq, req); ...
... ...
... ...
request_end(fc, req);
fuse_put_request(fc, req);
... queue_interrupt(&fc->iq, req);


Signed-off-by: Kirill Tkhai <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 46c34a348b0a ("fuse: no fc->lock for pqueue parts")
Cc: <[email protected]> # v4.2
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/dev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1868,16 +1868,20 @@ static ssize_t fuse_dev_do_write(struct

/* Is it an interrupt reply? */
if (req->intr_unique == oh.unique) {
+ __fuse_get_request(req);
spin_unlock(&fpq->lock);

err = -EINVAL;
- if (nbytes != sizeof(struct fuse_out_header))
+ if (nbytes != sizeof(struct fuse_out_header)) {
+ fuse_put_request(fc, req);
goto err_finish;
+ }

if (oh.error == -ENOSYS)
fc->no_interrupt = 1;
else if (oh.error == -EAGAIN)
queue_interrupt(&fc->iq, req);
+ fuse_put_request(fc, req);

fuse_copy_finish(cs);
return nbytes;



2018-11-19 16:55:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/83] parisc: Fix HPMC handler by increasing size to multiple of 16 bytes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit d5654e156bc4d68a87bbaa6d7e020baceddf6e68 ]

Make sure that the HPMC (High Priority Machine Check) handler is 16-byte
aligned and that it's length in the IVT is a multiple of 16 bytes.
Otherwise PDC may decide not to call the HPMC crash handler.

Signed-off-by: Helge Deller <[email protected]>
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/parisc/kernel/hpmc.S | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/parisc/kernel/hpmc.S b/arch/parisc/kernel/hpmc.S
index 38d461aec46d..407b3aa5aa07 100644
--- a/arch/parisc/kernel/hpmc.S
+++ b/arch/parisc/kernel/hpmc.S
@@ -83,6 +83,7 @@ END(hpmc_pim_data)
.text

.import intr_save, code
+ .align 16
ENTRY_CFI(os_hpmc)
.os_hpmc:

@@ -299,12 +300,15 @@ os_hpmc_6:

b .
nop
+ .align 16 /* make function length multiple of 16 bytes */
ENDPROC_CFI(os_hpmc)
.os_hpmc_end:


__INITRODATA
+.globl os_hpmc_size
.align 4
- .export os_hpmc_size
+ .type os_hpmc_size, @object
+ .size os_hpmc_size, 4
os_hpmc_size:
.word .os_hpmc_end-.os_hpmc
--
2.17.1




2018-11-19 16:55:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 64/83] ext4: fix buffer leak in ext4_xattr_move_to_block() on error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit 6bdc9977fcdedf47118d2caf7270a19f4b6d8a8f upstream.

Fixes: 3f2571c1f91f ("ext4: factor out xattr moving")
Fixes: 6dd4ee7cab7e ("ext4: Expand extra_inodes space per ...")
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.23
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/xattr.c | 2 ++
1 file changed, 2 insertions(+)

--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1393,6 +1393,8 @@ out:
kfree(buffer);
if (is)
brelse(is->iloc.bh);
+ if (bs)
+ brelse(bs->bh);
kfree(is);
kfree(bs);




2018-11-19 16:56:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 67/83] mount: Dont allow copying MNT_UNBINDABLE|MNT_LOCKED mounts

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric W. Biederman <[email protected]>

commit df7342b240185d58d3d9665c0bbf0a0f5570ec29 upstream.

Jonathan Calmels from NVIDIA reported that he's able to bypass the
mount visibility security check in place in the Linux kernel by using
a combination of the unbindable property along with the private mount
propagation option to allow a unprivileged user to see a path which
was purposefully hidden by the root user.

Reproducer:
# Hide a path to all users using a tmpfs
root@castiana:~# mount -t tmpfs tmpfs /sys/devices/
root@castiana:~#

# As an unprivileged user, unshare user namespace and mount namespace
stgraber@castiana:~$ unshare -U -m -r

# Confirm the path is still not accessible
root@castiana:~# ls /sys/devices/

# Make /sys recursively unbindable and private
root@castiana:~# mount --make-runbindable /sys
root@castiana:~# mount --make-private /sys

# Recursively bind-mount the rest of /sys over to /mnnt
root@castiana:~# mount --rbind /sys/ /mnt

# Access our hidden /sys/device as an unprivileged user
root@castiana:~# ls /mnt/devices/
breakpoint cpu cstate_core cstate_pkg i915 intel_pt isa kprobe
LNXSYSTM:00 msr pci0000:00 platform pnp0 power software system
tracepoint uncore_arb uncore_cbox_0 uncore_cbox_1 uprobe virtual

Solve this by teaching copy_tree to fail if a mount turns out to be
both unbindable and locked.

Cc: [email protected]
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Reported-by: Jonathan Calmels <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namespace.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1788,8 +1788,14 @@ struct mount *copy_tree(struct mount *mn
for (s = r; s; s = next_mnt(s, r)) {
if (!(flag & CL_COPY_UNBINDABLE) &&
IS_MNT_UNBINDABLE(s)) {
- s = skip_mnt_tree(s);
- continue;
+ if (s->mnt.mnt_flags & MNT_LOCKED) {
+ /* Both unbindable and locked. */
+ q = ERR_PTR(-EPERM);
+ goto out;
+ } else {
+ s = skip_mnt_tree(s);
+ continue;
+ }
}
if (!(flag & CL_COPY_MNT_NS_FILE) &&
is_mnt_ns_file(s->mnt.mnt_root)) {



2018-11-19 16:56:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 38/83] Revert "ceph: fix dentry leak in splice_dentry()"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yan, Zheng <[email protected]>

commit efe328230dc01aa0b1269aad0b5fae73eea4677a upstream.

This reverts commit 8b8f53af1ed9df88a4c0fbfdf3db58f62060edf3.

splice_dentry() is used by three places. For two places, req->r_dentry
is passed to splice_dentry(). In the case of error, req->r_dentry does
not get updated. So splice_dentry() should not drop reference.

Cc: [email protected] # 4.18+
Signed-off-by: "Yan, Zheng" <[email protected]>
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ceph/inode.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/fs/ceph/inode.c
+++ b/fs/ceph/inode.c
@@ -1077,8 +1077,12 @@ static struct dentry *splice_dentry(stru
if (IS_ERR(realdn)) {
pr_err("splice_dentry error %ld %p inode %p ino %llx.%llx\n",
PTR_ERR(realdn), dn, in, ceph_vinop(in));
- dput(dn);
- dn = realdn; /* note realdn contains the error */
+ dn = realdn;
+ /*
+ * Caller should release 'dn' in the case of error.
+ * If 'req->r_dentry' is passed to this function,
+ * caller should leave 'req->r_dentry' untouched.
+ */
goto out;
} else if (realdn) {
dout("dn %p (%d) spliced with %p (%d) "



2018-11-19 16:56:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 34/83] clk: s2mps11: Fix matching when built as module and DT node contains compatible

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Krzysztof Kozlowski <[email protected]>

commit 8985167ecf57f97061599a155bb9652c84ea4913 upstream.

When driver is built as module and DT node contains clocks compatible
(e.g. "samsung,s2mps11-clk"), the module will not be autoloaded because
module aliases won't match.

The modalias from uevent: of:NclocksT<NULL>Csamsung,s2mps11-clk
The modalias from driver: platform:s2mps11-clk

The devices are instantiated by parent's MFD. However both Device Tree
bindings and parent define the compatible for clocks devices. In case
of module matching this DT compatible will be used.

The issue will not happen if this is a built-in (no need for module
matching) or when clocks DT node does not contain compatible (not
correct from bindings perspective but working for driver).

Note when backporting to stable kernels: adjust the list of device ID
entries.

Cc: <[email protected]>
Fixes: 53c31b3437a6 ("mfd: sec-core: Add of_compatible strings for clock MFD cells")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Acked-by: Stephen Boyd <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

diff --git a/drivers/clk/clk-s2mps11.c b/drivers/clk/clk-s2mps11.c
index d44e0eea31ec..0934d3724495 100644
--- a/drivers/clk/clk-s2mps11.c
+++ b/drivers/clk/clk-s2mps11.c
@@ -245,6 +245,36 @@ static const struct platform_device_id s2mps11_clk_id[] = {
};
MODULE_DEVICE_TABLE(platform, s2mps11_clk_id);

+#ifdef CONFIG_OF
+/*
+ * Device is instantiated through parent MFD device and device matching is done
+ * through platform_device_id.
+ *
+ * However if device's DT node contains proper clock compatible and driver is
+ * built as a module, then the *module* matching will be done trough DT aliases.
+ * This requires of_device_id table. In the same time this will not change the
+ * actual *device* matching so do not add .of_match_table.
+ */
+static const struct of_device_id s2mps11_dt_match[] = {
+ {
+ .compatible = "samsung,s2mps11-clk",
+ .data = (void *)S2MPS11X,
+ }, {
+ .compatible = "samsung,s2mps13-clk",
+ .data = (void *)S2MPS13X,
+ }, {
+ .compatible = "samsung,s2mps14-clk",
+ .data = (void *)S2MPS14X,
+ }, {
+ .compatible = "samsung,s5m8767-clk",
+ .data = (void *)S5M8767X,
+ }, {
+ /* Sentinel */
+ },
+};
+MODULE_DEVICE_TABLE(of, s2mps11_dt_match);
+#endif
+
static struct platform_driver s2mps11_clk_driver = {
.driver = {
.name = "s2mps11-clk",



2018-11-19 16:56:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 41/83] reset: hisilicon: fix potential NULL pointer dereference

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gustavo A. R. Silva <[email protected]>

commit e9a2310fb689151166df7fd9971093362d34bd79 upstream.

There is a potential execution path in which function
platform_get_resource() returns NULL. If this happens,
we will end up having a NULL pointer dereference.

Fix this by replacing devm_ioremap with devm_ioremap_resource,
which has the NULL check and the memory region request.

This code was detected with the help of Coccinelle.

Cc: [email protected]
Fixes: 97b7129cd2af ("reset: hisilicon: change the definition of hisi_reset_init")
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/hisilicon/reset.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/drivers/clk/hisilicon/reset.c
+++ b/drivers/clk/hisilicon/reset.c
@@ -109,9 +109,8 @@ struct hisi_reset_controller *hisi_reset
return NULL;

res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- rstc->membase = devm_ioremap(&pdev->dev,
- res->start, resource_size(res));
- if (!rstc->membase)
+ rstc->membase = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(rstc->membase))
return NULL;

spin_lock_init(&rstc->lock);



2018-11-19 16:56:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 37/83] libceph: bump CEPH_MSG_MAX_DATA_LEN

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ilya Dryomov <[email protected]>

commit 94e6992bb560be8bffb47f287194adf070b57695 upstream.

If the read is large enough, we end up spinning in the messenger:

libceph: osd0 192.168.122.1:6801 io error
libceph: osd0 192.168.122.1:6801 io error
libceph: osd0 192.168.122.1:6801 io error

This is a receive side limit, so only reads were affected.

Cc: [email protected]
Signed-off-by: Ilya Dryomov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
include/linux/ceph/libceph.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/include/linux/ceph/libceph.h
+++ b/include/linux/ceph/libceph.h
@@ -77,7 +77,13 @@ struct ceph_options {

#define CEPH_MSG_MAX_FRONT_LEN (16*1024*1024)
#define CEPH_MSG_MAX_MIDDLE_LEN (16*1024*1024)
-#define CEPH_MSG_MAX_DATA_LEN (16*1024*1024)
+
+/*
+ * Handle the largest possible rbd object in one message.
+ * There is no limit on the size of cephfs objects, but it has to obey
+ * rsize and wsize mount options anyway.
+ */
+#define CEPH_MSG_MAX_DATA_LEN (32*1024*1024)

#define CEPH_AUTH_NAME_DEFAULT "guest"




2018-11-19 16:56:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 47/83] of, numa: Validate some distance map rules

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: John Garry <[email protected]>

commit 89c38422e072bb453e3045b8f1b962a344c3edea upstream.

Currently the NUMA distance map parsing does not validate the distance
table for the distance-matrix rules 1-2 in [1].

However the arch NUMA code may enforce some of these rules, but not all.
Such is the case for the arm64 port, which does not enforce the rule that
the distance between separates nodes cannot equal LOCAL_DISTANCE.

The patch adds the following rules validation:
- distance of node to self equals LOCAL_DISTANCE
- distance of separate nodes > LOCAL_DISTANCE

This change avoids a yet-unresolved crash reported in [2].

A note on dealing with symmetrical distances between nodes:

Validating symmetrical distances between nodes is difficult. If it were
mandated in the bindings that every distance must be recorded in the
table, then it would be easy. However, it isn't.

In addition to this, it is also possible to record [b, a] distance only
(and not [a, b]). So, when processing the table for [b, a], we cannot
assert that current distance of [a, b] != [b, a] as invalid, as [a, b]
distance may not be present in the table and current distance would be
default at REMOTE_DISTANCE.

As such, we maintain the policy that we overwrite distance [a, b] = [b, a]
for b > a. This policy is different to kernel ACPI SLIT validation, which
allows non-symmetrical distances (ACPI spec SLIT rules allow it). However,
the distance debug message is dropped as it may be misleading (for a distance
which is later overwritten).

Some final notes on semantics:

- It is implied that it is the responsibility of the arch NUMA code to
reset the NUMA distance map for an error in distance map parsing.

- It is the responsibility of the FW NUMA topology parsing (whether OF or
ACPI) to enforce NUMA distance rules, and not arch NUMA code.

[1] Documents/devicetree/bindings/numa.txt
[2] https://www.spinics.net/lists/arm-kernel/msg683304.html

Cc: [email protected] # 4.7
Signed-off-by: John Garry <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/of/of_numa.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

--- a/drivers/of/of_numa.c
+++ b/drivers/of/of_numa.c
@@ -126,9 +126,14 @@ static int __init of_numa_parse_distance
distance = of_read_number(matrix, 1);
matrix++;

+ if ((nodea == nodeb && distance != LOCAL_DISTANCE) ||
+ (nodea != nodeb && distance <= LOCAL_DISTANCE)) {
+ pr_err("Invalid distance[node%d -> node%d] = %d\n",
+ nodea, nodeb, distance);
+ return -EINVAL;
+ }
+
numa_set_distance(nodea, nodeb, distance);
- pr_debug("distance[node%d -> node%d] = %d\n",
- nodea, nodeb, distance);

/* Set default distance of node B->A same as A->B */
if (nodeb > nodea)



2018-11-19 16:56:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 46/83] mtd: docg3: dont set conflicting BCH_CONST_PARAMS option

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <[email protected]>

commit be2e1c9dcf76886a83fb1c433a316e26d4ca2550 upstream.

I noticed during the creation of another bugfix that the BCH_CONST_PARAMS
option that is set by DOCG3 breaks setting variable parameters for any
other users of the BCH library code.

The only other user we have today is the MTD_NAND software BCH
implementation (most flash controllers use hardware BCH these days
and are not affected). I considered removing BCH_CONST_PARAMS entirely
because of the inherent conflict, but according to the description in
lib/bch.c there is a significant performance benefit in keeping it.

To avoid the immediate problem of the conflict between MTD_NAND_BCH
and DOCG3, this only sets the constant parameters if MTD_NAND_BCH
is disabled, which should fix the problem for all cases that
are affected. This should also work for all stable kernels.

Note that there is only one machine that actually seems to use the
DOCG3 driver (arch/arm/mach-pxa/mioa701.c), so most users should have
the driver disabled, but it almost certainly shows up if we wanted
to test random kernels on machines that use software BCH in MTD.

Fixes: d13d19ece39f ("mtd: docg3: add ECC correction code")
Cc: [email protected]
Cc: Robert Jarzmik <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/mtd/devices/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/mtd/devices/Kconfig
+++ b/drivers/mtd/devices/Kconfig
@@ -196,7 +196,7 @@ comment "Disk-On-Chip Device Drivers"
config MTD_DOCG3
tristate "M-Systems Disk-On-Chip G3"
select BCH
- select BCH_CONST_PARAMS
+ select BCH_CONST_PARAMS if !MTD_NAND_BCH
select BITREVERSE
---help---
This provides an MTD device driver for the M-Systems DiskOnChip



2018-11-19 16:56:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/83] media: pci: cx23885: handle adding to list failure

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Mc Guire <[email protected]>

[ Upstream commit c5d59528e24ad22500347b199d52b9368e686a42 ]

altera_hw_filt_init() which calls append_internal() assumes
that the node was successfully linked in while in fact it can
silently fail. So the call-site needs to set return to -ENOMEM
on append_internal() returning NULL and exit through the err path.

Fixes: 349bcf02e361 ("[media] Altera FPGA based CI driver module")

Signed-off-by: Nicholas Mc Guire <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/media/pci/cx23885/altera-ci.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/drivers/media/pci/cx23885/altera-ci.c
+++ b/drivers/media/pci/cx23885/altera-ci.c
@@ -660,6 +660,10 @@ static int altera_hw_filt_init(struct al
}

temp_int = append_internal(inter);
+ if (!temp_int) {
+ ret = -ENOMEM;
+ goto err;
+ }
inter->filts_used = 1;
inter->dev = config->dev;
inter->fpga_rw = config->fpga_rw;
@@ -694,6 +698,7 @@ err:
__func__, ret);

kfree(pid_filt);
+ kfree(inter);

return ret;
}
@@ -728,6 +733,10 @@ int altera_ci_init(struct altera_ci_conf
}

temp_int = append_internal(inter);
+ if (!temp_int) {
+ ret = -ENOMEM;
+ goto err;
+ }
inter->cis_used = 1;
inter->dev = config->dev;
inter->fpga_rw = config->fpga_rw;
@@ -796,6 +805,7 @@ err:
ci_dbg_print("%s: Cannot initialize CI: Error %d.\n", __func__, ret);

kfree(state);
+ kfree(inter);

return ret;
}



2018-11-19 16:56:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 70/83] nfsd: COPY and CLONE operations require the saved filehandle to be set

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Scott Mayhew <[email protected]>

commit 01310bb7c9c98752cc763b36532fab028e0f8f81 upstream.

Make sure we have a saved filehandle, otherwise we'll oops with a null
pointer dereference in nfs4_preprocess_stateid_op().

Signed-off-by: Scott Mayhew <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/nfsd/nfs4proc.c | 3 +++
1 file changed, 3 insertions(+)

--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1016,6 +1016,9 @@ nfsd4_verify_copy(struct svc_rqst *rqstp
{
__be32 status;

+ if (!cstate->save_fh.fh_dentry)
+ return nfserr_nofilehandle;
+
status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->save_fh,
src_stateid, RD_STATE, src, NULL);
if (status) {



2018-11-19 16:56:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 79/83] drm/dp_mst: Check if primary mstb is null

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Stanislav Lisovskiy <[email protected]>

commit 23d8003907d094f77cf959228e2248d6db819fa7 upstream.

Unfortunately drm_dp_get_mst_branch_device which is called from both
drm_dp_mst_handle_down_rep and drm_dp_mst_handle_up_rep seem to rely
on that mgr->mst_primary is not NULL, which seem to be wrong as it can be
cleared with simultaneous mode set, if probing fails or in other case.
mgr->lock mutex doesn't protect against that as it might just get
assigned to NULL right before, not simultaneously.

There are currently bugs 107738, 108616 bugs which crash in
drm_dp_get_mst_branch_device, caused by this issue.

v2: Refactored the code, as it was nicely noticed.
Fixed Bugzilla bug numbers(second was 108616, but not 108816)
and added links.

[changed title and added stable cc]
Signed-off-by: Lyude Paul <[email protected]>
Signed-off-by: Stanislav Lisovskiy <[email protected]>
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108616
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107738
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/drm_dp_mst_topology.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -1230,6 +1230,9 @@ static struct drm_dp_mst_branch *drm_dp_
mutex_lock(&mgr->lock);
mstb = mgr->mst_primary;

+ if (!mstb)
+ goto out;
+
for (i = 0; i < lct - 1; i++) {
int shift = (i % 2) ? 0 : 4;
int port_num = (rad[i / 2] >> shift) & 0xf;



2018-11-19 16:56:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 50/83] Btrfs: fix cur_offset in the error case for nocow

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Robbie Ko <[email protected]>

commit 506481b20e818db40b6198815904ecd2d6daee64 upstream.

When the cow_file_range fails, the related resources are unlocked
according to the range [start..end), so the unlock cannot be repeated in
run_delalloc_nocow.

In some cases (e.g. cur_offset <= end && cow_start != -1), cur_offset is
not updated correctly, so move the cur_offset update before
cow_file_range.

kernel BUG at mm/page-writeback.c:2663!
Internal error: Oops - BUG: 0 [#1] SMP
CPU: 3 PID: 31525 Comm: kworker/u8:7 Tainted: P O
Hardware name: Realtek_RTD1296 (DT)
Workqueue: writeback wb_workfn (flush-btrfs-1)
task: ffffffc076db3380 ti: ffffffc02e9ac000 task.ti: ffffffc02e9ac000
PC is at clear_page_dirty_for_io+0x1bc/0x1e8
LR is at clear_page_dirty_for_io+0x14/0x1e8
pc : [<ffffffc00033c91c>] lr : [<ffffffc00033c774>] pstate: 40000145
sp : ffffffc02e9af4f0
Process kworker/u8:7 (pid: 31525, stack limit = 0xffffffc02e9ac020)
Call trace:
[<ffffffc00033c91c>] clear_page_dirty_for_io+0x1bc/0x1e8
[<ffffffbffc514674>] extent_clear_unlock_delalloc+0x1e4/0x210 [btrfs]
[<ffffffbffc4fb168>] run_delalloc_nocow+0x3b8/0x948 [btrfs]
[<ffffffbffc4fb948>] run_delalloc_range+0x250/0x3a8 [btrfs]
[<ffffffbffc514c0c>] writepage_delalloc.isra.21+0xbc/0x1d8 [btrfs]
[<ffffffbffc516048>] __extent_writepage+0xe8/0x248 [btrfs]
[<ffffffbffc51630c>] extent_write_cache_pages.isra.17+0x164/0x378 [btrfs]
[<ffffffbffc5185a8>] extent_writepages+0x48/0x68 [btrfs]
[<ffffffbffc4f5828>] btrfs_writepages+0x20/0x30 [btrfs]
[<ffffffc00033d758>] do_writepages+0x30/0x88
[<ffffffc0003ba0f4>] __writeback_single_inode+0x34/0x198
[<ffffffc0003ba6c4>] writeback_sb_inodes+0x184/0x3c0
[<ffffffc0003ba96c>] __writeback_inodes_wb+0x6c/0xc0
[<ffffffc0003bac20>] wb_writeback+0x1b8/0x1c0
[<ffffffc0003bb0f0>] wb_workfn+0x150/0x250
[<ffffffc0002b0014>] process_one_work+0x1dc/0x388
[<ffffffc0002b02f0>] worker_thread+0x130/0x500
[<ffffffc0002b6344>] kthread+0x10c/0x110
[<ffffffc000284590>] ret_from_fork+0x10/0x40
Code: d503201f a9025bb5 a90363b7 f90023b9 (d4210000)

CC: [email protected] # 4.4+
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Robbie Ko <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/btrfs/inode.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -1548,12 +1548,11 @@ out_check:
}
btrfs_release_path(path);

- if (cur_offset <= end && cow_start == (u64)-1) {
+ if (cur_offset <= end && cow_start == (u64)-1)
cow_start = cur_offset;
- cur_offset = end;
- }

if (cow_start != (u64)-1) {
+ cur_offset = end;
ret = cow_file_range(inode, locked_page, cow_start, end, end,
page_started, nr_written, 1, NULL);
if (ret)



2018-11-19 16:56:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 54/83] ext4: add missing brelse() in set_flexbg_block_bitmap()s error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit cea5794122125bf67559906a0762186cf417099c upstream.

Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...")
Cc: [email protected] # 3.3
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -442,16 +442,18 @@ static int set_flexbg_block_bitmap(struc

BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, bh);
- if (err)
+ if (err) {
+ brelse(bh);
return err;
+ }
ext4_debug("mark block bitmap %#04llx (+%llu/%u)\n", block,
block - start, count2);
ext4_set_bits(bh->b_data, block - start, count2);

err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (unlikely(err))
return err;
- brelse(bh);
}

return 0;



2018-11-19 16:56:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 71/83] rtc: hctosys: Add missing range error reporting

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Maciej W. Rozycki <[email protected]>

commit 7ce9a992ffde8ce93d5ae5767362a5c7389ae895 upstream.

Fix an issue with the 32-bit range error path in `rtc_hctosys' where no
error code is set and consequently the successful preceding call result
from `rtc_read_time' is propagated to `rtc_hctosys_ret'. This in turn
makes any subsequent call to `hctosys_show' incorrectly report in sysfs
that the system time has been set from this RTC while it has not.

Set the error to ERANGE then if we can't express the result due to an
overflow.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Fixes: b3a5ac42ab18 ("rtc: hctosys: Ensure system time doesn't overflow time_t")
Cc: [email protected] # 4.17+
Signed-off-by: Alexandre Belloni <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/rtc/hctosys.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/rtc/hctosys.c
+++ b/drivers/rtc/hctosys.c
@@ -50,8 +50,10 @@ static int __init rtc_hctosys(void)
tv64.tv_sec = rtc_tm_to_time64(&tm);

#if BITS_PER_LONG == 32
- if (tv64.tv_sec > INT_MAX)
+ if (tv64.tv_sec > INT_MAX) {
+ err = -ERANGE;
goto err_read;
+ }
#endif

err = do_settimeofday64(&tv64);



2018-11-19 16:56:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 75/83] lib/ubsan.c: dont mark __ubsan_handle_builtin_unreachable as noreturn

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <[email protected]>

commit 1c23b4108d716cc848b38532063a8aca4f86add8 upstream.

gcc-8 complains about the prototype for this function:

lib/ubsan.c:432:1: error: ignoring attribute 'noreturn' in declaration of a built-in function '__ubsan_handle_builtin_unreachable' because it conflicts with attribute 'const' [-Werror=attributes]

This is actually a GCC's bug. In GCC internals
__ubsan_handle_builtin_unreachable() declared with both 'noreturn' and
'const' attributes instead of only 'noreturn':

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84210

Workaround this by removing the noreturn attribute.

[aryabinin: add information about GCC bug in changelog]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Andrey Ryabinin <[email protected]>
Acked-by: Olof Johansson <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
lib/ubsan.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/lib/ubsan.c
+++ b/lib/ubsan.c
@@ -451,8 +451,7 @@ void __ubsan_handle_shift_out_of_bounds(
EXPORT_SYMBOL(__ubsan_handle_shift_out_of_bounds);


-void __noreturn
-__ubsan_handle_builtin_unreachable(struct unreachable_data *data)
+void __ubsan_handle_builtin_unreachable(struct unreachable_data *data)
{
unsigned long flags;




2018-11-19 16:56:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 58/83] ext4: avoid buffer leak in ext4_orphan_add() after prior errors

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit feaf264ce7f8d54582e2f66eb82dd9dd124c94f3 upstream.

Fixes: d745a8c20c1f ("ext4: reduce contention on s_orphan_lock")
Fixes: 6e3617e579e0 ("ext4: Handle non empty on-disk orphan link")
Cc: Dmitry Monakhov <[email protected]>
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.34
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/namei.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2842,7 +2842,9 @@ int ext4_orphan_add(handle_t *handle, st
list_del_init(&EXT4_I(inode)->i_orphan);
mutex_unlock(&sbi->s_orphan_lock);
}
- }
+ } else
+ brelse(iloc.bh);
+
jbd_debug(4, "superblock will point to %lu\n", inode->i_ino);
jbd_debug(4, "orphan inode %lu will point to %d\n",
inode->i_ino, NEXT_ORPHAN(inode));



2018-11-19 16:56:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 59/83] ext4: fix missing cleanup if ext4_alloc_flex_bg_array() fails while resizing

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit f348e2241fb73515d65b5d77dd9c174128a7fbf2 upstream.

Fixes: 117fff10d7f1 ("ext4: grow the s_flex_groups array as needed ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.7
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -1990,7 +1990,7 @@ retry:

err = ext4_alloc_flex_bg_array(sb, n_group + 1);
if (err)
- return err;
+ goto out;

err = ext4_mb_alloc_groupinfo(sb, n_group + 1);
if (err)



2018-11-19 16:57:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 69/83] sunrpc: correct the computation for page_ptr when truncating

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Frank Sorenson <[email protected]>

commit 5d7a5bcb67c70cbc904057ef52d3fcfeb24420bb upstream.

When truncating the encode buffer, the page_ptr is getting
advanced, causing the next page to be skipped while encoding.
The page is still included in the response, so the response
contains a page of bogus data.

We need to adjust the page_ptr backwards to ensure we encode
the next page into the correct place.

We saw this triggered when concurrent directory modifications caused
nfsd4_encode_direct_fattr() to return nfserr_noent, and the resulting
call to xdr_truncate_encode() corrupted the READDIR reply.

Signed-off-by: Frank Sorenson <[email protected]>
Cc: [email protected]
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/sunrpc/xdr.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -639,11 +639,10 @@ void xdr_truncate_encode(struct xdr_stre
WARN_ON_ONCE(xdr->iov);
return;
}
- if (fraglen) {
+ if (fraglen)
xdr->end = head->iov_base + head->iov_len;
- xdr->page_ptr--;
- }
/* (otherwise assume xdr->end is already set) */
+ xdr->page_ptr--;
head->iov_len = len;
buf->len = len;
xdr->p = head->iov_base + head->iov_len;



2018-11-19 16:57:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 63/83] ext4: release bs.bh before re-using in ext4_xattr_block_find()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit 45ae932d246f721e6584430017176cbcadfde610 upstream.

bs.bh was taken in previous ext4_xattr_block_find() call,
it should be released before re-using

Fixes: 7e01c8e5420b ("ext3/4: fix uninitialized bs in ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.26
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/xattr.c | 2 ++
1 file changed, 2 insertions(+)

--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1221,6 +1221,8 @@ ext4_xattr_set_handle(handle_t *handle,
error = ext4_xattr_block_set(handle, inode, &i, &bs);
} else if (error == -ENOSPC) {
if (EXT4_I(inode)->i_file_acl && !bs.s.base) {
+ brelse(bs.bh);
+ bs.bh = NULL;
error = ext4_xattr_block_find(inode, &i, &bs);
if (error)
goto cleanup;



2018-11-19 16:57:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 32/83] xtensa: make sure bFLT stack is 16 byte aligned

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Max Filippov <[email protected]>

commit 0773495b1f5f1c5e23551843f87b5ff37e7af8f7 upstream.

Xtensa ABI requires stack alignment to be at least 16. In noMMU
configuration ARCH_SLAB_MINALIGN is used to align stack. Make it at
least 16.

This fixes the following runtime error in noMMU configuration, caused by
interaction between insufficiently aligned stack and alloca function,
that results in corruption of on-stack variable in the libc function
glob:

Caught unhandled exception in 'sh' (pid = 47, pc = 0x02d05d65)
- should not happen
EXCCAUSE is 15

Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/xtensa/include/asm/processor.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

--- a/arch/xtensa/include/asm/processor.h
+++ b/arch/xtensa/include/asm/processor.h
@@ -24,7 +24,11 @@
# error Linux requires the Xtensa Windowed Registers Option.
#endif

-#define ARCH_SLAB_MINALIGN XCHAL_DATA_WIDTH
+/* Xtensa ABI requires stack alignment to be at least 16 */
+
+#define STACK_ALIGN (XCHAL_DATA_WIDTH > 16 ? XCHAL_DATA_WIDTH : 16)
+
+#define ARCH_SLAB_MINALIGN STACK_ALIGN

/*
* User space process size: 1 GB.



2018-11-19 16:57:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 35/83] clk: at91: Fix division by zero in PLL recalc_rate()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ronald Wahl <[email protected]>

commit 0f5cb0e6225cae2f029944cb8c74617aab6ddd49 upstream.

Commit a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached MUL
and DIV values") removed a check that prevents a division by zero. This
now causes a stacktrace when booting the kernel on a at91 platform if
the PLL DIV register contains zero. This commit reintroduces this check.

Fixes: a982e45dc150 ("clk: at91: PLL recalc_rate() now using cached...")
Cc: <[email protected]>
Signed-off-by: Ronald Wahl <[email protected]>
Acked-by: Ludovic Desroches <[email protected]>
Signed-off-by: Stephen Boyd <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/at91/clk-pll.c | 3 +++
1 file changed, 3 insertions(+)

--- a/drivers/clk/at91/clk-pll.c
+++ b/drivers/clk/at91/clk-pll.c
@@ -133,6 +133,9 @@ static unsigned long clk_pll_recalc_rate
{
struct clk_pll *pll = to_clk_pll(hw);

+ if (!pll->div || !pll->mul)
+ return 0;
+
return (parent_rate / pll->div) * (pll->mul + 1);
}




2018-11-19 16:57:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/83] parisc: Fix exported address of os_hpmc handler

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 99a3ae51d557d8e38a7aece65678a31f9db215ee ]

In the C-code we need to put the physical address of the hpmc handler in
the interrupt vector table (IVA) in order to get HPMCs working. Since
on parisc64 function pointers are indirect (in fact they are function
descriptors) we instead export the address as variable and not as
function.

This reverts a small part of commit f39cce654f9a ("parisc: Add
cfi_startproc and cfi_endproc to assembly code").

Signed-off-by: Helge Deller <[email protected]>
Cc: <[email protected]> [4.9+]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/parisc/kernel/hpmc.S | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/parisc/kernel/hpmc.S b/arch/parisc/kernel/hpmc.S
index 407b3aa5aa07..e88f4e7f39f3 100644
--- a/arch/parisc/kernel/hpmc.S
+++ b/arch/parisc/kernel/hpmc.S
@@ -84,7 +84,7 @@ END(hpmc_pim_data)

.import intr_save, code
.align 16
-ENTRY_CFI(os_hpmc)
+ENTRY(os_hpmc)
.os_hpmc:

/*
@@ -301,7 +301,6 @@ os_hpmc_6:
b .
nop
.align 16 /* make function length multiple of 16 bytes */
-ENDPROC_CFI(os_hpmc)
.os_hpmc_end:


--
2.17.1




2018-11-19 16:57:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 80/83] drm/i915/hdmi: Add HDMI 2.0 audio clock recovery N values

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Clint Taylor <[email protected]>

commit 6503493145cba4413ecd3d4d153faeef4a1e9b85 upstream.

HDMI 2.0 594Mhz modes were incorrectly selecting 25.200Mhz Automatic N value
mode instead of HDMI specification values.

V2: Fix 88.2 Hz N value

Cc: Jani Nikula <[email protected]>
Cc: [email protected]
Signed-off-by: Clint Taylor <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5a400aa3c562c4a726b4da286e63c96db905ade1)
Signed-off-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/intel_audio.c | 17 +++++++++++++++++
1 file changed, 17 insertions(+)

--- a/drivers/gpu/drm/i915/intel_audio.c
+++ b/drivers/gpu/drm/i915/intel_audio.c
@@ -76,6 +76,9 @@ static const struct {
/* HDMI N/CTS table */
#define TMDS_297M 297000
#define TMDS_296M 296703
+#define TMDS_594M 594000
+#define TMDS_593M 593407
+
static const struct {
int sample_rate;
int clock;
@@ -96,6 +99,20 @@ static const struct {
{ 176400, TMDS_297M, 18816, 247500 },
{ 192000, TMDS_296M, 23296, 281250 },
{ 192000, TMDS_297M, 20480, 247500 },
+ { 44100, TMDS_593M, 8918, 937500 },
+ { 44100, TMDS_594M, 9408, 990000 },
+ { 48000, TMDS_593M, 5824, 562500 },
+ { 48000, TMDS_594M, 6144, 594000 },
+ { 32000, TMDS_593M, 5824, 843750 },
+ { 32000, TMDS_594M, 3072, 445500 },
+ { 88200, TMDS_593M, 17836, 937500 },
+ { 88200, TMDS_594M, 18816, 990000 },
+ { 96000, TMDS_593M, 11648, 562500 },
+ { 96000, TMDS_594M, 12288, 594000 },
+ { 176400, TMDS_593M, 35672, 937500 },
+ { 176400, TMDS_594M, 37632, 990000 },
+ { 192000, TMDS_593M, 23296, 562500 },
+ { 192000, TMDS_594M, 24576, 594000 },
};

/* get AUD_CONFIG_PIXEL_CLOCK_HDMI_* value for mode */



2018-11-19 16:57:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 83/83] ovl: check whiteout in ovl_create_over_whiteout()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <[email protected]>

commit 5e1275808630ea3b2c97c776f40e475017535f72 upstream.

Kaixuxia repors that it's possible to crash overlayfs by removing the
whiteout on the upper layer before creating a directory over it. This is a
reproducer:

mkdir lower upper work merge
touch lower/file
mount -t overlay overlay -olowerdir=lower,upperdir=upper,workdir=work merge
rm merge/file
ls -al merge/file
rm upper/file
ls -al merge/
mkdir merge/file

Before commencing with a vfs_rename(..., RENAME_EXCHANGE) verify that the
lookup of "upper" is positive and is a whiteout, and return ESTALE
otherwise.

Reported by: kaixuxia <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: e9be9d5e76e3 ("overlay filesystem")
Cc: <[email protected]> # v3.18
Signed-off-by: Greg Kroah-Hartman <[email protected]>

--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -392,6 +392,10 @@ static int ovl_create_over_whiteout(struct dentry *dentry, struct inode *inode,
if (IS_ERR(upper))
goto out_dput;

+ err = -ESTALE;
+ if (d_is_negative(upper) || !IS_WHITEOUT(d_inode(upper)))
+ goto out_dput2;
+
err = ovl_create_real(wdir, newdentry, cattr, hardlink, true);
if (err)
goto out_dput2;
--
2.14.5




2018-11-19 16:58:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 73/83] fuse: fix leaked notify reply

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miklos Szeredi <[email protected]>

commit 7fabaf303458fcabb694999d6fa772cc13d4e217 upstream.

fuse_request_send_notify_reply() may fail if the connection was reset for
some reason (e.g. fs was unmounted). Don't leak request reference in this
case. Besides leaking memory, this resulted in fc->num_waiting not being
decremented and hence fuse_wait_aborted() left in a hanging and unkillable
state.

Fixes: 2d45ba381a74 ("fuse: add retrieve request")
Fixes: b8f95e5d13f5 ("fuse: umount should wait for all requests")
Reported-and-tested-by: [email protected]
Signed-off-by: Miklos Szeredi <[email protected]>
Cc: <[email protected]> #v2.6.36
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/dev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1715,8 +1715,10 @@ static int fuse_retrieve(struct fuse_con
req->in.args[1].size = total_len;

err = fuse_request_send_notify_reply(fc, req, outarg->notify_unique);
- if (err)
+ if (err) {
fuse_retrieve_end(fc, req);
+ fuse_put_request(fc, req);
+ }

return err;
}



2018-11-19 16:58:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 76/83] hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Kravetz <[email protected]>

commit 5e41540c8a0f0e98c337dda8b391e5dda0cde7cf upstream.

This bug has been experienced several times by the Oracle DB team. The
BUG is in remove_inode_hugepages() as follows:

/*
* If page is mapped, it was faulted in after being
* unmapped in caller. Unmap (again) now after taking
* the fault mutex. The mutex will prevent faults
* until we finish removing the page.
*
* This race can only happen in the hole punch case.
* Getting here in a truncate operation is a bug.
*/
if (unlikely(page_mapped(page))) {
BUG_ON(truncate_op);

In this case, the elevated map count is not the result of a race.
Rather it was incorrectly incremented as the result of a bug in the huge
pmd sharing code. Consider the following:

- Process A maps a hugetlbfs file of sufficient size and alignment
(PUD_SIZE) that a pmd page could be shared.

- Process B maps the same hugetlbfs file with the same size and
alignment such that a pmd page is shared.

- Process B then calls mprotect() to change protections for the mapping
with the shared pmd. As a result, the pmd is 'unshared'.

- Process B then calls mprotect() again to chage protections for the
mapping back to their original value. pmd remains unshared.

- Process B then forks and process C is created. During the fork
process, we do dup_mm -> dup_mmap -> copy_page_range to copy page
tables. Copying page tables for hugetlb mappings is done in the
routine copy_hugetlb_page_range.

In copy_hugetlb_page_range(), the destination pte is obtained by:

dst_pte = huge_pte_alloc(dst, addr, sz);

If pmd sharing is possible, the returned pointer will be to a pte in an
existing page table. In the situation above, process C could share with
either process A or process B. Since process A is first in the list,
the returned pte is a pointer to a pte in process A's page table.

However, the check for pmd sharing in copy_hugetlb_page_range is:

/* If the pagetables are shared don't copy or take references */
if (dst_pte == src_pte)
continue;

Since process C is sharing with process A instead of process B, the
above test fails. The code in copy_hugetlb_page_range which follows
assumes dst_pte points to a huge_pte_none pte. It copies the pte entry
from src_pte to dst_pte and increments this map count of the associated
page. This is how we end up with an elevated map count.

To solve, check the dst_pte entry for huge_pte_none. If !none, this
implies PMD sharing so do not copy.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: c5c99429fa57 ("fix hugepages leak due to pagetable page sharing")
Signed-off-by: Mike Kravetz <[email protected]>
Reviewed-by: Naoya Horiguchi <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: "Kirill A . Shutemov" <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Prakash Sangappa <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/hugetlb.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)

--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3220,7 +3220,7 @@ static int is_hugetlb_entry_hwpoisoned(p
int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
struct vm_area_struct *vma)
{
- pte_t *src_pte, *dst_pte, entry;
+ pte_t *src_pte, *dst_pte, entry, dst_entry;
struct page *ptepage;
unsigned long addr;
int cow;
@@ -3248,15 +3248,30 @@ int copy_hugetlb_page_range(struct mm_st
break;
}

- /* If the pagetables are shared don't copy or take references */
- if (dst_pte == src_pte)
+ /*
+ * If the pagetables are shared don't copy or take references.
+ * dst_pte == src_pte is the common case of src/dest sharing.
+ *
+ * However, src could have 'unshared' and dst shares with
+ * another vma. If dst_pte !none, this implies sharing.
+ * Check here before taking page table lock, and once again
+ * after taking the lock below.
+ */
+ dst_entry = huge_ptep_get(dst_pte);
+ if ((dst_pte == src_pte) || !huge_pte_none(dst_entry))
continue;

dst_ptl = huge_pte_lock(h, dst, dst_pte);
src_ptl = huge_pte_lockptr(h, src, src_pte);
spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING);
entry = huge_ptep_get(src_pte);
- if (huge_pte_none(entry)) { /* skip none entry */
+ dst_entry = huge_ptep_get(dst_pte);
+ if (huge_pte_none(entry) || !huge_pte_none(dst_entry)) {
+ /*
+ * Skip if src entry none. Also, skip in the
+ * unlikely case dst entry !none as this implies
+ * sharing with another vma.
+ */
;
} else if (unlikely(is_hugetlb_entry_migration(entry) ||
is_hugetlb_entry_hwpoisoned(entry))) {



2018-11-19 16:58:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 78/83] drm/rockchip: Allow driver to be shutdown on reboot/kexec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Marc Zyngier <[email protected]>

commit 7f3ef5dedb146e3d5063b6845781ad1bb59b92b5 upstream.

Leaving the DRM driver enabled on reboot or kexec has the annoying
effect of leaving the display generating transactions whilst the
IOMMU has been shut down.

In turn, the IOMMU driver (which shares its interrupt line with
the VOP) starts warning either on shutdown or when entering the
secondary kernel in the kexec case (nothing is expected on that
front).

A cheap way of ensuring that things are nicely shut down is to
register a shutdown callback in the platform driver.

Signed-off-by: Marc Zyngier <[email protected]>
Tested-by: Vicente Bergas <[email protected]>
Signed-off-by: Heiko Stuebner <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -478,6 +478,11 @@ static int rockchip_drm_platform_remove(
return 0;
}

+static void rockchip_drm_platform_shutdown(struct platform_device *pdev)
+{
+ rockchip_drm_platform_remove(pdev);
+}
+
static const struct of_device_id rockchip_drm_dt_ids[] = {
{ .compatible = "rockchip,display-subsystem", },
{ /* sentinel */ },
@@ -487,6 +492,7 @@ MODULE_DEVICE_TABLE(of, rockchip_drm_dt_
static struct platform_driver rockchip_drm_platform_driver = {
.probe = rockchip_drm_platform_probe,
.remove = rockchip_drm_platform_remove,
+ .shutdown = rockchip_drm_platform_shutdown,
.driver = {
.name = "rockchip-drm",
.of_match_table = rockchip_drm_dt_ids,



2018-11-19 17:27:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 77/83] mm: migration: fix migration of huge PMD shared pages

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mike Kravetz <[email protected]>

commit 017b1660df89f5fb4bfe66c34e35f7d2031100c7 upstream.

The page migration code employs try_to_unmap() to try and unmap the source
page. This is accomplished by using rmap_walk to find all vmas where the
page is mapped. This search stops when page mapcount is zero. For shared
PMD huge pages, the page map count is always 1 no matter the number of
mappings. Shared mappings are tracked via the reference count of the PMD
page. Therefore, try_to_unmap stops prematurely and does not completely
unmap all mappings of the source page.

This problem can result is data corruption as writes to the original
source page can happen after contents of the page are copied to the target
page. Hence, data is lost.

This problem was originally seen as DB corruption of shared global areas
after a huge page was soft offlined due to ECC memory errors. DB
developers noticed they could reproduce the issue by (hotplug) offlining
memory used to back huge pages. A simple testcase can reproduce the
problem by creating a shared PMD mapping (note that this must be at least
PUD_SIZE in size and PUD_SIZE aligned (1GB on x86)), and using
migrate_pages() to migrate process pages between nodes while continually
writing to the huge pages being migrated.

To fix, have the try_to_unmap_one routine check for huge PMD sharing by
calling huge_pmd_unshare for hugetlbfs huge pages. If it is a shared
mapping it will be 'unshared' which removes the page table entry and drops
the reference on the PMD page. After this, flush caches and TLB.

mmu notifiers are called before locking page tables, but we can not be
sure of PMD sharing until page tables are locked. Therefore, check for
the possibility of PMD sharing before locking so that notifiers can
prepare for the worst possible case.

Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: make _range_in_vma() a static inline]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 39dde65c9940 ("shared page table for hugetlb page")
Signed-off-by: Mike Kravetz <[email protected]>
Acked-by: Kirill A. Shutemov <[email protected]>
Reviewed-by: Naoya Horiguchi <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: Mike Kravetz <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Mike Kravetz <[email protected]>
Reviewed-by: Jérôme Glisse <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/hugetlb.h | 14 ++++++++++++
include/linux/mm.h | 6 +++++
mm/hugetlb.c | 37 ++++++++++++++++++++++++++++++-
mm/rmap.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 111 insertions(+), 2 deletions(-)

--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -109,6 +109,8 @@ pte_t *huge_pte_alloc(struct mm_struct *
unsigned long addr, unsigned long sz);
pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr);
int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
int write);
struct page *follow_huge_pmd(struct mm_struct *mm, unsigned long address,
@@ -131,6 +133,18 @@ static inline unsigned long hugetlb_tota
return 0;
}

+static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
+ pte_t *ptep)
+{
+ return 0;
+}
+
+static inline void adjust_range_if_pmd_sharing_possible(
+ struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
+
#define follow_hugetlb_page(m,v,p,vs,a,b,i,w) ({ BUG(); 0; })
#define follow_huge_addr(mm, addr, write) ERR_PTR(-EINVAL)
#define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; })
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2187,6 +2187,12 @@ static inline struct vm_area_struct *fin
return vma;
}

+static inline bool range_in_vma(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end)
+{
+ return (vma && vma->vm_start <= start && end <= vma->vm_end);
+}
+
#ifdef CONFIG_MMU
pgprot_t vm_get_page_prot(unsigned long vm_flags);
void vma_set_page_prot(struct vm_area_struct *vma);
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4333,13 +4333,41 @@ static bool vma_shareable(struct vm_area
/*
* check on proper vm_flags and page table alignment
*/
- if (vma->vm_flags & VM_MAYSHARE &&
- vma->vm_start <= base && end <= vma->vm_end)
+ if (vma->vm_flags & VM_MAYSHARE && range_in_vma(vma, base, end))
return true;
return false;
}

/*
+ * Determine if start,end range within vma could be mapped by shared pmd.
+ * If yes, adjust start and end to cover range associated with possible
+ * shared pmd mappings.
+ */
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+ unsigned long check_addr = *start;
+
+ if (!(vma->vm_flags & VM_MAYSHARE))
+ return;
+
+ for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
+ unsigned long a_start = check_addr & PUD_MASK;
+ unsigned long a_end = a_start + PUD_SIZE;
+
+ /*
+ * If sharing is possible, adjust start/end if necessary.
+ */
+ if (range_in_vma(vma, a_start, a_end)) {
+ if (a_start < *start)
+ *start = a_start;
+ if (a_end > *end)
+ *end = a_end;
+ }
+ }
+}
+
+/*
* Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc()
* and returns the corresponding pte. While this is not necessary for the
* !shared pmd case because we can allocate the pmd later as well, it makes the
@@ -4435,6 +4463,11 @@ int huge_pmd_unshare(struct mm_struct *m
{
return 0;
}
+
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
#define want_pmd_share() (0)
#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */

--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1476,6 +1476,9 @@ static int try_to_unmap_one(struct page
pte_t pteval;
spinlock_t *ptl;
int ret = SWAP_AGAIN;
+ unsigned long sh_address;
+ bool pmd_sharing_possible = false;
+ unsigned long spmd_start, spmd_end;
struct rmap_private *rp = arg;
enum ttu_flags flags = rp->flags;

@@ -1491,6 +1494,32 @@ static int try_to_unmap_one(struct page
goto out;
}

+ /*
+ * Only use the range_start/end mmu notifiers if huge pmd sharing
+ * is possible. In the normal case, mmu_notifier_invalidate_page
+ * is sufficient as we only unmap a page. However, if we unshare
+ * a pmd, we will unmap a PUD_SIZE range.
+ */
+ if (PageHuge(page)) {
+ spmd_start = address;
+ spmd_end = spmd_start + vma_mmu_pagesize(vma);
+
+ /*
+ * Check if pmd sharing is possible. If possible, we could
+ * unmap a PUD_SIZE range. spmd_start/spmd_end will be
+ * modified if sharing is possible.
+ */
+ adjust_range_if_pmd_sharing_possible(vma, &spmd_start,
+ &spmd_end);
+ if (spmd_end - spmd_start != vma_mmu_pagesize(vma)) {
+ sh_address = address;
+
+ pmd_sharing_possible = true;
+ mmu_notifier_invalidate_range_start(vma->vm_mm,
+ spmd_start, spmd_end);
+ }
+ }
+
pte = page_check_address(page, mm, address, &ptl,
PageTransCompound(page));
if (!pte)
@@ -1524,6 +1553,30 @@ static int try_to_unmap_one(struct page
}
}

+ /*
+ * Call huge_pmd_unshare to potentially unshare a huge pmd. Pass
+ * sh_address as it will be modified if unsharing is successful.
+ */
+ if (PageHuge(page) && huge_pmd_unshare(mm, &sh_address, pte)) {
+ /*
+ * huge_pmd_unshare unmapped an entire PMD page. There is
+ * no way of knowing exactly which PMDs may be cached for
+ * this mm, so flush them all. spmd_start/spmd_end cover
+ * this PUD_SIZE range.
+ */
+ flush_cache_range(vma, spmd_start, spmd_end);
+ flush_tlb_range(vma, spmd_start, spmd_end);
+
+ /*
+ * The ref count of the PMD page was dropped which is part
+ * of the way map counting is done for shared PMDs. When
+ * there is no other sharing, huge_pmd_unshare returns false
+ * and we will unmap the actual page and drop map count
+ * to zero.
+ */
+ goto out_unmap;
+ }
+
/* Nuke the page table entry. */
flush_cache_page(vma, address, page_to_pfn(page));
if (should_defer_flush(mm, flags)) {
@@ -1621,6 +1674,9 @@ out_unmap:
if (ret != SWAP_FAIL && ret != SWAP_MLOCK && !(flags & TTU_MUNLOCK))
mmu_notifier_invalidate_page(mm, address);
out:
+ if (pmd_sharing_possible)
+ mmu_notifier_invalidate_range_end(vma->vm_mm,
+ spmd_start, spmd_end);
return ret;
}




2018-11-19 17:27:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 72/83] fuse: fix use-after-free in fuse_direct_IO()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Lukas Czerner <[email protected]>

commit ebacb81273599555a7a19f7754a1451206a5fc4f upstream.

In async IO blocking case the additional reference to the io is taken for
it to survive fuse_aio_complete(). In non blocking case this additional
reference is not needed, however we still reference io to figure out
whether to wait for completion or not. This is wrong and will lead to
use-after-free. Fix it by storing blocking information in separate
variable.

This was spotted by KASAN when running generic/208 fstest.

Signed-off-by: Lukas Czerner <[email protected]>
Reported-by: Zorro Lang <[email protected]>
Signed-off-by: Miklos Szeredi <[email protected]>
Fixes: 744742d692e3 ("fuse: Add reference counting for fuse_io_priv")
Cc: <[email protected]> # v4.6
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/fuse/file.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2900,10 +2900,12 @@ fuse_direct_IO(struct kiocb *iocb, struc
}

if (io->async) {
+ bool blocking = io->blocking;
+
fuse_aio_complete(io, ret < 0 ? ret : 0, -1);

/* we have a non-extending, async request, so return */
- if (!io->blocking)
+ if (!blocking)
return -EIOCBQUEUED;

wait_for_completion(&wait);



2018-11-19 17:27:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 82/83] KVM: arm64: Fix caching of host MDCR_EL2 value

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mark Rutland <[email protected]>

commit da5a3ce66b8bb51b0ea8a89f42aac153903f90fb upstream.

At boot time, KVM stashes the host MDCR_EL2 value, but only does this
when the kernel is not running in hyp mode (i.e. is non-VHE). In these
cases, the stashed value of MDCR_EL2.HPMN happens to be zero, which can
lead to CONSTRAINED UNPREDICTABLE behaviour.

Since we use this value to derive the MDCR_EL2 value when switching
to/from a guest, after a guest have been run, the performance counters
do not behave as expected. This has been observed to result in accesses
via PMXEVTYPER_EL0 and PMXEVCNTR_EL0 not affecting the relevant
counters, resulting in events not being counted. In these cases, only
the fixed-purpose cycle counter appears to work as expected.

Fix this by always stashing the host MDCR_EL2 value, regardless of VHE.

Cc: Christopher Dall <[email protected]>
Cc: James Morse <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Fixes: 1e947bad0b63b351 ("arm64: KVM: Skip HYP setup when already running in HYP")
Tested-by: Robin Murphy <[email protected]>
Signed-off-by: Mark Rutland <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/arm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -1092,8 +1092,6 @@ static void cpu_init_hyp_mode(void *dumm

__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
__cpu_init_stage2();
-
- kvm_arm_init_debug();
}

static void cpu_hyp_reinit(void)
@@ -1108,6 +1106,8 @@ static void cpu_hyp_reinit(void)
if (__hyp_get_vectors() == hyp_default_vectors)
cpu_init_hyp_mode(NULL);
}
+
+ kvm_arm_init_debug();
}

static void cpu_hyp_reset(void)



2018-11-19 17:27:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 74/83] configfs: replace strncpy with memcpy

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Guenter Roeck <[email protected]>

commit 1823342a1f2b47a4e6f5667f67cd28ab6bc4d6cd upstream.

gcc 8.1.0 complains:

fs/configfs/symlink.c:67:3: warning:
'strncpy' output truncated before terminating nul copying as many
bytes from a string as its length
fs/configfs/symlink.c: In function 'configfs_get_link':
fs/configfs/symlink.c:63:13: note: length computed here

Using strncpy() is indeed less than perfect since the length of data to
be copied has already been determined with strlen(). Replace strncpy()
with memcpy() to address the warning and optimize the code a little.

Signed-off-by: Guenter Roeck <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/configfs/symlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/configfs/symlink.c
+++ b/fs/configfs/symlink.c
@@ -64,7 +64,7 @@ static void fill_item_path(struct config

/* back up enough to print this bus id with '/' */
length -= cur;
- strncpy(buffer + length,config_item_name(p),cur);
+ memcpy(buffer + length, config_item_name(p), cur);
*(buffer + --length) = '/';
}
}



2018-11-19 17:28:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 81/83] drm/i915/execlists: Force write serialisation into context image vs execution

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chris Wilson <[email protected]>

commit 0a823e8fd4fd67726697854578f3584ee3a49b1d upstream.

Ensure that the writes into the context image are completed prior to the
register mmio to trigger execution. Although previously we were assured
by the SDM that all writes are flushed before an uncached memory
transaction (our mmio write to submit the context to HW for execution),
we have empirical evidence to believe that this is not actually the
case.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108315
References: https://bugs.freedesktop.org/show_bug.cgi?id=106887
Signed-off-by: Chris Wilson <[email protected]>
Cc: Mika Kuoppala <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Acked-by: Mika Kuoppala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Cc: [email protected]
(cherry picked from commit 987abd5c62f92ee4970b45aa077f47949974e615)
Signed-off-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/gpu/drm/i915/intel_lrc.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -368,7 +368,8 @@ static u64 execlists_update_context(stru

reg_state[CTX_RING_TAIL+1] = intel_ring_offset(rq->ring, rq->tail);

- /* True 32b PPGTT with dynamic page allocation: update PDP
+ /*
+ * True 32b PPGTT with dynamic page allocation: update PDP
* registers and point the unallocated PDPs to scratch page.
* PML4 is allocated during ppgtt init, so this is not needed
* in 48-bit mode.
@@ -376,6 +377,17 @@ static u64 execlists_update_context(stru
if (ppgtt && !USES_FULL_48BIT_PPGTT(ppgtt->base.dev))
execlists_update_context_pdps(ppgtt, reg_state);

+ /*
+ * Make sure the context image is complete before we submit it to HW.
+ *
+ * Ostensibly, writes (including the WCB) should be flushed prior to
+ * an uncached write such as our mmio register access, the empirical
+ * evidence (esp. on Braswell) suggests that the WC write into memory
+ * may not be visible to the HW prior to the completion of the UC
+ * register write and that we may begin execution from the context
+ * before its image is complete leading to invalid PD chasing.
+ */
+ wmb();
return ce->lrc_desc;
}




2018-11-19 17:29:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 33/83] xtensa: fix boot parameters address translation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Max Filippov <[email protected]>

commit 40dc948f234b73497c3278875eb08a01d5854d3f upstream.

The bootloader may pass physical address of the boot parameters structure
to the MMUv3 kernel in the register a2. Code in the _SetupMMU block in
the arch/xtensa/kernel/head.S is supposed to map that physical address to
the virtual address in the configured virtual memory layout.

This code haven't been updated when additional 256+256 and 512+512
memory layouts were introduced and it may produce wrong addresses when
used with these layouts.

Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/xtensa/kernel/head.S | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/arch/xtensa/kernel/head.S
+++ b/arch/xtensa/kernel/head.S
@@ -88,9 +88,12 @@ _SetupMMU:
initialize_mmu
#if defined(CONFIG_MMU) && XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY
rsr a2, excsave1
- movi a3, 0x08000000
+ movi a3, XCHAL_KSEG_PADDR
+ bltu a2, a3, 1f
+ sub a2, a2, a3
+ movi a3, XCHAL_KSEG_SIZE
bgeu a2, a3, 1f
- movi a3, 0xd0000000
+ movi a3, XCHAL_KSEG_CACHED_VADDR
add a2, a2, a3
wsr a2, excsave1
1:



2018-11-19 17:29:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 36/83] clk: rockchip: Fix static checker warning in rockchip_ddrclk_get_parent call

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Enric Balletbo i Serra <[email protected]>

commit 665636b2940d0897c4130253467f5e8c42eea392 upstream.

Fixes the signedness bug returning '(-22)' on the return type by removing the
sanity checker in rockchip_ddrclk_get_parent(). The function should return
and unsigned value only and it's safe to remove the sanity checker as the
core functions that call get_parent like clk_core_get_parent_by_index already
ensures the validity of the clk index returned (index >= core->num_parents).

Fixes: a4f182bf81f18 ("clk: rockchip: add new clock-type for the ddrclk")
Cc: [email protected]
Signed-off-by: Enric Balletbo i Serra <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Signed-off-by: Heiko Stuebner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clk/rockchip/clk-ddr.c | 4 ----
1 file changed, 4 deletions(-)

--- a/drivers/clk/rockchip/clk-ddr.c
+++ b/drivers/clk/rockchip/clk-ddr.c
@@ -80,16 +80,12 @@ static long rockchip_ddrclk_sip_round_ra
static u8 rockchip_ddrclk_get_parent(struct clk_hw *hw)
{
struct rockchip_ddrclk *ddrclk = to_rockchip_ddrclk_hw(hw);
- int num_parents = clk_hw_get_num_parents(hw);
u32 val;

val = clk_readl(ddrclk->reg_base +
ddrclk->mux_offset) >> ddrclk->mux_shift;
val &= GENMASK(ddrclk->mux_width - 1, 0);

- if (val >= num_parents)
- return -EINVAL;
-
return val;
}




2018-11-19 17:31:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 68/83] mount: Prevent MNT_DETACH from disconnecting locked mounts

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric W. Biederman <[email protected]>

commit 9c8e0a1b683525464a2abe9fb4b54404a50ed2b4 upstream.

Timothy Baldwin <[email protected]> wrote:
> As per mount_namespaces(7) unprivileged users should not be able to look under mount points:
>
> Mounts that come as a single unit from more privileged mount are locked
> together and may not be separated in a less privileged mount namespace.
>
> However they can:
>
> 1. Create a mount namespace.
> 2. In the mount namespace open a file descriptor to the parent of a mount point.
> 3. Destroy the mount namespace.
> 4. Use the file descriptor to look under the mount point.
>
> I have reproduced this with Linux 4.16.18 and Linux 4.18-rc8.
>
> The setup:
>
> $ sudo sysctl kernel.unprivileged_userns_clone=1
> kernel.unprivileged_userns_clone = 1
> $ mkdir -p A/B/Secret
> $ sudo mount -t tmpfs hide A/B
>
>
> "Secret" is indeed hidden as expected:
>
> $ ls -lR A
> A:
> total 0
> drwxrwxrwt 2 root root 40 Feb 12 21:08 B
>
> A/B:
> total 0
>
>
> The attack revealing "Secret":
>
> $ unshare -Umr sh -c "exec unshare -m ls -lR /proc/self/fd/4/ 4<A"
> /proc/self/fd/4/:
> total 0
> drwxr-xr-x 3 root root 60 Feb 12 21:08 B
>
> /proc/self/fd/4/B:
> total 0
> drwxr-xr-x 2 root root 40 Feb 12 21:08 Secret
>
> /proc/self/fd/4/B/Secret:
> total 0

I tracked this down to put_mnt_ns running passing UMOUNT_SYNC and
disconnecting all of the mounts in a mount namespace. Fix this by
factoring drop_mounts out of drop_collected_mounts and passing
0 instead of UMOUNT_SYNC.

There are two possible behavior differences that result from this.
- No longer setting UMOUNT_SYNC will no longer set MNT_SYNC_UMOUNT on
the vfsmounts being unmounted. This effects the lazy rcu walk by
kicking the walk out of rcu mode and forcing it to be a non-lazy
walk.
- No longer disconnecting locked mounts will keep some mounts around
longer as they stay because the are locked to other mounts.

There are only two users of drop_collected mounts: audit_tree.c and
put_mnt_ns.

In audit_tree.c the mounts are private and there are no rcu lazy walks
only calls to iterate_mounts. So the changes should have no effect
except for a small timing effect as the connected mounts are disconnected.

In put_mnt_ns there may be references from process outside the mount
namespace to the mounts. So the mounts remaining connected will
be the bug fix that is needed. That rcu walks are allowed to continue
appears not to be a problem especially as the rcu walk change was about
an implementation detail not about semantics.

Cc: [email protected]
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Reported-by: Timothy Baldwin <[email protected]>
Tested-by: Timothy Baldwin <[email protected]>
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namespace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1848,7 +1848,7 @@ void drop_collected_mounts(struct vfsmou
{
namespace_lock();
lock_mount_hash();
- umount_tree(real_mount(mnt), UMOUNT_SYNC);
+ umount_tree(real_mount(mnt), 0);
unlock_mount_hash();
namespace_unlock();
}



2018-11-19 17:31:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 61/83] ext4: fix possible leak of sbi->s_group_desc_leak in error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <[email protected]>

commit 9e463084cdb22e0b56b2dfbc50461020409a5fd3 upstream.

Fixes: bfe0a5f47ada ("ext4: add more mount time checks of the superblock")
Reported-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.18
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/super.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3897,6 +3897,14 @@ static int ext4_fill_super(struct super_
sbi->s_groups_count = blocks_count;
sbi->s_blockfile_groups = min_t(ext4_group_t, sbi->s_groups_count,
(EXT4_MAX_BLOCK_FILE_PHYS / EXT4_BLOCKS_PER_GROUP(sb)));
+ if (((u64)sbi->s_groups_count * sbi->s_inodes_per_group) !=
+ le32_to_cpu(es->s_inodes_count)) {
+ ext4_msg(sb, KERN_ERR, "inodes count not valid: %u vs %llu",
+ le32_to_cpu(es->s_inodes_count),
+ ((u64)sbi->s_groups_count * sbi->s_inodes_per_group));
+ ret = -EINVAL;
+ goto failed_mount;
+ }
db_count = (sbi->s_groups_count + EXT4_DESC_PER_BLOCK(sb) - 1) /
EXT4_DESC_PER_BLOCK(sb);
if (ext4_has_feature_meta_bg(sb)) {
@@ -3916,14 +3924,6 @@ static int ext4_fill_super(struct super_
ret = -ENOMEM;
goto failed_mount;
}
- if (((u64)sbi->s_groups_count * sbi->s_inodes_per_group) !=
- le32_to_cpu(es->s_inodes_count)) {
- ext4_msg(sb, KERN_ERR, "inodes count not valid: %u vs %llu",
- le32_to_cpu(es->s_inodes_count),
- ((u64)sbi->s_groups_count * sbi->s_inodes_per_group));
- ret = -EINVAL;
- goto failed_mount;
- }

bgl_lock_init(sbi->s_blockgroup_lock);




2018-11-19 17:32:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 31/83] xtensa: add NOTES section to the linker script

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Max Filippov <[email protected]>

commit 4119ba211bc4f1bf638f41e50b7a0f329f58aa16 upstream.

This section collects all source .note.* sections together in the
vmlinux image. Without it .note.Linux section may be placed at address
0, while the rest of the kernel is at its normal address, resulting in a
huge vmlinux.bin image that may not be linked into the xtensa Image.elf.

Cc: [email protected]
Signed-off-by: Max Filippov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/xtensa/boot/Makefile | 2 +-
arch/xtensa/kernel/vmlinux.lds.S | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

--- a/arch/xtensa/boot/Makefile
+++ b/arch/xtensa/boot/Makefile
@@ -31,7 +31,7 @@ $(bootdir-y): $(addprefix $(obj)/,$(subd
$(addprefix $(obj)/,$(host-progs))
$(Q)$(MAKE) $(build)=$(obj)/$@ $(MAKECMDGOALS)

-OBJCOPYFLAGS = --strip-all -R .comment -R .note.gnu.build-id -O binary
+OBJCOPYFLAGS = --strip-all -R .comment -R .notes -O binary

vmlinux.bin: vmlinux FORCE
$(call if_changed,objcopy)
--- a/arch/xtensa/kernel/vmlinux.lds.S
+++ b/arch/xtensa/kernel/vmlinux.lds.S
@@ -109,6 +109,7 @@ SECTIONS
.fixup : { *(.fixup) }

EXCEPTION_TABLE(16)
+ NOTES
/* Data section */

_sdata = .;



2018-11-19 17:32:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 66/83] mount: Retest MNT_LOCKED in do_umount

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric W. Biederman <[email protected]>

commit 25d202ed820ee347edec0bf3bf553544556bf64b upstream.

It was recently pointed out that the one instance of testing MNT_LOCKED
outside of the namespace_sem is in ksys_umount.

Fix that by adding a test inside of do_umount with namespace_sem and
the mount_lock held. As it helps to fail fails the existing test is
maintained with an additional comment pointing out that it may be racy
because the locks are not held.

Cc: [email protected]
Reported-by: Al Viro <[email protected]>
Fixes: 5ff9d8a65ce8 ("vfs: Lock in place mounts from more privileged users")
Signed-off-by: "Eric W. Biederman" <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/namespace.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1599,8 +1599,13 @@ static int do_umount(struct mount *mnt,

namespace_lock();
lock_mount_hash();
- event++;

+ /* Recheck MNT_LOCKED with the locks held */
+ retval = -EINVAL;
+ if (mnt->mnt.mnt_flags & MNT_LOCKED)
+ goto out;
+
+ event++;
if (flags & MNT_DETACH) {
if (!list_empty(&mnt->mnt_list))
umount_tree(mnt, UMOUNT_PROPAGATE);
@@ -1614,6 +1619,7 @@ static int do_umount(struct mount *mnt,
retval = 0;
}
}
+out:
unlock_mount_hash();
namespace_unlock();
return retval;
@@ -1704,7 +1710,7 @@ SYSCALL_DEFINE2(umount, char __user *, n
goto dput_and_out;
if (!check_mnt(mnt))
goto dput_and_out;
- if (mnt->mnt.mnt_flags & MNT_LOCKED)
+ if (mnt->mnt.mnt_flags & MNT_LOCKED) /* Check optimistically */
goto dput_and_out;
retval = -EPERM;
if (flags & MNT_FORCE && !capable(CAP_SYS_ADMIN))



2018-11-19 17:32:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 62/83] ext4: fix possible leak of s_journal_flag_rwsem in error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit af18e35bfd01e6d65a5e3ef84ffe8b252d1628c5 upstream.

Fixes: c8585c6fcaf2 ("ext4: fix races between changing inode journal ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 4.7
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/super.c | 1 +
1 file changed, 1 insertion(+)

--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4305,6 +4305,7 @@ failed_mount6:
percpu_counter_destroy(&sbi->s_freeinodes_counter);
percpu_counter_destroy(&sbi->s_dirs_counter);
percpu_counter_destroy(&sbi->s_dirtyclusters_counter);
+ percpu_free_rwsem(&sbi->s_journal_flag_rwsem);
failed_mount5:
ext4_ext_release(sb);
ext4_release_system_zone(sb);



2018-11-19 17:32:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 52/83] clockevents/drivers/i8253: Add support for PIT shutdown quirk

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Kelley <[email protected]>

commit 35b69a420bfb56b7b74cb635ea903db05e357bec upstream.

Add support for platforms where pit_shutdown() doesn't work because of a
quirk in the PIT emulation. On these platforms setting the counter register
to zero causes the PIT to start running again, negating the shutdown.

Provide a global variable that controls whether the counter register is
zero'ed, which platform specific code can override.

Signed-off-by: Michael Kelley <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: vkuznets <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: "[email protected]" <[email protected]>
Cc: KY Srinivasan <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/clocksource/i8253.c | 14 ++++++++++++--
include/linux/i8253.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)

--- a/drivers/clocksource/i8253.c
+++ b/drivers/clocksource/i8253.c
@@ -19,6 +19,13 @@
DEFINE_RAW_SPINLOCK(i8253_lock);
EXPORT_SYMBOL(i8253_lock);

+/*
+ * Handle PIT quirk in pit_shutdown() where zeroing the counter register
+ * restarts the PIT, negating the shutdown. On platforms with the quirk,
+ * platform specific code can set this to false.
+ */
+bool i8253_clear_counter_on_shutdown __ro_after_init = true;
+
#ifdef CONFIG_CLKSRC_I8253
/*
* Since the PIT overflows every tick, its not very useful
@@ -108,8 +115,11 @@ static int pit_shutdown(struct clock_eve
raw_spin_lock(&i8253_lock);

outb_p(0x30, PIT_MODE);
- outb_p(0, PIT_CH0);
- outb_p(0, PIT_CH0);
+
+ if (i8253_clear_counter_on_shutdown) {
+ outb_p(0, PIT_CH0);
+ outb_p(0, PIT_CH0);
+ }

raw_spin_unlock(&i8253_lock);
return 0;
--- a/include/linux/i8253.h
+++ b/include/linux/i8253.h
@@ -21,6 +21,7 @@
#define PIT_LATCH ((PIT_TICK_RATE + HZ/2) / HZ)

extern raw_spinlock_t i8253_lock;
+extern bool i8253_clear_counter_on_shutdown;
extern struct clock_event_device i8253_clockevent;
extern void clockevent_i8253_init(bool oneshot);




2018-11-19 17:32:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 60/83] ext4: avoid possible double brelse() in add_new_gdb() on error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Theodore Ts'o <[email protected]>

commit 4f32c38b4662312dd3c5f113d8bdd459887fb773 upstream.

Fixes: b40971426a83 ("ext4: add error checking to calls to ...")
Reported-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 2.6.38
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 1 +
1 file changed, 1 insertion(+)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -844,6 +844,7 @@ static int add_new_gdb(handle_t *handle,
err = ext4_handle_dirty_metadata(handle, NULL, gdb_bh);
if (unlikely(err)) {
ext4_std_error(sb, err);
+ iloc.bh = NULL;
goto exit_inode;
}
brelse(dind);



2018-11-19 17:32:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 48/83] termios, tty/tty_baudrate.c: fix buffer overrun

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: H. Peter Anvin <[email protected]>

commit 991a25194097006ec1e0d2e0814ff920e59e3465 upstream.

On architectures with CBAUDEX == 0 (Alpha and PowerPC), the code in tty_baudrate.c does
not do any limit checking on the tty_baudrate[] array, and in fact a
buffer overrun is possible on both architectures. Add a limit check to
prevent that situation.

This will be followed by a much bigger cleanup/simplification patch.

Signed-off-by: H. Peter Anvin (Intel) <[email protected]>
Requested-by: Cc: Johan Hovold <[email protected]>
Cc: Jiri Slaby <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Richard Henderson <[email protected]>
Cc: Ivan Kokshaysky <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Kate Stewart <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Eugene Syromiatnikov <[email protected]>
Cc: Alan Cox <[email protected]>
Cc: stable <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/tty_ioctl.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/tty/tty_ioctl.c
+++ b/drivers/tty/tty_ioctl.c
@@ -325,7 +325,7 @@ speed_t tty_termios_baud_rate(struct kte
else
cbaud += 15;
}
- return baud_table[cbaud];
+ return cbaud >= n_baud_table ? 0 : baud_table[cbaud];
}
EXPORT_SYMBOL(tty_termios_baud_rate);

@@ -361,7 +361,7 @@ speed_t tty_termios_input_baud_rate(stru
else
cbaud += 15;
}
- return baud_table[cbaud];
+ return cbaud >= n_baud_table ? 0 : baud_table[cbaud];
#else
return tty_termios_baud_rate(termios);
#endif



2018-11-19 17:33:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 55/83] ext4: add missing brelse() add_new_gdb_meta_bg()s error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit 61a9c11e5e7a0dab5381afa5d9d4dd5ebf18f7a0 upstream.

Fixes: 01f795f9e0d6 ("ext4: add online resizing support for meta_bg ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.7
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -899,6 +899,7 @@ static int add_new_gdb_meta_bg(struct su
sizeof(struct buffer_head *),
GFP_NOFS);
if (!n_group_desc) {
+ brelse(gdb_bh);
err = -ENOMEM;
ext4_warning(sb, "not enough memory for %lu groups",
gdb_num + 1);
@@ -914,8 +915,6 @@ static int add_new_gdb_meta_bg(struct su
kvfree(o_group_desc);
BUFFER_TRACE(gdb_bh, "get_write_access");
err = ext4_journal_get_write_access(handle, gdb_bh);
- if (unlikely(err))
- brelse(gdb_bh);
return err;
}




2018-11-19 17:33:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 56/83] ext4: avoid potential extra brelse in setup_new_flex_group_blocks()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit 9e4028935cca3f9ef9b6a90df9da6f1f94853536 upstream.

Currently bh is set to NULL only during first iteration of for cycle,
then this pointer is not cleared after end of using.
Therefore rollback after errors can lead to extra brelse(bh) call,
decrements bh counter and later trigger an unexpected warning in __brelse()

Patch moves brelse() calls in body of cycle to exclude requirement of
brelse() call in rollback.

Fixes: 33afdcc5402d ("ext4: add a function which sets up group blocks ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.3+
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/resize.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)

--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -590,7 +590,6 @@ handle_bb:
bh = bclean(handle, sb, block);
if (IS_ERR(bh)) {
err = PTR_ERR(bh);
- bh = NULL;
goto out;
}
overhead = ext4_group_overhead_blocks(sb, group);
@@ -602,9 +601,9 @@ handle_bb:
ext4_mark_bitmap_end(group_data[i].blocks_count,
sb->s_blocksize * 8, bh->b_data);
err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (err)
goto out;
- brelse(bh);

handle_ib:
if (bg_flags[i] & EXT4_BG_INODE_UNINIT)
@@ -619,18 +618,16 @@ handle_ib:
bh = bclean(handle, sb, block);
if (IS_ERR(bh)) {
err = PTR_ERR(bh);
- bh = NULL;
goto out;
}

ext4_mark_bitmap_end(EXT4_INODES_PER_GROUP(sb),
sb->s_blocksize * 8, bh->b_data);
err = ext4_handle_dirty_metadata(handle, NULL, bh);
+ brelse(bh);
if (err)
goto out;
- brelse(bh);
}
- bh = NULL;

/* Mark group tables in block bitmap */
for (j = 0; j < GROUP_TABLE_COUNT; j++) {
@@ -661,7 +658,6 @@ handle_ib:
}

out:
- brelse(bh);
err2 = ext4_journal_stop(handle);
if (err2 && !err)
err = err2;



2018-11-19 17:33:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 45/83] netfilter: conntrack: fix calculation of next bucket number in early_drop

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Khoruzhick <[email protected]>

commit f393808dc64149ccd0e5a8427505ba2974a59854 upstream.

If there's no entry to drop in bucket that corresponds to the hash,
early_drop() should look for it in other buckets. But since it increments
hash instead of bucket number, it actually looks in the same bucket 8
times: hsize is 16k by default (14 bits) and hash is 32-bit value, so
reciprocal_scale(hash, hsize) returns the same value for hash..hash+7 in
most cases.

Fix it by increasing bucket number instead of hash and rename _hash
to bucket to avoid future confusion.

Fixes: 3e86638e9a0b ("netfilter: conntrack: consider ct netns in early_drop logic")
Cc: <[email protected]> # v4.7+
Signed-off-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/netfilter/nf_conntrack_core.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)

--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -918,19 +918,22 @@ static unsigned int early_drop_list(stru
return drops;
}

-static noinline int early_drop(struct net *net, unsigned int _hash)
+static noinline int early_drop(struct net *net, unsigned int hash)
{
- unsigned int i;
+ unsigned int i, bucket;

for (i = 0; i < NF_CT_EVICTION_RANGE; i++) {
struct hlist_nulls_head *ct_hash;
- unsigned int hash, hsize, drops;
+ unsigned int hsize, drops;

rcu_read_lock();
nf_conntrack_get_ht(&ct_hash, &hsize);
- hash = reciprocal_scale(_hash++, hsize);
+ if (!i)
+ bucket = reciprocal_scale(hash, hsize);
+ else
+ bucket = (bucket + 1) % hsize;

- drops = early_drop_list(net, &ct_hash[hash]);
+ drops = early_drop_list(net, &ct_hash[bucket]);
rcu_read_unlock();

if (drops) {



2018-11-19 17:33:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 44/83] mm: thp: relax __GFP_THISNODE for MADV_HUGEPAGE mappings

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrea Arcangeli <[email protected]>

commit ac5b2c18911ffe95c08d69273917f90212cf5659 upstream.

THP allocation might be really disruptive when allocated on NUMA system
with the local node full or hard to reclaim. Stefan has posted an
allocation stall report on 4.12 based SLES kernel which suggests the
same issue:

kvm: page allocation stalls for 194572ms, order:9, mode:0x4740ca(__GFP_HIGHMEM|__GFP_IO|__GFP_FS|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL|__GFP_THISNODE|__GFP_MOVABLE|__GFP_DIRECT_RECLAIM), nodemask=(null)
kvm cpuset=/ mems_allowed=0-1
CPU: 10 PID: 84752 Comm: kvm Tainted: G W 4.12.0+98-ph <a href="/view.php?id=1" title="[geschlossen] Integration Ramdisk" class="resolved">0000001</a> SLE15 (unreleased)
Hardware name: Supermicro SYS-1029P-WTRT/X11DDW-NT, BIOS 2.0 12/05/2017
Call Trace:
dump_stack+0x5c/0x84
warn_alloc+0xe0/0x180
__alloc_pages_slowpath+0x820/0xc90
__alloc_pages_nodemask+0x1cc/0x210
alloc_pages_vma+0x1e5/0x280
do_huge_pmd_wp_page+0x83f/0xf00
__handle_mm_fault+0x93d/0x1060
handle_mm_fault+0xc6/0x1b0
__do_page_fault+0x230/0x430
do_page_fault+0x2a/0x70
page_fault+0x7b/0x80
[...]
Mem-Info:
active_anon:126315487 inactive_anon:1612476 isolated_anon:5
active_file:60183 inactive_file:245285 isolated_file:0
unevictable:15657 dirty:286 writeback:1 unstable:0
slab_reclaimable:75543 slab_unreclaimable:2509111
mapped:81814 shmem:31764 pagetables:370616 bounce:0
free:32294031 free_pcp:6233 free_cma:0
Node 0 active_anon:254680388kB inactive_anon:1112760kB active_file:240648kB inactive_file:981168kB unevictable:13368kB isolated(anon):0kB isolated(file):0kB mapped:280240kB dirty:1144kB writeback:0kB shmem:95832kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 81225728kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no
Node 1 active_anon:250583072kB inactive_anon:5337144kB active_file:84kB inactive_file:0kB unevictable:49260kB isolated(anon):20kB isolated(file):0kB mapped:47016kB dirty:0kB writeback:4kB shmem:31224kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 31897600kB writeback_tmp:0kB unstable:0kB all_unreclaimable? no

The defrag mode is "madvise" and from the above report it is clear that
the THP has been allocated for MADV_HUGEPAGA vma.

Andrea has identified that the main source of the problem is
__GFP_THISNODE usage:

: The problem is that direct compaction combined with the NUMA
: __GFP_THISNODE logic in mempolicy.c is telling reclaim to swap very
: hard the local node, instead of failing the allocation if there's no
: THP available in the local node.
:
: Such logic was ok until __GFP_THISNODE was added to the THP allocation
: path even with MPOL_DEFAULT.
:
: The idea behind the __GFP_THISNODE addition, is that it is better to
: provide local memory in PAGE_SIZE units than to use remote NUMA THP
: backed memory. That largely depends on the remote latency though, on
: threadrippers for example the overhead is relatively low in my
: experience.
:
: The combination of __GFP_THISNODE and __GFP_DIRECT_RECLAIM results in
: extremely slow qemu startup with vfio, if the VM is larger than the
: size of one host NUMA node. This is because it will try very hard to
: unsuccessfully swapout get_user_pages pinned pages as result of the
: __GFP_THISNODE being set, instead of falling back to PAGE_SIZE
: allocations and instead of trying to allocate THP on other nodes (it
: would be even worse without vfio type1 GUP pins of course, except it'd
: be swapping heavily instead).

Fix this by removing __GFP_THISNODE for THP requests which are
requesting the direct reclaim. This effectivelly reverts 5265047ac301
on the grounds that the zone/node reclaim was known to be disruptive due
to premature reclaim when there was memory free. While it made sense at
the time for HPC workloads without NUMA awareness on rare machines, it
was ultimately harmful in the majority of cases. The existing behaviour
is similar, if not as widespare as it applies to a corner case but
crucially, it cannot be tuned around like zone_reclaim_mode can. The
default behaviour should always be to cause the least harm for the
common case.

If there are specialised use cases out there that want zone_reclaim_mode
in specific cases, then it can be built on top. Longterm we should
consider a memory policy which allows for the node reclaim like behavior
for the specific memory ranges which would allow a

[1] http://lkml.kernel.org/r/[email protected]

Mel said:

: Both patches look correct to me but I'm responding to this one because
: it's the fix. The change makes sense and moves further away from the
: severe stalling behaviour we used to see with both THP and zone reclaim
: mode.
:
: I put together a basic experiment with usemem configured to reference a
: buffer multiple times that is 80% the size of main memory on a 2-socket
: box with symmetric node sizes and defrag set to "always". The defrag
: setting is not the default but it would be functionally similar to
: accessing a buffer with madvise(MADV_HUGEPAGE). Usemem is configured to
: reference the buffer multiple times and while it's not an interesting
: workload, it would be expected to complete reasonably quickly as it fits
: within memory. The results were;
:
: usemem
: vanilla noreclaim-v1
: Amean Elapsd-1 42.78 ( 0.00%) 26.87 ( 37.18%)
: Amean Elapsd-3 27.55 ( 0.00%) 7.44 ( 73.00%)
: Amean Elapsd-4 5.72 ( 0.00%) 5.69 ( 0.45%)
:
: This shows the elapsed time in seconds for 1 thread, 3 threads and 4
: threads referencing buffers 80% the size of memory. With the patches
: applied, it's 37.18% faster for the single thread and 73% faster with two
: threads. Note that 4 threads showing little difference does not indicate
: the problem is related to thread counts. It's simply the case that 4
: threads gets spread so their workload mostly fits in one node.
:
: The overall view from /proc/vmstats is more startling
:
: 4.19.0-rc1 4.19.0-rc1
: vanillanoreclaim-v1r1
: Minor Faults 35593425 708164
: Major Faults 484088 36
: Swap Ins 3772837 0
: Swap Outs 3932295 0
:
: Massive amounts of swap in/out without the patch
:
: Direct pages scanned 6013214 0
: Kswapd pages scanned 0 0
: Kswapd pages reclaimed 0 0
: Direct pages reclaimed 4033009 0
:
: Lots of reclaim activity without the patch
:
: Kswapd efficiency 100% 100%
: Kswapd velocity 0.000 0.000
: Direct efficiency 67% 100%
: Direct velocity 11191.956 0.000
:
: Mostly from direct reclaim context as you'd expect without the patch.
:
: Page writes by reclaim 3932314.000 0.000
: Page writes file 19 0
: Page writes anon 3932295 0
: Page reclaim immediate 42336 0
:
: Writes from reclaim context is never good but the patch eliminates it.
:
: We should never have default behaviour to thrash the system for such a
: basic workload. If zone reclaim mode behaviour is ever desired but on a
: single task instead of a global basis then the sensible option is to build
: a mempolicy that enforces that behaviour.

This was a severe regression compared to previous kernels that made
important workloads unusable and it starts when __GFP_THISNODE was
added to THP allocations under MADV_HUGEPAGE. It is not a significant
risk to go to the previous behavior before __GFP_THISNODE was added, it
worked like that for years.

This was simply an optimization to some lucky workloads that can fit in
a single node, but it ended up breaking the VM for others that can't
possibly fit in a single node, so going back is safe.

[[email protected]: rewrote the changelog based on the one from Andrea]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 5265047ac301 ("mm, thp: really limit transparent hugepage allocation to local node")
Signed-off-by: Andrea Arcangeli <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Reported-by: Stefan Priebe <[email protected]>
Debugged-by: Andrea Arcangeli <[email protected]>
Reported-by: Alex Williamson <[email protected]>
Reviewed-by: Mel Gorman <[email protected]>
Tested-by: Mel Gorman <[email protected]>
Cc: Zi Yan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: <[email protected]> [4.1+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/mempolicy.c | 32 ++++++++++++++++++++++++++++++--
1 file changed, 30 insertions(+), 2 deletions(-)

--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2027,8 +2027,36 @@ retry_cpuset:
nmask = policy_nodemask(gfp, pol);
if (!nmask || node_isset(hpage_node, *nmask)) {
mpol_cond_put(pol);
- page = __alloc_pages_node(hpage_node,
- gfp | __GFP_THISNODE, order);
+ /*
+ * We cannot invoke reclaim if __GFP_THISNODE
+ * is set. Invoking reclaim with
+ * __GFP_THISNODE set, would cause THP
+ * allocations to trigger heavy swapping
+ * despite there may be tons of free memory
+ * (including potentially plenty of THP
+ * already available in the buddy) on all the
+ * other NUMA nodes.
+ *
+ * At most we could invoke compaction when
+ * __GFP_THISNODE is set (but we would need to
+ * refrain from invoking reclaim even if
+ * compaction returned COMPACT_SKIPPED because
+ * there wasn't not enough memory to succeed
+ * compaction). For now just avoid
+ * __GFP_THISNODE instead of limiting the
+ * allocation path to a strict and single
+ * compaction invocation.
+ *
+ * Supposedly if direct reclaim was enabled by
+ * the caller, the app prefers THP regardless
+ * of the node it comes from so this would be
+ * more desiderable behavior than only
+ * providing THP originated from the local
+ * node in such case.
+ */
+ if (!(gfp & __GFP_DIRECT_RECLAIM))
+ gfp |= __GFP_THISNODE;
+ page = __alloc_pages_node(hpage_node, gfp, order);
goto out;
}
}



2018-11-19 17:33:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 65/83] ext4: fix buffer leak in __ext4_read_dirblock() on error path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vasily Averin <[email protected]>

commit de59fae0043f07de5d25e02ca360f7d57bfa5866 upstream.

Fixes: dc6982ff4db1 ("ext4: refactor code to read directory blocks ...")
Signed-off-by: Vasily Averin <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Cc: [email protected] # 3.9
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/ext4/namei.c | 1 +
1 file changed, 1 insertion(+)

--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -124,6 +124,7 @@ static struct buffer_head *__ext4_read_d
if (!is_dx_block && type == INDEX) {
ext4_error_inode(inode, func, line, block,
"directory leaf block found instead of index block");
+ brelse(bh);
return ERR_PTR(-EFSCORRUPTED);
}
if (!ext4_has_metadata_csum(inode->i_sb) ||



2018-11-19 17:34:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 30/83] MIPS: Loongson-3: Fix BRIDGE irq delivery problem

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 360fe725f8849aaddc53475fef5d4a0c439b05ae ]

After commit e509bd7da149dc349160 ("genirq: Allow migration of chained
interrupts by installing default action") Loongson-3 fails at here:

setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction);

This is because both chained_action and cascade_irqaction don't have
IRQF_SHARED flag. This will cause Loongson-3 resume fails because HPET
timer interrupt can't be delivered during S3. So we set the irqchip of
the chained irq to loongson_irq_chip which doesn't disable the chained
irq in CP0.Status.

Cc: [email protected]
Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/20434/
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/include/asm/mach-loongson64/irq.h | 2 +-
arch/mips/loongson64/loongson-3/irq.c | 13 +++----------
2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/mips/include/asm/mach-loongson64/irq.h b/arch/mips/include/asm/mach-loongson64/irq.h
index d18c45c7c394..19ff9ce46c02 100644
--- a/arch/mips/include/asm/mach-loongson64/irq.h
+++ b/arch/mips/include/asm/mach-loongson64/irq.h
@@ -9,7 +9,7 @@
#define MIPS_CPU_IRQ_BASE 56

#define LOONGSON_UART_IRQ (MIPS_CPU_IRQ_BASE + 2) /* UART */
-#define LOONGSON_HT1_IRQ (MIPS_CPU_IRQ_BASE + 3) /* HT1 */
+#define LOONGSON_BRIDGE_IRQ (MIPS_CPU_IRQ_BASE + 3) /* CASCADE */
#define LOONGSON_TIMER_IRQ (MIPS_CPU_IRQ_BASE + 7) /* CPU Timer */

#define LOONGSON_HT1_CFG_BASE loongson_sysconf.ht_control_base
diff --git a/arch/mips/loongson64/loongson-3/irq.c b/arch/mips/loongson64/loongson-3/irq.c
index ec5f2b30646c..027f53e3bc81 100644
--- a/arch/mips/loongson64/loongson-3/irq.c
+++ b/arch/mips/loongson64/loongson-3/irq.c
@@ -44,12 +44,6 @@ void mach_irq_dispatch(unsigned int pending)
}
}

-static struct irqaction cascade_irqaction = {
- .handler = no_action,
- .flags = IRQF_NO_SUSPEND,
- .name = "cascade",
-};
-
static inline void mask_loongson_irq(struct irq_data *d) { }
static inline void unmask_loongson_irq(struct irq_data *d) { }

@@ -90,11 +84,10 @@ void __init mach_init_irq(void)
init_i8259_irqs();
irq_set_chip_and_handler(LOONGSON_UART_IRQ,
&loongson_irq_chip, handle_percpu_irq);
+ irq_set_chip_and_handler(LOONGSON_BRIDGE_IRQ,
+ &loongson_irq_chip, handle_percpu_irq);

- /* setup HT1 irq */
- setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction);
-
- set_c0_status(STATUSF_IP2 | STATUSF_IP6);
+ set_c0_status(STATUSF_IP2 | STATUSF_IP3 | STATUSF_IP6);
}

#ifdef CONFIG_HOTPLUG_CPU
--
2.17.1




2018-11-19 17:34:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 42/83] vhost/scsi: truncate T10 PI iov_iter to prot_bytes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Greg Edwards <[email protected]>

commit 4542d623c7134bc1738f8a68ccb6dd546f1c264f upstream.

Commands with protection information included were not truncating the
protection iov_iter to the number of protection bytes in the command.
This resulted in vhost_scsi mis-calculating the size of the protection
SGL in vhost_scsi_calc_sgls(), and including both the protection and
data SG entries in the protection SGL.

Fixes: 09b13fa8c1a1 ("vhost/scsi: Add ANY_LAYOUT support in vhost_scsi_handle_vq")
Signed-off-by: Greg Edwards <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
Fixes: 09b13fa8c1a1093e9458549ac8bb203a7c65c62a
Cc: [email protected]
Reviewed-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/vhost/scsi.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -999,7 +999,8 @@ vhost_scsi_handle_vq(struct vhost_scsi *
prot_bytes = vhost32_to_cpu(vq, v_req_pi.pi_bytesin);
}
/*
- * Set prot_iter to data_iter, and advance past any
+ * Set prot_iter to data_iter and truncate it to
+ * prot_bytes, and advance data_iter past any
* preceeding prot_bytes that may be present.
*
* Also fix up the exp_data_len to reflect only the
@@ -1008,6 +1009,7 @@ vhost_scsi_handle_vq(struct vhost_scsi *
if (prot_bytes) {
exp_data_len -= prot_bytes;
prot_iter = data_iter;
+ iov_iter_truncate(&prot_iter, prot_bytes);
iov_iter_advance(&data_iter, prot_bytes);
}
tag = vhost64_to_cpu(vq, v_req_pi.tag);



2018-11-19 17:34:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/83] powerpc/boot: Ensure _zimage_start is a weak symbol

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Joel Stanley <[email protected]>

[ Upstream commit ee9d21b3b3583712029a0db65a4b7c081d08d3b3 ]

When building with clang crt0's _zimage_start is not marked weak, which
breaks the build when linking the kernel image:

$ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
0000000000000058 g .text 0000000000000000 _zimage_start

ld: arch/powerpc/boot/wrapper.a(crt0.o): in function '_zimage_start':
(.text+0x58): multiple definition of '_zimage_start';
arch/powerpc/boot/pseries-head.o:(.text+0x0): first defined here

Clang requires the .weak directive to appear after the symbol is
declared. The binutils manual says:

This directive sets the weak attribute on the comma separated list of
symbol names. If the symbols do not already exist, they will be
created.

So it appears this is different with clang. The only reference I could
see for this was an OpenBSD mailing list post[1].

Changing it to be after the declaration fixes building with Clang, and
still works with GCC.

$ objdump -t arch/powerpc/boot/crt0.o |grep _zimage_start$
0000000000000058 w .text 0000000000000000 _zimage_start

Reported to clang as https://bugs.llvm.org/show_bug.cgi?id=38921

[1] https://groups.google.com/forum/#!topic/fa.openbsd.tech/PAgKKen2YCY

Signed-off-by: Joel Stanley <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/boot/crt0.S | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/powerpc/boot/crt0.S
+++ b/arch/powerpc/boot/crt0.S
@@ -47,8 +47,10 @@ p_end: .long _end
p_pstack: .long _platform_stack_top
#endif

- .weak _zimage_start
.globl _zimage_start
+ /* Clang appears to require the .weak directive to be after the symbol
+ * is defined. See https://bugs.llvm.org/show_bug.cgi?id=38921 */
+ .weak _zimage_start
_zimage_start:
.globl _zimage_start_lib
_zimage_start_lib:



2018-11-19 17:34:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/83] MIPS: kexec: Mark CPU offline before disabling local IRQ

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dengcheng Zhu <[email protected]>

[ Upstream commit dc57aaf95a516f70e2d527d8287a0332c481a226 ]

After changing CPU online status, it will not be sent any IPIs such as in
__flush_cache_all() on software coherency systems. Do this before disabling
local IRQ.

Signed-off-by: Dengcheng Zhu <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/20571/
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/kernel/crash.c | 3 +++
arch/mips/kernel/machine_kexec.c | 3 +++
2 files changed, 6 insertions(+)

--- a/arch/mips/kernel/crash.c
+++ b/arch/mips/kernel/crash.c
@@ -34,6 +34,9 @@ static void crash_shutdown_secondary(voi
if (!cpu_online(cpu))
return;

+ /* We won't be sent IPIs any more. */
+ set_cpu_online(cpu, false);
+
local_irq_disable();
if (!cpumask_test_cpu(cpu, &cpus_in_crash))
crash_save_cpu(regs, cpu);
--- a/arch/mips/kernel/machine_kexec.c
+++ b/arch/mips/kernel/machine_kexec.c
@@ -96,6 +96,9 @@ machine_kexec(struct kimage *image)
*ptr = (unsigned long) phys_to_virt(*ptr);
}

+ /* Mark offline BEFORE disabling local irq. */
+ set_cpu_online(smp_processor_id(), false);
+
/*
* we do not want to be bothered.
*/



2018-11-19 17:35:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/83] powerpc/nohash: fix undefined behaviour when testing page size support

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Axtens <[email protected]>

[ Upstream commit f5e284803a7206d43e26f9ffcae5de9626d95e37 ]

When enumerating page size definitions to check hardware support,
we construct a constant which is (1U << (def->shift - 10)).

However, the array of page size definitions is only initalised for
various MMU_PAGE_* constants, so it contains a number of 0-initialised
elements with def->shift == 0. This means we end up shifting by a
very large number, which gives the following UBSan splat:

================================================================================
UBSAN: Undefined behaviour in /home/dja/dev/linux/linux/arch/powerpc/mm/tlb_nohash.c:506:21
shift exponent 4294967286 is too large for 32-bit type 'unsigned int'
CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc3-00045-ga604f927b012-dirty #6
Call Trace:
[c00000000101bc20] [c000000000a13d54] .dump_stack+0xa8/0xec (unreliable)
[c00000000101bcb0] [c0000000004f20a8] .ubsan_epilogue+0x18/0x64
[c00000000101bd30] [c0000000004f2b10] .__ubsan_handle_shift_out_of_bounds+0x110/0x1a4
[c00000000101be20] [c000000000d21760] .early_init_mmu+0x1b4/0x5a0
[c00000000101bf10] [c000000000d1ba28] .early_setup+0x100/0x130
[c00000000101bf90] [c000000000000528] start_here_multiplatform+0x68/0x80
================================================================================

Fix this by first checking if the element exists (shift != 0) before
constructing the constant.

Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/mm/tlb_nohash.c | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/powerpc/mm/tlb_nohash.c
+++ b/arch/powerpc/mm/tlb_nohash.c
@@ -481,6 +481,9 @@ static void setup_page_sizes(void)
for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) {
struct mmu_psize_def *def = &mmu_psize_defs[psize];

+ if (!def->shift)
+ continue;
+
if (tlb1ps & (1U << (def->shift - 10))) {
def->flags |= MMU_PAGE_SIZE_DIRECT;




2018-11-19 17:35:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/83] parisc: Align os_hpmc_size on word boundary

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 0ed9d3de5f8f97e6efd5ca0e3377cab5f0451ead ]

The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus
triggered the unaligned fault handler at startup.
Fix it by aligning it properly.

Signed-off-by: Helge Deller <[email protected]>
Cc: <[email protected]> # v4.14+
Signed-off-by: Sasha Levin <[email protected]>
---
arch/parisc/kernel/hpmc.S | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/parisc/kernel/hpmc.S b/arch/parisc/kernel/hpmc.S
index 0fbd0a0e1cda..38d461aec46d 100644
--- a/arch/parisc/kernel/hpmc.S
+++ b/arch/parisc/kernel/hpmc.S
@@ -304,6 +304,7 @@ ENDPROC_CFI(os_hpmc)


__INITRODATA
+ .align 4
.export os_hpmc_size
os_hpmc_size:
.word .os_hpmc_end-.os_hpmc
--
2.17.1




2018-11-19 17:35:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/83] drm/omap: fix memory barrier bug in DMM driver

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Tomi Valkeinen <[email protected]>

[ Upstream commit 538f66ba204944470a653a4cccc5f8befdf97c22 ]

A DMM timeout "timed out waiting for done" has been observed on DRA7
devices. The timeout happens rarely, and only when the system is under
heavy load.

Debugging showed that the timeout can be made to happen much more
frequently by optimizing the DMM driver, so that there's almost no code
between writing the last DMM descriptors to RAM, and writing to DMM
register which starts the DMM transaction.

The current theory is that a wmb() does not properly ensure that the
data written to RAM is observable by all the components in the system.

This DMM timeout has caused interesting (and rare) bugs as the error
handling was not functioning properly (the error handling has been fixed
in previous commits):

* If a DMM timeout happened when a GEM buffer was being pinned for
display on the screen, a timeout error would be shown, but the driver
would continue programming DSS HW with broken buffer, leading to
SYNCLOST floods and possible crashes.

* If a DMM timeout happened when other user (say, video decoder) was
pinning a GEM buffer, a timeout would be shown but if the user
handled the error properly, no other issues followed.

* If a DMM timeout happened when a GEM buffer was being released, the
driver does not even notice the error, leading to crashes or hang
later.

This patch adds wmb() and readl() calls after the last bit is written to
RAM, which should ensure that the execution proceeds only after the data
is actually in RAM, and thus observable by DMM.

The read-back should not be needed. Further study is required to understand
if DMM is somehow special case and read-back is ok, or if DRA7's memory
barriers do not work correctly.

Signed-off-by: Tomi Valkeinen <[email protected]>
Signed-off-by: Peter Ujfalusi <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 11 +++++++++++
1 file changed, 11 insertions(+)

--- a/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
+++ b/drivers/gpu/drm/omapdrm/omap_dmm_tiler.c
@@ -273,6 +273,17 @@ static int dmm_txn_commit(struct dmm_txn
}

txn->last_pat->next_pa = 0;
+ /* ensure that the written descriptors are visible to DMM */
+ wmb();
+
+ /*
+ * NOTE: the wmb() above should be enough, but there seems to be a bug
+ * in OMAP's memory barrier implementation, which in some rare cases may
+ * cause the writes not to be observable after wmb().
+ */
+
+ /* read back to ensure the data is in RAM */
+ readl(&txn->last_pat->next_pa);

/* write to PAT_DESCR to clear out any pending transaction */
dmm_write(dmm, 0x0, reg[PAT_DESCR][engine->id]);



2018-11-19 17:35:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 39/83] mach64: fix display corruption on big endian machines

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mikulas Patocka <[email protected]>

commit 3c6c6a7878d00a3ac997a779c5b9861ff25dfcc8 upstream.

The code for manual bit triple is not endian-clean. It builds the variable
"hostdword" using byte accesses, therefore we must read the variable with
"le32_to_cpu".

The patch also enables (hardware or software) bit triple only if the image
is monochrome (image->depth). If we want to blit full-color image, we
shouldn't use the triple code.

Signed-off-by: Mikulas Patocka <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Cc: [email protected]
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/video/fbdev/aty/mach64_accel.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/drivers/video/fbdev/aty/mach64_accel.c
+++ b/drivers/video/fbdev/aty/mach64_accel.c
@@ -344,7 +344,7 @@ void atyfb_imageblit(struct fb_info *inf
* since Rage 3D IIc we have DP_HOST_TRIPLE_EN bit
* this hwaccelerated triple has an issue with not aligned data
*/
- if (M64_HAS(HW_TRIPLE) && image->width % 8 == 0)
+ if (image->depth == 1 && M64_HAS(HW_TRIPLE) && image->width % 8 == 0)
pix_width |= DP_HOST_TRIPLE_EN;
}

@@ -381,7 +381,7 @@ void atyfb_imageblit(struct fb_info *inf
src_bytes = (((image->width * image->depth) + 7) / 8) * image->height;

/* manual triple each pixel */
- if (info->var.bits_per_pixel == 24 && !(pix_width & DP_HOST_TRIPLE_EN)) {
+ if (image->depth == 1 && info->var.bits_per_pixel == 24 && !(pix_width & DP_HOST_TRIPLE_EN)) {
int inbit, outbit, mult24, byte_id_in_dword, width;
u8 *pbitmapin = (u8*)image->data, *pbitmapout;
u32 hostdword;
@@ -414,7 +414,7 @@ void atyfb_imageblit(struct fb_info *inf
}
}
wait_for_fifo(1, par);
- aty_st_le32(HOST_DATA0, hostdword, par);
+ aty_st_le32(HOST_DATA0, le32_to_cpu(hostdword), par);
}
} else {
u32 *pbitmap, dwords = (src_bytes + 3) / 4;



2018-11-19 17:35:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 24/83] e1000: fix race condition between e1000_down() and e1000_watchdog

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 44c445c3d1b4eacff23141fa7977c3b2ec3a45c9 ]

This patch fixes a race condition that can result into the interface being
up and carrier on, but with transmits disabled in the hardware.
The bug may show up by repeatedly IFF_DOWN+IFF_UP the interface, which
allows e1000_watchdog() interleave with e1000_down().

CPU x CPU y
--------------------------------------------------------------------
e1000_down():
netif_carrier_off()
e1000_watchdog():
if (carrier == off) {
netif_carrier_on();
enable_hw_transmit();
}
disable_hw_transmit();
e1000_watchdog():
/* carrier on, do nothing */

Signed-off-by: Vincenzo Maffione <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/e1000/e1000_main.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index dd112aa5cebb..39a09e18c1b7 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -521,8 +521,6 @@ void e1000_down(struct e1000_adapter *adapter)
struct net_device *netdev = adapter->netdev;
u32 rctl, tctl;

- netif_carrier_off(netdev);
-
/* disable receives in the hardware */
rctl = er32(RCTL);
ew32(RCTL, rctl & ~E1000_RCTL_EN);
@@ -538,6 +536,15 @@ void e1000_down(struct e1000_adapter *adapter)
E1000_WRITE_FLUSH();
msleep(10);

+ /* Set the carrier off after transmits have been disabled in the
+ * hardware, to avoid race conditions with e1000_watchdog() (which
+ * may be running concurrently to us, checking for the carrier
+ * bit to decide whether it should enable transmits again). Such
+ * a race condition would result into transmission being disabled
+ * in the hardware until the next IFF_DOWN+IFF_UP cycle.
+ */
+ netif_carrier_off(netdev);
+
napi_disable(&adapter->napi);

e1000_irq_disable(adapter);
--
2.17.1




2018-11-19 17:35:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/83] tty: check name length in tty_find_polling_driver()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Miles Chen <[email protected]>

[ Upstream commit 33a1a7be198657c8ca26ad406c4d2a89b7162bcc ]

The issue is found by a fuzzing test.
If tty_find_polling_driver() recevies an incorrect input such as
',,' or '0b', the len becomes 0 and strncmp() always return 0.
In this case, a null p->ops->poll_init() is called and it causes a kernel
panic.

Fix this by checking name length against zero in tty_find_polling_driver().

$echo ,, > /sys/module/kgdboc/parameters/kgdboc
[ 20.804451] WARNING: CPU: 1 PID: 104 at drivers/tty/serial/serial_core.c:457
uart_get_baud_rate+0xe8/0x190
[ 20.804917] Modules linked in:
[ 20.805317] CPU: 1 PID: 104 Comm: sh Not tainted 4.19.0-rc7ajb #8
[ 20.805469] Hardware name: linux,dummy-virt (DT)
[ 20.805732] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 20.805895] pc : uart_get_baud_rate+0xe8/0x190
[ 20.806042] lr : uart_get_baud_rate+0xc0/0x190
[ 20.806476] sp : ffffffc06acff940
[ 20.806676] x29: ffffffc06acff940 x28: 0000000000002580
[ 20.806977] x27: 0000000000009600 x26: 0000000000009600
[ 20.807231] x25: ffffffc06acffad0 x24: 00000000ffffeff0
[ 20.807576] x23: 0000000000000001 x22: 0000000000000000
[ 20.807807] x21: 0000000000000001 x20: 0000000000000000
[ 20.808049] x19: ffffffc06acffac8 x18: 0000000000000000
[ 20.808277] x17: 0000000000000000 x16: 0000000000000000
[ 20.808520] x15: ffffffffffffffff x14: ffffffff00000000
[ 20.808757] x13: ffffffffffffffff x12: 0000000000000001
[ 20.809011] x11: 0101010101010101 x10: ffffff880d59ff5f
[ 20.809292] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
[ 20.809549] x7 : 0000000000000000 x6 : ffffff880d59ff5f
[ 20.809803] x5 : 0000000080008001 x4 : 0000000000000003
[ 20.810056] x3 : ffffff900853e6b4 x2 : dfffff9000000000
[ 20.810693] x1 : ffffffc06acffad0 x0 : 0000000000000cb0
[ 20.811005] Call trace:
[ 20.811214] uart_get_baud_rate+0xe8/0x190
[ 20.811479] serial8250_do_set_termios+0xe0/0x6f4
[ 20.811719] serial8250_set_termios+0x48/0x54
[ 20.811928] uart_set_options+0x138/0x1bc
[ 20.812129] uart_poll_init+0x114/0x16c
[ 20.812330] tty_find_polling_driver+0x158/0x200
[ 20.812545] configure_kgdboc+0xbc/0x1bc
[ 20.812745] param_set_kgdboc_var+0xb8/0x150
[ 20.812960] param_attr_store+0xbc/0x150
[ 20.813160] module_attr_store+0x40/0x58
[ 20.813364] sysfs_kf_write+0x8c/0xa8
[ 20.813563] kernfs_fop_write+0x154/0x290
[ 20.813764] vfs_write+0xf0/0x278
[ 20.813951] __arm64_sys_write+0x84/0xf4
[ 20.814400] el0_svc_common+0xf4/0x1dc
[ 20.814616] el0_svc_handler+0x98/0xbc
[ 20.814804] el0_svc+0x8/0xc
[ 20.822005] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 20.826913] Mem abort info:
[ 20.827103] ESR = 0x84000006
[ 20.827352] Exception class = IABT (current EL), IL = 16 bits
[ 20.827655] SET = 0, FnV = 0
[ 20.827855] EA = 0, S1PTW = 0
[ 20.828135] user pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____)
[ 20.828484] [0000000000000000] pgd=00000000aadee003, pud=00000000aadee003, pmd=0000000000000000
[ 20.829195] Internal error: Oops: 84000006 [#1] SMP
[ 20.829564] Modules linked in:
[ 20.829890] CPU: 1 PID: 104 Comm: sh Tainted: G W 4.19.0-rc7ajb #8
[ 20.830545] Hardware name: linux,dummy-virt (DT)
[ 20.830829] pstate: 60000085 (nZCv daIf -PAN -UAO)
[ 20.831174] pc : (null)
[ 20.831457] lr : serial8250_do_set_termios+0x358/0x6f4
[ 20.831727] sp : ffffffc06acff9b0
[ 20.831936] x29: ffffffc06acff9b0 x28: ffffff9008d7c000
[ 20.832267] x27: ffffff900969e16f x26: 0000000000000000
[ 20.832589] x25: ffffff900969dfb0 x24: 0000000000000000
[ 20.832906] x23: ffffffc06acffad0 x22: ffffff900969e160
[ 20.833232] x21: 0000000000000000 x20: ffffffc06acffac8
[ 20.833559] x19: ffffff900969df90 x18: 0000000000000000
[ 20.833878] x17: 0000000000000000 x16: 0000000000000000
[ 20.834491] x15: ffffffffffffffff x14: ffffffff00000000
[ 20.834821] x13: ffffffffffffffff x12: 0000000000000001
[ 20.835143] x11: 0101010101010101 x10: ffffff880d59ff5f
[ 20.835467] x9 : ffffff880d59ff5e x8 : ffffffc06acffaf3
[ 20.835790] x7 : 0000000000000000 x6 : ffffff880d59ff5f
[ 20.836111] x5 : c06419717c314100 x4 : 0000000000000007
[ 20.836419] x3 : 0000000000000000 x2 : 0000000000000000
[ 20.836732] x1 : 0000000000000001 x0 : ffffff900969df90
[ 20.837100] Process sh (pid: 104, stack limit = 0x(____ptrval____))
[ 20.837396] Call trace:
[ 20.837566] (null)
[ 20.837816] serial8250_set_termios+0x48/0x54
[ 20.838089] uart_set_options+0x138/0x1bc
[ 20.838570] uart_poll_init+0x114/0x16c
[ 20.838834] tty_find_polling_driver+0x158/0x200
[ 20.839119] configure_kgdboc+0xbc/0x1bc
[ 20.839380] param_set_kgdboc_var+0xb8/0x150
[ 20.839658] param_attr_store+0xbc/0x150
[ 20.839920] module_attr_store+0x40/0x58
[ 20.840183] sysfs_kf_write+0x8c/0xa8
[ 20.840183] sysfs_kf_write+0x8c/0xa8
[ 20.840440] kernfs_fop_write+0x154/0x290
[ 20.840702] vfs_write+0xf0/0x278
[ 20.840942] __arm64_sys_write+0x84/0xf4
[ 20.841209] el0_svc_common+0xf4/0x1dc
[ 20.841471] el0_svc_handler+0x98/0xbc
[ 20.841713] el0_svc+0x8/0xc
[ 20.842057] Code: bad PC value
[ 20.842764] ---[ end trace a8835d7de79aaadf ]---
[ 20.843134] Kernel panic - not syncing: Fatal exception
[ 20.843515] SMP: stopping secondary CPUs
[ 20.844289] Kernel Offset: disabled
[ 20.844634] CPU features: 0x0,21806002
[ 20.844857] Memory Limit: none
[ 20.845172] ---[ end Kernel panic - not syncing: Fatal exception ]---

Signed-off-by: Miles Chen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/tty_io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -354,7 +354,7 @@ struct tty_driver *tty_find_polling_driv
mutex_lock(&tty_mutex);
/* Search through the tty devices to look for a match */
list_for_each_entry(p, &tty_drivers, tty_drivers) {
- if (strncmp(name, p->name, len) != 0)
+ if (!len || strncmp(name, p->name, len) != 0)
continue;
stp = str;
if (*stp == ',')



2018-11-19 17:36:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 29/83] MIPS: Loongson-3: Fix CPU UART irq delivery problem

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit d06f8a2f1befb5a3d0aa660ab1c05e9b744456ea ]

Masking/unmasking the CPU UART irq in CP0_Status (and redirecting it to
other CPUs) may cause interrupts be lost, especially in multi-package
machines (Package-0's UART irq cannot be delivered to others). So make
mask_loongson_irq() and unmask_loongson_irq() be no-ops.

The original problem (UART IRQ may deliver to any core) is also because
of masking/unmasking the CPU UART irq in CP0_Status. So it is safe to
remove all of the stuff.

Signed-off-by: Huacai Chen <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Patchwork: https://patchwork.linux-mips.org/patch/20433/
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: [email protected]
Cc: Fuxin Zhang <[email protected]>
Cc: Zhangjin Wu <[email protected]>
Cc: Huacai Chen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/loongson64/loongson-3/irq.c | 43 ++-------------------------
1 file changed, 3 insertions(+), 40 deletions(-)

diff --git a/arch/mips/loongson64/loongson-3/irq.c b/arch/mips/loongson64/loongson-3/irq.c
index 8e7649088353..ec5f2b30646c 100644
--- a/arch/mips/loongson64/loongson-3/irq.c
+++ b/arch/mips/loongson64/loongson-3/irq.c
@@ -50,45 +50,8 @@ static struct irqaction cascade_irqaction = {
.name = "cascade",
};

-static inline void mask_loongson_irq(struct irq_data *d)
-{
- clear_c0_status(0x100 << (d->irq - MIPS_CPU_IRQ_BASE));
- irq_disable_hazard();
-
- /* Workaround: UART IRQ may deliver to any core */
- if (d->irq == LOONGSON_UART_IRQ) {
- int cpu = smp_processor_id();
- int node_id = cpu_logical_map(cpu) / loongson_sysconf.cores_per_node;
- int core_id = cpu_logical_map(cpu) % loongson_sysconf.cores_per_node;
- u64 intenclr_addr = smp_group[node_id] |
- (u64)(&LOONGSON_INT_ROUTER_INTENCLR);
- u64 introuter_lpc_addr = smp_group[node_id] |
- (u64)(&LOONGSON_INT_ROUTER_LPC);
-
- *(volatile u32 *)intenclr_addr = 1 << 10;
- *(volatile u8 *)introuter_lpc_addr = 0x10 + (1<<core_id);
- }
-}
-
-static inline void unmask_loongson_irq(struct irq_data *d)
-{
- /* Workaround: UART IRQ may deliver to any core */
- if (d->irq == LOONGSON_UART_IRQ) {
- int cpu = smp_processor_id();
- int node_id = cpu_logical_map(cpu) / loongson_sysconf.cores_per_node;
- int core_id = cpu_logical_map(cpu) % loongson_sysconf.cores_per_node;
- u64 intenset_addr = smp_group[node_id] |
- (u64)(&LOONGSON_INT_ROUTER_INTENSET);
- u64 introuter_lpc_addr = smp_group[node_id] |
- (u64)(&LOONGSON_INT_ROUTER_LPC);
-
- *(volatile u32 *)intenset_addr = 1 << 10;
- *(volatile u8 *)introuter_lpc_addr = 0x10 + (1<<core_id);
- }
-
- set_c0_status(0x100 << (d->irq - MIPS_CPU_IRQ_BASE));
- irq_enable_hazard();
-}
+static inline void mask_loongson_irq(struct irq_data *d) { }
+static inline void unmask_loongson_irq(struct irq_data *d) { }

/* For MIPS IRQs which shared by all cores */
static struct irq_chip loongson_irq_chip = {
@@ -126,7 +89,7 @@ void __init mach_init_irq(void)
mips_cpu_irq_init();
init_i8259_irqs();
irq_set_chip_and_handler(LOONGSON_UART_IRQ,
- &loongson_irq_chip, handle_level_irq);
+ &loongson_irq_chip, handle_percpu_irq);

/* setup HT1 irq */
setup_irq(LOONGSON_HT1_IRQ, &cascade_irqaction);
--
2.17.1




2018-11-19 17:36:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 23/83] e1000: avoid null pointer dereference on invalid stat type

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 5983587c8c5ef00d6886477544ad67d495bc5479 ]

Currently if the stat type is invalid then data[i] is being set
either by dereferencing a null pointer p, or it is reading from
an incorrect previous location if we had a valid stat type
previously. Fix this by skipping over the read of p on an invalid
stat type.

Detected by CoverityScan, CID#113385 ("Explicit null dereferenced")

Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Alexander Duyck <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Jeff Kirsher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/intel/e1000/e1000_ethtool.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
index e84574b1eae7..2a81f6d72140 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_ethtool.c
@@ -1826,11 +1826,12 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
{
struct e1000_adapter *adapter = netdev_priv(netdev);
int i;
- char *p = NULL;
const struct e1000_stats *stat = e1000_gstrings_stats;

e1000_update_stats(adapter);
- for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++) {
+ for (i = 0; i < E1000_GLOBAL_STATS_LEN; i++, stat++) {
+ char *p;
+
switch (stat->type) {
case NETDEV_STATS:
p = (char *)netdev + stat->stat_offset;
@@ -1841,15 +1842,13 @@ static void e1000_get_ethtool_stats(struct net_device *netdev,
default:
WARN_ONCE(1, "Invalid E1000 stat type: %u index %d\n",
stat->type, i);
- break;
+ continue;
}

if (stat->sizeof_stat == sizeof(u64))
data[i] = *(u64 *)p;
else
data[i] = *(u32 *)p;
-
- stat++;
}
/* BUG_ON(i != E1000_STATS_LEN); */
}
--
2.17.1




2018-11-19 17:36:25

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/83] bna: ethtool: Avoid reading past end of buffer

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

[ Upstream commit 4dc69c1c1fff2f587f8e737e70b4a4e7565a5c94 ]

Using memcpy() from a string that is shorter than the length copied means
the destination buffer is being filled with arbitrary data from the kernel
rodata segment. Instead, use strncpy() which will fill the trailing bytes
with zeros.

This was found with the future CONFIG_FORTIFY_SOURCE feature.

Cc: Daniel Micay <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/brocade/bna/bnad_ethtool.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
index 31f61a744d66..9473d12ce239 100644
--- a/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
+++ b/drivers/net/ethernet/brocade/bna/bnad_ethtool.c
@@ -541,8 +541,8 @@ bnad_get_strings(struct net_device *netdev, u32 stringset, u8 *string)
for (i = 0; i < BNAD_ETHTOOL_STATS_NUM; i++) {
BUG_ON(!(strlen(bnad_net_stats_strings[i]) <
ETH_GSTRING_LEN));
- memcpy(string, bnad_net_stats_strings[i],
- ETH_GSTRING_LEN);
+ strncpy(string, bnad_net_stats_strings[i],
+ ETH_GSTRING_LEN);
string += ETH_GSTRING_LEN;
}
bmap = bna_tx_rid_mask(&bnad->bna);
--
2.17.1




2018-11-19 17:37:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/83] sc16is7xx: Fix for multi-channel stall

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Phil Elwell <[email protected]>

[ Upstream commit 8344498721059754e09d30fe255a12dab8fb03ef ]

The SC16IS752 is a dual-channel device. The two channels are largely
independent, but the IRQ signals are wired together as an open-drain,
active low signal which will be driven low while either of the
channels requires attention, which can be for significant periods of
time until operations complete and the interrupt can be acknowledged.
In that respect it is should be treated as a true level-sensitive IRQ.

The kernel, however, needs to be able to exit interrupt context in
order to use I2C or SPI to access the device registers (which may
involve sleeping). Therefore the interrupt needs to be masked out or
paused in some way.

The usual way to manage sleeping from within an interrupt handler
is to use a threaded interrupt handler - a regular interrupt routine
does the minimum amount of work needed to triage the interrupt before
waking the interrupt service thread. If the threaded IRQ is marked as
IRQF_ONESHOT the kernel will automatically mask out the interrupt
until the thread runs to completion. The sc16is7xx driver used to
use a threaded IRQ, but a patch switched to using a kthread_worker
in order to set realtime priorities on the handler thread and for
other optimisations. The end result is non-threaded IRQ that
schedules some work then returns IRQ_HANDLED, making the kernel
think that all IRQ processing has completed.

The work-around to prevent a constant stream of interrupts is to
mark the interrupt as edge-sensitive rather than level-sensitive,
but interpreting an active-low source as a falling-edge source
requires care to prevent a total cessation of interrupts. Whereas
an edge-triggering source will generate a new edge for every interrupt
condition a level-triggering source will keep the signal at the
interrupting level until it no longer requires attention; in other
words, the host won't see another edge until all interrupt conditions
are cleared. It is therefore vital that the interrupt handler does not
exit with an outstanding interrupt condition, otherwise the kernel
will not receive another interrupt unless some other operation causes
the interrupt state on the device to be cleared.

The existing sc16is7xx driver has a very simple interrupt "thread"
(kthread_work job) that processes interrupts on each channel in turn
until there are no more. If both channels are active and the first
channel starts interrupting while the handler for the second channel
is running then it will not be detected and an IRQ stall ensues. This
could be handled easily if there was a shared IRQ status register, or
a convenient way to determine if the IRQ had been deasserted for any
length of time, but both appear to be lacking.

Avoid this problem (or at least make it much less likely to happen)
by reducing the granularity of per-channel interrupt processing
to one condition per iteration, only exiting the overall loop when
both channels are no longer interrupting.

Signed-off-by: Phil Elwell <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/tty/serial/sc16is7xx.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)

--- a/drivers/tty/serial/sc16is7xx.c
+++ b/drivers/tty/serial/sc16is7xx.c
@@ -661,7 +661,7 @@ static void sc16is7xx_handle_tx(struct u
uart_write_wakeup(port);
}

-static void sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
+static bool sc16is7xx_port_irq(struct sc16is7xx_port *s, int portno)
{
struct uart_port *port = &s->p[portno].port;

@@ -670,7 +670,7 @@ static void sc16is7xx_port_irq(struct sc

iir = sc16is7xx_port_read(port, SC16IS7XX_IIR_REG);
if (iir & SC16IS7XX_IIR_NO_INT_BIT)
- break;
+ return false;

iir &= SC16IS7XX_IIR_ID_MASK;

@@ -692,16 +692,23 @@ static void sc16is7xx_port_irq(struct sc
port->line, iir);
break;
}
- } while (1);
+ } while (0);
+ return true;
}

static void sc16is7xx_ist(struct kthread_work *ws)
{
struct sc16is7xx_port *s = to_sc16is7xx_port(ws, irq_work);
- int i;

- for (i = 0; i < s->devtype->nr_uart; ++i)
- sc16is7xx_port_irq(s, i);
+ while (1) {
+ bool keep_polling = false;
+ int i;
+
+ for (i = 0; i < s->devtype->nr_uart; ++i)
+ keep_polling |= sc16is7xx_port_irq(s, i);
+ if (!keep_polling)
+ break;
+ }
}

static irqreturn_t sc16is7xx_irq(int irq, void *dev_id)



2018-11-19 17:37:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/83] 9p: clear dangling pointers in p9stat_free

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dominique Martinet <[email protected]>

[ Upstream commit 62e3941776fea8678bb8120607039410b1b61a65 ]

p9stat_free is more of a cleanup function than a 'free' function as it
only frees the content of the struct; there are chances of use-after-free
if it is improperly used (e.g. p9stat_free called twice as it used to be
possible to)

Clearing dangling pointers makes the function idempotent and safer to use.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dominique Martinet <[email protected]>
Reported-by: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/9p/protocol.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/net/9p/protocol.c
+++ b/net/9p/protocol.c
@@ -46,10 +46,15 @@ p9pdu_writef(struct p9_fcall *pdu, int p
void p9stat_free(struct p9_wstat *stbuf)
{
kfree(stbuf->name);
+ stbuf->name = NULL;
kfree(stbuf->uid);
+ stbuf->uid = NULL;
kfree(stbuf->gid);
+ stbuf->gid = NULL;
kfree(stbuf->muid);
+ stbuf->muid = NULL;
kfree(stbuf->extension);
+ stbuf->extension = NULL;
}
EXPORT_SYMBOL(p9stat_free);




2018-11-19 23:32:35

by kernelci.org bot

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/83] 4.9.138-stable review

stable-rc/linux-4.9.y boot: 59 boots: 0 failed, 43 passed with 16 offline (v4.9.137-83-g4ca3f6d71162)

Full Boot Summary: https://kernelci.org/boot/all/job/stable-rc/branch/linux-4.9.y/kernel/v4.9.137-83-g4ca3f6d71162/
Full Build Summary: https://kernelci.org/build/stable-rc/branch/linux-4.9.y/kernel/v4.9.137-83-g4ca3f6d71162/

Tree: stable-rc
Branch: linux-4.9.y
Git Describe: v4.9.137-83-g4ca3f6d71162
Git Commit: 4ca3f6d71162e660e35f62700d1dd2486feb5e4c
Git URL: http://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Tested: 36 unique boards, 17 SoC families, 13 builds out of 182

Offline Platforms:

arm:

omap2plus_defconfig:
am335x-boneblack: 1 offline lab

sunxi_defconfig:
sun5i-r8-chip: 1 offline lab

tegra_defconfig:
tegra124-jetson-tk1: 1 offline lab

bcm2835_defconfig:
bcm2835-rpi-b: 1 offline lab

exynos_defconfig:
exynos5800-peach-pi: 1 offline lab

sama5_defconfig:
at91-sama5d4_xplained: 1 offline lab

multi_v7_defconfig:
alpine-db: 1 offline lab
am335x-boneblack: 1 offline lab
at91-sama5d4_xplained: 1 offline lab
exynos5800-peach-pi: 1 offline lab
socfpga_cyclone5_de0_sockit: 1 offline lab
sun5i-r8-chip: 1 offline lab
tegra124-jetson-tk1: 1 offline lab

socfpga_defconfig:
socfpga_cyclone5_de0_sockit: 1 offline lab

arm64:

defconfig:
apq8016-sbc: 1 offline lab
juno-r2: 1 offline lab

---
For more info write to <[email protected]>

2018-11-20 02:05:04

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/83] 4.9.138-stable review

On 11/19/18 9:28 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.138 release.
> There are 83 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Nov 21 16:25:13 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.138-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


2018-11-20 08:13:18

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/83] 4.9.138-stable review

On Mon, 19 Nov 2018 at 22:23, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 4.9.138 release.
> There are 83 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Nov 21 16:25:13 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.138-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Summary
------------------------------------------------------------------------

kernel: 4.9.138-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: 4ca3f6d71162e660e35f62700d1dd2486feb5e4c
git describe: v4.9.137-83-g4ca3f6d71162
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.137-83-g4ca3f6d71162


No regressions (compared to build v4.9.137)


No fixes (compared to build v4.9.137)

Ran 21057 total tests in the following environments and test suites.

Environments
--------------
- dragonboard-410c - arm64
- hi6220-hikey - arm64
- i386
- juno-r2 - arm64
- qemu_arm
- qemu_arm64
- qemu_i386
- qemu_x86_64
- x15 - arm
- x86_64

Test Suites
-----------
* boot
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* ltp-cap_bounds-tests
* ltp-containers-tests
* ltp-cve-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-hugetlb-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-timers-tests
* ltp-open-posix-tests
* kselftest-vsyscall-mode-native
* kselftest-vsyscall-mode-none

--
Linaro LKFT
https://lkft.linaro.org

2018-11-20 14:50:12

by Jon Hunter

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/83] 4.9.138-stable review


On 19/11/2018 16:28, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.138 release.
> There are 83 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Nov 21 16:25:13 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.138-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

All tests are passing for Tegra ...

Test results for stable-v4.9:
8 builds: 8 pass, 0 fail
16 boots: 16 pass, 0 fail
14 tests: 14 pass, 0 fail

Linux version: 4.9.138-rc1-g4ca3f6d
Boards tested: tegra124-jetson-tk1, tegra20-ventana,
tegra210-p2371-2180, tegra30-cardhu-a04

Cheers
Jon

--
nvpublic

2018-11-20 21:09:32

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/83] 4.9.138-stable review

On Mon, Nov 19, 2018 at 05:28:26PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.138 release.
> There are 83 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Wed Nov 21 16:25:13 UTC 2018.
> Anything received after that time might be too late.
>

Build results:
total: 150 pass: 150 fail: 0
Qemu test results:
total: 283 pass: 283 fail: 0

Details are available at https://kerneltests.org/builders/.

Guenter