2017-04-25 15:09:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 00/28] 4.4.64-stable review

This is the start of the stable review cycle for the 4.4.64 release.
There are 28 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.64-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.4.64-rc1

Jon Paul Maloy <[email protected]>
tipc: fix crash during node removal

Dan Williams <[email protected]>
block: fix del_gendisk() vs blkdev_ioctl crash

Dan Williams <[email protected]>
x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions

Vitaly Kuznetsov <[email protected]>
hv: don't reset hv_context.tsc_page on crash

Vitaly Kuznetsov <[email protected]>
Drivers: hv: balloon: account for gaps in hot add regions

Vitaly Kuznetsov <[email protected]>
Drivers: hv: balloon: keep track of where ha_region starts

Vitaly Kuznetsov <[email protected]>
Tools: hv: kvp: ensure kvp device fd is closed on exec

Oliver O'Halloran <[email protected]>
powerpc/64: Fix flush_(d|i)cache_range() called from modules

Suzuki K Poulose <[email protected]>
kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

Yazen Ghannam <[email protected]>
x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs

Ravi Bangoria <[email protected]>
powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction

Sebastian Siewior <[email protected]>
ubi/upd: Always flush after prepared for an update

Johannes Berg <[email protected]>
mac80211: reject ToDS broadcast data frames

Haibo Chen <[email protected]>
mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card

Arnd Bergmann <[email protected]>
ACPI / power: Avoid maybe-uninitialized warning

Thorsten Leemhuis <[email protected]>
Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled

Jorgen Hansen <[email protected]>
VSOCK: Detach QP check should filter out non matching QPs.

K. Y. Srinivasan <[email protected]>
Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg()

Vitaly Kuznetsov <[email protected]>
Drivers: hv: get rid of timeout in vmbus_open()

Vitaly Kuznetsov <[email protected]>
Drivers: hv: don't leak memory in vmbus_establish_gpadl()

Christian Borntraeger <[email protected]>
s390/mm: fix CMMA vs KSM vs others

Germano Percossi <[email protected]>
CIFS: remove bad_network_name flag

Sachin Prabhu <[email protected]>
cifs: Do not send echoes before Negotiate is complete

Steven Rostedt (VMware) <[email protected]>
ring-buffer: Have ring_buffer_iter_empty() return true when empty

Steven Rostedt (VMware) <[email protected]>
tracing: Allocate the snapshot buffer before enabling probe

Eric Biggers <[email protected]>
KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings

David Howells <[email protected]>
KEYS: Change the name of the dead type to ".dead" to prevent user access

David Howells <[email protected]>
KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings


-------------

Diffstat:

Makefile | 4 +-
arch/arm/kvm/mmu.c | 12 ++++
arch/powerpc/kernel/entry_64.S | 6 +-
arch/powerpc/kernel/misc_64.S | 5 +-
arch/s390/include/asm/pgtable.h | 2 +
arch/x86/include/asm/pmem.h | 45 ++++++++----
arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 +-
block/genhd.c | 1 -
drivers/acpi/power.c | 1 +
drivers/hv/channel.c | 16 +++--
drivers/hv/connection.c | 8 +--
drivers/hv/hv.c | 5 +-
drivers/hv/hv_balloon.c | 136 +++++++++++++++++++++++++----------
drivers/input/mouse/elantech.c | 8 +++
drivers/mmc/host/sdhci-esdhc-imx.c | 1 +
drivers/mtd/ubi/upd.c | 8 +--
fs/cifs/cifsglob.h | 1 -
fs/cifs/smb1ops.c | 10 +++
fs/cifs/smb2pdu.c | 5 --
kernel/trace/ring_buffer.c | 16 ++++-
kernel/trace/trace.c | 8 ++-
net/mac80211/rx.c | 21 ++++++
net/tipc/node.c | 24 +++----
net/vmw_vsock/vmci_transport.c | 4 +-
security/keys/gc.c | 2 +-
security/keys/keyctl.c | 20 +++---
security/keys/process_keys.c | 44 +++++++-----
tools/hv/hv_kvp_daemon.c | 2 +-
28 files changed, 287 insertions(+), 130 deletions(-)



2017-04-25 15:09:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 01/28] KEYS: Disallow keyrings beginning with . to be joined as session keyrings

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Howells <[email protected]>

commit ee8f844e3c5a73b999edf733df1c529d6503ec2f upstream.

This fixes CVE-2016-9604.

Keyrings whose name begin with a '.' are special internal keyrings and so
userspace isn't allowed to create keyrings by this name to prevent
shadowing. However, the patch that added the guard didn't fix
KEYCTL_JOIN_SESSION_KEYRING. Not only can that create dot-named keyrings,
it can also subscribe to them as a session keyring if they grant SEARCH
permission to the user.

This, for example, allows a root process to set .builtin_trusted_keys as
its session keyring, at which point it has full access because now the
possessor permissions are added. This permits root to add extra public
keys, thereby bypassing module verification.

This also affects kexec and IMA.

This can be tested by (as root):

keyctl session .builtin_trusted_keys
keyctl add user a a @s
keyctl list @s

which on my test box gives me:

2 keys in keyring:
180010936: ---lswrv 0 0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
801382539: --alswrv 0 0 user: a


Fix this by rejecting names beginning with a '.' in the keyctl.

Signed-off-by: David Howells <[email protected]>
Acked-by: Mimi Zohar <[email protected]>
cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/keys/keyctl.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -271,7 +271,8 @@ error:
* Create and join an anonymous session keyring or join a named session
* keyring, creating it if necessary. A named session keyring must have Search
* permission for it to be joined. Session keyrings without this permit will
- * be skipped over.
+ * be skipped over. It is not permitted for userspace to create or join
+ * keyrings whose name begin with a dot.
*
* If successful, the ID of the joined session keyring will be returned.
*/
@@ -288,12 +289,16 @@ long keyctl_join_session_keyring(const c
ret = PTR_ERR(name);
goto error;
}
+
+ ret = -EPERM;
+ if (name[0] == '.')
+ goto error_name;
}

/* join the session */
ret = join_session_keyring(name);
+error_name:
kfree(name);
-
error:
return ret;
}


2017-04-25 15:09:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 10/28] Drivers: hv: get rid of timeout in vmbus_open()

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit 396e287fa2ff46e83ae016cdcb300c3faa3b02f6 upstream.

vmbus_teardown_gpadl() can result in infinite wait when it is called on 5
second timeout in vmbus_open(). The issue is caused by the fact that gpadl
teardown operation won't ever succeed for an opened channel and the timeout
isn't always enough. As a guest, we can always trust the host to respond to
our request (and there is nothing we can do if it doesn't).

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/channel.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -73,7 +73,6 @@ int vmbus_open(struct vmbus_channel *new
void *in, *out;
unsigned long flags;
int ret, err = 0;
- unsigned long t;
struct page *page;

spin_lock_irqsave(&newchannel->lock, flags);
@@ -183,11 +182,7 @@ int vmbus_open(struct vmbus_channel *new
goto error1;
}

- t = wait_for_completion_timeout(&open_info->waitevent, 5*HZ);
- if (t == 0) {
- err = -ETIMEDOUT;
- goto error1;
- }
+ wait_for_completion(&open_info->waitevent);

spin_lock_irqsave(&vmbus_connection.channelmsg_lock, flags);
list_del(&open_info->msglistentry);


2017-04-25 15:09:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 12/28] VSOCK: Detach QP check should filter out non matching QPs.

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jorgen Hansen <[email protected]>

commit 8ab18d71de8b07d2c4d6f984b718418c09ea45c5 upstream.

The check in vmci_transport_peer_detach_cb should only allow a
detach when the qp handle of the transport matches the one in
the detach message.

Testing: Before this change, a detach from a peer on a different
socket would cause an active stream socket to register a detach.

Reviewed-by: George Zhang <[email protected]>
Signed-off-by: Jorgen Hansen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/vmw_vsock/vmci_transport.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -842,7 +842,7 @@ static void vmci_transport_peer_detach_c
* qp_handle.
*/
if (vmci_handle_is_invalid(e_payload->handle) ||
- vmci_handle_is_equal(trans->qp_handle, e_payload->handle))
+ !vmci_handle_is_equal(trans->qp_handle, e_payload->handle))
return;

/* We don't ask for delayed CBs when we subscribe to this event (we
@@ -2154,7 +2154,7 @@ module_exit(vmci_transport_exit);

MODULE_AUTHOR("VMware, Inc.");
MODULE_DESCRIPTION("VMCI transport for Virtual Sockets");
-MODULE_VERSION("1.0.2.0-k");
+MODULE_VERSION("1.0.3.0-k");
MODULE_LICENSE("GPL v2");
MODULE_ALIAS("vmware_vsock");
MODULE_ALIAS_NETPROTO(PF_VSOCK);


2017-04-25 15:10:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 14/28] ACPI / power: Avoid maybe-uninitialized warning

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <[email protected]>

commit fe8c470ab87d90e4b5115902dd94eced7e3305c3 upstream.

gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state()
is enterered at least once, so it assumes that cur_state might not get
initialized:

drivers/acpi/power.c: In function 'acpi_power_get_inferred_state':
drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]

This sets the variable to zero at the start of the loop, to ensure that
there is well-defined behavior even for an empty list. This gets rid of
the warning.

The warning first showed up when the -Os flag got removed in a bug fix
patch in linux-4.11-rc5.

I would suggest merging this addon patch on top of that bug fix to avoid
introducing a new warning in the stable kernels.

Fixes: 61b79e16c68d (ACPI: Fix incompatibility with mcount-based function graph tracing)
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/acpi/power.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/acpi/power.c
+++ b/drivers/acpi/power.c
@@ -200,6 +200,7 @@ static int acpi_power_get_list_state(str
return -EINVAL;

/* The state of the list is 'on' IFF all resources are 'on'. */
+ cur_state = 0;
list_for_each_entry(entry, list, node) {
struct acpi_power_resource *resource = entry->resource;
acpi_handle handle = resource->device.handle;


2017-04-25 15:10:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 13/28] Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Thorsten Leemhuis <[email protected]>

commit 704de489e0e3640a2ee2d0daf173e9f7375582ba upstream.

Temporary got a Lifebook E547 into my hands and noticed the touchpad
only works after running:

echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled

Add it to the list of machines that need this workaround.

Signed-off-by: Thorsten Leemhuis <[email protected]>
Reviewed-by: Ulrik De Bie <[email protected]>
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/input/mouse/elantech.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/drivers/input/mouse/elantech.c
+++ b/drivers/input/mouse/elantech.c
@@ -1122,6 +1122,7 @@ static int elantech_get_resolution_v4(st
* Asus UX32VD 0x361f02 00, 15, 0e clickpad
* Avatar AVIU-145A2 0x361f00 ? clickpad
* Fujitsu LIFEBOOK E544 0x470f00 d0, 12, 09 2 hw buttons
+ * Fujitsu LIFEBOOK E547 0x470f00 50, 12, 09 2 hw buttons
* Fujitsu LIFEBOOK E554 0x570f01 40, 14, 0c 2 hw buttons
* Fujitsu T725 0x470f01 05, 12, 09 2 hw buttons
* Fujitsu H730 0x570f00 c0, 14, 0c 3 hw buttons (**)
@@ -1528,6 +1529,13 @@ static const struct dmi_system_id elante
},
},
{
+ /* Fujitsu LIFEBOOK E547 does not work with crc_enabled == 0 */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "LIFEBOOK E547"),
+ },
+ },
+ {
/* Fujitsu LIFEBOOK E554 does not work with crc_enabled == 0 */
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),


2017-04-25 15:10:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 16/28] mac80211: reject ToDS broadcast data frames

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Johannes Berg <[email protected]>

commit 3018e947d7fd536d57e2b550c33e456d921fff8c upstream.

AP/AP_VLAN modes don't accept any real 802.11 multicast data
frames, but since they do need to accept broadcast management
frames the same is currently permitted for data frames. This
opens a security problem because such frames would be decrypted
with the GTK, and could even contain unicast L3 frames.

Since the spec says that ToDS frames must always have the BSSID
as the RA (addr1), reject any other data frames.

The problem was originally reported in "Predicting, Decrypting,
and Abusing WPA2/802.11 Group Keys" at usenix
https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
and brought to my attention by Jouni.

Reported-by: Jouni Malinen <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
--

---
net/mac80211/rx.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)

--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -3396,6 +3396,27 @@ static bool ieee80211_accept_frame(struc
!ether_addr_equal(bssid, hdr->addr1))
return false;
}
+
+ /*
+ * 802.11-2016 Table 9-26 says that for data frames, A1 must be
+ * the BSSID - we've checked that already but may have accepted
+ * the wildcard (ff:ff:ff:ff:ff:ff).
+ *
+ * It also says:
+ * The BSSID of the Data frame is determined as follows:
+ * a) If the STA is contained within an AP or is associated
+ * with an AP, the BSSID is the address currently in use
+ * by the STA contained in the AP.
+ *
+ * So we should not accept data frames with an address that's
+ * multicast.
+ *
+ * Accepting it also opens a security problem because stations
+ * could encrypt it with the GTK and inject traffic that way.
+ */
+ if (ieee80211_is_data(hdr->frame_control) && multicast)
+ return false;
+
return true;
case NL80211_IFTYPE_WDS:
if (bssid || !ieee80211_is_data(hdr->frame_control))


2017-04-25 15:10:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 19/28] x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yazen Ghannam <[email protected]>

commit 29f72ce3e4d18066ec75c79c857bee0618a3504b upstream.

MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
However, MCA bank 3 is defined on Fam17h systems and can be accessed
using legacy MSRs. Without a name we get a stack trace on Fam17h systems
when trying to register sysfs files for bank 3 on kernels that don't
recognize Scalable MCA.

Call MCA bank 3 "decode_unit" since this is what it represents on
Fam17h. This will allow kernels without SMCA support to see this bank on
Fam17h+ and prevent the stack trace. This will not affect older systems
since this bank is reserved on them, i.e. it'll be ignored.

Tested on AMD Fam15h and Fam17h systems.

WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
kobject: (ffff88085bb256c0): attempted to be registered with empty name!
...
Call Trace:
kobject_add_internal
kobject_add
kobject_create_and_add
threshold_create_device
threshold_init_device

Signed-off-by: Yazen Ghannam <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/mcheck/mce_amd.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -53,7 +53,7 @@ static const char * const th_names[] = {
"load_store",
"insn_fetch",
"combined_unit",
- "",
+ "decode_unit",
"northbridge",
"execution_unit",
};


2017-04-25 15:10:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 25/28] hv: dont reset hv_context.tsc_page on crash

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit 56ef6718a1d8d77745033c5291e025ce18504159 upstream.

It may happen that secondary CPUs are still alive and resetting
hv_context.tsc_page will cause a consequent crash in read_hv_clock_tsc()
as we don't check for it being not NULL there. It is safe as we're not
freeing this page anyways.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/hv.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -305,9 +305,10 @@ void hv_cleanup(bool crash)

hypercall_msr.as_uint64 = 0;
wrmsrl(HV_X64_MSR_REFERENCE_TSC, hypercall_msr.as_uint64);
- if (!crash)
+ if (!crash) {
vfree(hv_context.tsc_page);
- hv_context.tsc_page = NULL;
+ hv_context.tsc_page = NULL;
+ }
}
#endif
}


2017-04-25 15:10:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 21/28] powerpc/64: Fix flush_(d|i)cache_range() called from modules

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver O'Halloran <[email protected]>

commit 8f5f525d5b83f7d76a6baf9c4e94d4bf312ea7f6 upstream.

When the kernel is compiled to use 64bit ABIv2 the _GLOBAL() macro does
not include a global entry point. A function's global entry point is
used when the function is called from a different TOC context and in the
kernel this typically means a call from a module into the vmlinux (or
vice-versa).

There are a few exported asm functions declared with _GLOBAL() and
calling them from a module will likely crash the kernel since any TOC
relative load will yield garbage.

flush_icache_range() and flush_dcache_range() are both exported to
modules, and use the TOC, so must use _GLOBAL_TOC().

Fixes: 721aeaa9fdf3 ("powerpc: Build little endian ppc64 kernel with ABIv2")
Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
arch/powerpc/kernel/misc_64.S | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/powerpc/kernel/misc_64.S
+++ b/arch/powerpc/kernel/misc_64.S
@@ -67,6 +67,9 @@ PPC64_CACHES:
*/

_KPROBE(flush_icache_range)
+0: addis r2,r12,(.TOC. - 0b)@ha
+ addi r2, r2,(.TOC. - 0b)@l
+ .localentry flush_icache_range, . - flush_icache_range
BEGIN_FTR_SECTION
PURGE_PREFETCHED_INS
blr
@@ -117,7 +120,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_COHERENT_I
*
* flush all bytes from start to stop-1 inclusive
*/
-_GLOBAL(flush_dcache_range)
+_GLOBAL_TOC(flush_dcache_range)

/*
* Flush the data cache to memory


2017-04-25 15:11:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 24/28] Drivers: hv: balloon: account for gaps in hot add regions

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit cb7a5724c7e1bfb5766ad1c3beba14cc715991cf upstream.

I'm observing the following hot add requests from the WS2012 host:

hot_add_req: start_pfn = 0x108200 count = 330752
hot_add_req: start_pfn = 0x158e00 count = 193536
hot_add_req: start_pfn = 0x188400 count = 239616

As the host doesn't specify hot add regions we're trying to create
128Mb-aligned region covering the first request, we create the 0x108000 -
0x160000 region and we add 0x108000 - 0x158e00 memory. The second request
passes the pfn_covered() check, we enlarge the region to 0x108000 -
0x190000 and add 0x158e00 - 0x188200 memory. The problem emerges with the
third request as it starts at 0x188400 so there is a 0x200 gap which is
not covered. As the end of our region is 0x190000 now it again passes the
pfn_covered() check were we just adjust the covered_end_pfn and make it
0x188400 instead of 0x188200 which means that we'll try to online
0x188200-0x188400 pages but these pages were never assigned to us and we
crash.

We can't react to such requests by creating new hot add regions as it may
happen that the whole suggested range falls into the previously identified
128Mb-aligned area so we'll end up adding nothing or create intersecting
regions and our current logic doesn't allow that. Instead, create a list of
such 'gaps' and check for them in the page online callback.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/hv_balloon.c | 131 ++++++++++++++++++++++++++++++++++--------------
1 file changed, 94 insertions(+), 37 deletions(-)

--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -441,6 +441,16 @@ struct hv_hotadd_state {
unsigned long covered_end_pfn;
unsigned long ha_end_pfn;
unsigned long end_pfn;
+ /*
+ * A list of gaps.
+ */
+ struct list_head gap_list;
+};
+
+struct hv_hotadd_gap {
+ struct list_head list;
+ unsigned long start_pfn;
+ unsigned long end_pfn;
};

struct balloon_state {
@@ -596,18 +606,46 @@ static struct notifier_block hv_memory_n
.priority = 0
};

+/* Check if the particular page is backed and can be onlined and online it. */
+static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg)
+{
+ unsigned long cur_start_pgp;
+ unsigned long cur_end_pgp;
+ struct hv_hotadd_gap *gap;
+
+ cur_start_pgp = (unsigned long)pfn_to_page(has->covered_start_pfn);
+ cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);
+
+ /* The page is not backed. */
+ if (((unsigned long)pg < cur_start_pgp) ||
+ ((unsigned long)pg >= cur_end_pgp))
+ return;
+
+ /* Check for gaps. */
+ list_for_each_entry(gap, &has->gap_list, list) {
+ cur_start_pgp = (unsigned long)
+ pfn_to_page(gap->start_pfn);
+ cur_end_pgp = (unsigned long)
+ pfn_to_page(gap->end_pfn);
+ if (((unsigned long)pg >= cur_start_pgp) &&
+ ((unsigned long)pg < cur_end_pgp)) {
+ return;
+ }
+ }

-static void hv_bring_pgs_online(unsigned long start_pfn, unsigned long size)
+ /* This frame is currently backed; online the page. */
+ __online_page_set_limits(pg);
+ __online_page_increment_counters(pg);
+ __online_page_free(pg);
+}
+
+static void hv_bring_pgs_online(struct hv_hotadd_state *has,
+ unsigned long start_pfn, unsigned long size)
{
int i;

- for (i = 0; i < size; i++) {
- struct page *pg;
- pg = pfn_to_page(start_pfn + i);
- __online_page_set_limits(pg);
- __online_page_increment_counters(pg);
- __online_page_free(pg);
- }
+ for (i = 0; i < size; i++)
+ hv_page_online_one(has, pfn_to_page(start_pfn + i));
}

static void hv_mem_hot_add(unsigned long start, unsigned long size,
@@ -684,26 +722,24 @@ static void hv_online_page(struct page *
list_for_each(cur, &dm_device.ha_region_list) {
has = list_entry(cur, struct hv_hotadd_state, list);
cur_start_pgp = (unsigned long)
- pfn_to_page(has->covered_start_pfn);
- cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);
+ pfn_to_page(has->start_pfn);
+ cur_end_pgp = (unsigned long)pfn_to_page(has->end_pfn);

- if (((unsigned long)pg >= cur_start_pgp) &&
- ((unsigned long)pg < cur_end_pgp)) {
- /*
- * This frame is currently backed; online the
- * page.
- */
- __online_page_set_limits(pg);
- __online_page_increment_counters(pg);
- __online_page_free(pg);
- }
+ /* The page belongs to a different HAS. */
+ if (((unsigned long)pg < cur_start_pgp) ||
+ ((unsigned long)pg >= cur_end_pgp))
+ continue;
+
+ hv_page_online_one(has, pg);
+ break;
}
}

-static bool pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
+static int pfn_covered(unsigned long start_pfn, unsigned long pfn_cnt)
{
struct list_head *cur;
struct hv_hotadd_state *has;
+ struct hv_hotadd_gap *gap;
unsigned long residual, new_inc;

if (list_empty(&dm_device.ha_region_list))
@@ -718,6 +754,24 @@ static bool pfn_covered(unsigned long st
*/
if (start_pfn < has->start_pfn || start_pfn >= has->end_pfn)
continue;
+
+ /*
+ * If the current start pfn is not where the covered_end
+ * is, create a gap and update covered_end_pfn.
+ */
+ if (has->covered_end_pfn != start_pfn) {
+ gap = kzalloc(sizeof(struct hv_hotadd_gap), GFP_ATOMIC);
+ if (!gap)
+ return -ENOMEM;
+
+ INIT_LIST_HEAD(&gap->list);
+ gap->start_pfn = has->covered_end_pfn;
+ gap->end_pfn = start_pfn;
+ list_add_tail(&gap->list, &has->gap_list);
+
+ has->covered_end_pfn = start_pfn;
+ }
+
/*
* If the current hot add-request extends beyond
* our current limit; extend it.
@@ -734,19 +788,10 @@ static bool pfn_covered(unsigned long st
has->end_pfn += new_inc;
}

- /*
- * If the current start pfn is not where the covered_end
- * is, update it.
- */
-
- if (has->covered_end_pfn != start_pfn)
- has->covered_end_pfn = start_pfn;
-
- return true;
-
+ return 1;
}

- return false;
+ return 0;
}

static unsigned long handle_pg_range(unsigned long pg_start,
@@ -785,6 +830,8 @@ static unsigned long handle_pg_range(uns
if (pgs_ol > pfn_cnt)
pgs_ol = pfn_cnt;

+ has->covered_end_pfn += pgs_ol;
+ pfn_cnt -= pgs_ol;
/*
* Check if the corresponding memory block is already
* online by checking its last previously backed page.
@@ -793,10 +840,8 @@ static unsigned long handle_pg_range(uns
*/
if (start_pfn > has->start_pfn &&
!PageReserved(pfn_to_page(start_pfn - 1)))
- hv_bring_pgs_online(start_pfn, pgs_ol);
+ hv_bring_pgs_online(has, start_pfn, pgs_ol);

- has->covered_end_pfn += pgs_ol;
- pfn_cnt -= pgs_ol;
}

if ((has->ha_end_pfn < has->end_pfn) && (pfn_cnt > 0)) {
@@ -834,13 +879,19 @@ static unsigned long process_hot_add(uns
unsigned long rg_size)
{
struct hv_hotadd_state *ha_region = NULL;
+ int covered;

if (pfn_cnt == 0)
return 0;

- if (!dm_device.host_specified_ha_region)
- if (pfn_covered(pg_start, pfn_cnt))
+ if (!dm_device.host_specified_ha_region) {
+ covered = pfn_covered(pg_start, pfn_cnt);
+ if (covered < 0)
+ return 0;
+
+ if (covered)
goto do_pg_range;
+ }

/*
* If the host has specified a hot-add range; deal with it first.
@@ -852,6 +903,7 @@ static unsigned long process_hot_add(uns
return 0;

INIT_LIST_HEAD(&ha_region->list);
+ INIT_LIST_HEAD(&ha_region->gap_list);

list_add_tail(&ha_region->list, &dm_device.ha_region_list);
ha_region->start_pfn = rg_start;
@@ -1584,6 +1636,7 @@ static int balloon_remove(struct hv_devi
struct hv_dynmem_device *dm = hv_get_drvdata(dev);
struct list_head *cur, *tmp;
struct hv_hotadd_state *has;
+ struct hv_hotadd_gap *gap, *tmp_gap;

if (dm->num_pages_ballooned != 0)
pr_warn("Ballooned pages: %d\n", dm->num_pages_ballooned);
@@ -1600,6 +1653,10 @@ static int balloon_remove(struct hv_devi
#endif
list_for_each_safe(cur, tmp, &dm->ha_region_list) {
has = list_entry(cur, struct hv_hotadd_state, list);
+ list_for_each_entry_safe(gap, tmp_gap, &has->gap_list, list) {
+ list_del(&gap->list);
+ kfree(gap);
+ }
list_del(&has->list);
kfree(has);
}


2017-04-25 15:11:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 08/28] s390/mm: fix CMMA vs KSM vs others

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Christian Borntraeger <[email protected]>

commit a8f60d1fadf7b8b54449fcc9d6b15248917478ba upstream.

On heavy paging with KSM I see guest data corruption. Turns out that
KSM will add pages to its tree, where the mapping return true for
pte_unused (or might become as such later). KSM will unmap such pages
and reinstantiate with different attributes (e.g. write protected or
special, e.g. in replace_page or write_protect_page)). This uncovered
a bug in our pagetable handling: We must remove the unused flag as
soon as an entry becomes present again.

Signed-of-by: Christian Borntraeger <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
arch/s390/include/asm/pgtable.h | 2 ++
1 file changed, 2 insertions(+)

--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -829,6 +829,8 @@ static inline void set_pte_at(struct mm_
{
pgste_t pgste;

+ if (pte_present(entry))
+ pte_val(entry) &= ~_PAGE_UNUSED;
if (mm_has_pgste(mm)) {
pgste = pgste_get_lock(ptep);
pgste_val(pgste) &= ~_PGSTE_GPS_ZERO;


2017-04-25 15:11:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 06/28] cifs: Do not send echoes before Negotiate is complete

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sachin Prabhu <[email protected]>

commit 62a6cfddcc0a5313e7da3e8311ba16226fe0ac10 upstream.

commit 4fcd1813e640 ("Fix reconnect to not defer smb3 session reconnect
long after socket reconnect") added support for Negotiate requests to
be initiated by echo calls.

To avoid delays in calling echo after a reconnect, I added the patch
introduced by the commit b8c600120fc8 ("Call echo service immediately
after socket reconnect").

This has however caused a regression with cifs shares which do not have
support for echo calls to trigger Negotiate requests. On connections
which need to call Negotiation, the echo calls trigger an error which
triggers a reconnect which in turn triggers another echo call. This
results in a loop which is only broken when an operation is performed on
the cifs share. For an idle share, it can DOS a server.

The patch uses the smb_operation can_echo() for cifs so that it is
called only if connection has been already been setup.

kernel bz: 194531

Signed-off-by: Sachin Prabhu <[email protected]>
Tested-by: Jonathan Liu <[email protected]>
Acked-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
fs/cifs/smb1ops.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -1015,6 +1015,15 @@ cifs_dir_needs_close(struct cifsFileInfo
return !cfile->srch_inf.endOfSearch && !cfile->invalidHandle;
}

+static bool
+cifs_can_echo(struct TCP_Server_Info *server)
+{
+ if (server->tcpStatus == CifsGood)
+ return true;
+
+ return false;
+}
+
struct smb_version_operations smb1_operations = {
.send_cancel = send_nt_cancel,
.compare_fids = cifs_compare_fids,
@@ -1049,6 +1058,7 @@ struct smb_version_operations smb1_opera
.get_dfs_refer = CIFSGetDFSRefer,
.qfs_tcon = cifs_qfs_tcon,
.is_path_accessible = cifs_is_path_accessible,
+ .can_echo = cifs_can_echo,
.query_path_info = cifs_query_path_info,
.query_file_info = cifs_query_file_info,
.get_srv_inum = cifs_get_srv_inum,


2017-04-25 15:30:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 09/28] Drivers: hv: dont leak memory in vmbus_establish_gpadl()

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit 7cc80c98070ccc7940fc28811c92cca0a681015d upstream.

In some cases create_gpadl_header() allocates submessages but we never
free them.

[sumits] Note for stable:
Upstream commit 4d63763296ab7865a98bc29cc7d77145815ef89f:
(Drivers: hv: get rid of redundant messagecount in create_gpadl_header())
changes the list usage to initialize list header in all cases; that patch
isn't added to stable, so the current patch is modified a little bit from
the upstream commit to check if the list is valid or not.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/hv/channel.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -375,7 +375,7 @@ int vmbus_establish_gpadl(struct vmbus_c
struct vmbus_channel_gpadl_header *gpadlmsg;
struct vmbus_channel_gpadl_body *gpadl_body;
struct vmbus_channel_msginfo *msginfo = NULL;
- struct vmbus_channel_msginfo *submsginfo;
+ struct vmbus_channel_msginfo *submsginfo, *tmp;
u32 msgcount;
struct list_head *curr;
u32 next_gpadl_handle;
@@ -437,6 +437,13 @@ cleanup:
list_del(&msginfo->msglistentry);
spin_unlock_irqrestore(&vmbus_connection.channelmsg_lock, flags);

+ if (msgcount > 1) {
+ list_for_each_entry_safe(submsginfo, tmp, &msginfo->submsglist,
+ msglistentry) {
+ kfree(submsginfo);
+ }
+ }
+
kfree(msginfo);
return ret;
}


2017-04-25 15:31:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 28/28] tipc: fix crash during node removal

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jon Paul Maloy <[email protected]>

commit d25a01257e422a4bdeb426f69529d57c73b235fe upstream.

When the TIPC module is unloaded, we have identified a race condition
that allows a node reference counter to go to zero and the node instance
being freed before the node timer is finished with accessing it. This
leads to occasional crashes, especially in multi-namespace environments.

The scenario goes as follows:

CPU0:(node_stop) CPU1:(node_timeout) // ref == 2

1: if(!mod_timer())
2: if (del_timer())
3: tipc_node_put() // ref -> 1
4: tipc_node_put() // ref -> 0
5: kfree_rcu(node);
6: tipc_node_get(node)
7: // BOOM!

We now clean up this functionality as follows:

1) We remove the node pointer from the node lookup table before we
attempt deactivating the timer. This way, we reduce the risk that
tipc_node_find() may obtain a valid pointer to an instance marked
for deletion; a harmless but undesirable situation.

2) We use del_timer_sync() instead of del_timer() to safely deactivate
the node timer without any risk that it might be reactivated by the
timeout handler. There is no risk of deadlock here, since the two
functions never touch the same spinlocks.

3: We remove a pointless tipc_node_get() + tipc_node_put() from the
timeout handler.

Reported-by: Zhijiang Hu <[email protected]>
Acked-by: Ying Xue <[email protected]>
Signed-off-by: Jon Maloy <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/tipc/node.c | 24 +++++++++++-------------
1 file changed, 11 insertions(+), 13 deletions(-)

--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -102,9 +102,10 @@ static unsigned int tipc_hashfn(u32 addr

static void tipc_node_kref_release(struct kref *kref)
{
- struct tipc_node *node = container_of(kref, struct tipc_node, kref);
+ struct tipc_node *n = container_of(kref, struct tipc_node, kref);

- tipc_node_delete(node);
+ kfree(n->bc_entry.link);
+ kfree_rcu(n, rcu);
}

void tipc_node_put(struct tipc_node *node)
@@ -216,21 +217,20 @@ static void tipc_node_delete(struct tipc
{
list_del_rcu(&node->list);
hlist_del_rcu(&node->hash);
- kfree(node->bc_entry.link);
- kfree_rcu(node, rcu);
+ tipc_node_put(node);
+
+ del_timer_sync(&node->timer);
+ tipc_node_put(node);
}

void tipc_node_stop(struct net *net)
{
- struct tipc_net *tn = net_generic(net, tipc_net_id);
+ struct tipc_net *tn = tipc_net(net);
struct tipc_node *node, *t_node;

spin_lock_bh(&tn->node_list_lock);
- list_for_each_entry_safe(node, t_node, &tn->node_list, list) {
- if (del_timer(&node->timer))
- tipc_node_put(node);
- tipc_node_put(node);
- }
+ list_for_each_entry_safe(node, t_node, &tn->node_list, list)
+ tipc_node_delete(node);
spin_unlock_bh(&tn->node_list_lock);
}

@@ -313,9 +313,7 @@ static void tipc_node_timeout(unsigned l
if (rc & TIPC_LINK_DOWN_EVT)
tipc_node_link_down(n, bearer_id, false);
}
- if (!mod_timer(&n->timer, jiffies + n->keepalive_intv))
- tipc_node_get(n);
- tipc_node_put(n);
+ mod_timer(&n->timer, jiffies + n->keepalive_intv);
}

/**


2017-04-25 15:31:22

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 07/28] CIFS: remove bad_network_name flag

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Germano Percossi <[email protected]>

commit a0918f1ce6a43ac980b42b300ec443c154970979 upstream.

STATUS_BAD_NETWORK_NAME can be received during node failover,
causing the flag to be set and making the reconnect thread
always unsuccessful, thereafter.

Once the only place where it is set is removed, the remaining
bits are rendered moot.

Removing it does not prevent "mount" from failing when a non
existent share is passed.

What happens when the share really ceases to exist while the
share is mounted is undefined now as much as it was before.

Signed-off-by: Germano Percossi <[email protected]>
Reviewed-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
fs/cifs/cifsglob.h | 1 -
fs/cifs/smb2pdu.c | 5 -----
2 files changed, 6 deletions(-)

--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -906,7 +906,6 @@ struct cifs_tcon {
bool use_persistent:1; /* use persistent instead of durable handles */
#ifdef CONFIG_CIFS_SMB2
bool print:1; /* set if connection to printer share */
- bool bad_network_name:1; /* set if ret status STATUS_BAD_NETWORK_NAME */
__le32 capabilities;
__u32 share_flags;
__u32 maximal_access;
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -932,9 +932,6 @@ SMB2_tcon(const unsigned int xid, struct
else
return -EIO;

- if (tcon && tcon->bad_network_name)
- return -ENOENT;
-
if ((tcon && tcon->seal) &&
((ses->server->capabilities & SMB2_GLOBAL_CAP_ENCRYPTION) == 0)) {
cifs_dbg(VFS, "encryption requested but no server support");
@@ -1036,8 +1033,6 @@ tcon_exit:
tcon_error_exit:
if (rsp->hdr.Status == STATUS_BAD_NETWORK_NAME) {
cifs_dbg(VFS, "BAD_NETWORK_NAME: %s\n", tree);
- if (tcon)
- tcon->bad_network_name = true;
}
goto tcon_exit;
}


2017-04-25 15:31:32

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 05/28] ring-buffer: Have ring_buffer_iter_empty() return true when empty

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <[email protected]>

commit 78f7a45dac2a2d2002f98a3a95f7979867868d73 upstream.

I noticed that reading the snapshot file when it is empty no longer gives a
status. It suppose to show the status of the snapshot buffer as well as how
to allocate and use it. For example:

># cat snapshot
# tracer: nop
#
#
# * Snapshot is allocated *
#
# Snapshot commands:
# echo 0 > snapshot : Clears and frees snapshot buffer
# echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
# Takes a snapshot of the main buffer.
# echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
# (Doesn't have to be '2' works with any number that
# is not a '0' or '1')

But instead it just showed an empty buffer:

># cat snapshot
# tracer: nop
#
# entries-in-buffer/entries-written: 0/0 #P:4
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |

What happened was that it was using the ring_buffer_iter_empty() function to
see if it was empty, and if it was, it showed the status. But that function
was returning false when it was empty. The reason was that the iter header
page was on the reader page, and the reader page was empty, but so was the
buffer itself. The check only tested to see if the iter was on the commit
page, but the commit page was no longer pointing to the reader page, but as
all pages were empty, the buffer is also.

Fixes: 651e22f2701b ("ring-buffer: Always reset iterator to reader page")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/ring_buffer.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3440,11 +3440,23 @@ EXPORT_SYMBOL_GPL(ring_buffer_iter_reset
int ring_buffer_iter_empty(struct ring_buffer_iter *iter)
{
struct ring_buffer_per_cpu *cpu_buffer;
+ struct buffer_page *reader;
+ struct buffer_page *head_page;
+ struct buffer_page *commit_page;
+ unsigned commit;

cpu_buffer = iter->cpu_buffer;

- return iter->head_page == cpu_buffer->commit_page &&
- iter->head == rb_commit_index(cpu_buffer);
+ /* Remember, trace recording is off when iterator is in use */
+ reader = cpu_buffer->reader_page;
+ head_page = cpu_buffer->head_page;
+ commit_page = cpu_buffer->commit_page;
+ commit = rb_page_commit(commit_page);
+
+ return ((iter->head_page == commit_page && iter->head == commit) ||
+ (iter->head_page == reader && commit_page == head_page &&
+ head_page->read == commit &&
+ iter->head == rb_page_commit(cpu_buffer->reader_page)));
}
EXPORT_SYMBOL_GPL(ring_buffer_iter_empty);



2017-04-25 15:31:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 04/28] tracing: Allocate the snapshot buffer before enabling probe

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Steven Rostedt (VMware) <[email protected]>

commit df62db5be2e5f070ecd1a5ece5945b590ee112e0 upstream.

Currently the snapshot trigger enables the probe and then allocates the
snapshot. If the probe triggers before the allocation, it could cause the
snapshot to fail and turn tracing off. It's best to allocate the snapshot
buffer first, and then enable the trigger. If something goes wrong in the
enabling of the trigger, the snapshot buffer is still allocated, but it can
also be freed by the user by writting zero into the snapshot buffer file.

Also add a check of the return status of alloc_snapshot().

Fixes: 77fd5c15e3 ("tracing: Add snapshot trigger to function probes")
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/trace/trace.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -6060,11 +6060,13 @@ ftrace_trace_snapshot_callback(struct ft
return ret;

out_reg:
- ret = register_ftrace_function_probe(glob, ops, count);
+ ret = alloc_snapshot(&global_trace);
+ if (ret < 0)
+ goto out;

- if (ret >= 0)
- alloc_snapshot(&global_trace);
+ ret = register_ftrace_function_probe(glob, ops, count);

+ out:
return ret < 0 ? ret : 0;
}



2017-04-25 15:31:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 03/28] KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Biggers <[email protected]>

commit c9f838d104fed6f2f61d68164712e3204bf5271b upstream.

This fixes CVE-2017-7472.

Running the following program as an unprivileged user exhausts kernel
memory by leaking thread keyrings:

#include <keyutils.h>

int main()
{
for (;;)
keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
}

Fix it by only creating a new thread keyring if there wasn't one before.
To make things more consistent, make install_thread_keyring_to_cred()
and install_process_keyring_to_cred() both return 0 if the corresponding
keyring is already present.

Fixes: d84f4f992cbd ("CRED: Inaugurate COW credentials")
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/keys/keyctl.c | 11 +++-------
security/keys/process_keys.c | 44 ++++++++++++++++++++++++++-----------------
2 files changed, 31 insertions(+), 24 deletions(-)

--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1228,8 +1228,8 @@ error:
* Read or set the default keyring in which request_key() will cache keys and
* return the old setting.
*
- * If a process keyring is specified then this will be created if it doesn't
- * yet exist. The old setting will be returned if successful.
+ * If a thread or process keyring is specified then it will be created if it
+ * doesn't yet exist. The old setting will be returned if successful.
*/
long keyctl_set_reqkey_keyring(int reqkey_defl)
{
@@ -1254,11 +1254,8 @@ long keyctl_set_reqkey_keyring(int reqke

case KEY_REQKEY_DEFL_PROCESS_KEYRING:
ret = install_process_keyring_to_cred(new);
- if (ret < 0) {
- if (ret != -EEXIST)
- goto error;
- ret = 0;
- }
+ if (ret < 0)
+ goto error;
goto set;

case KEY_REQKEY_DEFL_DEFAULT:
--- a/security/keys/process_keys.c
+++ b/security/keys/process_keys.c
@@ -125,13 +125,18 @@ error:
}

/*
- * Install a fresh thread keyring directly to new credentials. This keyring is
- * allowed to overrun the quota.
+ * Install a thread keyring to the given credentials struct if it didn't have
+ * one already. This is allowed to overrun the quota.
+ *
+ * Return: 0 if a thread keyring is now present; -errno on failure.
*/
int install_thread_keyring_to_cred(struct cred *new)
{
struct key *keyring;

+ if (new->thread_keyring)
+ return 0;
+
keyring = keyring_alloc("_tid", new->uid, new->gid, new,
KEY_POS_ALL | KEY_USR_VIEW,
KEY_ALLOC_QUOTA_OVERRUN, NULL);
@@ -143,7 +148,9 @@ int install_thread_keyring_to_cred(struc
}

/*
- * Install a fresh thread keyring, discarding the old one.
+ * Install a thread keyring to the current task if it didn't have one already.
+ *
+ * Return: 0 if a thread keyring is now present; -errno on failure.
*/
static int install_thread_keyring(void)
{
@@ -154,8 +161,6 @@ static int install_thread_keyring(void)
if (!new)
return -ENOMEM;

- BUG_ON(new->thread_keyring);
-
ret = install_thread_keyring_to_cred(new);
if (ret < 0) {
abort_creds(new);
@@ -166,17 +171,17 @@ static int install_thread_keyring(void)
}

/*
- * Install a process keyring directly to a credentials struct.
+ * Install a process keyring to the given credentials struct if it didn't have
+ * one already. This is allowed to overrun the quota.
*
- * Returns -EEXIST if there was already a process keyring, 0 if one installed,
- * and other value on any other error
+ * Return: 0 if a process keyring is now present; -errno on failure.
*/
int install_process_keyring_to_cred(struct cred *new)
{
struct key *keyring;

if (new->process_keyring)
- return -EEXIST;
+ return 0;

keyring = keyring_alloc("_pid", new->uid, new->gid, new,
KEY_POS_ALL | KEY_USR_VIEW,
@@ -189,11 +194,9 @@ int install_process_keyring_to_cred(stru
}

/*
- * Make sure a process keyring is installed for the current process. The
- * existing process keyring is not replaced.
+ * Install a process keyring to the current task if it didn't have one already.
*
- * Returns 0 if there is a process keyring by the end of this function, some
- * error otherwise.
+ * Return: 0 if a process keyring is now present; -errno on failure.
*/
static int install_process_keyring(void)
{
@@ -207,14 +210,18 @@ static int install_process_keyring(void)
ret = install_process_keyring_to_cred(new);
if (ret < 0) {
abort_creds(new);
- return ret != -EEXIST ? ret : 0;
+ return ret;
}

return commit_creds(new);
}

/*
- * Install a session keyring directly to a credentials struct.
+ * Install the given keyring as the session keyring of the given credentials
+ * struct, replacing the existing one if any. If the given keyring is NULL,
+ * then install a new anonymous session keyring.
+ *
+ * Return: 0 on success; -errno on failure.
*/
int install_session_keyring_to_cred(struct cred *cred, struct key *keyring)
{
@@ -249,8 +256,11 @@ int install_session_keyring_to_cred(stru
}

/*
- * Install a session keyring, discarding the old one. If a keyring is not
- * supplied, an empty one is invented.
+ * Install the given keyring as the session keyring of the current task,
+ * replacing the existing one if any. If the given keyring is NULL, then
+ * install a new anonymous session keyring.
+ *
+ * Return: 0 on success; -errno on failure.
*/
static int install_session_keyring(struct key *keyring)
{


2017-04-25 15:32:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 27/28] block: fix del_gendisk() vs blkdev_ioctl crash

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>

commit ac34f15e0c6d2fd58480052b6985f6991fb53bcc upstream.

When tearing down a block device early in its lifetime, userspace may
still be performing discovery actions like blkdev_ioctl() to re-read
partitions.

The nvdimm_revalidate_disk() implementation depends on
disk->driverfs_dev to be valid at entry. However, it is set to NULL in
del_gendisk() and fatally this is happening *before* the disk device is
deleted from userspace view.

There's no reason for del_gendisk() to clear ->driverfs_dev. That
device is the parent of the disk. It is guaranteed to not be freed
until the disk, as a child, drops its ->parent reference.

We could also fix this issue locally in nvdimm_revalidate_disk() by
using disk_to_dev(disk)->parent, but lets fix it globally since
->driverfs_dev follows the lifetime of the parent. Longer term we
should probably just add a @parent parameter to add_disk(), and stop
carrying this pointer in the gendisk.

BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<ffffffffa00340a8>] nvdimm_revalidate_disk+0x18/0x90 [libnvdimm]
CPU: 2 PID: 538 Comm: systemd-udevd Tainted: G O 4.4.0-rc5 #2257
[..]
Call Trace:
[<ffffffff8143e5c7>] rescan_partitions+0x87/0x2c0
[<ffffffff810f37f9>] ? __lock_is_held+0x49/0x70
[<ffffffff81438c62>] __blkdev_reread_part+0x72/0xb0
[<ffffffff81438cc5>] blkdev_reread_part+0x25/0x40
[<ffffffff8143982d>] blkdev_ioctl+0x4fd/0x9c0
[<ffffffff811246c9>] ? current_kernel_time64+0x69/0xd0
[<ffffffff812916dd>] block_ioctl+0x3d/0x50
[<ffffffff81264c38>] do_vfs_ioctl+0x308/0x560
[<ffffffff8115dbd1>] ? __audit_syscall_entry+0xb1/0x100
[<ffffffff810031d6>] ? do_audit_syscall_entry+0x66/0x70
[<ffffffff81264f09>] SyS_ioctl+0x79/0x90
[<ffffffff81902672>] entry_SYSCALL_64_fastpath+0x12/0x76

Cc: Jan Kara <[email protected]>
Cc: Jens Axboe <[email protected]>
Reported-by: Robert Hu <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
block/genhd.c | 1 -
1 file changed, 1 deletion(-)

--- a/block/genhd.c
+++ b/block/genhd.c
@@ -664,7 +664,6 @@ void del_gendisk(struct gendisk *disk)

kobject_put(disk->part0.holder_dir);
kobject_put(disk->slave_dir);
- disk->driverfs_dev = NULL;
if (!sysfs_deprecated)
sysfs_remove_link(block_depr, dev_name(disk_to_dev(disk)));
pm_runtime_set_memalloc_noio(disk_to_dev(disk), false);


2017-04-25 15:32:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 22/28] Tools: hv: kvp: ensure kvp device fd is closed on exec

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit 26840437cbd6d3625ea6ab34e17cd34bb810c861 upstream.

KVP daemon does fork()/exec() (with popen()) so we need to close our fds
to avoid sharing them with child processes. The immediate implication of
not doing so I see is SELinux complaining about 'ip' trying to access
'/dev/vmbus/hv_kvp'.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
tools/hv/hv_kvp_daemon.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1433,7 +1433,7 @@ int main(int argc, char *argv[])
openlog("KVP", 0, LOG_USER);
syslog(LOG_INFO, "KVP starting; pid is:%d", getpid());

- kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR);
+ kvp_fd = open("/dev/vmbus/hv_kvp", O_RDWR | O_CLOEXEC);

if (kvp_fd < 0) {
syslog(LOG_ERR, "open /dev/vmbus/hv_kvp failed; error: %d %s",


2017-04-25 15:32:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 23/28] Drivers: hv: balloon: keep track of where ha_region starts

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Vitaly Kuznetsov <[email protected]>

commit 7cf3b79ec85ee1a5bbaaf936bb1d050dc652983b upstream.

Windows 2012 (non-R2) does not specify hot add region in hot add requests
and the logic in hot_add_req() is trying to find a 128Mb-aligned region
covering the request. It may also happen that host's requests are not 128Mb
aligned and the created ha_region will start before the first specified
PFN. We can't online these non-present pages but we don't remember the real
start of the region.

This is a regression introduced by the commit 5abbbb75d733 ("Drivers: hv:
hv_balloon: don't lose memory when onlining order is not natural"). While
the idea of keeping the 'moving window' was wrong (as there is no guarantee
that hot add requests come ordered) we should still keep track of
covered_start_pfn. This is not a revert, the logic is different.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/hv_balloon.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

--- a/drivers/hv/hv_balloon.c
+++ b/drivers/hv/hv_balloon.c
@@ -430,13 +430,14 @@ struct dm_info_msg {
* currently hot added. We hot add in multiples of 128M
* chunks; it is possible that we may not be able to bring
* online all the pages in the region. The range
- * covered_end_pfn defines the pages that can
+ * covered_start_pfn:covered_end_pfn defines the pages that can
* be brough online.
*/

struct hv_hotadd_state {
struct list_head list;
unsigned long start_pfn;
+ unsigned long covered_start_pfn;
unsigned long covered_end_pfn;
unsigned long ha_end_pfn;
unsigned long end_pfn;
@@ -682,7 +683,8 @@ static void hv_online_page(struct page *

list_for_each(cur, &dm_device.ha_region_list) {
has = list_entry(cur, struct hv_hotadd_state, list);
- cur_start_pgp = (unsigned long)pfn_to_page(has->start_pfn);
+ cur_start_pgp = (unsigned long)
+ pfn_to_page(has->covered_start_pfn);
cur_end_pgp = (unsigned long)pfn_to_page(has->covered_end_pfn);

if (((unsigned long)pg >= cur_start_pgp) &&
@@ -854,6 +856,7 @@ static unsigned long process_hot_add(uns
list_add_tail(&ha_region->list, &dm_device.ha_region_list);
ha_region->start_pfn = rg_start;
ha_region->ha_end_pfn = rg_start;
+ ha_region->covered_start_pfn = pg_start;
ha_region->covered_end_pfn = pg_start;
ha_region->end_pfn = rg_start + rg_size;
}


2017-04-25 15:32:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 26/28] x86, pmem: fix broken __copy_user_nocache cache-bypass assumptions

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>

commit 11e63f6d920d6f2dfd3cd421e939a4aec9a58dcd upstream.

Before we rework the "pmem api" to stop abusing __copy_user_nocache()
for memcpy_to_pmem() we need to fix cases where we may strand dirty data
in the cpu cache. The problem occurs when copy_from_iter_pmem() is used
for arbitrary data transfers from userspace. There is no guarantee that
these transfers, performed by dax_iomap_actor(), will have aligned
destinations or aligned transfer lengths. Backstop the usage
__copy_user_nocache() with explicit cache management in these unaligned
cases.

Yes, copy_from_iter_pmem() is now too big for an inline, but addressing
that is saved for a later patch that moves the entirety of the "pmem
api" into the pmem driver directly.

Fixes: 5de490daec8b ("pmem: add copy_from_iter_pmem() and clear_pmem()")
Cc: <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Reviewed-by: Ross Zwisler <[email protected]>
Signed-off-by: Toshi Kani <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/pmem.h | 45 +++++++++++++++++++++++++++++++-------------
1 file changed, 32 insertions(+), 13 deletions(-)

--- a/arch/x86/include/asm/pmem.h
+++ b/arch/x86/include/asm/pmem.h
@@ -72,8 +72,8 @@ static inline void arch_wmb_pmem(void)
* @size: number of bytes to write back
*
* Write back a cache range using the CLWB (cache line write back)
- * instruction. This function requires explicit ordering with an
- * arch_wmb_pmem() call. This API is internal to the x86 PMEM implementation.
+ * instruction. Note that @size is internally rounded up to be cache
+ * line size aligned.
*/
static inline void __arch_wb_cache_pmem(void *vaddr, size_t size)
{
@@ -87,15 +87,6 @@ static inline void __arch_wb_cache_pmem(
clwb(p);
}

-/*
- * copy_from_iter_nocache() on x86 only uses non-temporal stores for iovec
- * iterators, so for other types (bvec & kvec) we must do a cache write-back.
- */
-static inline bool __iter_needs_pmem_wb(struct iov_iter *i)
-{
- return iter_is_iovec(i) == false;
-}
-
/**
* arch_copy_from_iter_pmem - copy data from an iterator to PMEM
* @addr: PMEM destination address
@@ -114,8 +105,36 @@ static inline size_t arch_copy_from_iter
/* TODO: skip the write-back by always using non-temporal stores */
len = copy_from_iter_nocache(vaddr, bytes, i);

- if (__iter_needs_pmem_wb(i))
- __arch_wb_cache_pmem(vaddr, bytes);
+ /*
+ * In the iovec case on x86_64 copy_from_iter_nocache() uses
+ * non-temporal stores for the bulk of the transfer, but we need
+ * to manually flush if the transfer is unaligned. A cached
+ * memory copy is used when destination or size is not naturally
+ * aligned. That is:
+ * - Require 8-byte alignment when size is 8 bytes or larger.
+ * - Require 4-byte alignment when size is 4 bytes.
+ *
+ * In the non-iovec case the entire destination needs to be
+ * flushed.
+ */
+ if (iter_is_iovec(i)) {
+ unsigned long flushed, dest = (unsigned long) addr;
+
+ if (bytes < 8) {
+ if (!IS_ALIGNED(dest, 4) || (bytes != 4))
+ __arch_wb_cache_pmem(addr, 1);
+ } else {
+ if (!IS_ALIGNED(dest, 8)) {
+ dest = ALIGN(dest, boot_cpu_data.x86_clflush_size);
+ __arch_wb_cache_pmem(addr, 1);
+ }
+
+ flushed = dest - (unsigned long) addr;
+ if (bytes > flushed && !IS_ALIGNED(bytes - flushed, 8))
+ __arch_wb_cache_pmem(addr + bytes - 1, 1);
+ }
+ } else
+ __arch_wb_cache_pmem(addr, bytes);

return len;
}


2017-04-25 15:33:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 02/28] KEYS: Change the name of the dead type to ".dead" to prevent user access

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Howells <[email protected]>

commit c1644fe041ebaf6519f6809146a77c3ead9193af upstream.

This fixes CVE-2017-6951.

Userspace should not be able to do things with the "dead" key type as it
doesn't have some of the helper functions set upon it that the kernel
needs. Attempting to use it may cause the kernel to crash.

Fix this by changing the name of the type to ".dead" so that it's rejected
up front on userspace syscalls by key_get_type_from_user().

Though this doesn't seem to affect recent kernels, it does affect older
ones, certainly those prior to:

commit c06cfb08b88dfbe13be44a69ae2fdc3a7c902d81
Author: David Howells <[email protected]>
Date: Tue Sep 16 17:36:06 2014 +0100
KEYS: Remove key_type::match in favour of overriding default by match_preparse

which went in before 3.18-rc1.

Signed-off-by: David Howells <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/keys/gc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -46,7 +46,7 @@ static unsigned long key_gc_flags;
* immediately unlinked.
*/
struct key_type key_type_dead = {
- .name = "dead",
+ .name = ".dead",
};

/*


2017-04-25 15:34:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 20/28] kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Suzuki K Poulose <[email protected]>

commit 8b3405e345b5a098101b0c31b264c812bba045d9 upstream.

In kvm_free_stage2_pgd() we don't hold the kvm->mmu_lock while calling
unmap_stage2_range() on the entire memory range for the guest. This could
cause problems with other callers (e.g, munmap on a memslot) trying to
unmap a range. And since we have to unmap the entire Guest memory range
holding a spinlock, make sure we yield the lock if necessary, after we
unmap each PUD range.

Fixes: commit d5d8184d35c9 ("KVM: ARM: Memory virtualization setup")
Cc: Paolo Bonzini <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: Mark Rutland <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
[ Avoid vCPU starvation and lockup detector warnings ]
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/arm/kvm/mmu.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -300,6 +300,14 @@ static void unmap_range(struct kvm *kvm,
next = kvm_pgd_addr_end(addr, end);
if (!pgd_none(*pgd))
unmap_puds(kvm, pgd, addr, next);
+ /*
+ * If we are dealing with a large range in
+ * stage2 table, release the kvm->mmu_lock
+ * to prevent starvation and lockup detector
+ * warnings.
+ */
+ if (kvm && (next != end))
+ cond_resched_lock(&kvm->mmu_lock);
} while (pgd++, addr = next, addr != end);
}

@@ -738,6 +746,7 @@ int kvm_alloc_stage2_pgd(struct kvm *kvm
*/
static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
{
+ assert_spin_locked(&kvm->mmu_lock);
unmap_range(kvm, kvm->arch.pgd, start, size);
}

@@ -824,7 +833,10 @@ void kvm_free_stage2_pgd(struct kvm *kvm
if (kvm->arch.pgd == NULL)
return;

+ spin_lock(&kvm->mmu_lock);
unmap_stage2_range(kvm, 0, KVM_PHYS_SIZE);
+ spin_unlock(&kvm->mmu_lock);
+
kvm_free_hwpgd(kvm_get_hwpgd(kvm));
if (KVM_PREALLOC_LEVEL > 0)
kfree(kvm->arch.pgd);


2017-04-25 15:34:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 17/28] ubi/upd: Always flush after prepared for an update

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sebastian Siewior <[email protected]>

commit 9cd9a21ce070be8a918ffd3381468315a7a76ba6 upstream.

In commit 6afaf8a484cb ("UBI: flush wl before clearing update marker") I
managed to trigger and fix a similar bug. Now here is another version of
which I assumed it wouldn't matter back then but it turns out UBI has a
check for it and will error out like this:

|ubi0 warning: validate_vid_hdr: inconsistent used_ebs
|ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592

All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
powercut in the middle of the operation.
ubi_start_update() sets the update-marker and puts all EBs on the erase
list. After that userland can proceed to write new data while the old EB
aren't erased completely. A powercut at this point is usually not that
much of a tragedy. UBI won't give read access to the static volume
because it has the update marker. It will most likely set the corrupted
flag because it misses some EBs.
So we are all good. Unless the size of the image that has been written
differs from the old image in the magnitude of at least one EB. In that
case UBI will find two different values for `used_ebs' and refuse to
attach the image with the error message mentioned above.

So in order not to get in the situation, the patch will ensure that we
wait until everything is removed before it tries to write any data.
The alternative would be to detect such a case and remove all EBs at the
attached time after we processed the volume-table and see the
update-marker set. The patch looks bigger and I doubt it is worth it
since usually the write() will wait from time to time for a new EB since
usually there not that many spare EB that can be used.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Richard Weinberger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/mtd/ubi/upd.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/mtd/ubi/upd.c
+++ b/drivers/mtd/ubi/upd.c
@@ -148,11 +148,11 @@ int ubi_start_update(struct ubi_device *
return err;
}

- if (bytes == 0) {
- err = ubi_wl_flush(ubi, UBI_ALL, UBI_ALL);
- if (err)
- return err;
+ err = ubi_wl_flush(ubi, UBI_ALL, UBI_ALL);
+ if (err)
+ return err;

+ if (bytes == 0) {
err = clear_update_marker(ubi, vol, 0);
if (err)
return err;


2017-04-25 15:34:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 18/28] powerpc/kprobe: Fix oops when kprobed on stdu instruction

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ravi Bangoria <[email protected]>

commit 9e1ba4f27f018742a1aa95d11e35106feba08ec1 upstream.

If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
OOPS:

Bad kernel stack pointer cd93c840 at c000000000009868
Oops: Bad kernel stack pointer, sig: 6 [#1]
...
GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
...
NIP [c000000000009868] resume_kernel+0x2c/0x58
LR [c000000000006208] program_check_common+0x108/0x180

On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
not emulate actual store in emulate_step() because it may corrupt the exception
frame. So the kernel does the actual store operation in exception return code
i.e. resume_kernel().

resume_kernel() loads the saved stack pointer from memory using lwz, which only
loads the low 32-bits of the address, causing the kernel crash.

Fix this by loading the 64-bit value instead.

Fixes: be96f63375a1 ("powerpc: Split out instruction analysis part of emulate_step()")
Signed-off-by: Ravi Bangoria <[email protected]>
Reviewed-by: Naveen N. Rao <[email protected]>
Reviewed-by: Ananth N Mavinakayanahalli <[email protected]>
[mpe: Change log massage, add stable tag]
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/entry_64.S | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -716,7 +716,7 @@ resume_kernel:

addi r8,r1,INT_FRAME_SIZE /* Get the kprobed function entry */

- lwz r3,GPR1(r1)
+ ld r3,GPR1(r1)
subi r3,r3,INT_FRAME_SIZE /* dst: Allocate a trampoline exception frame */
mr r4,r1 /* src: current exception frame */
mr r1,r3 /* Reroute the trampoline frame to r1 */
@@ -730,8 +730,8 @@ resume_kernel:
addi r6,r6,8
bdnz 2b

- /* Do real store operation to complete stwu */
- lwz r5,GPR1(r1)
+ /* Do real store operation to complete stdu */
+ ld r5,GPR1(r1)
std r8,0(r5)

/* Clear _TIF_EMULATE_STACK_STORE flag */


2017-04-25 15:35:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 15/28] mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Haibo Chen <[email protected]>

commit 9f327845358d3dd0d8a5a7a5436b0aa5c432e757 upstream.

Currently for DDR50 card, it need tuning in default. We meet tuning fail
issue for DDR50 card and some data CRC error when DDR50 sd card works.

This is because the default pad I/O drive strength can't make sure DDR50
card work stable. So increase the pad I/O drive strength for DDR50 card,
and use pins_100mhz.

This fixes DDR50 card support for IMX since DDR50 tuning was enabled from
commit 9faac7b95ea4 ("mmc: sdhci: enable tuning for DDR50")

Tested-and-reported-by: Tim Harvey <[email protected]>
Signed-off-by: Haibo Chen <[email protected]>
Acked-by: Dong Aisheng <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Signed-off-by: Ulf Hansson <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/mmc/host/sdhci-esdhc-imx.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -804,6 +804,7 @@ static int esdhc_change_pinstate(struct

switch (uhs) {
case MMC_TIMING_UHS_SDR50:
+ case MMC_TIMING_UHS_DDR50:
pinctrl = imx_data->pins_100mhz;
break;
case MMC_TIMING_UHS_SDR104:


2017-04-25 15:35:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 11/28] Drivers: hv: vmbus: Reduce the delay between retries in vmbus_post_msg()

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: K. Y. Srinivasan <[email protected]>

commit 8de0d7e951826d7592e0ba1da655b175c4aa0923 upstream.

The current delay between retries is unnecessarily high and is negatively
affecting the time it takes to boot the system.

Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Sumit Semwal <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/hv/connection.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -429,7 +429,7 @@ int vmbus_post_msg(void *buffer, size_t
union hv_connection_id conn_id;
int ret = 0;
int retries = 0;
- u32 msec = 1;
+ u32 usec = 1;

conn_id.asu32 = 0;
conn_id.u.id = VMBUS_MESSAGE_CONNECTION_ID;
@@ -462,9 +462,9 @@ int vmbus_post_msg(void *buffer, size_t
}

retries++;
- msleep(msec);
- if (msec < 2048)
- msec *= 2;
+ udelay(usec);
+ if (usec < 2048)
+ usec *= 2;
}
return ret;
}


2017-04-25 18:18:36

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On 04/25/2017 09:08 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.64 release.
> There are 28 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.64-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

2017-04-25 21:26:32

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On Tue, Apr 25, 2017 at 04:08:31PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.64 release.
> There are 28 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> Anything received after that time might be too late.
>
Early feedback: Various powerpc builds (defconfig, allmodconfig,
ppc64e_defconfig, cell_defconfig, maple_defconfig) fail with

arch/powerpc/kernel/misc_64.S: Assembler messages:
arch/powerpc/kernel/misc_64.S:72: Error:
.localentry expression for `flush_icache_range' does not evaluate to a constant

This appears to be due to 'powerpc/64: Fix flush_(d|i)cache_range() called from
modules'. No idea what is wrong with it, though; maybe some context patch is
missing. Copying the author and Michael.

Guenter

2017-04-26 02:27:29

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.64 release.
> There are 28 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
total: 149 pass: 144 fail: 5
Failed builds:
powerpc:defconfig
powerpc:allmodconfig
powerpc:ppc64e_defconfig
powerpc:cell_defconfig
powerpc:maple_defconfig

Qemu test results:
total: 115 pass: 110 fail: 5
Failed tests:
powerpc:mac99:ppc64_book3s_defconfig:nosmp
powerpc:mac99:ppc64_book3s_defconfig:smp4
powerpc:pseries:pseries_defconfig
powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
powerpc:mpc8544ds:ppc64_e5500_defconfig:smp

As mentioned earlier, the failures are

arch/powerpc/kernel/misc_64.S: Assembler messages:
arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'

or:

arch/powerpc/kernel/misc_64.S: Assembler messages:
arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant

The error message depends on the compiler / binutils version.

Details are available at http://kerneltests.org/builders.

Guenter

2017-04-26 08:32:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On Tue, Apr 25, 2017 at 12:18:24PM -0600, Shuah Khan wrote:
> On 04/25/2017 09:08 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.4.64 release.
> > There are 28 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.64-rc1.gz
> > or in the git tree and branch at:
> > git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> > and the diffstat can be found below.
> >
> > thanks,
> >
> > greg k-h
>
> Compiled and booted on my test system. No dmesg regressions.

Thanks for testing all of these and letting me know.

greg k-h

2017-04-26 08:32:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On Tue, Apr 25, 2017 at 07:27:18PM -0700, Guenter Roeck wrote:
> On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.4.64 release.
> > There are 28 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 149 pass: 144 fail: 5
> Failed builds:
> powerpc:defconfig
> powerpc:allmodconfig
> powerpc:ppc64e_defconfig
> powerpc:cell_defconfig
> powerpc:maple_defconfig
>
> Qemu test results:
> total: 115 pass: 110 fail: 5
> Failed tests:
> powerpc:mac99:ppc64_book3s_defconfig:nosmp
> powerpc:mac99:ppc64_book3s_defconfig:smp4
> powerpc:pseries:pseries_defconfig
> powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
> powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
>
> As mentioned earlier, the failures are
>
> arch/powerpc/kernel/misc_64.S: Assembler messages:
> arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'
>
> or:
>
> arch/powerpc/kernel/misc_64.S: Assembler messages:
> arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant
>
> The error message depends on the compiler / binutils version.

This patch is now dropped, so the ppc builds should now work.

thanks for testing all of these.

greg k-h

2017-04-26 13:11:49

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On 04/26/2017 06:10 AM, Guenter Roeck wrote:
> On 04/26/2017 01:31 AM, Greg Kroah-Hartman wrote:
>> On Tue, Apr 25, 2017 at 07:27:18PM -0700, Guenter Roeck wrote:
>>> On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
>>>> This is the start of the stable review cycle for the 4.4.64 release.
>>>> There are 28 patches in this series, all will be posted as a response
>>>> to this one. If anyone has any issues with these being applied, please
>>>> let me know.
>>>>
>>>> Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
>>>> Anything received after that time might be too late.
>>>>
>>>
>>> Build results:
>>> total: 149 pass: 144 fail: 5
>>> Failed builds:
>>> powerpc:defconfig
>>> powerpc:allmodconfig
>>> powerpc:ppc64e_defconfig
>>> powerpc:cell_defconfig
>>> powerpc:maple_defconfig
>>>
>>> Qemu test results:
>>> total: 115 pass: 110 fail: 5
>>> Failed tests:
>>> powerpc:mac99:ppc64_book3s_defconfig:nosmp
>>> powerpc:mac99:ppc64_book3s_defconfig:smp4
>>> powerpc:pseries:pseries_defconfig
>>> powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
>>> powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
>>>
>>> As mentioned earlier, the failures are
>>>
>>> arch/powerpc/kernel/misc_64.S: Assembler messages:
>>> arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'
>>>
>>> or:
>>>
>>> arch/powerpc/kernel/misc_64.S: Assembler messages:
>>> arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant
>>>
>>> The error message depends on the compiler / binutils version.
>>
>> This patch is now dropped, so the ppc builds should now work.
>>
>
> Did you push the change ? My builder didn't pick it up.
>

Please ignore. It did.

Guenter


2017-04-26 13:11:09

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On 04/26/2017 01:31 AM, Greg Kroah-Hartman wrote:
> On Tue, Apr 25, 2017 at 07:27:18PM -0700, Guenter Roeck wrote:
>> On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 4.4.64 release.
>>> There are 28 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
>>> Anything received after that time might be too late.
>>>
>>
>> Build results:
>> total: 149 pass: 144 fail: 5
>> Failed builds:
>> powerpc:defconfig
>> powerpc:allmodconfig
>> powerpc:ppc64e_defconfig
>> powerpc:cell_defconfig
>> powerpc:maple_defconfig
>>
>> Qemu test results:
>> total: 115 pass: 110 fail: 5
>> Failed tests:
>> powerpc:mac99:ppc64_book3s_defconfig:nosmp
>> powerpc:mac99:ppc64_book3s_defconfig:smp4
>> powerpc:pseries:pseries_defconfig
>> powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
>> powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
>>
>> As mentioned earlier, the failures are
>>
>> arch/powerpc/kernel/misc_64.S: Assembler messages:
>> arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'
>>
>> or:
>>
>> arch/powerpc/kernel/misc_64.S: Assembler messages:
>> arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant
>>
>> The error message depends on the compiler / binutils version.
>
> This patch is now dropped, so the ppc builds should now work.
>

Did you push the change ? My builder didn't pick it up.

Guenter

2017-04-26 14:39:38

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On Wed, Apr 26, 2017 at 10:31:56AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Apr 25, 2017 at 07:27:18PM -0700, Guenter Roeck wrote:
> > On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 4.4.64 release.
> > > There are 28 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> > >
> > > Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> > > Anything received after that time might be too late.
> > >
> >
> > Build results:
> > total: 149 pass: 144 fail: 5
> > Failed builds:
> > powerpc:defconfig
> > powerpc:allmodconfig
> > powerpc:ppc64e_defconfig
> > powerpc:cell_defconfig
> > powerpc:maple_defconfig
> >
> > Qemu test results:
> > total: 115 pass: 110 fail: 5
> > Failed tests:
> > powerpc:mac99:ppc64_book3s_defconfig:nosmp
> > powerpc:mac99:ppc64_book3s_defconfig:smp4
> > powerpc:pseries:pseries_defconfig
> > powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
> > powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
> >
> > As mentioned earlier, the failures are
> >
> > arch/powerpc/kernel/misc_64.S: Assembler messages:
> > arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'
> >
> > or:
> >
> > arch/powerpc/kernel/misc_64.S: Assembler messages:
> > arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant
> >
> > The error message depends on the compiler / binutils version.
>
> This patch is now dropped, so the ppc builds should now work.
>
Confirmed; all is good now.

Thanks,
Guenter

2017-04-26 15:49:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/28] 4.4.64-stable review

On Wed, Apr 26, 2017 at 07:39:21AM -0700, Guenter Roeck wrote:
> On Wed, Apr 26, 2017 at 10:31:56AM +0200, Greg Kroah-Hartman wrote:
> > On Tue, Apr 25, 2017 at 07:27:18PM -0700, Guenter Roeck wrote:
> > > On 04/25/2017 08:08 AM, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 4.4.64 release.
> > > > There are 28 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Thu Apr 27 15:08:00 UTC 2017.
> > > > Anything received after that time might be too late.
> > > >
> > >
> > > Build results:
> > > total: 149 pass: 144 fail: 5
> > > Failed builds:
> > > powerpc:defconfig
> > > powerpc:allmodconfig
> > > powerpc:ppc64e_defconfig
> > > powerpc:cell_defconfig
> > > powerpc:maple_defconfig
> > >
> > > Qemu test results:
> > > total: 115 pass: 110 fail: 5
> > > Failed tests:
> > > powerpc:mac99:ppc64_book3s_defconfig:nosmp
> > > powerpc:mac99:ppc64_book3s_defconfig:smp4
> > > powerpc:pseries:pseries_defconfig
> > > powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
> > > powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
> > >
> > > As mentioned earlier, the failures are
> > >
> > > arch/powerpc/kernel/misc_64.S: Assembler messages:
> > > arch/powerpc/kernel/misc_64.S:72: Error: unknown pseudo-op: `.localentry'
> > >
> > > or:
> > >
> > > arch/powerpc/kernel/misc_64.S: Assembler messages:
> > > arch/powerpc/kernel/misc_64.S:72: Error: .localentry expression for `flush_icache_range' does not evaluate to a constant
> > >
> > > The error message depends on the compiler / binutils version.
> >
> > This patch is now dropped, so the ppc builds should now work.
> >
> Confirmed; all is good now.

Wonderful, thanks for letting me know.

greg k-h