2018-02-09 14:17:53

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/92] 4.9.81-stable review

This is the start of the stable review cycle for the 4.9.81 release.
There are 92 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.81-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.81-rc1

Borislav Petkov <[email protected]>
x86/microcode: Do the family check first

Laurent Pinchart <[email protected]>
drm: rcar-du: Fix race condition when disabling planes at CRTC stop

Laurent Pinchart <[email protected]>
drm: rcar-du: Use the VBK interrupt for vblank events

Kuninori Morimoto <[email protected]>
ASoC: rsnd: avoid duplicate free_irq()

Kuninori Morimoto <[email protected]>
ASoC: rsnd: don't call free_irq() on Parent SSI

Julian Scheel <[email protected]>
ASoC: simple-card: Fix misleading error message

Robert Baronescu <[email protected]>
crypto: tcrypt - fix S/G table for test_aead_speed()

KarimAllah Ahmed <[email protected]>
KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL

KarimAllah Ahmed <[email protected]>
KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL

KarimAllah Ahmed <[email protected]>
KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES

Ashok Raj <[email protected]>
KVM/x86: Add IBPB support

Paolo Bonzini <[email protected]>
KVM: VMX: make MSR bitmaps per-VCPU

Paolo Bonzini <[email protected]>
KVM: VMX: introduce alloc_loaded_vmcs

Jim Mattson <[email protected]>
KVM: nVMX: Eliminate vmcs02 pool

David Matlack <[email protected]>
KVM: nVMX: mark vmcs12 pages dirty on L2 exit

David Hildenbrand <[email protected]>
KVM: nVMX: vmx_complete_nested_posted_interrupt() can't fail

David Hildenbrand <[email protected]>
KVM: nVMX: kmap() can't fail

Darren Kenny <[email protected]>
x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL

Arnd Bergmann <[email protected]>
x86/pti: Mark constant arrays as __initconst

KarimAllah Ahmed <[email protected]>
x86/spectre: Simplify spectre_v2 command line parsing

David Woodhouse <[email protected]>
x86/retpoline: Avoid retpolines for built-in __init functions

Dan Williams <[email protected]>
x86/kvm: Update spectre-v1 mitigation

Josh Poimboeuf <[email protected]>
x86/paravirt: Remove 'noreplace-paravirt' cmdline option

David Woodhouse <[email protected]>
x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel

Colin Ian King <[email protected]>
x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"

Dan Williams <[email protected]>
x86/spectre: Report get_user mitigation for spectre_v1

Dan Williams <[email protected]>
nl80211: Sanitize array index in parse_txq_params

Dan Williams <[email protected]>
vfs, fdtable: Prevent bounds-check bypass via speculative execution

Dan Williams <[email protected]>
x86/syscall: Sanitize syscall table de-references under speculation

Dan Williams <[email protected]>
x86/get_user: Use pointer masking to limit speculation

Dan Williams <[email protected]>
x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec

Dan Williams <[email protected]>
x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}

Dan Williams <[email protected]>
x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec

Dan Williams <[email protected]>
x86: Introduce barrier_nospec

Dan Williams <[email protected]>
x86: Implement array_index_mask_nospec

Dan Williams <[email protected]>
array_index_nospec: Sanitize speculative array de-references

Mark Rutland <[email protected]>
Documentation: Document array_index_nospec

Andy Lutomirski <[email protected]>
x86/asm: Move 'status' from thread_struct to thread_info

Andy Lutomirski <[email protected]>
x86/entry/64: Push extra regs right away

Andy Lutomirski <[email protected]>
x86/entry/64: Remove the SYSCALL64 fast path

Dou Liyang <[email protected]>
x86/spectre: Check CONFIG_RETPOLINE in command line parser

Borislav Petkov <[email protected]>
x86/retpoline: Simplify vmexit_fill_RSB()

David Woodhouse <[email protected]>
x86/cpufeatures: Clean up Spectre v2 related CPUID flags

Thomas Gleixner <[email protected]>
x86/cpu/bugs: Make retpoline module warning conditional

Borislav Petkov <[email protected]>
x86/bugs: Drop one "mitigation" from dmesg

Borislav Petkov <[email protected]>
x86/nospec: Fix header guards names

Borislav Petkov <[email protected]>
x86/alternative: Print unadorned pointers

David Woodhouse <[email protected]>
x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support

David Woodhouse <[email protected]>
x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes

David Woodhouse <[email protected]>
x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

David Woodhouse <[email protected]>
x86/msr: Add definitions for new speculation control MSRs

David Woodhouse <[email protected]>
x86/cpufeatures: Add AMD feature bits for Speculation Control

David Woodhouse <[email protected]>
x86/cpufeatures: Add Intel feature bits for Speculation Control

David Woodhouse <[email protected]>
x86/cpufeatures: Add CPUID_7_EDX CPUID leaf

Andi Kleen <[email protected]>
module/retpoline: Warn about missing retpoline in module

Peter Zijlstra <[email protected]>
KVM: VMX: Make indirect call speculation safe

Peter Zijlstra <[email protected]>
KVM: x86: Make indirect calls in emulator speculation safe

Waiman Long <[email protected]>
x86/retpoline: Remove the esp/rsp thunk

Eric Biggers <[email protected]>
KEYS: encrypted: fix buffer overread in valid_master_desc()

Takashi Iwai <[email protected]>
b43: Add missing MODULE_FIRMWARE()

Jesse Chan <[email protected]>
media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

Borislav Petkov <[email protected]>
x86/microcode/AMD: Do not load when running on a hypervisor

Josh Poimboeuf <[email protected]>
x86/asm: Fix inline asm call constraints for GCC 4.4

Eric Dumazet <[email protected]>
soreuseport: fix mem leak in reuseport_add_sock()

Martin KaFai Lau <[email protected]>
ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only

Paolo Abeni <[email protected]>
cls_u32: add missing RCU annotation.

Neal Cardwell <[email protected]>
tcp_bbr: fix pacing_gain to always be unity when using lt_bw

Jason Wang <[email protected]>
vhost_net: stop device during reset owner

Li RongQing <[email protected]>
tcp: release sk_frag.page in tcp_disconnect

Chunhao Lin <[email protected]>
r8169: fix RTL8168EP take too long to complete driver initialization.

Kristian Evensen <[email protected]>
qmi_wwan: Add support for Quectel EP06

Junxiao Bi <[email protected]>
qlcnic: fix deadlock bug

Eric Dumazet <[email protected]>
net: igmp: add a missing rcu locking section

Nikolay Aleksandrov <[email protected]>
ip6mr: fix stale iterator

Sebastian Andrzej Siewior <[email protected]>
serial: core: mark port as initialized after successful IRQ change

Hugh Dickins <[email protected]>
kaiser: allocate pgd with order 0 when pti=off

Dave Hansen <[email protected]>
x86/pti: Make unpoison of pgd for trusted boot work for real

Hugh Dickins <[email protected]>
kaiser: fix intel_bts perf crashes

Jesse Chan <[email protected]>
ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

Jesse Chan <[email protected]>
pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

Jesse Chan <[email protected]>
auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

Michael Ellerman <[email protected]>
powerpc/64s: Allow control of RFI flush via debugfs

Michael Ellerman <[email protected]>
powerpc/64s: Wire up cpu_show_meltdown()

Oliver O'Halloran <[email protected]>
powerpc/powernv: Check device-tree for RFI flush settings

Michael Neuling <[email protected]>
powerpc/pseries: Query hypervisor for RFI flush settings

Michael Ellerman <[email protected]>
powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti

Michael Ellerman <[email protected]>
powerpc/64s: Add support for RFI flush of L1-D cache

Nicholas Piggin <[email protected]>
powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL

Nicholas Piggin <[email protected]>
powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL

Nicholas Piggin <[email protected]>
powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL

Nicholas Piggin <[email protected]>
powerpc/64: Add macros for annotating the destination of rfid/hrfid

Michael Neuling <[email protected]>
powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper


-------------

Diffstat:

Documentation/kernel-parameters.txt | 2 -
Documentation/speculation.txt | 90 +++
Makefile | 4 +-
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/exception-64e.h | 6 +
arch/powerpc/include/asm/exception-64s.h | 53 ++
arch/powerpc/include/asm/feature-fixups.h | 15 +
arch/powerpc/include/asm/hvcall.h | 17 +
arch/powerpc/include/asm/paca.h | 10 +
arch/powerpc/include/asm/plpar_wrappers.h | 14 +
arch/powerpc/include/asm/setup.h | 13 +
arch/powerpc/kernel/asm-offsets.c | 4 +
arch/powerpc/kernel/entry_64.S | 30 +-
arch/powerpc/kernel/exceptions-64s.S | 108 ++-
arch/powerpc/kernel/setup_64.c | 139 ++++
arch/powerpc/kernel/vmlinux.lds.S | 9 +
arch/powerpc/lib/feature-fixups.c | 42 ++
arch/powerpc/platforms/powernv/setup.c | 50 ++
arch/powerpc/platforms/pseries/setup.c | 35 +
arch/x86/entry/common.c | 9 +-
arch/x86/entry/entry_32.S | 3 +-
arch/x86/entry/entry_64.S | 134 +---
arch/x86/entry/syscall_64.c | 7 +-
arch/x86/events/intel/bts.c | 44 +-
arch/x86/include/asm/asm-prototypes.h | 4 +-
arch/x86/include/asm/asm.h | 4 +-
arch/x86/include/asm/barrier.h | 28 +
arch/x86/include/asm/cpufeature.h | 7 +-
arch/x86/include/asm/cpufeatures.h | 22 +-
arch/x86/include/asm/disabled-features.h | 3 +-
arch/x86/include/asm/intel-family.h | 7 +-
arch/x86/include/asm/msr-index.h | 12 +
arch/x86/include/asm/msr.h | 3 +-
arch/x86/include/asm/nospec-branch.h | 91 +--
arch/x86/include/asm/pgalloc.h | 11 -
arch/x86/include/asm/pgtable.h | 6 +
arch/x86/include/asm/processor.h | 2 -
arch/x86/include/asm/required-features.h | 3 +-
arch/x86/include/asm/syscall.h | 6 +-
arch/x86/include/asm/thread_info.h | 3 +-
arch/x86/include/asm/uaccess.h | 15 +-
arch/x86/include/asm/uaccess_32.h | 12 +-
arch/x86/include/asm/uaccess_64.h | 12 +-
arch/x86/kernel/alternative.c | 28 +-
arch/x86/kernel/cpu/bugs.c | 128 +++-
arch/x86/kernel/cpu/common.c | 70 +-
arch/x86/kernel/cpu/intel.c | 66 ++
arch/x86/kernel/cpu/microcode/core.c | 47 +-
arch/x86/kernel/cpu/scattered.c | 2 -
arch/x86/kernel/process_64.c | 4 +-
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
arch/x86/kernel/tboot.c | 10 +
arch/x86/kvm/cpuid.c | 21 +-
arch/x86/kvm/cpuid.h | 31 +
arch/x86/kvm/emulate.c | 10 +-
arch/x86/kvm/svm.c | 116 ++++
arch/x86/kvm/vmx.c | 763 +++++++++++----------
arch/x86/kvm/x86.c | 1 +
arch/x86/lib/Makefile | 1 +
arch/x86/lib/getuser.S | 10 +
arch/x86/lib/retpoline.S | 57 +-
arch/x86/lib/usercopy_32.c | 8 +-
crypto/tcrypt.c | 6 +-
drivers/auxdisplay/img-ascii-lcd.c | 4 +
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 55 +-
drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 8 +
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 +
.../net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 18 +-
drivers/net/ethernet/realtek/r8169.c | 4 +-
drivers/net/usb/qmi_wwan.c | 1 +
drivers/net/wireless/broadcom/b43/main.c | 10 +
drivers/pinctrl/pxa/pinctrl-pxa2xx.c | 4 +
drivers/tty/serial/serial_core.c | 2 +
drivers/vhost/net.c | 1 +
include/linux/fdtable.h | 5 +-
include/linux/init.h | 9 +-
include/linux/module.h | 9 +
include/linux/nospec.h | 72 ++
kernel/module.c | 11 +
net/core/sock_reuseport.c | 35 +-
net/ipv4/igmp.c | 4 +
net/ipv4/tcp.c | 6 +
net/ipv4/tcp_bbr.c | 6 +-
net/ipv6/af_inet6.c | 11 +-
net/ipv6/ip6mr.c | 1 +
net/sched/cls_u32.c | 12 +-
net/wireless/nl80211.c | 9 +-
scripts/mod/modpost.c | 9 +
security/keys/encrypted-keys/encrypted.c | 31 +-
sound/soc/codecs/pcm512x-spi.c | 4 +
sound/soc/generic/simple-card.c | 8 +-
sound/soc/sh/rcar/ssi.c | 5 +
93 files changed, 2034 insertions(+), 797 deletions(-)




2018-02-09 13:42:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/92] powerpc/64s: Wire up cpu_show_meltdown()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <[email protected]>

commit fd6e440f20b1a4304553775fc55938848ff617c9 upstream.

The recent commit 87590ce6e373 ("sysfs/cpu: Add vulnerability folder")
added a generic folder and set of files for reporting information on
CPU vulnerabilities. One of those was for meltdown:

/sys/devices/system/cpu/vulnerabilities/meltdown

This commit wires up that file for 64-bit Book3S powerpc.

For now we default to "Vulnerable" unless the RFI flush is enabled.
That may not actually be true on all hardware, further patches will
refine the reporting based on the CPU/platform etc. But for now we
default to being pessimists.

Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/Kconfig | 1 +
arch/powerpc/kernel/setup_64.c | 8 ++++++++
2 files changed, 9 insertions(+)

--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -128,6 +128,7 @@ config PPC
select ARCH_HAS_GCOV_PROFILE_ALL
select GENERIC_SMP_IDLE_THREAD
select GENERIC_CMOS_UPDATE
+ select GENERIC_CPU_VULNERABILITIES if PPC_BOOK3S_64
select GENERIC_TIME_VSYSCALL_OLD
select GENERIC_CLOCKEVENTS
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -778,5 +778,13 @@ void __init setup_rfi_flush(enum l1d_flu
if (!no_rfi_flush)
rfi_flush_enable(enable);
}
+
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
+{
+ if (rfi_flush)
+ return sprintf(buf, "Mitigation: RFI Flush\n");
+
+ return sprintf(buf, "Vulnerable\n");
+}
#endif /* CONFIG_PPC_BOOK3S_64 */
#endif



2018-02-09 13:42:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 28/92] ipv6: Fix SO_REUSEPORT UDP socket with implicit sk_ipv6only

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Martin KaFai Lau <[email protected]>


[ Upstream commit 7ece54a60ee2ba7a386308cae73c790bd580589c ]

If a sk_v6_rcv_saddr is !IPV6_ADDR_ANY and !IPV6_ADDR_MAPPED, it
implicitly implies it is an ipv6only socket. However, in inet6_bind(),
this addr_type checking and setting sk->sk_ipv6only to 1 are only done
after sk->sk_prot->get_port(sk, snum) has been completed successfully.

This inconsistency between sk_v6_rcv_saddr and sk_ipv6only confuses
the 'get_port()'.

In particular, when binding SO_REUSEPORT UDP sockets,
udp_reuseport_add_sock(sk,...) is called. udp_reuseport_add_sock()
checks "ipv6_only_sock(sk2) == ipv6_only_sock(sk)" before adding sk to
sk2->sk_reuseport_cb. In this case, ipv6_only_sock(sk2) could be
1 while ipv6_only_sock(sk) is still 0 here. The end result is,
reuseport_alloc(sk) is called instead of adding sk to the existing
sk2->sk_reuseport_cb.

It can be reproduced by binding two SO_REUSEPORT UDP sockets on an
IPv6 address (!ANY and !MAPPED). Only one of the socket will
receive packet.

The fix is to set the implicit sk_ipv6only before calling get_port().
The original sk_ipv6only has to be saved such that it can be restored
in case get_port() failed. The situation is similar to the
inet_reset_saddr(sk) after get_port() has failed.

Thanks to Calvin Owens <[email protected]> who created an easy
reproduction which leads to a fix.

Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection")
Signed-off-by: Martin KaFai Lau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/af_inet6.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -274,6 +274,7 @@ int inet6_bind(struct socket *sock, stru
struct net *net = sock_net(sk);
__be32 v4addr = 0;
unsigned short snum;
+ bool saved_ipv6only;
int addr_type = 0;
int err = 0;

@@ -378,19 +379,21 @@ int inet6_bind(struct socket *sock, stru
if (!(addr_type & IPV6_ADDR_MULTICAST))
np->saddr = addr->sin6_addr;

+ saved_ipv6only = sk->sk_ipv6only;
+ if (addr_type != IPV6_ADDR_ANY && addr_type != IPV6_ADDR_MAPPED)
+ sk->sk_ipv6only = 1;
+
/* Make sure we are allowed to bind here. */
if ((snum || !inet->bind_address_no_port) &&
sk->sk_prot->get_port(sk, snum)) {
+ sk->sk_ipv6only = saved_ipv6only;
inet_reset_saddr(sk);
err = -EADDRINUSE;
goto out;
}

- if (addr_type != IPV6_ADDR_ANY) {
+ if (addr_type != IPV6_ADDR_ANY)
sk->sk_userlocks |= SOCK_BINDADDR_LOCK;
- if (addr_type != IPV6_ADDR_MAPPED)
- sk->sk_ipv6only = 1;
- }
if (snum)
sk->sk_userlocks |= SOCK_BINDPORT_LOCK;
inet->inet_sport = htons(inet->inet_num);



2018-02-09 13:43:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/92] powerpc/64s: Allow control of RFI flush via debugfs

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <[email protected]>

commit 236003e6b5443c45c18e613d2b0d776a9f87540e upstream.

Expose the state of the RFI flush (enabled/disabled) via debugfs, and
allow it to be enabled/disabled at runtime.

eg: $ cat /sys/kernel/debug/powerpc/rfi_flush
1
$ echo 0 > /sys/kernel/debug/powerpc/rfi_flush
$ cat /sys/kernel/debug/powerpc/rfi_flush
0

Signed-off-by: Michael Ellerman <[email protected]>
Reviewed-by: Nicholas Piggin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/setup_64.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -38,6 +38,7 @@
#include <linux/memory.h>
#include <linux/nmi.h>

+#include <asm/debugfs.h>
#include <asm/io.h>
#include <asm/kdump.h>
#include <asm/prom.h>
@@ -779,6 +780,35 @@ void __init setup_rfi_flush(enum l1d_flu
rfi_flush_enable(enable);
}

+#ifdef CONFIG_DEBUG_FS
+static int rfi_flush_set(void *data, u64 val)
+{
+ if (val == 1)
+ rfi_flush_enable(true);
+ else if (val == 0)
+ rfi_flush_enable(false);
+ else
+ return -EINVAL;
+
+ return 0;
+}
+
+static int rfi_flush_get(void *data, u64 *val)
+{
+ *val = rfi_flush ? 1 : 0;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_rfi_flush, rfi_flush_get, rfi_flush_set, "%llu\n");
+
+static __init int rfi_flush_debugfs_init(void)
+{
+ debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush);
+ return 0;
+}
+device_initcall(rfi_flush_debugfs_init);
+#endif
+
ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
{
if (rfi_flush)



2018-02-09 13:43:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/92] pinctrl: pxa: pxa2xx: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jesse Chan <[email protected]>

commit 0b9335cbd38e3bd2025bcc23b5758df4ac035f75 upstream.

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in drivers/pinctrl/pxa/pinctrl-pxa2xx.o
see include/linux/module.h for more information

This adds the license as "GPL v2", which matches the header of the file.

MODULE_DESCRIPTION and MODULE_AUTHOR are also added.

Signed-off-by: Jesse Chan <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/pinctrl/pxa/pinctrl-pxa2xx.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/pinctrl/pxa/pinctrl-pxa2xx.c
+++ b/drivers/pinctrl/pxa/pinctrl-pxa2xx.c
@@ -436,3 +436,7 @@ int pxa2xx_pinctrl_exit(struct platform_
return 0;
}
EXPORT_SYMBOL_GPL(pxa2xx_pinctrl_exit);
+
+MODULE_AUTHOR("Robert Jarzmik <[email protected]>");
+MODULE_DESCRIPTION("Marvell PXA2xx pinctrl driver");
+MODULE_LICENSE("GPL v2");



2018-02-09 13:44:06

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 44/92] x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit a5b2966364538a0e68c9fa29bc0a3a1651799035)

This doesn't refuse to load the affected microcodes; it just refuses to
use the Spectre v2 mitigation features if they're detected, by clearing
the appropriate feature bits.

The AMD CPUID bits are handled here too, because hypervisors *may* have
been exposing those bits even on Intel chips, for fine-grained control
of what's available.

It is non-trivial to use x86_match_cpu() for this table because that
doesn't handle steppings. And the approach taken in commit bd9240a18
almost made me lose my lunch.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/intel-family.h | 7 ++-
arch/x86/kernel/cpu/intel.c | 66 ++++++++++++++++++++++++++++++++++++
2 files changed, 71 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/intel-family.h
+++ b/arch/x86/include/asm/intel-family.h
@@ -12,6 +12,7 @@
*/

#define INTEL_FAM6_CORE_YONAH 0x0E
+
#define INTEL_FAM6_CORE2_MEROM 0x0F
#define INTEL_FAM6_CORE2_MEROM_L 0x16
#define INTEL_FAM6_CORE2_PENRYN 0x17
@@ -21,6 +22,7 @@
#define INTEL_FAM6_NEHALEM_G 0x1F /* Auburndale / Havendale */
#define INTEL_FAM6_NEHALEM_EP 0x1A
#define INTEL_FAM6_NEHALEM_EX 0x2E
+
#define INTEL_FAM6_WESTMERE 0x25
#define INTEL_FAM6_WESTMERE_EP 0x2C
#define INTEL_FAM6_WESTMERE_EX 0x2F
@@ -36,9 +38,9 @@
#define INTEL_FAM6_HASWELL_GT3E 0x46

#define INTEL_FAM6_BROADWELL_CORE 0x3D
-#define INTEL_FAM6_BROADWELL_XEON_D 0x56
#define INTEL_FAM6_BROADWELL_GT3E 0x47
#define INTEL_FAM6_BROADWELL_X 0x4F
+#define INTEL_FAM6_BROADWELL_XEON_D 0x56

#define INTEL_FAM6_SKYLAKE_MOBILE 0x4E
#define INTEL_FAM6_SKYLAKE_DESKTOP 0x5E
@@ -57,9 +59,10 @@
#define INTEL_FAM6_ATOM_SILVERMONT2 0x4D /* Avaton/Rangely */
#define INTEL_FAM6_ATOM_AIRMONT 0x4C /* CherryTrail / Braswell */
#define INTEL_FAM6_ATOM_MERRIFIELD 0x4A /* Tangier */
-#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Annidale */
+#define INTEL_FAM6_ATOM_MOOREFIELD 0x5A /* Anniedale */
#define INTEL_FAM6_ATOM_GOLDMONT 0x5C
#define INTEL_FAM6_ATOM_DENVERTON 0x5F /* Goldmont Microserver */
+#define INTEL_FAM6_ATOM_GEMINI_LAKE 0x7A

/* Xeon Phi */

--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -61,6 +61,59 @@ void check_mpx_erratum(struct cpuinfo_x8
}
}

+/*
+ * Early microcode releases for the Spectre v2 mitigation were broken.
+ * Information taken from;
+ * - https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/microcode-update-guidance.pdf
+ * - https://kb.vmware.com/s/article/52345
+ * - Microcode revisions observed in the wild
+ * - Release note from 20180108 microcode release
+ */
+struct sku_microcode {
+ u8 model;
+ u8 stepping;
+ u32 microcode;
+};
+static const struct sku_microcode spectre_bad_microcodes[] = {
+ { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x84 },
+ { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0A, 0x84 },
+ { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x84 },
+ { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x84 },
+ { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x84 },
+ { INTEL_FAM6_SKYLAKE_X, 0x03, 0x0100013e },
+ { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003c },
+ { INTEL_FAM6_SKYLAKE_MOBILE, 0x03, 0xc2 },
+ { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0xc2 },
+ { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 },
+ { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x1b },
+ { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 },
+ { INTEL_FAM6_BROADWELL_XEON_D, 0x03, 0x07000011 },
+ { INTEL_FAM6_BROADWELL_X, 0x01, 0x0b000025 },
+ { INTEL_FAM6_HASWELL_ULT, 0x01, 0x21 },
+ { INTEL_FAM6_HASWELL_GT3E, 0x01, 0x18 },
+ { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 },
+ { INTEL_FAM6_HASWELL_X, 0x02, 0x3b },
+ { INTEL_FAM6_HASWELL_X, 0x04, 0x10 },
+ { INTEL_FAM6_IVYBRIDGE_X, 0x04, 0x42a },
+ /* Updated in the 20180108 release; blacklist until we know otherwise */
+ { INTEL_FAM6_ATOM_GEMINI_LAKE, 0x01, 0x22 },
+ /* Observed in the wild */
+ { INTEL_FAM6_SANDYBRIDGE_X, 0x06, 0x61b },
+ { INTEL_FAM6_SANDYBRIDGE_X, 0x07, 0x712 },
+};
+
+static bool bad_spectre_microcode(struct cpuinfo_x86 *c)
+{
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) {
+ if (c->x86_model == spectre_bad_microcodes[i].model &&
+ c->x86_mask == spectre_bad_microcodes[i].stepping)
+ return (c->microcode <= spectre_bad_microcodes[i].microcode);
+ }
+ return false;
+}
+
static void early_init_intel(struct cpuinfo_x86 *c)
{
u64 misc_enable;
@@ -87,6 +140,19 @@ static void early_init_intel(struct cpui
rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode);
}

+ if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) ||
+ cpu_has(c, X86_FEATURE_STIBP) ||
+ cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) ||
+ cpu_has(c, X86_FEATURE_AMD_PRED_CMD) ||
+ cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) {
+ pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n");
+ clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
+ clear_cpu_cap(c, X86_FEATURE_STIBP);
+ clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL);
+ clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD);
+ clear_cpu_cap(c, X86_FEATURE_AMD_STIBP);
+ }
+
/*
* Atom erratum AAE44/AAF40/AAG38/AAH41:
*



2018-02-09 13:44:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 30/92] x86/asm: Fix inline asm call constraints for GCC 4.4

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <[email protected]>

commit 520a13c530aeb5f63e011d668c42db1af19ed349 upstream.

The kernel test bot (run by Xiaolong Ye) reported that the following commit:

f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")

is causing double faults in a kernel compiled with GCC 4.4.

Linus subsequently diagnosed the crash pattern and the buggy commit and found that
the issue is with this code:

register unsigned int __asm_call_sp asm("esp");
#define ASM_CALL_CONSTRAINT "+r" (__asm_call_sp)

Even on a 64-bit kernel, it's using ESP instead of RSP. That causes GCC
to produce the following bogus code:

ffffffff8147461d: 89 e0 mov %esp,%eax
ffffffff8147461f: 4c 89 f7 mov %r14,%rdi
ffffffff81474622: 4c 89 fe mov %r15,%rsi
ffffffff81474625: ba 20 00 00 00 mov $0x20,%edx
ffffffff8147462a: 89 c4 mov %eax,%esp
ffffffff8147462c: e8 bf 52 05 00 callq ffffffff814c98f0 <copy_user_generic_unrolled>

Despite the absurdity of it backing up and restoring the stack pointer
for no reason, the bug is actually the fact that it's only backing up
and restoring the lower 32 bits of the stack pointer. The upper 32 bits
are getting cleared out, corrupting the stack pointer.

So change the '__asm_call_sp' register variable to be associated with
the actual full-size stack pointer.

This also requires changing the __ASM_SEL() macro to be based on the
actual compiled arch size, rather than the CONFIG value, because
CONFIG_X86_64 compiles some files with '-m32' (e.g., realmode and vdso).
Otherwise Clang fails to build the kernel because it complains about the
use of a 64-bit register (RSP) in a 32-bit file.

Reported-and-Bisected-and-Tested-by: kernel test robot <[email protected]>
Diagnosed-by: Linus Torvalds <[email protected]>
Signed-off-by: Josh Poimboeuf <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: LKP <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Cc: Miguel Bernal Marin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: f5caf621ee35 ("x86/asm: Fix inline asm call constraints for Clang")
Link: http://lkml.kernel.org/r/20170928215826.6sdpmwtkiydiytim@treble
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Matthias Kaehlcke <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/asm.h | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -11,10 +11,12 @@
# define __ASM_FORM_COMMA(x) " " #x ","
#endif

-#ifdef CONFIG_X86_32
+#ifndef __x86_64__
+/* 32 bit */
# define __ASM_SEL(a,b) __ASM_FORM(a)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(a)
#else
+/* 64 bit */
# define __ASM_SEL(a,b) __ASM_FORM(b)
# define __ASM_SEL_RAW(a,b) __ASM_FORM_RAW(b)
#endif



2018-02-09 13:44:24

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 62/92] x86/uaccess: Use __uaccess_begin_nospec() and uaccess_try_nospec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit 304ec1b050310548db33063e567123fae8fd0301)

Quoting Linus:

I do think that it would be a good idea to very expressly document
the fact that it's not that the user access itself is unsafe. I do
agree that things like "get_user()" want to be protected, but not
because of any direct bugs or problems with get_user() and friends,
but simply because get_user() is an excellent source of a pointer
that is obviously controlled from a potentially attacking user
space. So it's a prime candidate for then finding _subsequent_
accesses that can then be used to perturb the cache.

__uaccess_begin_nospec() covers __get_user() and copy_from_iter() where the
limit check is far away from the user pointer de-reference. In those cases
a barrier_nospec() prevents speculation with a potential pointer to
privileged memory. uaccess_try_nospec covers get_user_try.

Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Andi Kleen <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727416953.33451.10508284228526170604.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/uaccess.h | 6 +++---
arch/x86/include/asm/uaccess_32.h | 12 ++++++------
arch/x86/include/asm/uaccess_64.h | 12 ++++++------
arch/x86/lib/usercopy_32.c | 4 ++--
4 files changed, 17 insertions(+), 17 deletions(-)

--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -437,7 +437,7 @@ do { \
({ \
int __gu_err; \
__inttype(*(ptr)) __gu_val; \
- __uaccess_begin(); \
+ __uaccess_begin_nospec(); \
__get_user_size(__gu_val, (ptr), (size), __gu_err, -EFAULT); \
__uaccess_end(); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
@@ -547,7 +547,7 @@ struct __large_struct { unsigned long bu
* get_user_ex(...);
* } get_user_catch(err)
*/
-#define get_user_try uaccess_try
+#define get_user_try uaccess_try_nospec
#define get_user_catch(err) uaccess_catch(err)

#define get_user_ex(x, ptr) do { \
@@ -582,7 +582,7 @@ extern void __cmpxchg_wrong_size(void)
__typeof__(ptr) __uval = (uval); \
__typeof__(*(ptr)) __old = (old); \
__typeof__(*(ptr)) __new = (new); \
- __uaccess_begin(); \
+ __uaccess_begin_nospec(); \
switch (size) { \
case 1: \
{ \
--- a/arch/x86/include/asm/uaccess_32.h
+++ b/arch/x86/include/asm/uaccess_32.h
@@ -102,17 +102,17 @@ __copy_from_user(void *to, const void __

switch (n) {
case 1:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u8 *)to, from, 1, ret, 1);
__uaccess_end();
return ret;
case 2:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u16 *)to, from, 2, ret, 2);
__uaccess_end();
return ret;
case 4:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u32 *)to, from, 4, ret, 4);
__uaccess_end();
return ret;
@@ -130,17 +130,17 @@ static __always_inline unsigned long __c

switch (n) {
case 1:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u8 *)to, from, 1, ret, 1);
__uaccess_end();
return ret;
case 2:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u16 *)to, from, 2, ret, 2);
__uaccess_end();
return ret;
case 4:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_size(*(u32 *)to, from, 4, ret, 4);
__uaccess_end();
return ret;
--- a/arch/x86/include/asm/uaccess_64.h
+++ b/arch/x86/include/asm/uaccess_64.h
@@ -59,31 +59,31 @@ int __copy_from_user_nocheck(void *dst,
return copy_user_generic(dst, (__force void *)src, size);
switch (size) {
case 1:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u8 *)dst, (u8 __user *)src,
ret, "b", "b", "=q", 1);
__uaccess_end();
return ret;
case 2:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u16 *)dst, (u16 __user *)src,
ret, "w", "w", "=r", 2);
__uaccess_end();
return ret;
case 4:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u32 *)dst, (u32 __user *)src,
ret, "l", "k", "=r", 4);
__uaccess_end();
return ret;
case 8:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u64 *)dst, (u64 __user *)src,
ret, "q", "", "=r", 8);
__uaccess_end();
return ret;
case 10:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u64 *)dst, (u64 __user *)src,
ret, "q", "", "=r", 10);
if (likely(!ret))
@@ -93,7 +93,7 @@ int __copy_from_user_nocheck(void *dst,
__uaccess_end();
return ret;
case 16:
- __uaccess_begin();
+ __uaccess_begin_nospec();
__get_user_asm(*(u64 *)dst, (u64 __user *)src,
ret, "q", "", "=r", 16);
if (likely(!ret))
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -570,7 +570,7 @@ do { \
unsigned long __copy_to_user_ll(void __user *to, const void *from,
unsigned long n)
{
- __uaccess_begin();
+ __uaccess_begin_nospec();
if (movsl_is_ok(to, from, n))
__copy_user(to, from, n);
else
@@ -627,7 +627,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocach
unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from,
unsigned long n)
{
- __uaccess_begin();
+ __uaccess_begin_nospec();
#ifdef CONFIG_X86_INTEL_USERCOPY
if (n > 64 && static_cpu_has(X86_FEATURE_XMM2))
n = __copy_user_intel_nocache(to, from, n);



2018-02-09 13:45:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 75/92] x86/speculation: Fix typo IBRS_ATT, which should be IBRS_ALL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Darren Kenny <[email protected]>


(cherry picked from commit af189c95a371b59f493dbe0f50c0a09724868881)

Fixes: 117cc7a908c83 ("x86/retpoline: Fill return stack buffer on vmexit")
Signed-off-by: Darren Kenny <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: David Woodhouse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -150,7 +150,7 @@ extern char __indirect_thunk_end[];
* On VMEXIT we must ensure that no RSB predictions learned in the guest
* can be followed in the host, by overwriting the RSB completely. Both
* retpoline and IBRS mitigations for Spectre v2 need this; only on future
- * CPUs with IBRS_ATT *might* it be avoided.
+ * CPUs with IBRS_ALL *might* it be avoided.
*/
static inline void vmexit_fill_RSB(void)
{



2018-02-09 13:45:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 77/92] KVM: nVMX: vmx_complete_nested_posted_interrupt() cant fail

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Hildenbrand <[email protected]>


(cherry picked from commit 6342c50ad12e8ce0736e722184a7dbdea4a3477f)

vmx_complete_nested_posted_interrupt() can't fail, let's turn it into
a void function.

Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/vmx.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4736,7 +4736,7 @@ static bool vmx_get_enable_apicv(void)
return enable_apicv;
}

-static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
+static void vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
int max_irr;
@@ -4747,13 +4747,13 @@ static int vmx_complete_nested_posted_in
vmx->nested.pi_pending) {
vmx->nested.pi_pending = false;
if (!pi_test_and_clear_on(vmx->nested.pi_desc))
- return 0;
+ return;

max_irr = find_last_bit(
(unsigned long *)vmx->nested.pi_desc->pir, 256);

if (max_irr == 256)
- return 0;
+ return;

vapic_page = kmap(vmx->nested.virtual_apic_page);
__kvm_apic_update_irr(vmx->nested.pi_desc->pir, vapic_page);
@@ -4766,7 +4766,6 @@ static int vmx_complete_nested_posted_in
vmcs_write16(GUEST_INTR_STATUS, status);
}
}
- return 0;
}

static inline bool kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu)
@@ -10482,7 +10481,8 @@ static int vmx_check_nested_events(struc
return 0;
}

- return vmx_complete_nested_posted_interrupt(vcpu);
+ vmx_complete_nested_posted_interrupt(vcpu);
+ return 0;
}

static u32 vmx_get_preemption_timer_value(struct kvm_vcpu *vcpu)



2018-02-09 13:45:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 64/92] x86/syscall: Sanitize syscall table de-references under speculation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit 2fbd7af5af8665d18bcefae3e9700be07e22b681)

The syscall table base is a user controlled function pointer in kernel
space. Use array_index_nospec() to prevent any out of bounds speculation.

While retpoline prevents speculating into a userspace directed target it
does not stop the pointer de-reference, the concern is leaking memory
relative to the syscall table base, by observing instruction cache
behavior.

Reported-by: Linus Torvalds <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727417984.33451.1216731042505722161.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/entry/common.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -20,6 +20,7 @@
#include <linux/export.h>
#include <linux/context_tracking.h>
#include <linux/user-return-notifier.h>
+#include <linux/nospec.h>
#include <linux/uprobes.h>

#include <asm/desc.h>
@@ -277,7 +278,8 @@ __visible void do_syscall_64(struct pt_r
* regs->orig_ax, which changes the behavior of some syscalls.
*/
if (likely((nr & __SYSCALL_MASK) < NR_syscalls)) {
- regs->ax = sys_call_table[nr & __SYSCALL_MASK](
+ nr = array_index_nospec(nr & __SYSCALL_MASK, NR_syscalls);
+ regs->ax = sys_call_table[nr](
regs->di, regs->si, regs->dx,
regs->r10, regs->r8, regs->r9);
}
@@ -313,6 +315,7 @@ static __always_inline void do_syscall_3
}

if (likely(nr < IA32_NR_syscalls)) {
+ nr = array_index_nospec(nr, IA32_NR_syscalls);
/*
* It's possible that a 32-bit syscall implementation
* takes a 64-bit parameter but nonetheless assumes that



2018-02-09 13:45:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 51/92] x86/retpoline: Simplify vmexit_fill_RSB()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

(cherry picked from commit 1dde7415e99933bb7293d6b2843752cbdb43ec11)

Simplify it to call an asm-function instead of pasting 41 insn bytes at
every call site. Also, add alignment to the macro as suggested here:

https://support.google.com/faqs/answer/7625886

[dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler]

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/entry/entry_32.S | 3 -
arch/x86/entry/entry_64.S | 3 -
arch/x86/include/asm/asm-prototypes.h | 3 +
arch/x86/include/asm/nospec-branch.h | 70 +++-------------------------------
arch/x86/lib/Makefile | 1
arch/x86/lib/retpoline.S | 56 +++++++++++++++++++++++++++
6 files changed, 71 insertions(+), 65 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -237,7 +237,8 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %ebx, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ /* Clobbers %ebx */
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -435,7 +435,8 @@ ENTRY(__switch_to_asm)
* exist, overwrite the RSB with entries which capture
* speculative execution to prevent attack.
*/
- FILL_RETURN_BUFFER %r12, RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+ /* Clobbers %rbx */
+ FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
#endif

/* restore callee-saved registers */
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -37,4 +37,7 @@ INDIRECT_THUNK(dx)
INDIRECT_THUNK(si)
INDIRECT_THUNK(di)
INDIRECT_THUNK(bp)
+asmlinkage void __fill_rsb(void);
+asmlinkage void __clear_rsb(void);
+
#endif /* CONFIG_RETPOLINE */
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -7,50 +7,6 @@
#include <asm/alternative-asm.h>
#include <asm/cpufeatures.h>

-/*
- * Fill the CPU return stack buffer.
- *
- * Each entry in the RSB, if used for a speculative 'ret', contains an
- * infinite 'pause; lfence; jmp' loop to capture speculative execution.
- *
- * This is required in various cases for retpoline and IBRS-based
- * mitigations for the Spectre variant 2 vulnerability. Sometimes to
- * eliminate potentially bogus entries from the RSB, and sometimes
- * purely to ensure that it doesn't get empty, which on some CPUs would
- * allow predictions from other (unwanted!) sources to be used.
- *
- * We define a CPP macro such that it can be used from both .S files and
- * inline assembly. It's possible to do a .macro and then include that
- * from C via asm(".include <asm/nospec-branch.h>") but let's not go there.
- */
-
-#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */
-#define RSB_FILL_LOOPS 16 /* To avoid underflow */
-
-/*
- * Google experimented with loop-unrolling and this turned out to be
- * the optimal version — two calls, each with their own speculation
- * trap should their return address end up getting used, in a loop.
- */
-#define __FILL_RETURN_BUFFER(reg, nr, sp) \
- mov $(nr/2), reg; \
-771: \
- call 772f; \
-773: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 773b; \
-772: \
- call 774f; \
-775: /* speculation trap */ \
- pause; \
- lfence; \
- jmp 775b; \
-774: \
- dec reg; \
- jnz 771b; \
- add $(BITS_PER_LONG/8) * nr, sp;
-
#ifdef __ASSEMBLY__

/*
@@ -121,17 +77,10 @@
#endif
.endm

- /*
- * A simpler FILL_RETURN_BUFFER macro. Don't make people use the CPP
- * monstrosity above, manually.
- */
-.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
+/* This clobbers the BX register */
+.macro FILL_RETURN_BUFFER nr:req ftr:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE "jmp .Lskip_rsb_\@", \
- __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \
- \ftr
-.Lskip_rsb_\@:
+ ALTERNATIVE "", "call __clear_rsb", \ftr
#endif
.endm

@@ -206,15 +155,10 @@ extern char __indirect_thunk_end[];
static inline void vmexit_fill_RSB(void)
{
#ifdef CONFIG_RETPOLINE
- unsigned long loops;
-
- asm volatile (ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE("jmp 910f",
- __stringify(__FILL_RETURN_BUFFER(%0, RSB_CLEAR_LOOPS, %1)),
- X86_FEATURE_RETPOLINE)
- "910:"
- : "=r" (loops), ASM_CALL_CONSTRAINT
- : : "memory" );
+ alternative_input("",
+ "call __fill_rsb",
+ X86_FEATURE_RETPOLINE,
+ ASM_NO_INPUT_CLOBBER(_ASM_BX, "memory"));
#endif
}

--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -26,6 +26,7 @@ lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) +=
lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o
lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
lib-$(CONFIG_RETPOLINE) += retpoline.o
+OBJECT_FILES_NON_STANDARD_retpoline.o :=y

obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o

--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -7,6 +7,7 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>
#include <asm/nospec-branch.h>
+#include <asm/bitsperlong.h>

.macro THUNK reg
.section .text.__x86.indirect_thunk
@@ -46,3 +47,58 @@ GENERATE_THUNK(r13)
GENERATE_THUNK(r14)
GENERATE_THUNK(r15)
#endif
+
+/*
+ * Fill the CPU return stack buffer.
+ *
+ * Each entry in the RSB, if used for a speculative 'ret', contains an
+ * infinite 'pause; lfence; jmp' loop to capture speculative execution.
+ *
+ * This is required in various cases for retpoline and IBRS-based
+ * mitigations for the Spectre variant 2 vulnerability. Sometimes to
+ * eliminate potentially bogus entries from the RSB, and sometimes
+ * purely to ensure that it doesn't get empty, which on some CPUs would
+ * allow predictions from other (unwanted!) sources to be used.
+ *
+ * Google experimented with loop-unrolling and this turned out to be
+ * the optimal version - two calls, each with their own speculation
+ * trap should their return address end up getting used, in a loop.
+ */
+.macro STUFF_RSB nr:req sp:req
+ mov $(\nr / 2), %_ASM_BX
+ .align 16
+771:
+ call 772f
+773: /* speculation trap */
+ pause
+ lfence
+ jmp 773b
+ .align 16
+772:
+ call 774f
+775: /* speculation trap */
+ pause
+ lfence
+ jmp 775b
+ .align 16
+774:
+ dec %_ASM_BX
+ jnz 771b
+ add $((BITS_PER_LONG/8) * \nr), \sp
+.endm
+
+#define RSB_FILL_LOOPS 16 /* To avoid underflow */
+
+ENTRY(__fill_rsb)
+ STUFF_RSB RSB_FILL_LOOPS, %_ASM_SP
+ ret
+END(__fill_rsb)
+EXPORT_SYMBOL_GPL(__fill_rsb)
+
+#define RSB_CLEAR_LOOPS 32 /* To forcibly overwrite all entries */
+
+ENTRY(__clear_rsb)
+ STUFF_RSB RSB_CLEAR_LOOPS, %_ASM_SP
+ ret
+END(__clear_rsb)
+EXPORT_SYMBOL_GPL(__clear_rsb)



2018-02-09 13:46:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 63/92] x86/get_user: Use pointer masking to limit speculation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit c7f631cb07e7da06ac1d231ca178452339e32a94)

Quoting Linus:

I do think that it would be a good idea to very expressly document
the fact that it's not that the user access itself is unsafe. I do
agree that things like "get_user()" want to be protected, but not
because of any direct bugs or problems with get_user() and friends,
but simply because get_user() is an excellent source of a pointer
that is obviously controlled from a potentially attacking user
space. So it's a prime candidate for then finding _subsequent_
accesses that can then be used to perturb the cache.

Unlike the __get_user() case get_user() includes the address limit check
near the pointer de-reference. With that locality the speculation can be
mitigated with pointer narrowing rather than a barrier, i.e.
array_index_nospec(). Where the narrowing is performed by:

cmp %limit, %ptr
sbb %mask, %mask
and %mask, %ptr

With respect to speculation the value of %ptr is either less than %limit
or NULL.

Co-developed-by: Linus Torvalds <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727417469.33451.11804043010080838495.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/lib/getuser.S | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@ -39,6 +39,8 @@ ENTRY(__get_user_1)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
1: movzbl (%_ASM_AX),%edx
xor %eax,%eax
@@ -53,6 +55,8 @@ ENTRY(__get_user_2)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
2: movzwl -1(%_ASM_AX),%edx
xor %eax,%eax
@@ -67,6 +71,8 @@ ENTRY(__get_user_4)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
3: movl -3(%_ASM_AX),%edx
xor %eax,%eax
@@ -82,6 +88,8 @@ ENTRY(__get_user_8)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
4: movq -7(%_ASM_AX),%rdx
xor %eax,%eax
@@ -93,6 +101,8 @@ ENTRY(__get_user_8)
mov PER_CPU_VAR(current_task), %_ASM_DX
cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
jae bad_get_user_8
+ sbb %_ASM_DX, %_ASM_DX /* array_index_mask_nospec() */
+ and %_ASM_DX, %_ASM_AX
ASM_STAC
4: movl -7(%_ASM_AX),%edx
5: movl -3(%_ASM_AX),%ecx



2018-02-09 13:47:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 55/92] x86/asm: Move status from thread_struct to thread_info

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>


(cherry picked from commit 37a8f7c38339b22b69876d6f5a0ab851565284e3)

The TS_COMPAT bit is very hot and is accessed from code paths that mostly
also touch thread_info::flags. Move it into struct thread_info to improve
cache locality.

The only reason it was in thread_struct is that there was a brief period
during which arch-specific fields were not allowed in struct thread_info.

Linus suggested further changing:

ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

to:

if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

on the theory that frequently dirtying the cacheline even in pure 64-bit
code that never needs to modify status hurts performance. That could be a
reasonable followup patch, but I suspect it matters less on top of this
patch.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Ingo Molnar <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Kernel Hardening <[email protected]>
Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/entry/common.c | 4 ++--
arch/x86/include/asm/processor.h | 2 --
arch/x86/include/asm/syscall.h | 6 +++---
arch/x86/include/asm/thread_info.h | 3 ++-
arch/x86/kernel/process_64.c | 4 ++--
arch/x86/kernel/ptrace.c | 2 +-
arch/x86/kernel/signal.c | 2 +-
7 files changed, 11 insertions(+), 12 deletions(-)

--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -201,7 +201,7 @@ __visible inline void prepare_exit_to_us
* special case only applies after poking regs and before the
* very next return to user mode.
*/
- current->thread.status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
+ ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
#endif

user_enter_irqoff();
@@ -299,7 +299,7 @@ static __always_inline void do_syscall_3
unsigned int nr = (unsigned int)regs->orig_ax;

#ifdef CONFIG_IA32_EMULATION
- current->thread.status |= TS_COMPAT;
+ ti->status |= TS_COMPAT;
#endif

if (READ_ONCE(ti->flags) & _TIF_WORK_SYSCALL_ENTRY) {
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -391,8 +391,6 @@ struct thread_struct {
unsigned short gsindex;
#endif

- u32 status; /* thread synchronous flags */
-
#ifdef CONFIG_X86_64
unsigned long fsbase;
unsigned long gsbase;
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -60,7 +60,7 @@ static inline long syscall_get_error(str
* TS_COMPAT is set for 32-bit syscall entries and then
* remains set until we return to user mode.
*/
- if (task->thread.status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (task->thread_info.status & (TS_COMPAT|TS_I386_REGS_POKED))
/*
* Sign-extend the value so (int)-EFOO becomes (long)-EFOO
* and will match correctly in comparisons.
@@ -116,7 +116,7 @@ static inline void syscall_get_arguments
unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread.status & TS_COMPAT)
+ if (task->thread_info.status & TS_COMPAT)
switch (i) {
case 0:
if (!n--) break;
@@ -177,7 +177,7 @@ static inline void syscall_set_arguments
const unsigned long *args)
{
# ifdef CONFIG_IA32_EMULATION
- if (task->thread.status & TS_COMPAT)
+ if (task->thread_info.status & TS_COMPAT)
switch (i) {
case 0:
if (!n--) break;
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -54,6 +54,7 @@ struct task_struct;

struct thread_info {
unsigned long flags; /* low level flags */
+ u32 status; /* thread synchronous flags */
};

#define INIT_THREAD_INFO(tsk) \
@@ -213,7 +214,7 @@ static inline int arch_within_stack_fram
#define in_ia32_syscall() true
#else
#define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
- current->thread.status & TS_COMPAT)
+ current_thread_info()->status & TS_COMPAT)
#endif

/*
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -538,7 +538,7 @@ void set_personality_ia32(bool x32)
current->personality &= ~READ_IMPLIES_EXEC;
/* in_compat_syscall() uses the presence of the x32
syscall bit flag to determine compat status */
- current->thread.status &= ~TS_COMPAT;
+ current_thread_info()->status &= ~TS_COMPAT;
} else {
set_thread_flag(TIF_IA32);
clear_thread_flag(TIF_X32);
@@ -546,7 +546,7 @@ void set_personality_ia32(bool x32)
current->mm->context.ia32_compat = TIF_IA32;
current->personality |= force_personality32;
/* Prepare the first "return" to user space */
- current->thread.status |= TS_COMPAT;
+ current_thread_info()->status |= TS_COMPAT;
}
}
EXPORT_SYMBOL_GPL(set_personality_ia32);
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -934,7 +934,7 @@ static int putreg32(struct task_struct *
*/
regs->orig_ax = value;
if (syscall_get_nr(child, regs) >= 0)
- child->thread.status |= TS_I386_REGS_POKED;
+ child->thread_info.status |= TS_I386_REGS_POKED;
break;

case offsetof(struct user32, regs.eflags):
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -785,7 +785,7 @@ static inline unsigned long get_nr_resta
* than the tracee.
*/
#ifdef CONFIG_IA32_EMULATION
- if (current->thread.status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (current_thread_info()->status & (TS_COMPAT|TS_I386_REGS_POKED))
return __NR_ia32_restart_syscall;
#endif
#ifdef CONFIG_X86_X32_ABI



2018-02-09 13:58:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 92/92] x86/microcode: Do the family check first

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

commit 1f161f67a272cc4f29f27934dd3f74cb657eb5c4 upstream with adjustments.

On CPUs like AMD's Geode, for example, we shouldn't even try to load
microcode because they do not support the modern microcode loading
interface.

However, we do the family check *after* the other checks whether the
loader has been disabled on the command line or whether we're running in
a guest.

So move the family checks first in order to exit early if we're being
loaded on an unsupported family.

Reported-and-tested-by: Sven Glodowski <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Cc: <[email protected]> # 4.11..
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Signed-off-by: Rolf Neugebauer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/microcode/core.c | 27 ++++++++++++++++++---------
1 file changed, 18 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -86,9 +86,6 @@ static bool __init check_loader_disabled
bool *res = &dis_ucode_ldr;
#endif

- if (!have_cpuid_p())
- return *res;
-
a = 1;
c = 0;
native_cpuid(&a, &b, &c, &d);
@@ -130,8 +127,9 @@ void __init load_ucode_bsp(void)
{
int vendor;
unsigned int family;
+ bool intel = true;

- if (check_loader_disabled_bsp())
+ if (!have_cpuid_p())
return;

vendor = x86_cpuid_vendor();
@@ -139,16 +137,27 @@ void __init load_ucode_bsp(void)

switch (vendor) {
case X86_VENDOR_INTEL:
- if (family >= 6)
- load_ucode_intel_bsp();
+ if (family < 6)
+ return;
break;
+
case X86_VENDOR_AMD:
- if (family >= 0x10)
- load_ucode_amd_bsp(family);
+ if (family < 0x10)
+ return;
+ intel = false;
break;
+
default:
- break;
+ return;
}
+
+ if (check_loader_disabled_bsp())
+ return;
+
+ if (intel)
+ load_ucode_intel_bsp();
+ else
+ load_ucode_amd_bsp(family);
}

static bool check_loader_disabled_ap(void)



2018-02-09 13:58:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 85/92] KVM/SVM: Allow direct access to MSR_IA32_SPEC_CTRL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: KarimAllah Ahmed <[email protected]>


(cherry picked from commit b2ac58f90540e39324e7a29a7ad471407ae0bf48)

[ Based on a patch from Paolo Bonzini <[email protected]> ]

... basically doing exactly what we do for VMX:

- Passthrough SPEC_CTRL to guests (if enabled in guest CPUID)
- Save and restore SPEC_CTRL around VMExit and VMEntry only if the guest
actually used it.

Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Ashok Raj <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/svm.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)

--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -183,6 +183,8 @@ struct vcpu_svm {
u64 gs_base;
} host;

+ u64 spec_ctrl;
+
u32 *msrpm;

ulong nmi_iret_rip;
@@ -248,6 +250,7 @@ static const struct svm_direct_access_ms
{ .index = MSR_CSTAR, .always = true },
{ .index = MSR_SYSCALL_MASK, .always = true },
#endif
+ { .index = MSR_IA32_SPEC_CTRL, .always = false },
{ .index = MSR_IA32_PRED_CMD, .always = false },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
@@ -863,6 +866,25 @@ static bool valid_msr_intercept(u32 inde
return false;
}

+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, unsigned msr)
+{
+ u8 bit_write;
+ unsigned long tmp;
+ u32 offset;
+ u32 *msrpm;
+
+ msrpm = is_guest_mode(vcpu) ? to_svm(vcpu)->nested.msrpm:
+ to_svm(vcpu)->msrpm;
+
+ offset = svm_msrpm_offset(msr);
+ bit_write = 2 * (msr & 0x0f) + 1;
+ tmp = msrpm[offset];
+
+ BUG_ON(offset == MSR_INVALID);
+
+ return !!test_bit(bit_write, &tmp);
+}
+
static void set_msr_interception(u32 *msrpm, unsigned msr,
int read, int write)
{
@@ -1537,6 +1559,8 @@ static void svm_vcpu_reset(struct kvm_vc
u32 dummy;
u32 eax = 1;

+ svm->spec_ctrl = 0;
+
if (!init_event) {
svm->vcpu.arch.apic_base = APIC_DEFAULT_PHYS_BASE |
MSR_IA32_APICBASE_ENABLE;
@@ -3520,6 +3544,13 @@ static int svm_get_msr(struct kvm_vcpu *
case MSR_VM_CR:
msr_info->data = svm->nested.vm_cr_msr;
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ msr_info->data = svm->spec_ctrl;
+ break;
case MSR_IA32_UCODE_REV:
msr_info->data = 0x01000065;
break;
@@ -3611,6 +3642,33 @@ static int svm_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ /* The STIBP bit doesn't fault even if it's not advertised */
+ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+ return 1;
+
+ svm->spec_ctrl = data;
+
+ if (!data)
+ break;
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_svm_vmrun_msrpm.
+ * We update the L1 MSR bit as well since it will end up
+ * touching the MSR anyway now.
+ */
+ set_msr_interception(svm->msrpm, MSR_IA32_SPEC_CTRL, 1, 1);
+ break;
case MSR_IA32_PRED_CMD:
if (!msr->host_initiated &&
!guest_cpuid_has_ibpb(vcpu))
@@ -4854,6 +4912,15 @@ static void svm_vcpu_run(struct kvm_vcpu

local_irq_enable();

+ /*
+ * If this vCPU has touched SPEC_CTRL, restore the guest's value if
+ * it's non-zero. Since vmentry is serialising on affected CPUs, there
+ * is no need to worry about the conditional branch over the wrmsr
+ * being speculatively taken.
+ */
+ if (svm->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
asm volatile (
"push %%" _ASM_BP "; \n\t"
"mov %c[rbx](%[svm]), %%" _ASM_BX " \n\t"
@@ -4946,6 +5013,27 @@ static void svm_vcpu_run(struct kvm_vcpu
#endif
);

+ /*
+ * We do not use IBRS in the kernel. If this vCPU has used the
+ * SPEC_CTRL MSR it may have left it on; save the value and
+ * turn it off. This is much more efficient than blindly adding
+ * it to the atomic save/restore list. Especially as the former
+ * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
+ *
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ rdmsrl(MSR_IA32_SPEC_CTRL, svm->spec_ctrl);
+
+ if (svm->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();




2018-02-09 13:59:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 87/92] ASoC: simple-card: Fix misleading error message

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Julian Scheel <[email protected]>

commit 7ac45d1635a4cd2e99a4b11903d4a2815ca1b27b upstream.

In case cpu could not be found the error message would always refer to
/codec/ not being found in DT. Fix this by catching the cpu node not found
case explicitly.

Signed-off-by: Julian Scheel <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: thongsyho <[email protected]>
Signed-off-by: Nhan Nguyen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/soc/generic/simple-card.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/sound/soc/generic/simple-card.c
+++ b/sound/soc/generic/simple-card.c
@@ -232,13 +232,19 @@ static int asoc_simple_card_dai_link_of(
snprintf(prop, sizeof(prop), "%scpu", prefix);
cpu = of_get_child_by_name(node, prop);

+ if (!cpu) {
+ ret = -EINVAL;
+ dev_err(dev, "%s: Can't find %s DT node\n", __func__, prop);
+ goto dai_link_of_err;
+ }
+
snprintf(prop, sizeof(prop), "%splat", prefix);
plat = of_get_child_by_name(node, prop);

snprintf(prop, sizeof(prop), "%scodec", prefix);
codec = of_get_child_by_name(node, prop);

- if (!cpu || !codec) {
+ if (!codec) {
ret = -EINVAL;
dev_err(dev, "%s: Can't find %s DT node\n", __func__, prop);
goto dai_link_of_err;



2018-02-09 13:59:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 90/92] drm: rcar-du: Use the VBK interrupt for vblank events

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Laurent Pinchart <[email protected]>

commit cbbb90b0c084d7dfb2ed8e3fecf8df200fbdd2a0 upstream.

When implementing support for interlaced modes, the driver switched from
reporting vblank events on the vertical blanking (VBK) interrupt to the
frame end interrupt (FRM). This incorrectly divided the reported refresh
rate by two. Fix it by moving back to the VBK interrupt.

Fixes: 906eff7fcada ("drm: rcar-du: Implement support for interlaced modes")
Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Kieran Bingham <[email protected]>
Signed-off-by: thongsyho <[email protected]>
Signed-off-by: Nhan Nguyen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -551,7 +551,7 @@ static irqreturn_t rcar_du_crtc_irq(int
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);

- if (status & DSSR_FRM) {
+ if (status & DSSR_VBK) {
drm_crtc_handle_vblank(&rcrtc->crtc);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;



2018-02-09 13:59:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 88/92] ASoC: rsnd: dont call free_irq() on Parent SSI

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kuninori Morimoto <[email protected]>

commit 1f8754d4daea5f257370a52a30fcb22798c54516 upstream.

If SSI uses shared pin, some SSI will be used as parent SSI.
Then, normal SSI's remove and Parent SSI's remove
(these are same SSI) will be called when unbind or remove timing.
In this case, free_irq() will be called twice.
This patch solve this issue.

Signed-off-by: Kuninori Morimoto <[email protected]>
Tested-by: Hiroyuki Yokoyama <[email protected]>
Reported-by: Hiroyuki Yokoyama <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: thongsyho <[email protected]>
Signed-off-by: Nhan Nguyen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/soc/sh/rcar/ssi.c | 5 +++++
1 file changed, 5 insertions(+)

--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -699,9 +699,14 @@ static int rsnd_ssi_dma_remove(struct rs
struct rsnd_priv *priv)
{
struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
+ struct rsnd_mod *ssi_parent_mod = rsnd_io_to_mod_ssip(io);
struct device *dev = rsnd_priv_to_dev(priv);
int irq = ssi->irq;

+ /* Do nothing for SSI parent mod */
+ if (ssi_parent_mod == mod)
+ return 0;
+
/* PIO will request IRQ again */
devm_free_irq(dev, irq, mod);




2018-02-09 13:59:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 89/92] ASoC: rsnd: avoid duplicate free_irq()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Kuninori Morimoto <[email protected]>

commit e0936c3471a8411a5df327641fa3ffe12a2fb07b upstream.

commit 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on
Parent SSI") fixed Parent SSI duplicate free_irq().
But on Renesas Sound, not only Parent SSI but also Multi SSI
have same issue.
This patch avoid duplicate free_irq() if it was not pure SSI.

Fixes: 1f8754d4daea5f ("ASoC: rsnd: don't call free_irq() on Parent SSI")
Signed-off-by: Kuninori Morimoto <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: thongsyho <[email protected]>
Signed-off-by: Nhan Nguyen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
sound/soc/sh/rcar/ssi.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -699,12 +699,12 @@ static int rsnd_ssi_dma_remove(struct rs
struct rsnd_priv *priv)
{
struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
- struct rsnd_mod *ssi_parent_mod = rsnd_io_to_mod_ssip(io);
+ struct rsnd_mod *pure_ssi_mod = rsnd_io_to_mod_ssi(io);
struct device *dev = rsnd_priv_to_dev(priv);
int irq = ssi->irq;

- /* Do nothing for SSI parent mod */
- if (ssi_parent_mod == mod)
+ /* Do nothing if non SSI (= SSI parent, multi SSI) mod */
+ if (pure_ssi_mod != mod)
return 0;

/* PIO will request IRQ again */



2018-02-09 14:00:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

(cherry picked from commit 0e6c16c652cadaffd25a6bb326ec10da5bcec6b4)

After commit ad67b74d2469 ("printk: hash addresses printed with %p")
pointers are being hashed when printed. However, this makes the alternative
debug output completely useless. Switch to %px in order to see the
unadorned kernel pointers.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/alternative.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -298,7 +298,7 @@ recompute_jump(struct alt_instr *a, u8 *
tgt_rip = next_rip + o_dspl;
n_dspl = tgt_rip - orig_insn;

- DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
+ DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);

if (tgt_rip - orig_insn >= 0) {
if (n_dspl - 2 <= 127)
@@ -352,7 +352,7 @@ static void __init_or_module optimize_no
sync_core();
local_irq_restore(flags);

- DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
+ DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
instr, a->instrlen - a->padlen, a->padlen);
}

@@ -370,7 +370,7 @@ void __init_or_module apply_alternatives
u8 *instr, *replacement;
u8 insnbuf[MAX_PATCH_LEN];

- DPRINTK("alt table %p -> %p", start, end);
+ DPRINTK("alt table %px, -> %px", start, end);
/*
* The scan order should be from start to end. A later scanned
* alternative code can overwrite previously scanned alternative code.
@@ -394,14 +394,14 @@ void __init_or_module apply_alternatives
continue;
}

- DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
+ DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
a->cpuid >> 5,
a->cpuid & 0x1f,
instr, a->instrlen,
replacement, a->replacementlen, a->padlen);

- DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
- DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
+ DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
+ DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);

memcpy(insnbuf, replacement, a->replacementlen);
insnbuf_sz = a->replacementlen;
@@ -422,7 +422,7 @@ void __init_or_module apply_alternatives
a->instrlen - a->replacementlen);
insnbuf_sz += a->instrlen - a->replacementlen;
}
- DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+ DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);

text_poke_early(instr, insnbuf, insnbuf_sz);
}



2018-02-09 14:00:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 91/92] drm: rcar-du: Fix race condition when disabling planes at CRTC stop

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Laurent Pinchart <[email protected]>

commit 641307df71fe77d7b38a477067495ede05d47295 upstream.

When stopping the CRTC the driver must disable all planes and wait for
the change to take effect at the next vblank. Merely calling
drm_crtc_wait_one_vblank() is not enough, as the function doesn't
include any mechanism to handle the race with vblank interrupts.

Replace the drm_crtc_wait_one_vblank() call with a manual mechanism that
handles the vblank interrupt race.

Signed-off-by: Laurent Pinchart <[email protected]>
Reviewed-by: Kieran Bingham <[email protected]>
Signed-off-by: thongsyho <[email protected]>
Signed-off-by: Nhan Nguyen <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/gpu/drm/rcar-du/rcar_du_crtc.c | 53 +++++++++++++++++++++++++++++----
drivers/gpu/drm/rcar-du/rcar_du_crtc.h | 8 ++++
2 files changed, 55 insertions(+), 6 deletions(-)

--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -392,6 +392,31 @@ static void rcar_du_crtc_start(struct rc
rcrtc->started = true;
}

+static void rcar_du_crtc_disable_planes(struct rcar_du_crtc *rcrtc)
+{
+ struct rcar_du_device *rcdu = rcrtc->group->dev;
+ struct drm_crtc *crtc = &rcrtc->crtc;
+ u32 status;
+ /* Make sure vblank interrupts are enabled. */
+ drm_crtc_vblank_get(crtc);
+ /*
+ * Disable planes and calculate how many vertical blanking interrupts we
+ * have to wait for. If a vertical blanking interrupt has been triggered
+ * but not processed yet, we don't know whether it occurred before or
+ * after the planes got disabled. We thus have to wait for two vblank
+ * interrupts in that case.
+ */
+ spin_lock_irq(&rcrtc->vblank_lock);
+ rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
+ status = rcar_du_crtc_read(rcrtc, DSSR);
+ rcrtc->vblank_count = status & DSSR_VBK ? 2 : 1;
+ spin_unlock_irq(&rcrtc->vblank_lock);
+ if (!wait_event_timeout(rcrtc->vblank_wait, rcrtc->vblank_count == 0,
+ msecs_to_jiffies(100)))
+ dev_warn(rcdu->dev, "vertical blanking timeout\n");
+ drm_crtc_vblank_put(crtc);
+}
+
static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
{
struct drm_crtc *crtc = &rcrtc->crtc;
@@ -400,17 +425,16 @@ static void rcar_du_crtc_stop(struct rca
return;

/* Disable all planes and wait for the change to take effect. This is
- * required as the DSnPR registers are updated on vblank, and no vblank
- * will occur once the CRTC is stopped. Disabling planes when starting
- * the CRTC thus wouldn't be enough as it would start scanning out
- * immediately from old frame buffers until the next vblank.
+ * required as the plane enable registers are updated on vblank, and no
+ * vblank will occur once the CRTC is stopped. Disabling planes when
+ * starting the CRTC thus wouldn't be enough as it would start scanning
+ * out immediately from old frame buffers until the next vblank.
*
* This increases the CRTC stop delay, especially when multiple CRTCs
* are stopped in one operation as we now wait for one vblank per CRTC.
* Whether this can be improved needs to be researched.
*/
- rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
- drm_crtc_wait_one_vblank(crtc);
+ rcar_du_crtc_disable_planes(rcrtc);

/* Disable vertical blanking interrupt reporting. We first need to wait
* for page flip completion before stopping the CRTC as userspace
@@ -548,10 +572,25 @@ static irqreturn_t rcar_du_crtc_irq(int
irqreturn_t ret = IRQ_NONE;
u32 status;

+ spin_lock(&rcrtc->vblank_lock);
+
status = rcar_du_crtc_read(rcrtc, DSSR);
rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);

if (status & DSSR_VBK) {
+ /*
+ * Wake up the vblank wait if the counter reaches 0. This must
+ * be protected by the vblank_lock to avoid races in
+ * rcar_du_crtc_disable_planes().
+ */
+ if (rcrtc->vblank_count) {
+ if (--rcrtc->vblank_count == 0)
+ wake_up(&rcrtc->vblank_wait);
+ }
+ }
+ spin_unlock(&rcrtc->vblank_lock);
+
+ if (status & DSSR_VBK) {
drm_crtc_handle_vblank(&rcrtc->crtc);
rcar_du_crtc_finish_page_flip(rcrtc);
ret = IRQ_HANDLED;
@@ -606,6 +645,8 @@ int rcar_du_crtc_create(struct rcar_du_g
}

init_waitqueue_head(&rcrtc->flip_wait);
+ init_waitqueue_head(&rcrtc->vblank_wait);
+ spin_lock_init(&rcrtc->vblank_lock);

rcrtc->group = rgrp;
rcrtc->mmio_offset = mmio_offsets[index];
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -15,6 +15,7 @@
#define __RCAR_DU_CRTC_H__

#include <linux/mutex.h>
+#include <linux/spinlock.h>
#include <linux/wait.h>

#include <drm/drmP.h>
@@ -33,6 +34,9 @@ struct rcar_du_vsp;
* @started: whether the CRTC has been started and is running
* @event: event to post when the pending page flip completes
* @flip_wait: wait queue used to signal page flip completion
+ * @vblank_lock: protects vblank_wait and vblank_count
+ * @vblank_wait: wait queue used to signal vertical blanking
+ * @vblank_count: number of vertical blanking interrupts to wait for
* @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
* @group: CRTC group this CRTC belongs to
*/
@@ -48,6 +52,10 @@ struct rcar_du_crtc {
struct drm_pending_vblank_event *event;
wait_queue_head_t flip_wait;

+ spinlock_t vblank_lock;
+ wait_queue_head_t vblank_wait;
+ unsigned int vblank_count;
+
unsigned int outputs;

struct rcar_du_group *group;



2018-02-09 14:00:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 53/92] x86/entry/64: Remove the SYSCALL64 fast path

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

(cherry picked from commit 21d375b6b34ff511a507de27bf316b3dde6938d9)

The SYCALLL64 fast path was a nice, if small, optimization back in the good
old days when syscalls were actually reasonably fast. Now there is PTI to
slow everything down, and indirect branches are verboten, making everything
messier. The retpoline code in the fast path is particularly nasty.

Just get rid of the fast path. The slow path is barely slower.

[ tglx: Split out the 'push all extra regs' part ]

Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Kernel Hardening <[email protected]>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/entry/entry_64.S | 123 --------------------------------------------
arch/x86/entry/syscall_64.c | 7 --
2 files changed, 3 insertions(+), 127 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -179,94 +179,11 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r11 /* pt_regs->r11 */
sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */

- /*
- * If we need to do entry work or if we guess we'll need to do
- * exit work, go straight to the slow path.
- */
- movq PER_CPU_VAR(current_task), %r11
- testl $_TIF_WORK_SYSCALL_ENTRY|_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
- jnz entry_SYSCALL64_slow_path
-
-entry_SYSCALL_64_fastpath:
- /*
- * Easy case: enable interrupts and issue the syscall. If the syscall
- * needs pt_regs, we'll call a stub that disables interrupts again
- * and jumps to the slow path.
- */
- TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_NONE)
-#if __SYSCALL_MASK == ~0
- cmpq $__NR_syscall_max, %rax
-#else
- andl $__SYSCALL_MASK, %eax
- cmpl $__NR_syscall_max, %eax
-#endif
- ja 1f /* return -ENOSYS (already in pt_regs->ax) */
- movq %r10, %rcx
-
- /*
- * This call instruction is handled specially in stub_ptregs_64.
- * It might end up jumping to the slow path. If it jumps, RAX
- * and all argument registers are clobbered.
- */
-#ifdef CONFIG_RETPOLINE
- movq sys_call_table(, %rax, 8), %rax
- call __x86_indirect_thunk_rax
-#else
- call *sys_call_table(, %rax, 8)
-#endif
-.Lentry_SYSCALL_64_after_fastpath_call:
-
- movq %rax, RAX(%rsp)
-1:
-
- /*
- * If we get here, then we know that pt_regs is clean for SYSRET64.
- * If we see that no exit work is required (which we are required
- * to check with IRQs off), then we can go straight to SYSRET64.
- */
- DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
- movq PER_CPU_VAR(current_task), %r11
- testl $_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
- jnz 1f
-
- LOCKDEP_SYS_EXIT
- TRACE_IRQS_ON /* user mode is traced as IRQs on */
- movq RIP(%rsp), %rcx
- movq EFLAGS(%rsp), %r11
- RESTORE_C_REGS_EXCEPT_RCX_R11
- /*
- * This opens a window where we have a user CR3, but are
- * running in the kernel. This makes using the CS
- * register useless for telling whether or not we need to
- * switch CR3 in NMIs. Normal interrupts are OK because
- * they are off here.
- */
- SWITCH_USER_CR3
- movq RSP(%rsp), %rsp
- USERGS_SYSRET64
-
-1:
- /*
- * The fast path looked good when we started, but something changed
- * along the way and we need to switch to the slow path. Calling
- * raise(3) will trigger this, for example. IRQs are off.
- */
- TRACE_IRQS_ON
- ENABLE_INTERRUPTS(CLBR_NONE)
- SAVE_EXTRA_REGS
- movq %rsp, %rdi
- call syscall_return_slowpath /* returns with IRQs disabled */
- jmp return_from_SYSCALL_64
-
-entry_SYSCALL64_slow_path:
/* IRQs are off. */
SAVE_EXTRA_REGS
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */

-return_from_SYSCALL_64:
RESTORE_EXTRA_REGS
TRACE_IRQS_IRETQ /* we're about to change IF */

@@ -339,6 +256,7 @@ return_from_SYSCALL_64:
syscall_return_via_sysret:
/* rcx and r11 are already restored (see code above) */
RESTORE_C_REGS_EXCEPT_RCX_R11
+
/*
* This opens a window where we have a user CR3, but are
* running in the kernel. This makes using the CS
@@ -363,45 +281,6 @@ opportunistic_sysret_failed:
jmp restore_c_regs_and_iret
END(entry_SYSCALL_64)

-ENTRY(stub_ptregs_64)
- /*
- * Syscalls marked as needing ptregs land here.
- * If we are on the fast path, we need to save the extra regs,
- * which we achieve by trying again on the slow path. If we are on
- * the slow path, the extra regs are already saved.
- *
- * RAX stores a pointer to the C function implementing the syscall.
- * IRQs are on.
- */
- cmpq $.Lentry_SYSCALL_64_after_fastpath_call, (%rsp)
- jne 1f
-
- /*
- * Called from fast path -- disable IRQs again, pop return address
- * and jump to slow path
- */
- DISABLE_INTERRUPTS(CLBR_NONE)
- TRACE_IRQS_OFF
- popq %rax
- jmp entry_SYSCALL64_slow_path
-
-1:
- JMP_NOSPEC %rax /* Called from C */
-END(stub_ptregs_64)
-
-.macro ptregs_stub func
-ENTRY(ptregs_\func)
- leaq \func(%rip), %rax
- jmp stub_ptregs_64
-END(ptregs_\func)
-.endm
-
-/* Instantiate ptregs_stub for each ptregs-using syscall */
-#define __SYSCALL_64_QUAL_(sym)
-#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_stub sym
-#define __SYSCALL_64(nr, sym, qual) __SYSCALL_64_QUAL_##qual(sym)
-#include <asm/syscalls_64.h>
-
/*
* %rdi: prev task
* %rsi: next task
--- a/arch/x86/entry/syscall_64.c
+++ b/arch/x86/entry/syscall_64.c
@@ -6,14 +6,11 @@
#include <asm/asm-offsets.h>
#include <asm/syscall.h>

-#define __SYSCALL_64_QUAL_(sym) sym
-#define __SYSCALL_64_QUAL_ptregs(sym) ptregs_##sym
-
-#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long __SYSCALL_64_QUAL_##qual(sym)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
+#define __SYSCALL_64(nr, sym, qual) extern asmlinkage long sym(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);
#include <asm/syscalls_64.h>
#undef __SYSCALL_64

-#define __SYSCALL_64(nr, sym, qual) [nr] = __SYSCALL_64_QUAL_##qual(sym),
+#define __SYSCALL_64(nr, sym, qual) [nr] = sym,

extern long sys_ni_syscall(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long);




2018-02-09 14:00:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 54/92] x86/entry/64: Push extra regs right away

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andy Lutomirski <[email protected]>

(cherry picked from commit d1f7732009e0549eedf8ea1db948dc37be77fd46)

With the fast path removed there is no point in splitting the push of the
normal and the extra register set. Just push the extra regs right away.

[ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ]

Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Kernel Hardening <[email protected]>
Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/entry/entry_64.S | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -177,10 +177,14 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
pushq %r11 /* pt_regs->r11 */
- sub $(6*8), %rsp /* pt_regs->bp, bx, r12-15 not saved */
+ pushq %rbx /* pt_regs->rbx */
+ pushq %rbp /* pt_regs->rbp */
+ pushq %r12 /* pt_regs->r12 */
+ pushq %r13 /* pt_regs->r13 */
+ pushq %r14 /* pt_regs->r14 */
+ pushq %r15 /* pt_regs->r15 */

/* IRQs are off. */
- SAVE_EXTRA_REGS
movq %rsp, %rdi
call do_syscall_64 /* returns with IRQs disabled */




2018-02-09 14:01:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 84/92] KVM/VMX: Allow direct access to MSR_IA32_SPEC_CTRL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: KarimAllah Ahmed <[email protected]>


(cherry picked from commit d28b387fb74da95d69d2615732f50cceb38e9a4d)

[ Based on a patch from Ashok Raj <[email protected]> ]

Add direct access to MSR_IA32_SPEC_CTRL for guests. This is needed for
guests that will only mitigate Spectre V2 through IBRS+IBPB and will not
be using a retpoline+IBPB based approach.

To avoid the overhead of saving and restoring the MSR_IA32_SPEC_CTRL for
guests that do not actually use the MSR, only start saving and restoring
when a non-zero is written to it.

No attempt is made to handle STIBP here, intentionally. Filtering STIBP
may be added in a future patch, which may require trapping all writes
if we don't want to pass it through directly to the guest.

[dwmw2: Clean up CPUID bits, save/restore manually, handle reset]

Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Ashok Raj <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/cpuid.c | 8 ++-
arch/x86/kvm/cpuid.h | 11 +++++
arch/x86/kvm/vmx.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/x86.c | 2
4 files changed, 118 insertions(+), 6 deletions(-)

--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -357,7 +357,7 @@ static inline int __do_cpuid_ent(struct

/* cpuid 0x80000008.ebx */
const u32 kvm_cpuid_8000_0008_ebx_x86_features =
- F(IBPB);
+ F(IBPB) | F(IBRS);

/* cpuid 0xC0000001.edx */
const u32 kvm_cpuid_C000_0001_edx_x86_features =
@@ -382,7 +382,7 @@ static inline int __do_cpuid_ent(struct

/* cpuid 7.0.edx*/
const u32 kvm_cpuid_7_0_edx_x86_features =
- F(ARCH_CAPABILITIES);
+ F(SPEC_CTRL) | F(ARCH_CAPABILITIES);

/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();
@@ -618,9 +618,11 @@ static inline int __do_cpuid_ent(struct
g_phys_as = phys_as;
entry->eax = g_phys_as | (virt_as << 8);
entry->edx = 0;
- /* IBPB isn't necessarily present in hardware cpuid */
+ /* IBRS and IBPB aren't necessarily present in hardware cpuid */
if (boot_cpu_has(X86_FEATURE_IBPB))
entry->ebx |= F(IBPB);
+ if (boot_cpu_has(X86_FEATURE_IBRS))
+ entry->ebx |= F(IBRS);
entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
break;
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -171,6 +171,17 @@ static inline bool guest_cpuid_has_ibpb(
return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
}

+static inline bool guest_cpuid_has_ibrs(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0);
+ if (best && (best->ebx & bit(X86_FEATURE_IBRS)))
+ return true;
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
+}
+
static inline bool guest_cpuid_has_arch_capabilities(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -552,6 +552,7 @@ struct vcpu_vmx {
#endif

u64 arch_capabilities;
+ u64 spec_ctrl;

u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
@@ -1852,6 +1853,29 @@ static void update_exception_bitmap(stru
}

/*
+ * Check if MSR is intercepted for currently loaded MSR bitmap.
+ */
+static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
+{
+ unsigned long *msr_bitmap;
+ int f = sizeof(unsigned long);
+
+ if (!cpu_has_vmx_msr_bitmap())
+ return true;
+
+ msr_bitmap = to_vmx(vcpu)->loaded_vmcs->msr_bitmap;
+
+ if (msr <= 0x1fff) {
+ return !!test_bit(msr, msr_bitmap + 0x800 / f);
+ } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
+ msr &= 0x1fff;
+ return !!test_bit(msr, msr_bitmap + 0xc00 / f);
+ }
+
+ return true;
+}
+
+/*
* Check if MSR is intercepted for L01 MSR bitmap.
*/
static bool msr_write_intercepted_l01(struct kvm_vcpu *vcpu, u32 msr)
@@ -2981,6 +3005,13 @@ static int vmx_get_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ msr_info->data = to_vmx(vcpu)->spec_ctrl;
+ break;
case MSR_IA32_ARCH_CAPABILITIES:
if (!msr_info->host_initiated &&
!guest_cpuid_has_arch_capabilities(vcpu))
@@ -3091,6 +3122,36 @@ static int vmx_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_SPEC_CTRL:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibrs(vcpu))
+ return 1;
+
+ /* The STIBP bit doesn't fault even if it's not advertised */
+ if (data & ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP))
+ return 1;
+
+ vmx->spec_ctrl = data;
+
+ if (!data)
+ break;
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_vmx_merge_msr_bitmap. We should not touch the
+ * vmcs02.msr_bitmap here since it gets completely overwritten
+ * in the merging. We update the vmcs01 here for L1 as well
+ * since it will end up touching the MSR anyway now.
+ */
+ vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap,
+ MSR_IA32_SPEC_CTRL,
+ MSR_TYPE_RW);
+ break;
case MSR_IA32_PRED_CMD:
if (!msr_info->host_initiated &&
!guest_cpuid_has_ibpb(vcpu))
@@ -5239,6 +5300,7 @@ static void vmx_vcpu_reset(struct kvm_vc
u64 cr0;

vmx->rmode.vm86_active = 0;
+ vmx->spec_ctrl = 0;

vmx->soft_vnmi_blocked = 0;

@@ -8824,6 +8886,15 @@ static void __noclone vmx_vcpu_run(struc

vmx_arm_hv_timer(vcpu);

+ /*
+ * If this vCPU has touched SPEC_CTRL, restore the guest's value if
+ * it's non-zero. Since vmentry is serialising on affected CPUs, there
+ * is no need to worry about the conditional branch over the wrmsr
+ * being speculatively taken.
+ */
+ if (vmx->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+
vmx->__launched = vmx->loaded_vmcs->launched;
asm(
/* Store host registers */
@@ -8942,6 +9013,27 @@ static void __noclone vmx_vcpu_run(struc
#endif
);

+ /*
+ * We do not use IBRS in the kernel. If this vCPU has used the
+ * SPEC_CTRL MSR it may have left it on; save the value and
+ * turn it off. This is much more efficient than blindly adding
+ * it to the atomic save/restore list. Especially as the former
+ * (Saving guest MSRs on vmexit) doesn't even exist in KVM.
+ *
+ * For non-nested case:
+ * If the L01 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ *
+ * For nested case:
+ * If the L02 MSR bitmap does not intercept the MSR, then we need to
+ * save it.
+ */
+ if (!msr_write_intercepted(vcpu, MSR_IA32_SPEC_CTRL))
+ rdmsrl(MSR_IA32_SPEC_CTRL, vmx->spec_ctrl);
+
+ if (vmx->spec_ctrl)
+ wrmsrl(MSR_IA32_SPEC_CTRL, 0);
+
/* Eliminate branch target predictions from guest mode */
vmexit_fill_RSB();

@@ -9501,7 +9593,7 @@ static inline bool nested_vmx_merge_msr_
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
/*
- * pred_cmd is trying to verify two things:
+ * pred_cmd & spec_ctrl are trying to verify two things:
*
* 1. L0 gave a permission to L1 to actually passthrough the MSR. This
* ensures that we do not accidentally generate an L02 MSR bitmap
@@ -9514,9 +9606,10 @@ static inline bool nested_vmx_merge_msr_
* the MSR.
*/
bool pred_cmd = msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD);
+ bool spec_ctrl = msr_write_intercepted_l01(vcpu, MSR_IA32_SPEC_CTRL);

if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
- !pred_cmd)
+ !pred_cmd && !spec_ctrl)
return false;

page = nested_get_page(vcpu, vmcs12->msr_bitmap);
@@ -9550,6 +9643,12 @@ static inline bool nested_vmx_merge_msr_
}
}

+ if (spec_ctrl)
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_SPEC_CTRL,
+ MSR_TYPE_R | MSR_TYPE_W);
+
if (pred_cmd)
nested_vmx_disable_intercept_for_msr(
msr_bitmap_l1, msr_bitmap_l0,
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -975,7 +975,7 @@ static u32 msrs_to_save[] = {
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
- MSR_IA32_ARCH_CAPABILITIES
+ MSR_IA32_SPEC_CTRL, MSR_IA32_ARCH_CAPABILITIES
};

static unsigned num_msrs_to_save;



2018-02-09 14:01:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 50/92] x86/cpufeatures: Clean up Spectre v2 related CPUID flags

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit 2961298efe1ea1b6fc0d7ee8b76018fa6c0bcef2)

We want to expose the hardware features simply in /proc/cpuinfo as "ibrs",
"ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them
as the user-visible bits.

When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB
capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP
bit is set, set the AMD STIBP that's used for the generic hardware
capability.

Hide the rest from /proc/cpuinfo by putting "" in the comments. Including
RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are
patches to make the sysfs vulnerabilities information non-readable by
non-root, and the same should apply to all information about which
mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.

The feature bit for whether IBPB is actually used, which is needed for
ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.

Originally-by: Borislav Petkov <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 18 +++++++++---------
arch/x86/include/asm/nospec-branch.h | 2 +-
arch/x86/kernel/cpu/bugs.c | 7 +++----
arch/x86/kernel/cpu/intel.c | 31 +++++++++++++++++++++----------
4 files changed, 34 insertions(+), 24 deletions(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -194,15 +194,15 @@
#define X86_FEATURE_HW_PSTATE ( 7*32+ 8) /* AMD HW-PState */
#define X86_FEATURE_PROC_FEEDBACK ( 7*32+ 9) /* AMD ProcFeedbackInterface */

-#define X86_FEATURE_RETPOLINE ( 7*32+12) /* Generic Retpoline mitigation for Spectre variant 2 */
-#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* AMD Retpoline mitigation for Spectre variant 2 */
+#define X86_FEATURE_RETPOLINE ( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
+#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */

-#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */
+#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* "" Fill RSB on context switches */

/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */
#define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */

-#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled*/
+#define X86_FEATURE_USE_IBPB ( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */

/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
@@ -260,9 +260,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
#define X86_FEATURE_IRPERF (13*32+1) /* Instructions Retired Count */
-#define X86_FEATURE_AMD_PRED_CMD (13*32+12) /* Prediction Command MSR (AMD) */
-#define X86_FEATURE_AMD_SPEC_CTRL (13*32+14) /* Speculation Control MSR only (AMD) */
-#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors (AMD) */
+#define X86_FEATURE_IBPB (13*32+12) /* Indirect Branch Prediction Barrier */
+#define X86_FEATURE_IBRS (13*32+14) /* Indirect Branch Restricted Speculation */
+#define X86_FEATURE_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors */

/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
@@ -301,8 +301,8 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
-#define X86_FEATURE_SPEC_CTRL (18*32+26) /* Speculation Control (IBRS + IBPB) */
-#define X86_FEATURE_STIBP (18*32+27) /* Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
+#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */

/*
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -225,7 +225,7 @@ static inline void indirect_branch_predi
"movl %[val], %%eax\n\t"
"movl $0, %%edx\n\t"
"wrmsr",
- X86_FEATURE_IBPB)
+ X86_FEATURE_USE_IBPB)
: : [msr] "i" (MSR_IA32_PRED_CMD),
[val] "i" (PRED_CMD_IBPB)
: "eax", "ecx", "edx", "memory");
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -272,9 +272,8 @@ retpoline_auto:
}

/* Initialize Indirect Branch Prediction Barrier if supported */
- if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) ||
- boot_cpu_has(X86_FEATURE_AMD_PRED_CMD)) {
- setup_force_cpu_cap(X86_FEATURE_IBPB);
+ if (boot_cpu_has(X86_FEATURE_IBPB)) {
+ setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
pr_info("Enabling Indirect Branch Prediction Barrier\n");
}
}
@@ -307,7 +306,7 @@ ssize_t cpu_show_spectre_v2(struct devic
return sprintf(buf, "Not affected\n");

return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
- boot_cpu_has(X86_FEATURE_IBPB) ? ", IBPB" : "",
+ boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
spectre_v2_module_string());
}
#endif
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -140,17 +140,28 @@ static void early_init_intel(struct cpui
rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode);
}

- if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) ||
- cpu_has(c, X86_FEATURE_STIBP) ||
- cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) ||
- cpu_has(c, X86_FEATURE_AMD_PRED_CMD) ||
- cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) {
- pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n");
- clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
+ /*
+ * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
+ * and they also have a different bit for STIBP support. Also,
+ * a hypervisor might have set the individual AMD bits even on
+ * Intel CPUs, for finer-grained selection of what's available.
+ */
+ if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
+ set_cpu_cap(c, X86_FEATURE_IBRS);
+ set_cpu_cap(c, X86_FEATURE_IBPB);
+ }
+ if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
+ set_cpu_cap(c, X86_FEATURE_STIBP);
+
+ /* Now if any of them are set, check the blacklist and clear the lot */
+ if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
+ cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) {
+ pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n");
+ clear_cpu_cap(c, X86_FEATURE_IBRS);
+ clear_cpu_cap(c, X86_FEATURE_IBPB);
clear_cpu_cap(c, X86_FEATURE_STIBP);
- clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL);
- clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD);
- clear_cpu_cap(c, X86_FEATURE_AMD_STIBP);
+ clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
+ clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP);
}

/*



2018-02-09 14:02:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 52/92] x86/spectre: Check CONFIG_RETPOLINE in command line parser

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dou Liyang <[email protected]>

(cherry picked from commit 9471eee9186a46893726e22ebb54cade3f9bc043)

The spectre_v2 option 'auto' does not check whether CONFIG_RETPOLINE is
enabled. As a consequence it fails to emit the appropriate warning and sets
feature flags which have no effect at all.

Add the missing IS_ENABLED() check.

Fixes: da285121560e ("x86/spectre: Add boot time option to select Spectre v2 mitigation")
Signed-off-by: Dou Liyang <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Tomohiro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -212,10 +212,10 @@ static void __init spectre_v2_select_mit
return;

case SPECTRE_V2_CMD_FORCE:
- /* FALLTRHU */
case SPECTRE_V2_CMD_AUTO:
- goto retpoline_auto;
-
+ if (IS_ENABLED(CONFIG_RETPOLINE))
+ goto retpoline_auto;
+ break;
case SPECTRE_V2_CMD_RETPOLINE_AMD:
if (IS_ENABLED(CONFIG_RETPOLINE))
goto retpoline_amd;



2018-02-09 14:02:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 82/92] KVM/x86: Add IBPB support

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ashok Raj <[email protected]>


(cherry picked from commit 15d45071523d89b3fb7372e2135fbd72f6af9506)

The Indirect Branch Predictor Barrier (IBPB) is an indirect branch
control mechanism. It keeps earlier branches from influencing
later ones.

Unlike IBRS and STIBP, IBPB does not define a new mode of operation.
It's a command that ensures predicted branch targets aren't used after
the barrier. Although IBRS and IBPB are enumerated by the same CPUID
enumeration, IBPB is very different.

IBPB helps mitigate against three potential attacks:

* Mitigate guests from being attacked by other guests.
- This is addressed by issing IBPB when we do a guest switch.

* Mitigate attacks from guest/ring3->host/ring3.
These would require a IBPB during context switch in host, or after
VMEXIT. The host process has two ways to mitigate
- Either it can be compiled with retpoline
- If its going through context switch, and has set !dumpable then
there is a IBPB in that path.
(Tim's patch: https://patchwork.kernel.org/patch/10192871)
- The case where after a VMEXIT you return back to Qemu might make
Qemu attackable from guest when Qemu isn't compiled with retpoline.
There are issues reported when doing IBPB on every VMEXIT that resulted
in some tsc calibration woes in guest.

* Mitigate guest/ring0->host/ring0 attacks.
When host kernel is using retpoline it is safe against these attacks.
If host kernel isn't using retpoline we might need to do a IBPB flush on
every VMEXIT.

Even when using retpoline for indirect calls, in certain conditions 'ret'
can use the BTB on Skylake-era CPUs. There are other mitigations
available like RSB stuffing/clearing.

* IBPB is issued only for SVM during svm_free_vcpu().
VMX has a vmclear and SVM doesn't. Follow discussion here:
https://lkml.org/lkml/2018/1/15/146

Please refer to the following spec for more details on the enumeration
and control.

Refer here to get documentation about mitigations.

https://software.intel.com/en-us/side-channel-security-support

[peterz: rebase and changelog rewrite]
[karahmed: - rebase
- vmx: expose PRED_CMD if guest has it in CPUID
- svm: only pass through IBPB if guest has it in CPUID
- vmx: support !cpu_has_vmx_msr_bitmap()]
- vmx: support nested]
[dwmw2: Expose CPUID bit too (AMD IBPB only for now as we lack IBRS)
PRED_CMD is a write-only MSR]

Signed-off-by: Ashok Raj <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: Asit Mallick <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Tim Chen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/cpuid.c | 11 ++++++-
arch/x86/kvm/cpuid.h | 12 +++++++
arch/x86/kvm/svm.c | 28 ++++++++++++++++++
arch/x86/kvm/vmx.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++--
4 files changed, 127 insertions(+), 3 deletions(-)

--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -355,6 +355,10 @@ static inline int __do_cpuid_ent(struct
F(3DNOWPREFETCH) | F(OSVW) | 0 /* IBS */ | F(XOP) |
0 /* SKINIT, WDT, LWP */ | F(FMA4) | F(TBM);

+ /* cpuid 0x80000008.ebx */
+ const u32 kvm_cpuid_8000_0008_ebx_x86_features =
+ F(IBPB);
+
/* cpuid 0xC0000001.edx */
const u32 kvm_cpuid_C000_0001_edx_x86_features =
F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) |
@@ -607,7 +611,12 @@ static inline int __do_cpuid_ent(struct
if (!g_phys_as)
g_phys_as = phys_as;
entry->eax = g_phys_as | (virt_as << 8);
- entry->ebx = entry->edx = 0;
+ entry->edx = 0;
+ /* IBPB isn't necessarily present in hardware cpuid */
+ if (boot_cpu_has(X86_FEATURE_IBPB))
+ entry->ebx |= F(IBPB);
+ entry->ebx &= kvm_cpuid_8000_0008_ebx_x86_features;
+ cpuid_mask(&entry->ebx, CPUID_8000_0008_EBX);
break;
}
case 0x80000019:
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -160,6 +160,18 @@ static inline bool guest_cpuid_has_rdtsc
return best && (best->edx & bit(X86_FEATURE_RDTSCP));
}

+static inline bool guest_cpuid_has_ibpb(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x80000008, 0);
+ if (best && (best->ebx & bit(X86_FEATURE_IBPB)))
+ return true;
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
+}
+
+
/*
* NRIPS is provided through cpuidfn 0x8000000a.edx bit 3
*/
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -248,6 +248,7 @@ static const struct svm_direct_access_ms
{ .index = MSR_CSTAR, .always = true },
{ .index = MSR_SYSCALL_MASK, .always = true },
#endif
+ { .index = MSR_IA32_PRED_CMD, .always = false },
{ .index = MSR_IA32_LASTBRANCHFROMIP, .always = false },
{ .index = MSR_IA32_LASTBRANCHTOIP, .always = false },
{ .index = MSR_IA32_LASTINTFROMIP, .always = false },
@@ -510,6 +511,7 @@ struct svm_cpu_data {
struct kvm_ldttss_desc *tss_desc;

struct page *save_area;
+ struct vmcb *current_vmcb;
};

static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data);
@@ -1644,11 +1646,17 @@ static void svm_free_vcpu(struct kvm_vcp
__free_pages(virt_to_page(svm->nested.msrpm), MSRPM_ALLOC_ORDER);
kvm_vcpu_uninit(vcpu);
kmem_cache_free(kvm_vcpu_cache, svm);
+ /*
+ * The vmcb page can be recycled, causing a false negative in
+ * svm_vcpu_load(). So do a full IBPB now.
+ */
+ indirect_branch_prediction_barrier();
}

static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
+ struct svm_cpu_data *sd = per_cpu(svm_data, cpu);
int i;

if (unlikely(cpu != vcpu->cpu)) {
@@ -1677,6 +1685,10 @@ static void svm_vcpu_load(struct kvm_vcp
if (static_cpu_has(X86_FEATURE_RDTSCP))
wrmsrl(MSR_TSC_AUX, svm->tsc_aux);

+ if (sd->current_vmcb != svm->vmcb) {
+ sd->current_vmcb = svm->vmcb;
+ indirect_branch_prediction_barrier();
+ }
avic_vcpu_load(vcpu, cpu);
}

@@ -3599,6 +3611,22 @@ static int svm_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr);
break;
+ case MSR_IA32_PRED_CMD:
+ if (!msr->host_initiated &&
+ !guest_cpuid_has_ibpb(vcpu))
+ return 1;
+
+ if (data & ~PRED_CMD_IBPB)
+ return 1;
+
+ if (!data)
+ break;
+
+ wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+ if (is_guest_mode(vcpu))
+ break;
+ set_msr_interception(svm->msrpm, MSR_IA32_PRED_CMD, 0, 1);
+ break;
case MSR_STAR:
svm->vmcb->save.star = data;
break;
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -550,6 +550,7 @@ struct vcpu_vmx {
u64 msr_host_kernel_gs_base;
u64 msr_guest_kernel_gs_base;
#endif
+
u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
/*
@@ -911,6 +912,8 @@ static void copy_vmcs12_to_shadow(struct
static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
static int alloc_identity_pagetable(struct kvm *kvm);
static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);
+static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type);

static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -1846,6 +1849,29 @@ static void update_exception_bitmap(stru
vmcs_write32(EXCEPTION_BITMAP, eb);
}

+/*
+ * Check if MSR is intercepted for L01 MSR bitmap.
+ */
+static bool msr_write_intercepted_l01(struct kvm_vcpu *vcpu, u32 msr)
+{
+ unsigned long *msr_bitmap;
+ int f = sizeof(unsigned long);
+
+ if (!cpu_has_vmx_msr_bitmap())
+ return true;
+
+ msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap;
+
+ if (msr <= 0x1fff) {
+ return !!test_bit(msr, msr_bitmap + 0x800 / f);
+ } else if ((msr >= 0xc0000000) && (msr <= 0xc0001fff)) {
+ msr &= 0x1fff;
+ return !!test_bit(msr, msr_bitmap + 0xc00 / f);
+ }
+
+ return true;
+}
+
static void clear_atomic_switch_msr_special(struct vcpu_vmx *vmx,
unsigned long entry, unsigned long exit)
{
@@ -2255,6 +2281,7 @@ static void vmx_vcpu_load(struct kvm_vcp
if (per_cpu(current_vmcs, cpu) != vmx->loaded_vmcs->vmcs) {
per_cpu(current_vmcs, cpu) = vmx->loaded_vmcs->vmcs;
vmcs_load(vmx->loaded_vmcs->vmcs);
+ indirect_branch_prediction_barrier();
}

if (!already_loaded) {
@@ -3056,6 +3083,33 @@ static int vmx_set_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
kvm_write_tsc(vcpu, msr_info);
break;
+ case MSR_IA32_PRED_CMD:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_ibpb(vcpu))
+ return 1;
+
+ if (data & ~PRED_CMD_IBPB)
+ return 1;
+
+ if (!data)
+ break;
+
+ wrmsrl(MSR_IA32_PRED_CMD, PRED_CMD_IBPB);
+
+ /*
+ * For non-nested:
+ * When it's written (to non-zero) for the first time, pass
+ * it through.
+ *
+ * For nested:
+ * The handling of the MSR bitmap for L2 guests is done in
+ * nested_vmx_merge_msr_bitmap. We should not touch the
+ * vmcs02.msr_bitmap here since it gets completely overwritten
+ * in the merging.
+ */
+ vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
+ MSR_TYPE_W);
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -9431,9 +9485,23 @@ static inline bool nested_vmx_merge_msr_
struct page *page;
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;
+ /*
+ * pred_cmd is trying to verify two things:
+ *
+ * 1. L0 gave a permission to L1 to actually passthrough the MSR. This
+ * ensures that we do not accidentally generate an L02 MSR bitmap
+ * from the L12 MSR bitmap that is too permissive.
+ * 2. That L1 or L2s have actually used the MSR. This avoids
+ * unnecessarily merging of the bitmap if the MSR is unused. This
+ * works properly because we only update the L01 MSR bitmap lazily.
+ * So even if L0 should pass L1 these MSRs, the L01 bitmap is only
+ * updated to reflect this when L1 (or its L2s) actually write to
+ * the MSR.
+ */
+ bool pred_cmd = msr_write_intercepted_l01(vcpu, MSR_IA32_PRED_CMD);

- /* This shortcut is ok because we support only x2APIC MSRs so far. */
- if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
+ if (!nested_cpu_has_virt_x2apic_mode(vmcs12) &&
+ !pred_cmd)
return false;

page = nested_get_page(vcpu, vmcs12->msr_bitmap);
@@ -9466,6 +9534,13 @@ static inline bool nested_vmx_merge_msr_
MSR_TYPE_W);
}
}
+
+ if (pred_cmd)
+ nested_vmx_disable_intercept_for_msr(
+ msr_bitmap_l1, msr_bitmap_l0,
+ MSR_IA32_PRED_CMD,
+ MSR_TYPE_W);
+
kunmap(page);
nested_release_page_clean(page);




2018-02-09 14:03:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 81/92] KVM: VMX: make MSR bitmaps per-VCPU

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <[email protected]>


(cherry picked from commit 904e14fb7cb96401a7dc803ca2863fd5ba32ffe6)

Place the MSR bitmap in struct loaded_vmcs, and update it in place
every time the x2apic or APICv state can change. This is rare and
the loop can handle 64 MSRs per iteration, in a similar fashion as
nested_vmx_prepare_msr_bitmap.

This prepares for choosing, on a per-VM basis, whether to intercept
the SPEC_CTRL and PRED_CMD MSRs.

Cc: [email protected] # prereq for Spectre mitigation
Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/vmx.c | 316 +++++++++++++++++++----------------------------------
1 file changed, 115 insertions(+), 201 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -110,6 +110,14 @@ static u64 __read_mostly host_xss;
static bool __read_mostly enable_pml = 1;
module_param_named(pml, enable_pml, bool, S_IRUGO);

+#define MSR_TYPE_R 1
+#define MSR_TYPE_W 2
+#define MSR_TYPE_RW 3
+
+#define MSR_BITMAP_MODE_X2APIC 1
+#define MSR_BITMAP_MODE_X2APIC_APICV 2
+#define MSR_BITMAP_MODE_LM 4
+
#define KVM_VMX_TSC_MULTIPLIER_MAX 0xffffffffffffffffULL

/* Guest_tsc -> host_tsc conversion requires 64-bit division. */
@@ -191,6 +199,7 @@ struct loaded_vmcs {
struct vmcs *shadow_vmcs;
int cpu;
int launched;
+ unsigned long *msr_bitmap;
struct list_head loaded_vmcss_on_cpu_link;
};

@@ -429,8 +438,6 @@ struct nested_vmx {
bool pi_pending;
u16 posted_intr_nv;

- unsigned long *msr_bitmap;
-
struct hrtimer preemption_timer;
bool preemption_timer_expired;

@@ -531,6 +538,7 @@ struct vcpu_vmx {
unsigned long host_rsp;
u8 fail;
bool nmi_known_unmasked;
+ u8 msr_bitmap_mode;
u32 exit_intr_info;
u32 idt_vectoring_info;
ulong rflags;
@@ -902,6 +910,7 @@ static u32 vmx_segment_access_rights(str
static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
static int alloc_identity_pagetable(struct kvm *kvm);
+static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu);

static DEFINE_PER_CPU(struct vmcs *, vmxarea);
static DEFINE_PER_CPU(struct vmcs *, current_vmcs);
@@ -921,12 +930,6 @@ static DEFINE_PER_CPU(spinlock_t, blocke

static unsigned long *vmx_io_bitmap_a;
static unsigned long *vmx_io_bitmap_b;
-static unsigned long *vmx_msr_bitmap_legacy;
-static unsigned long *vmx_msr_bitmap_longmode;
-static unsigned long *vmx_msr_bitmap_legacy_x2apic;
-static unsigned long *vmx_msr_bitmap_longmode_x2apic;
-static unsigned long *vmx_msr_bitmap_legacy_x2apic_apicv_inactive;
-static unsigned long *vmx_msr_bitmap_longmode_x2apic_apicv_inactive;
static unsigned long *vmx_vmread_bitmap;
static unsigned long *vmx_vmwrite_bitmap;

@@ -2520,36 +2523,6 @@ static void move_msr_up(struct vcpu_vmx
vmx->guest_msrs[from] = tmp;
}

-static void vmx_set_msr_bitmap(struct kvm_vcpu *vcpu)
-{
- unsigned long *msr_bitmap;
-
- if (is_guest_mode(vcpu))
- msr_bitmap = to_vmx(vcpu)->nested.msr_bitmap;
- else if (cpu_has_secondary_exec_ctrls() &&
- (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) &
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
- if (enable_apicv && kvm_vcpu_apicv_active(vcpu)) {
- if (is_long_mode(vcpu))
- msr_bitmap = vmx_msr_bitmap_longmode_x2apic;
- else
- msr_bitmap = vmx_msr_bitmap_legacy_x2apic;
- } else {
- if (is_long_mode(vcpu))
- msr_bitmap = vmx_msr_bitmap_longmode_x2apic_apicv_inactive;
- else
- msr_bitmap = vmx_msr_bitmap_legacy_x2apic_apicv_inactive;
- }
- } else {
- if (is_long_mode(vcpu))
- msr_bitmap = vmx_msr_bitmap_longmode;
- else
- msr_bitmap = vmx_msr_bitmap_legacy;
- }
-
- vmcs_write64(MSR_BITMAP, __pa(msr_bitmap));
-}
-
/*
* Set up the vmcs to automatically save and restore system
* msrs. Don't touch the 64-bit msrs if the guest is in legacy
@@ -2590,7 +2563,7 @@ static void setup_msrs(struct vcpu_vmx *
vmx->save_nmsrs = save_nmsrs;

if (cpu_has_vmx_msr_bitmap())
- vmx_set_msr_bitmap(&vmx->vcpu);
+ vmx_update_msr_bitmap(&vmx->vcpu);
}

/*
@@ -3537,6 +3510,8 @@ static void free_loaded_vmcs(struct load
loaded_vmcs_clear(loaded_vmcs);
free_vmcs(loaded_vmcs->vmcs);
loaded_vmcs->vmcs = NULL;
+ if (loaded_vmcs->msr_bitmap)
+ free_page((unsigned long)loaded_vmcs->msr_bitmap);
WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
}

@@ -3553,7 +3528,18 @@ static int alloc_loaded_vmcs(struct load

loaded_vmcs->shadow_vmcs = NULL;
loaded_vmcs_init(loaded_vmcs);
+
+ if (cpu_has_vmx_msr_bitmap()) {
+ loaded_vmcs->msr_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
+ if (!loaded_vmcs->msr_bitmap)
+ goto out_vmcs;
+ memset(loaded_vmcs->msr_bitmap, 0xff, PAGE_SIZE);
+ }
return 0;
+
+out_vmcs:
+ free_loaded_vmcs(loaded_vmcs);
+ return -ENOMEM;
}

static void free_kvm_area(void)
@@ -4562,10 +4548,8 @@ static void free_vpid(int vpid)
spin_unlock(&vmx_vpid_lock);
}

-#define MSR_TYPE_R 1
-#define MSR_TYPE_W 2
-static void __vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type)
+static void __always_inline vmx_disable_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type)
{
int f = sizeof(unsigned long);

@@ -4599,8 +4583,8 @@ static void __vmx_disable_intercept_for_
}
}

-static void __vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
- u32 msr, int type)
+static void __always_inline vmx_enable_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type)
{
int f = sizeof(unsigned long);

@@ -4634,6 +4618,15 @@ static void __vmx_enable_intercept_for_m
}
}

+static void __always_inline vmx_set_intercept_for_msr(unsigned long *msr_bitmap,
+ u32 msr, int type, bool value)
+{
+ if (value)
+ vmx_enable_intercept_for_msr(msr_bitmap, msr, type);
+ else
+ vmx_disable_intercept_for_msr(msr_bitmap, msr, type);
+}
+
/*
* If a msr is allowed by L0, we should check whether it is allowed by L1.
* The corresponding bit will be cleared unless both of L0 and L1 allow it.
@@ -4680,58 +4673,68 @@ static void nested_vmx_disable_intercept
}
}

-static void vmx_disable_intercept_for_msr(u32 msr, bool longmode_only)
+static u8 vmx_msr_bitmap_mode(struct kvm_vcpu *vcpu)
{
- if (!longmode_only)
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy,
- msr, MSR_TYPE_R | MSR_TYPE_W);
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode,
- msr, MSR_TYPE_R | MSR_TYPE_W);
-}
-
-static void vmx_enable_intercept_msr_read_x2apic(u32 msr, bool apicv_active)
-{
- if (apicv_active) {
- __vmx_enable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic,
- msr, MSR_TYPE_R);
- __vmx_enable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic,
- msr, MSR_TYPE_R);
- } else {
- __vmx_enable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic_apicv_inactive,
- msr, MSR_TYPE_R);
- __vmx_enable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic_apicv_inactive,
- msr, MSR_TYPE_R);
+ u8 mode = 0;
+
+ if (cpu_has_secondary_exec_ctrls() &&
+ (vmcs_read32(SECONDARY_VM_EXEC_CONTROL) &
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE)) {
+ mode |= MSR_BITMAP_MODE_X2APIC;
+ if (enable_apicv && kvm_vcpu_apicv_active(vcpu))
+ mode |= MSR_BITMAP_MODE_X2APIC_APICV;
}
+
+ if (is_long_mode(vcpu))
+ mode |= MSR_BITMAP_MODE_LM;
+
+ return mode;
}

-static void vmx_disable_intercept_msr_read_x2apic(u32 msr, bool apicv_active)
+#define X2APIC_MSR(r) (APIC_BASE_MSR + ((r) >> 4))
+
+static void vmx_update_msr_bitmap_x2apic(unsigned long *msr_bitmap,
+ u8 mode)
{
- if (apicv_active) {
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic,
- msr, MSR_TYPE_R);
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic,
- msr, MSR_TYPE_R);
- } else {
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic_apicv_inactive,
- msr, MSR_TYPE_R);
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic_apicv_inactive,
- msr, MSR_TYPE_R);
+ int msr;
+
+ for (msr = 0x800; msr <= 0x8ff; msr += BITS_PER_LONG) {
+ unsigned word = msr / BITS_PER_LONG;
+ msr_bitmap[word] = (mode & MSR_BITMAP_MODE_X2APIC_APICV) ? 0 : ~0;
+ msr_bitmap[word + (0x800 / sizeof(long))] = ~0;
+ }
+
+ if (mode & MSR_BITMAP_MODE_X2APIC) {
+ /*
+ * TPR reads and writes can be virtualized even if virtual interrupt
+ * delivery is not in use.
+ */
+ vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TASKPRI), MSR_TYPE_RW);
+ if (mode & MSR_BITMAP_MODE_X2APIC_APICV) {
+ vmx_enable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_TMCCT), MSR_TYPE_R);
+ vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_EOI), MSR_TYPE_W);
+ vmx_disable_intercept_for_msr(msr_bitmap, X2APIC_MSR(APIC_SELF_IPI), MSR_TYPE_W);
+ }
}
}

-static void vmx_disable_intercept_msr_write_x2apic(u32 msr, bool apicv_active)
+static void vmx_update_msr_bitmap(struct kvm_vcpu *vcpu)
{
- if (apicv_active) {
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic,
- msr, MSR_TYPE_W);
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic,
- msr, MSR_TYPE_W);
- } else {
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_legacy_x2apic_apicv_inactive,
- msr, MSR_TYPE_W);
- __vmx_disable_intercept_for_msr(vmx_msr_bitmap_longmode_x2apic_apicv_inactive,
- msr, MSR_TYPE_W);
- }
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+ unsigned long *msr_bitmap = vmx->vmcs01.msr_bitmap;
+ u8 mode = vmx_msr_bitmap_mode(vcpu);
+ u8 changed = mode ^ vmx->msr_bitmap_mode;
+
+ if (!changed)
+ return;
+
+ vmx_set_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW,
+ !(mode & MSR_BITMAP_MODE_LM));
+
+ if (changed & (MSR_BITMAP_MODE_X2APIC | MSR_BITMAP_MODE_X2APIC_APICV))
+ vmx_update_msr_bitmap_x2apic(msr_bitmap, mode);
+
+ vmx->msr_bitmap_mode = mode;
}

static bool vmx_get_enable_apicv(void)
@@ -4976,7 +4979,7 @@ static void vmx_refresh_apicv_exec_ctrl(
}

if (cpu_has_vmx_msr_bitmap())
- vmx_set_msr_bitmap(vcpu);
+ vmx_update_msr_bitmap(vcpu);
}

static u32 vmx_exec_control(struct vcpu_vmx *vmx)
@@ -5065,7 +5068,7 @@ static int vmx_vcpu_setup(struct vcpu_vm
vmcs_write64(VMWRITE_BITMAP, __pa(vmx_vmwrite_bitmap));
}
if (cpu_has_vmx_msr_bitmap())
- vmcs_write64(MSR_BITMAP, __pa(vmx_msr_bitmap_legacy));
+ vmcs_write64(MSR_BITMAP, __pa(vmx->vmcs01.msr_bitmap));

vmcs_write64(VMCS_LINK_POINTER, -1ull); /* 22.3.1.5 */

@@ -6396,7 +6399,7 @@ static void wakeup_handler(void)

static __init int hardware_setup(void)
{
- int r = -ENOMEM, i, msr;
+ int r = -ENOMEM, i;

rdmsrl_safe(MSR_EFER, &host_efer);

@@ -6411,41 +6414,13 @@ static __init int hardware_setup(void)
if (!vmx_io_bitmap_b)
goto out;

- vmx_msr_bitmap_legacy = (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_legacy)
- goto out1;
-
- vmx_msr_bitmap_legacy_x2apic =
- (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_legacy_x2apic)
- goto out2;
-
- vmx_msr_bitmap_legacy_x2apic_apicv_inactive =
- (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_legacy_x2apic_apicv_inactive)
- goto out3;
-
- vmx_msr_bitmap_longmode = (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_longmode)
- goto out4;
-
- vmx_msr_bitmap_longmode_x2apic =
- (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_longmode_x2apic)
- goto out5;
-
- vmx_msr_bitmap_longmode_x2apic_apicv_inactive =
- (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx_msr_bitmap_longmode_x2apic_apicv_inactive)
- goto out6;
-
vmx_vmread_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
if (!vmx_vmread_bitmap)
- goto out7;
+ goto out1;

vmx_vmwrite_bitmap = (unsigned long *)__get_free_page(GFP_KERNEL);
if (!vmx_vmwrite_bitmap)
- goto out8;
+ goto out2;

memset(vmx_vmread_bitmap, 0xff, PAGE_SIZE);
memset(vmx_vmwrite_bitmap, 0xff, PAGE_SIZE);
@@ -6454,12 +6429,9 @@ static __init int hardware_setup(void)

memset(vmx_io_bitmap_b, 0xff, PAGE_SIZE);

- memset(vmx_msr_bitmap_legacy, 0xff, PAGE_SIZE);
- memset(vmx_msr_bitmap_longmode, 0xff, PAGE_SIZE);
-
if (setup_vmcs_config(&vmcs_config) < 0) {
r = -EIO;
- goto out9;
+ goto out3;
}

if (boot_cpu_has(X86_FEATURE_NX))
@@ -6516,47 +6488,8 @@ static __init int hardware_setup(void)
kvm_tsc_scaling_ratio_frac_bits = 48;
}

- vmx_disable_intercept_for_msr(MSR_FS_BASE, false);
- vmx_disable_intercept_for_msr(MSR_GS_BASE, false);
- vmx_disable_intercept_for_msr(MSR_KERNEL_GS_BASE, true);
- vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_CS, false);
- vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_ESP, false);
- vmx_disable_intercept_for_msr(MSR_IA32_SYSENTER_EIP, false);
-
- memcpy(vmx_msr_bitmap_legacy_x2apic,
- vmx_msr_bitmap_legacy, PAGE_SIZE);
- memcpy(vmx_msr_bitmap_longmode_x2apic,
- vmx_msr_bitmap_longmode, PAGE_SIZE);
- memcpy(vmx_msr_bitmap_legacy_x2apic_apicv_inactive,
- vmx_msr_bitmap_legacy, PAGE_SIZE);
- memcpy(vmx_msr_bitmap_longmode_x2apic_apicv_inactive,
- vmx_msr_bitmap_longmode, PAGE_SIZE);
-
set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */

- /*
- * enable_apicv && kvm_vcpu_apicv_active()
- */
- for (msr = 0x800; msr <= 0x8ff; msr++)
- vmx_disable_intercept_msr_read_x2apic(msr, true);
-
- /* TMCCT */
- vmx_enable_intercept_msr_read_x2apic(0x839, true);
- /* TPR */
- vmx_disable_intercept_msr_write_x2apic(0x808, true);
- /* EOI */
- vmx_disable_intercept_msr_write_x2apic(0x80b, true);
- /* SELF-IPI */
- vmx_disable_intercept_msr_write_x2apic(0x83f, true);
-
- /*
- * (enable_apicv && !kvm_vcpu_apicv_active()) ||
- * !enable_apicv
- */
- /* TPR */
- vmx_disable_intercept_msr_read_x2apic(0x808, false);
- vmx_disable_intercept_msr_write_x2apic(0x808, false);
-
if (enable_ept) {
kvm_mmu_set_mask_ptes(VMX_EPT_READABLE_MASK,
(enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull,
@@ -6602,22 +6535,10 @@ static __init int hardware_setup(void)

return alloc_kvm_area();

-out9:
- free_page((unsigned long)vmx_vmwrite_bitmap);
-out8:
- free_page((unsigned long)vmx_vmread_bitmap);
-out7:
- free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic_apicv_inactive);
-out6:
- free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic);
-out5:
- free_page((unsigned long)vmx_msr_bitmap_longmode);
-out4:
- free_page((unsigned long)vmx_msr_bitmap_legacy_x2apic_apicv_inactive);
out3:
- free_page((unsigned long)vmx_msr_bitmap_legacy_x2apic);
+ free_page((unsigned long)vmx_vmwrite_bitmap);
out2:
- free_page((unsigned long)vmx_msr_bitmap_legacy);
+ free_page((unsigned long)vmx_vmread_bitmap);
out1:
free_page((unsigned long)vmx_io_bitmap_b);
out:
@@ -6628,12 +6549,6 @@ out:

static __exit void hardware_unsetup(void)
{
- free_page((unsigned long)vmx_msr_bitmap_legacy_x2apic);
- free_page((unsigned long)vmx_msr_bitmap_legacy_x2apic_apicv_inactive);
- free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic);
- free_page((unsigned long)vmx_msr_bitmap_longmode_x2apic_apicv_inactive);
- free_page((unsigned long)vmx_msr_bitmap_legacy);
- free_page((unsigned long)vmx_msr_bitmap_longmode);
free_page((unsigned long)vmx_io_bitmap_b);
free_page((unsigned long)vmx_io_bitmap_a);
free_page((unsigned long)vmx_vmwrite_bitmap);
@@ -6998,13 +6913,6 @@ static int handle_vmon(struct kvm_vcpu *
if (r < 0)
goto out_vmcs02;

- if (cpu_has_vmx_msr_bitmap()) {
- vmx->nested.msr_bitmap =
- (unsigned long *)__get_free_page(GFP_KERNEL);
- if (!vmx->nested.msr_bitmap)
- goto out_msr_bitmap;
- }
-
vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL);
if (!vmx->nested.cached_vmcs12)
goto out_cached_vmcs12;
@@ -7034,9 +6942,6 @@ out_shadow_vmcs:
kfree(vmx->nested.cached_vmcs12);

out_cached_vmcs12:
- free_page((unsigned long)vmx->nested.msr_bitmap);
-
-out_msr_bitmap:
free_loaded_vmcs(&vmx->nested.vmcs02);

out_vmcs02:
@@ -7115,10 +7020,6 @@ static void free_nested(struct vcpu_vmx
vmx->nested.vmxon = false;
free_vpid(vmx->nested.vpid02);
nested_release_vmcs12(vmx);
- if (vmx->nested.msr_bitmap) {
- free_page((unsigned long)vmx->nested.msr_bitmap);
- vmx->nested.msr_bitmap = NULL;
- }
if (enable_shadow_vmcs) {
vmcs_clear(vmx->vmcs01.shadow_vmcs);
free_vmcs(vmx->vmcs01.shadow_vmcs);
@@ -8465,7 +8366,7 @@ static void vmx_set_virtual_x2apic_mode(
}
vmcs_write32(SECONDARY_VM_EXEC_CONTROL, sec_exec_control);

- vmx_set_msr_bitmap(vcpu);
+ vmx_update_msr_bitmap(vcpu);
}

static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu, hpa_t hpa)
@@ -9085,6 +8986,7 @@ static struct kvm_vcpu *vmx_create_vcpu(
{
int err;
struct vcpu_vmx *vmx = kmem_cache_zalloc(kvm_vcpu_cache, GFP_KERNEL);
+ unsigned long *msr_bitmap;
int cpu;

if (!vmx)
@@ -9125,6 +9027,15 @@ static struct kvm_vcpu *vmx_create_vcpu(
if (err < 0)
goto free_msrs;

+ msr_bitmap = vmx->vmcs01.msr_bitmap;
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_FS_BASE, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_GS_BASE, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_KERNEL_GS_BASE, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_CS, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_ESP, MSR_TYPE_RW);
+ vmx_disable_intercept_for_msr(msr_bitmap, MSR_IA32_SYSENTER_EIP, MSR_TYPE_RW);
+ vmx->msr_bitmap_mode = 0;
+
vmx->loaded_vmcs = &vmx->vmcs01;
cpu = get_cpu();
vmx_vcpu_load(&vmx->vcpu, cpu);
@@ -9519,7 +9430,7 @@ static inline bool nested_vmx_merge_msr_
int msr;
struct page *page;
unsigned long *msr_bitmap_l1;
- unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.msr_bitmap;
+ unsigned long *msr_bitmap_l0 = to_vmx(vcpu)->nested.vmcs02.msr_bitmap;

/* This shortcut is ok because we support only x2APIC MSRs so far. */
if (!nested_cpu_has_virt_x2apic_mode(vmcs12))
@@ -10034,6 +9945,9 @@ static void prepare_vmcs02(struct kvm_vc
if (kvm_has_tsc_control)
decache_tsc_multiplier(vmx);

+ if (cpu_has_vmx_msr_bitmap())
+ vmcs_write64(MSR_BITMAP, __pa(vmx->nested.vmcs02.msr_bitmap));
+
if (enable_vpid) {
/*
* There is no direct mapping between vpid02 and vpid12, the
@@ -10738,7 +10652,7 @@ static void load_vmcs12_host_state(struc
vmcs_write64(GUEST_IA32_DEBUGCTL, 0);

if (cpu_has_vmx_msr_bitmap())
- vmx_set_msr_bitmap(vcpu);
+ vmx_update_msr_bitmap(vcpu);

if (nested_vmx_load_msr(vcpu, vmcs12->vm_exit_msr_load_addr,
vmcs12->vm_exit_msr_load_count))



2018-02-09 14:03:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 83/92] KVM/VMX: Emulate MSR_IA32_ARCH_CAPABILITIES

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: KarimAllah Ahmed <[email protected]>


(cherry picked from commit 28c1c9fabf48d6ad596273a11c46e0d0da3e14cd)

Intel processors use MSR_IA32_ARCH_CAPABILITIES MSR to indicate RDCL_NO
(bit 0) and IBRS_ALL (bit 1). This is a read-only MSR. By default the
contents will come directly from the hardware, but user-space can still
override it.

[dwmw2: The bit in kvm_cpuid_7_0_edx_x86_features can be unconditional]

Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Paolo Bonzini <[email protected]>
Reviewed-by: Darren Kenny <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Reviewed-by: Konrad Rzeszutek Wilk <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Ashok Raj <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/cpuid.c | 8 +++++++-
arch/x86/kvm/cpuid.h | 8 ++++++++
arch/x86/kvm/vmx.c | 15 +++++++++++++++
arch/x86/kvm/x86.c | 1 +
4 files changed, 31 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -380,6 +380,10 @@ static inline int __do_cpuid_ent(struct
/* cpuid 7.0.ecx*/
const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/;

+ /* cpuid 7.0.edx*/
+ const u32 kvm_cpuid_7_0_edx_x86_features =
+ F(ARCH_CAPABILITIES);
+
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();

@@ -462,12 +466,14 @@ static inline int __do_cpuid_ent(struct
/* PKU is not yet implemented for shadow paging. */
if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
entry->ecx &= ~F(PKU);
+ entry->edx &= kvm_cpuid_7_0_edx_x86_features;
+ cpuid_mask(&entry->edx, CPUID_7_EDX);
} else {
entry->ebx = 0;
entry->ecx = 0;
+ entry->edx = 0;
}
entry->eax = 0;
- entry->edx = 0;
break;
}
case 9:
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -171,6 +171,14 @@ static inline bool guest_cpuid_has_ibpb(
return best && (best->edx & bit(X86_FEATURE_SPEC_CTRL));
}

+static inline bool guest_cpuid_has_arch_capabilities(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 7, 0);
+ return best && (best->edx & bit(X86_FEATURE_ARCH_CAPABILITIES));
+}
+

/*
* NRIPS is provided through cpuidfn 0x8000000a.edx bit 3
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -551,6 +551,8 @@ struct vcpu_vmx {
u64 msr_guest_kernel_gs_base;
#endif

+ u64 arch_capabilities;
+
u32 vm_entry_controls_shadow;
u32 vm_exit_controls_shadow;
/*
@@ -2979,6 +2981,12 @@ static int vmx_get_msr(struct kvm_vcpu *
case MSR_IA32_TSC:
msr_info->data = guest_read_tsc(vcpu);
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated &&
+ !guest_cpuid_has_arch_capabilities(vcpu))
+ return 1;
+ msr_info->data = to_vmx(vcpu)->arch_capabilities;
+ break;
case MSR_IA32_SYSENTER_CS:
msr_info->data = vmcs_read32(GUEST_SYSENTER_CS);
break;
@@ -3110,6 +3118,11 @@ static int vmx_set_msr(struct kvm_vcpu *
vmx_disable_intercept_for_msr(vmx->vmcs01.msr_bitmap, MSR_IA32_PRED_CMD,
MSR_TYPE_W);
break;
+ case MSR_IA32_ARCH_CAPABILITIES:
+ if (!msr_info->host_initiated)
+ return 1;
+ vmx->arch_capabilities = data;
+ break;
case MSR_IA32_CR_PAT:
if (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_PAT) {
if (!kvm_mtrr_valid(vcpu, MSR_IA32_CR_PAT, data))
@@ -5196,6 +5209,8 @@ static int vmx_vcpu_setup(struct vcpu_vm
++vmx->nmsrs;
}

+ if (boot_cpu_has(X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, vmx->arch_capabilities);

vm_exit_controls_init(vmx, vmcs_config.vmexit_ctrl);

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -975,6 +975,7 @@ static u32 msrs_to_save[] = {
#endif
MSR_IA32_TSC, MSR_IA32_CR_PAT, MSR_VM_HSAVE_PA,
MSR_IA32_FEATURE_CONTROL, MSR_IA32_BNDCFGS, MSR_TSC_AUX,
+ MSR_IA32_ARCH_CAPABILITIES
};

static unsigned num_msrs_to_save;



2018-02-09 14:03:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 76/92] KVM: nVMX: kmap() cant fail

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Hildenbrand <[email protected]>

commit 42cf014d38d8822cce63703a467e00f65d000952 upstream.

kmap() can't fail, therefore it will always return a valid pointer. Let's
just get rid of the unnecessary checks.

Signed-off-by: David Hildenbrand <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/vmx.c | 9 ---------
1 file changed, 9 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4756,10 +4756,6 @@ static int vmx_complete_nested_posted_in
return 0;

vapic_page = kmap(vmx->nested.virtual_apic_page);
- if (!vapic_page) {
- WARN_ON(1);
- return -ENOMEM;
- }
__kvm_apic_update_irr(vmx->nested.pi_desc->pir, vapic_page);
kunmap(vmx->nested.virtual_apic_page);

@@ -9584,11 +9580,6 @@ static inline bool nested_vmx_merge_msr_
if (!page)
return false;
msr_bitmap_l1 = (unsigned long *)kmap(page);
- if (!msr_bitmap_l1) {
- nested_release_page_clean(page);
- WARN_ON(1);
- return false;
- }

memset(msr_bitmap_l0, 0xff, PAGE_SIZE);




2018-02-09 14:04:33

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 74/92] x86/pti: Mark constant arrays as __initconst

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Arnd Bergmann <[email protected]>


(cherry picked from commit 4bf5d56d429cbc96c23d809a08f63cd29e1a702e)

I'm seeing build failures from the two newly introduced arrays that
are marked 'const' and '__initdata', which are mutually exclusive:

arch/x86/kernel/cpu/common.c:882:43: error: 'cpu_no_speculation' causes a section type conflict with 'e820_table_firmware_init'
arch/x86/kernel/cpu/common.c:895:43: error: 'cpu_no_meltdown' causes a section type conflict with 'e820_table_firmware_init'

The correct annotation is __initconst.

Fixes: fec9434a12f3 ("x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown")
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Ricardo Neri <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Thomas Garnier <[email protected]>
Cc: David Woodhouse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -861,7 +861,7 @@ static void identify_cpu_without_cpuid(s
#endif
}

-static const __initdata struct x86_cpu_id cpu_no_speculation[] = {
+static const __initconst struct x86_cpu_id cpu_no_speculation[] = {
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
{ X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
@@ -874,7 +874,7 @@ static const __initdata struct x86_cpu_i
{}
};

-static const __initdata struct x86_cpu_id cpu_no_meltdown[] = {
+static const __initconst struct x86_cpu_id cpu_no_meltdown[] = {
{ X86_VENDOR_AMD },
{}
};



2018-02-09 14:04:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 80/92] KVM: VMX: introduce alloc_loaded_vmcs

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <[email protected]>


(cherry picked from commit f21f165ef922c2146cc5bdc620f542953c41714b)

Group together the calls to alloc_vmcs and loaded_vmcs_init. Soon we'll also
allocate an MSR bitmap there.

Cc: [email protected] # prereq for Spectre mitigation
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/vmx.c | 38 +++++++++++++++++++++++---------------
1 file changed, 23 insertions(+), 15 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3522,11 +3522,6 @@ static struct vmcs *alloc_vmcs_cpu(int c
return vmcs;
}

-static struct vmcs *alloc_vmcs(void)
-{
- return alloc_vmcs_cpu(raw_smp_processor_id());
-}
-
static void free_vmcs(struct vmcs *vmcs)
{
free_pages((unsigned long)vmcs, vmcs_config.order);
@@ -3545,6 +3540,22 @@ static void free_loaded_vmcs(struct load
WARN_ON(loaded_vmcs->shadow_vmcs != NULL);
}

+static struct vmcs *alloc_vmcs(void)
+{
+ return alloc_vmcs_cpu(raw_smp_processor_id());
+}
+
+static int alloc_loaded_vmcs(struct loaded_vmcs *loaded_vmcs)
+{
+ loaded_vmcs->vmcs = alloc_vmcs();
+ if (!loaded_vmcs->vmcs)
+ return -ENOMEM;
+
+ loaded_vmcs->shadow_vmcs = NULL;
+ loaded_vmcs_init(loaded_vmcs);
+ return 0;
+}
+
static void free_kvm_area(void)
{
int cpu;
@@ -6943,6 +6954,7 @@ static int handle_vmon(struct kvm_vcpu *
struct vmcs *shadow_vmcs;
const u64 VMXON_NEEDED_FEATURES = FEATURE_CONTROL_LOCKED
| FEATURE_CONTROL_VMXON_ENABLED_OUTSIDE_SMX;
+ int r;

/* The Intel VMX Instruction Reference lists a bunch of bits that
* are prerequisite to running VMXON, most notably cr4.VMXE must be
@@ -6982,11 +6994,9 @@ static int handle_vmon(struct kvm_vcpu *
return 1;
}

- vmx->nested.vmcs02.vmcs = alloc_vmcs();
- vmx->nested.vmcs02.shadow_vmcs = NULL;
- if (!vmx->nested.vmcs02.vmcs)
+ r = alloc_loaded_vmcs(&vmx->nested.vmcs02);
+ if (r < 0)
goto out_vmcs02;
- loaded_vmcs_init(&vmx->nested.vmcs02);

if (cpu_has_vmx_msr_bitmap()) {
vmx->nested.msr_bitmap =
@@ -9107,17 +9117,15 @@ static struct kvm_vcpu *vmx_create_vcpu(
if (!vmx->guest_msrs)
goto free_pml;

- vmx->loaded_vmcs = &vmx->vmcs01;
- vmx->loaded_vmcs->vmcs = alloc_vmcs();
- vmx->loaded_vmcs->shadow_vmcs = NULL;
- if (!vmx->loaded_vmcs->vmcs)
- goto free_msrs;
if (!vmm_exclusive)
kvm_cpu_vmxon(__pa(per_cpu(vmxarea, raw_smp_processor_id())));
- loaded_vmcs_init(vmx->loaded_vmcs);
+ err = alloc_loaded_vmcs(&vmx->vmcs01);
if (!vmm_exclusive)
kvm_cpu_vmxoff();
+ if (err < 0)
+ goto free_msrs;

+ vmx->loaded_vmcs = &vmx->vmcs01;
cpu = get_cpu();
vmx_vcpu_load(&vmx->vcpu, cpu);
vmx->vcpu.cpu = cpu;



2018-02-09 14:04:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 67/92] x86/spectre: Report get_user mitigation for spectre_v1

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit edfbae53dab8348fca778531be9f4855d2ca0360)

Reflect the presence of get_user(), __get_user(), and 'syscall' protections
in sysfs. The expectation is that new and better tooling will allow the
kernel to grow more usages of array_index_nospec(), for now, only claim
mitigation for __user pointer de-references.

Reported-by: Jiri Slaby <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727420158.33451.11658324346540434635.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -296,7 +296,7 @@ ssize_t cpu_show_spectre_v1(struct devic
{
if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1))
return sprintf(buf, "Not affected\n");
- return sprintf(buf, "Vulnerable\n");
+ return sprintf(buf, "Mitigation: __user pointer sanitization\n");
}

ssize_t cpu_show_spectre_v2(struct device *dev,



2018-02-09 14:04:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 72/92] x86/retpoline: Avoid retpolines for built-in __init functions

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>


(cherry picked from commit 66f793099a636862a71c59d4a6ba91387b155e0c)

There's no point in building init code with retpolines, since it runs before
any potentially hostile userspace does. And before the retpoline is actually
ALTERNATIVEd into place, for much of it.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/init.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -4,6 +4,13 @@
#include <linux/compiler.h>
#include <linux/types.h>

+/* Built-in __init functions needn't be compiled with retpoline */
+#if defined(RETPOLINE) && !defined(MODULE)
+#define __noretpoline __attribute__((indirect_branch("keep")))
+#else
+#define __noretpoline
+#endif
+
/* These macros are used to mark some functions or
* initialized data (doesn't apply to uninitialized data)
* as `initialization' functions. The kernel can take this
@@ -39,7 +46,7 @@

/* These are for everybody (although not all archs will actually
discard it in modules) */
-#define __init __section(.init.text) __cold notrace __latent_entropy
+#define __init __section(.init.text) __cold notrace __latent_entropy __noretpoline
#define __initdata __section(.init.data)
#define __initconst __section(.init.rodata)
#define __exitdata __section(.exit.data)



2018-02-09 14:05:16

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 73/92] x86/spectre: Simplify spectre_v2 command line parsing

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: KarimAllah Ahmed <[email protected]>


(cherry picked from commit 9005c6834c0ffdfe46afa76656bd9276cca864f6)

[dwmw2: Use ARRAY_SIZE]

Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 84 +++++++++++++++++++++++++++++----------------
1 file changed, 55 insertions(+), 29 deletions(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -118,13 +118,13 @@ static inline const char *spectre_v2_mod
static void __init spec2_print_if_insecure(const char *reason)
{
if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s\n", reason);
+ pr_info("%s selected on command line.\n", reason);
}

static void __init spec2_print_if_secure(const char *reason)
{
if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
- pr_info("%s\n", reason);
+ pr_info("%s selected on command line.\n", reason);
}

static inline bool retp_compiler(void)
@@ -139,42 +139,68 @@ static inline bool match_option(const ch
return len == arglen && !strncmp(arg, opt, len);
}

+static const struct {
+ const char *option;
+ enum spectre_v2_mitigation_cmd cmd;
+ bool secure;
+} mitigation_options[] = {
+ { "off", SPECTRE_V2_CMD_NONE, false },
+ { "on", SPECTRE_V2_CMD_FORCE, true },
+ { "retpoline", SPECTRE_V2_CMD_RETPOLINE, false },
+ { "retpoline,amd", SPECTRE_V2_CMD_RETPOLINE_AMD, false },
+ { "retpoline,generic", SPECTRE_V2_CMD_RETPOLINE_GENERIC, false },
+ { "auto", SPECTRE_V2_CMD_AUTO, false },
+};
+
static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
{
char arg[20];
- int ret;
+ int ret, i;
+ enum spectre_v2_mitigation_cmd cmd = SPECTRE_V2_CMD_AUTO;
+
+ if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+ return SPECTRE_V2_CMD_NONE;
+ else {
+ ret = cmdline_find_option(boot_command_line, "spectre_v2", arg,
+ sizeof(arg));
+ if (ret < 0)
+ return SPECTRE_V2_CMD_AUTO;

- ret = cmdline_find_option(boot_command_line, "spectre_v2", arg,
- sizeof(arg));
- if (ret > 0) {
- if (match_option(arg, ret, "off")) {
- goto disable;
- } else if (match_option(arg, ret, "on")) {
- spec2_print_if_secure("force enabled on command line.");
- return SPECTRE_V2_CMD_FORCE;
- } else if (match_option(arg, ret, "retpoline")) {
- spec2_print_if_insecure("retpoline selected on command line.");
- return SPECTRE_V2_CMD_RETPOLINE;
- } else if (match_option(arg, ret, "retpoline,amd")) {
- if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) {
- pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n");
- return SPECTRE_V2_CMD_AUTO;
- }
- spec2_print_if_insecure("AMD retpoline selected on command line.");
- return SPECTRE_V2_CMD_RETPOLINE_AMD;
- } else if (match_option(arg, ret, "retpoline,generic")) {
- spec2_print_if_insecure("generic retpoline selected on command line.");
- return SPECTRE_V2_CMD_RETPOLINE_GENERIC;
- } else if (match_option(arg, ret, "auto")) {
+ for (i = 0; i < ARRAY_SIZE(mitigation_options); i++) {
+ if (!match_option(arg, ret, mitigation_options[i].option))
+ continue;
+ cmd = mitigation_options[i].cmd;
+ break;
+ }
+
+ if (i >= ARRAY_SIZE(mitigation_options)) {
+ pr_err("unknown option (%s). Switching to AUTO select\n",
+ mitigation_options[i].option);
return SPECTRE_V2_CMD_AUTO;
}
}

- if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+ if ((cmd == SPECTRE_V2_CMD_RETPOLINE ||
+ cmd == SPECTRE_V2_CMD_RETPOLINE_AMD ||
+ cmd == SPECTRE_V2_CMD_RETPOLINE_GENERIC) &&
+ !IS_ENABLED(CONFIG_RETPOLINE)) {
+ pr_err("%s selected but not compiled in. Switching to AUTO select\n",
+ mitigation_options[i].option);
return SPECTRE_V2_CMD_AUTO;
-disable:
- spec2_print_if_insecure("disabled on command line.");
- return SPECTRE_V2_CMD_NONE;
+ }
+
+ if (cmd == SPECTRE_V2_CMD_RETPOLINE_AMD &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_AMD) {
+ pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n");
+ return SPECTRE_V2_CMD_AUTO;
+ }
+
+ if (mitigation_options[i].secure)
+ spec2_print_if_secure(mitigation_options[i].option);
+ else
+ spec2_print_if_insecure(mitigation_options[i].option);
+
+ return cmd;
}

/* Check for Skylake-like CPUs (for RSB handling) */



2018-02-09 14:05:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 71/92] x86/kvm: Update spectre-v1 mitigation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit 085331dfc6bbe3501fb936e657331ca943827600)

Commit 75f139aaf896 "KVM: x86: Add memory barrier on vmcs field lookup"
added a raw 'asm("lfence");' to prevent a bounds check bypass of
'vmcs_field_to_offset_table'.

The lfence can be avoided in this path by using the array_index_nospec()
helper designed for these types of fixes.

Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Paolo Bonzini <[email protected]>
Cc: Andrew Honig <[email protected]>
Cc: [email protected]
Cc: Jim Mattson <[email protected]>
Link: https://lkml.kernel.org/r/151744959670.6342.3001723920950249067.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/vmx.c | 20 +++++++++-----------
1 file changed, 9 insertions(+), 11 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -33,6 +33,7 @@
#include <linux/slab.h>
#include <linux/tboot.h>
#include <linux/hrtimer.h>
+#include <linux/nospec.h>
#include "kvm_cache_regs.h"
#include "x86.h"

@@ -856,21 +857,18 @@ static const unsigned short vmcs_field_t

static inline short vmcs_field_to_offset(unsigned long field)
{
- BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
+ const size_t size = ARRAY_SIZE(vmcs_field_to_offset_table);
+ unsigned short offset;

- if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
+ BUILD_BUG_ON(size > SHRT_MAX);
+ if (field >= size)
return -ENOENT;

- /*
- * FIXME: Mitigation for CVE-2017-5753. To be replaced with a
- * generic mechanism.
- */
- asm("lfence");
-
- if (vmcs_field_to_offset_table[field] == 0)
+ field = array_index_nospec(field, size);
+ offset = vmcs_field_to_offset_table[field];
+ if (offset == 0)
return -ENOENT;
-
- return vmcs_field_to_offset_table[field];
+ return offset;
}

static inline struct vmcs12 *get_vmcs12(struct kvm_vcpu *vcpu)



2018-02-09 14:05:40

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 70/92] x86/paravirt: Remove noreplace-paravirt cmdline option

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Josh Poimboeuf <[email protected]>


(cherry picked from commit 12c69f1e94c89d40696e83804dd2f0965b5250cd)

The 'noreplace-paravirt' option disables paravirt patching, leaving the
original pv indirect calls in place.

That's highly incompatible with retpolines, unless we want to uglify
paravirt even further and convert the paravirt calls to retpolines.

As far as I can tell, the option doesn't seem to be useful for much
other than introducing surprising corner cases and making the kernel
vulnerable to Spectre v2. It was probably a debug option from the early
paravirt days. So just remove it.

Signed-off-by: Josh Poimboeuf <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Juergen Gross <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Rusty Russell <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Alok Kataria <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Dan Williams <[email protected]>
Link: https://lkml.kernel.org/r/20180131041333.2x6blhxirc2kclrq@treble
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/kernel-parameters.txt | 2 --
arch/x86/kernel/alternative.c | 14 --------------
2 files changed, 16 deletions(-)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2805,8 +2805,6 @@ bytes respectively. Such letter suffixes
norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space

- noreplace-paravirt [X86,IA-64,PV_OPS] Don't patch paravirt_ops
-
noreplace-smp [X86-32,SMP] Don't replace SMP instructions
with UP alternatives

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -46,17 +46,6 @@ static int __init setup_noreplace_smp(ch
}
__setup("noreplace-smp", setup_noreplace_smp);

-#ifdef CONFIG_PARAVIRT
-static int __initdata_or_module noreplace_paravirt = 0;
-
-static int __init setup_noreplace_paravirt(char *str)
-{
- noreplace_paravirt = 1;
- return 1;
-}
-__setup("noreplace-paravirt", setup_noreplace_paravirt);
-#endif
-
#define DPRINTK(fmt, args...) \
do { \
if (debug_alternative) \
@@ -588,9 +577,6 @@ void __init_or_module apply_paravirt(str
struct paravirt_patch_site *p;
char insnbuf[MAX_PATCH_LEN];

- if (noreplace_paravirt)
- return;
-
for (p = start; p < end; p++) {
unsigned int used;




2018-02-09 14:06:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 66/92] nl80211: Sanitize array index in parse_txq_params

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit 259d8c1e984318497c84eef547bbb6b1d9f4eb05)

Wireless drivers rely on parse_txq_params to validate that txq_params->ac
is less than NL80211_NUM_ACS by the time the low-level driver's ->conf_tx()
handler is called. Use a new helper, array_index_nospec(), to sanitize
txq_params->ac with respect to speculation. I.e. ensure that any
speculation into ->conf_tx() handlers is done with a value of
txq_params->ac that is within the bounds of [0, NL80211_NUM_ACS).

Reported-by: Christian Lamparter <[email protected]>
Reported-by: Elena Reshetova <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: Johannes Berg <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727419584.33451.7700736761686184303.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/wireless/nl80211.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -16,6 +16,7 @@
#include <linux/nl80211.h>
#include <linux/rtnetlink.h>
#include <linux/netlink.h>
+#include <linux/nospec.h>
#include <linux/etherdevice.h>
#include <net/net_namespace.h>
#include <net/genetlink.h>
@@ -2014,20 +2015,22 @@ static const struct nla_policy txq_param
static int parse_txq_params(struct nlattr *tb[],
struct ieee80211_txq_params *txq_params)
{
+ u8 ac;
+
if (!tb[NL80211_TXQ_ATTR_AC] || !tb[NL80211_TXQ_ATTR_TXOP] ||
!tb[NL80211_TXQ_ATTR_CWMIN] || !tb[NL80211_TXQ_ATTR_CWMAX] ||
!tb[NL80211_TXQ_ATTR_AIFS])
return -EINVAL;

- txq_params->ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]);
+ ac = nla_get_u8(tb[NL80211_TXQ_ATTR_AC]);
txq_params->txop = nla_get_u16(tb[NL80211_TXQ_ATTR_TXOP]);
txq_params->cwmin = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMIN]);
txq_params->cwmax = nla_get_u16(tb[NL80211_TXQ_ATTR_CWMAX]);
txq_params->aifs = nla_get_u8(tb[NL80211_TXQ_ATTR_AIFS]);

- if (txq_params->ac >= NL80211_NUM_ACS)
+ if (ac >= NL80211_NUM_ACS)
return -EINVAL;
-
+ txq_params->ac = array_index_nospec(ac, NL80211_NUM_ACS);
return 0;
}




2018-02-09 14:06:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 69/92] x86/cpuid: Fix up "virtual" IBRS/IBPB/STIBP feature bits on Intel

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>


(cherry picked from commit 7fcae1118f5fd44a862aa5c3525248e35ee67c3b)

Despite the fact that all the other code there seems to be doing it, just
using set_cpu_cap() in early_intel_init() doesn't actually work.

For CPUs with PKU support, setup_pku() calls get_cpu_cap() after
c->c_init() has set those feature bits. That resets those bits back to what
was queried from the hardware.

Turning the bits off for bad microcode is easy to fix. That can just use
setup_clear_cpu_cap() to force them off for all CPUs.

I was less keen on forcing the feature bits *on* that way, just in case
of inconsistencies. I appreciate that the kernel is going to get this
utterly wrong if CPU features are not consistent, because it has already
applied alternatives by the time secondary CPUs are brought up.

But at least if setup_force_cpu_cap() isn't being used, we might have a
chance of *detecting* the lack of the corresponding bit and either
panicking or refusing to bring the offending CPU online.

So ensure that the appropriate feature bits are set within get_cpu_cap()
regardless of how many extra times it's called.

Fixes: 2961298e ("x86/cpufeatures: Clean up Spectre v2 related CPUID flags")
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/common.c | 21 +++++++++++++++++++++
arch/x86/kernel/cpu/intel.c | 27 ++++++++-------------------
2 files changed, 29 insertions(+), 19 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -718,6 +718,26 @@ static void apply_forced_caps(struct cpu
}
}

+static void init_speculation_control(struct cpuinfo_x86 *c)
+{
+ /*
+ * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
+ * and they also have a different bit for STIBP support. Also,
+ * a hypervisor might have set the individual AMD bits even on
+ * Intel CPUs, for finer-grained selection of what's available.
+ *
+ * We use the AMD bits in 0x8000_0008 EBX as the generic hardware
+ * features, which are visible in /proc/cpuinfo and used by the
+ * kernel. So set those accordingly from the Intel bits.
+ */
+ if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
+ set_cpu_cap(c, X86_FEATURE_IBRS);
+ set_cpu_cap(c, X86_FEATURE_IBPB);
+ }
+ if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
+ set_cpu_cap(c, X86_FEATURE_STIBP);
+}
+
void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
@@ -812,6 +832,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
c->x86_capability[CPUID_8000_000A_EDX] = cpuid_edx(0x8000000a);

init_scattered_cpuid_features(c);
+ init_speculation_control(c);
}

static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -140,28 +140,17 @@ static void early_init_intel(struct cpui
rdmsr(MSR_IA32_UCODE_REV, lower_word, c->microcode);
}

- /*
- * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
- * and they also have a different bit for STIBP support. Also,
- * a hypervisor might have set the individual AMD bits even on
- * Intel CPUs, for finer-grained selection of what's available.
- */
- if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
- set_cpu_cap(c, X86_FEATURE_IBRS);
- set_cpu_cap(c, X86_FEATURE_IBPB);
- }
- if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
- set_cpu_cap(c, X86_FEATURE_STIBP);
-
/* Now if any of them are set, check the blacklist and clear the lot */
- if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
+ if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) ||
+ cpu_has(c, X86_FEATURE_INTEL_STIBP) ||
+ cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) {
pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n");
- clear_cpu_cap(c, X86_FEATURE_IBRS);
- clear_cpu_cap(c, X86_FEATURE_IBPB);
- clear_cpu_cap(c, X86_FEATURE_STIBP);
- clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
- clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP);
+ setup_clear_cpu_cap(X86_FEATURE_IBRS);
+ setup_clear_cpu_cap(X86_FEATURE_IBPB);
+ setup_clear_cpu_cap(X86_FEATURE_STIBP);
+ setup_clear_cpu_cap(X86_FEATURE_SPEC_CTRL);
+ setup_clear_cpu_cap(X86_FEATURE_INTEL_STIBP);
}

/*



2018-02-09 14:06:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 68/92] x86/spectre: Fix spelling mistake: "vunerable"-> "vulnerable"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Colin Ian King <[email protected]>


(cherry picked from commit e698dcdfcda41efd0984de539767b4cddd235f1e)

Trivial fix to spelling mistake in pr_err error message text.

Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: Andy Lutomirski <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Woodhouse <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -102,7 +102,7 @@ bool retpoline_module_ok(bool has_retpol
if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline)
return true;

- pr_err("System may be vunerable to spectre v2\n");
+ pr_err("System may be vulnerable to spectre v2\n");
spectre_v2_bad_module = true;
return false;
}



2018-02-09 14:07:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 61/92] x86/usercopy: Replace open coded stac/clac with __uaccess_{begin, end}

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit b5c4ae4f35325d520b230bab6eb3310613b72ac1)

In preparation for converting some __uaccess_begin() instances to
__uacess_begin_nospec(), make sure all 'from user' uaccess paths are
using the _begin(), _end() helpers rather than open-coded stac() and
clac().

No functional changes.

Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Tom Lendacky <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727416438.33451.17309465232057176966.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/lib/usercopy_32.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -570,12 +570,12 @@ do { \
unsigned long __copy_to_user_ll(void __user *to, const void *from,
unsigned long n)
{
- stac();
+ __uaccess_begin();
if (movsl_is_ok(to, from, n))
__copy_user(to, from, n);
else
n = __copy_user_intel(to, from, n);
- clac();
+ __uaccess_end();
return n;
}
EXPORT_SYMBOL(__copy_to_user_ll);
@@ -627,7 +627,7 @@ EXPORT_SYMBOL(__copy_from_user_ll_nocach
unsigned long __copy_from_user_ll_nocache_nozero(void *to, const void __user *from,
unsigned long n)
{
- stac();
+ __uaccess_begin();
#ifdef CONFIG_X86_INTEL_USERCOPY
if (n > 64 && static_cpu_has(X86_FEATURE_XMM2))
n = __copy_user_intel_nocache(to, from, n);
@@ -636,7 +636,7 @@ unsigned long __copy_from_user_ll_nocach
#else
__copy_user(to, from, n);
#endif
- clac();
+ __uaccess_end();
return n;
}
EXPORT_SYMBOL(__copy_from_user_ll_nocache_nozero);



2018-02-09 14:07:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 48/92] x86/bugs: Drop one "mitigation" from dmesg

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

(cherry picked from commit 55fa19d3e51f33d9cd4056d25836d93abf9438db)

Make

[ 0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline

into

[ 0.031118] Spectre V2: Mitigation: Full generic retpoline

to reduce the mitigation mitigations strings.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -90,7 +90,7 @@ static const char *spectre_v2_strings[]
};

#undef pr_fmt
-#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt
+#define pr_fmt(fmt) "Spectre V2 : " fmt

static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
static bool spectre_v2_bad_module;



2018-02-09 14:07:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 60/92] x86: Introduce __uaccess_begin_nospec() and uaccess_try_nospec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit b3bbfb3fb5d25776b8e3f361d2eedaabb0b496cd)

For __get_user() paths, do not allow the kernel to speculate on the value
of a user controlled pointer. In addition to the 'stac' instruction for
Supervisor Mode Access Protection (SMAP), a barrier_nospec() causes the
access_ok() result to resolve in the pipeline before the CPU might take any
speculative action on the pointer value. Given the cost of 'stac' the
speculation barrier is placed after 'stac' to hopefully overlap the cost of
disabling SMAP with the cost of flushing the instruction pipeline.

Since __get_user is a major kernel interface that deals with user
controlled pointers, the __uaccess_begin_nospec() mechanism will prevent
speculative execution past an access_ok() permission check. While
speculative execution past access_ok() is not enough to lead to a kernel
memory leak, it is a necessary precondition.

To be clear, __uaccess_begin_nospec() is addressing a class of potential
problems near __get_user() usages.

Note, that while the barrier_nospec() in __uaccess_begin_nospec() is used
to protect __get_user(), pointer masking similar to array_index_nospec()
will be used for get_user() since it incorporates a bounds check near the
usage.

uaccess_try_nospec provides the same mechanism for get_user_try.

No functional changes.

Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Andi Kleen <[email protected]>
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Tom Lendacky <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727415922.33451.5796614273104346583.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/uaccess.h | 9 +++++++++
1 file changed, 9 insertions(+)

--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -123,6 +123,11 @@ extern int __get_user_bad(void);

#define __uaccess_begin() stac()
#define __uaccess_end() clac()
+#define __uaccess_begin_nospec() \
+({ \
+ stac(); \
+ barrier_nospec(); \
+})

/*
* This is a type: either unsigned long, if the argument fits into
@@ -474,6 +479,10 @@ struct __large_struct { unsigned long bu
__uaccess_begin(); \
barrier();

+#define uaccess_try_nospec do { \
+ current->thread.uaccess_err = 0; \
+ __uaccess_begin_nospec(); \
+
#define uaccess_catch(err) \
__uaccess_end(); \
(err) |= (current->thread.uaccess_err ? -EFAULT : 0); \



2018-02-09 14:07:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 58/92] x86: Implement array_index_mask_nospec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit babdde2698d482b6c0de1eab4f697cf5856c5859)

array_index_nospec() uses a mask to sanitize user controllable array
indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask
otherwise. While the default array_index_mask_nospec() handles the
carry-bit from the (index - size) result in software.

The x86 array_index_mask_nospec() does the same, but the carry-bit is
handled in the processor CF flag without conditional instructions in the
control flow.

Suggested-by: Linus Torvalds <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/barrier.h | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -23,6 +23,30 @@
#define wmb() asm volatile("sfence" ::: "memory")
#endif

+/**
+ * array_index_mask_nospec() - generate a mask that is ~0UL when the
+ * bounds check succeeds and 0 otherwise
+ * @index: array element index
+ * @size: number of elements in array
+ *
+ * Returns:
+ * 0 - (index < size)
+ */
+static inline unsigned long array_index_mask_nospec(unsigned long index,
+ unsigned long size)
+{
+ unsigned long mask;
+
+ asm ("cmp %1,%2; sbb %0,%0;"
+ :"=r" (mask)
+ :"r"(size),"r" (index)
+ :"cc");
+ return mask;
+}
+
+/* Override the default implementation from linux/nospec.h. */
+#define array_index_mask_nospec array_index_mask_nospec
+
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
#else



2018-02-09 14:08:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 59/92] x86: Introduce barrier_nospec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit b3d7ad85b80bbc404635dca80f5b129f6242bc7a)

Rename the open coded form of this instruction sequence from
rdtsc_ordered() into a generic barrier primitive, barrier_nospec().

One of the mitigations for Spectre variant1 vulnerabilities is to fence
speculative execution after successfully validating a bounds check. I.e.
force the result of a bounds check to resolve in the instruction pipeline
to ensure speculative execution honors that result before potentially
operating on out-of-bounds data.

No functional changes.

Suggested-by: Linus Torvalds <[email protected]>
Suggested-by: Andi Kleen <[email protected]>
Suggested-by: Ingo Molnar <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Tom Lendacky <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/barrier.h | 4 ++++
arch/x86/include/asm/msr.h | 3 +--
2 files changed, 5 insertions(+), 2 deletions(-)

--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -47,6 +47,10 @@ static inline unsigned long array_index_
/* Override the default implementation from linux/nospec.h. */
#define array_index_mask_nospec array_index_mask_nospec

+/* Prevent speculative execution past this barrier. */
+#define barrier_nospec() alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC, \
+ "lfence", X86_FEATURE_LFENCE_RDTSC)
+
#ifdef CONFIG_X86_PPRO_FENCE
#define dma_rmb() rmb()
#else
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -188,8 +188,7 @@ static __always_inline unsigned long lon
* that some other imaginary CPU is updating continuously with a
* time stamp.
*/
- alternative_2("", "mfence", X86_FEATURE_MFENCE_RDTSC,
- "lfence", X86_FEATURE_LFENCE_RDTSC);
+ barrier_nospec();
return rdtsc();
}




2018-02-09 14:08:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 65/92] vfs, fdtable: Prevent bounds-check bypass via speculative execution

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit 56c30ba7b348b90484969054d561f711ba196507)

'fd' is a user controlled value that is used as a data dependency to
read from the 'fdt->fd' array. In order to avoid potential leaks of
kernel memory values, block speculative execution of the instruction
stream that could issue reads based on an invalid 'file *' returned from
__fcheck_files.

Co-developed-by: Elena Reshetova <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Al Viro <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/fdtable.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -9,6 +9,7 @@
#include <linux/compiler.h>
#include <linux/spinlock.h>
#include <linux/rcupdate.h>
+#include <linux/nospec.h>
#include <linux/types.h>
#include <linux/init.h>
#include <linux/fs.h>
@@ -81,8 +82,10 @@ static inline struct file *__fcheck_file
{
struct fdtable *fdt = rcu_dereference_raw(files->fdt);

- if (fd < fdt->max_fds)
+ if (fd < fdt->max_fds) {
+ fd = array_index_nospec(fd, fdt->max_fds);
return rcu_dereference_raw(fdt->fd[fd]);
+ }
return NULL;
}




2018-02-09 14:08:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 09/92] powerpc/powernv: Check device-tree for RFI flush settings

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Oliver O'Halloran <[email protected]>

commit 6e032b350cd1fdb830f18f8320ef0e13b4e24094 upstream.

New device-tree properties are available which tell the hypervisor
settings related to the RFI flush. Use them to determine the
appropriate flush instruction to use, and whether the flush is
required.

Signed-off-by: Oliver O'Halloran <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/platforms/powernv/setup.c | 50 +++++++++++++++++++++++++++++++++
1 file changed, 50 insertions(+)

--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -35,13 +35,63 @@
#include <asm/opal.h>
#include <asm/kexec.h>
#include <asm/smp.h>
+#include <asm/tm.h>
+#include <asm/setup.h>

#include "powernv.h"

+static void pnv_setup_rfi_flush(void)
+{
+ struct device_node *np, *fw_features;
+ enum l1d_flush_type type;
+ int enable;
+
+ /* Default to fallback in case fw-features are not available */
+ type = L1D_FLUSH_FALLBACK;
+ enable = 1;
+
+ np = of_find_node_by_name(NULL, "ibm,opal");
+ fw_features = of_get_child_by_name(np, "fw-features");
+ of_node_put(np);
+
+ if (fw_features) {
+ np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2");
+ if (np && of_property_read_bool(np, "enabled"))
+ type = L1D_FLUSH_MTTRIG;
+
+ of_node_put(np);
+
+ np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0");
+ if (np && of_property_read_bool(np, "enabled"))
+ type = L1D_FLUSH_ORI;
+
+ of_node_put(np);
+
+ /* Enable unless firmware says NOT to */
+ enable = 2;
+ np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0");
+ if (np && of_property_read_bool(np, "disabled"))
+ enable--;
+
+ of_node_put(np);
+
+ np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1");
+ if (np && of_property_read_bool(np, "disabled"))
+ enable--;
+
+ of_node_put(np);
+ of_node_put(fw_features);
+ }
+
+ setup_rfi_flush(type, enable > 0);
+}
+
static void __init pnv_setup_arch(void)
{
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);

+ pnv_setup_rfi_flush();
+
/* Initialize SMP */
pnv_smp_init();




2018-02-09 14:08:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 57/92] array_index_nospec: Sanitize speculative array de-references

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dan Williams <[email protected]>


(cherry picked from commit f3804203306e098dae9ca51540fcd5eb700d7f40)

array_index_nospec() is proposed as a generic mechanism to mitigate
against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary
checks via speculative execution. The array_index_nospec()
implementation is expected to be safe for current generation CPUs across
multiple architectures (ARM, x86).

Based on an original implementation by Linus Torvalds, tweaked to remove
speculative flows by Alexei Starovoitov, and tweaked again by Linus to
introduce an x86 assembly implementation for the mask generation.

Co-developed-by: Linus Torvalds <[email protected]>
Co-developed-by: Alexei Starovoitov <[email protected]>
Suggested-by: Cyril Novikov <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
include/linux/nospec.h | 72 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 72 insertions(+)
create mode 100644 include/linux/nospec.h

--- /dev/null
+++ b/include/linux/nospec.h
@@ -0,0 +1,72 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2018 Linus Torvalds. All rights reserved.
+// Copyright(c) 2018 Alexei Starovoitov. All rights reserved.
+// Copyright(c) 2018 Intel Corporation. All rights reserved.
+
+#ifndef _LINUX_NOSPEC_H
+#define _LINUX_NOSPEC_H
+
+/**
+ * array_index_mask_nospec() - generate a ~0 mask when index < size, 0 otherwise
+ * @index: array element index
+ * @size: number of elements in array
+ *
+ * When @index is out of bounds (@index >= @size), the sign bit will be
+ * set. Extend the sign bit to all bits and invert, giving a result of
+ * zero for an out of bounds index, or ~0 if within bounds [0, @size).
+ */
+#ifndef array_index_mask_nospec
+static inline unsigned long array_index_mask_nospec(unsigned long index,
+ unsigned long size)
+{
+ /*
+ * Warn developers about inappropriate array_index_nospec() usage.
+ *
+ * Even if the CPU speculates past the WARN_ONCE branch, the
+ * sign bit of @index is taken into account when generating the
+ * mask.
+ *
+ * This warning is compiled out when the compiler can infer that
+ * @index and @size are less than LONG_MAX.
+ */
+ if (WARN_ONCE(index > LONG_MAX || size > LONG_MAX,
+ "array_index_nospec() limited to range of [0, LONG_MAX]\n"))
+ return 0;
+
+ /*
+ * Always calculate and emit the mask even if the compiler
+ * thinks the mask is not needed. The compiler does not take
+ * into account the value of @index under speculation.
+ */
+ OPTIMIZER_HIDE_VAR(index);
+ return ~(long)(index | (size - 1UL - index)) >> (BITS_PER_LONG - 1);
+}
+#endif
+
+/*
+ * array_index_nospec - sanitize an array index after a bounds check
+ *
+ * For a code sequence like:
+ *
+ * if (index < size) {
+ * index = array_index_nospec(index, size);
+ * val = array[index];
+ * }
+ *
+ * ...if the CPU speculates past the bounds check then
+ * array_index_nospec() will clamp the index within the range of [0,
+ * size).
+ */
+#define array_index_nospec(index, size) \
+({ \
+ typeof(index) _i = (index); \
+ typeof(size) _s = (size); \
+ unsigned long _mask = array_index_mask_nospec(_i, _s); \
+ \
+ BUILD_BUG_ON(sizeof(_i) > sizeof(long)); \
+ BUILD_BUG_ON(sizeof(_s) > sizeof(long)); \
+ \
+ _i &= _mask; \
+ _i; \
+})
+#endif /* _LINUX_NOSPEC_H */



2018-02-09 14:08:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 56/92] Documentation: Document array_index_nospec

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Mark Rutland <[email protected]>


(cherry picked from commit f84a56f73dddaeac1dba8045b007f742f61cd2da)

Document the rationale and usage of the new array_index_nospec() helper.

Signed-off-by: Mark Rutland <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Kees Cook <[email protected]>
Cc: [email protected]
Cc: Jonathan Corbet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/speculation.txt | 90 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 90 insertions(+)
create mode 100644 Documentation/speculation.txt

--- /dev/null
+++ b/Documentation/speculation.txt
@@ -0,0 +1,90 @@
+This document explains potential effects of speculation, and how undesirable
+effects can be mitigated portably using common APIs.
+
+===========
+Speculation
+===========
+
+To improve performance and minimize average latencies, many contemporary CPUs
+employ speculative execution techniques such as branch prediction, performing
+work which may be discarded at a later stage.
+
+Typically speculative execution cannot be observed from architectural state,
+such as the contents of registers. However, in some cases it is possible to
+observe its impact on microarchitectural state, such as the presence or
+absence of data in caches. Such state may form side-channels which can be
+observed to extract secret information.
+
+For example, in the presence of branch prediction, it is possible for bounds
+checks to be ignored by code which is speculatively executed. Consider the
+following code:
+
+ int load_array(int *array, unsigned int index)
+ {
+ if (index >= MAX_ARRAY_ELEMS)
+ return 0;
+ else
+ return array[index];
+ }
+
+Which, on arm64, may be compiled to an assembly sequence such as:
+
+ CMP <index>, #MAX_ARRAY_ELEMS
+ B.LT less
+ MOV <returnval>, #0
+ RET
+ less:
+ LDR <returnval>, [<array>, <index>]
+ RET
+
+It is possible that a CPU mis-predicts the conditional branch, and
+speculatively loads array[index], even if index >= MAX_ARRAY_ELEMS. This
+value will subsequently be discarded, but the speculated load may affect
+microarchitectural state which can be subsequently measured.
+
+More complex sequences involving multiple dependent memory accesses may
+result in sensitive information being leaked. Consider the following
+code, building on the prior example:
+
+ int load_dependent_arrays(int *arr1, int *arr2, int index)
+ {
+ int val1, val2,
+
+ val1 = load_array(arr1, index);
+ val2 = load_array(arr2, val1);
+
+ return val2;
+ }
+
+Under speculation, the first call to load_array() may return the value
+of an out-of-bounds address, while the second call will influence
+microarchitectural state dependent on this value. This may provide an
+arbitrary read primitive.
+
+====================================
+Mitigating speculation side-channels
+====================================
+
+The kernel provides a generic API to ensure that bounds checks are
+respected even under speculation. Architectures which are affected by
+speculation-based side-channels are expected to implement these
+primitives.
+
+The array_index_nospec() helper in <linux/nospec.h> can be used to
+prevent information from being leaked via side-channels.
+
+A call to array_index_nospec(index, size) returns a sanitized index
+value that is bounded to [0, size) even under cpu speculation
+conditions.
+
+This can be used to protect the earlier load_array() example:
+
+ int load_array(int *array, unsigned int index)
+ {
+ if (index >= MAX_ARRAY_ELEMS)
+ return 0;
+ else {
+ index = array_index_nospec(index, MAX_ARRAY_ELEMS);
+ return array[index];
+ }
+ }



2018-02-09 14:08:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 47/92] x86/nospec: Fix header guards names

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

(cherry picked from commit 7a32fc51ca938e67974cbb9db31e1a43f98345a9)

... to adhere to the _ASM_X86_ naming scheme.

No functional change.

Signed-off-by: Borislav Petkov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/nospec-branch.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -1,7 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */

-#ifndef __NOSPEC_BRANCH_H__
-#define __NOSPEC_BRANCH_H__
+#ifndef _ASM_X86_NOSPEC_BRANCH_H_
+#define _ASM_X86_NOSPEC_BRANCH_H_

#include <asm/alternative.h>
#include <asm/alternative-asm.h>
@@ -232,4 +232,4 @@ static inline void indirect_branch_predi
}

#endif /* __ASSEMBLY__ */
-#endif /* __NOSPEC_BRANCH_H__ */
+#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */



2018-02-09 14:09:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/92] ASoC: pcm512x: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jesse Chan <[email protected]>

commit 0cab20cec0b663b7be8e2be5998d5a4113647f86 upstream.

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in sound/soc/codecs/snd-soc-pcm512x-spi.o
see include/linux/module.h for more information

This adds the license as "GPL v2", which matches the header of the file.

MODULE_DESCRIPTION and MODULE_AUTHOR are also added.

Signed-off-by: Jesse Chan <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
sound/soc/codecs/pcm512x-spi.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/sound/soc/codecs/pcm512x-spi.c
+++ b/sound/soc/codecs/pcm512x-spi.c
@@ -70,3 +70,7 @@ static struct spi_driver pcm512x_spi_dri
};

module_spi_driver(pcm512x_spi_driver);
+
+MODULE_DESCRIPTION("ASoC PCM512x codec driver - SPI");
+MODULE_AUTHOR("Mark Brown <[email protected]>");
+MODULE_LICENSE("GPL v2");



2018-02-09 14:09:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 17/92] kaiser: allocate pgd with order 0 when pti=off

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <[email protected]>

The 4.9.77 version of "x86/pti/efi: broken conversion from efi to kernel
page table" looked nicer than the 4.4.112 version, but was suboptimal on
machines booted with "pti=off" (or on AMD machines): it allocated pgd
with an order 1 page whatever the setting of kaiser_enabled.

Fix that by moving the definition of PGD_ALLOCATION_ORDER from
asm/pgalloc.h to asm/pgtable.h, which already defines kaiser_enabled.

Fixes: 1b92c48a2eeb ("x86/pti/efi: broken conversion from efi to kernel page table")
Reviewed-by: Pavel Tatashin <[email protected]>
Cc: Steven Sistare <[email protected]>
Cc: Jiri Kosina <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/include/asm/pgalloc.h | 11 -----------
arch/x86/include/asm/pgtable.h | 6 ++++++
2 files changed, 6 insertions(+), 11 deletions(-)

--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -27,17 +27,6 @@ static inline void paravirt_release_pud(
*/
extern gfp_t __userpte_alloc_gfp;

-#ifdef CONFIG_PAGE_TABLE_ISOLATION
-/*
- * Instead of one PGD, we acquire two PGDs. Being order-1, it is
- * both 8k in size and 8k-aligned. That lets us just flip bit 12
- * in a pointer to swap between the two 4k halves.
- */
-#define PGD_ALLOCATION_ORDER 1
-#else
-#define PGD_ALLOCATION_ORDER 0
-#endif
-
/*
* Allocate and free page tables.
*/
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -20,9 +20,15 @@

#ifdef CONFIG_PAGE_TABLE_ISOLATION
extern int kaiser_enabled;
+/*
+ * Instead of one PGD, we acquire two PGDs. Being order-1, it is
+ * both 8k in size and 8k-aligned. That lets us just flip bit 12
+ * in a pointer to swap between the two 4k halves.
+ */
#else
#define kaiser_enabled 0
#endif
+#define PGD_ALLOCATION_ORDER kaiser_enabled

void ptdump_walk_pgd_level(struct seq_file *m, pgd_t *pgd);
void ptdump_walk_pgd_level_checkwx(void);



2018-02-09 14:09:13

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/92] x86/pti: Make unpoison of pgd for trusted boot work for real

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Dave Hansen <[email protected]>

commit 445b69e3b75e42362a5bdc13c8b8f61599e2228a upstream

The inital fix for trusted boot and PTI potentially misses the pgd clearing
if pud_alloc() sets a PGD. It probably works in *practice* because for two
adjacent calls to map_tboot_page() that share a PGD entry, the first will
clear NX, *then* allocate and set the PGD (without NX clear). The second
call will *not* allocate but will clear the NX bit.

Defer the NX clearing to a point after it is known that all top-level
allocations have occurred. Add a comment to clarify why.

[ tglx: Massaged changelog ]

[ hughd notes: I have not tested tboot, but this looks to me as necessary
and as safe in old-Kaiser backports as it is upstream; I'm not submitting
the commit-to-be-fixed 262b6b30087, since it was undone by 445b69e3b75e,
and makes conflict trouble because of 5-level's p4d versus 4-level's pgd.]

Fixes: 262b6b30087 ("x86/tboot: Unbreak tboot with PTI enabled")
Signed-off-by: Dave Hansen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Andrea Arcangeli <[email protected]>
Cc: Jon Masters <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Cc: Jiri Kosina <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/tboot.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -134,6 +134,16 @@ static int map_tboot_page(unsigned long
return -1;
set_pte_at(&tboot_mm, vaddr, pte, pfn_pte(pfn, prot));
pte_unmap(pte);
+
+ /*
+ * PTI poisons low addresses in the kernel page tables in the
+ * name of making them unusable for userspace. To execute
+ * code at such a low address, the poison must be cleared.
+ *
+ * Note: 'pgd' actually gets set in pud_alloc().
+ */
+ pgd->pgd &= ~_PAGE_NX;
+
return 0;
}




2018-02-09 14:09:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 18/92] serial: core: mark port as initialized after successful IRQ change

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Sebastian Andrzej Siewior <[email protected]>

commit 44117a1d1732c513875d5a163f10d9adbe866c08 upstream.

setserial changes the IRQ via uart_set_info(). It invokes
uart_shutdown() which free the current used IRQ and clear
TTY_PORT_INITIALIZED. It will then update the IRQ number and invoke
uart_startup() before returning to the caller leaving
TTY_PORT_INITIALIZED cleared.

The next open will crash with
| list_add double add: new=ffffffff839fcc98, prev=ffffffff839fcc98, next=ffffffff839fcc98.
since the close from the IOCTL won't free the IRQ (and clean the list)
due to the TTY_PORT_INITIALIZED check in uart_shutdown().

There is same pattern in uart_do_autoconfig() and I *think* it also
needs to set TTY_PORT_INITIALIZED there.
Is there a reason why uart_startup() does not set the flag by itself
after the IRQ has been acquired (since it is cleared in uart_shutdown)?

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/tty/serial/serial_core.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -965,6 +965,8 @@ static int uart_set_info(struct tty_stru
}
} else {
retval = uart_startup(tty, state, 1);
+ if (retval == 0)
+ tty_port_set_initialized(port, true);
if (retval > 0)
retval = 0;
}



2018-02-09 14:10:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 45/92] x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit 20ffa1caecca4db8f79fe665acdeaa5af815a24d)

Expose indirect_branch_prediction_barrier() for use in subsequent patches.

[ tglx: Add IBPB status to spectre_v2 sysfs file ]

Co-developed-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: KarimAllah Ahmed <[email protected]>
Signed-off-by: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 2 ++
arch/x86/include/asm/nospec-branch.h | 13 +++++++++++++
arch/x86/kernel/cpu/bugs.c | 10 +++++++++-
3 files changed, 24 insertions(+), 1 deletion(-)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -202,6 +202,8 @@
/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */
#define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */

+#define X86_FEATURE_IBPB ( 7*32+21) /* Indirect Branch Prediction Barrier enabled*/
+
/* Virtualization flags: Linux defined, word 8 */
#define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */
#define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -218,5 +218,18 @@ static inline void vmexit_fill_RSB(void)
#endif
}

+static inline void indirect_branch_prediction_barrier(void)
+{
+ asm volatile(ALTERNATIVE("",
+ "movl %[msr], %%ecx\n\t"
+ "movl %[val], %%eax\n\t"
+ "movl $0, %%edx\n\t"
+ "wrmsr",
+ X86_FEATURE_IBPB)
+ : : [msr] "i" (MSR_IA32_PRED_CMD),
+ [val] "i" (PRED_CMD_IBPB)
+ : "eax", "ecx", "edx", "memory");
+}
+
#endif /* __ASSEMBLY__ */
#endif /* __NOSPEC_BRANCH_H__ */
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -262,6 +262,13 @@ retpoline_auto:
setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
pr_info("Filling RSB on context switch\n");
}
+
+ /* Initialize Indirect Branch Prediction Barrier if supported */
+ if (boot_cpu_has(X86_FEATURE_SPEC_CTRL) ||
+ boot_cpu_has(X86_FEATURE_AMD_PRED_CMD)) {
+ setup_force_cpu_cap(X86_FEATURE_IBPB);
+ pr_info("Enabling Indirect Branch Prediction Barrier\n");
+ }
}

#undef pr_fmt
@@ -291,7 +298,8 @@ ssize_t cpu_show_spectre_v2(struct devic
if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
return sprintf(buf, "Not affected\n");

- return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ boot_cpu_has(X86_FEATURE_IBPB) ? ", IPBP" : "",
spectre_v2_bad_module ? " - vulnerable module loaded" : "");
}
#endif



2018-02-09 14:11:37

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 39/92] x86/cpufeatures: Add CPUID_7_EDX CPUID leaf

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit 95ca0ee8636059ea2800dfbac9ecac6212d6b38f)

This is a pure feature bits leaf. There are two AVX512 feature bits in it
already which were handled as scattered bits, and three more from this leaf
are going to be added for speculation control features.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/cpufeature.h | 7 +++++--
arch/x86/include/asm/cpufeatures.h | 10 ++++++----
arch/x86/include/asm/disabled-features.h | 3 ++-
arch/x86/include/asm/required-features.h | 3 ++-
arch/x86/kernel/cpu/common.c | 1 +
arch/x86/kernel/cpu/scattered.c | 2 --
6 files changed, 16 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -28,6 +28,7 @@ enum cpuid_leafs
CPUID_8000_000A_EDX,
CPUID_7_ECX,
CPUID_8000_0007_EBX,
+ CPUID_7_EDX,
};

#ifdef CONFIG_X86_FEATURE_NAMES
@@ -78,8 +79,9 @@ extern const char * const x86_bug_flags[
CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) || \
CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) || \
CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) || \
+ CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) || \
REQUIRED_MASK_CHECK || \
- BUILD_BUG_ON_ZERO(NCAPINTS != 18))
+ BUILD_BUG_ON_ZERO(NCAPINTS != 19))

#define DISABLED_MASK_BIT_SET(feature_bit) \
( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 0, feature_bit) || \
@@ -100,8 +102,9 @@ extern const char * const x86_bug_flags[
CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) || \
CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) || \
CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) || \
+ CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) || \
DISABLED_MASK_CHECK || \
- BUILD_BUG_ON_ZERO(NCAPINTS != 18))
+ BUILD_BUG_ON_ZERO(NCAPINTS != 19))

#define cpu_has(c, bit) \
(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 : \
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -12,7 +12,7 @@
/*
* Defines x86 CPU feature bits
*/
-#define NCAPINTS 18 /* N 32-bit words worth of info */
+#define NCAPINTS 19 /* N 32-bit words worth of info */
#define NBUGINTS 1 /* N 32-bit bug flags */

/*
@@ -197,9 +197,7 @@
#define X86_FEATURE_RETPOLINE ( 7*32+12) /* Generic Retpoline mitigation for Spectre variant 2 */
#define X86_FEATURE_RETPOLINE_AMD ( 7*32+13) /* AMD Retpoline mitigation for Spectre variant 2 */

-#define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */
-#define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */
-#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */
+#define X86_FEATURE_RSB_CTXSW ( 7*32+19) /* Fill RSB on context switches */

/* Because the ALTERNATIVE scheme is for members of the X86_FEATURE club... */
#define X86_FEATURE_KAISER ( 7*32+31) /* CONFIG_PAGE_TABLE_ISOLATION w/o nokaiser */
@@ -295,6 +293,10 @@
#define X86_FEATURE_SUCCOR (17*32+1) /* Uncorrectable error containment and recovery */
#define X86_FEATURE_SMCA (17*32+3) /* Scalable MCA */

+/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
+#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
+#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+
/*
* BUG word(s)
*/
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -59,6 +59,7 @@
#define DISABLED_MASK15 0
#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE)
#define DISABLED_MASK17 0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
+#define DISABLED_MASK18 0
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)

#endif /* _ASM_X86_DISABLED_FEATURES_H */
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -100,6 +100,7 @@
#define REQUIRED_MASK15 0
#define REQUIRED_MASK16 0
#define REQUIRED_MASK17 0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
+#define REQUIRED_MASK18 0
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)

#endif /* _ASM_X86_REQUIRED_FEATURES_H */
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -737,6 +737,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
c->x86_capability[CPUID_7_0_EBX] = ebx;
c->x86_capability[CPUID_7_ECX] = ecx;
+ c->x86_capability[CPUID_7_EDX] = edx;
}

/* Extended state features: level 0x0000000d */
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -31,8 +31,6 @@ void init_scattered_cpuid_features(struc
const struct cpuid_bit *cb;

static const struct cpuid_bit cpuid_bits[] = {
- { X86_FEATURE_AVX512_4VNNIW, CR_EDX, 2, 0x00000007, 0 },
- { X86_FEATURE_AVX512_4FMAPS, CR_EDX, 3, 0x00000007, 0 },
{ X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 },
{ X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 },
{ X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 },



2018-02-09 14:11:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit fec9434a12f38d3aeafeb75711b71d8a1fdef621)

Also, for CPUs which don't speculate at all, don't report that they're
vulnerable to the Spectre variants either.

Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it
for now, even though that could be done with a simple comparison, on the
assumption that we'll have more to add.

Based on suggestions from Dave Hansen and Alan Cox.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/common.c | 48 ++++++++++++++++++++++++++++++++++++++-----
1 file changed, 43 insertions(+), 5 deletions(-)

--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -44,6 +44,8 @@
#include <asm/pat.h>
#include <asm/microcode.h>
#include <asm/microcode_intel.h>
+#include <asm/intel-family.h>
+#include <asm/cpu_device_id.h>

#ifdef CONFIG_X86_LOCAL_APIC
#include <asm/uv/uv.h>
@@ -838,6 +840,41 @@ static void identify_cpu_without_cpuid(s
#endif
}

+static const __initdata struct x86_cpu_id cpu_no_speculation[] = {
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CEDARVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_CLOVERVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_LINCROFT, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PENWELL, X86_FEATURE_ANY },
+ { X86_VENDOR_INTEL, 6, INTEL_FAM6_ATOM_PINEVIEW, X86_FEATURE_ANY },
+ { X86_VENDOR_CENTAUR, 5 },
+ { X86_VENDOR_INTEL, 5 },
+ { X86_VENDOR_NSC, 5 },
+ { X86_VENDOR_ANY, 4 },
+ {}
+};
+
+static const __initdata struct x86_cpu_id cpu_no_meltdown[] = {
+ { X86_VENDOR_AMD },
+ {}
+};
+
+static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
+{
+ u64 ia32_cap = 0;
+
+ if (x86_match_cpu(cpu_no_meltdown))
+ return false;
+
+ if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+ rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+ /* Rogue Data Cache Load? No! */
+ if (ia32_cap & ARCH_CAP_RDCL_NO)
+ return false;
+
+ return true;
+}
+
/*
* Do minimum CPU detection early.
* Fields really needed: vendor, cpuid_level, family, model, mask,
@@ -884,11 +921,12 @@ static void __init early_identify_cpu(st

setup_force_cpu_cap(X86_FEATURE_ALWAYS);

- if (c->x86_vendor != X86_VENDOR_AMD)
- setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
-
- setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
- setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ if (!x86_match_cpu(cpu_no_speculation)) {
+ if (cpu_vulnerable_to_meltdown(c))
+ setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+ setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+ }

fpu__init_system(c);




2018-02-09 14:11:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 41/92] x86/cpufeatures: Add AMD feature bits for Speculation Control

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit 5d10cbc91d9eb5537998b65608441b592eec65e7)

AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel.
See http://lkml.kernel.org/r/[email protected]

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -258,6 +258,9 @@
/* AMD-defined CPU features, CPUID level 0x80000008 (ebx), word 13 */
#define X86_FEATURE_CLZERO (13*32+0) /* CLZERO instruction */
#define X86_FEATURE_IRPERF (13*32+1) /* Instructions Retired Count */
+#define X86_FEATURE_AMD_PRED_CMD (13*32+12) /* Prediction Command MSR (AMD) */
+#define X86_FEATURE_AMD_SPEC_CTRL (13*32+14) /* Speculation Control MSR only (AMD) */
+#define X86_FEATURE_AMD_STIBP (13*32+15) /* Single Thread Indirect Branch Predictors (AMD) */

/* Thermal and Power Management Leaf, CPUID level 0x00000006 (eax), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */



2018-02-09 14:12:09

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 40/92] x86/cpufeatures: Add Intel feature bits for Speculation Control

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit fc67dd70adb711a45d2ef34e12d1a8be75edde61)

Add three feature bits exposed by new microcode on Intel CPUs for
speculation control.

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Reviewed-by: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 3 +++
1 file changed, 3 insertions(+)

--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -296,6 +296,9 @@
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_SPEC_CTRL (18*32+26) /* Speculation Control (IBRS + IBPB) */
+#define X86_FEATURE_STIBP (18*32+27) /* Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */

/*
* BUG word(s)



2018-02-09 14:12:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/92] auxdisplay: img-ascii-lcd: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jesse Chan <[email protected]>

commit 09c479f7f1fbfaf848e5813996793966cd50be81 upstream.

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in drivers/auxdisplay/img-ascii-lcd.o
see include/linux/module.h for more information

This adds the license as "GPL", which matches the header of the file.

MODULE_DESCRIPTION and MODULE_AUTHOR are also added.

Signed-off-by: Jesse Chan <[email protected]>
Signed-off-by: Arnd Bergmann <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/auxdisplay/img-ascii-lcd.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/auxdisplay/img-ascii-lcd.c
+++ b/drivers/auxdisplay/img-ascii-lcd.c
@@ -442,3 +442,7 @@ static struct platform_driver img_ascii_
.remove = img_ascii_lcd_remove,
};
module_platform_driver(img_ascii_lcd_driver);
+
+MODULE_DESCRIPTION("Imagination Technologies ASCII LCD Display");
+MODULE_AUTHOR("Paul Burton <[email protected]>");
+MODULE_LICENSE("GPL");



2018-02-09 14:12:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 42/92] x86/msr: Add definitions for new speculation control MSRs

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: David Woodhouse <[email protected]>

(cherry picked from commit 1e340c60d0dd3ae07b5bedc16a0469c14b9f3410)

Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES.

See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf

Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: Greg Kroah-Hartman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/msr-index.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -37,6 +37,13 @@
#define EFER_FFXSR (1<<_EFER_FFXSR)

/* Intel MSRs. Some also available on other CPUs */
+#define MSR_IA32_SPEC_CTRL 0x00000048 /* Speculation Control */
+#define SPEC_CTRL_IBRS (1 << 0) /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_STIBP (1 << 1) /* Single Thread Indirect Branch Predictors */
+
+#define MSR_IA32_PRED_CMD 0x00000049 /* Prediction Command */
+#define PRED_CMD_IBPB (1 << 0) /* Indirect Branch Prediction Barrier */
+
#define MSR_IA32_PERFCTR0 0x000000c1
#define MSR_IA32_PERFCTR1 0x000000c2
#define MSR_FSB_FREQ 0x000000cd
@@ -50,6 +57,11 @@
#define SNB_C3_AUTO_UNDEMOTE (1UL << 28)

#define MSR_MTRRcap 0x000000fe
+
+#define MSR_IA32_ARCH_CAPABILITIES 0x0000010a
+#define ARCH_CAP_RDCL_NO (1 << 0) /* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL (1 << 1) /* Enhanced IBRS support */
+
#define MSR_IA32_BBL_CR_CTL 0x00000119
#define MSR_IA32_BBL_CR_CTL3 0x0000011e




2018-02-09 14:12:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 38/92] module/retpoline: Warn about missing retpoline in module

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andi Kleen <[email protected]>

(cherry picked from commit caf7501a1b4ec964190f31f9c3f163de252273b8)

There's a risk that a kernel which has full retpoline mitigations becomes
vulnerable when a module gets loaded that hasn't been compiled with the
right compiler or the right option.

To enable detection of that mismatch at module load time, add a module info
string "retpoline" at build time when the module was compiled with
retpoline support. This only covers compiled C source, but assembler source
or prebuilt object files are not checked.

If a retpoline enabled kernel detects a non retpoline protected module at
load time, print a warning and report it in the sysfs vulnerability file.

[ tglx: Massaged changelog ]

Signed-off-by: Andi Kleen <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kernel/cpu/bugs.c | 17 ++++++++++++++++-
include/linux/module.h | 9 +++++++++
kernel/module.c | 11 +++++++++++
scripts/mod/modpost.c | 9 +++++++++
4 files changed, 45 insertions(+), 1 deletion(-)

--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -10,6 +10,7 @@
#include <linux/init.h>
#include <linux/utsname.h>
#include <linux/cpu.h>
+#include <linux/module.h>

#include <asm/nospec-branch.h>
#include <asm/cmdline.h>
@@ -92,6 +93,19 @@ static const char *spectre_v2_strings[]
#define pr_fmt(fmt) "Spectre V2 mitigation: " fmt

static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
+static bool spectre_v2_bad_module;
+
+#ifdef RETPOLINE
+bool retpoline_module_ok(bool has_retpoline)
+{
+ if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline)
+ return true;
+
+ pr_err("System may be vunerable to spectre v2\n");
+ spectre_v2_bad_module = true;
+ return false;
+}
+#endif

static void __init spec2_print_if_insecure(const char *reason)
{
@@ -277,6 +291,7 @@ ssize_t cpu_show_spectre_v2(struct devic
if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
return sprintf(buf, "Not affected\n");

- return sprintf(buf, "%s\n", spectre_v2_strings[spectre_v2_enabled]);
+ return sprintf(buf, "%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+ spectre_v2_bad_module ? " - vulnerable module loaded" : "");
}
#endif
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -791,6 +791,15 @@ static inline void module_bug_finalize(c
static inline void module_bug_cleanup(struct module *mod) {}
#endif /* CONFIG_GENERIC_BUG */

+#ifdef RETPOLINE
+extern bool retpoline_module_ok(bool has_retpoline);
+#else
+static inline bool retpoline_module_ok(bool has_retpoline)
+{
+ return true;
+}
+#endif
+
#ifdef CONFIG_MODULE_SIG
static inline bool module_sig_ok(struct module *module)
{
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2817,6 +2817,15 @@ static int check_modinfo_livepatch(struc
}
#endif /* CONFIG_LIVEPATCH */

+static void check_modinfo_retpoline(struct module *mod, struct load_info *info)
+{
+ if (retpoline_module_ok(get_modinfo(info, "retpoline")))
+ return;
+
+ pr_warn("%s: loading module not compiled with retpoline compiler.\n",
+ mod->name);
+}
+
/* Sets info->hdr and info->len. */
static int copy_module_from_user(const void __user *umod, unsigned long len,
struct load_info *info)
@@ -2969,6 +2978,8 @@ static int check_modinfo(struct module *
add_taint_module(mod, TAINT_OOT_MODULE, LOCKDEP_STILL_OK);
}

+ check_modinfo_retpoline(mod, info);
+
if (get_modinfo(info, "staging")) {
add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);
pr_warn("%s: module is from the staging directory, the quality "
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -2130,6 +2130,14 @@ static void add_intree_flag(struct buffe
buf_printf(b, "\nMODULE_INFO(intree, \"Y\");\n");
}

+/* Cannot check for assembler */
+static void add_retpoline(struct buffer *b)
+{
+ buf_printf(b, "\n#ifdef RETPOLINE\n");
+ buf_printf(b, "MODULE_INFO(retpoline, \"Y\");\n");
+ buf_printf(b, "#endif\n");
+}
+
static void add_staging_flag(struct buffer *b, const char *name)
{
static const char *staging_dir = "drivers/staging";
@@ -2474,6 +2482,7 @@ int main(int argc, char **argv)

add_header(&buf, mod);
add_intree_flag(&buf, !external_module);
+ add_retpoline(&buf);
add_staging_flag(&buf, mod->name);
err |= add_versions(&buf, mod);
add_depends(&buf, mod, modules);



2018-02-09 14:12:56

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 34/92] KEYS: encrypted: fix buffer overread in valid_master_desc()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Biggers <[email protected]>

commit 794b4bc292f5d31739d89c0202c54e7dc9bc3add upstream.

With the 'encrypted' key type it was possible for userspace to provide a
data blob ending with a master key description shorter than expected,
e.g. 'keyctl add encrypted desc "new x" @s'. When validating such a
master key description, validate_master_desc() could read beyond the end
of the buffer. Fix this by using strncmp() instead of memcmp(). [Also
clean up the code to deduplicate some logic.]

Cc: Mimi Zohar <[email protected]>
Signed-off-by: Eric Biggers <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: James Morris <[email protected]>
Signed-off-by: Jin Qian <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
security/keys/encrypted-keys/encrypted.c | 31 +++++++++++++++----------------
1 file changed, 15 insertions(+), 16 deletions(-)

--- a/security/keys/encrypted-keys/encrypted.c
+++ b/security/keys/encrypted-keys/encrypted.c
@@ -141,23 +141,22 @@ static int valid_ecryptfs_desc(const cha
*/
static int valid_master_desc(const char *new_desc, const char *orig_desc)
{
- if (!memcmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_TRUSTED_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_TRUSTED_PREFIX_LEN))
- goto out;
- } else if (!memcmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN)) {
- if (strlen(new_desc) == KEY_USER_PREFIX_LEN)
- goto out;
- if (orig_desc)
- if (memcmp(new_desc, orig_desc, KEY_USER_PREFIX_LEN))
- goto out;
- } else
- goto out;
+ int prefix_len;
+
+ if (!strncmp(new_desc, KEY_TRUSTED_PREFIX, KEY_TRUSTED_PREFIX_LEN))
+ prefix_len = KEY_TRUSTED_PREFIX_LEN;
+ else if (!strncmp(new_desc, KEY_USER_PREFIX, KEY_USER_PREFIX_LEN))
+ prefix_len = KEY_USER_PREFIX_LEN;
+ else
+ return -EINVAL;
+
+ if (!new_desc[prefix_len])
+ return -EINVAL;
+
+ if (orig_desc && strncmp(new_desc, orig_desc, prefix_len))
+ return -EINVAL;
+
return 0;
-out:
- return -EINVAL;
}

/*



2018-02-09 14:13:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 36/92] KVM: x86: Make indirect calls in emulator speculation safe

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <[email protected]>

(cherry picked from commit 1a29b5b7f347a1a9230c1e0af5b37e3e571588ab)

Replace the indirect calls with CALL_NOSPEC.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: David Woodhouse <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Tim Chen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
[dwmw2: Use ASM_CALL_CONSTRAINT like upstream, now we have it]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/emulate.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -25,6 +25,7 @@
#include <asm/kvm_emulate.h>
#include <linux/stringify.h>
#include <asm/debugreg.h>
+#include <asm/nospec-branch.h>

#include "x86.h"
#include "tss.h"
@@ -1012,8 +1013,8 @@ static __always_inline u8 test_cc(unsign
void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);

flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
- asm("push %[flags]; popf; call *%[fastop]"
- : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags));
+ asm("push %[flags]; popf; " CALL_NOSPEC
+ : "=a"(rc) : [thunk_target]"r"(fop), [flags]"r"(flags));
return rc;
}

@@ -5306,15 +5307,14 @@ static void fetch_possible_mmx_operand(s

static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
{
- register void *__sp asm(_ASM_SP);
ulong flags = (ctxt->eflags & EFLAGS_MASK) | X86_EFLAGS_IF;

if (!(ctxt->d & ByteOp))
fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;

- asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
+ asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
: "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
- [fastop]"+S"(fop), "+r"(__sp)
+ [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
: "c"(ctxt->src2.val));

ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);



2018-02-09 14:13:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 37/92] KVM: VMX: Make indirect call speculation safe

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <[email protected]>

(cherry picked from commit c940a3fb1e2e9b7d03228ab28f375fb5a47ff699)

Replace indirect call with CALL_NOSPEC.

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Reviewed-by: David Woodhouse <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ashok Raj <[email protected]>
Cc: Greg KH <[email protected]>
Cc: Jun Nakajima <[email protected]>
Cc: David Woodhouse <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: [email protected]
Cc: Dave Hansen <[email protected]>
Cc: Asit Mallick <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Jason Baron <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Arjan Van De Ven <[email protected]>
Cc: Tim Chen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/kvm/vmx.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8676,14 +8676,14 @@ static void vmx_handle_external_intr(str
#endif
"pushf\n\t"
__ASM_SIZE(push) " $%c[cs]\n\t"
- "call *%[entry]\n\t"
+ CALL_NOSPEC
:
#ifdef CONFIG_X86_64
[sp]"=&r"(tmp),
#endif
"+r"(__sp)
:
- [entry]"r"(entry),
+ THUNK_TARGET(entry),
[ss]"i"(__KERNEL_DS),
[cs]"i"(__KERNEL_CS)
);



2018-02-09 14:14:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 31/92] x86/microcode/AMD: Do not load when running on a hypervisor

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Borislav Petkov <[email protected]>

commit a15a753539eca8ba243d576f02e7ca9c4b7d7042 upstream.

Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.

By having it turned off by default, the APs won't run into the loader
either.

Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".

Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.

Signed-off-by: Borislav Petkov <[email protected]>
Tested-by: Juergen Gross <[email protected]>
Tested-by: Boris Ostrovsky <[email protected]>
Acked-by: Juergen Gross <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Rolf Neugebauer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
1 file changed, 19 insertions(+), 9 deletions(-)

--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -43,7 +43,7 @@
#define MICROCODE_VERSION "2.01"

static struct microcode_ops *microcode_ops;
-static bool dis_ucode_ldr;
+static bool dis_ucode_ldr = true;

/*
* Synchronization.
@@ -73,6 +73,7 @@ struct cpu_info_ctx {
static bool __init check_loader_disabled_bsp(void)
{
static const char *__dis_opt_str = "dis_ucode_ldr";
+ u32 a, b, c, d;

#ifdef CONFIG_X86_32
const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
@@ -85,8 +86,23 @@ static bool __init check_loader_disabled
bool *res = &dis_ucode_ldr;
#endif

- if (cmdline_find_option_bool(cmdline, option))
- *res = true;
+ if (!have_cpuid_p())
+ return *res;
+
+ a = 1;
+ c = 0;
+ native_cpuid(&a, &b, &c, &d);
+
+ /*
+ * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
+ * completely accurate as xen pv guests don't see that CPUID bit set but
+ * that's good enough as they don't land on the BSP path anyway.
+ */
+ if (c & BIT(31))
+ return *res;
+
+ if (cmdline_find_option_bool(cmdline, option) <= 0)
+ *res = false;

return *res;
}
@@ -118,9 +134,6 @@ void __init load_ucode_bsp(void)
if (check_loader_disabled_bsp())
return;

- if (!have_cpuid_p())
- return;
-
vendor = x86_cpuid_vendor();
family = x86_cpuid_family();

@@ -154,9 +167,6 @@ void load_ucode_ap(void)
if (check_loader_disabled_ap())
return;

- if (!have_cpuid_p())
- return;
-
vendor = x86_cpuid_vendor();
family = x86_cpuid_family();




2018-02-09 14:14:10

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 32/92] media: soc_camera: soc_scale_crop: add missing MODULE_DESCRIPTION/AUTHOR/LICENSE

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jesse Chan <[email protected]>

commit 5331aec1bf9c9da557668174e0a4bfcee39f1121 upstream.

This change resolves a new compile-time warning
when built as a loadable module:

WARNING: modpost: missing MODULE_LICENSE() in drivers/media/platform/soc_camera/soc_scale_crop.o
see include/linux/module.h for more information

This adds the license as "GPL", which matches the header of the file.

MODULE_DESCRIPTION and MODULE_AUTHOR are also added.

Signed-off-by: Jesse Chan <[email protected]>
Signed-off-by: Hans Verkuil <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/media/platform/soc_camera/soc_scale_crop.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/drivers/media/platform/soc_camera/soc_scale_crop.c
+++ b/drivers/media/platform/soc_camera/soc_scale_crop.c
@@ -418,3 +418,7 @@ void soc_camera_calc_client_output(struc
mf->height = soc_camera_shift_scale(rect->height, shift, scale_v);
}
EXPORT_SYMBOL(soc_camera_calc_client_output);
+
+MODULE_DESCRIPTION("soc-camera scaling-cropping functions");
+MODULE_AUTHOR("Guennadi Liakhovetski <[email protected]>");
+MODULE_LICENSE("GPL");



2018-02-09 14:14:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 33/92] b43: Add missing MODULE_FIRMWARE()

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Takashi Iwai <[email protected]>

commit 3c89a72ad80c64bdbd5ff851ee9c328a191f7e01 upstream.

Some firmware entries were forgotten to be added via MODULE_FIRMWARE(), which
may result in the non-functional state when the driver is loaded in initrd.

Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1037344
Fixes: 15be8e89cdd9 ("b43: add more bcma cores")
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Kalle Valo <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/wireless/broadcom/b43/main.c | 10 ++++++++++
1 file changed, 10 insertions(+)

--- a/drivers/net/wireless/broadcom/b43/main.c
+++ b/drivers/net/wireless/broadcom/b43/main.c
@@ -71,8 +71,18 @@ MODULE_FIRMWARE("b43/ucode11.fw");
MODULE_FIRMWARE("b43/ucode13.fw");
MODULE_FIRMWARE("b43/ucode14.fw");
MODULE_FIRMWARE("b43/ucode15.fw");
+MODULE_FIRMWARE("b43/ucode16_lp.fw");
MODULE_FIRMWARE("b43/ucode16_mimo.fw");
+MODULE_FIRMWARE("b43/ucode24_lcn.fw");
+MODULE_FIRMWARE("b43/ucode25_lcn.fw");
+MODULE_FIRMWARE("b43/ucode25_mimo.fw");
+MODULE_FIRMWARE("b43/ucode26_mimo.fw");
+MODULE_FIRMWARE("b43/ucode29_mimo.fw");
+MODULE_FIRMWARE("b43/ucode33_lcn40.fw");
+MODULE_FIRMWARE("b43/ucode30_mimo.fw");
MODULE_FIRMWARE("b43/ucode5.fw");
+MODULE_FIRMWARE("b43/ucode40.fw");
+MODULE_FIRMWARE("b43/ucode42.fw");
MODULE_FIRMWARE("b43/ucode9.fw");

static int modparam_bad_frames_preempt;



2018-02-09 14:14:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 35/92] x86/retpoline: Remove the esp/rsp thunk

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Waiman Long <[email protected]>

(cherry picked from commit 1df37383a8aeabb9b418698f0bcdffea01f4b1b2)

It doesn't make sense to have an indirect call thunk with esp/rsp as
retpoline code won't work correctly with the stack pointer register.
Removing it will help compiler writers to catch error in case such
a thunk call is emitted incorrectly.

Fixes: 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
Suggested-by: Jeff Law <[email protected]>
Signed-off-by: Waiman Long <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Acked-by: David Woodhouse <[email protected]>
Cc: Tom Lendacky <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Arjan van de Ven <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Paul Turner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: David Woodhouse <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/x86/include/asm/asm-prototypes.h | 1 -
arch/x86/lib/retpoline.S | 1 -
2 files changed, 2 deletions(-)

--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -37,5 +37,4 @@ INDIRECT_THUNK(dx)
INDIRECT_THUNK(si)
INDIRECT_THUNK(di)
INDIRECT_THUNK(bp)
-INDIRECT_THUNK(sp)
#endif /* CONFIG_RETPOLINE */
--- a/arch/x86/lib/retpoline.S
+++ b/arch/x86/lib/retpoline.S
@@ -36,7 +36,6 @@ GENERATE_THUNK(_ASM_DX)
GENERATE_THUNK(_ASM_SI)
GENERATE_THUNK(_ASM_DI)
GENERATE_THUNK(_ASM_BP)
-GENERATE_THUNK(_ASM_SP)
#ifdef CONFIG_64BIT
GENERATE_THUNK(r8)
GENERATE_THUNK(r9)



2018-02-09 14:15:07

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 26/92] tcp_bbr: fix pacing_gain to always be unity when using lt_bw

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Neal Cardwell <[email protected]>


[ Upstream commit 3aff3b4b986e51bcf4ab249e5d48d39596e0df6a ]

This commit fixes the pacing_gain to remain at BBR_UNIT (1.0) when
using lt_bw and returning from the PROBE_RTT state to PROBE_BW.

Previously, when using lt_bw, upon exiting PROBE_RTT and entering
PROBE_BW the bbr_reset_probe_bw_mode() code could sometimes randomly
end up with a cycle_idx of 0 and hence have bbr_advance_cycle_phase()
set a pacing gain above 1.0. In such cases this would result in a
pacing rate that is 1.25x higher than intended, potentially resulting
in a high loss rate for a little while until we stop using the lt_bw a
bit later.

This commit is a stable candidate for kernels back as far as 4.9.

Fixes: 0f8782ea1497 ("tcp_bbr: add BBR congestion control")
Signed-off-by: Neal Cardwell <[email protected]>
Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Soheil Hassas Yeganeh <[email protected]>
Reported-by: Beyers Cronje <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_bbr.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -452,7 +452,8 @@ static void bbr_advance_cycle_phase(stru

bbr->cycle_idx = (bbr->cycle_idx + 1) & (CYCLE_LEN - 1);
bbr->cycle_mstamp = tp->delivered_mstamp;
- bbr->pacing_gain = bbr_pacing_gain[bbr->cycle_idx];
+ bbr->pacing_gain = bbr->lt_use_bw ? BBR_UNIT :
+ bbr_pacing_gain[bbr->cycle_idx];
}

/* Gain cycling: cycle pacing gain to converge to fair share of available bw. */
@@ -461,8 +462,7 @@ static void bbr_update_cycle_phase(struc
{
struct bbr *bbr = inet_csk_ca(sk);

- if ((bbr->mode == BBR_PROBE_BW) && !bbr->lt_use_bw &&
- bbr_is_next_cycle_phase(sk, rs))
+ if (bbr->mode == BBR_PROBE_BW && bbr_is_next_cycle_phase(sk, rs))
bbr_advance_cycle_phase(sk);
}




2018-02-09 14:15:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 25/92] vhost_net: stop device during reset owner

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Jason Wang <[email protected]>


[ Upstream commit 4cd879515d686849eec5f718aeac62a70b067d82 ]

We don't stop device before reset owner, this means we could try to
serve any virtqueue kick before reset dev->worker. This will result a
warn since the work was pending at llist during owner resetting. Fix
this by stopping device during owner reset.

Reported-by: [email protected]
Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/vhost/net.c | 1 +
1 file changed, 1 insertion(+)

--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1078,6 +1078,7 @@ static long vhost_net_reset_owner(struct
}
vhost_net_stop(n, &tx_sock, &rx_sock);
vhost_net_flush(n);
+ vhost_dev_stop(&n->dev);
vhost_dev_reset_owner(&n->dev, umem);
vhost_net_vq_reset(n);
done:



2018-02-09 14:15:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 27/92] cls_u32: add missing RCU annotation.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Paolo Abeni <[email protected]>


[ Upstream commit 058a6c033488494a6b1477b05fe8e1a16e344462 ]

In a couple of points of the control path, n->ht_down is currently
accessed without the required RCU annotation. The accesses are
safe, but sparse complaints. Since we already held the
rtnl lock, let use rtnl_dereference().

Fixes: a1b7c5fd7fe9 ("net: sched: add cls_u32 offload hooks for netdevs")
Fixes: de5df63228fc ("net: sched: cls_u32 changes to knode must appear atomic to readers")
Signed-off-by: Paolo Abeni <[email protected]>
Acked-by: Cong Wang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/cls_u32.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)

--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -496,6 +496,7 @@ static void u32_clear_hw_hnode(struct tc
static int u32_replace_hw_knode(struct tcf_proto *tp, struct tc_u_knode *n,
u32 flags)
{
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct net_device *dev = tp->q->dev_queue->dev;
struct tc_cls_u32_offload u32_offload = {0};
struct tc_to_netdev offload;
@@ -520,7 +521,7 @@ static int u32_replace_hw_knode(struct t
offload.cls_u32->knode.sel = &n->sel;
offload.cls_u32->knode.exts = &n->exts;
if (n->ht_down)
- offload.cls_u32->knode.link_handle = n->ht_down->handle;
+ offload.cls_u32->knode.link_handle = ht->handle;

err = dev->netdev_ops->ndo_setup_tc(dev, tp->q->handle,
tp->protocol, &offload);
@@ -788,8 +789,9 @@ static void u32_replace_knode(struct tcf
static struct tc_u_knode *u32_init_knode(struct tcf_proto *tp,
struct tc_u_knode *n)
{
- struct tc_u_knode *new;
+ struct tc_u_hnode *ht = rtnl_dereference(n->ht_down);
struct tc_u32_sel *s = &n->sel;
+ struct tc_u_knode *new;

new = kzalloc(sizeof(*n) + s->nkeys*sizeof(struct tc_u32_key),
GFP_KERNEL);
@@ -807,11 +809,11 @@ static struct tc_u_knode *u32_init_knode
new->fshift = n->fshift;
new->res = n->res;
new->flags = n->flags;
- RCU_INIT_POINTER(new->ht_down, n->ht_down);
+ RCU_INIT_POINTER(new->ht_down, ht);

/* bump reference count as long as we hold pointer to structure */
- if (new->ht_down)
- new->ht_down->refcnt++;
+ if (ht)
+ ht->refcnt++;

#ifdef CONFIG_CLS_U32_PERF
/* Statistics may be incremented by readers during update



2018-02-09 14:16:11

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 23/92] r8169: fix RTL8168EP take too long to complete driver initialization.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Chunhao Lin <[email protected]>


[ Upstream commit 086ca23d03c0d2f4088f472386778d293e15c5f6 ]

Driver check the wrong register bit in rtl_ocp_tx_cond() that keep driver
waiting until timeout.

Fix this by waiting for the right register bit.

Signed-off-by: Chunhao Lin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/realtek/r8169.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1387,7 +1387,7 @@ DECLARE_RTL_COND(rtl_ocp_tx_cond)
{
void __iomem *ioaddr = tp->mmio_addr;

- return RTL_R8(IBISR0) & 0x02;
+ return RTL_R8(IBISR0) & 0x20;
}

static void rtl8168ep_stop_cmac(struct rtl8169_private *tp)
@@ -1395,7 +1395,7 @@ static void rtl8168ep_stop_cmac(struct r
void __iomem *ioaddr = tp->mmio_addr;

RTL_W8(IBCR2, RTL_R8(IBCR2) & ~0x01);
- rtl_msleep_loop_wait_low(tp, &rtl_ocp_tx_cond, 50, 2000);
+ rtl_msleep_loop_wait_high(tp, &rtl_ocp_tx_cond, 50, 2000);
RTL_W8(IBISR0, RTL_R8(IBISR0) | 0x20);
RTL_W8(IBCR0, RTL_R8(IBCR0) & ~0x01);
}



2018-02-09 14:16:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 24/92] tcp: release sk_frag.page in tcp_disconnect

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Li RongQing <[email protected]>


[ Upstream commit 9b42d55a66d388e4dd5550107df051a9637564fc ]

socket can be disconnected and gets transformed back to a listening
socket, if sk_frag.page is not released, which will be cloned into
a new socket by sk_clone_lock, but the reference count of this page
is increased, lead to a use after free or double free issue

Signed-off-by: Li RongQing <[email protected]>
Cc: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp.c | 6 ++++++
1 file changed, 6 insertions(+)

--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2316,6 +2316,12 @@ int tcp_disconnect(struct sock *sk, int

WARN_ON(inet->inet_num && !icsk->icsk_bind_hash);

+ if (sk->sk_frag.page) {
+ put_page(sk->sk_frag.page);
+ sk->sk_frag.page = NULL;
+ sk->sk_frag.offset = 0;
+ }
+
sk->sk_error_report(sk);
return err;
}



2018-02-09 14:16:31

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 19/92] ip6mr: fix stale iterator

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Aleksandrov <[email protected]>


[ Upstream commit 4adfa79fc254efb7b0eb3cd58f62c2c3f805f1ba ]

When we dump the ip6mr mfc entries via proc, we initialize an iterator
with the table to dump but we don't clear the cache pointer which might
be initialized from a prior read on the same descriptor that ended. This
can result in lock imbalance (an unnecessary unlock) leading to other
crashes and hangs. Clear the cache pointer like ipmr does to fix the issue.
Thanks for the reliable reproducer.

Here's syzbot's trace:
WARNING: bad unlock balance detected!
4.15.0-rc3+ #128 Not tainted
syzkaller971460/3195 is trying to release lock (mrt_lock) at:
[<000000006898068d>] ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
but there are no more locks to release!

other info that might help us debug this:
1 lock held by syzkaller971460/3195:
#0: (&p->lock){+.+.}, at: [<00000000744a6565>] seq_read+0xd5/0x13d0
fs/seq_file.c:165

stack backtrace:
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_unlock_imbalance_bug+0x12f/0x140 kernel/locking/lockdep.c:3561
__lock_release kernel/locking/lockdep.c:3775 [inline]
lock_release+0x5f9/0xda0 kernel/locking/lockdep.c:4023
__raw_read_unlock include/linux/rwlock_api_smp.h:225 [inline]
_raw_read_unlock+0x1a/0x30 kernel/locking/spinlock.c:255
ipmr_mfc_seq_stop+0xe1/0x130 net/ipv6/ip6mr.c:553
traverse+0x3bc/0xa00 fs/seq_file.c:135
seq_read+0x96a/0x13d0 fs/seq_file.c:189
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
BUG: sleeping function called from invalid context at lib/usercopy.c:25
in_atomic(): 1, irqs_disabled(): 0, pid: 3195, name: syzkaller971460
INFO: lockdep is turned off.
CPU: 1 PID: 3195 Comm: syzkaller971460 Not tainted 4.15.0-rc3+ #128
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
___might_sleep+0x2b2/0x470 kernel/sched/core.c:6060
__might_sleep+0x95/0x190 kernel/sched/core.c:6013
__might_fault+0xab/0x1d0 mm/memory.c:4525
_copy_to_user+0x2c/0xc0 lib/usercopy.c:25
copy_to_user include/linux/uaccess.h:155 [inline]
seq_read+0xcb4/0x13d0 fs/seq_file.c:279
proc_reg_read+0xef/0x170 fs/proc/inode.c:217
do_loop_readv_writev fs/read_write.c:673 [inline]
do_iter_read+0x3db/0x5b0 fs/read_write.c:897
compat_readv+0x1bf/0x270 fs/read_write.c:1140
do_compat_preadv64+0xdc/0x100 fs/read_write.c:1189
C_SYSC_preadv fs/read_write.c:1209 [inline]
compat_SyS_preadv+0x3b/0x50 fs/read_write.c:1203
do_syscall_32_irqs_on arch/x86/entry/common.c:327 [inline]
do_fast_syscall_32+0x3ee/0xf9d arch/x86/entry/common.c:389
entry_SYSENTER_compat+0x51/0x60 arch/x86/entry/entry_64_compat.S:125
RIP: 0023:0xf7f73c79
RSP: 002b:00000000e574a15c EFLAGS: 00000292 ORIG_RAX: 000000000000014d
RAX: ffffffffffffffda RBX: 000000000000000f RCX: 0000000020a3afb0
RDX: 0000000000000001 RSI: 0000000000000067 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
WARNING: CPU: 1 PID: 3195 at lib/usercopy.c:26 _copy_to_user+0xb5/0xc0
lib/usercopy.c:26

Reported-by: syzbot <bot+eceb3204562c41a438fa1f2335e0fe4f6886d669@syzkaller.appspotmail.com>
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv6/ip6mr.c | 1 +
1 file changed, 1 insertion(+)

--- a/net/ipv6/ip6mr.c
+++ b/net/ipv6/ip6mr.c
@@ -495,6 +495,7 @@ static void *ipmr_mfc_seq_start(struct s
return ERR_PTR(-ENOENT);

it->mrt = mrt;
+ it->cache = NULL;
return *pos ? ipmr_mfc_seq_idx(net, seq->private, *pos - 1)
: SEQ_START_TOKEN;
}



2018-02-09 14:16:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 20/92] net: igmp: add a missing rcu locking section

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit e7aadb27a5415e8125834b84a74477bfbee4eff5 ]

Newly added igmpv3_get_srcaddr() needs to be called under rcu lock.

Timer callbacks do not ensure this locking.

=============================
WARNING: suspicious RCU usage
4.15.0+ #200 Not tainted
-----------------------------
./include/linux/inetdevice.h:216 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syzkaller616973/4074:
#0: (&mm->mmap_sem){++++}, at: [<00000000bfce669e>] __do_page_fault+0x32d/0xc90 arch/x86/mm/fault.c:1355
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] lockdep_copy_map include/linux/lockdep.h:178 [inline]
#1: ((&im->timer)){+.-.}, at: [<00000000619d2f71>] call_timer_fn+0x1c6/0x820 kernel/time/timer.c:1316
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] spin_lock_bh include/linux/spinlock.h:315 [inline]
#2: (&(&im->lock)->rlock){+.-.}, at: [<000000005f833c5c>] igmpv3_send_report+0x98/0x5b0 net/ipv4/igmp.c:600

stack backtrace:
CPU: 0 PID: 4074 Comm: syzkaller616973 Not tainted 4.15.0+ #200
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
lockdep_rcu_suspicious+0x123/0x170 kernel/locking/lockdep.c:4592
__in_dev_get_rcu include/linux/inetdevice.h:216 [inline]
igmpv3_get_srcaddr net/ipv4/igmp.c:329 [inline]
igmpv3_newpack+0xeef/0x12e0 net/ipv4/igmp.c:389
add_grhead.isra.27+0x235/0x300 net/ipv4/igmp.c:432
add_grec+0xbd3/0x1170 net/ipv4/igmp.c:565
igmpv3_send_report+0xd5/0x5b0 net/ipv4/igmp.c:605
igmp_send_report+0xc43/0x1050 net/ipv4/igmp.c:722
igmp_timer_expire+0x322/0x5c0 net/ipv4/igmp.c:831
call_timer_fn+0x228/0x820 kernel/time/timer.c:1326
expire_timers kernel/time/timer.c:1363 [inline]
__run_timers+0x7ee/0xb70 kernel/time/timer.c:1666
run_timer_softirq+0x4c/0x70 kernel/time/timer.c:1692
__do_softirq+0x2d7/0xb85 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1cc/0x200 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:541 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:938

Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: syzbot <[email protected]>

Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/igmp.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -386,7 +386,11 @@ static struct sk_buff *igmpv3_newpack(st
pip->frag_off = htons(IP_DF);
pip->ttl = 1;
pip->daddr = fl4.daddr;
+
+ rcu_read_lock();
pip->saddr = igmpv3_get_srcaddr(dev, &fl4);
+ rcu_read_unlock();
+
pip->protocol = IPPROTO_IGMP;
pip->tot_len = 0; /* filled in later */
ip_select_ident(net, skb, NULL);



2018-02-09 14:17:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/92] powerpc/pseries: Query hypervisor for RFI flush settings

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Neuling <[email protected]>

commit 8989d56878a7735dfdb234707a2fee6faf631085 upstream.

A new hypervisor call is available which tells the guest settings
related to the RFI flush. Use it to query the appropriate flush
instruction(s), and whether the flush is required.

Signed-off-by: Michael Neuling <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/platforms/pseries/setup.c | 35 +++++++++++++++++++++++++++++++++
1 file changed, 35 insertions(+)

--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -450,6 +450,39 @@ static void __init find_and_init_phbs(vo
of_pci_check_probe_only();
}

+static void pseries_setup_rfi_flush(void)
+{
+ struct h_cpu_char_result result;
+ enum l1d_flush_type types;
+ bool enable;
+ long rc;
+
+ /* Enable by default */
+ enable = true;
+
+ rc = plpar_get_cpu_characteristics(&result);
+ if (rc == H_SUCCESS) {
+ types = L1D_FLUSH_NONE;
+
+ if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
+ types |= L1D_FLUSH_MTTRIG;
+ if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
+ types |= L1D_FLUSH_ORI;
+
+ /* Use fallback if nothing set in hcall */
+ if (types == L1D_FLUSH_NONE)
+ types = L1D_FLUSH_FALLBACK;
+
+ if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
+ enable = false;
+ } else {
+ /* Default to fallback if case hcall is not available */
+ types = L1D_FLUSH_FALLBACK;
+ }
+
+ setup_rfi_flush(types, enable);
+}
+
static void __init pSeries_setup_arch(void)
{
set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -467,6 +500,8 @@ static void __init pSeries_setup_arch(vo

fwnmi_init();

+ pseries_setup_rfi_flush();
+
/* By default, only probe PCI (can be overridden by rtas_pci) */
pci_add_flags(PCI_PROBE_ONLY);




2018-02-09 14:17:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 21/92] qlcnic: fix deadlock bug

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Junxiao Bi <[email protected]>


[ Upstream commit 233ac3891607f501f08879134d623b303838f478 ]

The following soft lockup was caught. This is a deadlock caused by
recusive locking.

Process kworker/u40:1:28016 was holding spin lock "mbx->queue_lock" in
qlcnic_83xx_mailbox_worker(), while a softirq came in and ask the same spin
lock in qlcnic_83xx_enqueue_mbx_cmd(). This lock should be hold by disable
bh..

[161846.962125] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u40:1:28016]
[161846.962367] Modules linked in: tun ocfs2 xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn xenfs xen_privcmd autofs4 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bnx2fc fcoe libfcoe libfc sunrpc 8021q mrp garp bridge stp llc bonding dm_round_robin dm_multipath iTCO_wdt iTCO_vendor_support pcspkr sb_edac edac_core i2c_i801 shpchp lpc_ich mfd_core ioatdma ipmi_devintf ipmi_si ipmi_msghandler sg ext4 jbd2 mbcache2 sr_mod cdrom sd_mod igb i2c_algo_bit i2c_core ahci libahci megaraid_sas ixgbe dca ptp pps_core vxlan udp_tunnel ip6_udp_tunnel qla2xxx scsi_transport_fc qlcnic crc32c_intel be2iscsi bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi ipv6 cxgb3 mdio libiscsi_tcp qla4xxx iscsi_boot_sysfs libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod
[161846.962454]
[161846.962460] CPU: 1 PID: 28016 Comm: kworker/u40:1 Not tainted 4.1.12-94.5.9.el6uek.x86_64 #2
[161846.962463] Hardware name: Oracle Corporation SUN SERVER X4-2L /ASSY,MB,X4-2L , BIOS 26050100 09/19/2017
[161846.962489] Workqueue: qlcnic_mailbox qlcnic_83xx_mailbox_worker [qlcnic]
[161846.962493] task: ffff8801f2e34600 ti: ffff88004ca5c000 task.ti: ffff88004ca5c000
[161846.962496] RIP: e030:[<ffffffff810013aa>] [<ffffffff810013aa>] xen_hypercall_sched_op+0xa/0x20
[161846.962506] RSP: e02b:ffff880202e43388 EFLAGS: 00000206
[161846.962509] RAX: 0000000000000000 RBX: ffff8801f6996b70 RCX: ffffffff810013aa
[161846.962511] RDX: ffff880202e433cc RSI: ffff880202e433b0 RDI: 0000000000000003
[161846.962513] RBP: ffff880202e433d0 R08: 0000000000000000 R09: ffff8801fe893200
[161846.962516] R10: ffff8801fe400538 R11: 0000000000000206 R12: ffff880202e4b000
[161846.962518] R13: 0000000000000050 R14: 0000000000000001 R15: 000000000000020d
[161846.962528] FS: 0000000000000000(0000) GS:ffff880202e40000(0000) knlGS:ffff880202e40000
[161846.962531] CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
[161846.962533] CR2: 0000000002612640 CR3: 00000001bb796000 CR4: 0000000000042660
[161846.962536] Stack:
[161846.962538] ffff880202e43608 0000000000000000 ffffffff813f0442 ffff880202e433b0
[161846.962543] 0000000000000000 ffff880202e433cc ffffffff00000001 0000000000000000
[161846.962547] 00000009813f03d6 ffff880202e433e0 ffffffff813f0460 ffff880202e43440
[161846.962552] Call Trace:
[161846.962555] <IRQ>
[161846.962565] [<ffffffff813f0442>] ? xen_poll_irq_timeout+0x42/0x50
[161846.962570] [<ffffffff813f0460>] xen_poll_irq+0x10/0x20
[161846.962578] [<ffffffff81014222>] xen_lock_spinning+0xe2/0x110
[161846.962583] [<ffffffff81013f01>] __raw_callee_save_xen_lock_spinning+0x11/0x20
[161846.962592] [<ffffffff816e5c57>] ? _raw_spin_lock+0x57/0x80
[161846.962609] [<ffffffffa028acfc>] qlcnic_83xx_enqueue_mbx_cmd+0x7c/0xe0 [qlcnic]
[161846.962623] [<ffffffffa028e008>] qlcnic_83xx_issue_cmd+0x58/0x210 [qlcnic]
[161846.962636] [<ffffffffa028caf2>] qlcnic_83xx_sre_macaddr_change+0x162/0x1d0 [qlcnic]
[161846.962649] [<ffffffffa028cb8b>] qlcnic_83xx_change_l2_filter+0x2b/0x30 [qlcnic]
[161846.962657] [<ffffffff8160248b>] ? __skb_flow_dissect+0x18b/0x650
[161846.962670] [<ffffffffa02856e5>] qlcnic_send_filter+0x205/0x250 [qlcnic]
[161846.962682] [<ffffffffa0285c77>] qlcnic_xmit_frame+0x547/0x7b0 [qlcnic]
[161846.962691] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962696] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962701] [<ffffffff81630112>] sch_direct_xmit+0x112/0x220
[161846.962706] [<ffffffff8160b80f>] __dev_queue_xmit+0x1df/0x5e0
[161846.962710] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962721] [<ffffffffa0575bd5>] bond_dev_queue_xmit+0x35/0x80 [bonding]
[161846.962729] [<ffffffffa05769fb>] __bond_start_xmit+0x1cb/0x210 [bonding]
[161846.962736] [<ffffffffa0576a71>] bond_start_xmit+0x31/0x60 [bonding]
[161846.962740] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962745] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962749] [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0
[161846.962754] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962760] [<ffffffffa05cfa72>] vlan_dev_hard_start_xmit+0xb2/0x150 [8021q]
[161846.962764] [<ffffffff8160ac22>] xmit_one+0x82/0x1a0
[161846.962769] [<ffffffff8160ad90>] dev_hard_start_xmit+0x50/0xa0
[161846.962773] [<ffffffff8160bb1e>] __dev_queue_xmit+0x4ee/0x5e0
[161846.962777] [<ffffffff8160bc33>] dev_queue_xmit_sk+0x13/0x20
[161846.962789] [<ffffffffa05adf74>] br_dev_queue_push_xmit+0x54/0xa0 [bridge]
[161846.962797] [<ffffffffa05ae4ff>] br_forward_finish+0x2f/0x90 [bridge]
[161846.962807] [<ffffffff810b0dad>] ? ttwu_do_wakeup+0x1d/0x100
[161846.962811] [<ffffffff815f929b>] ? __alloc_skb+0x8b/0x1f0
[161846.962818] [<ffffffffa05ae04d>] __br_forward+0x8d/0x120 [bridge]
[161846.962822] [<ffffffff815f613b>] ? __kmalloc_reserve+0x3b/0xa0
[161846.962829] [<ffffffff810be55e>] ? update_rq_runnable_avg+0xee/0x230
[161846.962836] [<ffffffffa05ae176>] br_forward+0x96/0xb0 [bridge]
[161846.962845] [<ffffffffa05af85e>] br_handle_frame_finish+0x1ae/0x420 [bridge]
[161846.962853] [<ffffffffa05afc4f>] br_handle_frame+0x17f/0x260 [bridge]
[161846.962862] [<ffffffffa05afad0>] ? br_handle_frame_finish+0x420/0x420 [bridge]
[161846.962867] [<ffffffff8160d057>] __netif_receive_skb_core+0x1f7/0x870
[161846.962872] [<ffffffff8160d6f2>] __netif_receive_skb+0x22/0x70
[161846.962877] [<ffffffff8160d913>] netif_receive_skb_internal+0x23/0x90
[161846.962884] [<ffffffffa07512ea>] ? xenvif_idx_release+0xea/0x100 [xen_netback]
[161846.962889] [<ffffffff816e5a10>] ? _raw_spin_unlock_irqrestore+0x20/0x50
[161846.962893] [<ffffffff8160e624>] netif_receive_skb_sk+0x24/0x90
[161846.962899] [<ffffffffa075269a>] xenvif_tx_submit+0x2ca/0x3f0 [xen_netback]
[161846.962906] [<ffffffffa0753f0c>] xenvif_tx_action+0x9c/0xd0 [xen_netback]
[161846.962915] [<ffffffffa07567f5>] xenvif_poll+0x35/0x70 [xen_netback]
[161846.962920] [<ffffffff8160e01b>] napi_poll+0xcb/0x1e0
[161846.962925] [<ffffffff8160e1c0>] net_rx_action+0x90/0x1c0
[161846.962931] [<ffffffff8108aaba>] __do_softirq+0x10a/0x350
[161846.962938] [<ffffffff8108ae75>] irq_exit+0x125/0x130
[161846.962943] [<ffffffff813f03a9>] xen_evtchn_do_upcall+0x39/0x50
[161846.962950] [<ffffffff816e7ffe>] xen_do_hypervisor_callback+0x1e/0x40
[161846.962952] <EOI>
[161846.962959] [<ffffffff816e5c4a>] ? _raw_spin_lock+0x4a/0x80
[161846.962964] [<ffffffff816e5b1e>] ? _raw_spin_lock_irqsave+0x1e/0xa0
[161846.962978] [<ffffffffa028e279>] ? qlcnic_83xx_mailbox_worker+0xb9/0x2a0 [qlcnic]
[161846.962991] [<ffffffff810a14e1>] ? process_one_work+0x151/0x4b0
[161846.962995] [<ffffffff8100c3f2>] ? check_events+0x12/0x20
[161846.963001] [<ffffffff810a1960>] ? worker_thread+0x120/0x480
[161846.963005] [<ffffffff816e187b>] ? __schedule+0x30b/0x890
[161846.963010] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[161846.963015] [<ffffffff810a1840>] ? process_one_work+0x4b0/0x4b0
[161846.963021] [<ffffffff810a6b3e>] ? kthread+0xce/0xf0
[161846.963025] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[161846.963031] [<ffffffff816e6522>] ? ret_from_fork+0x42/0x70
[161846.963035] [<ffffffff810a6a70>] ? kthread_freezable_should_stop+0x70/0x70
[161846.963037] Code: cc 51 41 53 b8 1c 00 00 00 0f 05 41 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 51 41 53 b8 1d 00 00 00 0f 05 <41> 5b 59 c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc

Signed-off-by: Junxiao Bi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

--- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
+++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
@@ -3849,7 +3849,7 @@ static void qlcnic_83xx_flush_mbx_queue(
struct list_head *head = &mbx->cmd_q;
struct qlcnic_cmd_args *cmd = NULL;

- spin_lock(&mbx->queue_lock);
+ spin_lock_bh(&mbx->queue_lock);

while (!list_empty(head)) {
cmd = list_entry(head->next, struct qlcnic_cmd_args, list);
@@ -3860,7 +3860,7 @@ static void qlcnic_83xx_flush_mbx_queue(
qlcnic_83xx_notify_cmd_completion(adapter, cmd);
}

- spin_unlock(&mbx->queue_lock);
+ spin_unlock_bh(&mbx->queue_lock);
}

static int qlcnic_83xx_check_mbx_status(struct qlcnic_adapter *adapter)
@@ -3896,12 +3896,12 @@ static void qlcnic_83xx_dequeue_mbx_cmd(
{
struct qlcnic_mailbox *mbx = adapter->ahw->mailbox;

- spin_lock(&mbx->queue_lock);
+ spin_lock_bh(&mbx->queue_lock);

list_del(&cmd->list);
mbx->num_cmds--;

- spin_unlock(&mbx->queue_lock);
+ spin_unlock_bh(&mbx->queue_lock);

qlcnic_83xx_notify_cmd_completion(adapter, cmd);
}
@@ -3966,7 +3966,7 @@ static int qlcnic_83xx_enqueue_mbx_cmd(s
init_completion(&cmd->completion);
cmd->rsp_opcode = QLC_83XX_MBX_RESPONSE_UNKNOWN;

- spin_lock(&mbx->queue_lock);
+ spin_lock_bh(&mbx->queue_lock);

list_add_tail(&cmd->list, &mbx->cmd_q);
mbx->num_cmds++;
@@ -3974,7 +3974,7 @@ static int qlcnic_83xx_enqueue_mbx_cmd(s
*timeout = cmd->total_cmds * QLC_83XX_MBX_TIMEOUT;
queue_work(mbx->work_q, &mbx->work);

- spin_unlock(&mbx->queue_lock);
+ spin_unlock_bh(&mbx->queue_lock);

return 0;
}
@@ -4070,15 +4070,15 @@ static void qlcnic_83xx_mailbox_worker(s
mbx->rsp_status = QLC_83XX_MBX_RESPONSE_WAIT;
spin_unlock_irqrestore(&mbx->aen_lock, flags);

- spin_lock(&mbx->queue_lock);
+ spin_lock_bh(&mbx->queue_lock);

if (list_empty(head)) {
- spin_unlock(&mbx->queue_lock);
+ spin_unlock_bh(&mbx->queue_lock);
return;
}
cmd = list_entry(head->next, struct qlcnic_cmd_args, list);

- spin_unlock(&mbx->queue_lock);
+ spin_unlock_bh(&mbx->queue_lock);

mbx_ops->encode_cmd(adapter, cmd);
mbx_ops->nofity_fw(adapter, QLC_83XX_MBX_REQUEST);



2018-02-09 14:17:23

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/92] powerpc/64s: Add support for RFI flush of L1-D cache

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <[email protected]>

commit aa8a5e0062ac940f7659394f4817c948dc8c0667 upstream.

On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.

This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.

The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.

In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.

In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.

For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.

In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.

Tested-by: Jon Masters <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
[Balbir - back ported to stable with changes]
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/include/asm/exception-64s.h | 40 +++++++++++--
arch/powerpc/include/asm/feature-fixups.h | 15 +++++
arch/powerpc/include/asm/paca.h | 10 +++
arch/powerpc/include/asm/setup.h | 13 ++++
arch/powerpc/kernel/asm-offsets.c | 4 +
arch/powerpc/kernel/exceptions-64s.S | 86 ++++++++++++++++++++++++++++++
arch/powerpc/kernel/setup_64.c | 79 +++++++++++++++++++++++++++
arch/powerpc/kernel/vmlinux.lds.S | 9 +++
arch/powerpc/lib/feature-fixups.c | 42 ++++++++++++++
9 files changed, 290 insertions(+), 8 deletions(-)

--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -51,34 +51,58 @@
#define EX_PPR 88 /* SMT thread status register (priority) */
#define EX_CTR 96

-/* Macros for annotating the expected destination of (h)rfid */
+/*
+ * Macros for annotating the expected destination of (h)rfid
+ *
+ * The nop instructions allow us to insert one or more instructions to flush the
+ * L1-D cache when returning to userspace or a guest.
+ */
+#define RFI_FLUSH_SLOT \
+ RFI_FLUSH_FIXUP_SECTION; \
+ nop; \
+ nop; \
+ nop

#define RFI_TO_KERNEL \
rfid

#define RFI_TO_USER \
- rfid
+ RFI_FLUSH_SLOT; \
+ rfid; \
+ b rfi_flush_fallback

#define RFI_TO_USER_OR_KERNEL \
- rfid
+ RFI_FLUSH_SLOT; \
+ rfid; \
+ b rfi_flush_fallback

#define RFI_TO_GUEST \
- rfid
+ RFI_FLUSH_SLOT; \
+ rfid; \
+ b rfi_flush_fallback

#define HRFI_TO_KERNEL \
hrfid

#define HRFI_TO_USER \
- hrfid
+ RFI_FLUSH_SLOT; \
+ hrfid; \
+ b hrfi_flush_fallback

#define HRFI_TO_USER_OR_KERNEL \
- hrfid
+ RFI_FLUSH_SLOT; \
+ hrfid; \
+ b hrfi_flush_fallback

#define HRFI_TO_GUEST \
- hrfid
+ RFI_FLUSH_SLOT; \
+ hrfid; \
+ b hrfi_flush_fallback

#define HRFI_TO_UNKNOWN \
- hrfid
+ RFI_FLUSH_SLOT; \
+ hrfid; \
+ b hrfi_flush_fallback

#ifdef CONFIG_RELOCATABLE
#define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -189,4 +189,19 @@ void apply_feature_fixups(void);
void setup_feature_keys(void);
#endif

+#define RFI_FLUSH_FIXUP_SECTION \
+951: \
+ .pushsection __rfi_flush_fixup,"a"; \
+ .align 2; \
+952: \
+ FTR_ENTRY_OFFSET 951b-952b; \
+ .popsection;
+
+
+#ifndef __ASSEMBLY__
+
+extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+
+#endif
+
#endif /* __ASM_POWERPC_FEATURE_FIXUPS_H */
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -205,6 +205,16 @@ struct paca_struct {
struct sibling_subcore_state *sibling_subcore_state;
#endif
#endif
+#ifdef CONFIG_PPC_BOOK3S_64
+ /*
+ * rfi fallback flush must be in its own cacheline to prevent
+ * other paca data leaking into the L1d
+ */
+ u64 exrfi[13] __aligned(0x80);
+ void *rfi_flush_fallback_area;
+ u64 l1d_flush_congruence;
+ u64 l1d_flush_sets;
+#endif
};

#ifdef CONFIG_PPC_BOOK3S
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -38,6 +38,19 @@ static inline void pseries_big_endian_ex
static inline void pseries_little_endian_exceptions(void) {}
#endif /* CONFIG_PPC_PSERIES */

+void rfi_flush_enable(bool enable);
+
+/* These are bit flags */
+enum l1d_flush_type {
+ L1D_FLUSH_NONE = 0x1,
+ L1D_FLUSH_FALLBACK = 0x2,
+ L1D_FLUSH_ORI = 0x4,
+ L1D_FLUSH_MTTRIG = 0x8,
+};
+
+void __init setup_rfi_flush(enum l1d_flush_type, bool enable);
+void do_rfi_flush_fixups(enum l1d_flush_type types);
+
#endif /* !__ASSEMBLY__ */

#endif /* _ASM_POWERPC_SETUP_H */
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -240,6 +240,10 @@ int main(void)
#ifdef CONFIG_PPC_BOOK3S_64
DEFINE(PACAMCEMERGSP, offsetof(struct paca_struct, mc_emergency_sp));
DEFINE(PACA_IN_MCE, offsetof(struct paca_struct, in_mce));
+ DEFINE(PACA_RFI_FLUSH_FALLBACK_AREA, offsetof(struct paca_struct, rfi_flush_fallback_area));
+ DEFINE(PACA_EXRFI, offsetof(struct paca_struct, exrfi));
+ DEFINE(PACA_L1D_FLUSH_CONGRUENCE, offsetof(struct paca_struct, l1d_flush_congruence));
+ DEFINE(PACA_L1D_FLUSH_SETS, offsetof(struct paca_struct, l1d_flush_sets));
#endif
DEFINE(PACAHWCPUID, offsetof(struct paca_struct, hw_cpu_id));
DEFINE(PACAKEXECSTATE, offsetof(struct paca_struct, kexec_state));
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1594,6 +1594,92 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR)
bl kernel_bad_stack
b 1b

+ .globl rfi_flush_fallback
+rfi_flush_fallback:
+ SET_SCRATCH0(r13);
+ GET_PACA(r13);
+ std r9,PACA_EXRFI+EX_R9(r13)
+ std r10,PACA_EXRFI+EX_R10(r13)
+ std r11,PACA_EXRFI+EX_R11(r13)
+ std r12,PACA_EXRFI+EX_R12(r13)
+ std r8,PACA_EXRFI+EX_R13(r13)
+ mfctr r9
+ ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+ ld r11,PACA_L1D_FLUSH_SETS(r13)
+ ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
+ /*
+ * The load adresses are at staggered offsets within cachelines,
+ * which suits some pipelines better (on others it should not
+ * hurt).
+ */
+ addi r12,r12,8
+ mtctr r11
+ DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+
+ /* order ld/st prior to dcbt stop all streams with flushing */
+ sync
+1: li r8,0
+ .rept 8 /* 8-way set associative */
+ ldx r11,r10,r8
+ add r8,r8,r12
+ xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not
+ add r8,r8,r11 // Add 0, this creates a dependency on the ldx
+ .endr
+ addi r10,r10,128 /* 128 byte cache line */
+ bdnz 1b
+
+ mtctr r9
+ ld r9,PACA_EXRFI+EX_R9(r13)
+ ld r10,PACA_EXRFI+EX_R10(r13)
+ ld r11,PACA_EXRFI+EX_R11(r13)
+ ld r12,PACA_EXRFI+EX_R12(r13)
+ ld r8,PACA_EXRFI+EX_R13(r13)
+ GET_SCRATCH0(r13);
+ rfid
+
+ .globl hrfi_flush_fallback
+hrfi_flush_fallback:
+ SET_SCRATCH0(r13);
+ GET_PACA(r13);
+ std r9,PACA_EXRFI+EX_R9(r13)
+ std r10,PACA_EXRFI+EX_R10(r13)
+ std r11,PACA_EXRFI+EX_R11(r13)
+ std r12,PACA_EXRFI+EX_R12(r13)
+ std r8,PACA_EXRFI+EX_R13(r13)
+ mfctr r9
+ ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+ ld r11,PACA_L1D_FLUSH_SETS(r13)
+ ld r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
+ /*
+ * The load adresses are at staggered offsets within cachelines,
+ * which suits some pipelines better (on others it should not
+ * hurt).
+ */
+ addi r12,r12,8
+ mtctr r11
+ DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+
+ /* order ld/st prior to dcbt stop all streams with flushing */
+ sync
+1: li r8,0
+ .rept 8 /* 8-way set associative */
+ ldx r11,r10,r8
+ add r8,r8,r12
+ xor r11,r11,r11 // Ensure r11 is 0 even if fallback area is not
+ add r8,r8,r11 // Add 0, this creates a dependency on the ldx
+ .endr
+ addi r10,r10,128 /* 128 byte cache line */
+ bdnz 1b
+
+ mtctr r9
+ ld r9,PACA_EXRFI+EX_R9(r13)
+ ld r10,PACA_EXRFI+EX_R10(r13)
+ ld r11,PACA_EXRFI+EX_R11(r13)
+ ld r12,PACA_EXRFI+EX_R12(r13)
+ ld r8,PACA_EXRFI+EX_R13(r13)
+ GET_SCRATCH0(r13);
+ hrfid
+
/*
* Called from arch_local_irq_enable when an interrupt needs
* to be resent. r3 contains 0x500, 0x900, 0xa00 or 0xe80 to indicate
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -678,4 +678,83 @@ static int __init disable_hardlockup_det
return 0;
}
early_initcall(disable_hardlockup_detector);
+
+#ifdef CONFIG_PPC_BOOK3S_64
+static enum l1d_flush_type enabled_flush_types;
+static void *l1d_flush_fallback_area;
+bool rfi_flush;
+
+static void do_nothing(void *unused)
+{
+ /*
+ * We don't need to do the flush explicitly, just enter+exit kernel is
+ * sufficient, the RFI exit handlers will do the right thing.
+ */
+}
+
+void rfi_flush_enable(bool enable)
+{
+ if (rfi_flush == enable)
+ return;
+
+ if (enable) {
+ do_rfi_flush_fixups(enabled_flush_types);
+ on_each_cpu(do_nothing, NULL, 1);
+ } else
+ do_rfi_flush_fixups(L1D_FLUSH_NONE);
+
+ rfi_flush = enable;
+}
+
+static void init_fallback_flush(void)
+{
+ u64 l1d_size, limit;
+ int cpu;
+
+ l1d_size = ppc64_caches.dsize;
+ limit = min(safe_stack_limit(), ppc64_rma_size);
+
+ /*
+ * Align to L1d size, and size it at 2x L1d size, to catch possible
+ * hardware prefetch runoff. We don't have a recipe for load patterns to
+ * reliably avoid the prefetcher.
+ */
+ l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit));
+ memset(l1d_flush_fallback_area, 0, l1d_size * 2);
+
+ for_each_possible_cpu(cpu) {
+ /*
+ * The fallback flush is currently coded for 8-way
+ * associativity. Different associativity is possible, but it
+ * will be treated as 8-way and may not evict the lines as
+ * effectively.
+ *
+ * 128 byte lines are mandatory.
+ */
+ u64 c = l1d_size / 8;
+
+ paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area;
+ paca[cpu].l1d_flush_congruence = c;
+ paca[cpu].l1d_flush_sets = c / 128;
+ }
+}
+
+void __init setup_rfi_flush(enum l1d_flush_type types, bool enable)
+{
+ if (types & L1D_FLUSH_FALLBACK) {
+ pr_info("rfi-flush: Using fallback displacement flush\n");
+ init_fallback_flush();
+ }
+
+ if (types & L1D_FLUSH_ORI)
+ pr_info("rfi-flush: Using ori type flush\n");
+
+ if (types & L1D_FLUSH_MTTRIG)
+ pr_info("rfi-flush: Using mttrig type flush\n");
+
+ enabled_flush_types = types;
+
+ rfi_flush_enable(enable);
+}
+#endif /* CONFIG_PPC_BOOK3S_64 */
#endif
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -132,6 +132,15 @@ SECTIONS
/* Read-only data */
RODATA

+#ifdef CONFIG_PPC64
+ . = ALIGN(8);
+ __rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) {
+ __start___rfi_flush_fixup = .;
+ *(__rfi_flush_fixup)
+ __stop___rfi_flush_fixup = .;
+ }
+#endif
+
EXCEPTION_TABLE(0)

NOTES :kernel :notes
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -23,6 +23,7 @@
#include <asm/sections.h>
#include <asm/setup.h>
#include <asm/firmware.h>
+#include <asm/setup.h>

struct fixup_entry {
unsigned long mask;
@@ -115,6 +116,47 @@ void do_feature_fixups(unsigned long val
}
}

+#ifdef CONFIG_PPC_BOOK3S_64
+void do_rfi_flush_fixups(enum l1d_flush_type types)
+{
+ unsigned int instrs[3], *dest;
+ long *start, *end;
+ int i;
+
+ start = PTRRELOC(&__start___rfi_flush_fixup),
+ end = PTRRELOC(&__stop___rfi_flush_fixup);
+
+ instrs[0] = 0x60000000; /* nop */
+ instrs[1] = 0x60000000; /* nop */
+ instrs[2] = 0x60000000; /* nop */
+
+ if (types & L1D_FLUSH_FALLBACK)
+ /* b .+16 to fallback flush */
+ instrs[0] = 0x48000010;
+
+ i = 0;
+ if (types & L1D_FLUSH_ORI) {
+ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/
+ }
+
+ if (types & L1D_FLUSH_MTTRIG)
+ instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+ patch_instruction(dest, instrs[0]);
+ patch_instruction(dest + 1, instrs[1]);
+ patch_instruction(dest + 2, instrs[2]);
+ }
+
+ printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i);
+}
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
{
long *start, *end;



2018-02-09 14:17:34

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/92] powerpc/64s: Convert slb_miss_common to use RFI_TO_USER/KERNEL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <[email protected]>

commit c7305645eb0c1621351cfc104038831ae87c0053 upstream.

In the SLB miss handler we may be returning to user or kernel. We need
to add a check early on and save the result in the cr4 register, and
then we bifurcate the return path based on that.

Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Nicholas Piggin <[email protected]>
[mpe: Backport to 4.4 based on patch from Balbir]
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/exceptions-64s.S | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)

--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -655,6 +655,8 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R

andi. r10,r12,MSR_RI /* check for unrecoverable exception */
beq- 2f
+ andi. r10,r12,MSR_PR /* check for user mode (PR != 0) */
+ bne 1f

/* All done -- return from exception. */

@@ -671,7 +673,23 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R
ld r11,PACA_EXSLB+EX_R11(r13)
ld r12,PACA_EXSLB+EX_R12(r13)
ld r13,PACA_EXSLB+EX_R13(r13)
- rfid
+ RFI_TO_KERNEL
+ b . /* prevent speculative execution */
+
+1:
+.machine push
+.machine "power4"
+ mtcrf 0x80,r9
+ mtcrf 0x01,r9 /* slb_allocate uses cr0 and cr7 */
+.machine pop
+
+ RESTORE_PPR_PACA(PACA_EXSLB, r9)
+ ld r9,PACA_EXSLB+EX_R9(r13)
+ ld r10,PACA_EXSLB+EX_R10(r13)
+ ld r11,PACA_EXSLB+EX_R11(r13)
+ ld r12,PACA_EXSLB+EX_R12(r13)
+ ld r13,PACA_EXSLB+EX_R13(r13)
+ RFI_TO_USER
b . /* prevent speculative execution */

2: mfspr r11,SPRN_SRR0
@@ -679,7 +697,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_R
mtspr SPRN_SRR0,r10
ld r10,PACAKMSR(r13)
mtspr SPRN_SRR1,r10
- rfid
+ RFI_TO_KERNEL
b .

8: mfspr r11,SPRN_SRR0



2018-02-09 14:17:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/92] powerpc/64s: Support disabling RFI flush with no_rfi_flush and nopti

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Ellerman <[email protected]>

commit bc9c9304a45480797e13a8e1df96ffcf44fb62fe upstream.

Because there may be some performance overhead of the RFI flush, add
kernel command line options to disable it.

We add a sensibly named 'no_rfi_flush' option, but we also hijack the
x86 option 'nopti'. The RFI flush is not the same as KPTI, but if we
see 'nopti' we can guess that the user is trying to avoid any overhead
of Meltdown mitigations, and it means we don't have to educate every
one about a different command line option.

Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/setup_64.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)

--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -682,8 +682,29 @@ early_initcall(disable_hardlockup_detect
#ifdef CONFIG_PPC_BOOK3S_64
static enum l1d_flush_type enabled_flush_types;
static void *l1d_flush_fallback_area;
+static bool no_rfi_flush;
bool rfi_flush;

+static int __init handle_no_rfi_flush(char *p)
+{
+ pr_info("rfi-flush: disabled on command line.");
+ no_rfi_flush = true;
+ return 0;
+}
+early_param("no_rfi_flush", handle_no_rfi_flush);
+
+/*
+ * The RFI flush is not KPTI, but because users will see doco that says to use
+ * nopti we hijack that option here to also disable the RFI flush.
+ */
+static int __init handle_no_pti(char *p)
+{
+ pr_info("rfi-flush: disabling due to 'nopti' on command line.\n");
+ handle_no_rfi_flush(NULL);
+ return 0;
+}
+early_param("nopti", handle_no_pti);
+
static void do_nothing(void *unused)
{
/*
@@ -754,7 +775,8 @@ void __init setup_rfi_flush(enum l1d_flu

enabled_flush_types = types;

- rfi_flush_enable(enable);
+ if (!no_rfi_flush)
+ rfi_flush_enable(enable);
}
#endif /* CONFIG_PPC_BOOK3S_64 */
#endif



2018-02-09 14:18:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/92] powerpc/64: Convert the syscall exit path to use RFI_TO_USER/KERNEL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <[email protected]>

commit b8e90cb7bc04a509e821e82ab6ed7a8ef11ba333 upstream.

In the syscall exit path we may be returning to user or kernel
context. We already have a test for that, because we conditionally
restore r13. So use that existing test and branch, and bifurcate the
return based on that.

Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/entry_64.S | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -251,13 +251,23 @@ BEGIN_FTR_SECTION
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)

ld r13,GPR13(r1) /* only restore r13 if returning to usermode */
+ ld r2,GPR2(r1)
+ ld r1,GPR1(r1)
+ mtlr r4
+ mtcr r5
+ mtspr SPRN_SRR0,r7
+ mtspr SPRN_SRR1,r8
+ RFI_TO_USER
+ b . /* prevent speculative execution */
+
+ /* exit to kernel */
1: ld r2,GPR2(r1)
ld r1,GPR1(r1)
mtlr r4
mtcr r5
mtspr SPRN_SRR0,r7
mtspr SPRN_SRR1,r8
- RFI
+ RFI_TO_KERNEL
b . /* prevent speculative execution */

syscall_error:



2018-02-09 14:18:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/92] powerpc/64: Add macros for annotating the destination of rfid/hrfid

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <[email protected]>

commit 50e51c13b3822d14ff6df4279423e4b7b2269bc3 upstream.

The rfid/hrfid ((Hypervisor) Return From Interrupt) instruction is
used for switching from the kernel to userspace, and from the
hypervisor to the guest kernel. However it can and is also used for
other transitions, eg. from real mode kernel code to virtual mode
kernel code, and it's not always clear from the code what the
destination context is.

To make it clearer when reading the code, add macros which encode the
expected destination context.

Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/include/asm/exception-64e.h | 6 ++++++
arch/powerpc/include/asm/exception-64s.h | 29 +++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)

--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -209,5 +209,11 @@ exc_##label##_book3e:
ori r3,r3,vector_offset@l; \
mtspr SPRN_IVOR##vector_number,r3;

+#define RFI_TO_KERNEL \
+ rfi
+
+#define RFI_TO_USER \
+ rfi
+
#endif /* _ASM_POWERPC_EXCEPTION_64E_H */

--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -51,6 +51,35 @@
#define EX_PPR 88 /* SMT thread status register (priority) */
#define EX_CTR 96

+/* Macros for annotating the expected destination of (h)rfid */
+
+#define RFI_TO_KERNEL \
+ rfid
+
+#define RFI_TO_USER \
+ rfid
+
+#define RFI_TO_USER_OR_KERNEL \
+ rfid
+
+#define RFI_TO_GUEST \
+ rfid
+
+#define HRFI_TO_KERNEL \
+ hrfid
+
+#define HRFI_TO_USER \
+ hrfid
+
+#define HRFI_TO_USER_OR_KERNEL \
+ hrfid
+
+#define HRFI_TO_GUEST \
+ hrfid
+
+#define HRFI_TO_UNKNOWN \
+ hrfid
+
#ifdef CONFIG_RELOCATABLE
#define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h) \
mfspr r11,SPRN_##h##SRR0; /* save SRR0 */ \



2018-02-09 14:19:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/92] powerpc/64: Convert fast_exception_return to use RFI_TO_USER/KERNEL

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nicholas Piggin <[email protected]>

commit a08f828cf47e6c605af21d2cdec68f84e799c318 upstream.

Similar to the syscall return path, in fast_exception_return we may be
returning to user or kernel context. We already have a test for that,
because we conditionally restore r13. So use that existing test and
branch, and bifurcate the return based on that.

Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/entry_64.S | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)

--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -859,7 +859,7 @@ BEGIN_FTR_SECTION
END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ACCOUNT_CPU_USER_EXIT(r13, r2, r4)
REST_GPR(13, r1)
-1:
+
mtspr SPRN_SRR1,r3

ld r2,_CCR(r1)
@@ -872,8 +872,22 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
ld r3,GPR3(r1)
ld r4,GPR4(r1)
ld r1,GPR1(r1)
+ RFI_TO_USER
+ b . /* prevent speculative execution */

- rfid
+1: mtspr SPRN_SRR1,r3
+
+ ld r2,_CCR(r1)
+ mtcrf 0xFF,r2
+ ld r2,_NIP(r1)
+ mtspr SPRN_SRR0,r2
+
+ ld r0,GPR0(r1)
+ ld r2,GPR2(r1)
+ ld r3,GPR3(r1)
+ ld r4,GPR4(r1)
+ ld r1,GPR1(r1)
+ RFI_TO_KERNEL
b . /* prevent speculative execution */

#endif /* CONFIG_PPC_BOOK3E */



2018-02-09 14:19:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/92] powerpc/pseries: Add H_GET_CPU_CHARACTERISTICS flags & wrapper

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Michael Neuling <[email protected]>

commit 191eccb1580939fb0d47deb405b82a85b0379070 upstream.

A new hypervisor call has been defined to communicate various
characteristics of the CPU to guests. Add definitions for the hcall
number, flags and a wrapper function.

Signed-off-by: Michael Neuling <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
[Balbir fixed conflicts in backport]
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/include/asm/hvcall.h | 17 +++++++++++++++++
arch/powerpc/include/asm/plpar_wrappers.h | 14 ++++++++++++++
2 files changed, 31 insertions(+)

--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -240,6 +240,7 @@
#define H_GET_HCA_INFO 0x1B8
#define H_GET_PERF_COUNT 0x1BC
#define H_MANAGE_TRACE 0x1C0
+#define H_GET_CPU_CHARACTERISTICS 0x1C8
#define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
#define H_QUERY_INT_STATE 0x1E4
#define H_POLL_PENDING 0x1D8
@@ -306,6 +307,17 @@
#define H_SET_MODE_RESOURCE_ADDR_TRANS_MODE 3
#define H_SET_MODE_RESOURCE_LE 4

+/* H_GET_CPU_CHARACTERISTICS return values */
+#define H_CPU_CHAR_SPEC_BAR_ORI31 (1ull << 63) // IBM bit 0
+#define H_CPU_CHAR_BCCTRL_SERIALISED (1ull << 62) // IBM bit 1
+#define H_CPU_CHAR_L1D_FLUSH_ORI30 (1ull << 61) // IBM bit 2
+#define H_CPU_CHAR_L1D_FLUSH_TRIG2 (1ull << 60) // IBM bit 3
+#define H_CPU_CHAR_L1D_THREAD_PRIV (1ull << 59) // IBM bit 4
+
+#define H_CPU_BEHAV_FAVOUR_SECURITY (1ull << 63) // IBM bit 0
+#define H_CPU_BEHAV_L1D_FLUSH_PR (1ull << 62) // IBM bit 1
+#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR (1ull << 61) // IBM bit 2
+
#ifndef __ASSEMBLY__

/**
@@ -433,6 +445,11 @@ static inline unsigned long cmo_get_page
}
#endif /* CONFIG_PPC_PSERIES */

+struct h_cpu_char_result {
+ u64 character;
+ u64 behaviour;
+};
+
#endif /* __ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_HVCALL_H */
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -340,4 +340,18 @@ static inline long plapr_set_watchpoint0
return plpar_set_mode(0, H_SET_MODE_RESOURCE_SET_DAWR, dawr0, dawrx0);
}

+static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p)
+{
+ unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
+ long rc;
+
+ rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf);
+ if (rc == H_SUCCESS) {
+ p->character = retbuf[0];
+ p->behaviour = retbuf[1];
+ }
+
+ return rc;
+}
+
#endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */



2018-02-09 20:20:32

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On 02/09/2018 06:38 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.81 release.
> There are 92 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.81-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah


2018-02-09 21:33:35

by Dan Rue

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Fri, Feb 09, 2018 at 02:38:29PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.81 release.
> There are 92 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.81-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.

Results from Linaro’s test farm.
No regressions on arm64, arm and x86_64.

Summary
------------------------------------------------------------------------

kernel: 4.9.81-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-4.9.y
git commit: d3dab8430a0720fba4183f6107d70edb373be852
git describe: v4.9.80-93-gd3dab8430a07
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-4.9-oe/build/v4.9.80-93-gd3dab8430a07


No regressions (compared to build v4.9.80-93-ga5ced8b8b31d)

Boards, architectures and test suites:
-------------------------------------

hi6220-hikey - arm64
* boot - pass: 20
* kselftest - skip: 24, pass: 41
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 2, pass: 61
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 1, pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 4, pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 2, skip: 152, pass: 996
* ltp-timers-tests - skip: 1, pass: 12

juno-r2 - arm64
* boot - pass: 21
* kselftest - skip: 23, pass: 42
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 2, pass: 61
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 22
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 4, pass: 10
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 2, skip: 149, pass: 999
* ltp-timers-tests - skip: 1, pass: 12

x15 - arm
* boot - pass: 20
* kselftest - skip: 25, pass: 39
* libhugetlbfs - skip: 1, pass: 87
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 2, pass: 61
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - skip: 2, pass: 20
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 1, pass: 13
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 2, skip: 97, pass: 1051
* ltp-timers-tests - skip: 1, pass: 12

x86_64
* boot - pass: 20
* kselftest - skip: 27, pass: 54
* libhugetlbfs - skip: 1, pass: 90
* ltp-cap_bounds-tests - pass: 2
* ltp-containers-tests - skip: 17, pass: 64
* ltp-fcntl-locktests-tests - pass: 2
* ltp-filecaps-tests - pass: 2
* ltp-fs-tests - skip: 1, pass: 62
* ltp-fs_bind-tests - pass: 2
* ltp-fs_perms_simple-tests - pass: 19
* ltp-fsx-tests - pass: 2
* ltp-hugetlb-tests - pass: 21
* ltp-io-tests - pass: 3
* ltp-ipc-tests - pass: 9
* ltp-math-tests - pass: 11
* ltp-nptl-tests - pass: 2
* ltp-pty-tests - pass: 4
* ltp-sched-tests - skip: 5, pass: 9
* ltp-securebits-tests - pass: 4
* ltp-syscalls-tests - fail: 2, skip: 118, pass: 1030
* ltp-timers-tests - skip: 1, pass: 12



--
Linaro QA (beta)
https://qa-reports.linaro.org

2018-02-09 22:02:35

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

On Fri, Feb 9, 2018 at 5:39 AM, Greg Kroah-Hartman
<[email protected]> wrote:
> 4.9-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Borislav Petkov <[email protected]>
>
> (cherry picked from commit 0e6c16c652cadaffd25a6bb326ec10da5bcec6b4)
>
> After commit ad67b74d2469 ("printk: hash addresses printed with %p")
> pointers are being hashed when printed. However, this makes the alternative
> debug output completely useless. Switch to %px in order to see the
> unadorned kernel pointers.

This missed a "Fixes:" tag so probably missed automated checking on
how far back to port this. It shouldn't go back beyond 4.15 (where
ad67b74d2469 first appeared).

-Kees

>
> Signed-off-by: Borislav Petkov <[email protected]>
> Signed-off-by: Thomas Gleixner <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: David Woodhouse <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: Josh Poimboeuf <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Link: https://lkml.kernel.org/r/[email protected]
> Signed-off-by: David Woodhouse <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
> ---
> arch/x86/kernel/alternative.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -298,7 +298,7 @@ recompute_jump(struct alt_instr *a, u8 *
> tgt_rip = next_rip + o_dspl;
> n_dspl = tgt_rip - orig_insn;
>
> - DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
> + DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);
>
> if (tgt_rip - orig_insn >= 0) {
> if (n_dspl - 2 <= 127)
> @@ -352,7 +352,7 @@ static void __init_or_module optimize_no
> sync_core();
> local_irq_restore(flags);
>
> - DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
> + DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
> instr, a->instrlen - a->padlen, a->padlen);
> }
>
> @@ -370,7 +370,7 @@ void __init_or_module apply_alternatives
> u8 *instr, *replacement;
> u8 insnbuf[MAX_PATCH_LEN];
>
> - DPRINTK("alt table %p -> %p", start, end);
> + DPRINTK("alt table %px, -> %px", start, end);
> /*
> * The scan order should be from start to end. A later scanned
> * alternative code can overwrite previously scanned alternative code.
> @@ -394,14 +394,14 @@ void __init_or_module apply_alternatives
> continue;
> }
>
> - DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
> + DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
> a->cpuid >> 5,
> a->cpuid & 0x1f,
> instr, a->instrlen,
> replacement, a->replacementlen, a->padlen);
>
> - DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
> - DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
> + DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
> + DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
>
> memcpy(insnbuf, replacement, a->replacementlen);
> insnbuf_sz = a->replacementlen;
> @@ -422,7 +422,7 @@ void __init_or_module apply_alternatives
> a->instrlen - a->replacementlen);
> insnbuf_sz += a->instrlen - a->replacementlen;
> }
> - DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
> + DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);
>
> text_poke_early(instr, insnbuf, insnbuf_sz);
> }
>
>



--
Kees Cook
Pixel Security

2018-02-10 07:25:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

On Fri, Feb 09, 2018 at 02:01:32PM -0800, Kees Cook wrote:
> On Fri, Feb 9, 2018 at 5:39 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > 4.9-stable review patch. If anyone has any objections, please let me know.
> >
> > ------------------
> >
> > From: Borislav Petkov <[email protected]>
> >
> > (cherry picked from commit 0e6c16c652cadaffd25a6bb326ec10da5bcec6b4)
> >
> > After commit ad67b74d2469 ("printk: hash addresses printed with %p")
> > pointers are being hashed when printed. However, this makes the alternative
> > debug output completely useless. Switch to %px in order to see the
> > unadorned kernel pointers.
>
> This missed a "Fixes:" tag so probably missed automated checking on
> how far back to port this. It shouldn't go back beyond 4.15 (where
> ad67b74d2469 first appeared).

Good point. Should we instead be using %pK for this change instead? Or
should we just backport commit ad67b74d2469 to 4.14? :)

Thanks for letting me know this.

greg k-h

2018-02-10 15:48:37

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On 02/09/2018 05:38 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.81 release.
> There are 92 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
> Anything received after that time might be too late.
>

Build results:
total: 145 pass: 140 fail: 5
Failed builds:
powerpc:defconfig
powerpc:allmodconfig
powerpc:ppc64e_defconfig
powerpc:cell_defconfig
powerpc:maple_defconfig
Qemu test results:
total: 118 pass: 113 fail: 5
Failed tests:
powerpc:mac99:ppc64_book3s_defconfig:nosmp
powerpc:mac99:ppc64_book3s_defconfig:smp4
powerpc:pseries:pseries_defconfig
powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
powerpc:mpc8544ds:ppc64_e5500_defconfig:smp

All builds and tests fail with the following error.

arch/powerpc/kernel/setup_64.c:41:25: fatal error: asm/debugfs.h: No such file or directory

For older kernels, the backport of 236003e6b5443c4 ("powerpc/64s: Allow control of RFI
flush via debugfs") should include linux/debugfs.h, not asm/debugfs.h.

4.4.y-queue is affected as well.

Guenter

2018-02-10 19:16:11

by Kees Cook

[permalink] [raw]
Subject: Re: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

On Fri, Feb 9, 2018 at 11:23 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Fri, Feb 09, 2018 at 02:01:32PM -0800, Kees Cook wrote:
>> On Fri, Feb 9, 2018 at 5:39 AM, Greg Kroah-Hartman
>> <[email protected]> wrote:
>> > 4.9-stable review patch. If anyone has any objections, please let me know.
>> >
>> > ------------------
>> >
>> > From: Borislav Petkov <[email protected]>
>> >
>> > (cherry picked from commit 0e6c16c652cadaffd25a6bb326ec10da5bcec6b4)
>> >
>> > After commit ad67b74d2469 ("printk: hash addresses printed with %p")
>> > pointers are being hashed when printed. However, this makes the alternative
>> > debug output completely useless. Switch to %px in order to see the
>> > unadorned kernel pointers.
>>
>> This missed a "Fixes:" tag so probably missed automated checking on
>> how far back to port this. It shouldn't go back beyond 4.15 (where
>> ad67b74d2469 first appeared).
>
> Good point. Should we instead be using %pK for this change instead? Or
> should we just backport commit ad67b74d2469 to 4.14? :)

ad67b74d2469 is pretty disruptive, I can't recommend putting it in
4.14. I mean, I think it'd be great, but I suspect other people would
find it very surprising. :)

I would just revert this commit from kernels earlier than 4.15; this
%p usage is in debugging, so %pK doesn't make much sense here.

-Kees

--
Kees Cook
Pixel Security

2018-02-10 19:22:59

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

On Sat, Feb 10, 2018 at 11:14:18AM -0800, Kees Cook wrote:
> I would just revert this commit from kernels earlier than 4.15; this
> %p usage is in debugging, so %pK doesn't make much sense here.

Yes, just drop it. That's only for debugging alternatives and not really
needed in stable.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2018-02-13 09:17:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 46/92] x86/alternative: Print unadorned pointers

On Sat, Feb 10, 2018 at 08:21:36PM +0100, Borislav Petkov wrote:
> On Sat, Feb 10, 2018 at 11:14:18AM -0800, Kees Cook wrote:
> > I would just revert this commit from kernels earlier than 4.15; this
> > %p usage is in debugging, so %pK doesn't make much sense here.
>
> Yes, just drop it. That's only for debugging alternatives and not really
> needed in stable.

Now dropped from 4.9.y and 4.14.y, thanks!

greg k-h

2018-02-13 09:37:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Sat, Feb 10, 2018 at 07:46:50AM -0800, Guenter Roeck wrote:
> On 02/09/2018 05:38 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.81 release.
> > There are 92 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 145 pass: 140 fail: 5
> Failed builds:
> powerpc:defconfig
> powerpc:allmodconfig
> powerpc:ppc64e_defconfig
> powerpc:cell_defconfig
> powerpc:maple_defconfig
> Qemu test results:
> total: 118 pass: 113 fail: 5
> Failed tests:
> powerpc:mac99:ppc64_book3s_defconfig:nosmp
> powerpc:mac99:ppc64_book3s_defconfig:smp4
> powerpc:pseries:pseries_defconfig
> powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
> powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
>
> All builds and tests fail with the following error.
>
> arch/powerpc/kernel/setup_64.c:41:25: fatal error: asm/debugfs.h: No such file or directory
>
> For older kernels, the backport of 236003e6b5443c4 ("powerpc/64s: Allow control of RFI
> flush via debugfs") should include linux/debugfs.h, not asm/debugfs.h.

Thanks for letting me know, I've now fixed this up. Will go work on the
4.4 patch as well.

greg k-h

2018-02-13 13:36:11

by Nick Lowe

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

Hi,

This does not seem to have subsumed the AMD specific code in

x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
Commit a8799fd14d9f7f385a5a5c86cde247caf4bb0320

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.9.81&id=a8799fd14d9f7f385a5a5c86cde247caf4bb0320

x86/kaiser: Check boottime cmdline params
Commit 8018307a45a90ab2eecfd03d48b7efb31707df37

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/arch/x86/mm/kaiser.c?h=v4.9.75&id=8018307a45a90ab2eecfd03d48b7efb31707df37

Here, we can see:

+skip:
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ goto disable;

Refer to 4.9.81's kaiser.c
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/mm/kaiser.c?h=v4.9.81

Also 4.4.115's kaiser.c
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/mm/kaiser.c?h=v4.4.115

Cheers,

Nick

2018-02-13 14:31:49

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On 02/13/2018 01:36 AM, Greg Kroah-Hartman wrote:
> On Sat, Feb 10, 2018 at 07:46:50AM -0800, Guenter Roeck wrote:
>> On 02/09/2018 05:38 AM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 4.9.81 release.
>>> There are 92 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
>>> Anything received after that time might be too late.
>>>
>>
>> Build results:
>> total: 145 pass: 140 fail: 5
>> Failed builds:
>> powerpc:defconfig
>> powerpc:allmodconfig
>> powerpc:ppc64e_defconfig
>> powerpc:cell_defconfig
>> powerpc:maple_defconfig
>> Qemu test results:
>> total: 118 pass: 113 fail: 5
>> Failed tests:
>> powerpc:mac99:ppc64_book3s_defconfig:nosmp
>> powerpc:mac99:ppc64_book3s_defconfig:smp4
>> powerpc:pseries:pseries_defconfig
>> powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
>> powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
>>
>> All builds and tests fail with the following error.
>>
>> arch/powerpc/kernel/setup_64.c:41:25: fatal error: asm/debugfs.h: No such file or directory
>>
>> For older kernels, the backport of 236003e6b5443c4 ("powerpc/64s: Allow control of RFI
>> flush via debugfs") should include linux/debugfs.h, not asm/debugfs.h.
>
> Thanks for letting me know, I've now fixed this up. Will go work on the
> 4.4 patch as well.
>

Unfortunately it wasn't the only problem. Now we have:

arch/powerpc/kernel/entry_64.S: Assembler messages:
arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'

Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
or at least part of it. Unfortunately it doesn't apply cleanly.

4.4 also now has a different failure.

In file included from drivers/staging/rdma/ehca/hcp_if.c:45:0:
arch/powerpc/include/asm/hvcall.h:439:2: error: unknown type name 'u64'
u64 character;
^
arch/powerpc/include/asm/hvcall.h:440:2: error: unknown type name 'u64'
u64 behaviour;

You'll need 1b689a95ce742 to fix that problem.

Guenter

2018-02-13 15:02:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On Tue, Feb 13, 2018 at 01:34:07PM +0000, Nick Lowe wrote:
> Hi,
>
> This does not seem to have subsumed the AMD specific code in
>
> x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown
> Commit a8799fd14d9f7f385a5a5c86cde247caf4bb0320
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.9.81&id=a8799fd14d9f7f385a5a5c86cde247caf4bb0320
>
> x86/kaiser: Check boottime cmdline params
> Commit 8018307a45a90ab2eecfd03d48b7efb31707df37
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/arch/x86/mm/kaiser.c?h=v4.9.75&id=8018307a45a90ab2eecfd03d48b7efb31707df37
>
> Here, we can see:
>
> +skip:
> + if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
> + goto disable;
>
> Refer to 4.9.81's kaiser.c
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/mm/kaiser.c?h=v4.9.81
>
> Also 4.4.115's kaiser.c
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/arch/x86/mm/kaiser.c?h=v4.4.115

So, any hints on what you think should be the correct fix here?

thanks,

greg k-h

2018-02-13 15:12:37

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

>
> So, any hints on what you think should be the correct fix here?

the patch sure looks correct to me, it now has a nice table for CPU IDs
including all of AMD (and soon hopefully the existing Intel ones that are not exposed to meltdown)




2018-02-13 15:29:34

by Nick Lowe

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

Hi Arjan and Greg,

Sorry if I am not being clear enough.

My point is that there is a check for X86_VENDOR_AMD now in two places.

It is still hardcoded for the auto boot option which I think should
not be there. The patch on that basis looked incomplete to me.

Put another way, there is no effect to the auto option where the
contents of cpu_no_meltdown[] are changed and
cpu_vulnerable_to_meltdown returns differently.

The auto option does not make use of a determination of the
X86_BUG_CPU_MELTDOWN state.

This seems wrong to me. It does not seem correct to me for the auto
option to have this duplication with a check for just X86_VENDOR_AMD.

Regards,

Nick

2018-02-13 15:31:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Tue, Feb 13, 2018 at 06:30:36AM -0800, Guenter Roeck wrote:
> On 02/13/2018 01:36 AM, Greg Kroah-Hartman wrote:
> > On Sat, Feb 10, 2018 at 07:46:50AM -0800, Guenter Roeck wrote:
> > > On 02/09/2018 05:38 AM, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 4.9.81 release.
> > > > There are 92 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > > >
> > > > Responses should be made by Sun Feb 11 13:39:04 UTC 2018.
> > > > Anything received after that time might be too late.
> > > >
> > >
> > > Build results:
> > > total: 145 pass: 140 fail: 5
> > > Failed builds:
> > > powerpc:defconfig
> > > powerpc:allmodconfig
> > > powerpc:ppc64e_defconfig
> > > powerpc:cell_defconfig
> > > powerpc:maple_defconfig
> > > Qemu test results:
> > > total: 118 pass: 113 fail: 5
> > > Failed tests:
> > > powerpc:mac99:ppc64_book3s_defconfig:nosmp
> > > powerpc:mac99:ppc64_book3s_defconfig:smp4
> > > powerpc:pseries:pseries_defconfig
> > > powerpc:mpc8544ds:ppc64_e5500_defconfig:nosmp
> > > powerpc:mpc8544ds:ppc64_e5500_defconfig:smp
> > >
> > > All builds and tests fail with the following error.
> > >
> > > arch/powerpc/kernel/setup_64.c:41:25: fatal error: asm/debugfs.h: No such file or directory
> > >
> > > For older kernels, the backport of 236003e6b5443c4 ("powerpc/64s: Allow control of RFI
> > > flush via debugfs") should include linux/debugfs.h, not asm/debugfs.h.
> >
> > Thanks for letting me know, I've now fixed this up. Will go work on the
> > 4.4 patch as well.
> >
>
> Unfortunately it wasn't the only problem. Now we have:
>
> arch/powerpc/kernel/entry_64.S: Assembler messages:
> arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
> arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
> arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
> arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'
>
> Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
> or at least part of it. Unfortunately it doesn't apply cleanly.

Ugh. Let's see if the ppc developers care about this or not :)

> 4.4 also now has a different failure.
>
> In file included from drivers/staging/rdma/ehca/hcp_if.c:45:0:
> arch/powerpc/include/asm/hvcall.h:439:2: error: unknown type name 'u64'
> u64 character;
> ^
> arch/powerpc/include/asm/hvcall.h:440:2: error: unknown type name 'u64'
> u64 behaviour;
>
> You'll need 1b689a95ce742 to fix that problem.

Now queued up, thanks.

greg k-h

2018-02-13 15:58:03

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On Tue, Feb 13, 2018 at 07:09:44AM -0800, Arjan van de Ven wrote:
> >
> > So, any hints on what you think should be the correct fix here?
>
> the patch sure looks correct to me, it now has a nice table for CPU IDs
> including all of AMD (and soon hopefully the existing Intel ones that are not exposed to meltdown)

I don't think the table is nice, it's a white list that would need
to be maintained forever.

-Andi

2018-02-13 16:05:39

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On Tue, 13 Feb 2018, Andi Kleen wrote:
> On Tue, Feb 13, 2018 at 07:09:44AM -0800, Arjan van de Ven wrote:
> > >
> > > So, any hints on what you think should be the correct fix here?
> >
> > the patch sure looks correct to me, it now has a nice table for CPU IDs
> > including all of AMD (and soon hopefully the existing Intel ones that are not exposed to meltdown)
>
> I don't think the table is nice, it's a white list that would need
> to be maintained forever.

No. The table is basically excluding families <=5 and a few individual
ones. Anything newer than that should tell via ARCH_CAP_RDCL_NO and not
need any entry.

Thanks,

tglx



2018-02-13 16:13:26

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On Tue, Feb 13, 2018 at 05:02:59PM +0100, Thomas Gleixner wrote:
> No. The table is basically excluding families <=5 and a few individual
> ones. Anything newer than that should tell via ARCH_CAP_RDCL_NO and not
> need any entry.

It looks to me like Nick wants 4.9 to test X86_BUG_CPU_MELTDOWN in
kaiser_check_boottime_disable() not X86_VENDOR_AMD. I.e., the auto
disable should pay attention to the CPUs in the table like upstream.

4.9 got the kaiser backports which explains the difference.

IMHO, of course.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
--

2018-02-13 16:19:15

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On 02/13/2018 07:56 AM, Andi Kleen wrote:
> On Tue, Feb 13, 2018 at 07:09:44AM -0800, Arjan van de Ven wrote:
>>> So, any hints on what you think should be the correct fix here?
>> the patch sure looks correct to me, it now has a nice table for CPU IDs
>> including all of AMD (and soon hopefully the existing Intel ones that are not exposed to meltdown)
> I don't think the table is nice, it's a white list that would need
> to be maintained forever.

On Intel, we have that RDCL_NO bit in the ARCH_CAPABILITIES MSR going
forward to show that we are not vulnerable. So, at least for Intel we
don't need to add new models forever.

2018-02-13 16:38:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

On Tue, Feb 13, 2018 at 03:27:47PM +0000, Nick Lowe wrote:
> Hi Arjan and Greg,
>
> Sorry if I am not being clear enough.
>
> My point is that there is a check for X86_VENDOR_AMD now in two places.
>
> It is still hardcoded for the auto boot option which I think should
> not be there. The patch on that basis looked incomplete to me.
>
> Put another way, there is no effect to the auto option where the
> contents of cpu_no_meltdown[] are changed and
> cpu_vulnerable_to_meltdown returns differently.
>
> The auto option does not make use of a determination of the
> X86_BUG_CPU_MELTDOWN state.
>
> This seems wrong to me. It does not seem correct to me for the auto
> option to have this duplication with a check for just X86_VENDOR_AMD.

Do you have a patch that reflects what you want to see changed here?

And can you test it? :)

I don't have any AMD hardware, sorry.

thanks,

greg k-h

2018-02-16 19:08:49

by Nick Lowe

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

Hi

I do not have a tested patch, but I expect the change would be something like:

skip:
- if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD)
+ if (!static_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
goto disable;

Cheers,

Nick


On Tue, Feb 13, 2018 at 4:32 PM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Tue, Feb 13, 2018 at 03:27:47PM +0000, Nick Lowe wrote:
>> Hi Arjan and Greg,
>>
>> Sorry if I am not being clear enough.
>>
>> My point is that there is a check for X86_VENDOR_AMD now in two places.
>>
>> It is still hardcoded for the auto boot option which I think should
>> not be there. The patch on that basis looked incomplete to me.
>>
>> Put another way, there is no effect to the auto option where the
>> contents of cpu_no_meltdown[] are changed and
>> cpu_vulnerable_to_meltdown returns differently.
>>
>> The auto option does not make use of a determination of the
>> X86_BUG_CPU_MELTDOWN state.
>>
>> This seems wrong to me. It does not seem correct to me for the auto
>> option to have this duplication with a check for just X86_VENDOR_AMD.
>
> Do you have a patch that reflects what you want to see changed here?
>
> And can you test it? :)
>
> I don't have any AMD hardware, sorry.
>
> thanks,
>
> greg k-h

2018-02-16 19:20:05

by Nick Lowe

[permalink] [raw]
Subject: Re: [PATCH 4.9 43/92] x86/pti: Do not enable PTI on CPUs which are not vulnerable to Meltdown

Hi,

This should use boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN)) - I am
preparing patches.

Best,

Nick

2018-02-17 13:34:32

by Yves-Alexis Perez

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Tue, 2018-02-13 at 16:29 +0100, Greg Kroah-Hartman wrote:
> > arch/powerpc/kernel/entry_64.S: Assembler messages:
> > arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
> > arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
> > arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
> > arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'
> >
> > Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
> > or at least part of it. Unfortunately it doesn't apply cleanly.
>
> Ugh. Let's see if the ppc developers care about this or not :)

Hi,

in Debian we extracted the following hunk from 222f20f140623 to fix build on
powerpc/ppc64el. Only compile tested against Debian builds though.

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 3320bcac7192..e68faa4d1b13 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -37,6 +37,11 @@
#include <asm/tm.h>
#include <asm/ppc-opcode.h>
#include <asm/export.h>
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/exception-64s.h>
+#else
+#include <asm/exception-64e.h>
+#endif

/*
* System calls.

--
Yves-Alexis


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2018-02-17 13:46:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Sat, Feb 17, 2018 at 02:31:53PM +0100, Yves-Alexis Perez wrote:
> On Tue, 2018-02-13 at 16:29 +0100, Greg Kroah-Hartman wrote:
> > > arch/powerpc/kernel/entry_64.S: Assembler messages:
> > > arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
> > > arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
> > > arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
> > > arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'
> > >
> > > Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
> > > or at least part of it. Unfortunately it doesn't apply cleanly.
> >
> > Ugh. Let's see if the ppc developers care about this or not :)
>
> Hi,
>
> in Debian we extracted the following hunk from 222f20f140623 to fix build on
> powerpc/ppc64el. Only compile tested against Debian builds though.
>
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 3320bcac7192..e68faa4d1b13 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -37,6 +37,11 @@
> #include <asm/tm.h>
> #include <asm/ppc-opcode.h>
> #include <asm/export.h>
> +#ifdef CONFIG_PPC_BOOK3S
> +#include <asm/exception-64s.h>
> +#else
> +#include <asm/exception-64e.h>
> +#endif
>

Ah, thanks! I've now queued up this portion of the patch.

greg k-h

2018-02-17 17:43:45

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On 02/17/2018 05:45 AM, Greg Kroah-Hartman wrote:
> On Sat, Feb 17, 2018 at 02:31:53PM +0100, Yves-Alexis Perez wrote:
>> On Tue, 2018-02-13 at 16:29 +0100, Greg Kroah-Hartman wrote:
>>>> arch/powerpc/kernel/entry_64.S: Assembler messages:
>>>> arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
>>>> arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
>>>> arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
>>>> arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'
>>>>
>>>> Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
>>>> or at least part of it. Unfortunately it doesn't apply cleanly.
>>>
>>> Ugh. Let's see if the ppc developers care about this or not :)
>>
>> Hi,
>>
>> in Debian we extracted the following hunk from 222f20f140623 to fix build on
>> powerpc/ppc64el. Only compile tested against Debian builds though.
>>
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 3320bcac7192..e68faa4d1b13 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -37,6 +37,11 @@
>> #include <asm/tm.h>
>> #include <asm/ppc-opcode.h>
>> #include <asm/export.h>
>> +#ifdef CONFIG_PPC_BOOK3S
>> +#include <asm/exception-64s.h>
>> +#else
>> +#include <asm/exception-64e.h>
>> +#endif
>>
>
> Ah, thanks! I've now queued up this portion of the patch.
>

Hmm, that chunk really doesn't do what the original patch is supposed to do,
meaning it won't provide the vulnerability protection it is supposed to provide
(AFAICS that is Meltdown). Just a note in case anyone is concerned about
actually providing that protection.

Guenter

2018-02-18 17:26:36

by Yves-Alexis Perez

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Sat, 2018-02-17 at 09:35 -0800, Guenter Roeck wrote:
> Hmm, that chunk really doesn't do what the original patch is supposed to do,
> meaning it won't provide the vulnerability protection it is supposed to provide
> (AFAICS that is Meltdown). Just a note in case anyone is concerned about
> actually providing that protection.

Right now my concern was really about fixing 4.9.81/4.9.82 build failure on
powerpc. Whether the patches backported in those two releases are enough to
fix Meltdown or not is a different thing indeed (and an information I'd be
glad to have too).

Regards,
--
Yves-Alexis


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2018-02-20 10:41:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/92] 4.9.81-stable review

On Sat, Feb 17, 2018 at 09:35:59AM -0800, Guenter Roeck wrote:
> On 02/17/2018 05:45 AM, Greg Kroah-Hartman wrote:
> > On Sat, Feb 17, 2018 at 02:31:53PM +0100, Yves-Alexis Perez wrote:
> > > On Tue, 2018-02-13 at 16:29 +0100, Greg Kroah-Hartman wrote:
> > > > > arch/powerpc/kernel/entry_64.S: Assembler messages:
> > > > > arch/powerpc/kernel/entry_64.S:260: Error: unrecognized opcode: `rfi_to_user'
> > > > > arch/powerpc/kernel/entry_64.S:270: Error: unrecognized opcode: `rfi_to_kernel'
> > > > > arch/powerpc/kernel/entry_64.S:885: Error: unrecognized opcode: `rfi_to_user'
> > > > > arch/powerpc/kernel/entry_64.S:900: Error: unrecognized opcode: `rfi_to_kernel'
> > > > >
> > > > > Looks like 222f20f140623 ("powerpc/64s: Simple RFI macro conversions") is missing,
> > > > > or at least part of it. Unfortunately it doesn't apply cleanly.
> > > >
> > > > Ugh. Let's see if the ppc developers care about this or not :)
> > >
> > > Hi,
> > >
> > > in Debian we extracted the following hunk from 222f20f140623 to fix build on
> > > powerpc/ppc64el. Only compile tested against Debian builds though.
> > >
> > > diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> > > index 3320bcac7192..e68faa4d1b13 100644
> > > --- a/arch/powerpc/kernel/entry_64.S
> > > +++ b/arch/powerpc/kernel/entry_64.S
> > > @@ -37,6 +37,11 @@
> > > #include <asm/tm.h>
> > > #include <asm/ppc-opcode.h>
> > > #include <asm/export.h>
> > > +#ifdef CONFIG_PPC_BOOK3S
> > > +#include <asm/exception-64s.h>
> > > +#else
> > > +#include <asm/exception-64e.h>
> > > +#endif
> >
> > Ah, thanks! I've now queued up this portion of the patch.
> >
>
> Hmm, that chunk really doesn't do what the original patch is supposed to do,
> meaning it won't provide the vulnerability protection it is supposed to provide
> (AFAICS that is Meltdown). Just a note in case anyone is concerned about
> actually providing that protection.

Good point, I've renamed this patch now to make it more obvious what is
going on.

thanks,

greg k-h