LinuxLists.cc - [PATCH 5.4 00/17] 5.4.79-rc1 review

2020-11-20 11:09:48

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 00/17] 5.4.79-rc1 review

This is the start of the stable review cycle for the 5.4.79 release.
There are 17 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun, 22 Nov 2020 10:45:32 +0000.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.79-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 5.4.79-rc1

Nick Desaulniers <[email protected]>
ACPI: GED: fix -Wformat

David Edmondson <[email protected]>
KVM: x86: clflushopt should be treated as a no-op by emulation

Zhang Changzhong <[email protected]>
can: proc: can_remove_proc(): silence remove_proc_entry warning

Johannes Berg <[email protected]>
mac80211: always wind down STA state

Dmitry Torokhov <[email protected]>
Input: sunkbd - avoid use-after-free in teardown paths

Hauke Mehrtens <[email protected]>
net: lantiq: Add locking for TX DMA channel

Christophe Leroy <[email protected]>
powerpc/8xx: Always fault when _PAGE_ACCESSED is not set

Eran Ben Elisha <[email protected]>
net/mlx5: Add retry mechanism to the command entry index allocation

Eran Ben Elisha <[email protected]>
net/mlx5: Fix a race when moving command interface to events mode

Eran Ben Elisha <[email protected]>
net/mlx5: poll cmd EQ in case of command timeout

Parav Pandit <[email protected]>
net/mlx5: Use async EQ setup cleanup helpers for multiple EQs

Sudip Mukherjee <[email protected]>
MIPS: PCI: Fix MIPS build

Daniel Axtens <[email protected]>
selftests/powerpc: entry flush test

Michael Ellerman <[email protected]>
powerpc: Only include kup-radix.h for 64-bit Book3S

Nicholas Piggin <[email protected]>
powerpc/64s: flush L1D after user accesses

Nicholas Piggin <[email protected]>
powerpc/64s: flush L1D on kernel entry

Russell Currey <[email protected]>
selftests/powerpc: rfi_flush: disable entry flush if present

-------------

Diffstat:

Documentation/admin-guide/kernel-parameters.txt | 7 +
Makefile | 4 +-
arch/mips/pci/pci-xtalk-bridge.c | 2 +-
arch/powerpc/include/asm/book3s/64/kup-radix.h | 29 ++--
arch/powerpc/include/asm/exception-64s.h | 12 +-
arch/powerpc/include/asm/feature-fixups.h | 19 +++
arch/powerpc/include/asm/kup.h | 27 +++-
arch/powerpc/include/asm/security_features.h | 7 +
arch/powerpc/include/asm/setup.h | 4 +
arch/powerpc/kernel/exceptions-64s.S | 88 +++++------
arch/powerpc/kernel/head_8xx.S | 14 +-
arch/powerpc/kernel/setup_64.c | 122 ++++++++++++++-
arch/powerpc/kernel/vmlinux.lds.S | 14 ++
arch/powerpc/lib/feature-fixups.c | 104 +++++++++++++
arch/powerpc/platforms/powernv/setup.c | 17 +++
arch/powerpc/platforms/pseries/setup.c | 8 +
arch/x86/kvm/emulate.c | 8 +-
drivers/acpi/evged.c | 2 +-
drivers/input/keyboard/sunkbd.c | 41 +++++-
drivers/net/ethernet/lantiq_xrx200.c | 2 +
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 109 ++++++++++++--
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 157 +++++++++++---------
drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h | 2 +
include/linux/mlx5/driver.h | 6 +
net/can/proc.c | 6 +-
net/mac80211/sta_info.c | 18 +++
.../testing/selftests/powerpc/security/.gitignore | 1 +
tools/testing/selftests/powerpc/security/Makefile | 2 +-
.../selftests/powerpc/security/entry_flush.c | 163 +++++++++++++++++++++
.../testing/selftests/powerpc/security/rfi_flush.c | 35 ++++-
30 files changed, 857 insertions(+), 173 deletions(-)

2020-11-20 11:09:50

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 14/17] mac80211: always wind down STA state

From: Johannes Berg <[email protected]>

commit dcd479e10a0510522a5d88b29b8f79ea3467d501 upstream.

When (for example) an IBSS station is pre-moved to AUTHORIZED
before it's inserted, and then the insertion fails, we don't
clean up the fast RX/TX states that might already have been
created, since we don't go through all the state transitions
again on the way down.

Do that, if it hasn't been done already, when the station is
freed. I considered only freeing the fast TX/RX state there,
but we might add more state so it's more robust to wind down
the state properly.

Note that we warn if the station was ever inserted, it should
have been properly cleaned up in that case, and the driver
will probably not like things happening out of order.

Reported-by: [email protected]
Link: https://lore.kernel.org/r/20201009141710.7223b322a955.I95bd08b9ad0e039c034927cce0b75beea38e059b@changeid
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/mac80211/sta_info.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

--- a/net/mac80211/sta_info.c
+++ b/net/mac80211/sta_info.c
@@ -244,6 +244,24 @@ struct sta_info *sta_info_get_by_idx(str
*/
void sta_info_free(struct ieee80211_local *local, struct sta_info *sta)
{
+ /*
+ * If we had used sta_info_pre_move_state() then we might not
+ * have gone through the state transitions down again, so do
+ * it here now (and warn if it's inserted).
+ *
+ * This will clear state such as fast TX/RX that may have been
+ * allocated during state transitions.
+ */
+ while (sta->sta_state > IEEE80211_STA_NONE) {
+ int ret;
+
+ WARN_ON_ONCE(test_sta_flag(sta, WLAN_STA_INSERTED));
+
+ ret = sta_info_move_state(sta, sta->sta_state - 1);
+ if (WARN_ONCE(ret, "sta_info_move_state() returned %d\n", ret))
+ break;
+ }
+
if (sta->rate_ctrl)
rate_control_free_sta(sta);

2020-11-20 11:11:51

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 05/17] selftests/powerpc: entry flush test

From: Daniel Axtens <[email protected]>

commit 89a83a0c69c81a25ce91002b90ca27ed86132a0a upstream.

Add a test modelled on the RFI flush test which counts the number
of L1D misses doing a simple syscall with the entry flush on and off.

For simplicity of backporting, this test duplicates a lot of code from
the upstream rfi_flush. This is cleaned up upstream, but we don't clean
it up here because it would involve bringing in even more commits.

Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/kernel/setup_64.c | 2
arch/powerpc/platforms/powernv/setup.c | 6
tools/testing/selftests/powerpc/security/.gitignore | 1
tools/testing/selftests/powerpc/security/Makefile | 2
tools/testing/selftests/powerpc/security/entry_flush.c | 163 +++++++++++++++++
5 files changed, 170 insertions(+), 4 deletions(-)
create mode 100644 tools/testing/selftests/powerpc/security/entry_flush.c

--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -1025,7 +1025,7 @@ void setup_uaccess_flush(bool enable)
return;

if (!no_uaccess_flush)
- uaccess_flush_enable(true);
+ uaccess_flush_enable(enable);
}

#ifdef CONFIG_DEBUG_FS
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -124,10 +124,12 @@ static void pnv_setup_rfi_flush(void)

/*
* If we are non-Power9 bare metal, we don't need to flush on kernel
- * entry: it fixes a P9 specific vulnerability.
+ * entry or after user access: they fix a P9 specific vulnerability.
*/
- if (!pvr_version_is(PVR_POWER9))
+ if (!pvr_version_is(PVR_POWER9)) {
security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_UACCESS);
+ }

enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
(security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \
--- a/tools/testing/selftests/powerpc/security/.gitignore
+++ b/tools/testing/selftests/powerpc/security/.gitignore
@@ -1 +1,2 @@
rfi_flush
+entry_flush
--- a/tools/testing/selftests/powerpc/security/Makefile
+++ b/tools/testing/selftests/powerpc/security/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0+

-TEST_GEN_PROGS := rfi_flush
+TEST_GEN_PROGS := rfi_flush entry_flush
top_srcdir = ../../../../..

CFLAGS += -I../../../../../usr/include
--- /dev/null
+++ b/tools/testing/selftests/powerpc/security/entry_flush.c
@@ -0,0 +1,163 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Copyright 2018 IBM Corporation.
+ */
+
+#define __SANE_USERSPACE_TYPES__
+
+#include <sys/types.h>
+#include <stdint.h>
+#include <malloc.h>
+#include <unistd.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+#include "utils.h"
+
+#define CACHELINE_SIZE 128
+
+struct perf_event_read {
+ __u64 nr;
+ __u64 l1d_misses;
+};
+
+static inline __u64 load(void *addr)
+{
+ __u64 tmp;
+
+ asm volatile("ld %0,0(%1)" : "=r"(tmp) : "b"(addr));
+
+ return tmp;
+}
+
+static void syscall_loop(char *p, unsigned long iterations,
+ unsigned long zero_size)
+{
+ for (unsigned long i = 0; i < iterations; i++) {
+ for (unsigned long j = 0; j < zero_size; j += CACHELINE_SIZE)
+ load(p + j);
+ getppid();
+ }
+}
+
+int entry_flush_test(void)
+{
+ char *p;
+ int repetitions = 10;
+ int fd, passes = 0, iter, rc = 0;
+ struct perf_event_read v;
+ __u64 l1d_misses_total = 0;
+ unsigned long iterations = 100000, zero_size = 24 * 1024;
+ unsigned long l1d_misses_expected;
+ int rfi_flush_orig;
+ int entry_flush, entry_flush_orig;
+
+ SKIP_IF(geteuid() != 0);
+
+ // The PMU event we use only works on Power7 or later
+ SKIP_IF(!have_hwcap(PPC_FEATURE_ARCH_2_06));
+
+ if (read_debugfs_file("powerpc/rfi_flush", &rfi_flush_orig) < 0) {
+ perror("Unable to read powerpc/rfi_flush debugfs file");
+ SKIP_IF(1);
+ }
+
+ if (read_debugfs_file("powerpc/entry_flush", &entry_flush_orig) < 0) {
+ perror("Unable to read powerpc/entry_flush debugfs file");
+ SKIP_IF(1);
+ }
+
+ if (rfi_flush_orig != 0) {
+ if (write_debugfs_file("powerpc/rfi_flush", 0) < 0) {
+ perror("error writing to powerpc/rfi_flush debugfs file");
+ FAIL_IF(1);
+ }
+ }
+
+ entry_flush = entry_flush_orig;
+
+ fd = perf_event_open_counter(PERF_TYPE_RAW, /* L1d miss */ 0x400f0, -1);
+ FAIL_IF(fd < 0);
+
+ p = (char *)memalign(zero_size, CACHELINE_SIZE);
+
+ FAIL_IF(perf_event_enable(fd));
+
+ // disable L1 prefetching
+ set_dscr(1);
+
+ iter = repetitions;
+
+ /*
+ * We expect to see l1d miss for each cacheline access when entry_flush
+ * is set. Allow a small variation on this.
+ */
+ l1d_misses_expected = iterations * (zero_size / CACHELINE_SIZE - 2);
+
+again:
+ FAIL_IF(perf_event_reset(fd));
+
+ syscall_loop(p, iterations, zero_size);
+
+ FAIL_IF(read(fd, &v, sizeof(v)) != sizeof(v));
+
+ if (entry_flush && v.l1d_misses >= l1d_misses_expected)
+ passes++;
+ else if (!entry_flush && v.l1d_misses < (l1d_misses_expected / 2))
+ passes++;
+
+ l1d_misses_total += v.l1d_misses;
+
+ while (--iter)
+ goto again;
+
+ if (passes < repetitions) {
+ printf("FAIL (L1D misses with entry_flush=%d: %llu %c %lu) [%d/%d failures]\n",
+ entry_flush, l1d_misses_total, entry_flush ? '<' : '>',
+ entry_flush ? repetitions * l1d_misses_expected :
+ repetitions * l1d_misses_expected / 2,
+ repetitions - passes, repetitions);
+ rc = 1;
+ } else
+ printf("PASS (L1D misses with entry_flush=%d: %llu %c %lu) [%d/%d pass]\n",
+ entry_flush, l1d_misses_total, entry_flush ? '>' : '<',
+ entry_flush ? repetitions * l1d_misses_expected :
+ repetitions * l1d_misses_expected / 2,
+ passes, repetitions);
+
+ if (entry_flush == entry_flush_orig) {
+ entry_flush = !entry_flush_orig;
+ if (write_debugfs_file("powerpc/entry_flush", entry_flush) < 0) {
+ perror("error writing to powerpc/entry_flush debugfs file");
+ return 1;
+ }
+ iter = repetitions;
+ l1d_misses_total = 0;
+ passes = 0;
+ goto again;
+ }
+
+ perf_event_disable(fd);
+ close(fd);
+
+ set_dscr(0);
+
+ if (write_debugfs_file("powerpc/rfi_flush", rfi_flush_orig) < 0) {
+ perror("unable to restore original value of powerpc/rfi_flush debugfs file");
+ return 1;
+ }
+
+ if (write_debugfs_file("powerpc/entry_flush", entry_flush_orig) < 0) {
+ perror("unable to restore original value of powerpc/entry_flush debugfs file");
+ return 1;
+ }
+
+ return rc;
+}
+
+int main(int argc, char *argv[])
+{
+ return test_harness(entry_flush_test, "entry_flush_test");
+}

2020-11-20 11:12:21

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 09/17] net/mlx5: Fix a race when moving command interface to events mode

From: Eran Ben Elisha <[email protected]>

commit d43b7007dbd1195a5b6b83213e49b1516aaf6f5e upstream.

After driver creates (via FW command) an EQ for commands, the driver will
be informed on new commands completion by EQE. However, due to a race in
driver's internal command mode metadata update, some new commands will
still be miss-handled by driver as if we are in polling mode. Such commands
can get two non forced completion, leading to already freed command entry
access.

CREATE_EQ command, that maps EQ to the command queue must be posted to the
command queue while it is empty and no other command should be posted.

Add SW mechanism that once the CREATE_EQ command is about to be executed,
all other commands will return error without being sent to the FW. Allow
sending other commands only after successfully changing the driver's
internal command mode metadata.
We can safely return error to all other commands while creating the command
EQ, as all other commands might be sent from the user/application during
driver load. Application can rerun them later after driver's load was
finished.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Cc: Timo Rothenpieler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 35 +++++++++++++++++++++++---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 3 ++
include/linux/mlx5/driver.h | 6 ++++
3 files changed, 40 insertions(+), 4 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -875,6 +875,14 @@ static void free_msg(struct mlx5_core_de
static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
struct mlx5_cmd_msg *msg);

+static bool opcode_allowed(struct mlx5_cmd *cmd, u16 opcode)
+{
+ if (cmd->allowed_opcode == CMD_ALLOWED_OPCODE_ALL)
+ return true;
+
+ return cmd->allowed_opcode == opcode;
+}
+
static void cmd_work_handler(struct work_struct *work)
{
struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work);
@@ -941,7 +949,8 @@ static void cmd_work_handler(struct work

/* Skip sending command to fw if internal error */
if (pci_channel_offline(dev->pdev) ||
- dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
+ dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
+ !opcode_allowed(&dev->cmd, ent->op)) {
u8 status = 0;
u32 drv_synd;

@@ -1459,6 +1468,22 @@ static void create_debugfs_files(struct
mlx5_cmdif_debugfs_init(dev);
}

+void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ int i;
+
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ down(&cmd->sem);
+ down(&cmd->pages_sem);
+
+ cmd->allowed_opcode = opcode;
+
+ up(&cmd->pages_sem);
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ up(&cmd->sem);
+}
+
static void mlx5_cmd_change_mod(struct mlx5_core_dev *dev, int mode)
{
struct mlx5_cmd *cmd = &dev->cmd;
@@ -1751,12 +1776,13 @@ static int cmd_exec(struct mlx5_core_dev
int err;
u8 status = 0;
u32 drv_synd;
+ u16 opcode;
u8 token;

+ opcode = MLX5_GET(mbox_in, in, opcode);
if (pci_channel_offline(dev->pdev) ||
- dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
- u16 opcode = MLX5_GET(mbox_in, in, opcode);
-
+ dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
+ !opcode_allowed(&dev->cmd, opcode)) {
err = mlx5_internal_err_ret_value(dev, opcode, &drv_synd, &status);
MLX5_SET(mbox_out, out, status, status);
MLX5_SET(mbox_out, out, syndrome, drv_synd);
@@ -2058,6 +2084,7 @@ int mlx5_cmd_init(struct mlx5_core_dev *
mlx5_core_dbg(dev, "descriptor at dma 0x%llx\n", (unsigned long long)(cmd->dma));

cmd->mode = CMD_MODE_POLLING;
+ cmd->allowed_opcode = CMD_ALLOWED_OPCODE_ALL;

create_msg_cache(dev);

--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -648,11 +648,13 @@ static int create_async_eqs(struct mlx5_
.nent = MLX5_NUM_CMD_EQE,
.mask[0] = 1ull << MLX5_EVENT_TYPE_CMD,
};
+ mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_CREATE_EQ);
err = setup_async_eq(dev, &table->cmd_eq, &param, "cmd");
if (err)
goto err1;

mlx5_cmd_use_events(dev);
+ mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);

param = (struct mlx5_eq_param) {
.irq_index = 0,
@@ -682,6 +684,7 @@ err2:
mlx5_cmd_use_polling(dev);
cleanup_async_eq(dev, &table->cmd_eq, "cmd");
err1:
+ mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
return err;
}
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -299,6 +299,7 @@ struct mlx5_cmd {
struct semaphore sem;
struct semaphore pages_sem;
int mode;
+ u16 allowed_opcode;
struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
struct dma_pool *pool;
struct mlx5_cmd_debug dbg;
@@ -890,10 +891,15 @@ mlx5_frag_buf_get_idx_last_contig_stride
return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1);
}

+enum {
+ CMD_ALLOWED_OPCODE_ALL,
+};
+
int mlx5_cmd_init(struct mlx5_core_dev *dev);
void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
void mlx5_cmd_use_polling(struct mlx5_core_dev *dev);
+void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode);

struct mlx5_async_ctx {
struct mlx5_core_dev *dev;

2020-11-20 11:13:01

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 17/17] ACPI: GED: fix -Wformat

From: Nick Desaulniers <[email protected]>

commit 9debfb81e7654fe7388a49f45bc4d789b94c1103 upstream.

Clang is more aggressive about -Wformat warnings when the format flag
specifies a type smaller than the parameter. It turns out that gsi is an
int. Fixes:

drivers/acpi/evged.c:105:48: warning: format specifies type 'unsigned
char' but the argument has type 'unsigned int' [-Wformat]
trigger == ACPI_EDGE_SENSITIVE ? 'E' : 'L', gsi);
^~~

Link: https://github.com/ClangBuiltLinux/linux/issues/378
Fixes: ea6f3af4c5e6 ("ACPI: GED: add support for _Exx / _Lxx handler methods")
Acked-by: Ard Biesheuvel <[email protected]>
Signed-off-by: Nick Desaulniers <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/acpi/evged.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/drivers/acpi/evged.c
+++ b/drivers/acpi/evged.c
@@ -101,7 +101,7 @@ static acpi_status acpi_ged_request_inte

switch (gsi) {
case 0 ... 255:
- sprintf(ev_name, "_%c%02hhX",
+ sprintf(ev_name, "_%c%02X",
trigger == ACPI_EDGE_SENSITIVE ? 'E' : 'L', gsi);

if (ACPI_SUCCESS(acpi_get_handle(handle, ev_name, &evt_handle)))

2020-11-20 11:13:08

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 15/17] can: proc: can_remove_proc(): silence remove_proc_entry warning

From: Zhang Changzhong <[email protected]>

commit 3accbfdc36130282f5ae9e6eecfdf820169fedce upstream.

If can_init_proc() fail to create /proc/net/can directory, can_remove_proc()
will trigger a warning:

WARNING: CPU: 6 PID: 7133 at fs/proc/generic.c:672 remove_proc_entry+0x17b0
Kernel panic - not syncing: panic_on_warn set ...

Fix to return early from can_remove_proc() if can proc_dir does not exists.

Signed-off-by: Zhang Changzhong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Fixes: 8e8cda6d737d ("can: initial support for network namespaces")
Acked-by: Oliver Hartkopp <[email protected]>
Signed-off-by: Marc Kleine-Budde <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/can/proc.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/net/can/proc.c
+++ b/net/can/proc.c
@@ -471,6 +471,9 @@ void can_init_proc(struct net *net)
*/
void can_remove_proc(struct net *net)
{
+ if (!net->can.proc_dir)
+ return;
+
if (net->can.pde_version)
remove_proc_entry(CAN_PROC_VERSION, net->can.proc_dir);

@@ -498,6 +501,5 @@ void can_remove_proc(struct net *net)
if (net->can.pde_rcvlist_sff)
remove_proc_entry(CAN_PROC_RCVLIST_SFF, net->can.proc_dir);

- if (net->can.proc_dir)
- remove_proc_entry("can", net->proc_net);
+ remove_proc_entry("can", net->proc_net);
}

2020-11-20 11:13:09

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 10/17] net/mlx5: Add retry mechanism to the command entry index allocation

From: Eran Ben Elisha <[email protected]>

commit 410bd754cd73c4a2ac3856d9a03d7b08f9c906bf upstream.

It is possible that new command entry index allocation will temporarily
fail. The new command holds the semaphore, so it means that a free entry
should be ready soon. Add one second retry mechanism before returning an
error.

Patch "net/mlx5: Avoid possible free of command entry while timeout comp
handler" increase the possibility to bump into this temporarily failure
as it delays the entry index release for non-callback commands.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <[email protected]>
Reviewed-by: Moshe Shemesh <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Cc: Timo Rothenpieler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -883,6 +883,25 @@ static bool opcode_allowed(struct mlx5_c
return cmd->allowed_opcode == opcode;
}

+static int cmd_alloc_index_retry(struct mlx5_cmd *cmd)
+{
+ unsigned long alloc_end = jiffies + msecs_to_jiffies(1000);
+ int idx;
+
+retry:
+ idx = cmd_alloc_index(cmd);
+ if (idx < 0 && time_before(jiffies, alloc_end)) {
+ /* Index allocation can fail on heavy load of commands. This is a temporary
+ * situation as the current command already holds the semaphore, meaning that
+ * another command completion is being handled and it is expected to release
+ * the entry index soon.
+ */
+ cpu_relax();
+ goto retry;
+ }
+ return idx;
+}
+
static void cmd_work_handler(struct work_struct *work)
{
struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work);
@@ -900,7 +919,7 @@ static void cmd_work_handler(struct work
sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
down(sem);
if (!ent->page_queue) {
- alloc_ret = cmd_alloc_index(cmd);
+ alloc_ret = cmd_alloc_index_retry(cmd);
if (alloc_ret < 0) {
mlx5_core_err(dev, "failed to allocate command entry\n");
if (ent->callback) {

2020-11-20 11:13:09

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 12/17] net: lantiq: Add locking for TX DMA channel

From: Hauke Mehrtens <[email protected]>

commit f9317ae5523f99999fb54c513ebabbb2bc887ddf upstream.

The TX DMA channel data is accessed by the xrx200_start_xmit() and the
xrx200_tx_housekeeping() function from different threads. Make sure the
accesses are synchronized by acquiring the netif_tx_lock() in the
xrx200_tx_housekeeping() function too. This lock is acquired by the
kernel before calling xrx200_start_xmit().

Signed-off-by: Hauke Mehrtens <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/lantiq_xrx200.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/net/ethernet/lantiq_xrx200.c
+++ b/drivers/net/ethernet/lantiq_xrx200.c
@@ -245,6 +245,7 @@ static int xrx200_tx_housekeeping(struct
int pkts = 0;
int bytes = 0;

+ netif_tx_lock(net_dev);
while (pkts < budget) {
struct ltq_dma_desc *desc = &ch->dma.desc_base[ch->tx_free];

@@ -268,6 +269,7 @@ static int xrx200_tx_housekeeping(struct
net_dev->stats.tx_bytes += bytes;
netdev_completed_queue(ch->priv->net_dev, pkts, bytes);

+ netif_tx_unlock(net_dev);
if (netif_queue_stopped(net_dev))
netif_wake_queue(net_dev);

2020-11-20 11:13:11

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 16/17] KVM: x86: clflushopt should be treated as a no-op by emulation

From: David Edmondson <[email protected]>

commit 51b958e5aeb1e18c00332e0b37c5d4e95a3eff84 upstream.

The instruction emulator ignores clflush instructions, yet fails to
support clflushopt. Treat both similarly.

Fixes: 13e457e0eebf ("KVM: x86: Emulator does not decode clflush well")
Signed-off-by: David Edmondson <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Joao Martins <[email protected]>
Signed-off-by: Paolo Bonzini <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/x86/kvm/emulate.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -4050,6 +4050,12 @@ static int em_clflush(struct x86_emulate
return X86EMUL_CONTINUE;
}

+static int em_clflushopt(struct x86_emulate_ctxt *ctxt)
+{
+ /* emulating clflushopt regardless of cpuid */
+ return X86EMUL_CONTINUE;
+}
+
static int em_movsxd(struct x86_emulate_ctxt *ctxt)
{
ctxt->dst.val = (s32) ctxt->src.val;
@@ -4592,7 +4598,7 @@ static const struct opcode group11[] = {
};

static const struct gprefix pfx_0f_ae_7 = {
- I(SrcMem | ByteOp, em_clflush), N, N, N,
+ I(SrcMem | ByteOp, em_clflush), I(SrcMem | ByteOp, em_clflushopt), N, N,
};

static const struct group_dual group15 = { {

2020-11-20 11:13:27

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 06/17] MIPS: PCI: Fix MIPS build

From: Sudip Mukherjee <[email protected]>

While backporting 37640adbefd6 ("MIPS: PCI: remember nasid changed by
set interrupt affinity") something went wrong and an extra 'n' was added.
So 'data->nasid' became 'data->nnasid' and the MIPS builds started failing.

This is only needed for 5.4-stable tree.

Fixes: 957978aa56f1 ("MIPS: PCI: remember nasid changed by set interrupt affinity")
Signed-off-by: Sudip Mukherjee <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/mips/pci/pci-xtalk-bridge.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/arch/mips/pci/pci-xtalk-bridge.c
+++ b/arch/mips/pci/pci-xtalk-bridge.c
@@ -284,7 +284,7 @@ static int bridge_set_affinity(struct ir
ret = irq_chip_set_affinity_parent(d, mask, force);
if (ret >= 0) {
cpu = cpumask_first_and(mask, cpu_online_mask);
- data->nnasid = COMPACT_TO_NASID_NODEID(cpu_to_node(cpu));
+ data->nasid = COMPACT_TO_NASID_NODEID(cpu_to_node(cpu));
bridge_write(data->bc, b_int_addr[pin].addr,
(((data->bc->intr_addr >> 30) & 0x30000) |
bit | (data->nasid << 8)));

2020-11-20 11:13:27

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 07/17] net/mlx5: Use async EQ setup cleanup helpers for multiple EQs

From: Parav Pandit <[email protected]>

commit 3ed879965cc4ea13fe0908468b653c4ff2cb1309 upstream.

Use helper routines to setup and teardown multiple EQs and reuse the
code in setup, cleanup and error unwinding flows.

Signed-off-by: Parav Pandit <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Cc: Timo Rothenpieler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 114 +++++++++++----------------
1 file changed, 49 insertions(+), 65 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -563,6 +563,39 @@ static void gather_async_events_mask(str
gather_user_async_events(dev, mask);
}

+static int
+setup_async_eq(struct mlx5_core_dev *dev, struct mlx5_eq_async *eq,
+ struct mlx5_eq_param *param, const char *name)
+{
+ int err;
+
+ eq->irq_nb.notifier_call = mlx5_eq_async_int;
+
+ err = create_async_eq(dev, &eq->core, param);
+ if (err) {
+ mlx5_core_warn(dev, "failed to create %s EQ %d\n", name, err);
+ return err;
+ }
+ err = mlx5_eq_enable(dev, &eq->core, &eq->irq_nb);
+ if (err) {
+ mlx5_core_warn(dev, "failed to enable %s EQ %d\n", name, err);
+ destroy_async_eq(dev, &eq->core);
+ }
+ return err;
+}
+
+static void cleanup_async_eq(struct mlx5_core_dev *dev,
+ struct mlx5_eq_async *eq, const char *name)
+{
+ int err;
+
+ mlx5_eq_disable(dev, &eq->core, &eq->irq_nb);
+ err = destroy_async_eq(dev, &eq->core);
+ if (err)
+ mlx5_core_err(dev, "failed to destroy %s eq, err(%d)\n",
+ name, err);
+}
+
static int create_async_eqs(struct mlx5_core_dev *dev)
{
struct mlx5_eq_table *table = dev->priv.eq_table;
@@ -572,77 +605,45 @@ static int create_async_eqs(struct mlx5_
MLX5_NB_INIT(&table->cq_err_nb, cq_err_event_notifier, CQ_ERROR);
mlx5_eq_notifier_register(dev, &table->cq_err_nb);

- table->cmd_eq.irq_nb.notifier_call = mlx5_eq_async_int;
param = (struct mlx5_eq_param) {
.irq_index = 0,
.nent = MLX5_NUM_CMD_EQE,
+ .mask[0] = 1ull << MLX5_EVENT_TYPE_CMD,
};
-
- param.mask[0] = 1ull << MLX5_EVENT_TYPE_CMD;
- err = create_async_eq(dev, &table->cmd_eq.core, &param);
- if (err) {
- mlx5_core_warn(dev, "failed to create cmd EQ %d\n", err);
- goto err0;
- }
- err = mlx5_eq_enable(dev, &table->cmd_eq.core, &table->cmd_eq.irq_nb);
- if (err) {
- mlx5_core_warn(dev, "failed to enable cmd EQ %d\n", err);
+ err = setup_async_eq(dev, &table->cmd_eq, &param, "cmd");
+ if (err)
goto err1;
- }
+
mlx5_cmd_use_events(dev);

- table->async_eq.irq_nb.notifier_call = mlx5_eq_async_int;
param = (struct mlx5_eq_param) {
.irq_index = 0,
.nent = MLX5_NUM_ASYNC_EQE,
};

gather_async_events_mask(dev, param.mask);
- err = create_async_eq(dev, &table->async_eq.core, &param);
- if (err) {
- mlx5_core_warn(dev, "failed to create async EQ %d\n", err);
+ err = setup_async_eq(dev, &table->async_eq, &param, "async");
+ if (err)
goto err2;
- }
- err = mlx5_eq_enable(dev, &table->async_eq.core,
- &table->async_eq.irq_nb);
- if (err) {
- mlx5_core_warn(dev, "failed to enable async EQ %d\n", err);
- goto err3;
- }

- table->pages_eq.irq_nb.notifier_call = mlx5_eq_async_int;
param = (struct mlx5_eq_param) {
.irq_index = 0,
.nent = /* TODO: sriov max_vf + */ 1,
+ .mask[0] = 1ull << MLX5_EVENT_TYPE_PAGE_REQUEST,
};

- param.mask[0] = 1ull << MLX5_EVENT_TYPE_PAGE_REQUEST;
- err = create_async_eq(dev, &table->pages_eq.core, &param);
- if (err) {
- mlx5_core_warn(dev, "failed to create pages EQ %d\n", err);
- goto err4;
- }
- err = mlx5_eq_enable(dev, &table->pages_eq.core,
- &table->pages_eq.irq_nb);
- if (err) {
- mlx5_core_warn(dev, "failed to enable pages EQ %d\n", err);
- goto err5;
- }
+ err = setup_async_eq(dev, &table->pages_eq, &param, "pages");
+ if (err)
+ goto err3;

- return err;
+ return 0;

-err5:
- destroy_async_eq(dev, &table->pages_eq.core);
-err4:
- mlx5_eq_disable(dev, &table->async_eq.core, &table->async_eq.irq_nb);
err3:
- destroy_async_eq(dev, &table->async_eq.core);
+ cleanup_async_eq(dev, &table->async_eq, "async");
err2:
mlx5_cmd_use_polling(dev);
- mlx5_eq_disable(dev, &table->cmd_eq.core, &table->cmd_eq.irq_nb);
+ cleanup_async_eq(dev, &table->cmd_eq, "cmd");
err1:
- destroy_async_eq(dev, &table->cmd_eq.core);
-err0:
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
return err;
}
@@ -650,28 +651,11 @@ err0:
static void destroy_async_eqs(struct mlx5_core_dev *dev)
{
struct mlx5_eq_table *table = dev->priv.eq_table;
- int err;
-
- mlx5_eq_disable(dev, &table->pages_eq.core, &table->pages_eq.irq_nb);
- err = destroy_async_eq(dev, &table->pages_eq.core);
- if (err)
- mlx5_core_err(dev, "failed to destroy pages eq, err(%d)\n",
- err);
-
- mlx5_eq_disable(dev, &table->async_eq.core, &table->async_eq.irq_nb);
- err = destroy_async_eq(dev, &table->async_eq.core);
- if (err)
- mlx5_core_err(dev, "failed to destroy async eq, err(%d)\n",
- err);

+ cleanup_async_eq(dev, &table->pages_eq, "pages");
+ cleanup_async_eq(dev, &table->async_eq, "async");
mlx5_cmd_use_polling(dev);
-
- mlx5_eq_disable(dev, &table->cmd_eq.core, &table->cmd_eq.irq_nb);
- err = destroy_async_eq(dev, &table->cmd_eq.core);
- if (err)
- mlx5_core_err(dev, "failed to destroy command eq, err(%d)\n",
- err);
-
+ cleanup_async_eq(dev, &table->cmd_eq, "cmd");
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
}

2020-11-20 11:38:44

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 04/17] powerpc: Only include kup-radix.h for 64-bit Book3S

From: Michael Ellerman <[email protected]>

commit 178d52c6e89c38d0553b0ac8b99927b11eb995b0 upstream.

In kup.h we currently include kup-radix.h for all 64-bit builds, which
includes Book3S and Book3E. The latter doesn't make sense, Book3E
never uses the Radix MMU.

This has worked up until now, but almost by accident, and the recent
uaccess flush changes introduced a build breakage on Book3E because of
the bad structure of the code.

So disentangle things so that we only use kup-radix.h for Book3S. This
requires some more stubs in kup.h.

Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/powerpc/include/asm/book3s/64/kup-radix.h | 5 +++--
arch/powerpc/include/asm/kup.h | 14 +++++++++++---
2 files changed, 14 insertions(+), 5 deletions(-)

--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -11,13 +11,12 @@

#ifdef __ASSEMBLY__

-.macro kuap_restore_amr gpr
#ifdef CONFIG_PPC_KUAP
+.macro kuap_restore_amr gpr
BEGIN_MMU_FTR_SECTION_NESTED(67)
ld \gpr, STACK_REGS_KUAP(r1)
mtspr SPRN_AMR, \gpr
END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
-#endif
.endm

.macro kuap_check_amr gpr1, gpr2
@@ -31,6 +30,7 @@
END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
#endif
.endm
+#endif

.macro kuap_save_amr_and_lock gpr1, gpr2, use_cr, msr_pr_cr
#ifdef CONFIG_PPC_KUAP
@@ -87,6 +87,7 @@ bad_kuap_fault(struct pt_regs *regs, uns
"Bug: %s fault blocked by AMR!", is_write ? "Write" : "Read");
}
#else /* CONFIG_PPC_KUAP */
+static inline void kuap_restore_amr(struct pt_regs *regs, unsigned long amr) { }
static inline void set_kuap(unsigned long value) { }
#endif /* !CONFIG_PPC_KUAP */

--- a/arch/powerpc/include/asm/kup.h
+++ b/arch/powerpc/include/asm/kup.h
@@ -6,7 +6,7 @@
#define KUAP_WRITE 2
#define KUAP_READ_WRITE (KUAP_READ | KUAP_WRITE)

-#ifdef CONFIG_PPC64
+#ifdef CONFIG_PPC_BOOK3S_64
#include <asm/book3s/64/kup-radix.h>
#endif
#ifdef CONFIG_PPC_8xx
@@ -24,9 +24,15 @@
.macro kuap_restore sp, current, gpr1, gpr2, gpr3
.endm

+.macro kuap_restore_amr gpr
+.endm
+
.macro kuap_check current, gpr
.endm

+.macro kuap_check_amr gpr1, gpr2
+.endm
+
#endif

#else /* !__ASSEMBLY__ */
@@ -52,17 +58,19 @@ bad_kuap_fault(struct pt_regs *regs, uns
return false;
}

+static inline void kuap_check_amr(void) { }
+
/*
* book3s/64/kup-radix.h defines these functions for the !KUAP case to flush
* the L1D cache after user accesses. Only include the empty stubs for other
* platforms.
*/
-#ifndef CONFIG_PPC64
+#ifndef CONFIG_PPC_BOOK3S_64
static inline void allow_user_access(void __user *to, const void __user *from,
unsigned long size, unsigned long dir) { }
static inline void prevent_user_access(void __user *to, const void __user *from,
unsigned long size, unsigned long dir) { }
-#endif /* CONFIG_PPC64 */
+#endif /* CONFIG_PPC_BOOK3S_64 */
#endif /* CONFIG_PPC_KUAP */

static inline void allow_read_from_user(const void __user *from, unsigned long size)

2020-11-20 11:39:46

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 02/17] powerpc/64s: flush L1D on kernel entry

From: Nicholas Piggin <[email protected]>

commit f79643787e0a0762d2409b7b8334e83f22d85695 upstream.

[backporting note: we need to mark some exception handlers as out-of-line
because the flushing makes them take too much space -- dja]

IBM Power9 processors can speculatively operate on data in the L1 cache
before it has been completely validated, via a way-prediction mechanism. It
is not possible for an attacker to determine the contents of impermissible
memory using this method, since these systems implement a combination of
hardware and software security measures to prevent scenarios where
protected data could be leaked.

However these measures don't address the scenario where an attacker induces
the operating system to speculatively execute instructions using data that
the attacker controls. This can be used for example to speculatively bypass
"kernel user access prevention" techniques, as discovered by Anthony
Steinhauser of Google's Safeside Project. This is not an attack by itself,
but there is a possibility it could be used in conjunction with
side-channels or other weaknesses in the privileged code to construct an
attack.

This issue can be mitigated by flushing the L1 cache between privilege
boundaries of concern. This patch flushes the L1 cache on kernel entry.

This is part of the fix for CVE-2020-4788.

Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Daniel Axtens <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 3 +
arch/powerpc/include/asm/exception-64s.h | 9 +++
arch/powerpc/include/asm/feature-fixups.h | 10 ++++
arch/powerpc/include/asm/security_features.h | 4 +
arch/powerpc/include/asm/setup.h | 3 +
arch/powerpc/kernel/exceptions-64s.S | 49 +++++++++++++++++--
arch/powerpc/kernel/setup_64.c | 60 +++++++++++++++++++++++-
arch/powerpc/kernel/vmlinux.lds.S | 7 ++
arch/powerpc/lib/feature-fixups.c | 54 +++++++++++++++++++++
arch/powerpc/platforms/powernv/setup.c | 11 ++++
arch/powerpc/platforms/pseries/setup.c | 4 +
11 files changed, 206 insertions(+), 8 deletions(-)

--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2667,6 +2667,7 @@
mds=off [X86]
tsx_async_abort=off [X86]
kvm.nx_huge_pages=off [X86]
+ no_entry_flush [PPC]

Exceptions:
This does not have any effect on
@@ -2989,6 +2990,8 @@

noefi Disable EFI runtime services support.

+ no_entry_flush [PPC] Don't flush the L1-D cache when entering the kernel.
+
noexec [IA-64]

noexec [X86]
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -61,11 +61,18 @@
nop; \
nop

+#define ENTRY_FLUSH_SLOT \
+ ENTRY_FLUSH_FIXUP_SECTION; \
+ nop; \
+ nop; \
+ nop;
+
/*
* r10 must be free to use, r13 must be paca
*/
#define INTERRUPT_TO_KERNEL \
- STF_ENTRY_BARRIER_SLOT
+ STF_ENTRY_BARRIER_SLOT; \
+ ENTRY_FLUSH_SLOT

/*
* Macros for annotating the expected destination of (h)rfid
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -205,6 +205,14 @@ label##3: \
FTR_ENTRY_OFFSET 955b-956b; \
.popsection;

+#define ENTRY_FLUSH_FIXUP_SECTION \
+957: \
+ .pushsection __entry_flush_fixup,"a"; \
+ .align 2; \
+958: \
+ FTR_ENTRY_OFFSET 957b-958b; \
+ .popsection;
+
#define RFI_FLUSH_FIXUP_SECTION \
951: \
.pushsection __rfi_flush_fixup,"a"; \
@@ -237,8 +245,10 @@ label##3: \
#include <linux/types.h>

extern long stf_barrier_fallback;
+extern long entry_flush_fallback;
extern long __start___stf_entry_barrier_fixup, __stop___stf_entry_barrier_fixup;
extern long __start___stf_exit_barrier_fixup, __stop___stf_exit_barrier_fixup;
+extern long __start___entry_flush_fixup, __stop___entry_flush_fixup;
extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
extern long __start___barrier_nospec_fixup, __stop___barrier_nospec_fixup;
extern long __start__btb_flush_fixup, __stop__btb_flush_fixup;
--- a/arch/powerpc/include/asm/security_features.h
+++ b/arch/powerpc/include/asm/security_features.h
@@ -84,12 +84,16 @@ static inline bool security_ftr_enabled(
// Software required to flush link stack on context switch
#define SEC_FTR_FLUSH_LINK_STACK 0x0000000000001000ull

+// The L1-D cache should be flushed when entering the kernel
+#define SEC_FTR_L1D_FLUSH_ENTRY 0x0000000000004000ull
+

// Features enabled by default
#define SEC_FTR_DEFAULT \
(SEC_FTR_L1D_FLUSH_HV | \
SEC_FTR_L1D_FLUSH_PR | \
SEC_FTR_BNDS_CHK_SPEC_BAR | \
+ SEC_FTR_L1D_FLUSH_ENTRY | \
SEC_FTR_FAVOUR_SECURITY)

#endif /* _ASM_POWERPC_SECURITY_FEATURES_H */
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -52,12 +52,15 @@ enum l1d_flush_type {
};

void setup_rfi_flush(enum l1d_flush_type, bool enable);
+void setup_entry_flush(bool enable);
+void setup_uaccess_flush(bool enable);
void do_rfi_flush_fixups(enum l1d_flush_type types);
#ifdef CONFIG_PPC_BARRIER_NOSPEC
void setup_barrier_nospec(void);
#else
static inline void setup_barrier_nospec(void) { };
#endif
+void do_entry_flush_fixups(enum l1d_flush_type types);
void do_barrier_nospec_fixups(bool enable);
extern bool barrier_nospec_enabled;

--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1150,7 +1150,7 @@ EXC_REAL_BEGIN(data_access, 0x300, 0x80)
INT_HANDLER data_access, 0x300, ool=1, dar=1, dsisr=1, kvm=1
EXC_REAL_END(data_access, 0x300, 0x80)
EXC_VIRT_BEGIN(data_access, 0x4300, 0x80)
- INT_HANDLER data_access, 0x300, virt=1, dar=1, dsisr=1
+ INT_HANDLER data_access, 0x300, ool=1, virt=1, dar=1, dsisr=1
EXC_VIRT_END(data_access, 0x4300, 0x80)
INT_KVM_HANDLER data_access, 0x300, EXC_STD, PACA_EXGEN, 1
EXC_COMMON_BEGIN(data_access_common)
@@ -1205,7 +1205,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TY

EXC_REAL_BEGIN(instruction_access, 0x400, 0x80)
- INT_HANDLER instruction_access, 0x400, kvm=1
+ INT_HANDLER instruction_access, 0x400, ool=1, kvm=1
EXC_REAL_END(instruction_access, 0x400, 0x80)
EXC_VIRT_BEGIN(instruction_access, 0x4400, 0x80)
INT_HANDLER instruction_access, 0x400, virt=1
@@ -1225,7 +1225,7 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TY

EXC_REAL_BEGIN(instruction_access_slb, 0x480, 0x80)
- INT_HANDLER instruction_access_slb, 0x480, area=PACA_EXSLB, kvm=1
+ INT_HANDLER instruction_access_slb, 0x480, ool=1, area=PACA_EXSLB, kvm=1
EXC_REAL_END(instruction_access_slb, 0x480, 0x80)
EXC_VIRT_BEGIN(instruction_access_slb, 0x4480, 0x80)
INT_HANDLER instruction_access_slb, 0x480, virt=1, area=PACA_EXSLB
@@ -1365,17 +1365,17 @@ EXC_REAL_BEGIN(decrementer, 0x900, 0x80)
INT_HANDLER decrementer, 0x900, ool=1, bitmask=IRQS_DISABLED, kvm=1
EXC_REAL_END(decrementer, 0x900, 0x80)
EXC_VIRT_BEGIN(decrementer, 0x4900, 0x80)
- INT_HANDLER decrementer, 0x900, virt=1, bitmask=IRQS_DISABLED
+ INT_HANDLER decrementer, 0x900, ool=1, virt=1, bitmask=IRQS_DISABLED
EXC_VIRT_END(decrementer, 0x4900, 0x80)
INT_KVM_HANDLER decrementer, 0x900, EXC_STD, PACA_EXGEN, 0
EXC_COMMON_ASYNC(decrementer_common, 0x900, timer_interrupt)

EXC_REAL_BEGIN(hdecrementer, 0x980, 0x80)
- INT_HANDLER hdecrementer, 0x980, hsrr=EXC_HV, kvm=1
+ INT_HANDLER hdecrementer, 0x980, ool=1, hsrr=EXC_HV, kvm=1
EXC_REAL_END(hdecrementer, 0x980, 0x80)
EXC_VIRT_BEGIN(hdecrementer, 0x4980, 0x80)
- INT_HANDLER hdecrementer, 0x980, virt=1, hsrr=EXC_HV, kvm=1
+ INT_HANDLER hdecrementer, 0x980, ool=1, virt=1, hsrr=EXC_HV, kvm=1
EXC_VIRT_END(hdecrementer, 0x4980, 0x80)
INT_KVM_HANDLER hdecrementer, 0x980, EXC_HV, PACA_EXGEN, 0
EXC_COMMON(hdecrementer_common, 0x980, hdec_interrupt)
@@ -2046,6 +2046,43 @@ TRAMP_REAL_BEGIN(stf_barrier_fallback)
.endr
blr

+TRAMP_REAL_BEGIN(entry_flush_fallback)
+ std r9,PACA_EXRFI+EX_R9(r13)
+ std r10,PACA_EXRFI+EX_R10(r13)
+ std r11,PACA_EXRFI+EX_R11(r13)
+ mfctr r9
+ ld r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+ ld r11,PACA_L1D_FLUSH_SIZE(r13)
+ srdi r11,r11,(7 + 3) /* 128 byte lines, unrolled 8x */
+ mtctr r11
+ DCBT_BOOK3S_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+
+ /* order ld/st prior to dcbt stop all streams with flushing */
+ sync
+
+ /*
+ * The load addresses are at staggered offsets within cachelines,
+ * which suits some pipelines better (on others it should not
+ * hurt).
+ */
+1:
+ ld r11,(0x80 + 8)*0(r10)
+ ld r11,(0x80 + 8)*1(r10)
+ ld r11,(0x80 + 8)*2(r10)
+ ld r11,(0x80 + 8)*3(r10)
+ ld r11,(0x80 + 8)*4(r10)
+ ld r11,(0x80 + 8)*5(r10)
+ ld r11,(0x80 + 8)*6(r10)
+ ld r11,(0x80 + 8)*7(r10)
+ addi r10,r10,0x80*8
+ bdnz 1b
+
+ mtctr r9
+ ld r9,PACA_EXRFI+EX_R9(r13)
+ ld r10,PACA_EXRFI+EX_R10(r13)
+ ld r11,PACA_EXRFI+EX_R11(r13)
+ blr
+
TRAMP_REAL_BEGIN(rfi_flush_fallback)
SET_SCRATCH0(r13);
GET_PACA(r13);
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -859,7 +859,9 @@ early_initcall(disable_hardlockup_detect
static enum l1d_flush_type enabled_flush_types;
static void *l1d_flush_fallback_area;
static bool no_rfi_flush;
+static bool no_entry_flush;
bool rfi_flush;
+bool entry_flush;

static int __init handle_no_rfi_flush(char *p)
{
@@ -869,6 +871,14 @@ static int __init handle_no_rfi_flush(ch
}
early_param("no_rfi_flush", handle_no_rfi_flush);

+static int __init handle_no_entry_flush(char *p)
+{
+ pr_info("entry-flush: disabled on command line.");
+ no_entry_flush = true;
+ return 0;
+}
+early_param("no_entry_flush", handle_no_entry_flush);
+
/*
* The RFI flush is not KPTI, but because users will see doco that says to use
* nopti we hijack that option here to also disable the RFI flush.
@@ -900,6 +910,18 @@ void rfi_flush_enable(bool enable)
rfi_flush = enable;
}

+void entry_flush_enable(bool enable)
+{
+ if (enable) {
+ do_entry_flush_fixups(enabled_flush_types);
+ on_each_cpu(do_nothing, NULL, 1);
+ } else {
+ do_entry_flush_fixups(L1D_FLUSH_NONE);
+ }
+
+ entry_flush = enable;
+}
+
static void __ref init_fallback_flush(void)
{
u64 l1d_size, limit;
@@ -958,10 +980,19 @@ void setup_rfi_flush(enum l1d_flush_type

enabled_flush_types = types;

- if (!no_rfi_flush && !cpu_mitigations_off())
+ if (!cpu_mitigations_off() && !no_rfi_flush)
rfi_flush_enable(enable);
}

+void setup_entry_flush(bool enable)
+{
+ if (cpu_mitigations_off())
+ return;
+
+ if (!no_entry_flush)
+ entry_flush_enable(enable);
+}
+
#ifdef CONFIG_DEBUG_FS
static int rfi_flush_set(void *data, u64 val)
{
@@ -989,9 +1020,36 @@ static int rfi_flush_get(void *data, u64

DEFINE_SIMPLE_ATTRIBUTE(fops_rfi_flush, rfi_flush_get, rfi_flush_set, "%llu\n");

+static int entry_flush_set(void *data, u64 val)
+{
+ bool enable;
+
+ if (val == 1)
+ enable = true;
+ else if (val == 0)
+ enable = false;
+ else
+ return -EINVAL;
+
+ /* Only do anything if we're changing state */
+ if (enable != entry_flush)
+ entry_flush_enable(enable);
+
+ return 0;
+}
+
+static int entry_flush_get(void *data, u64 *val)
+{
+ *val = entry_flush ? 1 : 0;
+ return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_entry_flush, entry_flush_get, entry_flush_set, "%llu\n");
+
static __init int rfi_flush_debugfs_init(void)
{
debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush);
+ debugfs_create_file("entry_flush", 0600, powerpc_debugfs_root, NULL, &fops_entry_flush);
return 0;
}
device_initcall(rfi_flush_debugfs_init);
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -144,6 +144,13 @@ SECTIONS
}

. = ALIGN(8);
+ __entry_flush_fixup : AT(ADDR(__entry_flush_fixup) - LOAD_OFFSET) {
+ __start___entry_flush_fixup = .;
+ *(__entry_flush_fixup)
+ __stop___entry_flush_fixup = .;
+ }
+
+ . = ALIGN(8);
__stf_exit_barrier_fixup : AT(ADDR(__stf_exit_barrier_fixup) - LOAD_OFFSET) {
__start___stf_exit_barrier_fixup = .;
*(__stf_exit_barrier_fixup)
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -228,6 +228,60 @@ void do_stf_barrier_fixups(enum stf_barr
do_stf_exit_barrier_fixups(types);
}

+void do_entry_flush_fixups(enum l1d_flush_type types)
+{
+ unsigned int instrs[3], *dest;
+ long *start, *end;
+ int i;
+
+ start = PTRRELOC(&__start___entry_flush_fixup);
+ end = PTRRELOC(&__stop___entry_flush_fixup);
+
+ instrs[0] = 0x60000000; /* nop */
+ instrs[1] = 0x60000000; /* nop */
+ instrs[2] = 0x60000000; /* nop */
+
+ i = 0;
+ if (types == L1D_FLUSH_FALLBACK) {
+ instrs[i++] = 0x7d4802a6; /* mflr r10 */
+ instrs[i++] = 0x60000000; /* branch patched below */
+ instrs[i++] = 0x7d4803a6; /* mtlr r10 */
+ }
+
+ if (types & L1D_FLUSH_ORI) {
+ instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+ instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/
+ }
+
+ if (types & L1D_FLUSH_MTTRIG)
+ instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
+
+ for (i = 0; start < end; start++, i++) {
+ dest = (void *)start + *start;
+
+ pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+ patch_instruction(dest, instrs[0]);
+
+ if (types == L1D_FLUSH_FALLBACK)
+ patch_branch((dest + 1), (unsigned long)&entry_flush_fallback,
+ BRANCH_SET_LINK);
+ else
+ patch_instruction((dest + 1), instrs[1]);
+
+ patch_instruction((dest + 2), instrs[2]);
+ }
+
+ printk(KERN_DEBUG "entry-flush: patched %d locations (%s flush)\n", i,
+ (types == L1D_FLUSH_NONE) ? "no" :
+ (types == L1D_FLUSH_FALLBACK) ? "fallback displacement" :
+ (types & L1D_FLUSH_ORI) ? (types & L1D_FLUSH_MTTRIG)
+ ? "ori+mttrig type"
+ : "ori type" :
+ (types & L1D_FLUSH_MTTRIG) ? "mttrig type"
+ : "unknown");
+}
+
void do_rfi_flush_fixups(enum l1d_flush_type types)
{
unsigned int instrs[3], *dest;
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -122,12 +122,23 @@ static void pnv_setup_rfi_flush(void)
type = L1D_FLUSH_ORI;
}

+ /*
+ * If we are non-Power9 bare metal, we don't need to flush on kernel
+ * entry: it fixes a P9 specific vulnerability.
+ */
+ if (!pvr_version_is(PVR_POWER9))
+ security_ftr_clear(SEC_FTR_L1D_FLUSH_ENTRY);
+
enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && \
(security_ftr_enabled(SEC_FTR_L1D_FLUSH_PR) || \
security_ftr_enabled(SEC_FTR_L1D_FLUSH_HV));

setup_rfi_flush(type, enable);
setup_count_cache_flush();
+
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
+ security_ftr_enabled(SEC_FTR_L1D_FLUSH_ENTRY);
+ setup_entry_flush(enable);
}

static void __init pnv_setup_arch(void)
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -561,6 +561,10 @@ void pseries_setup_rfi_flush(void)

setup_rfi_flush(types, enable);
setup_count_cache_flush();
+
+ enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
+ security_ftr_enabled(SEC_FTR_L1D_FLUSH_ENTRY);
+ setup_entry_flush(enable);
}

#ifdef CONFIG_PCI_IOV

2020-11-20 11:40:27

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 13/17] Input: sunkbd - avoid use-after-free in teardown paths

From: Dmitry Torokhov <[email protected]>

commit 77e70d351db7de07a46ac49b87a6c3c7a60fca7e upstream.

We need to make sure we cancel the reinit work before we tear down the
driver structures.

Reported-by: Bodong Zhao <[email protected]>
Tested-by: Bodong Zhao <[email protected]>
Cc: [email protected]
Signed-off-by: Dmitry Torokhov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/input/keyboard/sunkbd.c | 41 ++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)

--- a/drivers/input/keyboard/sunkbd.c
+++ b/drivers/input/keyboard/sunkbd.c
@@ -99,7 +99,8 @@ static irqreturn_t sunkbd_interrupt(stru
switch (data) {

case SUNKBD_RET_RESET:
- schedule_work(&sunkbd->tq);
+ if (sunkbd->enabled)
+ schedule_work(&sunkbd->tq);
sunkbd->reset = -1;
break;

@@ -200,16 +201,12 @@ static int sunkbd_initialize(struct sunk
}

/*
- * sunkbd_reinit() sets leds and beeps to a state the computer remembers they
- * were in.
+ * sunkbd_set_leds_beeps() sets leds and beeps to a state the computer remembers
+ * they were in.
*/

-static void sunkbd_reinit(struct work_struct *work)
+static void sunkbd_set_leds_beeps(struct sunkbd *sunkbd)
{
- struct sunkbd *sunkbd = container_of(work, struct sunkbd, tq);
-
- wait_event_interruptible_timeout(sunkbd->wait, sunkbd->reset >= 0, HZ);
-
serio_write(sunkbd->serio, SUNKBD_CMD_SETLED);
serio_write(sunkbd->serio,
(!!test_bit(LED_CAPSL, sunkbd->dev->led) << 3) |
@@ -222,11 +219,39 @@ static void sunkbd_reinit(struct work_st
SUNKBD_CMD_BELLOFF - !!test_bit(SND_BELL, sunkbd->dev->snd));
}

+
+/*
+ * sunkbd_reinit() wait for the keyboard reset to complete and restores state
+ * of leds and beeps.
+ */
+
+static void sunkbd_reinit(struct work_struct *work)
+{
+ struct sunkbd *sunkbd = container_of(work, struct sunkbd, tq);
+
+ /*
+ * It is OK that we check sunkbd->enabled without pausing serio,
+ * as we only want to catch true->false transition that will
+ * happen once and we will be woken up for it.
+ */
+ wait_event_interruptible_timeout(sunkbd->wait,
+ sunkbd->reset >= 0 || !sunkbd->enabled,
+ HZ);
+
+ if (sunkbd->reset >= 0 && sunkbd->enabled)
+ sunkbd_set_leds_beeps(sunkbd);
+}
+
static void sunkbd_enable(struct sunkbd *sunkbd, bool enable)
{
serio_pause_rx(sunkbd->serio);
sunkbd->enabled = enable;
serio_continue_rx(sunkbd->serio);
+
+ if (!enable) {
+ wake_up_interruptible(&sunkbd->wait);
+ cancel_work_sync(&sunkbd->tq);
+ }
}

/*

2020-11-20 11:41:54

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 08/17] net/mlx5: poll cmd EQ in case of command timeout

From: Eran Ben Elisha <[email protected]>

commit 1d5558b1f0de81f54ddee05f3793acc5260d107f upstream.

Once driver detects a command interface command timeout, it warns the
user and returns timeout error to the caller. In such case, the entry of
the command is not evacuated (because only real event interrupt is allowed
to clear command interface entry). If the HW event interrupt
of this entry will never arrive, this entry will be left unused forever.
Command interface entries are limited and eventually we can end up without
the ability to post a new command.

In addition, if driver will not consume the EQE of the lost interrupt and
rearm the EQ, no new interrupts will arrive for other commands.

Add a resiliency mechanism for manually polling the command EQ in case of
a command timeout. In case resiliency mechanism will find non-handled EQE,
it will consume it, and the command interface will be fully functional
again. Once the resiliency flow finished, wait another 5 seconds for the
command interface to complete for this command entry.

Define mlx5_cmd_eq_recover() to manage the cmd EQ polling resiliency flow.
Add an async EQ spinlock to avoid races between resiliency flows and real
interrupts that might run simultaneously.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Cc: Timo Rothenpieler <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 53 +++++++++++++++++++----
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 40 ++++++++++++++++-
drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h | 2
3 files changed, 86 insertions(+), 9 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -853,11 +853,21 @@ static void cb_timeout_handler(struct wo
struct mlx5_core_dev *dev = container_of(ent->cmd, struct mlx5_core_dev,
cmd);

+ mlx5_cmd_eq_recover(dev);
+
+ /* Maybe got handled by eq recover ? */
+ if (!test_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state)) {
+ mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, recovered after timeout\n", ent->idx,
+ mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
+ goto out; /* phew, already handled */
+ }
+
ent->ret = -ETIMEDOUT;
- mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n",
- mlx5_command_str(msg_to_opcode(ent->in)),
- msg_to_opcode(ent->in));
+ mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) Async, timeout. Will cause a leak of a command resource\n",
+ ent->idx, mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
+
+out:
cmd_ent_put(ent); /* for the cmd_ent_get() took on schedule delayed work */
}

@@ -987,6 +997,35 @@ static const char *deliv_status_to_str(u
}
}

+enum {
+ MLX5_CMD_TIMEOUT_RECOVER_MSEC = 5 * 1000,
+};
+
+static void wait_func_handle_exec_timeout(struct mlx5_core_dev *dev,
+ struct mlx5_cmd_work_ent *ent)
+{
+ unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_RECOVER_MSEC);
+
+ mlx5_cmd_eq_recover(dev);
+
+ /* Re-wait on the ent->done after executing the recovery flow. If the
+ * recovery flow (or any other recovery flow running simultaneously)
+ * has recovered an EQE, it should cause the entry to be completed by
+ * the command interface.
+ */
+ if (wait_for_completion_timeout(&ent->done, timeout)) {
+ mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) recovered after timeout\n", ent->idx,
+ mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
+ return;
+ }
+
+ mlx5_core_warn(dev, "cmd[%d]: %s(0x%x) No done completion\n", ent->idx,
+ mlx5_command_str(msg_to_opcode(ent->in)), msg_to_opcode(ent->in));
+
+ ent->ret = -ETIMEDOUT;
+ mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
+}
+
static int wait_func(struct mlx5_core_dev *dev, struct mlx5_cmd_work_ent *ent)
{
unsigned long timeout = msecs_to_jiffies(MLX5_CMD_TIMEOUT_MSEC);
@@ -998,12 +1037,10 @@ static int wait_func(struct mlx5_core_de
ent->ret = -ECANCELED;
goto out_err;
}
- if (cmd->mode == CMD_MODE_POLLING || ent->polling) {
+ if (cmd->mode == CMD_MODE_POLLING || ent->polling)
wait_for_completion(&ent->done);
- } else if (!wait_for_completion_timeout(&ent->done, timeout)) {
- ent->ret = -ETIMEDOUT;
- mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
- }
+ else if (!wait_for_completion_timeout(&ent->done, timeout))
+ wait_func_handle_exec_timeout(dev, ent);

out_err:
err = ent->ret;
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -190,6 +190,29 @@ u32 mlx5_eq_poll_irq_disabled(struct mlx
return count_eqe;
}

+static void mlx5_eq_async_int_lock(struct mlx5_eq_async *eq, unsigned long *flags)
+ __acquires(&eq->lock)
+{
+ if (in_irq())
+ spin_lock(&eq->lock);
+ else
+ spin_lock_irqsave(&eq->lock, *flags);
+}
+
+static void mlx5_eq_async_int_unlock(struct mlx5_eq_async *eq, unsigned long *flags)
+ __releases(&eq->lock)
+{
+ if (in_irq())
+ spin_unlock(&eq->lock);
+ else
+ spin_unlock_irqrestore(&eq->lock, *flags);
+}
+
+enum async_eq_nb_action {
+ ASYNC_EQ_IRQ_HANDLER = 0,
+ ASYNC_EQ_RECOVER = 1,
+};
+
static int mlx5_eq_async_int(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -199,11 +222,14 @@ static int mlx5_eq_async_int(struct noti
struct mlx5_eq_table *eqt;
struct mlx5_core_dev *dev;
struct mlx5_eqe *eqe;
+ unsigned long flags;
int num_eqes = 0;

dev = eq->dev;
eqt = dev->priv.eq_table;

+ mlx5_eq_async_int_lock(eq_async, &flags);
+
eqe = next_eqe_sw(eq);
if (!eqe)
goto out;
@@ -224,8 +250,19 @@ static int mlx5_eq_async_int(struct noti

out:
eq_update_ci(eq, 1);
+ mlx5_eq_async_int_unlock(eq_async, &flags);

- return 0;
+ return unlikely(action == ASYNC_EQ_RECOVER) ? num_eqes : 0;
+}
+
+void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev)
+{
+ struct mlx5_eq_async *eq = &dev->priv.eq_table->cmd_eq;
+ int eqes;
+
+ eqes = mlx5_eq_async_int(&eq->irq_nb, ASYNC_EQ_RECOVER, NULL);
+ if (eqes)
+ mlx5_core_warn(dev, "Recovered %d EQEs on cmd_eq\n", eqes);
}

static void init_eq_buf(struct mlx5_eq *eq)
@@ -570,6 +607,7 @@ setup_async_eq(struct mlx5_core_dev *dev
int err;

eq->irq_nb.notifier_call = mlx5_eq_async_int;
+ spin_lock_init(&eq->lock);

err = create_async_eq(dev, &eq->core, param);
if (err) {
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h
@@ -38,6 +38,7 @@ struct mlx5_eq {
struct mlx5_eq_async {
struct mlx5_eq core;
struct notifier_block irq_nb;
+ spinlock_t lock; /* To avoid irq EQ handle races with resiliency flows */
};

struct mlx5_eq_comp {
@@ -82,6 +83,7 @@ void mlx5_cq_tasklet_cb(unsigned long da
struct cpumask *mlx5_eq_comp_cpumask(struct mlx5_core_dev *dev, int ix);

u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq);
+void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev);
void mlx5_eq_synchronize_async_irq(struct mlx5_core_dev *dev);
void mlx5_eq_synchronize_cmd_irq(struct mlx5_core_dev *dev);

2020-11-20 11:41:58

by Greg Kroah-Hartman

[permalink] [raw]

Subject: [PATCH 5.4 11/17] powerpc/8xx: Always fault when _PAGE_ACCESSED is not set

From: Christophe Leroy <[email protected]>

commit 29daf869cbab69088fe1755d9dd224e99ba78b56 upstream.

The kernel expects pte_young() to work regardless of CONFIG_SWAP.

Make sure a minor fault is taken to set _PAGE_ACCESSED when it
is not already set, regardless of the selection of CONFIG_SWAP.

This adds at least 3 instructions to the TLB miss exception
handlers fast path. Following patch will reduce this overhead.

Also update the rotation instruction to the correct number of bits
to reflect all changes done to _PAGE_ACCESSED over time.

Fixes: d069cb4373fe ("powerpc/8xx: Don't touch ACCESSED when no SWAP.")
Fixes: 5f356497c384 ("powerpc/8xx: remove unused _PAGE_WRITETHRU")
Fixes: e0a8e0d90a9f ("powerpc/8xx: Handle PAGE_USER via APG bits")
Fixes: 5b2753fc3e8a ("powerpc/8xx: Implementation of PAGE_EXEC")
Fixes: a891c43b97d3 ("powerpc/8xx: Prepare handlers for _PAGE_HUGE for 512k pages.")
Cc: [email protected]
Signed-off-by: Christophe Leroy <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/af834e8a0f1fa97bfae65664950f0984a70c4750.1602492856.git.christophe.leroy@csgroup.eu
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/powerpc/kernel/head_8xx.S | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -229,9 +229,7 @@ SystemCall:

InstructionTLBMiss:
mtspr SPRN_SPRG_SCRATCH0, r10
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mtspr SPRN_SPRG_SCRATCH1, r11
-#endif

/* If we are faulting a kernel address, we have to use the
* kernel page tables.
@@ -278,11 +276,9 @@ InstructionTLBMiss:
#ifdef ITLB_MISS_KERNEL
mtcr r11
#endif
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 20 and 23 must be clear.
* Software indicator bits 22, 24, 25, 26, and 27 must be
@@ -296,9 +292,7 @@ InstructionTLBMiss:

/* Restore registers */
0: mfspr r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mfspr r11, SPRN_SPRG_SCRATCH1
-#endif
rfi
patch_site 0b, patch__itlbmiss_exit_1

@@ -308,9 +302,7 @@ InstructionTLBMiss:
addi r10, r10, 1
stw r10, (itlb_miss_counter - PAGE_OFFSET)@l(0)
mfspr r10, SPRN_SPRG_SCRATCH0
-#if defined(ITLB_MISS_KERNEL) || defined(CONFIG_SWAP)
mfspr r11, SPRN_SPRG_SCRATCH1
-#endif
rfi
#endif

@@ -394,11 +386,9 @@ DataStoreTLBMiss:
* r11 = ((r10 & PRESENT) & ((r10 & ACCESSED) >> 5));
* r10 = (r10 & ~PRESENT) | r11;
*/
-#ifdef CONFIG_SWAP
- rlwinm r11, r10, 32-5, _PAGE_PRESENT
+ rlwinm r11, r10, 32-7, _PAGE_PRESENT
and r11, r11, r10
rlwimi r10, r11, 0, _PAGE_PRESENT
-#endif
/* The Linux PTE won't go exactly into the MMU TLB.
* Software indicator bits 24, 25, 26, and 27 must be
* set. All other Linux PTE bits control the behavior

2020-11-20 22:32:31

[permalink] [raw]

Subject: Re: [PATCH 5.4 00/17] 5.4.79-rc1 review

On 11/20/20 4:03 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.79 release.
> There are 17 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 22 Nov 2020 10:45:32 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.79-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

Tested-by: Shuah Khan <[email protected]>

thanks,
-- Shuah

2020-11-21 11:50:39

by Naresh Kamboju

[permalink] [raw]

Subject: Re: [PATCH 5.4 00/17] 5.4.79-rc1 review

On Fri, 20 Nov 2020 at 16:37, Greg Kroah-Hartman
<[email protected]> wrote:
>
> This is the start of the stable review cycle for the 5.4.79 release.
> There are 17 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun, 22 Nov 2020 10:45:32 +0000.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.79-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h

Results from Linaro’s test farm.
No regressions on arm64, arm, x86_64, and i386.

Tested-by: Linux Kernel Functional Testing <[email protected]>

Summary
------------------------------------------------------------------------

kernel: 5.4.79-rc1
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
git branch: linux-5.4.y
git commit: ea92920d046bee86876dc51ba95a34d2056cb9a0
git describe: v5.4.78-18-gea92920d046b
Test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.4.y/build/v5.4.78-18-gea92920d046b

No regressions (compared to build v5.4.78)

No fixes (compared to build v5.4.78)

Ran 49619 total tests in the following environments and test suites.

Environments
--------------
- arm64
- dragonboard-410c
- hi6220-hikey
- i386
- juno-r2
- juno-r2-compat
- juno-r2-kasan
- nxp-ls2088
- qemu-arm-clang
- qemu-arm64-clang
- qemu-arm64-kasan
- qemu-x86_64-clang
- qemu-x86_64-kasan
- qemu_arm
- qemu_arm64
- qemu_arm64-compat
- qemu_i386
- qemu_x86_64
- qemu_x86_64-compat
- x15
- x86
- x86-kasan

Test Suites
-----------
* build
* install-android-platform-tools-r2600
* kselftest
* libhugetlbfs
* linux-log-parser
* ltp-cap_bounds-tests
* ltp-commands-tests
* ltp-containers-tests
* ltp-cpuhotplug-tests
* ltp-crypto-tests
* ltp-dio-tests
* ltp-fcntl-locktests-tests
* ltp-filecaps-tests
* ltp-fs_bind-tests
* ltp-fs_perms_simple-tests
* ltp-fsx-tests
* ltp-io-tests
* ltp-ipc-tests
* ltp-math-tests
* ltp-nptl-tests
* ltp-pty-tests
* ltp-sched-tests
* ltp-securebits-tests
* ltp-syscalls-tests
* ltp-tracing-tests
* perf
* v4l2-compliance
* ltp-controllers-tests
* ltp-cve-tests
* ltp-fs-tests
* ltp-hugetlb-tests
* ltp-mm-tests
* network-basic-tests
* ltp-open-posix-tests
* kvm-unit-tests

--
Linaro LKFT
https://lkft.linaro.org

2020-11-21 18:40:36

by Guenter Roeck

[permalink] [raw]

Subject: Re: [PATCH 5.4 00/17] 5.4.79-rc1 review

On Fri, Nov 20, 2020 at 12:03:27PM +0100, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 5.4.79 release.
> There are 17 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>

Build results:
total: 157 pass: 157 fail: 0
Qemu test results:
total: 426 pass: 426 fail: 0

Tested-by: Guenter Roeck <[email protected]>

Guenter