2020-03-05 17:15:28

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 01/58] ACPI: watchdog: Allow disabling WDAT at boot

From: Jean Delvare <[email protected]>

[ Upstream commit 3f9e12e0df012c4a9a7fd7eb0d3ae69b459d6b2c ]

In case the WDAT interface is broken, give the user an option to
ignore it to let a native driver bind to the watchdog device instead.

Signed-off-by: Jean Delvare <[email protected]>
Acked-by: Mika Westerberg <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ++++
drivers/acpi/acpi_watchdog.c | 12 +++++++++++-
2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5594c8bf1dcd4..b5c933fa971f3 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -136,6 +136,10 @@
dynamic table installation which will install SSDT
tables to /sys/firmware/acpi/tables/dynamic.

+ acpi_no_watchdog [HW,ACPI,WDT]
+ Ignore the ACPI-based watchdog interface (WDAT) and let
+ a native driver control the watchdog device instead.
+
acpi_rsdp= [ACPI,EFI,KEXEC]
Pass the RSDP address to the kernel, mostly used
on machines running EFI runtime service to boot the
diff --git a/drivers/acpi/acpi_watchdog.c b/drivers/acpi/acpi_watchdog.c
index b5516b04ffc07..ab6e434b4cee0 100644
--- a/drivers/acpi/acpi_watchdog.c
+++ b/drivers/acpi/acpi_watchdog.c
@@ -55,12 +55,14 @@ static bool acpi_watchdog_uses_rtc(const struct acpi_table_wdat *wdat)
}
#endif

+static bool acpi_no_watchdog;
+
static const struct acpi_table_wdat *acpi_watchdog_get_wdat(void)
{
const struct acpi_table_wdat *wdat = NULL;
acpi_status status;

- if (acpi_disabled)
+ if (acpi_disabled || acpi_no_watchdog)
return NULL;

status = acpi_get_table(ACPI_SIG_WDAT, 0,
@@ -88,6 +90,14 @@ bool acpi_has_watchdog(void)
}
EXPORT_SYMBOL_GPL(acpi_has_watchdog);

+/* ACPI watchdog can be disabled on boot command line */
+static int __init disable_acpi_watchdog(char *str)
+{
+ acpi_no_watchdog = true;
+ return 1;
+}
+__setup("acpi_no_watchdog", disable_acpi_watchdog);
+
void __init acpi_watchdog_init(void)
{
const struct acpi_wdat_entry *entries;
--
2.20.1


2020-03-05 17:15:33

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 03/58] HID: core: fix off-by-one memset in hid_report_raw_event()

From: Johan Korsnes <[email protected]>

[ Upstream commit 5ebdffd25098898aff1249ae2f7dbfddd76d8f8f ]

In case a report is greater than HID_MAX_BUFFER_SIZE, it is truncated,
but the report-number byte is not correctly handled. This results in a
off-by-one in the following memset, causing a kernel Oops and ensuing
system crash.

Note: With commit 8ec321e96e05 ("HID: Fix slab-out-of-bounds read in
hid_field_extract") I no longer hit the kernel Oops as we instead fail
"controlled" at probe if there is a report too long in the HID
report-descriptor. hid_report_raw_event() is an exported symbol, so
presumabely we cannot always rely on this being the case.

Fixes: 966922f26c7f ("HID: fix a crash in hid_report_raw_event()
function.")
Signed-off-by: Johan Korsnes <[email protected]>
Cc: Armando Visconti <[email protected]>
Cc: Jiri Kosina <[email protected]>
Cc: Alan Stern <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
index 851fe54ea59e7..359616e3efbbb 100644
--- a/drivers/hid/hid-core.c
+++ b/drivers/hid/hid-core.c
@@ -1741,7 +1741,9 @@ int hid_report_raw_event(struct hid_device *hid, int type, u8 *data, u32 size,

rsize = ((report->size - 1) >> 3) + 1;

- if (rsize > HID_MAX_BUFFER_SIZE)
+ if (report_enum->numbered && rsize >= HID_MAX_BUFFER_SIZE)
+ rsize = HID_MAX_BUFFER_SIZE - 1;
+ else if (rsize > HID_MAX_BUFFER_SIZE)
rsize = HID_MAX_BUFFER_SIZE;

if (csize < rsize) {
--
2.20.1

2020-03-05 17:15:47

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 08/58] ACPI: watchdog: Set default timeout in probe

From: Mika Westerberg <[email protected]>

[ Upstream commit cabe17d0173ab04bd3f87b8199ae75f43f1ea473 ]

If the BIOS default timeout for the watchdog is too small userspace may
not have enough time to configure new timeout after opening the device
before the system is already reset. For this reason program default
timeout of 30 seconds in the driver probe and allow userspace to change
this from command line or through module parameter (wdat_wdt.timeout).

Reported-by: Jean Delvare <[email protected]>
Signed-off-by: Mika Westerberg <[email protected]>
Reviewed-by: Jean Delvare <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/watchdog/wdat_wdt.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/drivers/watchdog/wdat_wdt.c b/drivers/watchdog/wdat_wdt.c
index e7cf41aa26c3b..697b04ffee97c 100644
--- a/drivers/watchdog/wdat_wdt.c
+++ b/drivers/watchdog/wdat_wdt.c
@@ -54,6 +54,13 @@ module_param(nowayout, bool, 0);
MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
__MODULE_STRING(WATCHDOG_NOWAYOUT) ")");

+#define WDAT_DEFAULT_TIMEOUT 30
+
+static int timeout = WDAT_DEFAULT_TIMEOUT;
+module_param(timeout, int, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in seconds (default="
+ __MODULE_STRING(WDAT_DEFAULT_TIMEOUT) ")");
+
static int wdat_wdt_read(struct wdat_wdt *wdat,
const struct wdat_instruction *instr, u32 *value)
{
@@ -438,6 +445,22 @@ static int wdat_wdt_probe(struct platform_device *pdev)

platform_set_drvdata(pdev, wdat);

+ /*
+ * Set initial timeout so that userspace has time to configure the
+ * watchdog properly after it has opened the device. In some cases
+ * the BIOS default is too short and causes immediate reboot.
+ */
+ if (timeout * 1000 < wdat->wdd.min_hw_heartbeat_ms ||
+ timeout * 1000 > wdat->wdd.max_hw_heartbeat_ms) {
+ dev_warn(dev, "Invalid timeout %d given, using %d\n",
+ timeout, WDAT_DEFAULT_TIMEOUT);
+ timeout = WDAT_DEFAULT_TIMEOUT;
+ }
+
+ ret = wdat_wdt_set_timeout(&wdat->wdd, timeout);
+ if (ret)
+ return ret;
+
watchdog_set_nowayout(&wdat->wdd, nowayout);
return devm_watchdog_register_device(dev, &wdat->wdd);
}
--
2.20.1

2020-03-05 17:16:05

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 17/58] MIPS: vdso: Wrap -mexplicit-relocs in cc-option

From: Nathan Chancellor <[email protected]>

[ Upstream commit 72cf3b3df423c1bbd8fa1056fed009d3a260f8a9 ]

Clang does not support this option and errors out:

clang-11: error: unknown argument: '-mexplicit-relocs'

Clang does not appear to need this flag like GCC does because the jalr
check that was added in commit 976c23af3ee5 ("mips: vdso: add build
time check that no 'jalr t9' calls left") passes just fine with

$ make ARCH=mips CC=clang CROSS_COMPILE=mipsel-linux-gnu- malta_defconfig arch/mips/vdso/

even before commit d3f703c4359f ("mips: vdso: fix 'jalr t9' crash in
vdso code").

-mrelax-pic-calls has been supported since clang 9, which is the
earliest version that could build a working MIPS kernel, and it is the
default for clang so just leave it be.

Fixes: d3f703c4359f ("mips: vdso: fix 'jalr t9' crash in vdso code")
Link: https://github.com/ClangBuiltLinux/linux/issues/890
Signed-off-by: Nathan Chancellor <[email protected]>
Reviewed-by: Nick Desaulniers <[email protected]>
Tested-by: Nick Desaulniers <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: Ralf Baechle <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/vdso/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 08c835d48520b..3dcfdee678fb9 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -29,7 +29,7 @@ endif
cflags-vdso := $(ccflags-vdso) \
$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
-O3 -g -fPIC -fno-strict-aliasing -fno-common -fno-builtin -G 0 \
- -mrelax-pic-calls -mexplicit-relocs \
+ -mrelax-pic-calls $(call cc-option, -mexplicit-relocs) \
-fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
$(call cc-option, -fno-asynchronous-unwind-tables) \
$(call cc-option, -fno-stack-protector)
--
2.20.1

2020-03-05 17:16:08

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 18/58] selftests/rseq: Fix out-of-tree compilation

From: Michael Ellerman <[email protected]>

[ Upstream commit ef89d0545132d685f73da6f58b7e7fe002536f91 ]

Currently if you build with O=... the rseq tests don't build:

$ make O=$PWD/output -C tools/testing/selftests/ TARGETS=rseq
make: Entering directory '/linux/tools/testing/selftests'
...
make[1]: Entering directory '/linux/tools/testing/selftests/rseq'
gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ -shared -fPIC rseq.c -lpthread -o /linux/output/rseq/librseq.so
gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ basic_test.c -lpthread -lrseq -o /linux/output/rseq/basic_test
/usr/bin/ld: cannot find -lrseq
collect2: error: ld returned 1 exit status

This is because the library search path points to the source
directory, not the output.

We can fix it by changing the library search path to $(OUTPUT).

Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Shuah Khan <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
tools/testing/selftests/rseq/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index d6469535630af..708c1b3452454 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -4,7 +4,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
CLANG_FLAGS += -no-integrated-as
endif

-CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ \
+CFLAGS += -O2 -Wall -g -I./ -I../../../../usr/include/ -L$(OUTPUT) -Wl,-rpath=./ \
$(CLANG_FLAGS)
LDLIBS += -lpthread

--
2.20.1

2020-03-05 17:16:11

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 19/58] tracing: Fix number printing bug in print_synth_event()

From: Tom Zanussi <[email protected]>

[ Upstream commit 784bd0847eda032ed2f3522f87250655a18c0190 ]

Fix a varargs-related bug in print_synth_event() which resulted in
strange output and oopses on 32-bit x86 systems. The problem is that
trace_seq_printf() expects the varargs to match the format string, but
print_synth_event() was always passing u64 values regardless. This
results in unspecified behavior when unpacking with va_arg() in
trace_seq_printf().

Add a function that takes the size into account when calling
trace_seq_printf().

Before:

modprobe-1731 [003] .... 919.039758: gen_synth_test: next_pid_field=777(null)next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=3(null)my_string_field=thneed my_int_field=598(null)

After:

insmod-1136 [001] .... 36.634590: gen_synth_test: next_pid_field=777 next_comm_field=hula hoops ts_ns=1000000 ts_ms=1000 cpu=1 my_string_field=thneed my_int_field=598

Link: http://lkml.kernel.org/r/a9b59eb515dbbd7d4abe53b347dccf7a8e285657.1581720155.git.zanussi@kernel.org

Reported-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/trace/trace_events_hist.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace_events_hist.c b/kernel/trace/trace_events_hist.c
index a31be3fce3e8e..6495800fb92a1 100644
--- a/kernel/trace/trace_events_hist.c
+++ b/kernel/trace/trace_events_hist.c
@@ -811,6 +811,29 @@ static const char *synth_field_fmt(char *type)
return fmt;
}

+static void print_synth_event_num_val(struct trace_seq *s,
+ char *print_fmt, char *name,
+ int size, u64 val, char *space)
+{
+ switch (size) {
+ case 1:
+ trace_seq_printf(s, print_fmt, name, (u8)val, space);
+ break;
+
+ case 2:
+ trace_seq_printf(s, print_fmt, name, (u16)val, space);
+ break;
+
+ case 4:
+ trace_seq_printf(s, print_fmt, name, (u32)val, space);
+ break;
+
+ default:
+ trace_seq_printf(s, print_fmt, name, val, space);
+ break;
+ }
+}
+
static enum print_line_t print_synth_event(struct trace_iterator *iter,
int flags,
struct trace_event *event)
@@ -849,10 +872,13 @@ static enum print_line_t print_synth_event(struct trace_iterator *iter,
} else {
struct trace_print_flags __flags[] = {
__def_gfpflag_names, {-1, NULL} };
+ char *space = (i == se->n_fields - 1 ? "" : " ");

- trace_seq_printf(s, print_fmt, se->fields[i]->name,
- entry->fields[n_u64],
- i == se->n_fields - 1 ? "" : " ");
+ print_synth_event_num_val(s, print_fmt,
+ se->fields[i]->name,
+ se->fields[i]->size,
+ entry->fields[n_u64],
+ space);

if (strcmp(se->fields[i]->type, "gfp_t") == 0) {
trace_seq_puts(s, " (");
--
2.20.1

2020-03-05 17:16:13

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 20/58] nl80211: fix potential leak in AP start

From: Johannes Berg <[email protected]>

[ Upstream commit 9951ebfcdf2b97dbb28a5d930458424341e61aa2 ]

If nl80211_parse_he_obss_pd() fails, we leak the previously
allocated ACL memory. Free it in this case.

Fixes: 796e90f42b7e ("cfg80211: add support for parsing OBBS_PD attributes")
Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20200221104142.835aba4cdd14.I1923b55ba9989c57e13978f91f40bfdc45e60cbd@changeid
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/wireless/nl80211.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index c74646b7a751f..78c2d9359fc72 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -4794,8 +4794,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info)
err = nl80211_parse_he_obss_pd(
info->attrs[NL80211_ATTR_HE_OBSS_PD],
&params.he_obss_pd);
- if (err)
- return err;
+ goto out;
}

nl80211_calculate_ap_params(&params);
@@ -4817,6 +4816,7 @@ static int nl80211_start_ap(struct sk_buff *skb, struct genl_info *info)
}
wdev_unlock(wdev);

+out:
kfree(params.acl);

return err;
--
2.20.1

2020-03-05 17:16:19

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 31/58] net: ll_temac: Fix race condition causing TX hang

From: Esben Haabendal <[email protected]>

[ Upstream commit 84823ff80f7403752b59e00bb198724100dc611c ]

It is possible that the interrupt handler fires and frees up space in
the TX ring in between checking for sufficient TX ring space and
stopping the TX queue in temac_start_xmit. If this happens, the
queue wake from the interrupt handler will occur before the queue is
stopped, causing a lost wakeup and the adapter's transmit hanging.

To avoid this, after stopping the queue, check again whether there is
sufficient space in the TX ring. If so, wake up the queue again.

This is a port of the similar fix in axienet driver,
commit 7de44285c1f6 ("net: axienet: Fix race condition causing TX hang").

Fixes: 23ecc4bde21f ("net: ll_temac: fix checksum offload logic")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/xilinx/ll_temac_main.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 21c1b4322ea78..fd578568b3bff 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -788,6 +788,9 @@ static void temac_start_xmit_done(struct net_device *ndev)
stat = be32_to_cpu(cur_p->app0);
}

+ /* Matches barrier in temac_start_xmit */
+ smp_mb();
+
netif_wake_queue(ndev);
}

@@ -830,9 +833,19 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
cur_p = &lp->tx_bd_v[lp->tx_bd_tail];

if (temac_check_tx_bd_space(lp, num_frag + 1)) {
- if (!netif_queue_stopped(ndev))
- netif_stop_queue(ndev);
- return NETDEV_TX_BUSY;
+ if (netif_queue_stopped(ndev))
+ return NETDEV_TX_BUSY;
+
+ netif_stop_queue(ndev);
+
+ /* Matches barrier in temac_start_xmit_done */
+ smp_mb();
+
+ /* Space might have just been freed - check again */
+ if (temac_check_tx_bd_space(lp, num_frag))
+ return NETDEV_TX_BUSY;
+
+ netif_wake_queue(ndev);
}

cur_p->app0 = 0;
--
2.20.1

2020-03-05 17:16:21

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 33/58] net: ll_temac: Fix RX buffer descriptor handling on GFP_ATOMIC pressure

From: Esben Haabendal <[email protected]>

[ Upstream commit 770d9c67974c4c71af4beb786dc43162ad2a15ba ]

Failures caused by GFP_ATOMIC memory pressure have been observed, and
due to the missing error handling, results in kernel crash such as

[1876998.350133] kernel BUG at mm/slub.c:3952!
[1876998.350141] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[1876998.350147] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.3.0-scnxt #1
[1876998.350150] Hardware name: N/A N/A/COMe-bIP2, BIOS CCR2R920 03/01/2017
[1876998.350160] RIP: 0010:kfree+0x1ca/0x220
[1876998.350164] Code: 85 db 74 49 48 8b 95 68 01 00 00 48 31 c2 48 89 10 e9 d7 fe ff ff 49 8b 04 24 a9 00 00 01 00 75 0b 49 8b 44 24 08 a8 01 75 02 <0f> 0b 49 8b 04 24 31 f6 a9 00 00 01 00 74 06 41 0f b6 74 24
5b
[1876998.350172] RSP: 0018:ffffc900000f0df0 EFLAGS: 00010246
[1876998.350177] RAX: ffffea00027f0708 RBX: ffff888008d78000 RCX: 0000000000391372
[1876998.350181] RDX: 0000000000000000 RSI: ffffe8ffffd01400 RDI: ffff888008d78000
[1876998.350185] RBP: ffff8881185a5d00 R08: ffffc90000087dd8 R09: 000000000000280a
[1876998.350189] R10: 0000000000000002 R11: 0000000000000000 R12: ffffea0000235e00
[1876998.350193] R13: ffff8881185438a0 R14: 0000000000000000 R15: ffff888118543870
[1876998.350198] FS: 0000000000000000(0000) GS:ffff88811f300000(0000) knlGS:0000000000000000
[1876998.350203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
s#1 Part1
[1876998.350206] CR2: 00007f8dac7b09f0 CR3: 000000011e20a006 CR4: 00000000001606e0
[1876998.350210] Call Trace:
[1876998.350215] <IRQ>
[1876998.350224] ? __netif_receive_skb_core+0x70a/0x920
[1876998.350229] kfree_skb+0x32/0xb0
[1876998.350234] __netif_receive_skb_core+0x70a/0x920
[1876998.350240] __netif_receive_skb_one_core+0x36/0x80
[1876998.350245] process_backlog+0x8b/0x150
[1876998.350250] net_rx_action+0xf7/0x340
[1876998.350255] __do_softirq+0x10f/0x353
[1876998.350262] irq_exit+0xb2/0xc0
[1876998.350265] do_IRQ+0x77/0xd0
[1876998.350271] common_interrupt+0xf/0xf
[1876998.350274] </IRQ>

In order to handle such failures more graceful, this change splits the
receive loop into one for consuming the received buffers, and one for
allocating new buffers.

When GFP_ATOMIC allocations fail, the receive will continue with the
buffers that is still there, and with the expectation that the allocations
will succeed in a later call to receive.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/xilinx/ll_temac.h | 1 +
drivers/net/ethernet/xilinx/ll_temac_main.c | 112 ++++++++++++++------
2 files changed, 82 insertions(+), 31 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h
index 276292bca334d..99fe059e5c7f3 100644
--- a/drivers/net/ethernet/xilinx/ll_temac.h
+++ b/drivers/net/ethernet/xilinx/ll_temac.h
@@ -375,6 +375,7 @@ struct temac_local {
int tx_bd_next;
int tx_bd_tail;
int rx_bd_ci;
+ int rx_bd_tail;

/* DMA channel control setup */
u32 tx_chnl_ctrl;
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index fd4231493449b..2e3f59dae586e 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -389,12 +389,13 @@ static int temac_dma_bd_init(struct net_device *ndev)
lp->tx_bd_next = 0;
lp->tx_bd_tail = 0;
lp->rx_bd_ci = 0;
+ lp->rx_bd_tail = RX_BD_NUM - 1;

/* Enable RX DMA transfers */
wmb();
lp->dma_out(lp, RX_CURDESC_PTR, lp->rx_bd_p);
lp->dma_out(lp, RX_TAILDESC_PTR,
- lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * (RX_BD_NUM - 1)));
+ lp->rx_bd_p + (sizeof(*lp->rx_bd_v) * lp->rx_bd_tail));

/* Prepare for TX DMA transfer */
lp->dma_out(lp, TX_CURDESC_PTR, lp->tx_bd_p);
@@ -923,27 +924,41 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
static void ll_temac_recv(struct net_device *ndev)
{
struct temac_local *lp = netdev_priv(ndev);
- struct sk_buff *skb, *new_skb;
- unsigned int bdstat;
- struct cdmac_bd *cur_p;
- dma_addr_t tail_p, skb_dma_addr;
- int length;
unsigned long flags;
+ int rx_bd;
+ bool update_tail = false;

spin_lock_irqsave(&lp->rx_lock, flags);

- tail_p = lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_ci;
- cur_p = &lp->rx_bd_v[lp->rx_bd_ci];
-
- bdstat = be32_to_cpu(cur_p->app0);
- while ((bdstat & STS_CTRL_APP0_CMPLT)) {
+ /* Process all received buffers, passing them on network
+ * stack. After this, the buffer descriptors will be in an
+ * un-allocated stage, where no skb is allocated for it, and
+ * they are therefore not available for TEMAC/DMA.
+ */
+ do {
+ struct cdmac_bd *bd = &lp->rx_bd_v[lp->rx_bd_ci];
+ struct sk_buff *skb = lp->rx_skb[lp->rx_bd_ci];
+ unsigned int bdstat = be32_to_cpu(bd->app0);
+ int length;
+
+ /* While this should not normally happen, we can end
+ * here when GFP_ATOMIC allocations fail, and we
+ * therefore have un-allocated buffers.
+ */
+ if (!skb)
+ break;

- skb = lp->rx_skb[lp->rx_bd_ci];
- length = be32_to_cpu(cur_p->app4) & 0x3FFF;
+ /* Loop over all completed buffer descriptors */
+ if (!(bdstat & STS_CTRL_APP0_CMPLT))
+ break;

- dma_unmap_single(ndev->dev.parent, be32_to_cpu(cur_p->phys),
+ dma_unmap_single(ndev->dev.parent, be32_to_cpu(bd->phys),
XTE_MAX_JUMBO_FRAME_SIZE, DMA_FROM_DEVICE);
+ /* The buffer is not valid for DMA anymore */
+ bd->phys = 0;
+ bd->len = 0;

+ length = be32_to_cpu(bd->app4) & 0x3FFF;
skb_put(skb, length);
skb->protocol = eth_type_trans(skb, ndev);
skb_checksum_none_assert(skb);
@@ -958,39 +973,74 @@ static void ll_temac_recv(struct net_device *ndev)
* (back) for proper IP checksum byte order
* (be16).
*/
- skb->csum = htons(be32_to_cpu(cur_p->app3) & 0xFFFF);
+ skb->csum = htons(be32_to_cpu(bd->app3) & 0xFFFF);
skb->ip_summed = CHECKSUM_COMPLETE;
}

if (!skb_defer_rx_timestamp(skb))
netif_rx(skb);
+ /* The skb buffer is now owned by network stack above */
+ lp->rx_skb[lp->rx_bd_ci] = NULL;

ndev->stats.rx_packets++;
ndev->stats.rx_bytes += length;

- new_skb = netdev_alloc_skb_ip_align(ndev,
- XTE_MAX_JUMBO_FRAME_SIZE);
- if (!new_skb) {
- spin_unlock_irqrestore(&lp->rx_lock, flags);
- return;
+ rx_bd = lp->rx_bd_ci;
+ if (++lp->rx_bd_ci >= RX_BD_NUM)
+ lp->rx_bd_ci = 0;
+ } while (rx_bd != lp->rx_bd_tail);
+
+ /* Allocate new buffers for those buffer descriptors that were
+ * passed to network stack. Note that GFP_ATOMIC allocations
+ * can fail (e.g. when a larger burst of GFP_ATOMIC
+ * allocations occurs), so while we try to allocate all
+ * buffers in the same interrupt where they were processed, we
+ * continue with what we could get in case of allocation
+ * failure. Allocation of remaining buffers will be retried
+ * in following calls.
+ */
+ while (1) {
+ struct sk_buff *skb;
+ struct cdmac_bd *bd;
+ dma_addr_t skb_dma_addr;
+
+ rx_bd = lp->rx_bd_tail + 1;
+ if (rx_bd >= RX_BD_NUM)
+ rx_bd = 0;
+ bd = &lp->rx_bd_v[rx_bd];
+
+ if (bd->phys)
+ break; /* All skb's allocated */
+
+ skb = netdev_alloc_skb_ip_align(ndev, XTE_MAX_JUMBO_FRAME_SIZE);
+ if (!skb) {
+ dev_warn(&ndev->dev, "skb alloc failed\n");
+ break;
}

- cur_p->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
- skb_dma_addr = dma_map_single(ndev->dev.parent, new_skb->data,
+ skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
XTE_MAX_JUMBO_FRAME_SIZE,
DMA_FROM_DEVICE);
- cur_p->phys = cpu_to_be32(skb_dma_addr);
- cur_p->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
- lp->rx_skb[lp->rx_bd_ci] = new_skb;
+ if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent,
+ skb_dma_addr))) {
+ dev_kfree_skb_any(skb);
+ break;
+ }

- lp->rx_bd_ci++;
- if (lp->rx_bd_ci >= RX_BD_NUM)
- lp->rx_bd_ci = 0;
+ bd->phys = cpu_to_be32(skb_dma_addr);
+ bd->len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
+ bd->app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
+ lp->rx_skb[rx_bd] = skb;
+
+ lp->rx_bd_tail = rx_bd;
+ update_tail = true;
+ }

- cur_p = &lp->rx_bd_v[lp->rx_bd_ci];
- bdstat = be32_to_cpu(cur_p->app0);
+ /* Move tail pointer when buffers have been allocated */
+ if (update_tail) {
+ lp->dma_out(lp, RX_TAILDESC_PTR,
+ lp->rx_bd_p + sizeof(*lp->rx_bd_v) * lp->rx_bd_tail);
}
- lp->dma_out(lp, RX_TAILDESC_PTR, tail_p);

spin_unlock_irqrestore(&lp->rx_lock, flags);
}
--
2.20.1

2020-03-05 17:16:52

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 09/58] HID: i2c-hid: add Trekstor Surfbook E11B to descriptor override

From: Kai-Heng Feng <[email protected]>

[ Upstream commit be0aba826c4a6ba5929def1962a90d6127871969 ]

The Surfbook E11B uses the SIPODEV SP1064 touchpad, which does not supply
descriptors, so it has to be added to the override list.

BugLink: https://bugs.launchpad.net/bugs/1858299
Signed-off-by: Kai-Heng Feng <[email protected]>
Reviewed-by: Hans de Goede <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
index d31ea82b84c17..a66f08041a1aa 100644
--- a/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
+++ b/drivers/hid/i2c-hid/i2c-hid-dmi-quirks.c
@@ -341,6 +341,14 @@ static const struct dmi_system_id i2c_hid_dmi_desc_override_table[] = {
},
.driver_data = (void *)&sipodev_desc
},
+ {
+ .ident = "Trekstor SURFBOOK E11B",
+ .matches = {
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "TREKSTOR"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "SURFBOOK E11B"),
+ },
+ .driver_data = (void *)&sipodev_desc
+ },
{
.ident = "Direkt-Tek DTLAPY116-2",
.matches = {
--
2.20.1

2020-03-05 17:17:00

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 05/58] HID: hiddev: Fix race in in hiddev_disconnect()

From: "[email protected]" <[email protected]>

[ Upstream commit 5c02c447eaeda29d3da121a2e17b97ccaf579b51 ]

Syzbot reports that "hiddev" is used after it's free in hiddev_disconnect().
The hiddev_disconnect() function sets "hiddev->exist = 0;" so
hiddev_release() can free it as soon as we drop the "existancelock"
lock. This patch moves the mutex_unlock(&hiddev->existancelock) until
after we have finished using it.

Reported-by: [email protected]
Fixes: 7f77897ef2b6 ("HID: hiddev: fix potential use-after-free")
Suggested-by: Alan Stern <[email protected]>
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/usbhid/hiddev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index c879b214a4797..35b1fa6d962ec 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -941,9 +941,9 @@ void hiddev_disconnect(struct hid_device *hid)
hiddev->exist = 0;

if (hiddev->open) {
- mutex_unlock(&hiddev->existancelock);
hid_hw_close(hiddev->hid);
wake_up_interruptible(&hiddev->wait);
+ mutex_unlock(&hiddev->existancelock);
} else {
mutex_unlock(&hiddev->existancelock);
kfree(hiddev);
--
2.20.1

2020-03-05 17:17:02

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 06/58] HID: alps: Fix an error handling path in 'alps_input_configured()'

From: Christophe JAILLET <[email protected]>

[ Upstream commit 8d2e77b39b8fecb794e19cd006a12f90b14dd077 ]

They are issues:
- if 'input_allocate_device()' fails and return NULL, there is no need
to free anything and 'input_free_device()' call is a no-op. It can
be axed.
- 'ret' is known to be 0 at this point, so we must set it to a
meaningful value before returning

Fixes: 2562756dde55 ("HID: add Alps I2C HID Touchpad-Stick support")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-alps.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hid/hid-alps.c b/drivers/hid/hid-alps.c
index ae79a7c667372..fa704153cb00d 100644
--- a/drivers/hid/hid-alps.c
+++ b/drivers/hid/hid-alps.c
@@ -730,7 +730,7 @@ static int alps_input_configured(struct hid_device *hdev, struct hid_input *hi)
if (data->has_sp) {
input2 = input_allocate_device();
if (!input2) {
- input_free_device(input2);
+ ret = -ENOMEM;
goto exit;
}

--
2.20.1

2020-03-05 17:17:08

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 12/58] mips: vdso: add build time check that no 'jalr t9' calls left

From: Victor Kamensky <[email protected]>

[ Upstream commit 976c23af3ee5bd3447a7bfb6c356ceb4acf264a6 ]

vdso shared object cannot have GOT based PIC 'jalr t9' calls
because nobody set GOT table in vdso. Contributing into vdso
.o files are compiled in PIC mode and as result for internal
static functions calls compiler will generate 'jalr t9'
instructions. Those are supposed to be converted into PC
relative 'bal' calls by linker when relocation are processed.

Mips global GOT entries do have dynamic relocations and they
will be caught by cmd_vdso_check Makefile rule. Static PIC
calls go through mips local GOT entries that do not have
dynamic relocations. For those 'jalr t9' calls could be present
but without dynamic relocations and they need to be converted
to 'bal' calls by linker.

Add additional build time check to make sure that no 'jalr t9'
slip through because of some toolchain misconfiguration that
prevents 'jalr t9' to 'bal' conversion.

Signed-off-by: Victor Kamensky <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/vdso/Makefile | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index b6b1eb638fb14..08c835d48520b 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -92,12 +92,18 @@ CFLAGS_REMOVE_vdso.o = -pg
GCOV_PROFILE := n
UBSAN_SANITIZE := n

+# Check that we don't have PIC 'jalr t9' calls left
+quiet_cmd_vdso_mips_check = VDSOCHK $@
+ cmd_vdso_mips_check = if $(OBJDUMP) --disassemble $@ | egrep -h "jalr.*t9" > /dev/null; \
+ then (echo >&2 "$@: PIC 'jalr t9' calls are not supported"; \
+ rm -f $@; /bin/false); fi
+
#
# Shared build commands.
#

quiet_cmd_vdsold_and_vdso_check = LD $@
- cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check)
+ cmd_vdsold_and_vdso_check = $(cmd_vdsold); $(cmd_vdso_check); $(cmd_vdso_mips_check)

quiet_cmd_vdsold = VDSO $@
cmd_vdsold = $(CC) $(c_flags) $(VDSO_LDFLAGS) \
--
2.20.1

2020-03-05 17:17:32

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 38/58] kbuild: add dtbs_check to PHONY

From: Masahiro Yamada <[email protected]>

[ Upstream commit 964a596db8db8c77c9903dd05655696696e6b3ad ]

The dtbs_check should be a phony target, but currently it is not
specified so.

'make dtbs_check' works even if a file named 'dtbs_check' exists
because it depends on another phony target, scripts_dtc, but we
should not rely on it.

Add dtbs_check to PHONY.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index af5e90075514a..63094ce5e5744 100644
--- a/Makefile
+++ b/Makefile
@@ -1242,7 +1242,7 @@ ifneq ($(dtstree),)
%.dtb: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@

-PHONY += dtbs dtbs_install dt_binding_check
+PHONY += dtbs dtbs_install dtbs_check dt_binding_check
dtbs dtbs_check: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree)

--
2.20.1

2020-03-05 17:17:37

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 24/58] netfilter: ipset: Fix "INFO: rcu detected stall in hash_xxx" reports

From: Jozsef Kadlecsik <[email protected]>

[ Upstream commit f66ee0410b1c3481ee75e5db9b34547b4d582465 ]

In the case of huge hash:* types of sets, due to the single spinlock of
a set the processing of the whole set under spinlock protection could take
too long.

There were four places where the whole hash table of the set was processed
from bucket to bucket under holding the spinlock:

- During resizing a set, the original set was locked to exclude kernel side
add/del element operations (userspace add/del is excluded by the
nfnetlink mutex). The original set is actually just read during the
resize, so the spinlocking is replaced with rcu locking of regions.
However, thus there can be parallel kernel side add/del of entries.
In order not to loose those operations a backlog is added and replayed
after the successful resize.
- Garbage collection of timed out entries was also protected by the spinlock.
In order not to lock too long, region locking is introduced and a single
region is processed in one gc go. Also, the simple timer based gc running
is replaced with a workqueue based solution. The internal book-keeping
(number of elements, size of extensions) is moved to region level due to
the region locking.
- Adding elements: when the max number of the elements is reached, the gc
was called to evict the timed out entries. The new approach is that the gc
is called just for the matching region, assuming that if the region
(proportionally) seems to be full, then the whole set does. We could scan
the other regions to check every entry under rcu locking, but for huge
sets it'd mean a slowdown at adding elements.
- Listing the set header data: when the set was defined with timeout
support, the garbage collector was called to clean up timed out entries
to get the correct element numbers and set size values. Now the set is
scanned to check non-timed out entries, without actually calling the gc
for the whole set.

Thanks to Florian Westphal for helping me to solve the SOFTIRQ-safe ->
SOFTIRQ-unsafe lock order issues during working on the patch.

Reported-by: [email protected]
Reported-by: [email protected]
Reported-by: [email protected]
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/netfilter/ipset/ip_set.h | 11 +-
net/netfilter/ipset/ip_set_core.c | 34 +-
net/netfilter/ipset/ip_set_hash_gen.h | 633 +++++++++++++++++--------
3 files changed, 472 insertions(+), 206 deletions(-)

diff --git a/include/linux/netfilter/ipset/ip_set.h b/include/linux/netfilter/ipset/ip_set.h
index 77336f4c4b1ce..32658749e9db0 100644
--- a/include/linux/netfilter/ipset/ip_set.h
+++ b/include/linux/netfilter/ipset/ip_set.h
@@ -121,6 +121,7 @@ struct ip_set_ext {
u32 timeout;
u8 packets_op;
u8 bytes_op;
+ bool target;
};

struct ip_set;
@@ -187,6 +188,14 @@ struct ip_set_type_variant {
/* Return true if "b" set is the same as "a"
* according to the create set parameters */
bool (*same_set)(const struct ip_set *a, const struct ip_set *b);
+ /* Region-locking is used */
+ bool region_lock;
+};
+
+struct ip_set_region {
+ spinlock_t lock; /* Region lock */
+ size_t ext_size; /* Size of the dynamic extensions */
+ u32 elements; /* Number of elements vs timeout */
};

/* The core set type structure */
@@ -681,7 +690,7 @@ ip_set_init_skbinfo(struct ip_set_skbinfo *skbinfo,
}

#define IP_SET_INIT_KEXT(skb, opt, set) \
- { .bytes = (skb)->len, .packets = 1, \
+ { .bytes = (skb)->len, .packets = 1, .target = true,\
.timeout = ip_set_adt_opt_timeout(opt, set) }

#define IP_SET_INIT_UEXT(set) \
diff --git a/net/netfilter/ipset/ip_set_core.c b/net/netfilter/ipset/ip_set_core.c
index a9df9dac57b2d..75da200aa5d87 100644
--- a/net/netfilter/ipset/ip_set_core.c
+++ b/net/netfilter/ipset/ip_set_core.c
@@ -557,6 +557,20 @@ ip_set_rcu_get(struct net *net, ip_set_id_t index)
return set;
}

+static inline void
+ip_set_lock(struct ip_set *set)
+{
+ if (!set->variant->region_lock)
+ spin_lock_bh(&set->lock);
+}
+
+static inline void
+ip_set_unlock(struct ip_set *set)
+{
+ if (!set->variant->region_lock)
+ spin_unlock_bh(&set->lock);
+}
+
int
ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
const struct xt_action_param *par, struct ip_set_adt_opt *opt)
@@ -578,9 +592,9 @@ ip_set_test(ip_set_id_t index, const struct sk_buff *skb,
if (ret == -EAGAIN) {
/* Type requests element to be completed */
pr_debug("element must be completed, ADD is triggered\n");
- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
set->variant->kadt(set, skb, par, IPSET_ADD, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
ret = 1;
} else {
/* --return-nomatch: invert matched element */
@@ -609,9 +623,9 @@ ip_set_add(ip_set_id_t index, const struct sk_buff *skb,
!(opt->family == set->family || set->family == NFPROTO_UNSPEC))
return -IPSET_ERR_TYPE_MISMATCH;

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->kadt(set, skb, par, IPSET_ADD, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);

return ret;
}
@@ -631,9 +645,9 @@ ip_set_del(ip_set_id_t index, const struct sk_buff *skb,
!(opt->family == set->family || set->family == NFPROTO_UNSPEC))
return -IPSET_ERR_TYPE_MISMATCH;

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->kadt(set, skb, par, IPSET_DEL, opt);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);

return ret;
}
@@ -1098,9 +1112,9 @@ ip_set_flush_set(struct ip_set *set)
{
pr_debug("set: %s\n", set->name);

- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
set->variant->flush(set);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
}

static int ip_set_flush(struct net *net, struct sock *ctnl, struct sk_buff *skb,
@@ -1523,9 +1537,9 @@ call_ad(struct sock *ctnl, struct sk_buff *skb, struct ip_set *set,
bool eexist = flags & IPSET_FLAG_EXIST, retried = false;

do {
- spin_lock_bh(&set->lock);
+ ip_set_lock(set);
ret = set->variant->uadt(set, tb, adt, &lineno, flags, retried);
- spin_unlock_bh(&set->lock);
+ ip_set_unlock(set);
retried = true;
} while (ret == -EAGAIN &&
set->variant->resize &&
diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index d098d87bc3317..2ac28c5c7e957 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -7,13 +7,21 @@
#include <linux/rcupdate.h>
#include <linux/jhash.h>
#include <linux/types.h>
+#include <linux/netfilter/nfnetlink.h>
#include <linux/netfilter/ipset/ip_set.h>

-#define __ipset_dereference_protected(p, c) rcu_dereference_protected(p, c)
-#define ipset_dereference_protected(p, set) \
- __ipset_dereference_protected(p, lockdep_is_held(&(set)->lock))
-
-#define rcu_dereference_bh_nfnl(p) rcu_dereference_bh_check(p, 1)
+#define __ipset_dereference(p) \
+ rcu_dereference_protected(p, 1)
+#define ipset_dereference_nfnl(p) \
+ rcu_dereference_protected(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))
+#define ipset_dereference_set(p, set) \
+ rcu_dereference_protected(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET) || \
+ lockdep_is_held(&(set)->lock))
+#define ipset_dereference_bh_nfnl(p) \
+ rcu_dereference_bh_check(p, \
+ lockdep_nfnl_is_held(NFNL_SUBSYS_IPSET))

/* Hashing which uses arrays to resolve clashing. The hash table is resized
* (doubled) when searching becomes too long.
@@ -72,11 +80,35 @@ struct hbucket {
__aligned(__alignof__(u64));
};

+/* Region size for locking == 2^HTABLE_REGION_BITS */
+#define HTABLE_REGION_BITS 10
+#define ahash_numof_locks(htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? 1 \
+ : jhash_size((htable_bits) - HTABLE_REGION_BITS))
+#define ahash_sizeof_regions(htable_bits) \
+ (ahash_numof_locks(htable_bits) * sizeof(struct ip_set_region))
+#define ahash_region(n, htable_bits) \
+ ((n) % ahash_numof_locks(htable_bits))
+#define ahash_bucket_start(h, htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? 0 \
+ : (h) * jhash_size(HTABLE_REGION_BITS))
+#define ahash_bucket_end(h, htable_bits) \
+ ((htable_bits) < HTABLE_REGION_BITS ? jhash_size(htable_bits) \
+ : ((h) + 1) * jhash_size(HTABLE_REGION_BITS))
+
+struct htable_gc {
+ struct delayed_work dwork;
+ struct ip_set *set; /* Set the gc belongs to */
+ u32 region; /* Last gc run position */
+};
+
/* The hash table: the table size stored here in order to make resizing easy */
struct htable {
atomic_t ref; /* References for resizing */
- atomic_t uref; /* References for dumping */
+ atomic_t uref; /* References for dumping and gc */
u8 htable_bits; /* size of hash table == 2^htable_bits */
+ u32 maxelem; /* Maxelem per region */
+ struct ip_set_region *hregion; /* Region locks and ext sizes */
struct hbucket __rcu *bucket[0]; /* hashtable buckets */
};

@@ -162,6 +194,10 @@ htable_bits(u32 hashsize)
#define NLEN 0
#endif /* IP_SET_HASH_WITH_NETS */

+#define SET_ELEM_EXPIRED(set, d) \
+ (SET_WITH_TIMEOUT(set) && \
+ ip_set_timeout_expired(ext_timeout(d, set)))
+
#endif /* _IP_SET_HASH_GEN_H */

#ifndef MTYPE
@@ -205,10 +241,12 @@ htable_bits(u32 hashsize)
#undef mtype_test_cidrs
#undef mtype_test
#undef mtype_uref
-#undef mtype_expire
#undef mtype_resize
+#undef mtype_ext_size
+#undef mtype_resize_ad
#undef mtype_head
#undef mtype_list
+#undef mtype_gc_do
#undef mtype_gc
#undef mtype_gc_init
#undef mtype_variant
@@ -247,10 +285,12 @@ htable_bits(u32 hashsize)
#define mtype_test_cidrs IPSET_TOKEN(MTYPE, _test_cidrs)
#define mtype_test IPSET_TOKEN(MTYPE, _test)
#define mtype_uref IPSET_TOKEN(MTYPE, _uref)
-#define mtype_expire IPSET_TOKEN(MTYPE, _expire)
#define mtype_resize IPSET_TOKEN(MTYPE, _resize)
+#define mtype_ext_size IPSET_TOKEN(MTYPE, _ext_size)
+#define mtype_resize_ad IPSET_TOKEN(MTYPE, _resize_ad)
#define mtype_head IPSET_TOKEN(MTYPE, _head)
#define mtype_list IPSET_TOKEN(MTYPE, _list)
+#define mtype_gc_do IPSET_TOKEN(MTYPE, _gc_do)
#define mtype_gc IPSET_TOKEN(MTYPE, _gc)
#define mtype_gc_init IPSET_TOKEN(MTYPE, _gc_init)
#define mtype_variant IPSET_TOKEN(MTYPE, _variant)
@@ -275,8 +315,7 @@ htable_bits(u32 hashsize)
/* The generic hash structure */
struct htype {
struct htable __rcu *table; /* the hash table */
- struct timer_list gc; /* garbage collection when timeout enabled */
- struct ip_set *set; /* attached to this ip_set */
+ struct htable_gc gc; /* gc workqueue */
u32 maxelem; /* max elements in the hash */
u32 initval; /* random jhash init value */
#ifdef IP_SET_HASH_WITH_MARKMASK
@@ -288,21 +327,33 @@ struct htype {
#ifdef IP_SET_HASH_WITH_NETMASK
u8 netmask; /* netmask value for subnets to store */
#endif
+ struct list_head ad; /* Resize add|del backlist */
struct mtype_elem next; /* temporary storage for uadd */
#ifdef IP_SET_HASH_WITH_NETS
struct net_prefixes nets[NLEN]; /* book-keeping of prefixes */
#endif
};

+/* ADD|DEL entries saved during resize */
+struct mtype_resize_ad {
+ struct list_head list;
+ enum ipset_adt ad; /* ADD|DEL element */
+ struct mtype_elem d; /* Element value */
+ struct ip_set_ext ext; /* Extensions for ADD */
+ struct ip_set_ext mext; /* Target extensions for ADD */
+ u32 flags; /* Flags for ADD */
+};
+
#ifdef IP_SET_HASH_WITH_NETS
/* Network cidr size book keeping when the hash stores different
* sized networks. cidr == real cidr + 1 to support /0.
*/
static void
-mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_add_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
{
int i, j;

+ spin_lock_bh(&set->lock);
/* Add in increasing prefix order, so larger cidr first */
for (i = 0, j = -1; i < NLEN && h->nets[i].cidr[n]; i++) {
if (j != -1) {
@@ -311,7 +362,7 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
j = i;
} else if (h->nets[i].cidr[n] == cidr) {
h->nets[CIDR_POS(cidr)].nets[n]++;
- return;
+ goto unlock;
}
}
if (j != -1) {
@@ -320,24 +371,29 @@ mtype_add_cidr(struct htype *h, u8 cidr, u8 n)
}
h->nets[i].cidr[n] = cidr;
h->nets[CIDR_POS(cidr)].nets[n] = 1;
+unlock:
+ spin_unlock_bh(&set->lock);
}

static void
-mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
+mtype_del_cidr(struct ip_set *set, struct htype *h, u8 cidr, u8 n)
{
u8 i, j, net_end = NLEN - 1;

+ spin_lock_bh(&set->lock);
for (i = 0; i < NLEN; i++) {
if (h->nets[i].cidr[n] != cidr)
continue;
h->nets[CIDR_POS(cidr)].nets[n]--;
if (h->nets[CIDR_POS(cidr)].nets[n] > 0)
- return;
+ goto unlock;
for (j = i; j < net_end && h->nets[j].cidr[n]; j++)
h->nets[j].cidr[n] = h->nets[j + 1].cidr[n];
h->nets[j].cidr[n] = 0;
- return;
+ goto unlock;
}
+unlock:
+ spin_unlock_bh(&set->lock);
}
#endif

@@ -345,7 +401,7 @@ mtype_del_cidr(struct htype *h, u8 cidr, u8 n)
static size_t
mtype_ahash_memsize(const struct htype *h, const struct htable *t)
{
- return sizeof(*h) + sizeof(*t);
+ return sizeof(*h) + sizeof(*t) + ahash_sizeof_regions(t->htable_bits);
}

/* Get the ith element from the array block n */
@@ -369,24 +425,29 @@ mtype_flush(struct ip_set *set)
struct htype *h = set->data;
struct htable *t;
struct hbucket *n;
- u32 i;
-
- t = ipset_dereference_protected(h->table, set);
- for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
- if (!n)
- continue;
- if (set->extensions & IPSET_EXT_DESTROY)
- mtype_ext_cleanup(set, n);
- /* FIXME: use slab cache */
- rcu_assign_pointer(hbucket(t, i), NULL);
- kfree_rcu(n, rcu);
+ u32 r, i;
+
+ t = ipset_dereference_nfnl(h->table);
+ for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+ spin_lock_bh(&t->hregion[r].lock);
+ for (i = ahash_bucket_start(r, t->htable_bits);
+ i < ahash_bucket_end(r, t->htable_bits); i++) {
+ n = __ipset_dereference(hbucket(t, i));
+ if (!n)
+ continue;
+ if (set->extensions & IPSET_EXT_DESTROY)
+ mtype_ext_cleanup(set, n);
+ /* FIXME: use slab cache */
+ rcu_assign_pointer(hbucket(t, i), NULL);
+ kfree_rcu(n, rcu);
+ }
+ t->hregion[r].ext_size = 0;
+ t->hregion[r].elements = 0;
+ spin_unlock_bh(&t->hregion[r].lock);
}
#ifdef IP_SET_HASH_WITH_NETS
memset(h->nets, 0, sizeof(h->nets));
#endif
- set->elements = 0;
- set->ext_size = 0;
}

/* Destroy the hashtable part of the set */
@@ -397,7 +458,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
u32 i;

for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
+ n = __ipset_dereference(hbucket(t, i));
if (!n)
continue;
if (set->extensions & IPSET_EXT_DESTROY && ext_destroy)
@@ -406,6 +467,7 @@ mtype_ahash_destroy(struct ip_set *set, struct htable *t, bool ext_destroy)
kfree(n);
}

+ ip_set_free(t->hregion);
ip_set_free(t);
}

@@ -414,28 +476,21 @@ static void
mtype_destroy(struct ip_set *set)
{
struct htype *h = set->data;
+ struct list_head *l, *lt;

if (SET_WITH_TIMEOUT(set))
- del_timer_sync(&h->gc);
+ cancel_delayed_work_sync(&h->gc.dwork);

- mtype_ahash_destroy(set,
- __ipset_dereference_protected(h->table, 1), true);
+ mtype_ahash_destroy(set, ipset_dereference_nfnl(h->table), true);
+ list_for_each_safe(l, lt, &h->ad) {
+ list_del(l);
+ kfree(l);
+ }
kfree(h);

set->data = NULL;
}

-static void
-mtype_gc_init(struct ip_set *set, void (*gc)(struct timer_list *t))
-{
- struct htype *h = set->data;
-
- timer_setup(&h->gc, gc, 0);
- mod_timer(&h->gc, jiffies + IPSET_GC_PERIOD(set->timeout) * HZ);
- pr_debug("gc initialized, run in every %u\n",
- IPSET_GC_PERIOD(set->timeout));
-}
-
static bool
mtype_same_set(const struct ip_set *a, const struct ip_set *b)
{
@@ -454,11 +509,9 @@ mtype_same_set(const struct ip_set *a, const struct ip_set *b)
a->extensions == b->extensions;
}

-/* Delete expired elements from the hashtable */
static void
-mtype_expire(struct ip_set *set, struct htype *h)
+mtype_gc_do(struct ip_set *set, struct htype *h, struct htable *t, u32 r)
{
- struct htable *t;
struct hbucket *n, *tmp;
struct mtype_elem *data;
u32 i, j, d;
@@ -466,10 +519,12 @@ mtype_expire(struct ip_set *set, struct htype *h)
#ifdef IP_SET_HASH_WITH_NETS
u8 k;
#endif
+ u8 htable_bits = t->htable_bits;

- t = ipset_dereference_protected(h->table, set);
- for (i = 0; i < jhash_size(t->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(t, i), 1);
+ spin_lock_bh(&t->hregion[r].lock);
+ for (i = ahash_bucket_start(r, htable_bits);
+ i < ahash_bucket_end(r, htable_bits); i++) {
+ n = __ipset_dereference(hbucket(t, i));
if (!n)
continue;
for (j = 0, d = 0; j < n->pos; j++) {
@@ -485,58 +540,100 @@ mtype_expire(struct ip_set *set, struct htype *h)
smp_mb__after_atomic();
#ifdef IP_SET_HASH_WITH_NETS
for (k = 0; k < IPSET_NET_COUNT; k++)
- mtype_del_cidr(h,
+ mtype_del_cidr(set, h,
NCIDR_PUT(DCIDR_GET(data->cidr, k)),
k);
#endif
+ t->hregion[r].elements--;
ip_set_ext_destroy(set, data);
- set->elements--;
d++;
}
if (d >= AHASH_INIT_SIZE) {
if (d >= n->size) {
+ t->hregion[r].ext_size -=
+ ext_size(n->size, dsize);
rcu_assign_pointer(hbucket(t, i), NULL);
kfree_rcu(n, rcu);
continue;
}
tmp = kzalloc(sizeof(*tmp) +
- (n->size - AHASH_INIT_SIZE) * dsize,
- GFP_ATOMIC);
+ (n->size - AHASH_INIT_SIZE) * dsize,
+ GFP_ATOMIC);
if (!tmp)
- /* Still try to delete expired elements */
+ /* Still try to delete expired elements. */
continue;
tmp->size = n->size - AHASH_INIT_SIZE;
for (j = 0, d = 0; j < n->pos; j++) {
if (!test_bit(j, n->used))
continue;
data = ahash_data(n, j, dsize);
- memcpy(tmp->value + d * dsize, data, dsize);
+ memcpy(tmp->value + d * dsize,
+ data, dsize);
set_bit(d, tmp->used);
d++;
}
tmp->pos = d;
- set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+ t->hregion[r].ext_size -=
+ ext_size(AHASH_INIT_SIZE, dsize);
rcu_assign_pointer(hbucket(t, i), tmp);
kfree_rcu(n, rcu);
}
}
+ spin_unlock_bh(&t->hregion[r].lock);
}

static void
-mtype_gc(struct timer_list *t)
+mtype_gc(struct work_struct *work)
{
- struct htype *h = from_timer(h, t, gc);
- struct ip_set *set = h->set;
+ struct htable_gc *gc;
+ struct ip_set *set;
+ struct htype *h;
+ struct htable *t;
+ u32 r, numof_locks;
+ unsigned int next_run;
+
+ gc = container_of(work, struct htable_gc, dwork.work);
+ set = gc->set;
+ h = set->data;

- pr_debug("called\n");
spin_lock_bh(&set->lock);
- mtype_expire(set, h);
+ t = ipset_dereference_set(h->table, set);
+ atomic_inc(&t->uref);
+ numof_locks = ahash_numof_locks(t->htable_bits);
+ r = gc->region++;
+ if (r >= numof_locks) {
+ r = gc->region = 0;
+ }
+ next_run = (IPSET_GC_PERIOD(set->timeout) * HZ) / numof_locks;
+ if (next_run < HZ/10)
+ next_run = HZ/10;
spin_unlock_bh(&set->lock);

- h->gc.expires = jiffies + IPSET_GC_PERIOD(set->timeout) * HZ;
- add_timer(&h->gc);
+ mtype_gc_do(set, h, t, r);
+
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by expire: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
+
+ queue_delayed_work(system_power_efficient_wq, &gc->dwork, next_run);
+
+}
+
+static void
+mtype_gc_init(struct htable_gc *gc)
+{
+ INIT_DEFERRABLE_WORK(&gc->dwork, mtype_gc);
+ queue_delayed_work(system_power_efficient_wq, &gc->dwork, HZ);
}

+static int
+mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ struct ip_set_ext *mext, u32 flags);
+static int
+mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
+ struct ip_set_ext *mext, u32 flags);
+
/* Resize a hash: create a new hash table with doubling the hashsize
* and inserting the elements to it. Repeat until we succeed or
* fail due to memory pressures.
@@ -547,7 +644,7 @@ mtype_resize(struct ip_set *set, bool retried)
struct htype *h = set->data;
struct htable *t, *orig;
u8 htable_bits;
- size_t extsize, dsize = set->dsize;
+ size_t dsize = set->dsize;
#ifdef IP_SET_HASH_WITH_NETS
u8 flags;
struct mtype_elem *tmp;
@@ -555,7 +652,9 @@ mtype_resize(struct ip_set *set, bool retried)
struct mtype_elem *data;
struct mtype_elem *d;
struct hbucket *n, *m;
- u32 i, j, key;
+ struct list_head *l, *lt;
+ struct mtype_resize_ad *x;
+ u32 i, j, r, nr, key;
int ret;

#ifdef IP_SET_HASH_WITH_NETS
@@ -563,10 +662,8 @@ mtype_resize(struct ip_set *set, bool retried)
if (!tmp)
return -ENOMEM;
#endif
- rcu_read_lock_bh();
- orig = rcu_dereference_bh_nfnl(h->table);
+ orig = ipset_dereference_bh_nfnl(h->table);
htable_bits = orig->htable_bits;
- rcu_read_unlock_bh();

retry:
ret = 0;
@@ -583,88 +680,124 @@ mtype_resize(struct ip_set *set, bool retried)
ret = -ENOMEM;
goto out;
}
+ t->hregion = ip_set_alloc(ahash_sizeof_regions(htable_bits));
+ if (!t->hregion) {
+ kfree(t);
+ ret = -ENOMEM;
+ goto out;
+ }
t->htable_bits = htable_bits;
+ t->maxelem = h->maxelem / ahash_numof_locks(htable_bits);
+ for (i = 0; i < ahash_numof_locks(htable_bits); i++)
+ spin_lock_init(&t->hregion[i].lock);

- spin_lock_bh(&set->lock);
- orig = __ipset_dereference_protected(h->table, 1);
- /* There can't be another parallel resizing, but dumping is possible */
+ /* There can't be another parallel resizing,
+ * but dumping, gc, kernel side add/del are possible
+ */
+ orig = ipset_dereference_bh_nfnl(h->table);
atomic_set(&orig->ref, 1);
atomic_inc(&orig->uref);
- extsize = 0;
pr_debug("attempt to resize set %s from %u to %u, t %p\n",
set->name, orig->htable_bits, htable_bits, orig);
- for (i = 0; i < jhash_size(orig->htable_bits); i++) {
- n = __ipset_dereference_protected(hbucket(orig, i), 1);
- if (!n)
- continue;
- for (j = 0; j < n->pos; j++) {
- if (!test_bit(j, n->used))
+ for (r = 0; r < ahash_numof_locks(orig->htable_bits); r++) {
+ /* Expire may replace a hbucket with another one */
+ rcu_read_lock_bh();
+ for (i = ahash_bucket_start(r, orig->htable_bits);
+ i < ahash_bucket_end(r, orig->htable_bits); i++) {
+ n = __ipset_dereference(hbucket(orig, i));
+ if (!n)
continue;
- data = ahash_data(n, j, dsize);
+ for (j = 0; j < n->pos; j++) {
+ if (!test_bit(j, n->used))
+ continue;
+ data = ahash_data(n, j, dsize);
+ if (SET_ELEM_EXPIRED(set, data))
+ continue;
#ifdef IP_SET_HASH_WITH_NETS
- /* We have readers running parallel with us,
- * so the live data cannot be modified.
- */
- flags = 0;
- memcpy(tmp, data, dsize);
- data = tmp;
- mtype_data_reset_flags(data, &flags);
+ /* We have readers running parallel with us,
+ * so the live data cannot be modified.
+ */
+ flags = 0;
+ memcpy(tmp, data, dsize);
+ data = tmp;
+ mtype_data_reset_flags(data, &flags);
#endif
- key = HKEY(data, h->initval, htable_bits);
- m = __ipset_dereference_protected(hbucket(t, key), 1);
- if (!m) {
- m = kzalloc(sizeof(*m) +
+ key = HKEY(data, h->initval, htable_bits);
+ m = __ipset_dereference(hbucket(t, key));
+ nr = ahash_region(key, htable_bits);
+ if (!m) {
+ m = kzalloc(sizeof(*m) +
AHASH_INIT_SIZE * dsize,
GFP_ATOMIC);
- if (!m) {
- ret = -ENOMEM;
- goto cleanup;
- }
- m->size = AHASH_INIT_SIZE;
- extsize += ext_size(AHASH_INIT_SIZE, dsize);
- RCU_INIT_POINTER(hbucket(t, key), m);
- } else if (m->pos >= m->size) {
- struct hbucket *ht;
-
- if (m->size >= AHASH_MAX(h)) {
- ret = -EAGAIN;
- } else {
- ht = kzalloc(sizeof(*ht) +
+ if (!m) {
+ ret = -ENOMEM;
+ goto cleanup;
+ }
+ m->size = AHASH_INIT_SIZE;
+ t->hregion[nr].ext_size +=
+ ext_size(AHASH_INIT_SIZE,
+ dsize);
+ RCU_INIT_POINTER(hbucket(t, key), m);
+ } else if (m->pos >= m->size) {
+ struct hbucket *ht;
+
+ if (m->size >= AHASH_MAX(h)) {
+ ret = -EAGAIN;
+ } else {
+ ht = kzalloc(sizeof(*ht) +
(m->size + AHASH_INIT_SIZE)
* dsize,
GFP_ATOMIC);
- if (!ht)
- ret = -ENOMEM;
+ if (!ht)
+ ret = -ENOMEM;
+ }
+ if (ret < 0)
+ goto cleanup;
+ memcpy(ht, m, sizeof(struct hbucket) +
+ m->size * dsize);
+ ht->size = m->size + AHASH_INIT_SIZE;
+ t->hregion[nr].ext_size +=
+ ext_size(AHASH_INIT_SIZE,
+ dsize);
+ kfree(m);
+ m = ht;
+ RCU_INIT_POINTER(hbucket(t, key), ht);
}
- if (ret < 0)
- goto cleanup;
- memcpy(ht, m, sizeof(struct hbucket) +
- m->size * dsize);
- ht->size = m->size + AHASH_INIT_SIZE;
- extsize += ext_size(AHASH_INIT_SIZE, dsize);
- kfree(m);
- m = ht;
- RCU_INIT_POINTER(hbucket(t, key), ht);
- }
- d = ahash_data(m, m->pos, dsize);
- memcpy(d, data, dsize);
- set_bit(m->pos++, m->used);
+ d = ahash_data(m, m->pos, dsize);
+ memcpy(d, data, dsize);
+ set_bit(m->pos++, m->used);
+ t->hregion[nr].elements++;
#ifdef IP_SET_HASH_WITH_NETS
- mtype_data_reset_flags(d, &flags);
+ mtype_data_reset_flags(d, &flags);
#endif
+ }
}
+ rcu_read_unlock_bh();
}
- rcu_assign_pointer(h->table, t);
- set->ext_size = extsize;

- spin_unlock_bh(&set->lock);
+ /* There can't be any other writer. */
+ rcu_assign_pointer(h->table, t);

/* Give time to other readers of the set */
synchronize_rcu();

pr_debug("set %s resized from %u (%p) to %u (%p)\n", set->name,
orig->htable_bits, orig, t->htable_bits, t);
- /* If there's nobody else dumping the table, destroy it */
+ /* Add/delete elements processed by the SET target during resize.
+ * Kernel-side add cannot trigger a resize and userspace actions
+ * are serialized by the mutex.
+ */
+ list_for_each_safe(l, lt, &h->ad) {
+ x = list_entry(l, struct mtype_resize_ad, list);
+ if (x->ad == IPSET_ADD) {
+ mtype_add(set, &x->d, &x->ext, &x->mext, x->flags);
+ } else {
+ mtype_del(set, &x->d, NULL, NULL, 0);
+ }
+ list_del(l);
+ kfree(l);
+ }
+ /* If there's nobody else using the table, destroy it */
if (atomic_dec_and_test(&orig->uref)) {
pr_debug("Table destroy by resize %p\n", orig);
mtype_ahash_destroy(set, orig, false);
@@ -677,15 +810,44 @@ mtype_resize(struct ip_set *set, bool retried)
return ret;

cleanup:
+ rcu_read_unlock_bh();
atomic_set(&orig->ref, 0);
atomic_dec(&orig->uref);
- spin_unlock_bh(&set->lock);
mtype_ahash_destroy(set, t, false);
if (ret == -EAGAIN)
goto retry;
goto out;
}

+/* Get the current number of elements and ext_size in the set */
+static void
+mtype_ext_size(struct ip_set *set, u32 *elements, size_t *ext_size)
+{
+ struct htype *h = set->data;
+ const struct htable *t;
+ u32 i, j, r;
+ struct hbucket *n;
+ struct mtype_elem *data;
+
+ t = rcu_dereference_bh(h->table);
+ for (r = 0; r < ahash_numof_locks(t->htable_bits); r++) {
+ for (i = ahash_bucket_start(r, t->htable_bits);
+ i < ahash_bucket_end(r, t->htable_bits); i++) {
+ n = rcu_dereference_bh(hbucket(t, i));
+ if (!n)
+ continue;
+ for (j = 0; j < n->pos; j++) {
+ if (!test_bit(j, n->used))
+ continue;
+ data = ahash_data(n, j, set->dsize);
+ if (!SET_ELEM_EXPIRED(set, data))
+ (*elements)++;
+ }
+ }
+ *ext_size += t->hregion[r].ext_size;
+ }
+}
+
/* Add an element to a hash and update the internal counters when succeeded,
* otherwise report the proper error code.
*/
@@ -698,32 +860,49 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
const struct mtype_elem *d = value;
struct mtype_elem *data;
struct hbucket *n, *old = ERR_PTR(-ENOENT);
- int i, j = -1;
+ int i, j = -1, ret;
bool flag_exist = flags & IPSET_FLAG_EXIST;
bool deleted = false, forceadd = false, reuse = false;
- u32 key, multi = 0;
+ u32 r, key, multi = 0, elements, maxelem;

- if (set->elements >= h->maxelem) {
- if (SET_WITH_TIMEOUT(set))
- /* FIXME: when set is full, we slow down here */
- mtype_expire(set, h);
- if (set->elements >= h->maxelem && SET_WITH_FORCEADD(set))
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
+ key = HKEY(value, h->initval, t->htable_bits);
+ r = ahash_region(key, t->htable_bits);
+ atomic_inc(&t->uref);
+ elements = t->hregion[r].elements;
+ maxelem = t->maxelem;
+ if (elements >= maxelem) {
+ u32 e;
+ if (SET_WITH_TIMEOUT(set)) {
+ rcu_read_unlock_bh();
+ mtype_gc_do(set, h, t, r);
+ rcu_read_lock_bh();
+ }
+ maxelem = h->maxelem;
+ elements = 0;
+ for (e = 0; e < ahash_numof_locks(t->htable_bits); e++)
+ elements += t->hregion[e].elements;
+ if (elements >= maxelem && SET_WITH_FORCEADD(set))
forceadd = true;
}
+ rcu_read_unlock_bh();

- t = ipset_dereference_protected(h->table, set);
- key = HKEY(value, h->initval, t->htable_bits);
- n = __ipset_dereference_protected(hbucket(t, key), 1);
+ spin_lock_bh(&t->hregion[r].lock);
+ n = rcu_dereference_bh(hbucket(t, key));
if (!n) {
- if (forceadd || set->elements >= h->maxelem)
+ if (forceadd || elements >= maxelem)
goto set_full;
old = NULL;
n = kzalloc(sizeof(*n) + AHASH_INIT_SIZE * set->dsize,
GFP_ATOMIC);
- if (!n)
- return -ENOMEM;
+ if (!n) {
+ ret = -ENOMEM;
+ goto unlock;
+ }
n->size = AHASH_INIT_SIZE;
- set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+ t->hregion[r].ext_size +=
+ ext_size(AHASH_INIT_SIZE, set->dsize);
goto copy_elem;
}
for (i = 0; i < n->pos; i++) {
@@ -737,19 +916,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
}
data = ahash_data(n, i, set->dsize);
if (mtype_data_equal(data, d, &multi)) {
- if (flag_exist ||
- (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)))) {
+ if (flag_exist || SET_ELEM_EXPIRED(set, data)) {
/* Just the extensions could be overwritten */
j = i;
goto overwrite_extensions;
}
- return -IPSET_ERR_EXIST;
+ ret = -IPSET_ERR_EXIST;
+ goto unlock;
}
/* Reuse first timed out entry */
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)) &&
- j == -1) {
+ if (SET_ELEM_EXPIRED(set, data) && j == -1) {
j = i;
reuse = true;
}
@@ -759,16 +935,16 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
if (!deleted) {
#ifdef IP_SET_HASH_WITH_NETS
for (i = 0; i < IPSET_NET_COUNT; i++)
- mtype_del_cidr(h,
+ mtype_del_cidr(set, h,
NCIDR_PUT(DCIDR_GET(data->cidr, i)),
i);
#endif
ip_set_ext_destroy(set, data);
- set->elements--;
+ t->hregion[r].elements--;
}
goto copy_data;
}
- if (set->elements >= h->maxelem)
+ if (elements >= maxelem)
goto set_full;
/* Create a new slot */
if (n->pos >= n->size) {
@@ -776,28 +952,32 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
if (n->size >= AHASH_MAX(h)) {
/* Trigger rehashing */
mtype_data_next(&h->next, d);
- return -EAGAIN;
+ ret = -EAGAIN;
+ goto resize;
}
old = n;
n = kzalloc(sizeof(*n) +
(old->size + AHASH_INIT_SIZE) * set->dsize,
GFP_ATOMIC);
- if (!n)
- return -ENOMEM;
+ if (!n) {
+ ret = -ENOMEM;
+ goto unlock;
+ }
memcpy(n, old, sizeof(struct hbucket) +
old->size * set->dsize);
n->size = old->size + AHASH_INIT_SIZE;
- set->ext_size += ext_size(AHASH_INIT_SIZE, set->dsize);
+ t->hregion[r].ext_size +=
+ ext_size(AHASH_INIT_SIZE, set->dsize);
}

copy_elem:
j = n->pos++;
data = ahash_data(n, j, set->dsize);
copy_data:
- set->elements++;
+ t->hregion[r].elements++;
#ifdef IP_SET_HASH_WITH_NETS
for (i = 0; i < IPSET_NET_COUNT; i++)
- mtype_add_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
+ mtype_add_cidr(set, h, NCIDR_PUT(DCIDR_GET(d->cidr, i)), i);
#endif
memcpy(data, d, sizeof(struct mtype_elem));
overwrite_extensions:
@@ -820,13 +1000,41 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
if (old)
kfree_rcu(old, rcu);
}
+ ret = 0;
+resize:
+ spin_unlock_bh(&t->hregion[r].lock);
+ if (atomic_read(&t->ref) && ext->target) {
+ /* Resize is in process and kernel side add, save values */
+ struct mtype_resize_ad *x;
+
+ x = kzalloc(sizeof(struct mtype_resize_ad), GFP_ATOMIC);
+ if (!x)
+ /* Don't bother */
+ goto out;
+ x->ad = IPSET_ADD;
+ memcpy(&x->d, value, sizeof(struct mtype_elem));
+ memcpy(&x->ext, ext, sizeof(struct ip_set_ext));
+ memcpy(&x->mext, mext, sizeof(struct ip_set_ext));
+ x->flags = flags;
+ spin_lock_bh(&set->lock);
+ list_add_tail(&x->list, &h->ad);
+ spin_unlock_bh(&set->lock);
+ }
+ goto out;

- return 0;
set_full:
if (net_ratelimit())
pr_warn("Set %s is full, maxelem %u reached\n",
- set->name, h->maxelem);
- return -IPSET_ERR_HASH_FULL;
+ set->name, maxelem);
+ ret = -IPSET_ERR_HASH_FULL;
+unlock:
+ spin_unlock_bh(&t->hregion[r].lock);
+out:
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by add: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
+ return ret;
}

/* Delete an element from the hash and free up space if possible.
@@ -840,13 +1048,23 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
const struct mtype_elem *d = value;
struct mtype_elem *data;
struct hbucket *n;
- int i, j, k, ret = -IPSET_ERR_EXIST;
+ struct mtype_resize_ad *x = NULL;
+ int i, j, k, r, ret = -IPSET_ERR_EXIST;
u32 key, multi = 0;
size_t dsize = set->dsize;

- t = ipset_dereference_protected(h->table, set);
+ /* Userspace add and resize is excluded by the mutex.
+ * Kernespace add does not trigger resize.
+ */
+ rcu_read_lock_bh();
+ t = rcu_dereference_bh(h->table);
key = HKEY(value, h->initval, t->htable_bits);
- n = __ipset_dereference_protected(hbucket(t, key), 1);
+ r = ahash_region(key, t->htable_bits);
+ atomic_inc(&t->uref);
+ rcu_read_unlock_bh();
+
+ spin_lock_bh(&t->hregion[r].lock);
+ n = rcu_dereference_bh(hbucket(t, key));
if (!n)
goto out;
for (i = 0, k = 0; i < n->pos; i++) {
@@ -857,8 +1075,7 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
data = ahash_data(n, i, dsize);
if (!mtype_data_equal(data, d, &multi))
continue;
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(data, set)))
+ if (SET_ELEM_EXPIRED(set, data))
goto out;

ret = 0;
@@ -866,20 +1083,33 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
smp_mb__after_atomic();
if (i + 1 == n->pos)
n->pos--;
- set->elements--;
+ t->hregion[r].elements--;
#ifdef IP_SET_HASH_WITH_NETS
for (j = 0; j < IPSET_NET_COUNT; j++)
- mtype_del_cidr(h, NCIDR_PUT(DCIDR_GET(d->cidr, j)),
- j);
+ mtype_del_cidr(set, h,
+ NCIDR_PUT(DCIDR_GET(d->cidr, j)), j);
#endif
ip_set_ext_destroy(set, data);

+ if (atomic_read(&t->ref) && ext->target) {
+ /* Resize is in process and kernel side del,
+ * save values
+ */
+ x = kzalloc(sizeof(struct mtype_resize_ad),
+ GFP_ATOMIC);
+ if (x) {
+ x->ad = IPSET_DEL;
+ memcpy(&x->d, value,
+ sizeof(struct mtype_elem));
+ x->flags = flags;
+ }
+ }
for (; i < n->pos; i++) {
if (!test_bit(i, n->used))
k++;
}
if (n->pos == 0 && k == 0) {
- set->ext_size -= ext_size(n->size, dsize);
+ t->hregion[r].ext_size -= ext_size(n->size, dsize);
rcu_assign_pointer(hbucket(t, key), NULL);
kfree_rcu(n, rcu);
} else if (k >= AHASH_INIT_SIZE) {
@@ -898,7 +1128,8 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
k++;
}
tmp->pos = k;
- set->ext_size -= ext_size(AHASH_INIT_SIZE, dsize);
+ t->hregion[r].ext_size -=
+ ext_size(AHASH_INIT_SIZE, dsize);
rcu_assign_pointer(hbucket(t, key), tmp);
kfree_rcu(n, rcu);
}
@@ -906,6 +1137,16 @@ mtype_del(struct ip_set *set, void *value, const struct ip_set_ext *ext,
}

out:
+ spin_unlock_bh(&t->hregion[r].lock);
+ if (x) {
+ spin_lock_bh(&set->lock);
+ list_add(&x->list, &h->ad);
+ spin_unlock_bh(&set->lock);
+ }
+ if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
+ pr_debug("Table destroy after resize by del: %p\n", t);
+ mtype_ahash_destroy(set, t, false);
+ }
return ret;
}

@@ -991,6 +1232,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
int i, ret = 0;
u32 key, multi = 0;

+ rcu_read_lock_bh();
t = rcu_dereference_bh(h->table);
#ifdef IP_SET_HASH_WITH_NETS
/* If we test an IP address and not a network address,
@@ -1022,6 +1264,7 @@ mtype_test(struct ip_set *set, void *value, const struct ip_set_ext *ext,
goto out;
}
out:
+ rcu_read_unlock_bh();
return ret;
}

@@ -1033,23 +1276,14 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
const struct htable *t;
struct nlattr *nested;
size_t memsize;
+ u32 elements = 0;
+ size_t ext_size = 0;
u8 htable_bits;

- /* If any members have expired, set->elements will be wrong
- * mytype_expire function will update it with the right count.
- * we do not hold set->lock here, so grab it first.
- * set->elements can still be incorrect in the case of a huge set,
- * because elements might time out during the listing.
- */
- if (SET_WITH_TIMEOUT(set)) {
- spin_lock_bh(&set->lock);
- mtype_expire(set, h);
- spin_unlock_bh(&set->lock);
- }
-
rcu_read_lock_bh();
- t = rcu_dereference_bh_nfnl(h->table);
- memsize = mtype_ahash_memsize(h, t) + set->ext_size;
+ t = rcu_dereference_bh(h->table);
+ mtype_ext_size(set, &elements, &ext_size);
+ memsize = mtype_ahash_memsize(h, t) + ext_size + set->ext_size;
htable_bits = t->htable_bits;
rcu_read_unlock_bh();

@@ -1071,7 +1305,7 @@ mtype_head(struct ip_set *set, struct sk_buff *skb)
#endif
if (nla_put_net32(skb, IPSET_ATTR_REFERENCES, htonl(set->ref)) ||
nla_put_net32(skb, IPSET_ATTR_MEMSIZE, htonl(memsize)) ||
- nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(set->elements)))
+ nla_put_net32(skb, IPSET_ATTR_ELEMENTS, htonl(elements)))
goto nla_put_failure;
if (unlikely(ip_set_put_flags(skb, set)))
goto nla_put_failure;
@@ -1091,15 +1325,15 @@ mtype_uref(struct ip_set *set, struct netlink_callback *cb, bool start)

if (start) {
rcu_read_lock_bh();
- t = rcu_dereference_bh_nfnl(h->table);
+ t = ipset_dereference_bh_nfnl(h->table);
atomic_inc(&t->uref);
cb->args[IPSET_CB_PRIVATE] = (unsigned long)t;
rcu_read_unlock_bh();
} else if (cb->args[IPSET_CB_PRIVATE]) {
t = (struct htable *)cb->args[IPSET_CB_PRIVATE];
if (atomic_dec_and_test(&t->uref) && atomic_read(&t->ref)) {
- /* Resizing didn't destroy the hash table */
- pr_debug("Table destroy by dump: %p\n", t);
+ pr_debug("Table destroy after resize "
+ " by dump: %p\n", t);
mtype_ahash_destroy(set, t, false);
}
cb->args[IPSET_CB_PRIVATE] = 0;
@@ -1141,8 +1375,7 @@ mtype_list(const struct ip_set *set,
if (!test_bit(i, n->used))
continue;
e = ahash_data(n, i, set->dsize);
- if (SET_WITH_TIMEOUT(set) &&
- ip_set_timeout_expired(ext_timeout(e, set)))
+ if (SET_ELEM_EXPIRED(set, e))
continue;
pr_debug("list hash %lu hbucket %p i %u, data %p\n",
cb->args[IPSET_CB_ARG0], n, i, e);
@@ -1208,6 +1441,7 @@ static const struct ip_set_type_variant mtype_variant = {
.uref = mtype_uref,
.resize = mtype_resize,
.same_set = mtype_same_set,
+ .region_lock = true,
};

#ifdef IP_SET_EMIT_CREATE
@@ -1226,6 +1460,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
size_t hsize;
struct htype *h;
struct htable *t;
+ u32 i;

pr_debug("Create set %s with family %s\n",
set->name, set->family == NFPROTO_IPV4 ? "inet" : "inet6");
@@ -1294,6 +1529,15 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
kfree(h);
return -ENOMEM;
}
+ t->hregion = ip_set_alloc(ahash_sizeof_regions(hbits));
+ if (!t->hregion) {
+ kfree(t);
+ kfree(h);
+ return -ENOMEM;
+ }
+ h->gc.set = set;
+ for (i = 0; i < ahash_numof_locks(hbits); i++)
+ spin_lock_init(&t->hregion[i].lock);
h->maxelem = maxelem;
#ifdef IP_SET_HASH_WITH_NETMASK
h->netmask = netmask;
@@ -1304,9 +1548,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
get_random_bytes(&h->initval, sizeof(h->initval));

t->htable_bits = hbits;
+ t->maxelem = h->maxelem / ahash_numof_locks(hbits);
RCU_INIT_POINTER(h->table, t);

- h->set = set;
+ INIT_LIST_HEAD(&h->ad);
set->data = h;
#ifndef IP_SET_PROTO_UNDEF
if (set->family == NFPROTO_IPV4) {
@@ -1329,12 +1574,10 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
#ifndef IP_SET_PROTO_UNDEF
if (set->family == NFPROTO_IPV4)
#endif
- IPSET_TOKEN(HTYPE, 4_gc_init)(set,
- IPSET_TOKEN(HTYPE, 4_gc));
+ IPSET_TOKEN(HTYPE, 4_gc_init)(&h->gc);
#ifndef IP_SET_PROTO_UNDEF
else
- IPSET_TOKEN(HTYPE, 6_gc_init)(set,
- IPSET_TOKEN(HTYPE, 6_gc));
+ IPSET_TOKEN(HTYPE, 6_gc_init)(&h->gc);
#endif
}
pr_debug("create %s hashsize %u (%u) maxelem %u: %p(%p)\n",
--
2.20.1

2020-03-05 17:19:24

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 39/58] kbuild: add dt_binding_check to PHONY in a correct place

From: Masahiro Yamada <[email protected]>

[ Upstream commit c473a8d03ea8e03ca9d649b0b6d72fbcf6212c05 ]

The dt_binding_check is added to PHONY, but it is invisible when
$(dtstree) is empty. So, it is not specified as phony for
ARCH=x86 etc.

Add it to PHONY outside the ifneq ... endif block.

Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 63094ce5e5744..993f87d5a0ba5 100644
--- a/Makefile
+++ b/Makefile
@@ -1242,7 +1242,7 @@ ifneq ($(dtstree),)
%.dtb: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree) $(dtstree)/$@

-PHONY += dtbs dtbs_install dtbs_check dt_binding_check
+PHONY += dtbs dtbs_install dtbs_check
dtbs dtbs_check: include/config/kernel.release scripts_dtc
$(Q)$(MAKE) $(build)=$(dtstree)

@@ -1262,6 +1262,7 @@ PHONY += scripts_dtc
scripts_dtc: scripts_basic
$(Q)$(MAKE) $(build)=scripts/dtc

+PHONY += dt_binding_check
dt_binding_check: scripts_dtc
$(Q)$(MAKE) $(build)=Documentation/devicetree/bindings

--
2.20.1

2020-03-05 17:19:53

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 36/58] drm/amdgpu: fix memory leak during TDR test(v2)

From: Monk Liu <[email protected]>

[ Upstream commit 4829f89855f1d3a3d8014e74cceab51b421503db ]

fix system memory leak

v2:
fix coding style

Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/gpu/drm/amd/powerplay/smu_v11_0.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
index c5257ae3188a3..0922d9cd858a0 100644
--- a/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
+++ b/drivers/gpu/drm/amd/powerplay/smu_v11_0.c
@@ -988,8 +988,12 @@ static int smu_v11_0_init_max_sustainable_clocks(struct smu_context *smu)
struct smu_11_0_max_sustainable_clocks *max_sustainable_clocks;
int ret = 0;

- max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks),
+ if (!smu->smu_table.max_sustainable_clocks)
+ max_sustainable_clocks = kzalloc(sizeof(struct smu_11_0_max_sustainable_clocks),
GFP_KERNEL);
+ else
+ max_sustainable_clocks = smu->smu_table.max_sustainable_clocks;
+
smu->smu_table.max_sustainable_clocks = (void *)max_sustainable_clocks;

max_sustainable_clocks->uclock = smu->smu_table.boot_values.uclk / 100;
--
2.20.1

2020-03-05 17:19:56

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 37/58] kbuild: fix DT binding schema rule to detect command line changes

From: Masahiro Yamada <[email protected]>

[ Upstream commit 7a04960560640ac5b0b89461f7757322b57d0c7a ]

This if_change_rule is not working properly; it cannot detect any
command line change.

The reason is because cmd-check in scripts/Kbuild.include compares
$(cmd_$@) and $(cmd_$1), but cmd_dtc_dt_yaml does not exist here.

For if_change_rule to work properly, the stem part of cmd_* and rule_*
must match. Because this cmd_and_fixdep invokes cmd_dtc, this rule must
be named rule_dtc.

Fixes: 4f0e3a57d6eb ("kbuild: Add support for DT binding schema checks")
Signed-off-by: Masahiro Yamada <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
scripts/Makefile.lib | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.lib b/scripts/Makefile.lib
index 179d55af58529..8a6663580b8e1 100644
--- a/scripts/Makefile.lib
+++ b/scripts/Makefile.lib
@@ -305,13 +305,13 @@ DT_TMP_SCHEMA := $(objtree)/$(DT_BINDING_DIR)/processed-schema.yaml
quiet_cmd_dtb_check = CHECK $@
cmd_dtb_check = $(DT_CHECKER) -u $(srctree)/$(DT_BINDING_DIR) -p $(DT_TMP_SCHEMA) $@ ;

-define rule_dtc_dt_yaml
+define rule_dtc
$(call cmd_and_fixdep,dtc,yaml)
$(call cmd,dtb_check)
endef

$(obj)/%.dt.yaml: $(src)/%.dts $(DTC) $(DT_TMP_SCHEMA) FORCE
- $(call if_changed_rule,dtc_dt_yaml)
+ $(call if_changed_rule,dtc)

dtc-tmp = $(subst $(comma),_,$(dot-target).dts.tmp)

--
2.20.1

2020-03-05 17:20:03

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 35/58] blk-mq: insert passthrough request into hctx->dispatch directly

From: Ming Lei <[email protected]>

[ Upstream commit 01e99aeca3979600302913cef3f89076786f32c8 ]

For some reason, device may be in one situation which can't handle
FS request, so STS_RESOURCE is always returned and the FS request
will be added to hctx->dispatch. However passthrough request may
be required at that time for fixing the problem. If passthrough
request is added to scheduler queue, there isn't any chance for
blk-mq to dispatch it given we prioritize requests in hctx->dispatch.
Then the FS IO request may never be completed, and IO hang is caused.

So passthrough request has to be added to hctx->dispatch directly
for fixing the IO hang.

Fix this issue by inserting passthrough request into hctx->dispatch
directly together withing adding FS request to the tail of
hctx->dispatch in blk_mq_dispatch_rq_list(). Actually we add FS request
to tail of hctx->dispatch at default, see blk_mq_request_bypass_insert().

Then it becomes consistent with original legacy IO request
path, in which passthrough request is always added to q->queue_head.

Cc: Dongli Zhang <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Ewan D. Milne <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
block/blk-flush.c | 2 +-
block/blk-mq-sched.c | 22 +++++++++++++++-------
block/blk-mq.c | 18 +++++++++++-------
block/blk-mq.h | 3 ++-
4 files changed, 29 insertions(+), 16 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index b1f0a1ac505c9..5aa6fada22598 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -399,7 +399,7 @@ void blk_insert_flush(struct request *rq)
*/
if ((policy & REQ_FSEQ_DATA) &&
!(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
- blk_mq_request_bypass_insert(rq, false);
+ blk_mq_request_bypass_insert(rq, false, false);
return;
}

diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index ca22afd47b3dc..856356b1619e8 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -361,13 +361,19 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx,
bool has_sched,
struct request *rq)
{
- /* dispatch flush rq directly */
- if (rq->rq_flags & RQF_FLUSH_SEQ) {
- spin_lock(&hctx->lock);
- list_add(&rq->queuelist, &hctx->dispatch);
- spin_unlock(&hctx->lock);
+ /*
+ * dispatch flush and passthrough rq directly
+ *
+ * passthrough request has to be added to hctx->dispatch directly.
+ * For some reason, device may be in one situation which can't
+ * handle FS request, so STS_RESOURCE is always returned and the
+ * FS request will be added to hctx->dispatch. However passthrough
+ * request may be required at that time for fixing the problem. If
+ * passthrough request is added to scheduler queue, there isn't any
+ * chance to dispatch it given we prioritize requests in hctx->dispatch.
+ */
+ if ((rq->rq_flags & RQF_FLUSH_SEQ) || blk_rq_is_passthrough(rq))
return true;
- }

if (has_sched)
rq->rq_flags |= RQF_SORTED;
@@ -391,8 +397,10 @@ void blk_mq_sched_insert_request(struct request *rq, bool at_head,

WARN_ON(e && (rq->tag != -1));

- if (blk_mq_sched_bypass_insert(hctx, !!e, rq))
+ if (blk_mq_sched_bypass_insert(hctx, !!e, rq)) {
+ blk_mq_request_bypass_insert(rq, at_head, false);
goto run;
+ }

if (e && e->type->ops.insert_requests) {
LIST_HEAD(list);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index ec791156e9ccd..3c1abab1fdf52 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -761,7 +761,7 @@ static void blk_mq_requeue_work(struct work_struct *work)
* merge.
*/
if (rq->rq_flags & RQF_DONTPREP)
- blk_mq_request_bypass_insert(rq, false);
+ blk_mq_request_bypass_insert(rq, false, false);
else
blk_mq_sched_insert_request(rq, true, false, false);
}
@@ -1313,7 +1313,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
q->mq_ops->commit_rqs(hctx);

spin_lock(&hctx->lock);
- list_splice_init(list, &hctx->dispatch);
+ list_splice_tail_init(list, &hctx->dispatch);
spin_unlock(&hctx->lock);

/*
@@ -1668,12 +1668,16 @@ void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
* Should only be used carefully, when the caller knows we want to
* bypass a potential IO scheduler on the target device.
*/
-void blk_mq_request_bypass_insert(struct request *rq, bool run_queue)
+void blk_mq_request_bypass_insert(struct request *rq, bool at_head,
+ bool run_queue)
{
struct blk_mq_hw_ctx *hctx = rq->mq_hctx;

spin_lock(&hctx->lock);
- list_add_tail(&rq->queuelist, &hctx->dispatch);
+ if (at_head)
+ list_add(&rq->queuelist, &hctx->dispatch);
+ else
+ list_add_tail(&rq->queuelist, &hctx->dispatch);
spin_unlock(&hctx->lock);

if (run_queue)
@@ -1863,7 +1867,7 @@ static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
if (bypass_insert)
return BLK_STS_RESOURCE;

- blk_mq_request_bypass_insert(rq, run_queue);
+ blk_mq_request_bypass_insert(rq, false, run_queue);
return BLK_STS_OK;
}

@@ -1879,7 +1883,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,

ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false, true);
if (ret == BLK_STS_RESOURCE || ret == BLK_STS_DEV_RESOURCE)
- blk_mq_request_bypass_insert(rq, true);
+ blk_mq_request_bypass_insert(rq, false, true);
else if (ret != BLK_STS_OK)
blk_mq_end_request(rq, ret);

@@ -1913,7 +1917,7 @@ void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
if (ret != BLK_STS_OK) {
if (ret == BLK_STS_RESOURCE ||
ret == BLK_STS_DEV_RESOURCE) {
- blk_mq_request_bypass_insert(rq,
+ blk_mq_request_bypass_insert(rq, false,
list_empty(list));
break;
}
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 32c62c64e6c2b..f2075978db500 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -66,7 +66,8 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
*/
void __blk_mq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
bool at_head);
-void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
+void blk_mq_request_bypass_insert(struct request *rq, bool at_head,
+ bool run_queue);
void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
struct list_head *list);

--
2.20.1

2020-03-05 17:20:08

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 34/58] net: ll_temac: Handle DMA halt condition caused by buffer underrun

From: Esben Haabendal <[email protected]>

[ Upstream commit 1d63b8d66d146deaaedbe16c80de105f685ea012 ]

The SDMA engine used by TEMAC halts operation when it has finished
processing of the last buffer descriptor in the buffer ring.
Unfortunately, no interrupt event is generated when this happens,
so we need to setup another mechanism to make sure DMA operation is
restarted when enough buffers have been added to the ring.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/xilinx/ll_temac.h | 3 ++
drivers/net/ethernet/xilinx/ll_temac_main.c | 58 +++++++++++++++++++--
2 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac.h b/drivers/net/ethernet/xilinx/ll_temac.h
index 99fe059e5c7f3..53fb8141f1a67 100644
--- a/drivers/net/ethernet/xilinx/ll_temac.h
+++ b/drivers/net/ethernet/xilinx/ll_temac.h
@@ -380,6 +380,9 @@ struct temac_local {
/* DMA channel control setup */
u32 tx_chnl_ctrl;
u32 rx_chnl_ctrl;
+ u8 coalesce_count_rx;
+
+ struct delayed_work restart_work;
};

/* Wrappers for temac_ior()/temac_iow() function pointers above */
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 2e3f59dae586e..eb480204cdbeb 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -51,6 +51,7 @@
#include <linux/ip.h>
#include <linux/slab.h>
#include <linux/interrupt.h>
+#include <linux/workqueue.h>
#include <linux/dma-mapping.h>
#include <linux/processor.h>
#include <linux/platform_data/xilinx-ll-temac.h>
@@ -866,8 +867,11 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
skb_headlen(skb), DMA_TO_DEVICE);
cur_p->len = cpu_to_be32(skb_headlen(skb));
- if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr)))
- return NETDEV_TX_BUSY;
+ if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr))) {
+ dev_kfree_skb_any(skb);
+ ndev->stats.tx_dropped++;
+ return NETDEV_TX_OK;
+ }
cur_p->phys = cpu_to_be32(skb_dma_addr);
ptr_to_txbd((void *)skb, cur_p);

@@ -897,7 +901,9 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
dma_unmap_single(ndev->dev.parent,
be32_to_cpu(cur_p->phys),
skb_headlen(skb), DMA_TO_DEVICE);
- return NETDEV_TX_BUSY;
+ dev_kfree_skb_any(skb);
+ ndev->stats.tx_dropped++;
+ return NETDEV_TX_OK;
}
cur_p->phys = cpu_to_be32(skb_dma_addr);
cur_p->len = cpu_to_be32(skb_frag_size(frag));
@@ -920,6 +926,17 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
return NETDEV_TX_OK;
}

+static int ll_temac_recv_buffers_available(struct temac_local *lp)
+{
+ int available;
+
+ if (!lp->rx_skb[lp->rx_bd_ci])
+ return 0;
+ available = 1 + lp->rx_bd_tail - lp->rx_bd_ci;
+ if (available <= 0)
+ available += RX_BD_NUM;
+ return available;
+}

static void ll_temac_recv(struct net_device *ndev)
{
@@ -990,6 +1007,18 @@ static void ll_temac_recv(struct net_device *ndev)
lp->rx_bd_ci = 0;
} while (rx_bd != lp->rx_bd_tail);

+ /* DMA operations will halt when the last buffer descriptor is
+ * processed (ie. the one pointed to by RX_TAILDESC_PTR).
+ * When that happens, no more interrupt events will be
+ * generated. No IRQ_COAL or IRQ_DLY, and not even an
+ * IRQ_ERR. To avoid stalling, we schedule a delayed work
+ * when there is a potential risk of that happening. The work
+ * will call this function, and thus re-schedule itself until
+ * enough buffers are available again.
+ */
+ if (ll_temac_recv_buffers_available(lp) < lp->coalesce_count_rx)
+ schedule_delayed_work(&lp->restart_work, HZ / 1000);
+
/* Allocate new buffers for those buffer descriptors that were
* passed to network stack. Note that GFP_ATOMIC allocations
* can fail (e.g. when a larger burst of GFP_ATOMIC
@@ -1045,6 +1074,18 @@ static void ll_temac_recv(struct net_device *ndev)
spin_unlock_irqrestore(&lp->rx_lock, flags);
}

+/* Function scheduled to ensure a restart in case of DMA halt
+ * condition caused by running out of buffer descriptors.
+ */
+static void ll_temac_restart_work_func(struct work_struct *work)
+{
+ struct temac_local *lp = container_of(work, struct temac_local,
+ restart_work.work);
+ struct net_device *ndev = lp->ndev;
+
+ ll_temac_recv(ndev);
+}
+
static irqreturn_t ll_temac_tx_irq(int irq, void *_ndev)
{
struct net_device *ndev = _ndev;
@@ -1137,6 +1178,8 @@ static int temac_stop(struct net_device *ndev)

dev_dbg(&ndev->dev, "temac_close()\n");

+ cancel_delayed_work_sync(&lp->restart_work);
+
free_irq(lp->tx_irq, ndev);
free_irq(lp->rx_irq, ndev);

@@ -1269,6 +1312,7 @@ static int temac_probe(struct platform_device *pdev)
lp->dev = &pdev->dev;
lp->options = XTE_OPTION_DEFAULTS;
spin_lock_init(&lp->rx_lock);
+ INIT_DELAYED_WORK(&lp->restart_work, ll_temac_restart_work_func);

/* Setup mutex for synchronization of indirect register access */
if (pdata) {
@@ -1375,6 +1419,7 @@ static int temac_probe(struct platform_device *pdev)
*/
lp->tx_chnl_ctrl = 0x10220000;
lp->rx_chnl_ctrl = 0xff070000;
+ lp->coalesce_count_rx = 0x07;

/* Finished with the DMA node; drop the reference */
of_node_put(dma_np);
@@ -1406,11 +1451,14 @@ static int temac_probe(struct platform_device *pdev)
(pdata->tx_irq_count << 16);
else
lp->tx_chnl_ctrl = 0x10220000;
- if (pdata->rx_irq_timeout || pdata->rx_irq_count)
+ if (pdata->rx_irq_timeout || pdata->rx_irq_count) {
lp->rx_chnl_ctrl = (pdata->rx_irq_timeout << 24) |
(pdata->rx_irq_count << 16);
- else
+ lp->coalesce_count_rx = pdata->rx_irq_count;
+ } else {
lp->rx_chnl_ctrl = 0xff070000;
+ lp->coalesce_count_rx = 0x07;
+ }
}

/* Error handle returned DMA RX and TX interrupts */
--
2.20.1

2020-03-05 17:20:23

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 32/58] net: ll_temac: Add more error handling of dma_map_single() calls

From: Esben Haabendal <[email protected]>

[ Upstream commit d07c849cd2b97d6809430dfb7e738ad31088037a ]

This adds error handling to the remaining dma_map_single() calls, so that
behavior is well defined if/when we run out of DMA memory.

Fixes: 92744989533c ("net: add Xilinx ll_temac device driver")
Signed-off-by: Esben Haabendal <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/xilinx/ll_temac_main.c | 26 +++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index fd578568b3bff..fd4231493449b 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -367,6 +367,8 @@ static int temac_dma_bd_init(struct net_device *ndev)
skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
XTE_MAX_JUMBO_FRAME_SIZE,
DMA_FROM_DEVICE);
+ if (dma_mapping_error(ndev->dev.parent, skb_dma_addr))
+ goto out;
lp->rx_bd_v[i].phys = cpu_to_be32(skb_dma_addr);
lp->rx_bd_v[i].len = cpu_to_be32(XTE_MAX_JUMBO_FRAME_SIZE);
lp->rx_bd_v[i].app0 = cpu_to_be32(STS_CTRL_APP0_IRQONEND);
@@ -863,12 +865,13 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
skb_dma_addr = dma_map_single(ndev->dev.parent, skb->data,
skb_headlen(skb), DMA_TO_DEVICE);
cur_p->len = cpu_to_be32(skb_headlen(skb));
+ if (WARN_ON_ONCE(dma_mapping_error(ndev->dev.parent, skb_dma_addr)))
+ return NETDEV_TX_BUSY;
cur_p->phys = cpu_to_be32(skb_dma_addr);
ptr_to_txbd((void *)skb, cur_p);

for (ii = 0; ii < num_frag; ii++) {
- lp->tx_bd_tail++;
- if (lp->tx_bd_tail >= TX_BD_NUM)
+ if (++lp->tx_bd_tail >= TX_BD_NUM)
lp->tx_bd_tail = 0;

cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
@@ -876,6 +879,25 @@ temac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
skb_frag_address(frag),
skb_frag_size(frag),
DMA_TO_DEVICE);
+ if (dma_mapping_error(ndev->dev.parent, skb_dma_addr)) {
+ if (--lp->tx_bd_tail < 0)
+ lp->tx_bd_tail = TX_BD_NUM - 1;
+ cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
+ while (--ii >= 0) {
+ --frag;
+ dma_unmap_single(ndev->dev.parent,
+ be32_to_cpu(cur_p->phys),
+ skb_frag_size(frag),
+ DMA_TO_DEVICE);
+ if (--lp->tx_bd_tail < 0)
+ lp->tx_bd_tail = TX_BD_NUM - 1;
+ cur_p = &lp->tx_bd_v[lp->tx_bd_tail];
+ }
+ dma_unmap_single(ndev->dev.parent,
+ be32_to_cpu(cur_p->phys),
+ skb_headlen(skb), DMA_TO_DEVICE);
+ return NETDEV_TX_BUSY;
+ }
cur_p->phys = cpu_to_be32(skb_dma_addr);
cur_p->len = cpu_to_be32(skb_frag_size(frag));
cur_p->app0 = 0;
--
2.20.1

2020-03-05 17:20:37

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 28/58] hv_netvsc: Fix unwanted wakeup in netvsc_attach()

From: Haiyang Zhang <[email protected]>

[ Upstream commit f6f13c125e05603f68f5bf31f045b95e6d493598 ]

When netvsc_attach() is called by operations like changing MTU, etc.,
an extra wakeup may happen while netvsc_attach() calling
rndis_filter_device_add() which sends rndis messages when queue is
stopped in netvsc_detach(). The completion message will wake up queue 0.

We can reproduce the issue by changing MTU etc., then the wake_queue
counter from "ethtool -S" will increase beyond stop_queue counter:
stop_queue: 0
wake_queue: 1
The issue causes queue wake up, and counter increment, no other ill
effects in current code. So we didn't see any network problem for now.

To fix this, initialize tx_disable to true, and set it to false when
the NIC is ready to be attached or registered.

Fixes: 7b2ee50c0cd5 ("hv_netvsc: common detach logic")
Signed-off-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/hyperv/netvsc.c | 2 +-
drivers/net/hyperv/netvsc_drv.c | 3 +++
2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index eab83e71567a9..6c0732fc8c250 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -99,7 +99,7 @@ static struct netvsc_device *alloc_net_device(void)

init_waitqueue_head(&net_device->wait_drain);
net_device->destroy = false;
- net_device->tx_disable = false;
+ net_device->tx_disable = true;

net_device->max_pkt = RNDIS_MAX_PKT_DEFAULT;
net_device->pkt_align = RNDIS_PKT_ALIGN_DEFAULT;
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 0dee358864f30..ca16ae8c83328 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -973,6 +973,7 @@ static int netvsc_attach(struct net_device *ndev,
}

/* In any case device is now ready */
+ nvdev->tx_disable = false;
netif_device_attach(ndev);

/* Note: enable and attach happen when sub-channels setup */
@@ -2350,6 +2351,8 @@ static int netvsc_probe(struct hv_device *dev,
else
net->max_mtu = ETH_DATA_LEN;

+ nvdev->tx_disable = false;
+
ret = register_netdevice(net);
if (ret != 0) {
pr_err("Unable to register netdev.\n");
--
2.20.1

2020-03-05 17:20:43

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 29/58] net: ks8851-ml: Fix IRQ handling and locking

From: Marek Vasut <[email protected]>

[ Upstream commit 44343418d0f2f623cb9da6f5000df793131cbe3b ]

The KS8851 requires that packet RX and TX are mutually exclusive.
Currently, the driver hopes to achieve this by disabling interrupt
from the card by writing the card registers and by disabling the
interrupt on the interrupt controller. This however is racy on SMP.

Replace this approach by expanding the spinlock used around the
ks_start_xmit() TX path to ks_irq() RX path to assure true mutual
exclusion and remove the interrupt enabling/disabling, which is
now not needed anymore. Furthermore, disable interrupts also in
ks_net_stop(), which was missing before.

Note that a massive improvement here would be to re-use the KS8851
driver approach, which is to move the TX path into a worker thread,
interrupt handling to threaded interrupt, and synchronize everything
with mutexes, but that would be a much bigger rework, for a separate
patch.

Signed-off-by: Marek Vasut <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Lukas Wunner <[email protected]>
Cc: Petr Stetiar <[email protected]>
Cc: YueHaibing <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/ethernet/micrel/ks8851_mll.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/micrel/ks8851_mll.c b/drivers/net/ethernet/micrel/ks8851_mll.c
index a41a90c589db2..20cb5b500661a 100644
--- a/drivers/net/ethernet/micrel/ks8851_mll.c
+++ b/drivers/net/ethernet/micrel/ks8851_mll.c
@@ -548,14 +548,17 @@ static irqreturn_t ks_irq(int irq, void *pw)
{
struct net_device *netdev = pw;
struct ks_net *ks = netdev_priv(netdev);
+ unsigned long flags;
u16 status;

+ spin_lock_irqsave(&ks->statelock, flags);
/*this should be the first in IRQ handler */
ks_save_cmd_reg(ks);

status = ks_rdreg16(ks, KS_ISR);
if (unlikely(!status)) {
ks_restore_cmd_reg(ks);
+ spin_unlock_irqrestore(&ks->statelock, flags);
return IRQ_NONE;
}

@@ -581,6 +584,7 @@ static irqreturn_t ks_irq(int irq, void *pw)
ks->netdev->stats.rx_over_errors++;
/* this should be the last in IRQ handler*/
ks_restore_cmd_reg(ks);
+ spin_unlock_irqrestore(&ks->statelock, flags);
return IRQ_HANDLED;
}

@@ -650,6 +654,7 @@ static int ks_net_stop(struct net_device *netdev)

/* shutdown RX/TX QMU */
ks_disable_qmu(ks);
+ ks_disable_int(ks);

/* set powermode to soft power down to save power */
ks_set_powermode(ks, PMECR_PM_SOFTDOWN);
@@ -706,10 +711,9 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev)
{
netdev_tx_t retv = NETDEV_TX_OK;
struct ks_net *ks = netdev_priv(netdev);
+ unsigned long flags;

- disable_irq(netdev->irq);
- ks_disable_int(ks);
- spin_lock(&ks->statelock);
+ spin_lock_irqsave(&ks->statelock, flags);

/* Extra space are required:
* 4 byte for alignment, 4 for status/length, 4 for CRC
@@ -723,9 +727,7 @@ static netdev_tx_t ks_start_xmit(struct sk_buff *skb, struct net_device *netdev)
dev_kfree_skb(skb);
} else
retv = NETDEV_TX_BUSY;
- spin_unlock(&ks->statelock);
- ks_enable_int(ks);
- enable_irq(netdev->irq);
+ spin_unlock_irqrestore(&ks->statelock, flags);
return retv;
}

--
2.20.1

2020-03-05 17:20:45

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 25/58] netfilter: ipset: Fix forceadd evaluation path

From: Jozsef Kadlecsik <[email protected]>

[ Upstream commit 8af1c6fbd9239877998c7f5a591cb2c88d41fb66 ]

When the forceadd option is enabled, the hash:* types should find and replace
the first entry in the bucket with the new one if there are no reuseable
(deleted or timed out) entries. However, the position index was just not set
to zero and remained the invalid -1 if there were no reuseable entries.

Reported-by: [email protected]
Fixes: 23c42a403a9c ("netfilter: ipset: Introduction of new commands and protocol version 7")
Signed-off-by: Jozsef Kadlecsik <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/netfilter/ipset/ip_set_hash_gen.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 2ac28c5c7e957..2389c9f89e481 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -931,6 +931,8 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
}
}
if (reuse || forceadd) {
+ if (j == -1)
+ j = 0;
data = ahash_data(n, j, set->dsize);
if (!deleted) {
#ifdef IP_SET_HASH_WITH_NETS
--
2.20.1

2020-03-05 17:21:09

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 15/58] HID: hid-bigbenff: call hid_hw_stop() in case of error

From: Hanno Zulla <[email protected]>

[ Upstream commit 976a54d0f4202cb412a3b1fc7f117e1d97db35f3 ]

It's required to call hid_hw_stop() once hid_hw_start() was called
previously, so error cases need to handle this. Also, hid_hw_close() is
not necessary during removal.

Signed-off-by: Hanno Zulla <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-bigbenff.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c
index f7e85bacb6889..f8c552b64a899 100644
--- a/drivers/hid/hid-bigbenff.c
+++ b/drivers/hid/hid-bigbenff.c
@@ -305,7 +305,6 @@ static void bigben_remove(struct hid_device *hid)
struct bigben_device *bigben = hid_get_drvdata(hid);

cancel_work_sync(&bigben->worker);
- hid_hw_close(hid);
hid_hw_stop(hid);
}

@@ -350,7 +349,7 @@ static int bigben_probe(struct hid_device *hid,
error = input_ff_create_memless(hidinput->input, NULL,
hid_bigben_play_effect);
if (error)
- return error;
+ goto error_hw_stop;

name_sz = strlen(dev_name(&hid->dev)) + strlen(":red:bigben#") + 1;

@@ -360,8 +359,10 @@ static int bigben_probe(struct hid_device *hid,
sizeof(struct led_classdev) + name_sz,
GFP_KERNEL
);
- if (!led)
- return -ENOMEM;
+ if (!led) {
+ error = -ENOMEM;
+ goto error_hw_stop;
+ }
name = (void *)(&led[1]);
snprintf(name, name_sz,
"%s:red:bigben%d",
@@ -375,7 +376,7 @@ static int bigben_probe(struct hid_device *hid,
bigben->leds[n] = led;
error = devm_led_classdev_register(&hid->dev, led);
if (error)
- return error;
+ goto error_hw_stop;
}

/* initial state: LED1 is on, no rumble effect */
@@ -389,6 +390,10 @@ static int bigben_probe(struct hid_device *hid,
hid_info(hid, "LED and force feedback support for BigBen gamepad\n");

return 0;
+
+error_hw_stop:
+ hid_hw_stop(hid);
+ return error;
}

static __u8 *bigben_report_fixup(struct hid_device *hid, __u8 *rdesc,
--
2.20.1

2020-03-05 17:21:19

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 23/58] scsi: libfc: free response frame from GPN_ID

From: Igor Druzhinin <[email protected]>

[ Upstream commit ff6993bb79b9f99bdac0b5378169052931b65432 ]

fc_disc_gpn_id_resp() should be the last function using it so free it here
to avoid memory leak.

Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Hannes Reinecke <[email protected]>
Signed-off-by: Igor Druzhinin <[email protected]>
Signed-off-by: Martin K. Petersen <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/scsi/libfc/fc_disc.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/libfc/fc_disc.c b/drivers/scsi/libfc/fc_disc.c
index 9c5f7c9178c66..2b865c6423e29 100644
--- a/drivers/scsi/libfc/fc_disc.c
+++ b/drivers/scsi/libfc/fc_disc.c
@@ -628,6 +628,8 @@ static void fc_disc_gpn_id_resp(struct fc_seq *sp, struct fc_frame *fp,
}
out:
kref_put(&rdata->kref, fc_rport_destroy);
+ if (!IS_ERR(fp))
+ fc_frame_free(fp);
}

/**
--
2.20.1

2020-03-05 17:21:28

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 21/58] cfg80211: check reg_rule for NULL in handle_channel_custom()

From: Johannes Berg <[email protected]>

[ Upstream commit a7ee7d44b57c9ae174088e53a668852b7f4f452d ]

We may end up with a NULL reg_rule after the loop in
handle_channel_custom() if the bandwidth didn't fit,
check if this is the case and bail out if so.

Signed-off-by: Johannes Berg <[email protected]>
Link: https://lore.kernel.org/r/20200221104449.3b558a50201c.I4ad3725c4dacaefd2d18d3cc65ba6d18acd5dbfe@changeid
Signed-off-by: Johannes Berg <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
net/wireless/reg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index fff9a74891fc4..1a8218f1bbe07 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -2276,7 +2276,7 @@ static void handle_channel_custom(struct wiphy *wiphy,
break;
}

- if (IS_ERR(reg_rule)) {
+ if (IS_ERR_OR_NULL(reg_rule)) {
pr_debug("Disabling freq %d MHz as custom regd has no rule that fits it\n",
chan->center_freq);
if (wiphy->regulatory_flags & REGULATORY_WIPHY_SELF_MANAGED) {
--
2.20.1

2020-03-05 17:21:29

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 16/58] HID: hid-bigbenff: fix race condition for scheduled work during removal

From: Hanno Zulla <[email protected]>

[ Upstream commit 4eb1b01de5b9d8596d6c103efcf1a15cfc1bedf7 ]

It's possible that there is scheduled work left while the device is
already being removed, which can cause a kernel crash. Adding a flag
will avoid this.

Signed-off-by: Hanno Zulla <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-bigbenff.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c
index f8c552b64a899..db6da21ade063 100644
--- a/drivers/hid/hid-bigbenff.c
+++ b/drivers/hid/hid-bigbenff.c
@@ -174,6 +174,7 @@ static __u8 pid0902_rdesc_fixed[] = {
struct bigben_device {
struct hid_device *hid;
struct hid_report *report;
+ bool removed;
u8 led_state; /* LED1 = 1 .. LED4 = 8 */
u8 right_motor_on; /* right motor off/on 0/1 */
u8 left_motor_force; /* left motor force 0-255 */
@@ -190,6 +191,9 @@ static void bigben_worker(struct work_struct *work)
struct bigben_device, worker);
struct hid_field *report_field = bigben->report->field[0];

+ if (bigben->removed)
+ return;
+
if (bigben->work_led) {
bigben->work_led = false;
report_field->value[0] = 0x01; /* 1 = led message */
@@ -304,6 +308,7 @@ static void bigben_remove(struct hid_device *hid)
{
struct bigben_device *bigben = hid_get_drvdata(hid);

+ bigben->removed = true;
cancel_work_sync(&bigben->worker);
hid_hw_stop(hid);
}
@@ -324,6 +329,7 @@ static int bigben_probe(struct hid_device *hid,
return -ENOMEM;
hid_set_drvdata(hid, bigben);
bigben->hid = hid;
+ bigben->removed = false;

error = hid_parse(hid);
if (error) {
--
2.20.1

2020-03-05 17:21:41

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 14/58] HID: hid-bigbenff: fix general protection fault caused by double kfree

From: Hanno Zulla <[email protected]>

[ Upstream commit 789a2c250340666220fa74bc6c8f58497e3863b3 ]

The struct *bigben was allocated via devm_kzalloc() and then used as a
parameter in input_ff_create_memless(). This caused a double kfree
during removal of the device, since both the managed resource API and
ml_ff_destroy() in drivers/input/ff-memless.c would call kfree() on it.

Signed-off-by: Hanno Zulla <[email protected]>
Signed-off-by: Benjamin Tissoires <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-bigbenff.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c
index 3f6abd190df43..f7e85bacb6889 100644
--- a/drivers/hid/hid-bigbenff.c
+++ b/drivers/hid/hid-bigbenff.c
@@ -220,10 +220,16 @@ static void bigben_worker(struct work_struct *work)
static int hid_bigben_play_effect(struct input_dev *dev, void *data,
struct ff_effect *effect)
{
- struct bigben_device *bigben = data;
+ struct hid_device *hid = input_get_drvdata(dev);
+ struct bigben_device *bigben = hid_get_drvdata(hid);
u8 right_motor_on;
u8 left_motor_force;

+ if (!bigben) {
+ hid_err(hid, "no device data\n");
+ return 0;
+ }
+
if (effect->type != FF_RUMBLE)
return 0;

@@ -341,7 +347,7 @@ static int bigben_probe(struct hid_device *hid,

INIT_WORK(&bigben->worker, bigben_worker);

- error = input_ff_create_memless(hidinput->input, bigben,
+ error = input_ff_create_memless(hidinput->input, NULL,
hid_bigben_play_effect);
if (error)
return error;
--
2.20.1

2020-03-05 17:21:47

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 13/58] MIPS: VPE: Fix a double free and a memory leak in 'release_vpe()'

From: Christophe JAILLET <[email protected]>

[ Upstream commit bef8e2dfceed6daeb6ca3e8d33f9c9d43b926580 ]

Pointer on the memory allocated by 'alloc_progmem()' is stored in
'v->load_addr'. So this is this memory that should be freed by
'release_progmem()'.

'release_progmem()' is only a call to 'kfree()'.

With the current code, there is both a double free and a memory leak.
Fix it by passing the correct pointer to 'release_progmem()'.

Fixes: e01402b115ccc ("More AP / SP bits for the 34K, the Malta bits and things. Still wants")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/kernel/vpe.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/mips/kernel/vpe.c b/arch/mips/kernel/vpe.c
index 6176b9acba950..d0d832ab3d3b8 100644
--- a/arch/mips/kernel/vpe.c
+++ b/arch/mips/kernel/vpe.c
@@ -134,7 +134,7 @@ void release_vpe(struct vpe *v)
{
list_del(&v->list);
if (v->load_addr)
- release_progmem(v);
+ release_progmem(v->load_addr);
kfree(v);
}

--
2.20.1

2020-03-05 17:21:48

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 27/58] net: usb: qmi_wwan: restore mtu min/max values after raw_ip switch

From: Daniele Palmas <[email protected]>

[ Upstream commit eae7172f8141eb98e64e6e81acc9e9d5b2add127 ]

usbnet creates network interfaces with min_mtu = 0 and
max_mtu = ETH_MAX_MTU.

These values are not modified by qmi_wwan when the network interface
is created initially, allowing, for example, to set mtu greater than 1500.

When a raw_ip switch is done (raw_ip set to 'Y', then set to 'N') the mtu
values for the network interface are set through ether_setup, with
min_mtu = ETH_MIN_MTU and max_mtu = ETH_DATA_LEN, not allowing anymore to
set mtu greater than 1500 (error: mtu greater than device maximum).

The patch restores the original min/max mtu values set by usbnet after a
raw_ip switch.

Signed-off-by: Daniele Palmas <[email protected]>
Acked-by: Bjørn Mork <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/net/usb/qmi_wwan.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 9485c8d1de8a3..9dcbca1f8f9aa 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -338,6 +338,9 @@ static void qmi_wwan_netdev_setup(struct net_device *net)
netdev_dbg(net, "mode: raw IP\n");
} else if (!net->header_ops) { /* don't bother if already set */
ether_setup(net);
+ /* Restoring min/max mtu values set originally by usbnet */
+ net->min_mtu = 0;
+ net->max_mtu = ETH_MAX_MTU;
clear_bit(EVENT_NO_IP_ALIGN, &dev->flags);
netdev_dbg(net, "mode: Ethernet\n");
}
--
2.20.1

2020-03-05 17:21:54

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 10/58] mips: vdso: fix 'jalr t9' crash in vdso code

From: Victor Kamensky <[email protected]>

[ Upstream commit d3f703c4359ff06619b2322b91f69710453e6b6d ]

Observed that when kernel is built with Yocto mips64-poky-linux-gcc,
and mips64-poky-linux-gnun32-gcc toolchain, resulting vdso contains
'jalr t9' instructions in its code and since in vdso case nobody
sets GOT table code crashes when instruction reached. On other hand
observed that when kernel is built mips-poky-linux-gcc toolchain, the
same 'jalr t9' instruction are replaced with PC relative function
calls using 'bal' instructions.

The difference boils down to -mrelax-pic-calls and -mexplicit-relocs
gcc options that gets different default values depending on gcc
target triplets and corresponding binutils. -mrelax-pic-calls got
enabled by default only in mips-poky-linux-gcc case. MIPS binutils
ld relies on R_MIPS_JALR relocation to convert 'jalr t9' into 'bal'
and such relocation is generated only if -mrelax-pic-calls option
is on.

Please note 'jalr t9' conversion to 'bal' can happen only to static
functions. These static PIC calls use mips local GOT entries that
are supposed to be filled with start of DSO value by run-time linker
(missing in VDSO case) and they do not have dynamic relocations.
Global mips GOT entries must have dynamic relocations and they should
be prevented by cmd_vdso_check Makefile rule.

Solution call out -mrelax-pic-calls and -mexplicit-relocs options
explicitly while compiling MIPS vdso code. That would get correct
and consistent between different toolchains behaviour.

Reported-by: Bruce Ashfield <[email protected]>
Signed-off-by: Victor Kamensky <[email protected]>
Signed-off-by: Paul Burton <[email protected]>
Cc: [email protected]
Cc: Ralf Baechle <[email protected]>
Cc: James Hogan <[email protected]>
Cc: Vincenzo Frascino <[email protected]>
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/vdso/Makefile | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 996a934ece7d6..3fa4bbe1bae53 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -29,6 +29,7 @@ endif
cflags-vdso := $(ccflags-vdso) \
$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
-O3 -g -fPIC -fno-strict-aliasing -fno-common -fno-builtin -G 0 \
+ -mrelax-pic-calls -mexplicit-relocs \
-fno-stack-protector -fno-jump-tables -DDISABLE_BRANCH_PROFILING \
$(call cc-option, -fno-asynchronous-unwind-tables) \
$(call cc-option, -fno-stack-protector)
--
2.20.1

2020-03-05 17:21:56

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 26/58] vhost: Check docket sk_family instead of call getname

From: Eugenio Pérez <[email protected]>

[ Upstream commit 42d84c8490f9f0931786f1623191fcab397c3d64 ]

Doing so, we save one call to get data we already have in the struct.

Also, since there is no guarantee that getname use sockaddr_ll
parameter beyond its size, we add a little bit of security here.
It should do not do beyond MAX_ADDR_LEN, but syzbot found that
ax25_getname writes more (72 bytes, the size of full_sockaddr_ax25,
versus 20 + 32 bytes of sockaddr_ll + MAX_ADDR_LEN in syzbot repro).

Fixes: 3a4d5c94e9593 ("vhost_net: a kernel-level virtio server")
Reported-by: [email protected]
Signed-off-by: Eugenio Pérez <[email protected]>
Acked-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/vhost/net.c | 10 +---------
1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 1a2dd53caadea..b53b6528d6ce2 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -1414,10 +1414,6 @@ static int vhost_net_release(struct inode *inode, struct file *f)

static struct socket *get_raw_socket(int fd)
{
- struct {
- struct sockaddr_ll sa;
- char buf[MAX_ADDR_LEN];
- } uaddr;
int r;
struct socket *sock = sockfd_lookup(fd, &r);

@@ -1430,11 +1426,7 @@ static struct socket *get_raw_socket(int fd)
goto err;
}

- r = sock->ops->getname(sock, (struct sockaddr *)&uaddr.sa, 0);
- if (r < 0)
- goto err;
-
- if (uaddr.sa.sll_family != AF_PACKET) {
+ if (sock->sk->sk_family != AF_PACKET) {
r = -EPFNOSUPPORT;
goto err;
}
--
2.20.1

2020-03-05 17:22:20

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 04/58] HID: core: increase HID report buffer size to 8KiB

From: Johan Korsnes <[email protected]>

[ Upstream commit 84a4062632462c4320704fcdf8e99e89e94c0aba ]

We have a HID touch device that reports its opens and shorts test
results in HID buffers of size 8184 bytes. The maximum size of the HID
buffer is currently set to 4096 bytes, causing probe of this device to
fail. With this patch we increase the maximum size of the HID buffer to
8192 bytes, making device probe and acquisition of said buffers succeed.

Signed-off-by: Johan Korsnes <[email protected]>
Cc: Alan Stern <[email protected]>
Cc: Armando Visconti <[email protected]>
Cc: Jiri Kosina <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
include/linux/hid.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/hid.h b/include/linux/hid.h
index cd41f209043f6..875f71132b142 100644
--- a/include/linux/hid.h
+++ b/include/linux/hid.h
@@ -492,7 +492,7 @@ struct hid_report_enum {
};

#define HID_MIN_BUFFER_SIZE 64 /* make sure there is at least a packet size of space */
-#define HID_MAX_BUFFER_SIZE 4096 /* 4kb */
+#define HID_MAX_BUFFER_SIZE 8192 /* 8kb */
#define HID_CONTROL_FIFO_SIZE 256 /* to init devices with >100 reports */
#define HID_OUTPUT_FIFO_SIZE 64

--
2.20.1

2020-03-05 17:22:32

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 11/58] MIPS: Disable VDSO time functionality on microMIPS

From: Paul Burton <[email protected]>

[ Upstream commit 07015d7a103c4420b69a287b8ef4d2535c0f4106 ]

A check we're about to add to pick up on function calls that depend on
bogus use of the GOT in the VDSO picked up on instances of such function
calls in microMIPS builds. Since the code appears genuinely problematic,
and given the relatively small amount of use & testing that microMIPS
sees, go ahead & disable the VDSO for microMIPS builds.

Signed-off-by: Paul Burton <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
arch/mips/vdso/Makefile | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 3fa4bbe1bae53..b6b1eb638fb14 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -48,6 +48,8 @@ endif

CFLAGS_REMOVE_vgettimeofday.o = -pg

+DISABLE_VDSO := n
+
#
# For the pre-R6 code in arch/mips/vdso/vdso.h for locating
# the base address of VDSO, the linker will emit a R_MIPS_PC32
@@ -61,11 +63,24 @@ CFLAGS_REMOVE_vgettimeofday.o = -pg
ifndef CONFIG_CPU_MIPSR6
ifeq ($(call ld-ifversion, -lt, 225000000, y),y)
$(warning MIPS VDSO requires binutils >= 2.25)
- obj-vdso-y := $(filter-out vgettimeofday.o, $(obj-vdso-y))
- ccflags-vdso += -DDISABLE_MIPS_VDSO
+ DISABLE_VDSO := y
endif
endif

+#
+# GCC (at least up to version 9.2) appears to emit function calls that make use
+# of the GOT when targeting microMIPS, which we can't use in the VDSO due to
+# the lack of relocations. As such, we disable the VDSO for microMIPS builds.
+#
+ifdef CONFIG_CPU_MICROMIPS
+ DISABLE_VDSO := y
+endif
+
+ifeq ($(DISABLE_VDSO),y)
+ obj-vdso-y := $(filter-out vgettimeofday.o, $(obj-vdso-y))
+ ccflags-vdso += -DDISABLE_MIPS_VDSO
+endif
+
# VDSO linker flags.
VDSO_LDFLAGS := \
-Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1 \
--
2.20.1

2020-03-05 17:22:34

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 07/58] i2c: altera: Fix potential integer overflow

From: "Gustavo A. R. Silva" <[email protected]>

[ Upstream commit 54498e8070e19e74498a72c7331348143e7e1f8c ]

Factor out 100 from the equation and do 32-bit arithmetic (3 * clk_mhz / 10)
instead of 64-bit.

Notice that clk_mhz is MHz, so the multiplication will never wrap 32 bits
and there is no need for div_u64().

Addresses-Coverity: 1458369 ("Unintentional integer overflow")
Fixes: 0560ad576268 ("i2c: altera: Add Altera I2C Controller driver")
Suggested-by: David Laight <[email protected]>
Signed-off-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Thor Thayer <[email protected]>
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/i2c/busses/i2c-altera.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/i2c/busses/i2c-altera.c b/drivers/i2c/busses/i2c-altera.c
index 5255d3755411b..1de23b4f3809c 100644
--- a/drivers/i2c/busses/i2c-altera.c
+++ b/drivers/i2c/busses/i2c-altera.c
@@ -171,7 +171,7 @@ static void altr_i2c_init(struct altr_i2c_dev *idev)
/* SCL Low Time */
writel(t_low, idev->base + ALTR_I2C_SCL_LOW);
/* SDA Hold Time, 300ns */
- writel(div_u64(300 * clk_mhz, 1000), idev->base + ALTR_I2C_SDA_HOLD);
+ writel(3 * clk_mhz / 10, idev->base + ALTR_I2C_SDA_HOLD);

/* Mask all master interrupt bits */
altr_i2c_int_enable(idev, ALTR_I2C_ALL_IRQ, false);
--
2.20.1

2020-03-05 17:23:13

by Sasha Levin

[permalink] [raw]
Subject: [PATCH AUTOSEL 5.4 02/58] HID: apple: Add support for recent firmware on Magic Keyboards

From: Mansour Behabadi <[email protected]>

[ Upstream commit e433be929e63265b7412478eb7ff271467aee2d7 ]

Magic Keyboards with more recent firmware (0x0100) report Fn key differently.
Without this patch, Fn key may not behave as expected and may not be
configurable via hid_apple fnmode module parameter.

Signed-off-by: Mansour Behabadi <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
---
drivers/hid/hid-apple.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/hid/hid-apple.c b/drivers/hid/hid-apple.c
index 6ac8becc2372e..d732d1d10cafb 100644
--- a/drivers/hid/hid-apple.c
+++ b/drivers/hid/hid-apple.c
@@ -340,7 +340,8 @@ static int apple_input_mapping(struct hid_device *hdev, struct hid_input *hi,
unsigned long **bit, int *max)
{
if (usage->hid == (HID_UP_CUSTOM | 0x0003) ||
- usage->hid == (HID_UP_MSVENDOR | 0x0003)) {
+ usage->hid == (HID_UP_MSVENDOR | 0x0003) ||
+ usage->hid == (HID_UP_HPVENDOR2 | 0x0003)) {
/* The fn key on Apple USB keyboards */
set_bit(EV_REP, hi->input->evbit);
hid_map_usage_clear(hi, usage, bit, max, EV_KEY, KEY_FN);
--
2.20.1

2020-08-29 13:23:05

by Hauke Mehrtens

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.4 10/58] mips: vdso: fix 'jalr t9' crash in vdso code

On 3/5/20 6:13 PM, Sasha Levin wrote:
> From: Victor Kamensky <[email protected]>
>
> [ Upstream commit d3f703c4359ff06619b2322b91f69710453e6b6d ]
>
> Observed that when kernel is built with Yocto mips64-poky-linux-gcc,
> and mips64-poky-linux-gnun32-gcc toolchain, resulting vdso contains
> 'jalr t9' instructions in its code and since in vdso case nobody
> sets GOT table code crashes when instruction reached. On other hand
> observed that when kernel is built mips-poky-linux-gcc toolchain, the
> same 'jalr t9' instruction are replaced with PC relative function
> calls using 'bal' instructions.
>
> The difference boils down to -mrelax-pic-calls and -mexplicit-relocs
> gcc options that gets different default values depending on gcc
> target triplets and corresponding binutils. -mrelax-pic-calls got
> enabled by default only in mips-poky-linux-gcc case. MIPS binutils
> ld relies on R_MIPS_JALR relocation to convert 'jalr t9' into 'bal'
> and such relocation is generated only if -mrelax-pic-calls option
> is on.
>
> Please note 'jalr t9' conversion to 'bal' can happen only to static
> functions. These static PIC calls use mips local GOT entries that
> are supposed to be filled with start of DSO value by run-time linker
> (missing in VDSO case) and they do not have dynamic relocations.
> Global mips GOT entries must have dynamic relocations and they should
> be prevented by cmd_vdso_check Makefile rule.
>
> Solution call out -mrelax-pic-calls and -mexplicit-relocs options
> explicitly while compiling MIPS vdso code. That would get correct
> and consistent between different toolchains behaviour.
>
> Reported-by: Bruce Ashfield <[email protected]>
> Signed-off-by: Victor Kamensky <[email protected]>
> Signed-off-by: Paul Burton <[email protected]>
> Cc: [email protected]
> Cc: Ralf Baechle <[email protected]>
> Cc: James Hogan <[email protected]>
> Cc: Vincenzo Frascino <[email protected]>
> Cc: [email protected]
> Signed-off-by: Sasha Levin <[email protected]>
> ---
> arch/mips/vdso/Makefile | 1 +
> 1 file changed, 1 insertion(+)
>

Hi Sasha,

Why was this not added to the 5.4 stable branch?

Some OpenWrt users ran into this problem with kernel 5.4 on MIPS64 [0].
We backported this patch on our own in OpenWrt [1], but it should be
added to the sable branch in my opinion as it fixes a real problem.

@Sasha: Can you add it to the 5.4 stable branch or should I send some
special email?

[0]: https://bugs.openwrt.org/index.php?do=details&task_id=3277
[1]: https://git.openwrt.org/2932b4d05e1d212259c2903fd9cebcee5616680b

Hauke


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature

2020-08-29 13:58:04

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.4 10/58] mips: vdso: fix 'jalr t9' crash in vdso code

On Sat, Aug 29, 2020 at 03:08:01PM +0200, Hauke Mehrtens wrote:
>On 3/5/20 6:13 PM, Sasha Levin wrote:
>> From: Victor Kamensky <[email protected]>
>>
>> [ Upstream commit d3f703c4359ff06619b2322b91f69710453e6b6d ]
>>
>> Observed that when kernel is built with Yocto mips64-poky-linux-gcc,
>> and mips64-poky-linux-gnun32-gcc toolchain, resulting vdso contains
>> 'jalr t9' instructions in its code and since in vdso case nobody
>> sets GOT table code crashes when instruction reached. On other hand
>> observed that when kernel is built mips-poky-linux-gcc toolchain, the
>> same 'jalr t9' instruction are replaced with PC relative function
>> calls using 'bal' instructions.
>>
>> The difference boils down to -mrelax-pic-calls and -mexplicit-relocs
>> gcc options that gets different default values depending on gcc
>> target triplets and corresponding binutils. -mrelax-pic-calls got
>> enabled by default only in mips-poky-linux-gcc case. MIPS binutils
>> ld relies on R_MIPS_JALR relocation to convert 'jalr t9' into 'bal'
>> and such relocation is generated only if -mrelax-pic-calls option
>> is on.
>>
>> Please note 'jalr t9' conversion to 'bal' can happen only to static
>> functions. These static PIC calls use mips local GOT entries that
>> are supposed to be filled with start of DSO value by run-time linker
>> (missing in VDSO case) and they do not have dynamic relocations.
>> Global mips GOT entries must have dynamic relocations and they should
>> be prevented by cmd_vdso_check Makefile rule.
>>
>> Solution call out -mrelax-pic-calls and -mexplicit-relocs options
>> explicitly while compiling MIPS vdso code. That would get correct
>> and consistent between different toolchains behaviour.
>>
>> Reported-by: Bruce Ashfield <[email protected]>
>> Signed-off-by: Victor Kamensky <[email protected]>
>> Signed-off-by: Paul Burton <[email protected]>
>> Cc: [email protected]
>> Cc: Ralf Baechle <[email protected]>
>> Cc: James Hogan <[email protected]>
>> Cc: Vincenzo Frascino <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Sasha Levin <[email protected]>
>> ---
>> arch/mips/vdso/Makefile | 1 +
>> 1 file changed, 1 insertion(+)
>>
>
>Hi Sasha,
>
>Why was this not added to the 5.4 stable branch?
>
>Some OpenWrt users ran into this problem with kernel 5.4 on MIPS64 [0].
>We backported this patch on our own in OpenWrt [1], but it should be
>added to the sable branch in my opinion as it fixes a real problem.
>
>@Sasha: Can you add it to the 5.4 stable branch or should I send some
>special email?

It failed building on 5.4. If you'd like it included, please send me a
tested backport for 5.4.

--
Thanks,
Sasha

2020-08-29 14:39:20

by Hauke Mehrtens

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.4 10/58] mips: vdso: fix 'jalr t9' crash in vdso code

On 8/29/20 3:56 PM, Sasha Levin wrote:
> On Sat, Aug 29, 2020 at 03:08:01PM +0200, Hauke Mehrtens wrote:
>> On 3/5/20 6:13 PM, Sasha Levin wrote:
>>> From: Victor Kamensky <[email protected]>
>>>
>>> [ Upstream commit d3f703c4359ff06619b2322b91f69710453e6b6d ]
>>>
>>> Observed that when kernel is built with Yocto mips64-poky-linux-gcc,
>>> and mips64-poky-linux-gnun32-gcc toolchain, resulting vdso contains
>>> 'jalr t9' instructions in its code and since in vdso case nobody
>>> sets GOT table code crashes when instruction reached. On other hand
>>> observed that when kernel is built mips-poky-linux-gcc toolchain, the
>>> same 'jalr t9' instruction are replaced with PC relative function
>>> calls using 'bal' instructions.
>>>
>>> The difference boils down to -mrelax-pic-calls and -mexplicit-relocs
>>> gcc options that gets different default values depending on gcc
>>> target triplets and corresponding binutils. -mrelax-pic-calls got
>>> enabled by default only in mips-poky-linux-gcc case. MIPS binutils
>>> ld relies on R_MIPS_JALR relocation to convert 'jalr t9' into 'bal'
>>> and such relocation is generated only if -mrelax-pic-calls option
>>> is on.
>>>
>>> Please note 'jalr t9' conversion to 'bal' can happen only to static
>>> functions. These static PIC calls use mips local GOT entries that
>>> are supposed to be filled with start of DSO value by run-time linker
>>> (missing in VDSO case) and they do not have dynamic relocations.
>>> Global mips GOT entries must have dynamic relocations and they should
>>> be prevented by cmd_vdso_check Makefile rule.
>>>
>>> Solution call out -mrelax-pic-calls and -mexplicit-relocs options
>>> explicitly while compiling MIPS vdso code. That would get correct
>>> and consistent between different toolchains behaviour.
>>>
>>> Reported-by: Bruce Ashfield <[email protected]>
>>> Signed-off-by: Victor Kamensky <[email protected]>
>>> Signed-off-by: Paul Burton <[email protected]>
>>> Cc: [email protected]
>>> Cc: Ralf Baechle <[email protected]>
>>> Cc: James Hogan <[email protected]>
>>> Cc: Vincenzo Frascino <[email protected]>
>>> Cc: [email protected]
>>> Signed-off-by: Sasha Levin <[email protected]>
>>> ---
>>>  arch/mips/vdso/Makefile | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>
>> Hi Sasha,
>>
>> Why was this not added to the 5.4 stable branch?
>>
>> Some OpenWrt users ran into this problem with kernel 5.4 on MIPS64 [0].
>> We backported this patch on our own in OpenWrt [1], but it should be
>> added to the sable branch in my opinion as it fixes a real problem.
>>
>> @Sasha: Can you add it to the 5.4 stable branch or should I send some
>> special email?
>
> It failed building on 5.4. If you'd like it included, please send me a
> tested backport for 5.4.
>

I successfully compiled a kernel 5.4.61 with this patch on top with GCC
8.4 for MIPS 64 big and little Endian.

What was broken in your compile test?

Someone complained that it does not compile with LLVM any more, there is
this additional fix to solve that problem:
https://git.kernel.org/linus/72cf3b3df423c1bbd8fa1056fed009d3a260f8a9
OpenWrt does not support LLVM, so we did not see this problem.

Could you please backport these two commits to kernel 5.4:
https://git.kernel.org/linus/d3f703c4359ff06619b2322b91f69710453e6b6d
https://git.kernel.org/linus/72cf3b3df423c1bbd8fa1056fed009d3a260f8a9

We haven't seen this problem with kernel 4.19 or older, but I am not
sure if this was luck.

Hauke


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature

2020-08-30 02:25:47

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH AUTOSEL 5.4 10/58] mips: vdso: fix 'jalr t9' crash in vdso code

On Sat, Aug 29, 2020 at 04:37:32PM +0200, Hauke Mehrtens wrote:
>On 8/29/20 3:56 PM, Sasha Levin wrote:
>> On Sat, Aug 29, 2020 at 03:08:01PM +0200, Hauke Mehrtens wrote:
>>> On 3/5/20 6:13 PM, Sasha Levin wrote:
>>>> From: Victor Kamensky <[email protected]>
>>>>
>>>> [ Upstream commit d3f703c4359ff06619b2322b91f69710453e6b6d ]
>>>>
>>>> Observed that when kernel is built with Yocto mips64-poky-linux-gcc,
>>>> and mips64-poky-linux-gnun32-gcc toolchain, resulting vdso contains
>>>> 'jalr t9' instructions in its code and since in vdso case nobody
>>>> sets GOT table code crashes when instruction reached. On other hand
>>>> observed that when kernel is built mips-poky-linux-gcc toolchain, the
>>>> same 'jalr t9' instruction are replaced with PC relative function
>>>> calls using 'bal' instructions.
>>>>
>>>> The difference boils down to -mrelax-pic-calls and -mexplicit-relocs
>>>> gcc options that gets different default values depending on gcc
>>>> target triplets and corresponding binutils. -mrelax-pic-calls got
>>>> enabled by default only in mips-poky-linux-gcc case. MIPS binutils
>>>> ld relies on R_MIPS_JALR relocation to convert 'jalr t9' into 'bal'
>>>> and such relocation is generated only if -mrelax-pic-calls option
>>>> is on.
>>>>
>>>> Please note 'jalr t9' conversion to 'bal' can happen only to static
>>>> functions. These static PIC calls use mips local GOT entries that
>>>> are supposed to be filled with start of DSO value by run-time linker
>>>> (missing in VDSO case) and they do not have dynamic relocations.
>>>> Global mips GOT entries must have dynamic relocations and they should
>>>> be prevented by cmd_vdso_check Makefile rule.
>>>>
>>>> Solution call out -mrelax-pic-calls and -mexplicit-relocs options
>>>> explicitly while compiling MIPS vdso code. That would get correct
>>>> and consistent between different toolchains behaviour.
>>>>
>>>> Reported-by: Bruce Ashfield <[email protected]>
>>>> Signed-off-by: Victor Kamensky <[email protected]>
>>>> Signed-off-by: Paul Burton <[email protected]>
>>>> Cc: [email protected]
>>>> Cc: Ralf Baechle <[email protected]>
>>>> Cc: James Hogan <[email protected]>
>>>> Cc: Vincenzo Frascino <[email protected]>
>>>> Cc: [email protected]
>>>> Signed-off-by: Sasha Levin <[email protected]>
>>>> ---
>>>> ?arch/mips/vdso/Makefile | 1 +
>>>> ?1 file changed, 1 insertion(+)
>>>>
>>>
>>> Hi Sasha,
>>>
>>> Why was this not added to the 5.4 stable branch?
>>>
>>> Some OpenWrt users ran into this problem with kernel 5.4 on MIPS64 [0].
>>> We backported this patch on our own in OpenWrt [1], but it should be
>>> added to the sable branch in my opinion as it fixes a real problem.
>>>
>>> @Sasha: Can you add it to the 5.4 stable branch or should I send some
>>> special email?
>>
>> It failed building on 5.4. If you'd like it included, please send me a
>> tested backport for 5.4.
>>
>
>I successfully compiled a kernel 5.4.61 with this patch on top with GCC
>8.4 for MIPS 64 big and little Endian.
>
>What was broken in your compile test?

See https://lore.kernel.org/stable/[email protected]/

--
Thanks,
Sasha