Subject: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

Started as addressing the flakiness issues in rst_ipv*, that affect
netdev dashboard.

Signed-off-by: Dmitry Safonov <[email protected]>
---
Dmitry Safonov (4):
selftests/tcp_ao: Make RST tests less flaky
selftests/tcp_ao: Zero-init tcp_ao_info_opt
selftests/tcp_ao: Fix fscanf() call for format-security
selftests/tcp_ao: Printing fixes to confirm with format-security

tools/testing/selftests/net/tcp_ao/lib/proc.c | 2 +-
tools/testing/selftests/net/tcp_ao/lib/setup.c | 12 +++++------
tools/testing/selftests/net/tcp_ao/rst.c | 23 ++++++++++++----------
.../selftests/net/tcp_ao/setsockopt-closed.c | 2 +-
4 files changed, 21 insertions(+), 18 deletions(-)
---
base-commit: 8f2c057754b25075aa3da132cd4fd4478cdab854
change-id: 20240413-tcp-ao-selftests-fixes-adacd65cb8ba

Best regards,
--
Dmitry Safonov <[email protected]>




Subject: [PATCH net 1/4] selftests/tcp_ao: Make RST tests less flaky

From: Dmitry Safonov <[email protected]>

Currently, "active reset" cases are flaky, because select() is called
for 3 sockets, while only 2 are expected to receive RST.
The idea of the third socket was to get into request_sock_queue,
but the test mistakenly attempted to connect() after the listener
socket was shut down.

Repair this test, it's important to check the different kernel
code-paths for signing RST TCP-AO segments.

Fixes: c6df7b2361d7 ("selftests/net: Add TCP-AO RST test")
Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Dmitry Safonov <[email protected]>
---
tools/testing/selftests/net/tcp_ao/rst.c | 23 +++++++++++++----------
1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/net/tcp_ao/rst.c b/tools/testing/selftests/net/tcp_ao/rst.c
index 7df8b8700e39..a2fe88d35ac0 100644
--- a/tools/testing/selftests/net/tcp_ao/rst.c
+++ b/tools/testing/selftests/net/tcp_ao/rst.c
@@ -256,8 +256,6 @@ static int test_wait_fds(int sk[], size_t nr, bool is_writable[],

static void test_client_active_rst(unsigned int port)
{
- /* one in queue, another accept()ed */
- unsigned int wait_for = backlog + 2;
int i, sk[3], err;
bool is_writable[ARRAY_SIZE(sk)] = {false};
unsigned int last = ARRAY_SIZE(sk) - 1;
@@ -275,16 +273,20 @@ static void test_client_active_rst(unsigned int port)
for (i = 0; i < last; i++) {
err = _test_connect_socket(sk[i], this_ip_dest, port,
(i == 0) ? TEST_TIMEOUT_SEC : -1);
-
if (err < 0)
test_error("failed to connect()");
}

- synchronize_threads(); /* 2: connection accept()ed, another queued */
- err = test_wait_fds(sk, last, is_writable, wait_for, TEST_TIMEOUT_SEC);
+ synchronize_threads(); /* 2: two connections: one accept()ed, another queued */
+ err = test_wait_fds(sk, last, is_writable, last, TEST_TIMEOUT_SEC);
if (err < 0)
test_error("test_wait_fds(): %d", err);

+ /* async connect() with third sk to get into request_sock_queue */
+ err = _test_connect_socket(sk[last], this_ip_dest, port, -1);
+ if (err < 0)
+ test_error("failed to connect()");
+
synchronize_threads(); /* 3: close listen socket */
if (test_client_verify(sk[0], packet_sz, quota / packet_sz, TEST_TIMEOUT_SEC))
test_fail("Failed to send data on connected socket");
@@ -292,13 +294,14 @@ static void test_client_active_rst(unsigned int port)
test_ok("Verified established tcp connection");

synchronize_threads(); /* 4: finishing up */
- err = _test_connect_socket(sk[last], this_ip_dest, port, -1);
- if (err < 0)
- test_error("failed to connect()");

synchronize_threads(); /* 5: closed active sk */
- err = test_wait_fds(sk, ARRAY_SIZE(sk), NULL,
- wait_for, TEST_TIMEOUT_SEC);
+ /*
+ * Wait for 2 connections: one accepted, another in the accept queue,
+ * the one in request_sock_queue won't get fully established, so
+ * doesn't receive an active RST, see inet_csk_listen_stop().
+ */
+ err = test_wait_fds(sk, last, NULL, last, TEST_TIMEOUT_SEC);
if (err < 0)
test_error("select(): %d", err);


--
2.42.0



Subject: [PATCH net 2/4] selftests/tcp_ao: Zero-init tcp_ao_info_opt

From: Dmitry Safonov <[email protected]>

The structure is on the stack and has to be zero-initialized as
the kernel checks for:
> if (in.reserved != 0 || in.reserved2 != 0)
> return -EINVAL;

Fixes: b26660531cf6 ("selftests/net: Add test for TCP-AO add setsockopt() command")
Signed-off-by: Dmitry Safonov <[email protected]>
---
tools/testing/selftests/net/tcp_ao/setsockopt-closed.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c
index 452de131fa3a..517930f9721b 100644
--- a/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c
+++ b/tools/testing/selftests/net/tcp_ao/setsockopt-closed.c
@@ -21,7 +21,7 @@ static void make_listen(int sk)
static void test_vefify_ao_info(int sk, struct tcp_ao_info_opt *info,
const char *tst)
{
- struct tcp_ao_info_opt tmp;
+ struct tcp_ao_info_opt tmp = {};
socklen_t len = sizeof(tmp);

if (getsockopt(sk, IPPROTO_TCP, TCP_AO_INFO, &tmp, &len))

--
2.42.0



Subject: [PATCH net 3/4] selftests/tcp_ao: Fix fscanf() call for format-security

From: Dmitry Safonov <[email protected]>

On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces:
> lib/proc.c: In function ‘netstat_read_type’:
> lib/proc.c:89:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 89 | if (fscanf(fnetstat, type->header_name) == EOF)
> | ^~
> cc1: some warnings being treated as errors

Here the selftests lib parses header name, while expectes non-space word
ending with a column.

Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <[email protected]>
---
tools/testing/selftests/net/tcp_ao/lib/proc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/tcp_ao/lib/proc.c b/tools/testing/selftests/net/tcp_ao/lib/proc.c
index 2fb6dd8adba6..8b984fa04286 100644
--- a/tools/testing/selftests/net/tcp_ao/lib/proc.c
+++ b/tools/testing/selftests/net/tcp_ao/lib/proc.c
@@ -86,7 +86,7 @@ static void netstat_read_type(FILE *fnetstat, struct netstat **dest, char *line)

pos = strchr(line, ' ') + 1;

- if (fscanf(fnetstat, type->header_name) == EOF)
+ if (fscanf(fnetstat, "%[^ :]", type->header_name) == EOF)
test_error("fscanf(%s)", type->header_name);
if (fread(&tmp, 1, 1, fnetstat) != 1 || tmp != ':')
test_error("Unexpected netstat format (%c)", tmp);

--
2.42.0



Subject: [PATCH net 4/4] selftests/tcp_ao: Printing fixes to confirm with format-security

From: Dmitry Safonov <[email protected]>

On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces
> lib/setup.c: In function ‘__test_msg’:
> lib/setup.c:20:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 20 | ksft_print_msg(buf);
> | ^~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_ok’:
> lib/setup.c:26:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 26 | ksft_test_result_pass(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_fail’:
> lib/setup.c:32:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 32 | ksft_test_result_fail(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_xfail’:
> lib/setup.c:38:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 38 | ksft_test_result_xfail(buf);
> | ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_error’:
> lib/setup.c:44:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 44 | ksft_test_result_error(buf);
> | ^~~~~~~~~~~~~~~~~~~~~~
> lib/setup.c: In function ‘__test_skip’:
> lib/setup.c:50:9: error: format not a string literal and no format arguments [-Werror=format-security]
> 50 | ksft_test_result_skip(buf);
> | ^~~~~~~~~~~~~~~~~~~~~
> cc1: some warnings being treated as errors

As the buffer was already pre-printed into, print it as a string
rather than a format-string.

Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Signed-off-by: Dmitry Safonov <[email protected]>
---
tools/testing/selftests/net/tcp_ao/lib/setup.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/net/tcp_ao/lib/setup.c b/tools/testing/selftests/net/tcp_ao/lib/setup.c
index 92276f916f2f..e408b9243b2c 100644
--- a/tools/testing/selftests/net/tcp_ao/lib/setup.c
+++ b/tools/testing/selftests/net/tcp_ao/lib/setup.c
@@ -17,37 +17,37 @@ static pthread_mutex_t ksft_print_lock = PTHREAD_MUTEX_INITIALIZER;
void __test_msg(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_print_msg(buf);
+ ksft_print_msg("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}
void __test_ok(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_test_result_pass(buf);
+ ksft_test_result_pass("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}
void __test_fail(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_test_result_fail(buf);
+ ksft_test_result_fail("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}
void __test_xfail(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_test_result_xfail(buf);
+ ksft_test_result_xfail("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}
void __test_error(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_test_result_error(buf);
+ ksft_test_result_error("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}
void __test_skip(const char *buf)
{
pthread_mutex_lock(&ksft_print_lock);
- ksft_test_result_skip(buf);
+ ksft_test_result_skip("%s", buf);
pthread_mutex_unlock(&ksft_print_lock);
}


--
2.42.0



2024-04-13 01:50:31

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net 3/4] selftests/tcp_ao: Fix fscanf() call for format-security

On Sat, 13 Apr 2024 at 02:43, Dmitry Safonov via B4 Relay
<[email protected]> wrote:
>
> From: Dmitry Safonov <[email protected]>
>
> On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces:
> > lib/proc.c: In function ‘netstat_read_type’:
> > lib/proc.c:89:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 89 | if (fscanf(fnetstat, type->header_name) == EOF)
> > | ^~
> > cc1: some warnings being treated as errors
>
> Here the selftests lib parses header name, while expectes non-space word
> ending with a column.
>
> Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
> Signed-off-by: Dmitry Safonov <[email protected]>

Actually, now I see that it was also reported, adding

Reported-by: Muhammad Usama Anjum <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/

--
Dmitry

2024-04-13 01:51:38

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net 4/4] selftests/tcp_ao: Printing fixes to confirm with format-security

On Sat, 13 Apr 2024 at 02:43, Dmitry Safonov via B4 Relay
<[email protected]> wrote:
>
> From: Dmitry Safonov <[email protected]>
>
> On my new laptop with packages from nixos-unstable, gcc 12.3.0 produces
> > lib/setup.c: In function ‘__test_msg’:
> > lib/setup.c:20:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 20 | ksft_print_msg(buf);
> > | ^~~~~~~~~~~~~~
> > lib/setup.c: In function ‘__test_ok’:
> > lib/setup.c:26:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 26 | ksft_test_result_pass(buf);
> > | ^~~~~~~~~~~~~~~~~~~~~
> > lib/setup.c: In function ‘__test_fail’:
> > lib/setup.c:32:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 32 | ksft_test_result_fail(buf);
> > | ^~~~~~~~~~~~~~~~~~~~~
> > lib/setup.c: In function ‘__test_xfail’:
> > lib/setup.c:38:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 38 | ksft_test_result_xfail(buf);
> > | ^~~~~~~~~~~~~~~~~~~~~~
> > lib/setup.c: In function ‘__test_error’:
> > lib/setup.c:44:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 44 | ksft_test_result_error(buf);
> > | ^~~~~~~~~~~~~~~~~~~~~~
> > lib/setup.c: In function ‘__test_skip’:
> > lib/setup.c:50:9: error: format not a string literal and no format arguments [-Werror=format-security]
> > 50 | ksft_test_result_skip(buf);
> > | ^~~~~~~~~~~~~~~~~~~~~
> > cc1: some warnings being treated as errors
>
> As the buffer was already pre-printed into, print it as a string
> rather than a format-string.
>
> Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
> Signed-off-by: Dmitry Safonov <[email protected]>

And this one as well,

Reported-by: Muhammad Usama Anjum <[email protected]>
Link: https://lore.kernel.org/all/[email protected]/

--
Dmitry

2024-04-16 11:40:42

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

Hello:

This series was applied to netdev/net.git (main)
by Paolo Abeni <[email protected]>:

On Sat, 13 Apr 2024 02:42:51 +0100 you wrote:
> Started as addressing the flakiness issues in rst_ipv*, that affect
> netdev dashboard.
>
> Signed-off-by: Dmitry Safonov <[email protected]>
> ---
> Dmitry Safonov (4):
> selftests/tcp_ao: Make RST tests less flaky
> selftests/tcp_ao: Zero-init tcp_ao_info_opt
> selftests/tcp_ao: Fix fscanf() call for format-security
> selftests/tcp_ao: Printing fixes to confirm with format-security
>
> [...]

Here is the summary with links:
- [net,1/4] selftests/tcp_ao: Make RST tests less flaky
https://git.kernel.org/netdev/net/c/4225dfa4535f
- [net,2/4] selftests/tcp_ao: Zero-init tcp_ao_info_opt
https://git.kernel.org/netdev/net/c/b089b3bead53
- [net,3/4] selftests/tcp_ao: Fix fscanf() call for format-security
https://git.kernel.org/netdev/net/c/beb78cd1329d
- [net,4/4] selftests/tcp_ao: Printing fixes to confirm with format-security
https://git.kernel.org/netdev/net/c/b476c93654d7

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2024-04-16 14:29:57

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Sat, 13 Apr 2024 02:42:51 +0100 Dmitry Safonov via B4 Relay wrote:
> Started as addressing the flakiness issues in rst_ipv*, that affect
> netdev dashboard.

Thank you! :)

2024-04-17 18:52:28

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Tue, 16 Apr 2024 at 15:28, Jakub Kicinski <[email protected]> wrote:
>
> On Sat, 13 Apr 2024 02:42:51 +0100 Dmitry Safonov via B4 Relay wrote:
> > Started as addressing the flakiness issues in rst_ipv*, that affect
> > netdev dashboard.
>
> Thank you! :)

Jakub, you are very welcome :)
I'll keep an eye on the dashboard, but I very much encourage you to
ping me in case of any other issues with tcp_ao selftests.

I currently have v2 for tcp-ao tracepoints, but delaying it as working
on a reproducer/selftest for an issue I think I have a patch for.

BTW, do you know if those were addressed or anyone is looking into
them? (from other tcp-ao hits, that seem not anyhow related to tcp-ao
itself):

1. [ 240.001391][ T833] Possible interrupt unsafe locking scenario:
[ 240.001391][ T833]
[ 240.001635][ T833] CPU0 CPU1
[ 240.001797][ T833] ---- ----
[ 240.001958][ T833] lock(&p->alloc_lock);
[ 240.002083][ T833] local_irq_disable();
[ 240.002284][ T833] lock(&ndev->lock);
[ 240.002490][ T833] lock(&p->alloc_lock);
[ 240.002709][ T833] <Interrupt>
[ 240.002819][ T833] lock(&ndev->lock);
[ 240.002937][ T833]
[ 240.002937][ T833] *** DEADLOCK ***

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537021/14-self-connect-ipv6/stderr

2. [ 251.411647][ T71] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock
order detected
[ 251.411986][ T71] 6.9.0-rc1-virtme #1 Not tainted
[ 251.412214][ T71] -----------------------------------------------------
[ 251.412533][ T71] kworker/u16:1/71 [HC0[0]:SC0[2]:HE1:SE0] is
trying to acquire:
[ 251.412837][ T71] ffff888005182c28 (&p->alloc_lock){+.+.}-{2:2},
at: __get_task_comm+0x27/0x70
[ 251.413214][ T71]
[ 251.413214][ T71] and this task is already holding:
[ 251.413527][ T71] ffff88802f83efd8 (&ul->lock){+.-.}-{2:2}, at:
rt6_uncached_list_flush_dev+0x138/0x840
[ 251.413887][ T71] which would create a new lock dependency:
[ 251.414153][ T71] (&ul->lock){+.-.}-{2:2} -> (&p->alloc_lock){+.+.}-{2:2}
[ 251.414464][ T71]
[ 251.414464][ T71] but this new dependency connects a SOFTIRQ-irq-safe lock:
[ 251.414808][ T71] (&ul->lock){+.-.}-{2:2}

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537201/17-icmps-discard-ipv4/stderr

3. [ 264.280734][ C3] Possible unsafe locking scenario:
[ 264.280734][ C3]
[ 264.280968][ C3] CPU0 CPU1
[ 264.281117][ C3] ---- ----
[ 264.281263][ C3] lock((&tw->tw_timer));
[ 264.281427][ C3]
lock(&hashinfo->ehash_locks[i]);
[ 264.281647][ C3] lock((&tw->tw_timer));
[ 264.281834][ C3] lock(&hashinfo->ehash_locks[i]);

https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/547461/19-self-connect-ipv4/stderr

I can spend some time on them after I verify that my fix for -stable
is actually fixing an issue I think it fixes.
Seems like your automation + my selftests are giving some fruits, hehe.

Thanks,
Dmitry

2024-04-17 20:47:27

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Wed, 17 Apr 2024 19:47:18 +0100 Dmitry Safonov wrote:
> 1. [ 240.001391][ T833] Possible interrupt unsafe locking scenario:
> [ 240.001391][ T833]
> [ 240.001635][ T833] CPU0 CPU1
> [ 240.001797][ T833] ---- ----
> [ 240.001958][ T833] lock(&p->alloc_lock);
> [ 240.002083][ T833] local_irq_disable();
> [ 240.002284][ T833] lock(&ndev->lock);
> [ 240.002490][ T833] lock(&p->alloc_lock);
> [ 240.002709][ T833] <Interrupt>
> [ 240.002819][ T833] lock(&ndev->lock);
> [ 240.002937][ T833]
> [ 240.002937][ T833] *** DEADLOCK ***
>
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537021/14-self-connect-ipv6/stderr
>
> 2. [ 251.411647][ T71] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock
> order detected
> [ 251.411986][ T71] 6.9.0-rc1-virtme #1 Not tainted
> [ 251.412214][ T71] -----------------------------------------------------
> [ 251.412533][ T71] kworker/u16:1/71 [HC0[0]:SC0[2]:HE1:SE0] is
> trying to acquire:
> [ 251.412837][ T71] ffff888005182c28 (&p->alloc_lock){+.+.}-{2:2},
> at: __get_task_comm+0x27/0x70
> [ 251.413214][ T71]
> [ 251.413214][ T71] and this task is already holding:
> [ 251.413527][ T71] ffff88802f83efd8 (&ul->lock){+.-.}-{2:2}, at:
> rt6_uncached_list_flush_dev+0x138/0x840
> [ 251.413887][ T71] which would create a new lock dependency:
> [ 251.414153][ T71] (&ul->lock){+.-.}-{2:2} -> (&p->alloc_lock){+.+.}-{2:2}
> [ 251.414464][ T71]
> [ 251.414464][ T71] but this new dependency connects a SOFTIRQ-irq-safe lock:
> [ 251.414808][ T71] (&ul->lock){+.-.}-{2:2}
>
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/537201/17-icmps-discard-ipv4/stderr
>
> 3. [ 264.280734][ C3] Possible unsafe locking scenario:
> [ 264.280734][ C3]
> [ 264.280968][ C3] CPU0 CPU1
> [ 264.281117][ C3] ---- ----
> [ 264.281263][ C3] lock((&tw->tw_timer));
> [ 264.281427][ C3]
> lock(&hashinfo->ehash_locks[i]);
> [ 264.281647][ C3] lock((&tw->tw_timer));
> [ 264.281834][ C3] lock(&hashinfo->ehash_locks[i]);
>
> https://netdev-3.bots.linux.dev/vmksft-tcp-ao-dbg/results/547461/19-self-connect-ipv4/stderr
>
> I can spend some time on them after I verify that my fix for -stable
> is actually fixing an issue I think it fixes.
> Seems like your automation + my selftests are giving some fruits, hehe.

Oh, very interesting, I don't recall these coming up before.

We try to extract crashes but apparently we're missing lockdep splats.
I'll try to improve the extraction logic...

2024-04-17 21:28:55

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Wed, 17 Apr 2024 13:46:36 -0700 Jakub Kicinski wrote:
> > I can spend some time on them after I verify that my fix for -stable
> > is actually fixing an issue I think it fixes.
> > Seems like your automation + my selftests are giving some fruits, hehe.
>
> Oh, very interesting, I don't recall these coming up before.

Correction, these are old, and if I plug the branch names here:
https://netdev.bots.linux.dev/contest.html
there is a whole bunch of tests failing that day.

Keep in mind these run pre-commit so not all failures are flakes.

2024-04-17 22:31:09

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH net 0/4] selftests/net/tcp_ao: A bunch of fixes for TCP-AO selftests

On Wed, 17 Apr 2024 at 22:28, Jakub Kicinski <[email protected]> wrote:
>
> On Wed, 17 Apr 2024 13:46:36 -0700 Jakub Kicinski wrote:
> > > I can spend some time on them after I verify that my fix for -stable
> > > is actually fixing an issue I think it fixes.
> > > Seems like your automation + my selftests are giving some fruits, hehe.
> >
> > Oh, very interesting, I don't recall these coming up before.
>
> Correction, these are old, and if I plug the branch names here:
> https://netdev.bots.linux.dev/contest.html
> there is a whole bunch of tests failing that day.

Hmm, yeah, I was looking at the history of selftests to see if there
is anything else interesting:

2024-04-11--15-00 - lockdep for hashinfo->ehash_locks vs tw->tw_timer

It seems that you actually reported that already here:
https://lore.kernel.org/all/[email protected]/

2024-04-04--12-00 - lockdep for p->alloc_lock vs ul->lock
(rt6_uncached_list_flush_dev)
2024-04-04--09-00 - lockdep for p->alloc_lock vs ndev->lock
(addrconf_permanent_addr)
2024-04-04--03-00 - lockdep for p->alloc_lock vs ul->lock

Was reported as well:
https://lore.kernel.org/all/[email protected]/

2024-03-06--00-00 - kernel BUG at net/core/skbuff.c:2813

Can't really track this down to any report/fix. Probably as it's month
old and hasn't happened since on these tests - something was borken on
that particular day.

> Keep in mind these run pre-commit so not all failures are flakes.

Thanks,
Dmitry