2024-02-10 15:12:50

by Guenter Roeck

[permalink] [raw]
Subject: Problems with csum_partial with misaligned buffers on sh4 platform

Hi,

when running checksum unit tests on sh4 qemu emulations, I get the following
errors.

KTAP version 1
# Subtest: checksum
# module: checksum_kunit
1..5
# test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
Expected ( u64)result == ( u64)expec, but
( u64)result == 53378 (0xd082)
( u64)expec == 33488 (0x82d0)
not ok 1 test_csum_fixed_random_inputs
# test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
Expected ( u64)result == ( u64)expec, but
( u64)result == 65281 (0xff01)
( u64)expec == 65280 (0xff00)
not ok 2 test_csum_all_carry_inputs
# test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
Expected ( u64)result == ( u64)expec, but
( u64)result == 65535 (0xffff)
( u64)expec == 65534 (0xfffe)
not ok 3 test_csum_no_carry_inputs
ok 4 test_ip_fast_csum
ok 5 test_csum_ipv6_magic
# checksum: pass:2 fail:3 skip:0 total:5

The above is with from a little endian system. On a big endian system,
the test result is as follows.

KTAP version 1
# Subtest: checksum
# module: checksum_kunit
1..5
# test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
Expected ( u64)result == ( u64)expec, but
( u64)result == 33488 (0x82d0)
( u64)expec == 53378 (0xd082)
not ok 1 test_csum_fixed_random_inputs
# test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
Expected ( u64)result == ( u64)expec, but
( u64)result == 65281 (0xff01)
( u64)expec == 255 (0xff)
not ok 2 test_csum_all_carry_inputs
# test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
Expected ( u64)result == ( u64)expec, but
( u64)result == 1020 (0x3fc)
( u64)expec == 0 (0x0)
not ok 3 test_csum_no_carry_inputs
# test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
Expected ( u64)expected == ( u64)csum_result, but
( u64)expected == 55939 (0xda83)
( u64)csum_result == 33754 (0x83da)
not ok 4 test_ip_fast_csum
# test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
not ok 5 test_csum_ipv6_magic
# checksum: pass:0 fail:5 skip:0 total:5

Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
systems due to a bug in the test code, unrelated to this problem.

Analysis shows that the errors are seen only if the buffer is misaligned.
Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
Handle calling csum_partial with misaligned data") which seemed to be
related. Reverting that commit fixes the problem.
This suggests that something may be wrong with that commit. Alternatively,
of course, it may be possible that something is wrong with the qemu
emulation, but that seems unlikely.

Thanks,
Guenter


Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

Hi Guenter,

On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
> when running checksum unit tests on sh4 qemu emulations, I get the following
> errors.
>
> KTAP version 1
> # Subtest: checksum
> # module: checksum_kunit
> 1..5
> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 53378 (0xd082)
> ( u64)expec == 33488 (0x82d0)
> not ok 1 test_csum_fixed_random_inputs
> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65281 (0xff01)
> ( u64)expec == 65280 (0xff00)
> not ok 2 test_csum_all_carry_inputs
> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65535 (0xffff)
> ( u64)expec == 65534 (0xfffe)
> not ok 3 test_csum_no_carry_inputs
> ok 4 test_ip_fast_csum
> ok 5 test_csum_ipv6_magic
> # checksum: pass:2 fail:3 skip:0 total:5
>
> The above is with from a little endian system. On a big endian system,
> the test result is as follows.
>
> KTAP version 1
> # Subtest: checksum
> # module: checksum_kunit
> 1..5
> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 33488 (0x82d0)
> ( u64)expec == 53378 (0xd082)
> not ok 1 test_csum_fixed_random_inputs
> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65281 (0xff01)
> ( u64)expec == 255 (0xff)
> not ok 2 test_csum_all_carry_inputs
> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 1020 (0x3fc)
> ( u64)expec == 0 (0x0)
> not ok 3 test_csum_no_carry_inputs
> # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
> Expected ( u64)expected == ( u64)csum_result, but
> ( u64)expected == 55939 (0xda83)
> ( u64)csum_result == 33754 (0x83da)
> not ok 4 test_ip_fast_csum
> # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
> Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
> ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
> ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
> not ok 5 test_csum_ipv6_magic
> # checksum: pass:0 fail:5 skip:0 total:5
>
> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
> systems due to a bug in the test code, unrelated to this problem.
>
> Analysis shows that the errors are seen only if the buffer is misaligned.
> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
> Handle calling csum_partial with misaligned data") which seemed to be
> related. Reverting that commit fixes the problem.
> This suggests that something may be wrong with that commit. Alternatively,
> of course, it may be possible that something is wrong with the qemu
> emulation, but that seems unlikely.

I have not run these tests before. Can you tell me how these are run,
so I can verify these reproduce on real hardware?

Adrian

--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913

2024-02-10 21:59:07

by Guenter Roeck

[permalink] [raw]
Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

Hi Adrian,

On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
> Hi Guenter,
>
> On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
>> when running checksum unit tests on sh4 qemu emulations, I get the following
>> errors.
>>
>> KTAP version 1
>> # Subtest: checksum
>> # module: checksum_kunit
>> 1..5
>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 53378 (0xd082)
>> ( u64)expec == 33488 (0x82d0)
>> not ok 1 test_csum_fixed_random_inputs
>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65281 (0xff01)
>> ( u64)expec == 65280 (0xff00)
>> not ok 2 test_csum_all_carry_inputs
>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65535 (0xffff)
>> ( u64)expec == 65534 (0xfffe)
>> not ok 3 test_csum_no_carry_inputs
>> ok 4 test_ip_fast_csum
>> ok 5 test_csum_ipv6_magic
>> # checksum: pass:2 fail:3 skip:0 total:5
>>
>> The above is with from a little endian system. On a big endian system,
>> the test result is as follows.
>>
>> KTAP version 1
>> # Subtest: checksum
>> # module: checksum_kunit
>> 1..5
>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 33488 (0x82d0)
>> ( u64)expec == 53378 (0xd082)
>> not ok 1 test_csum_fixed_random_inputs
>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65281 (0xff01)
>> ( u64)expec == 255 (0xff)
>> not ok 2 test_csum_all_carry_inputs
>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 1020 (0x3fc)
>> ( u64)expec == 0 (0x0)
>> not ok 3 test_csum_no_carry_inputs
>> # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>> Expected ( u64)expected == ( u64)csum_result, but
>> ( u64)expected == 55939 (0xda83)
>> ( u64)csum_result == 33754 (0x83da)
>> not ok 4 test_ip_fast_csum
>> # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>> Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>> ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>> ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>> not ok 5 test_csum_ipv6_magic
>> # checksum: pass:0 fail:5 skip:0 total:5
>>
>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>> systems due to a bug in the test code, unrelated to this problem.
>>
>> Analysis shows that the errors are seen only if the buffer is misaligned.
>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>> Handle calling csum_partial with misaligned data") which seemed to be
>> related. Reverting that commit fixes the problem.
>> This suggests that something may be wrong with that commit. Alternatively,
>> of course, it may be possible that something is wrong with the qemu
>> emulation, but that seems unlikely.
>
> I have not run these tests before. Can you tell me how these are run,
> so I can verify these reproduce on real hardware?
>

Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
configuration should do the trick. Both can be built as module,
so presumably one can build and load them separately. I have not tried
that, though - I always build them into the kernel and boot the resulting
image.

Hope this helps,
Guenter


2024-02-11 03:41:40

by D. Jeff Dionne

[permalink] [raw]
Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

I remember there being problems with alignment on SH targets in the network stack. IIRC, wireguard triggered it in actual use, seems to me it had to do with skb alignment.

Rich Felker may remember more, but I don’t think we implemented a (complete) solution.

Cheers,
J.

> On 11 Feb 2024, at 07:03, Guenter Roeck <[email protected]> wrote:
>
> Hi Adrian,
>
>> On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
>> Hi Guenter,
>>> On Sat, 2024-02-10 at 07:12 -0800, Guenter Roeck wrote:
>>> when running checksum unit tests on sh4 qemu emulations, I get the following
>>> errors.
>>>
>>> KTAP version 1
>>> # Subtest: checksum
>>> # module: checksum_kunit
>>> 1..5
>>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 53378 (0xd082)
>>> ( u64)expec == 33488 (0x82d0)
>>> not ok 1 test_csum_fixed_random_inputs
>>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunitc:525
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 65281 (0xff01)
>>> ( u64)expec == 65280 (0xff00)
>>> not ok 2 test_csum_all_carry_inputs
>>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 65535 (0xffff)
>>> ( u64)expec == 65534 (0xfffe)
>>> not ok 3 test_csum_no_carry_inputs
>>> ok 4 test_ip_fast_csum
>>> ok 5 test_csum_ipv6_magic
>>> # checksum: pass:2 fail:3 skip:0 total:5
>>>
>>> The above is with from a little endian system. On a big endian system,
>>> the test result is as follows.
>>>
>>> KTAP version 1
>>> # Subtest: checksum
>>> # module: checksum_kunit
>>> 1..5
>>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 33488 (0x82d0)
>>> ( u64)expec == 53378 (0xd082)
>>> not ok 1 test_csum_fixed_random_inputs
>>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunitc:525
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 65281 (0xff01)
>>> ( u64)expec == 255 (0xff)
>>> not ok 2 test_csum_all_carry_inputs
>>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>>> Expected ( u64)result == ( u64)expec, but
>>> ( u64)result == 1020 (0x3fc)
>>> ( u64)expec == 0 (0x0)
>>> not ok 3 test_csum_no_carry_inputs
>>> # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>>> Expected ( u64)expected == ( u64)csum_result, but
>>> ( u64)expected == 55939 (0xda83)
>>> ( u64)csum_result == 33754 (0x83da)
>>> not ok 4 test_ip_fast_csum
>>> # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>>> Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>>> ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>>> ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>>> not ok 5 test_csum_ipv6_magic
>>> # checksum: pass:0 fail:5 skip:0 total:5
>>>
>>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>>> systems due to a bug in the test code, unrelated to this problem.
>>>
>>> Analysis shows that the errors are seen only if the buffer is misaligned.
>>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>>> Handle calling csum_partial with misaligned data") which seemed to be
>>> related. Reverting that commit fixes the problem.
>>> This suggests that something may be wrong with that commit. Alternatively,
>>> of course, it may be possible that something is wrong with the qemu
>>> emulation, but that seems unlikely.
>> I have not run these tests before. Can you tell me how these are run,
>> so I can verify these reproduce on real hardware?
>
> Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
> configuration should do the trick. Both can be built as module,
> so presumably one can build and load them separately. I have not tried
> that, though - I always build them into the kernel and boot the resulting
> image.
>
> Hope this helps,
> Guenter
>
>

2024-02-11 09:54:03

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

Hi Günter,

On Sat, Feb 10, 2024 at 10:59 PM Guenter Roeck <[email protected]> wrote:
> On 2/10/24 12:12, John Paul Adrian Glaubitz wrote:
> > I have not run these tests before. Can you tell me how these are run,
> > so I can verify these reproduce on real hardware?
>
> Enabling CONFIG_KUNIT and CONFIG_CHECKSUM_KUNIT on top of a working
> configuration should do the trick. Both can be built as module,
> so presumably one can build and load them separately. I have not tried
> that, though - I always build them into the kernel and boot the resulting
> image.

Yes, you can build and load them as modules separately; that's what
I do on m68k (and yes, the checksum test fails on m68k, as it is
big endian).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68korg

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2024-03-11 17:05:40

by Guenter Roeck

[permalink] [raw]
Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
> Hi,
>
> when running checksum unit tests on sh4 qemu emulations, I get the following
> errors.
>

Adding to regression tracker.

#regzbot ^introduced cadc4e1a2b4d2
#regzbot title Problems with csum_partial with misaligned buffers on sh4 platform
#regzbot ignore-activity

> KTAP version 1
> # Subtest: checksum
> # module: checksum_kunit
> 1..5
> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 53378 (0xd082)
> ( u64)expec == 33488 (0x82d0)
> not ok 1 test_csum_fixed_random_inputs
> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65281 (0xff01)
> ( u64)expec == 65280 (0xff00)
> not ok 2 test_csum_all_carry_inputs
> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65535 (0xffff)
> ( u64)expec == 65534 (0xfffe)
> not ok 3 test_csum_no_carry_inputs
> ok 4 test_ip_fast_csum
> ok 5 test_csum_ipv6_magic
> # checksum: pass:2 fail:3 skip:0 total:5
>
> The above is with from a little endian system. On a big endian system,
> the test result is as follows.
>
> KTAP version 1
> # Subtest: checksum
> # module: checksum_kunit
> 1..5
> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 33488 (0x82d0)
> ( u64)expec == 53378 (0xd082)
> not ok 1 test_csum_fixed_random_inputs
> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 65281 (0xff01)
> ( u64)expec == 255 (0xff)
> not ok 2 test_csum_all_carry_inputs
> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
> Expected ( u64)result == ( u64)expec, but
> ( u64)result == 1020 (0x3fc)
> ( u64)expec == 0 (0x0)
> not ok 3 test_csum_no_carry_inputs
> # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
> Expected ( u64)expected == ( u64)csum_result, but
> ( u64)expected == 55939 (0xda83)
> ( u64)csum_result == 33754 (0x83da)
> not ok 4 test_ip_fast_csum
> # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
> Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
> ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
> ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
> not ok 5 test_csum_ipv6_magic
> # checksum: pass:0 fail:5 skip:0 total:5
>
> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
> systems due to a bug in the test code, unrelated to this problem.
>
> Analysis shows that the errors are seen only if the buffer is misaligned.
> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
> Handle calling csum_partial with misaligned data") which seemed to be
> related. Reverting that commit fixes the problem.
> This suggests that something may be wrong with that commit. Alternatively,
> of course, it may be possible that something is wrong with the qemu
> emulation, but that seems unlikely.
>
> Thanks,
> Guenter

Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

On 11.03.24 18:04, Guenter Roeck wrote:
> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>
>> when running checksum unit tests on sh4 qemu emulations, I get the following
>> errors.
>
> Adding to regression tracker.
>
> #regzbot ^introduced cadc4e1a2b4d2

Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
commit afaics is from v3.0-rc1 and Linus iirc at least once said
something along the lines of "a regression only reported after a long
time at some point becomes just a bug". I'd say that applies there,
which is why I'm wondering if tracking this really is worth it.

Ciao, Thorsten


>> KTAP version 1
>> # Subtest: checksum
>> # module: checksum_kunit
>> 1..5
>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 53378 (0xd082)
>> ( u64)expec == 33488 (0x82d0)
>> not ok 1 test_csum_fixed_random_inputs
>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65281 (0xff01)
>> ( u64)expec == 65280 (0xff00)
>> not ok 2 test_csum_all_carry_inputs
>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:573
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65535 (0xffff)
>> ( u64)expec == 65534 (0xfffe)
>> not ok 3 test_csum_no_carry_inputs
>> ok 4 test_ip_fast_csum
>> ok 5 test_csum_ipv6_magic
>> # checksum: pass:2 fail:3 skip:0 total:5
>>
>> The above is with from a little endian system. On a big endian system,
>> the test result is as follows.
>>
>> KTAP version 1
>> # Subtest: checksum
>> # module: checksum_kunit
>> 1..5
>> # test_csum_fixed_random_inputs: ASSERTION FAILED at lib/checksum_kunit.c:500
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 33488 (0x82d0)
>> ( u64)expec == 53378 (0xd082)
>> not ok 1 test_csum_fixed_random_inputs
>> # test_csum_all_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:525
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 65281 (0xff01)
>> ( u64)expec == 255 (0xff)
>> not ok 2 test_csum_all_carry_inputs
>> # test_csum_no_carry_inputs: ASSERTION FAILED at lib/checksum_kunit.c:565
>> Expected ( u64)result == ( u64)expec, but
>> ( u64)result == 1020 (0x3fc)
>> ( u64)expec == 0 (0x0)
>> not ok 3 test_csum_no_carry_inputs
>> # test_ip_fast_csum: ASSERTION FAILED at lib/checksum_kunit.c:589
>> Expected ( u64)expected == ( u64)csum_result, but
>> ( u64)expected == 55939 (0xda83)
>> ( u64)csum_result == 33754 (0x83da)
>> not ok 4 test_ip_fast_csum
>> # test_csum_ipv6_magic: ASSERTION FAILED at lib/checksum_kunit.c:617
>> Expected ( u64)expected_csum_ipv6_magic[i] == ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum), but
>> ( u64)expected_csum_ipv6_magic[i] == 6356 (0x18d4)
>> ( u64)csum_ipv6_magic(saddr, daddr, len, proto, csum) == 43586 (0xaa42)
>> not ok 5 test_csum_ipv6_magic
>> # checksum: pass:0 fail:5 skip:0 total:5
>>
>> Note that test_ip_fast_csum and test_csum_ipv6_magic fail on all big endian
>> systems due to a bug in the test code, unrelated to this problem.
>>
>> Analysis shows that the errors are seen only if the buffer is misaligned.
>> Looking into arch/sh/lib/checksum.S, I found commit cadc4e1a2b4d2 ("sh:
>> Handle calling csum_partial with misaligned data") which seemed to be
>> related. Reverting that commit fixes the problem.
>> This suggests that something may be wrong with that commit. Alternatively,
>> of course, it may be possible that something is wrong with the qemu
>> emulation, but that seems unlikely.
>>
>> Thanks,
>> Guenter

2024-03-18 15:32:28

by Guenter Roeck

[permalink] [raw]
Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

On 3/18/24 08:04, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 11.03.24 18:04, Guenter Roeck wrote:
>> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>>
>>> when running checksum unit tests on sh4 qemu emulations, I get the following
>>> errors.
>>
>> Adding to regression tracker.
>>
>> #regzbot ^introduced cadc4e1a2b4d2
>
> Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
> commit afaics is from v3.0-rc1 and Linus iirc at least once said
> something along the lines of "a regression only reported after a long
> time at some point becomes just a bug". I'd say that applies there,
> which is why I'm wondering if tracking this really is worth it.
>

Not my call to make. I'll keep in mind to not add "bugs" to the regression
tracker in the future. Feel free to drop.

For my understanding, what is "a long time" ?

Thanks,
Guenter


Subject: Re: Problems with csum_partial with misaligned buffers on sh4 platform

On 18.03.24 16:32, Guenter Roeck wrote:
> On 3/18/24 08:04, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 11.03.24 18:04, Guenter Roeck wrote:
>>> On Sat, Feb 10, 2024 at 07:12:39AM -0800, Guenter Roeck wrote:
>>>>
>>>> when running checksum unit tests on sh4 qemu emulations, I get the
>>>> following
>>>> errors.
>>>
>>> Adding to regression tracker.
>>>
>>> #regzbot ^introduced cadc4e1a2b4d2
>>
>> Hmmm, thx for that, but well, I'm a bit taken back and forth here. That
>> commit afaics is from v3.0-rc1 and Linus iirc at least once said
>> something along the lines of "a regression only reported after a long
>> time at some point becomes just a bug". I'd say that applies there,
>> which is why I'm wondering if tracking this really is worth it.
>
> Not my call to make. I'll keep in mind to not add "bugs" to the regression
> tracker in the future.

From my side there is no need for you to keep that in mind, as "somewhat
added this regression to the tracking" might be something that will
occasionally make a developer finally fix the problem -- which is why I
waited a few days with today's reply. :-D

> Feel free to drop.

Let me do that:

#regzbot inconclusive: really old regression

> For my understanding, what is "a long time" ?

That is a good question and I guess the answer like so often in kernel
land depends on the regression in question. :-/ Also note that that
"iirc" really was meant like it, as I might misremember. I just checked
and found two related quotes, but the situations are somewhat different:

https://lore.kernel.org/all/CAHk-=wis_qQy4oDNynNKi5b7Qhosmxtoj1jxo5wmB6SRUwQUBQ@mail.gmail.com/
"""
And yes, I do consider "regression in an earlier release" to be a
regression that needs fixing.

There's obviously a time limit: if that "regression in an earlier
release" was a year or more ago, and just took forever for people to
notice, and it had semantic changes that now mean that fixing the
regression could cause a _new_ regression, then that can cause me to
go "Oh, now the new semantics are what we have to live with".
"""

And also:
https://lore.kernel.org/all/CAHk-=wiVi7mSrsMP=fLXQrXK_UimybW=ziLOwSzFTtoXUacWVQ@mail.gmail.com/
"""
And obviously, if users take years to even notice that something
broke, or if we have sane ways to work around the breakage that
doesn't make for too much trouble for users (ie "ok, there are a
handful of users, and they can use a kernel command line to work
around it" kind of things) we've also been a bit less strict.
"""

Ciao, Thorsten