2019-11-04 21:20:11

by Stefan Seyfried

[permalink] [raw]
Subject: test-mesh-crypto segfaults

Hi,

test-mesh-crypto segfaults for me.

abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> gdb unit/test-mesh-crypto

[....]

[8.6.2 Service Data using Node Identity]
ID Resolving Key = 84396c435ac48560b5965385253e210c
84396c435ac48560b5965385253e210c => PASS
Hash Input = 00000000000034ae608fbbc1f2c61201
00000000000034ae608fbbc1f2c61201 => PASS
Hash = 00861765aefcc57b
00861765aefcc57b => PASS
Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
0100861765aefcc57b34ae608fbbc1f2c6 => PASS


Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
#1 0x00007ffff7e8b6af in free_check () from /lib64/libc.so.6
#2 0x0000555555557c98 in l_free (ptr=<optimized out>) at ell/util.c:136
#3 l_queue_clear (queue=0x55555556d010, destroy=0x555555557c60
<l_free>) at ell/queue.c:107
#4 0x0000555555557210 in _sub_D_65535_1.7021 () at ell/cipher.c:319
#5 0x00007ffff7fe2c13 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#6 0x00007ffff7e3f877 in __run_exit_handlers () from /lib64/libc.so.6
#7 0x00007ffff7e3fa2c in exit () from /lib64/libc.so.6
#8 0x00007ffff7e27e12 in __libc_start_main () from /lib64/libc.so.6
#9 0x0000555555557b8a in _start () at ../sysdeps/x86_64/start.S:120

this is reproducible here, I'm disabling this test for now in the
openSUSE package build.

Best regards
--
Stefan Seyfried

"For a successful technology, reality must take precedence over
public relations, for nature cannot be fooled." -- Richard Feynman


2019-11-04 22:13:14

by Gix, Brian

[permalink] [raw]
Subject: Re: test-mesh-crypto segfaults

Hi Stefan,

On Mon, 2019-11-04 at 22:16 +0100, Stefan Seyfried wrote:
> Hi,
>
> test-mesh-crypto segfaults for me.
>
> abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> gdb unit/test-mesh-crypto
>
> [....]
>
> [8.6.2 Service Data using Node Identity]
> ID Resolving Key = 84396c435ac48560b5965385253e210c
> 84396c435ac48560b5965385253e210c => PASS
> Hash Input = 00000000000034ae608fbbc1f2c61201
> 00000000000034ae608fbbc1f2c61201 => PASS
> Hash = 00861765aefcc57b
> 00861765aefcc57b => PASS
> Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
> 0100861765aefcc57b34ae608fbbc1f2c6 => PASS
>

This is very strange. The unit test looks like it has completed successfully (at least from what you copy-
pasted). That test 8.6.2 is the final test, and it looks happy. Can you verify for me that all the other
tests completed successfully?

There is a *small* chance that you could be running on an old kernel: Kernels version 4.8 and before had a
bug in the AEAD crypto code that made mesh code (including this unit test) inoperable... But this should show
up *first* in one of the earlier tests within this unit test... so look for any FAIL designations.

The other thing about this particular test is that it is the *only* bluez unit test which is based on ELL
(Embedded Linux Library) instead of GLIB.

Are you running from an installed ELL when compiling bluez, or do you have ELL in a "peer directory"... For
instance:

.../work/ell
.../work/bluez

If you have the ELL source code available, please try running:
ell % make check

paying particular attention to the output of:
unit/test-cipher

If unit/test-cipher in ELL passes, then unit/test-mesh-crypto in BLUEZ should pass.


>
> Program received signal SIGSEGV, Segmentation fault.
> 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
> (gdb) bt
> #0 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
> #1 0x00007ffff7e8b6af in free_check () from /lib64/libc.so.6
> #2 0x0000555555557c98 in l_free (ptr=<optimized out>) at ell/util.c:136
> #3 l_queue_clear (queue=0x55555556d010, destroy=0x555555557c60
> <l_free>) at ell/queue.c:107
> #4 0x0000555555557210 in _sub_D_65535_1.7021 () at ell/cipher.c:319
> #5 0x00007ffff7fe2c13 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
> #6 0x00007ffff7e3f877 in __run_exit_handlers () from /lib64/libc.so.6
> #7 0x00007ffff7e3fa2c in exit () from /lib64/libc.so.6
> #8 0x00007ffff7e27e12 in __libc_start_main () from /lib64/libc.so.6
> #9 0x0000555555557b8a in _start () at ../sysdeps/x86_64/start.S:120
>
> this is reproducible here, I'm disabling this test for now in the
> openSUSE package build.
>
> Best regards

2019-11-04 22:27:59

by Gix, Brian

[permalink] [raw]
Subject: Re: test-mesh-crypto segfaults

Also, make sure your ELL version is at least Release 0.26, from 30-Oct-2019.

On Mon, 2019-11-04 at 22:11 +0000, Gix, Brian wrote:
> Hi Stefan,
>
> On Mon, 2019-11-04 at 22:16 +0100, Stefan Seyfried wrote:
> > Hi,
> >
> > test-mesh-crypto segfaults for me.
> >
> > abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> gdb unit/test-mesh-crypto
> >
> > [....]
> >
> > [8.6.2 Service Data using Node Identity]
> > ID Resolving Key = 84396c435ac48560b5965385253e210c
> > 84396c435ac48560b5965385253e210c => PASS
> > Hash Input = 00000000000034ae608fbbc1f2c61201
> > 00000000000034ae608fbbc1f2c61201 => PASS
> > Hash = 00861765aefcc57b
> > 00861765aefcc57b => PASS
> > Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
> > 0100861765aefcc57b34ae608fbbc1f2c6 => PASS
> >
>
> This is very strange. The unit test looks like it has completed successfully (at least from what you copy-
> pasted). That test 8.6.2 is the final test, and it looks happy. Can you verify for me that all the other
> tests completed successfully?
>
> There is a *small* chance that you could be running on an old kernel: Kernels version 4.8 and before had a
> bug in the AEAD crypto code that made mesh code (including this unit test) inoperable... But this should
> show
> up *first* in one of the earlier tests within this unit test... so look for any FAIL designations.
>
> The other thing about this particular test is that it is the *only* bluez unit test which is based on ELL
> (Embedded Linux Library) instead of GLIB.
>
> Are you running from an installed ELL when compiling bluez, or do you have ELL in a "peer directory"... For
> instance:
>
> .../work/ell
> .../work/bluez
>
> If you have the ELL source code available, please try running:
> ell % make check
>
> paying particular attention to the output of:
> unit/test-cipher
>
> If unit/test-cipher in ELL passes, then unit/test-mesh-crypto in BLUEZ should pass.
>
>
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
> > (gdb) bt
> > #0 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
> > #1 0x00007ffff7e8b6af in free_check () from /lib64/libc.so.6
> > #2 0x0000555555557c98 in l_free (ptr=<optimized out>) at ell/util.c:136
> > #3 l_queue_clear (queue=0x55555556d010, destroy=0x555555557c60
> > <l_free>) at ell/queue.c:107
> > #4 0x0000555555557210 in _sub_D_65535_1.7021 () at ell/cipher.c:319
> > #5 0x00007ffff7fe2c13 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
> > #6 0x00007ffff7e3f877 in __run_exit_handlers () from /lib64/libc.so.6
> > #7 0x00007ffff7e3fa2c in exit () from /lib64/libc.so.6
> > #8 0x00007ffff7e27e12 in __libc_start_main () from /lib64/libc.so.6
> > #9 0x0000555555557b8a in _start () at ../sysdeps/x86_64/start.S:120
> >
> > this is reproducible here, I'm disabling this test for now in the
> > openSUSE package build.
> >
> > Best regards

2019-11-05 09:56:29

by Stefan Seyfried

[permalink] [raw]
Subject: Re: test-mesh-crypto segfaults

Hi Brian,

One thing I forgot to mention explicitly: this is the recently released bluez-5.52 tarball, not git master(!).
Also note that openSUSE Factory use LTO by default now.

from config.log:

It was created by bluez configure 5.52, which was
generated by GNU Autoconf 2.69. Invocation command line was
$ ./configure --host=x86_64-suse-linux-gnu --build=x86_64-suse-linux-gnu --program-prefix=
--disable-dependency-tracking --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc
--datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/lib --localstatedir=/var
--sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --disable-dependency-tracking
--disable-silent-rules --enable-pie --enable-library --enable-tools --enable-cups --enable-mesh --enable-midi
--enable-test --enable-experimental --enable-deprecated --enable-datafiles --enable-sixaxis
--with-systemdsystemunitdir=/usr/lib/systemd/system --with-systemduserunitdir=/usr/lib/systemd/user

[...]

ac_cv_env_CFLAGS_value='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables
-fasynchronous-unwind-tables -fstack-clash-protection -Werror=return-type -flto=auto -g'


Am 04.11.19 um 23:11 schrieb Gix, Brian:
> Hi Stefan,
>
> On Mon, 2019-11-04 at 22:16 +0100, Stefan Seyfried wrote:
>> Hi,
>>
>> test-mesh-crypto segfaults for me.
>>
>> abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> gdb unit/test-mesh-crypto
>>
>> [....]
>>
>> [8.6.2 Service Data using Node Identity]
>> ID Resolving Key = 84396c435ac48560b5965385253e210c
>> 84396c435ac48560b5965385253e210c => PASS
>> Hash Input = 00000000000034ae608fbbc1f2c61201
>> 00000000000034ae608fbbc1f2c61201 => PASS
>> Hash = 00861765aefcc57b
>> 00861765aefcc57b => PASS
>> Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
>> 0100861765aefcc57b34ae608fbbc1f2c6 => PASS
>>
>
> This is very strange. The unit test looks like it has completed successfully (at least from what you copy-
> pasted). That test 8.6.2 is the final test, and it looks happy. Can you verify for me that all the other
> tests completed successfully?

yes, all previous tests passed successfully.

> There is a *small* chance that you could be running on an old kernel: Kernels version 4.8 and before had a
> bug in the AEAD crypto code that made mesh code (including this unit test) inoperable... But this should show
> up *first* in one of the earlier tests within this unit test... so look for any FAIL designations.

This is 5.4.0rc5 right now.

> The other thing about this particular test is that it is the *only* bluez unit test which is based on ELL
> (Embedded Linux Library) instead of GLIB.
>
> Are you running from an installed ELL when compiling bluez, or do you have ELL in a "peer directory"... For
> instance:
>
> .../work/ell
> .../work/bluez

no, I am using the embedded ell library AFAICT. I had "ell-devel" installed in the build root, but it is not used
apparently. Nothing changes if I do not install ell-devel. ell-devel and libell0 are old versions (0.23), but again,
removing them from the buildroot does not change things.

> If you have the ELL source code available, please try running:
> ell % make check
>
> paying particular attention to the output of:
> unit/test-cipher
>
> If unit/test-cipher in ELL passes, then unit/test-mesh-crypto in BLUEZ should pass.

I suppose it should work with the ell embedded in bluez-5.52 ;-)

>> Program received signal SIGSEGV, Segmentation fault.
>> 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
>> (gdb) bt
>> #0 0x00007ffff7e874ae in mem2chunk_check () from /lib64/libc.so.6
>> #1 0x00007ffff7e8b6af in free_check () from /lib64/libc.so.6
>> #2 0x0000555555557c98 in l_free (ptr=<optimized out>) at ell/util.c:136
>> #3 l_queue_clear (queue=0x55555556d010, destroy=0x555555557c60
>> <l_free>) at ell/queue.c:107

I was intrigued by the lots of 0x55 bytes in these pointers(?) and checked if MALLOC_PERTURB_ and MALLOC_CHECK_ values
make a difference.

openSUSE Factory by default builds with

abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> echo $MALLOC_CHECK_
3
abuild@strolchi:~/rpmbuild/BUILD/bluez-5.52> echo $MALLOC_PERTURB_
69

unsetting both leads to an abort() instead of a segfault, but I think that's just a distraction:

~> unit/test-mesh-crypto
[...]
[8.6.2 Service Data using Node Identity]
ID Resolving Key = 84396c435ac48560b5965385253e210c
84396c435ac48560b5965385253e210c => PASS
Hash Input = 00000000000034ae608fbbc1f2c61201
00000000000034ae608fbbc1f2c61201 => PASS
Hash = 00861765aefcc57b
00861765aefcc57b => PASS
Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
0100861765aefcc57b34ae608fbbc1f2c6 => PASS
[8.6.2 Service Data using Node Identity]
ID Resolving Key = 84396c435ac48560b5965385253e210c
84396c435ac48560b5965385253e210c => PASS
Hash Input = 00000000000034ae608fbbc1f2c61201
00000000000034ae608fbbc1f2c61201 => PASS
Hash = 00861765aefcc57b
00861765aefcc57b => PASS
Mesh ID Beacon = 0100861765aefcc57b34ae608fbbc1f2c6
0100861765aefcc57b34ae608fbbc1f2c6 => PASS

free(): invalid size
Aborted (core dumped)

in gdb:
Program received signal SIGABRT, Aborted.
0x00007ffff7e3d0d1 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff7e3d0d1 in raise () from /lib64/libc.so.6
#1 0x00007ffff7e2655d in abort () from /lib64/libc.so.6
#2 0x00007ffff7e8045f in __libc_message () from /lib64/libc.so.6
#3 0x00007ffff7e878fc in malloc_printerr () from /lib64/libc.so.6
#4 0x00007ffff7e892ec in _int_free () from /lib64/libc.so.6
#5 0x0000555555557ca4 in l_free (ptr=<optimized out>) at ell/util.c:136
#6 l_queue_clear (queue=0x55555556d2a0, destroy=0x555555557c60 <l_free>) at ell/queue.c:111
#7 0x0000555555557210 in _sub_D_65535_1.7021 () at ell/cipher.c:319
#8 0x00007ffff7fe2c13 in _dl_fini () from /lib64/ld-linux-x86-64.so.2
#9 0x00007ffff7e3f877 in __run_exit_handlers () from /lib64/libc.so.6
#10 0x00007ffff7e3fa2c in exit () from /lib64/libc.so.6
#11 0x00007ffff7e27e12 in __libc_start_main () from /lib64/libc.so.6
#12 0x0000555555557b8a in _start () at ../sysdeps/x86_64/start.S:120

free(): invalid size
Aborted (core dumped)


Valgrind also complains, so I guess something is wrong in libell:

==28744== Invalid read of size 8
==28744== at 0x10BC7E: l_queue_clear.constprop.0 (queue.c:101)
==28744== by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Block was alloc'd at
==28744== at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10C80D: l_malloc (util.c:62)
==28744== by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744== by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744== by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744== by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744== by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 8
==28744== at 0x10BCA9: l_queue_clear.constprop.0 (queue.c:114)
==28744== by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Block was alloc'd at
==28744== at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10C80D: l_malloc (util.c:62)
==28744== by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744== by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744== by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744== by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744== by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 8
==28744== at 0x10BCB1: l_queue_clear.constprop.0 (queue.c:115)
==28744== by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Address 0x4a15048 is 8 bytes inside a block of size 24 free'd
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Block was alloc'd at
==28744== at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10C80D: l_malloc (util.c:62)
==28744== by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744== by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744== by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744== by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744== by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid write of size 4
==28744== at 0x10BCB9: l_queue_clear.constprop.0 (queue.c:116)
==28744== by 0x10B20F: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Address 0x4a15050 is 16 bytes inside a block of size 24 free'd
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Block was alloc'd at
==28744== at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10C80D: l_malloc (util.c:62)
==28744== by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744== by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744== by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744== by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744== by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744== Invalid free() / delete / delete[] / realloc()
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Address 0x4a15040 is 0 bytes inside a block of size 24 free'd
==28744== at 0x48389AB: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10B207: _sub_D_65535_1.7021 (in /home/abuild/rpmbuild/BUILD/bluez-5.52/unit/test-mesh-crypto)
==28744== by 0x400FC12: _dl_fini (in /lib64/ld-2.30.so)
==28744== by 0x4889876: __run_exit_handlers (in /lib64/libc-2.30.so)
==28744== by 0x4889A2B: exit (in /lib64/libc-2.30.so)
==28744== by 0x4871E11: (below main) (in /lib64/libc-2.30.so)
==28744== Block was alloc'd at
==28744== at 0x483777F: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==28744== by 0x10C80D: l_malloc (util.c:62)
==28744== by 0x10B275: UnknownInlinedFun (queue.c:63)
==28744== by 0x10B275: l_debug_add_section.constprop.0 (log.c:376)
==28744== by 0x10B308: _sub_I_65535_0.7019 (queue.c:179)
==28744== by 0x1119E4: __libc_csu_init (elf-init.c:88)
==28744== by 0x4871D99: (below main) (in /lib64/libc-2.30.so)
==28744==
==28744==
==28744== HEAP SUMMARY:
==28744== in use at exit: 0 bytes in 0 blocks
==28744== total heap usage: 773 allocs, 774 frees, 18,598 bytes allocated
==28744==
==28744== All heap blocks were freed -- no leaks are possible
==28744==
==28744== For lists of detected and suppressed errors, rerun with: -s
==28744== ERROR SUMMARY: 5 errors from 5 contexts (suppressed: 0 from 0)


Somehow I'm unable to get proper debuginfo and better output from valgrind and gdb, so in a last desparate attempt I
disabled LTO for my build and now the test does not fail.

I'm going to file a bug for the openSUSE toolchain team for the LTO failure, but it is well possible that they will come
back and point to issues in ell or bluez, so stay tuned ;-)

the difference LTO vs non-LTO is:

LTO:
CFLAGS='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables
-fstack-clash-protection -Werror=return-type -g'

Non-LTO:
CFLAGS='-O2 -Wall -D_FORTIFY_SOURCE=2 -fstack-protector-strong -funwind-tables -fasynchronous-unwind-tables
-fstack-clash-protection -Werror=return-type -flto=auto -g'

Best regards,
--
Stefan Seyfried

"For a successful technology, reality must take precedence over
public relations, for nature cannot be fooled." -- Richard Feynman

2019-11-05 15:26:11

by Stefan Seyfried

[permalink] [raw]
Subject: Re: test-mesh-crypto segfaults

Am 05.11.19 um 10:53 schrieb Stefan Seyfried:
> Hi Brian,
>
> One thing I forgot to mention explicitly: this is the recently released bluez-5.52 tarball, not git master(!).
> Also note that openSUSE Factory use LTO by default now.

LTO seems to trigger the problem, see
https://bugzilla.opensuse.org/show_bug.cgi?id=1155889#c6
for an explanation of the issue and a suggested patch that will fix / work around the issue.

Best regards,
--
Stefan Seyfried

"For a successful technology, reality must take precedence over
public relations, for nature cannot be fooled." -- Richard Feynman

2019-11-05 16:32:29

by Gix, Brian

[permalink] [raw]
Subject: Re: test-mesh-crypto segfaults

Hi Stefan,

On Tue, 2019-11-05 at 16:24 +0100, Stefan Seyfried wrote:
> Am 05.11.19 um 10:53 schrieb Stefan Seyfried:
> > Hi Brian,
> >
> > One thing I forgot to mention explicitly: this is the recently released bluez-5.52 tarball, not git
> > master(!).
> > Also note that openSUSE Factory use LTO by default now.
>
> LTO seems to trigger the problem, see
> https://bugzilla.opensuse.org/show_bug.cgi?id=1155889#c6
> for an explanation of the issue and a suggested patch that will fix / work around the issue.

Thank you for your very thorough analysis. We are looking at it now.

>
> Best regards,