2022-10-15 15:44:19

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

Topposting, to make this easier to access for everyone.

@Mirsad, thx for bisecting.

@Phillip: if you want to see a problem description and the whole
backstory of the problem that apparently is caused by b09a7a036d20
("squashfs: support reading fragments in readahead call"), see this
thread (Mirsad sadly started a new one with the quoted mail below):
https://lore.kernel.org/all/[email protected]/

Ciao, Thorsten

On 15.10.22 16:56, Mirsad Goran Todorovac wrote:
> On 14. 10. 2022. 23:44, Mirsad Goran Todorovac wrote:
>> On 14. 10. 2022. 14:28, Bagas Sanjaya wrote:
>>> On 10/14/22 17:32, Mirsad Todorovac wrote:
>>>> I tried the "make localmodconfig" and provided the default answers
>>>> ([ENTER]) to all questions
>>>> the script asked as advised here:
>>>> https://www.stolaf.edu/people/rab/os/linux-compile.html .
>>>>
>>>> However, though it built much faster, the stripped version did not
>>>> trigger the bug.
>>>>
>>>> I am now trying to reproduce the bug with v6.0-rc[123] with
>>>> config-{051913,060000}.
>>>> This brings a lot of combinations, and though I am a newbie, I
>>>> noticed that build scripts
>>>> start with "make clean" for both deb-pkg and rpm-pkg.
>>>>
>>>> Is there a way to rebuild only the stuff that changed between the
>>>> versions?
>>>>
>>> You can try building kernel with ccache enabled. However, you'll need
>>> to unset build timestamp, since it will make builds non-deterministic:
>>>
>>>     make CC="ccache gcc" KBUILD_BUILD_TIMESTAMP=""
>>>
>>> The first ccache build will be slower than normal build, because the
>>> object files needs to be written twice (to the output directory and
>>> to the cache), though.
>>
>> Hi, Bagas,
>>
>> Unfortunately, the ccache command did not work as expected. It still
>> calls the gcc compiler proper,
>> even in the second run ...
>>
>> PID USER      PR  NI    VIRT    RES    SHR S %CPU  %MEM     TIME+ COMMAND
>> 29213 root      20   0   67420  55100  15676 R  98.1   0.3 0:02.42
>> /opt/alt/python38/bin/python3 /usr/bin/imunify360-agent rstatus
>> 29374 mtodorov  30  10  133932 110104  18408 R  98.1   0.7 0:01.46
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arc+
>>  1079 root      20   0  589008 371700   2320 S  97.1   2.3 2343:20
>> /usr/sbin/bacula-fd -f -c /etc/bacula/bacula-fd.conf
>> 29425 mtodorov  30  10   95052  71520  16700 R  76.9   0.4 0:00.80
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arc+
>> 29459 mtodorov  30  10   79268  51432  12732 R  46.2   0.3 0:00.48
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arc+
>> 29462 mtodorov  30  10   65516  34828   9224 R  43.3   0.2 0:00.45
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -E -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./+
>> 29505 mtodorov  30  10       0      0      0 R  32.7   0.0 0:00.34 [cc1]
>> 29506 mtodorov  30  10   65940  38560  12576 R  32.7   0.2 0:00.34
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./arc+
>> 29523 mtodorov  30  10   52180  21572   9344 R  17.3   0.1 0:00.18
>> /usr/lib/gcc/x86_64-linux-gnu/8/cc1 -E -quiet -nostdinc -I
>> ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I ./+
>>  5517 root      20   0  780956  77812  10052 S  11.5   0.5 466:39.30
>> /opt/alt/python38/bin/python3 -m im360.run
>> 29384 mtodorov  30  10    7132   3964   3568 S  11.5   0.0 0:00.12
>> /usr/local/bin/ccache gcc -Wp,-MMD,sound/i2c/other/.ak4114.o.d
>> -nostdinc -I./arch/x86/include -I./arch/x86/include/generated -I.+
>> 29407 mtodorov  30  10    7116   3748   3360 S  10.6   0.0 0:00.11
>> /usr/local/bin/ccache gcc -Wp,-MMD,fs/exfat/.nls.o.d -nostdinc
>> -I./arch/x86/include -I./arch/x86/include/generated -I./include -+
>> 29451 mtodorov  30  10    7132   3968   3568 S   9.6   0.0 0:00.10
>> /usr/local/bin/ccache gcc
>> -Wp,-MMD,drivers/pinctrl/intel/.pinctrl-tigerlake.o.d -nostdinc
>> -I./arch/x86/include -I./arch/x86/incl+
>> 29522 mtodorov  30  10    8288   7520   1468 R   8.7   0.0 0:00.09
>> scripts/genksyms/genksyms -r /dev/null
>> 29524 mtodorov  30  10   17008  12480   3204 R   4.8   0.1 0:00.05 as
>> -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I
>> ./arch/x86/include/uapi -I ./arch/x86/include/generated+
>>  8896 mtodorov  20   0   12212   4616   3348 R   3.8   0.0 0:04.58 top
>>
>> 29525 mtodorov  30  10   15028  10412 3248 R   3.8   0.1   0:00.04 as
>> -I ./arch/x86/include -I ./arch/x86/include/generated -I ./include -I
>> ./arch/x86/include/uapi -I ./arch/x86/include/generated+
>>
>> I used the command line:
>> $ make CC="/usr/local/bin/ccache gcc" KBUILD_BUILD_TIMESTAMP="" -j10
>> deb-pkg
>>
>> I tried to install the latest ccache (version 4.6.3), but it silently
>> calls cc1, despite having 4GiB of cache in $HOME/.ccache ...
>>
>> I must be doing something wrong. Heavens are not aligned in favour to
>> me this day :-/
>
> Here are the results of the requested bisect on the bug involving the
> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>
> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
> git bisect start
> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
> 'mm-stable-2022-08-03' of
> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
> 'mm-nonmm-stable-2022-08-06-2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
> Add quirk for HP Spectre x360 15-eb0xxx
> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
> 'dma-mapping-5.20-2022-08-06' of
> git://git.infradead.org/users/hch/dma-mapping
> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
> kexec build error
> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
> build "file direct" version of page actor
> git bisect good db98b43086275350294f5c6f797249b714d6316d
> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
> Check the size of screen before memset_io()
> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
> 'for-5.20/fbdev-1' of
> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
> useless functions
> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
> address space of proc_dohung_task_timeout_secs
> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
> reading fragments in readahead call
> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
> mtodorov@domac:~/linux/kernel/linux_stable$
>
> The git bisect stopped at the squashfs commit
> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
> according to the Code of Conduct.
>
> Have a nice day!
>
> Kind regards,
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu
> --
> System engineer
> Faculty of Graphic Arts | Academy of Fine Arts
> University of Zagreb, Republic of Croatia
> The European Union
>

#regzbot ^backmonitor:
https://lore.kernel.org/all/[email protected]/
#regzbot introduced: b09a7a036d2035


2022-10-15 22:09:56

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

Thorsten Leemhuis <[email protected]> wrote:
>
>Topposting, to make this easier to access for everyone.
>
>@Mirsad, thx for bisecting.
>
>@Phillip: if you want to see a problem description and the whole
>backstory of the problem that apparently is caused by b09a7a036d20
>("squashfs: support reading fragments in readahead call"), see this
>thread (Mirsad sadly started a new one with the quoted mail below):
>https://lore.kernel.org/all/[email protected]/
>

The above backstory tends to suggest data corruption which is happening
after a couple of hours especially on heavy loads, e.g. the comment

> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>The bug usually isn't showing immediately, but after a couple of hours
>of running (especially with multimedia running inside Firefox).

Which is typically caused by double freed buffers or race conditions in
freeing and reusing.

Thanks Mirsad for the following

On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>
>Here are the results of the requested bisect on the bug involving the
>Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>
>mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>git bisect start
># bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
># good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
># good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
># good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
># good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>'mm-stable-2022-08-03' of
>git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
># bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>'mm-nonmm-stable-2022-08-06-2' of
>git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
># good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>Add quirk for HP Spectre x360 15-eb0xxx
>git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
># good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>'dma-mapping-5.20-2022-08-06' of
>git://git.infradead.org/users/hch/dma-mapping
>git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
># good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>kexec build error
>git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
># good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>'s390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
># good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>build "file direct" version of page actor
>git bisect good db98b43086275350294f5c6f797249b714d6316d
># good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>Check the size of screen before memset_io()
>git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
># good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>'for-5.20/fbdev-1' of
>git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
># bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>useless functions
>git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
># bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>address space of proc_dohung_task_timeout_secs
>git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
># bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>reading fragments in readahead call
>git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>mtodorov@domac:~/linux/kernel/linux_stable$
>
>The git bisect stopped at the squashfs commit
>b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>according to the Code of Conduct.

Which identified the "squashfs: support reading fragments in readahead call"
patch.

There is a race-condition introduced in that patch, which involves cache
releasing and reuse.

The following diff will fix that race-condition. It would be great if
someone could test and verify before sending it out as a patch.

Thanks

Phillip

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b22..6cc23178e9ad 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
squashfs_i(inode)->fragment_size);
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ int error = buffer->error;

- if (buffer->error)
+ if (error)
goto out;

expected += squashfs_i(inode)->fragment_offset;
@@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,

out:
squashfs_cache_put(buffer);
- return buffer->error;
+ return error;
}

static void squashfs_readahead(struct readahead_control *ractl)

2022-10-16 12:26:35

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/16/22 03:59, Phillip Lougher wrote:
>
> Which identified the "squashfs: support reading fragments in readahead call"
> patch.
>
> There is a race-condition introduced in that patch, which involves cache
> releasing and reuse.
>
> The following diff will fix that race-condition. It would be great if
> someone could test and verify before sending it out as a patch.
>
> Thanks
>
> Phillip
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..6cc23178e9ad 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int error = buffer->error;
>
> - if (buffer->error)
> + if (error)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return error;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
>

No Verneed warnings so far. However, I need to test for a longer time
(a day) to check if any warnings are reported.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-16 12:45:22

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/16/22 19:21, Bagas Sanjaya wrote:
> On 10/16/22 03:59, Phillip Lougher wrote:
>>
>> Which identified the "squashfs: support reading fragments in readahead call"
>> patch.
>>
>> There is a race-condition introduced in that patch, which involves cache
>> releasing and reuse.
>>
>> The following diff will fix that race-condition. It would be great if
>> someone could test and verify before sending it out as a patch.
>>
>> Thanks
>>
>> Phillip
>>
>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
>> index e56510964b22..6cc23178e9ad 100644
>> --- a/fs/squashfs/file.c
>> +++ b/fs/squashfs/file.c
>> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
>> squashfs_i(inode)->fragment_size);
>> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
>> + int error = buffer->error;
>>
>> - if (buffer->error)
>> + if (error)
>> goto out;
>>
>> expected += squashfs_i(inode)->fragment_offset;
>> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>>
>> out:
>> squashfs_cache_put(buffer);
>> - return buffer->error;
>> + return error;
>> }
>>
>> static void squashfs_readahead(struct readahead_control *ractl)
>>
>
> No Verneed warnings so far. However, I need to test for a longer time
> (a day) to check if any warnings are reported.
>
> Thanks.
>

Also, since this regression is also found on linux-6.0.y stable branch,
don't forget to Cc stable list.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-16 12:45:53

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 16.10.22 14:24, Bagas Sanjaya wrote:
> On 10/16/22 19:21, Bagas Sanjaya wrote:
>> On 10/16/22 03:59, Phillip Lougher wrote:
>>>
>>> Which identified the "squashfs: support reading fragments in readahead call"
>>> patch.

BTW, Phillip, sorry for specifying the wrong culprit in an earlier mail.

> Also, since this regression is also found on linux-6.0.y stable branch,
> don't forget to Cc stable list.

FWIW, that's afaics not needed (and actually slightly confusing I'd
say). That is only important when the problem was introduced in a stable
release to make the stable team aware of it, as then they might be
willing to revert the culprit if the problem is not present in mainline.

But this problem was introduced in the 6.0 mainline cycle and thus needs
to be fixed there first.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

P.P.S.:

#regzbot introduced: b09a7a036d2035b14636c

2022-10-16 16:43:27

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 15. 10. 2022. 22:59, Phillip Lougher wrote:

> Thorsten Leemhuis<[email protected]> wrote:
>> Topposting, to make this easier to access for everyone.
>>
>> @Mirsad, thx for bisecting.
>>
>> @Phillip: if you want to see a problem description and the whole
>> backstory of the problem that apparently is caused by b09a7a036d20
>> ("squashfs: support reading fragments in readahead call"), see this
>> thread (Mirsad sadly started a new one with the quoted mail below):
>> https://lore.kernel.org/all/[email protected]/
>>
> The above backstory tends to suggest data corruption which is happening
> after a couple of hours especially on heavy loads, e.g. the comment
>
>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>> The bug usually isn't showing immediately, but after a couple of hours
>> of running (especially with multimedia running inside Firefox).
> Which is typically caused by double freed buffers or race conditions in
> freeing and reusing.
>
> Thanks Mirsad for the following
>
> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>> Here are the results of the requested bisect on the bug involving the
>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>
>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>> git bisect start
>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>> 'mm-stable-2022-08-03' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>> 'mm-nonmm-stable-2022-08-06-2' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>> Add quirk for HP Spectre x360 15-eb0xxx
>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>> 'dma-mapping-5.20-2022-08-06' of
>> git://git.infradead.org/users/hch/dma-mapping
>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>> kexec build error
>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>> build "file direct" version of page actor
>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>> Check the size of screen before memset_io()
>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>> 'for-5.20/fbdev-1' of
>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>> useless functions
>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>> address space of proc_dohung_task_timeout_secs
>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>> reading fragments in readahead call
>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>> mtodorov@domac:~/linux/kernel/linux_stable$
>>
>> The git bisect stopped at the squashfs commit
>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>> according to the Code of Conduct.
> Which identified the "squashfs: support reading fragments in readahead call"
> patch.
>
> There is a race-condition introduced in that patch, which involves cache
> releasing and reuse.
>
> The following diff will fix that race-condition. It would be great if
> someone could test and verify before sending it out as a patch.
>
> Thanks
>
> Phillip
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..6cc23178e9ad 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int error = buffer->error;
>
> - if (buffer->error)
> + if (error)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return error;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)

Dear Phillip,

I am sorry to have to disappoint you.

Having applied the patch to the version 6.0.1. The first run succeeded,
but the second (after quit and restart from toolbar) run's results are
still not good:

marvin@marvin-IdeaPad-3-15ITL6:~$ sudo tail -300f /var/log/syslog | grep
firefox
Oct 16 17:23:17 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope.
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
update.go:85: cannot change mount namespace according to change mount
(/var/lib/snapd/hostfs/usr/share/cups/doc-root /usr/share/cups/doc-root
none bind,ro 0 0): cannot create directory "/usr/share/cups/doc-root":
permission denied
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
update.go:85: cannot change mount namespace according to change mount
(/var/lib/snapd/hostfs/usr/share/gimp/2.0/help /usr/share/gimp/2.0/help
none bind,ro 0 0): cannot create directory "/usr/share/gimp/2.0":
permission denied
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
update.go:85: cannot change mount namespace according to change mount
(/var/lib/snapd/hostfs/usr/share/libreoffice/help
/usr/share/libreoffice/help none bind,ro 0 0): cannot create directory
"/usr/share/libreoffice/help": permission denied
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
update.go:85: cannot change mount namespace according to change mount
(/var/lib/snapd/hostfs/usr/share/xubuntu-docs /usr/share/xubuntu-docs
none bind,ro 0 0): cannot open directory "/var/lib": permission denied
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.654344] audit:
type=1400 audit(1665933798.028:105): apparmor="DENIED" operation="mkdir"
profile="snap-update-ns.firefox" name="/usr/share/cups/doc-root/"
pid=2556 comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655141] audit:
type=1400 audit(1665933798.028:106): apparmor="DENIED" operation="mkdir"
profile="snap-update-ns.firefox" name="/usr/share/gimp/2.0/" pid=2556
comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655528] audit:
type=1400 audit(1665933798.028:107): apparmor="DENIED" operation="mkdir"
profile="snap-update-ns.firefox" name="/usr/share/libreoffice/help/"
pid=2556 comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655671] audit:
type=1400 audit(1665933798.028:108): apparmor="DENIED" operation="open"
profile="snap-update-ns.firefox" name="/var/lib/" pid=2556 comm="5"
requested_mask="r" denied_mask="r" fsuid=0 ouid=0
Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
module "canberra-gtk-module"
Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
module "canberra-gtk-module"
Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 kernel: [   27.287722] audit:
type=1326 audit(1665933799.660:110): auid=1000 uid=1000 gid=1000 ses=3
subj=snap.firefox.firefox pid=2527 comm="firefox"
exe="/snap/firefox/1943/usr/lib/firefox/firefox" sig=0 arch=c000003e
syscall=314 compat=0 ip=0x7f54ec28673d code=0x50000
Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
ATTENTION: default value of option mesa_glthread overridden by environment.
Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
message repeated 2 times: [ ATTENTION: default value of option
mesa_glthread overridden by environment.]
Oct 16 17:23:23 marvin-IdeaPad-3-15ITL6 snapd[817]: storehelpers.go:748:
cannot refresh: snap has no updates available: "bare",
"canonical-livepatch", "core", "core18", "core20", "firefox",
"gnome-3-34-1804", "gnome-3-38-2004", "gtk-common-themes", "slack",
"snap-store", "snapd", "zoom-client"
Oct 16 17:23:40 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
Missing chrome or resource URL:
resource://gre/modules/UpdateListener.sys.mjs
Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
[2022-10-16T15:23:55Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
message repeated 3 times: [ [2022-10-16T15:23:55Z ERROR mp4parse] Found
2 nul bytes in "\0\0"]
Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
[2022-10-16T15:23:56Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
message repeated 3 times: [ [2022-10-16T15:23:56Z ERROR mp4parse] Found
2 nul bytes in "\0\0"]
Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
[2022-10-16T15:23:58Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
message repeated 3 times: [ [2022-10-16T15:23:58Z ERROR mp4parse] Found
2 nul bytes in "\0\0"]
Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
XPCOMGlueLoad error for file
/snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
/snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1:
unsupported version 0 of Verneed record
Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
Couldn't load XPCOM.
Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
XPCOMGlueLoad error for file
/snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
/snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1:
unsupported version 0 of Verneed record
Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
Couldn't load XPCOM.
Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:24:27 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4480]:
Inconsistency detected by ld.so: ../elf/dl-runtime.c: 80: _dl_fixup:
Assertion `ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT' failed!
Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
/snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
/snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verneed record
Oct 16 17:25:02 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
Missing chrome or resource URL:
resource://gre/modules/UpdateListener.sys.mjs
Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
/snap/firefox/1943/usr/lib/firefox/pingsender:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
/snap/firefox/1943/usr/lib/firefox/pingsender:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
/snap/firefox/1943/usr/lib/firefox/pingsender: error while loading
shared libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version
0 of Verneed record
Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2764]:
/snap/firefox/1943/usr/lib/firefox/firefox: symbol lookup error:
/snap/firefox/1943/usr/lib/firefox/libmozsandbox.so: undefined symbol: ,
version
Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 systemd[1795]:
snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope:
Consumed 6min 11.938s CPU time.
Oct 16 17:26:22 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
snap.firefox.firefox.8af6eb39-1ebc-4e81-a7c1-a6cc66e76eac.scope.
Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
/bin/bash: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
Verdef record
Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
/bin/bash: error while loading shared libraries:
/lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verneed record

I have double-checked that this version of the function is in my build tree:

static int squashfs_readahead_fragment(struct page **page,
        unsigned int pages, unsigned int expected)
{
        struct inode *inode = page[0]->mapping->host;
        struct squashfs_cache_entry *buffer =
squashfs_get_fragment(inode->i_sb,
                squashfs_i(inode)->fragment_block,
                squashfs_i(inode)->fragment_size);
        struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
        unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
        int error = buffer->error;

        if (error)
                goto out;

        expected += squashfs_i(inode)->fragment_offset;

        for (n = 0; n < pages; n++) {
                unsigned int base = (page[n]->index & mask) << PAGE_SHIFT;
                unsigned int offset = base +
squashfs_i(inode)->fragment_offset;

                if (expected > offset) {
                        unsigned int avail = min_t(unsigned int, expected -
                                offset, PAGE_SIZE);

                        squashfs_fill_page(page[n], buffer, offset, avail);
                }

                unlock_page(page[n]);
                put_page(page[n]);
        }

out:
        squashfs_cache_put(buffer);
        return error;
}

The same set of Firefox windows and tabs loaded correctly once switching
back to the 5.19.13 kernel image.

The same applies for running the same setup with the home-built "Linux
5.19.15-20221014-conf051913ub x86_64": No Verdef nor Verneed error messages.

Have a nice day, and good bug hunting!

Regards,
Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-10-16 20:21:26

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 16/10/2022 16:55, Mirsad Goran Todorovac wrote:
> On 15. 10. 2022. 22:59, Phillip Lougher wrote:
>
>> Thorsten Leemhuis<[email protected] wrote:
>>> Topposting, to make this easier to access for everyone.
>>>
>>> @Mirsad, thx for bisecting.
>>>
>>> @Phillip: if you want to see a problem description and the whole
>>> backstory of the problem that apparently is caused by     b09a7a036d20
>>> ("squashfs: support reading fragments in readahead call"), see this
>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>> https://lore.kernel.org/all/[email protected]/
>>>
>> The above backstory tends to suggest data corruption which is happening
>> after a couple of hours especially on heavy loads, e.g. the comment
>>
>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>> The bug usually isn't showing immediately, but after a couple of hours
>>> of running (especially with multimedia running inside Firefox).
>> Which is typically caused by double freed buffers or race conditions in
>> freeing and reusing.
>>
>> Thanks Mirsad for the following
>>
>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>> Here are the results of the requested bisect on the bug involving the
>>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>
>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>> git bisect start
>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>> 'mm-stable-2022-08-03' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>> 'mm-nonmm-stable-2022-08-06-2' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>> Add quirk for HP Spectre x360 15-eb0xxx
>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>> 'dma-mapping-5.20-2022-08-06' of
>>> git://git.infradead.org/users/hch/dma-mapping
>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>> kexec build error
>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>> 's390-5.20-1' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>> build "file direct" version of page actor
>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>> Check the size of screen before memset_io()
>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>> 'for-5.20/fbdev-1' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>> useless functions
>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>>> address space of proc_dohung_task_timeout_secs
>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65]  squashfs: support
>>> reading fragments in readahead call
>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>
>>> The git bisect stopped at the squashfs commit
>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>>> according to the Code of Conduct.
>> Which identified the "squashfs: support reading fragments in readahead
>> call"
>> patch.
>>
>> There is a race-condition introduced in that patch, which involves cache
>> releasing and reuse.
>>
>> The following diff will fix that race-condition.  It would be great if
>> someone could test and verify before sending it out as a patch.
>>
>> Thanks
>>
>> Phillip
>>
>> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
>> index e56510964b22..6cc23178e9ad 100644
>> --- a/fs/squashfs/file.c
>> +++ b/fs/squashfs/file.c
>> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page
>> **page,
>>           squashfs_i(inode)->fragment_size);
>>       struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>>       unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
>> +    int error = buffer->error;
>> -    if (buffer->error)
>> +    if (error)
>>           goto out;
>>       expected += squashfs_i(inode)->fragment_offset;
>> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page
>> **page,
>>   out:
>>       squashfs_cache_put(buffer);
>> -    return buffer->error;
>> +    return error;
>>   }
>>   static void squashfs_readahead(struct readahead_control *ractl)
>
> Dear Phillip,
>
> I am sorry to have to disappoint you.
>
> Having applied the patch to the version 6.0.1. The first run succeeded,
> but the second (after quit and restart from toolbar) run's results are
> still not good:
>

Tracking down bugs of this sort is always a process of elimination,
and gathering information to pinpoint the exact circumstances of why
it is triggering.

Next step is to download the exact snap(s) where the problems are
occurring, as this may provide some insights.

I don't run Ubuntu, and I don't use snaps. Can you provide the
download link(s) of the snap(s) that cause problems? If there's
any firefox snaps that don't cause problems those will be useful
too.

You don't mention if there's any errors present in "dmesg" when
this happens, and so I'm assuming there aren't any?

Phillip


> marvin@marvin-IdeaPad-3-15ITL6:~$ sudo tail -300f /var/log/syslog | grep
> firefox
> Oct 16 17:23:17 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
> snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope.
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
> update.go:85: cannot change mount namespace according to change mount
> (/var/lib/snapd/hostfs/usr/share/cups/doc-root /usr/share/cups/doc-root
> none bind,ro 0 0): cannot create directory "/usr/share/cups/doc-root":
> permission denied
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
> update.go:85: cannot change mount namespace according to change mount
> (/var/lib/snapd/hostfs/usr/share/gimp/2.0/help /usr/share/gimp/2.0/help
> none bind,ro 0 0): cannot create directory "/usr/share/gimp/2.0":
> permission denied
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
> update.go:85: cannot change mount namespace according to change mount
> (/var/lib/snapd/hostfs/usr/share/libreoffice/help
> /usr/share/libreoffice/help none bind,ro 0 0): cannot create directory
> "/usr/share/libreoffice/help": permission denied
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
> update.go:85: cannot change mount namespace according to change mount
> (/var/lib/snapd/hostfs/usr/share/xubuntu-docs /usr/share/xubuntu-docs
> none bind,ro 0 0): cannot open directory "/var/lib": permission denied
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.654344] audit:
> type=1400 audit(1665933798.028:105): apparmor="DENIED" operation="mkdir"
> profile="snap-update-ns.firefox" name="/usr/share/cups/doc-root/"
> pid=2556 comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655141] audit:
> type=1400 audit(1665933798.028:106): apparmor="DENIED" operation="mkdir"
> profile="snap-update-ns.firefox" name="/usr/share/gimp/2.0/" pid=2556
> comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655528] audit:
> type=1400 audit(1665933798.028:107): apparmor="DENIED" operation="mkdir"
> profile="snap-update-ns.firefox" name="/usr/share/libreoffice/help/"
> pid=2556 comm="5" requested_mask="c" denied_mask="c" fsuid=0 ouid=0
> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655671] audit:
> type=1400 audit(1665933798.028:108): apparmor="DENIED" operation="open"
> profile="snap-update-ns.firefox" name="/var/lib/" pid=2556 comm="5"
> requested_mask="r" denied_mask="r" fsuid=0 ouid=0
> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
> module "canberra-gtk-module"
> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
> module "canberra-gtk-module"
> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 kernel: [   27.287722] audit:
> type=1326 audit(1665933799.660:110): auid=1000 uid=1000 gid=1000 ses=3
> subj=snap.firefox.firefox pid=2527 comm="firefox"
> exe="/snap/firefox/1943/usr/lib/firefox/firefox" sig=0 arch=c000003e
> syscall=314 compat=0 ip=0x7f54ec28673d code=0x50000
> Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
> ATTENTION: default value of option mesa_glthread overridden by environment.
> Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
> message repeated 2 times: [ ATTENTION: default value of option
> mesa_glthread overridden by environment.]
> Oct 16 17:23:23 marvin-IdeaPad-3-15ITL6 snapd[817]: storehelpers.go:748:
> cannot refresh: snap has no updates available: "bare",
> "canonical-livepatch", "core", "core18", "core20", "firefox",
> "gnome-3-34-1804", "gnome-3-38-2004", "gtk-common-themes", "slack",
> "snap-store", "snapd", "zoom-client"
> Oct 16 17:23:40 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
> Missing chrome or resource URL:
> resource://gre/modules/UpdateListener.sys.mjs
> Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> [2022-10-16T15:23:55Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
> Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> message repeated 3 times: [ [2022-10-16T15:23:55Z ERROR mp4parse] Found
> 2 nul bytes in "\0\0"]
> Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> [2022-10-16T15:23:56Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
> Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> message repeated 3 times: [ [2022-10-16T15:23:56Z ERROR mp4parse] Found
> 2 nul bytes in "\0\0"]
> Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> [2022-10-16T15:23:58Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
> Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
> message repeated 3 times: [ [2022-10-16T15:23:58Z ERROR mp4parse] Found
> 2 nul bytes in "\0\0"]
> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
> XPCOMGlueLoad error for file
> /snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
> /snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1: unsupported version 0 of Verneed record
> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
> Couldn't load XPCOM.
> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
> XPCOMGlueLoad error for file
> /snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
> /snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1: unsupported version 0 of Verneed record
> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
> Couldn't load XPCOM.
> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:24:27 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4480]:
> Inconsistency detected by ld.so: ../elf/dl-runtime.c: 80: _dl_fixup:
> Assertion `ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT' failed!
> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
> /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
> message repeated 2 times: [ /snap/firefox/1943/usr/lib/firefox/firefox:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verneed record
> Oct 16 17:25:02 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
> Missing chrome or resource URL:
> resource://gre/modules/UpdateListener.sys.mjs
> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
> /snap/firefox/1943/usr/lib/firefox/pingsender:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
> /snap/firefox/1943/usr/lib/firefox/pingsender:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
> /snap/firefox/1943/usr/lib/firefox/pingsender: error while loading
> shared libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version
> 0 of Verneed record
> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2764]:
> /snap/firefox/1943/usr/lib/firefox/firefox: symbol lookup error:
> /snap/firefox/1943/usr/lib/firefox/libmozsandbox.so: undefined symbol: ,
> version
> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 systemd[1795]:
> snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope:
> Consumed 6min 11.938s CPU time.
> Oct 16 17:26:22 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
> snap.firefox.firefox.8af6eb39-1ebc-4e81-a7c1-a6cc66e76eac.scope.
> Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
> /bin/bash: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
> Verdef record
> Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
> /bin/bash: error while loading shared libraries:
> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verneed record
>
> I have double-checked that this version of the function is in my build
> tree:
>
> static int squashfs_readahead_fragment(struct page **page,
>         unsigned int pages, unsigned int expected)
> {
>         struct inode *inode = page[0]->mapping->host;
>         struct squashfs_cache_entry *buffer =
> squashfs_get_fragment(inode->i_sb,
>                 squashfs_i(inode)->fragment_block,
>                 squashfs_i(inode)->fragment_size);
>         struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>         unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
>         int error = buffer->error;
>
>         if (error)
>                 goto out;
>
>         expected += squashfs_i(inode)->fragment_offset;
>
>         for (n = 0; n < pages; n++) {
>                 unsigned int base = (page[n]->index & mask) << PAGE_SHIFT;
>                 unsigned int offset = base +
> squashfs_i(inode)->fragment_offset;
>
>                 if (expected > offset) {
>                         unsigned int avail = min_t(unsigned int,
> expected -
>                                 offset, PAGE_SIZE);
>
>                         squashfs_fill_page(page[n], buffer, offset,
> avail);
>                 }
>
>                 unlock_page(page[n]);
>                 put_page(page[n]);
>         }
>
> out:
>         squashfs_cache_put(buffer);
>         return error;
> }
>
> The same set of Firefox windows and tabs loaded correctly once switching
> back to the 5.19.13 kernel image.
>
> The same applies for running the same setup with the home-built "Linux
> 5.19.15-20221014-conf051913ub x86_64": No Verdef nor Verneed error
> messages.
>
> Have a nice day, and good bug hunting!
>
> Regards,
> Mirsad
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu

2022-10-16 20:37:40

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 16/10/2022 20:55, Phillip Lougher wrote:
>
> Tracking down bugs of this sort is always a process of elimination,
> and gathering information to pinpoint the exact circumstances of why
> it is triggering.
>
> Next step is to download the exact snap(s) where the problems are
> occurring, as this may provide some insights.
>
> I don't run Ubuntu, and I don't use snaps.  Can you provide the
> download link(s) of the snap(s) that cause problems?  If there's
> any firefox snaps that don't cause problems those will be useful
> too.
>

It appears there's a searchable "Snap Store" https://snapcraft.io,
but, it doesn't give any download links to the actual snaps which
is rather annoying.

Does anyone know how to get the download link? I have not run
Ubuntu for over 14 years and have absolutely no wish to do so
now either.

Thanks

Phillip

> You don't mention if there's any errors present in "dmesg" when
> this happens, and so I'm assuming there aren't any?
>
> Phillip
>
>
>> marvin@marvin-IdeaPad-3-15ITL6:~$ sudo tail -300f /var/log/syslog |
>> grep firefox
>> Oct 16 17:23:17 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
>> snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope.
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
>> update.go:85: cannot change mount namespace according to change mount
>> (/var/lib/snapd/hostfs/usr/share/cups/doc-root
>> /usr/share/cups/doc-root none bind,ro 0 0): cannot create directory
>> "/usr/share/cups/doc-root": permission denied
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
>> update.go:85: cannot change mount namespace according to change mount
>> (/var/lib/snapd/hostfs/usr/share/gimp/2.0/help
>> /usr/share/gimp/2.0/help none bind,ro 0 0): cannot create directory
>> "/usr/share/gimp/2.0": permission denied
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
>> update.go:85: cannot change mount namespace according to change mount
>> (/var/lib/snapd/hostfs/usr/share/libreoffice/help
>> /usr/share/libreoffice/help none bind,ro 0 0): cannot create directory
>> "/usr/share/libreoffice/help": permission denied
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2556]:
>> update.go:85: cannot change mount namespace according to change mount
>> (/var/lib/snapd/hostfs/usr/share/xubuntu-docs /usr/share/xubuntu-docs
>> none bind,ro 0 0): cannot open directory "/var/lib": permission denied
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.654344] audit:
>> type=1400 audit(1665933798.028:105): apparmor="DENIED"
>> operation="mkdir" profile="snap-update-ns.firefox"
>> name="/usr/share/cups/doc-root/" pid=2556 comm="5" requested_mask="c"
>> denied_mask="c" fsuid=0 ouid=0
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655141] audit:
>> type=1400 audit(1665933798.028:106): apparmor="DENIED"
>> operation="mkdir" profile="snap-update-ns.firefox"
>> name="/usr/share/gimp/2.0/" pid=2556 comm="5" requested_mask="c"
>> denied_mask="c" fsuid=0 ouid=0
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655528] audit:
>> type=1400 audit(1665933798.028:107): apparmor="DENIED"
>> operation="mkdir" profile="snap-update-ns.firefox"
>> name="/usr/share/libreoffice/help/" pid=2556 comm="5"
>> requested_mask="c" denied_mask="c" fsuid=0 ouid=0
>> Oct 16 17:23:18 marvin-IdeaPad-3-15ITL6 kernel: [   25.655671] audit:
>> type=1400 audit(1665933798.028:108): apparmor="DENIED"
>> operation="open" profile="snap-update-ns.firefox" name="/var/lib/"
>> pid=2556 comm="5" requested_mask="r" denied_mask="r" fsuid=0 ouid=0
>> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
>> module "canberra-gtk-module"
>> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 firefox[2527]: Failed to load
>> module "canberra-gtk-module"
>> Oct 16 17:23:19 marvin-IdeaPad-3-15ITL6 kernel: [   27.287722] audit:
>> type=1326 audit(1665933799.660:110): auid=1000 uid=1000 gid=1000 ses=3
>> subj=snap.firefox.firefox pid=2527 comm="firefox"
>> exe="/snap/firefox/1943/usr/lib/firefox/firefox" sig=0 arch=c000003e
>> syscall=314 compat=0 ip=0x7f54ec28673d code=0x50000
>> Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
>> ATTENTION: default value of option mesa_glthread overridden by
>> environment.
>> Oct 16 17:23:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
>> message repeated 2 times: [ ATTENTION: default value of option
>> mesa_glthread overridden by environment.]
>> Oct 16 17:23:23 marvin-IdeaPad-3-15ITL6 snapd[817]:
>> storehelpers.go:748: cannot refresh: snap has no updates available:
>> "bare", "canonical-livepatch", "core", "core18", "core20", "firefox",
>> "gnome-3-34-1804", "gnome-3-38-2004", "gtk-common-themes", "slack",
>> "snap-store", "snapd", "zoom-client"
>> Oct 16 17:23:40 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
>> Missing chrome or resource URL:
>> resource://gre/modules/UpdateListener.sys.mjs
>> Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> [2022-10-16T15:23:55Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
>> Oct 16 17:23:55 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> message repeated 3 times: [ [2022-10-16T15:23:55Z ERROR mp4parse]
>> Found 2 nul bytes in "\0\0"]
>> Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> [2022-10-16T15:23:56Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
>> Oct 16 17:23:56 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> message repeated 3 times: [ [2022-10-16T15:23:56Z ERROR mp4parse]
>> Found 2 nul bytes in "\0\0"]
>> Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> [2022-10-16T15:23:58Z ERROR mp4parse] Found 2 nul bytes in "\0\0"
>> Oct 16 17:23:58 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[3990]:
>> message repeated 3 times: [ [2022-10-16T15:23:58Z ERROR mp4parse]
>> Found 2 nul bytes in "\0\0"]
>> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
>> XPCOMGlueLoad error for file
>> /snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
>> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
>> /snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1: unsupported version 0 of Verneed record
>> Oct 16 17:24:14 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4942]:
>> Couldn't load XPCOM.
>> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
>> XPCOMGlueLoad error for file
>> /snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
>> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
>> /snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1: unsupported version 0 of Verneed record
>> Oct 16 17:24:19 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4995]:
>> Couldn't load XPCOM.
>> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5033]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:24:27 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[4480]:
>> Inconsistency detected by ld.so: ../elf/dl-runtime.c: 80: _dl_fixup:
>> Assertion `ELFW(R_TYPE)(reloc->r_info) == ELF_MACHINE_JMP_SLOT' failed!
>> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:31 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5163]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:36 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5205]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:42 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5372]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:45 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5396]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
>> message repeated 2 times: [
>> /snap/firefox/1943/usr/lib/firefox/firefox:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record]
>> Oct 16 17:24:53 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5483]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: error while loading shared
>> libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verneed record
>> Oct 16 17:25:02 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2527]:
>> Missing chrome or resource URL:
>> resource://gre/modules/UpdateListener.sys.mjs
>> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
>> /snap/firefox/1943/usr/lib/firefox/pingsender:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
>> /snap/firefox/1943/usr/lib/firefox/pingsender:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verdef record
>> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5514]:
>> /snap/firefox/1943/usr/lib/firefox/pingsender: error while loading
>> shared libraries: /lib/x86_64-linux-gnu/libdl.so.2: unsupported
>> version 0 of Verneed record
>> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[2764]:
>> /snap/firefox/1943/usr/lib/firefox/firefox: symbol lookup error:
>> /snap/firefox/1943/usr/lib/firefox/libmozsandbox.so: undefined symbol:
>> , version
>> Oct 16 17:25:07 marvin-IdeaPad-3-15ITL6 systemd[1795]:
>> snap.firefox.firefox.41ac7adc-efbd-4c47-bccb-06711efb7a2e.scope:
>> Consumed 6min 11.938s CPU time.
>> Oct 16 17:26:22 marvin-IdeaPad-3-15ITL6 systemd[1795]: Started
>> snap.firefox.firefox.8af6eb39-1ebc-4e81-a7c1-a6cc66e76eac.scope.
>> Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
>> /bin/bash: /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of
>> Verdef record
>> Oct 16 17:26:23 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5573]:
>> /bin/bash: error while loading shared libraries:
>> /lib/x86_64-linux-gnu/libdl.so.2: unsupported version 0 of Verneed record
>>
>> I have double-checked that this version of the function is in my build
>> tree:
>>
>> static int squashfs_readahead_fragment(struct page **page,
>>          unsigned int pages, unsigned int expected)
>> {
>>          struct inode *inode = page[0]->mapping->host;
>>          struct squashfs_cache_entry *buffer =
>> squashfs_get_fragment(inode->i_sb,
>>                  squashfs_i(inode)->fragment_block,
>>                  squashfs_i(inode)->fragment_size);
>>          struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>>          unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT))
>> - 1;
>>          int error = buffer->error;
>>
>>          if (error)
>>                  goto out;
>>
>>          expected += squashfs_i(inode)->fragment_offset;
>>
>>          for (n = 0; n < pages; n++) {
>>                  unsigned int base = (page[n]->index & mask) <<
>> PAGE_SHIFT;
>>                  unsigned int offset = base +
>> squashfs_i(inode)->fragment_offset;
>>
>>                  if (expected > offset) {
>>                          unsigned int avail = min_t(unsigned int,
>> expected -
>>                                  offset, PAGE_SIZE);
>>
>>                          squashfs_fill_page(page[n], buffer, offset,
>> avail);
>>                  }
>>
>>                  unlock_page(page[n]);
>>                  put_page(page[n]);
>>          }
>>
>> out:
>>          squashfs_cache_put(buffer);
>>          return error;
>> }
>>
>> The same set of Firefox windows and tabs loaded correctly once
>> switching back to the 5.19.13 kernel image.
>>
>> The same applies for running the same setup with the home-built "Linux
>> 5.19.15-20221014-conf051913ub x86_64": No Verdef nor Verneed error
>> messages.
>>
>> Have a nice day, and good bug hunting!
>>
>> Regards,
>> Mirsad
>>
>> --
>> Mirsad Goran Todorovac
>> Sistem inženjer
>> Grafički fakultet | Akademija likovnih umjetnosti
>> Sveučilište u Zagrebu
>

2022-10-17 02:16:37

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/17/22 03:19, Phillip Lougher wrote:
> On 16/10/2022 20:55, Phillip Lougher wrote:
>>
>> Tracking down bugs of this sort is always a process of elimination,
>> and gathering information to pinpoint the exact circumstances of why
>> it is triggering.
>>
>> Next step is to download the exact snap(s) where the problems are occurring, as this may provide some insights.
>>
>> I don't run Ubuntu, and I don't use snaps.  Can you provide the
>> download link(s) of the snap(s) that cause problems?  If there's
>> any firefox snaps that don't cause problems those will be useful
>> too.
>>
>
> It appears there's a searchable "Snap Store" https://snapcraft.io,
> but, it doesn't give any download links to the actual snaps which
> is rather annoying.
>
> Does anyone know how to get the download link?  I have not run
> Ubuntu for over 14 years and have absolutely no wish to do so
> now either.
>

You need to have snapd installed, unfortunately.

You can start the testing with hello-world snap, which can be installed
with `snap install hello-world`. Then run the snap with `snap run
hello-world`. After that, reboot and repeat until the regression
occurs.

You can also try other snaps. For GUI applications, try krita, gimp,
gitkraken, and libreoffice. For server applications, try spinning up
containers with lxd or store data with nextcloud.

My guess is that the problem is in snapd when trying to execute
snaps, hence all snaps are affected.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-17 02:46:43

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 17. 10. 2022. 04:03, Bagas Sanjaya wrote:

> On 10/17/22 03:19, Phillip Lougher wrote:
>> On 16/10/2022 20:55, Phillip Lougher wrote:
>>> Tracking down bugs of this sort is always a process of elimination,
>>> and gathering information to pinpoint the exact circumstances of why
>>> it is triggering.
>>>
>>> Next step is to download the exact snap(s) where the problems are occurring, as this may provide some insights.
>>>
>>> I don't run Ubuntu, and I don't use snaps.  Can you provide the
>>> download link(s) of the snap(s) that cause problems?  If there's
>>> any firefox snaps that don't cause problems those will be useful
>>> too.
>>>
>> It appears there's a searchable "Snap Store" https://snapcraft.io,
>> but, it doesn't give any download links to the actual snaps which
>> is rather annoying.
>>
>> Does anyone know how to get the download link?  I have not run
>> Ubuntu for over 14 years and have absolutely no wish to do so
>> now either.
> You need to have snapd installed, unfortunately.
>
> You can start the testing with hello-world snap, which can be installed
> with `snap install hello-world`. Then run the snap with `snap run
> hello-world`. After that, reboot and repeat until the regression
> occurs.
>
> You can also try other snaps. For GUI applications, try krita, gimp,
> gitkraken, and libreoffice. For server applications, try spinning up
> containers with lxd or store data with nextcloud.
>
> My guess is that the problem is in snapd when trying to execute
> snaps, hence all snaps are affected.

I can only report that bug persisted in 6.1-rc1 with the latest
Phillip's squashfs patch.

Have a nice day.

Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-10-17 04:19:02

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/17/22 09:41, Mirsad Goran Todorovac wrote:
> I can only report that bug persisted in 6.1-rc1 with the latest Phillip's squashfs patch.
>

So the regression isn't solved yet with the patch, right?

--
An old man doll... just what I always wanted! - Clara

2022-10-17 08:35:55

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 17.10.2022. 6:15, Bagas Sanjaya wrote:

> On 10/17/22 09:41, Mirsad Goran Todorovac wrote:
>> I can only report that bug persisted in 6.1-rc1 with the latest Phillip's squashfs patch.
> So the regression isn't solved yet with the patch, right?

Apparently not, or not yet. The patch obviously solves some problems,
though.

The pre-regression fs/squashfs/file.c has completely different version
of the squash readahead code than
this introduced with the commit b09a7a036d2035b14636cd4c4c69518d73770f65.

I could lament on this, but I am not clever enough to see what is wrong. :(

Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-10-17 10:03:58

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/16/22 19:21, Bagas Sanjaya wrote:
>
> No Verneed warnings so far. However, I need to test for a longer time
> (a day) to check if any warnings are reported.
>

The regression still occurs with this patch applied. Let's see if
reverting the offending commit helps.

--
An old man doll... just what I always wanted! - Clara

2022-10-17 13:19:30

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/17/22 16:45, Bagas Sanjaya wrote:
> On 10/16/22 19:21, Bagas Sanjaya wrote:
>>
>> No Verneed warnings so far. However, I need to test for a longer time
>> (a day) to check if any warnings are reported.
>>
>
> The regression still occurs with this patch applied. Let's see if
> reverting the offending commit helps.
>
Sorry for speaking late about my triggering cause.

As the background, on my Debian 11 computer, I have lxd snap installed.
I use lxd to channel my sysadmin side by spinning up containers and
installing server applications there (LAMP, email, DNS).

Back to triggering cause. I can trigger the regression when the network
connection drops (when Firefox shows connection problem page). Then
I run `lxd list` to list container instances and I get the Verneed
regression error.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-17 13:29:55

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7


On 16.10.2022. 21:55, Phillip Lougher wrote:
> On 16/10/2022 16:55, Mirsad Goran Todorovac wrote:
>> On 15. 10. 2022. 22:59, Phillip Lougher wrote:
>
> Tracking down bugs of this sort is always a process of elimination,
> and gathering information to pinpoint the exact circumstances of why
> it is triggering.
>
> Next step is to download the exact snap(s) where the problems are
> occurring, as this may provide some insights.
>
> I don't run Ubuntu, and I don't use snaps.  Can you provide the
> download link(s) of the snap(s) that cause problems?  If there's
> any firefox snaps that don't cause problems those will be useful
> too.
>
> You don't mention if there's any errors present in "dmesg" when
> this happens, and so I'm assuming there aren't any?
>
> Phillip

Snaps are originated in Ubuntu, and they heavily rely on snapd, and
squashfs for decompressing executables.
Naturally, this also affects Linux Mint as Ubuntu's fork.

Those two comprise a large share of the Linux distros, and Firefox is
exclusively distributed in snaps.

There is no way to install Firefox without snapd and squashfs, so the
distros are lobotomised from the point of
the Firefox users.

You should basically do what is described here:
https://snapcraft.io/install/firefox/centos

and probably obvious to you: https://forums.centos.org/viewtopic.php?t=71485

then it's easy:

# snap install firefox

it will install from the latest release.

Thank you.

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-10-17 14:24:45

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 17/10/2022 14:22, Mirsad Goran Todorovac wrote:
>
> On 16.10.2022. 21:55, Phillip Lougher wrote:
>> On 16/10/2022 16:55, Mirsad Goran Todorovac wrote:
>>> On 15. 10. 2022. 22:59, Phillip Lougher wrote:
>>
>> Tracking down bugs of this sort is always a process of elimination,
>> and gathering information to pinpoint the exact circumstances of why
>> it is triggering.
>>
>> Next step is to download the exact snap(s) where the problems are
>> occurring, as this may provide some insights.
>>
>> I don't run Ubuntu, and I don't use snaps.  Can you provide the
>> download link(s) of the snap(s) that cause problems?  If there's
>> any firefox snaps that don't cause problems those will be useful
>> too.
>>
>> You don't mention if there's any errors present in "dmesg" when
>> this happens, and so I'm assuming there aren't any?
>>
>> Phillip
>
> Snaps are originated in Ubuntu, and they heavily rely on snapd, and
> squashfs for decompressing executables.
> Naturally, this also affects Linux Mint as Ubuntu's fork.
>
> Those two comprise a large share of the Linux distros, and Firefox is
> exclusively distributed in snaps.

I know you meant well, but, don't talk to me as if I'm the village
idiot.

I said I don't run snaps or Ubuntu, which is nothing to do with lack of
understanding (*)

I'll contact a couple of people in Canonical to obtain what I need.

(*) I worked as a kernel maintainer for Canonical (Ubuntu) from 2007/8
and I worked as a kernel maintainer for Redhat on RHEL (which Centos
is a free version) from 2011 - 2019. This is addition to writing and
maintaining Squashfs in the upstream kernel.


2022-10-17 17:43:39

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On Mon, Oct 17, 2022 at 1:37 PM Bagas Sanjaya <[email protected]> wrote:
>
> On 10/17/22 16:45, Bagas Sanjaya wrote:
> > On 10/16/22 19:21, Bagas Sanjaya wrote:
> >>
> >> No Verneed warnings so far. However, I need to test for a longer time
> >> (a day) to check if any warnings are reported.
> >>
> >
> > The regression still occurs with this patch applied. Let's see if
> > reverting the offending commit helps.
> >
> Sorry for speaking late about my triggering cause.
>
> As the background, on my Debian 11 computer, I have lxd snap installed.
> I use lxd to channel my sysadmin side by spinning up containers and
> installing server applications there (LAMP, email, DNS).
>
> Back to triggering cause. I can trigger the regression when the network
> connection drops (when Firefox shows connection problem page). Then
> I run `lxd list` to list container instances and I get the Verneed
> regression error.
>

This appears to be both deterministic, and appears to show something
happening in the Firefox snap triggers an error in the Lxd snap.

This will then not be a Squashfs issue. Firefox running in one
Squashfs filesystem
can't trigger a failure in another Squashfs filesystem. This has to be caused
by something else.

Thanks

Phillip

2022-10-18 02:09:00

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/18/22 00:25, Phillip Lougher wrote:
>> Sorry for speaking late about my triggering cause.
>>
>> As the background, on my Debian 11 computer, I have lxd snap installed.
>> I use lxd to channel my sysadmin side by spinning up containers and
>> installing server applications there (LAMP, email, DNS).
>>
>> Back to triggering cause. I can trigger the regression when the network
>> connection drops (when Firefox shows connection problem page). Then
>> I run `lxd list` to list container instances and I get the Verneed
>> regression error.
>>
>
> This appears to be both deterministic, and appears to show something
> happening in the Firefox snap triggers an error in the Lxd snap.
>
> This will then not be a Squashfs issue. Firefox running in one
> Squashfs filesystem
> can't trigger a failure in another Squashfs filesystem. This has to be caused
> by something else.
>

Hi Phillip,

My Firefox runs from upstream tarball (not via snap), so the problem may lie
on snapd itself.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-18 02:27:43

by Jintao Yin

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
> Thorsten Leemhuis <[email protected]> wrote:
> >
> >Topposting, to make this easier to access for everyone.
> >
> >@Mirsad, thx for bisecting.
> >
> >@Phillip: if you want to see a problem description and the whole
> >backstory of the problem that apparently is caused by b09a7a036d20
> >("squashfs: support reading fragments in readahead call"), see this
> >thread (Mirsad sadly started a new one with the quoted mail below):
> >https://lore.kernel.org/all/[email protected]/
> >
>
> The above backstory tends to suggest data corruption which is happening
> after a couple of hours especially on heavy loads, e.g. the comment
>
> > On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
> >The bug usually isn't showing immediately, but after a couple of hours
> >of running (especially with multimedia running inside Firefox).
>
> Which is typically caused by double freed buffers or race conditions in
> freeing and reusing.
>
> Thanks Mirsad for the following
>
> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
> >
> >Here are the results of the requested bisect on the bug involving the
> >Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
> >both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
> >
> >mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
> >git bisect start
> ># bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> >git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> ># good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
> >git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
> ># good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
> >git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
> ># good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
> >'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
> >git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
> ># good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
> >'mm-stable-2022-08-03' of
> >git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
> ># bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
> >'mm-nonmm-stable-2022-08-06-2' of
> >git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
> ># good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
> >Add quirk for HP Spectre x360 15-eb0xxx
> >git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
> ># good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
> >'dma-mapping-5.20-2022-08-06' of
> >git://git.infradead.org/users/hch/dma-mapping
> >git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
> ># good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
> >kexec build error
> >git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
> ># good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
> >'s390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
> >git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
> ># good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
> >build "file direct" version of page actor
> >git bisect good db98b43086275350294f5c6f797249b714d6316d
> ># good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
> >Check the size of screen before memset_io()
> >git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
> ># good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
> >'for-5.20/fbdev-1' of
> >git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
> >git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
> ># bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
> >useless functions
> >git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
> ># bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
> >address space of proc_dohung_task_timeout_secs
> >git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
> ># bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
> >reading fragments in readahead call
> >git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
> >mtodorov@domac:~/linux/kernel/linux_stable$
> >
> >The git bisect stopped at the squashfs commit
> >b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
> >according to the Code of Conduct.
>
> Which identified the "squashfs: support reading fragments in readahead call"
> patch.
>
> There is a race-condition introduced in that patch, which involves cache
> releasing and reuse.
>
> The following diff will fix that race-condition. It would be great if
> someone could test and verify before sending it out as a patch.
>
> Thanks
>
> Phillip
Hi Phillip,
There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.

In function squashfs_readahead(...),
file_end is initialized with i_size_read(inode) >> msblk->block_log,
which means the last block index of the file.
But later in the logic to check if the page is last one or not the
code is
if (pages[nr_pages - 1]->index == file_end && bytes) {
...
}
, use file_end as the last page index of file but actually is the last
block index, so for the common setup of page and block size, the first
comparison is true only when pages[nr_pages - 1]->index is 0.
Otherwise, the trailing bytes of last page won't be zeroed.

Maybe it's the real cause of the snap bug in some way.

Thanks,

Jintao
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..6cc23178e9ad 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int error = buffer->error;
>
> - if (buffer->error)
> + if (error)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return error;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
>

2022-10-18 06:17:08

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/17/22 15:59, Phillip Lougher wrote:

> On 17/10/2022 14:22, Mirsad Goran Todorovac wrote:
>>
>> On 16.10.2022. 21:55, Phillip Lougher wrote:
>>> On 16/10/2022 16:55, Mirsad Goran Todorovac wrote:
>>>> On 15. 10. 2022. 22:59, Phillip Lougher wrote:
>>>
>>> Tracking down bugs of this sort is always a process of elimination,
>>> and gathering information to pinpoint the exact circumstances of why
>>> it is triggering.
>>>
>>> Next step is to download the exact snap(s) where the problems are
>>> occurring, as this may provide some insights.
>>>
>>> I don't run Ubuntu, and I don't use snaps.  Can you provide the
>>> download link(s) of the snap(s) that cause problems?  If there's
>>> any firefox snaps that don't cause problems those will be useful
>>> too.
>>>
>>> You don't mention if there's any errors present in "dmesg" when
>>> this happens, and so I'm assuming there aren't any?
>>>
>>> Phillip
>>
>> Snaps are originated in Ubuntu, and they heavily rely on snapd, and
>> squashfs for decompressing executables.
>> Naturally, this also affects Linux Mint as Ubuntu's fork.
>>
>> Those two comprise a large share of the Linux distros, and Firefox is
>> exclusively distributed in snaps.
>
> I know you meant well, but, don't talk to me as if I'm the village
> idiot.
>
> I said I don't run snaps or Ubuntu, which is nothing to do with lack of
> understanding (*)
>
> I'll contact a couple of people in Canonical to obtain what I need.
>
> (*) I worked as a kernel maintainer for Canonical (Ubuntu) from 2007/8
> and I worked as a kernel maintainer for Redhat on RHEL (which Centos
> is a free version) from 2011 - 2019.  This is addition to writing and
> maintaining Squashfs in the upstream kernel.

Dear Phillip,

Thousand apologies. I had no intent to offend you.
The village idiot is actually meant to be me :)

I've already read on Wikipedia about all the uses of SquashFS, and I do
not underestimate
your work by no means.

I apologise again if I sounded like that. I don't feel superior or
gloating by bisecting
the code to your commit. By no means.

Still I somehow managed to underestimate your contribution to The Cause,
and I am
new to this community, so please forgive me this error. I tried to sound
diplomatic
rather than lecturing.

Obviously, I failed :(

Mirsad

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

2022-10-18 07:09:41

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/18/22 04:15, Jintao Yin wrote:

> On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
>> Thorsten Leemhuis <[email protected]> wrote:
>>> Topposting, to make this easier to access for everyone.
>>>
>>> @Mirsad, thx for bisecting.
>>>
>>> @Phillip: if you want to see a problem description and the whole
>>> backstory of the problem that apparently is caused by b09a7a036d20
>>> ("squashfs: support reading fragments in readahead call"), see this
>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>> https://lore.kernel.org/all/[email protected]/
>>>
>> The above backstory tends to suggest data corruption which is happening
>> after a couple of hours especially on heavy loads, e.g. the comment
>>
>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>> The bug usually isn't showing immediately, but after a couple of hours
>>> of running (especially with multimedia running inside Firefox).
>> Which is typically caused by double freed buffers or race conditions in
>> freeing and reusing.
>>
>> Thanks Mirsad for the following
>>
>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>> Here are the results of the requested bisect on the bug involving the
>>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>
>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>> git bisect start
>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>> 'mm-stable-2022-08-03' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>> 'mm-nonmm-stable-2022-08-06-2' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>> Add quirk for HP Spectre x360 15-eb0xxx
>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>> 'dma-mapping-5.20-2022-08-06' of
>>> git://git.infradead.org/users/hch/dma-mapping
>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>> kexec build error
>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>> build "file direct" version of page actor
>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>> Check the size of screen before memset_io()
>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>> 'for-5.20/fbdev-1' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>> useless functions
>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>>> address space of proc_dohung_task_timeout_secs
>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>>> reading fragments in readahead call
>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>
>>> The git bisect stopped at the squashfs commit
>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>>> according to the Code of Conduct.
>> Which identified the "squashfs: support reading fragments in readahead call"
>> patch.
>>
>> There is a race-condition introduced in that patch, which involves cache
>> releasing and reuse.
>>
>> The following diff will fix that race-condition. It would be great if
>> someone could test and verify before sending it out as a patch.
>>
>> Thanks
>>
>> Phillip
> Hi Phillip,
> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>
> In function squashfs_readahead(...),
> file_end is initialized with i_size_read(inode) >> msblk->block_log,
> which means the last block index of the file.
> But later in the logic to check if the page is last one or not the
> code is
> if (pages[nr_pages - 1]->index == file_end && bytes) {
> ...
> }
> , use file_end as the last page index of file but actually is the last
> block index, so for the common setup of page and block size, the first
> comparison is true only when pages[nr_pages - 1]->index is 0.
> Otherwise, the trailing bytes of last page won't be zeroed.
>
> Maybe it's the real cause of the snap bug in some way.
>
> Thanks,
>
> Jintao

Dear Jintao,

Forgive me for noticing that this is no longer Phillip's code, so I took the
freedom as the original submitter of the bug to include Hsin-Yi into the Cc:
list, so he'd be informed about the error in his code.

Phillip:
I usually stop myself at bisecting bugs, and not people, so if you think that
I was demeaning your professional contribution, I will pull a halt on this and
pull out of this thread.

I am more like weeks than decades long in Linux kernel study, so I realise I
haven't earned the right to give you lessons, and if I sounded like that, I
apologise again.

Owing to the Author of my story, it is more important for me to win hearts for
my Saviour than points in bug hunting.

Thank you.

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

2022-10-18 07:43:53

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/18/22 09:15, Jintao Yin wrote:
> Hi Phillip,
> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>
> In function squashfs_readahead(...),
> file_end is initialized with i_size_read(inode) >> msblk->block_log,
> which means the last block index of the file.
> But later in the logic to check if the page is last one or not the
> code is
> if (pages[nr_pages - 1]->index == file_end && bytes) {
> ...
> }
> , use file_end as the last page index of file but actually is the last
> block index, so for the common setup of page and block size, the first
> comparison is true only when pages[nr_pages - 1]->index is 0.
> Otherwise, the trailing bytes of last page won't be zeroed.
>
> Maybe it's the real cause of the snap bug in some way.
>

Hi Jintao, thanks for explaining the real culprit. However, I'd like to
see the fixup patch from you so that we can test.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-18 09:02:43

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/18/22 14:23, Bagas Sanjaya wrote:
> On 10/18/22 09:15, Jintao Yin wrote:
>> Hi Phillip,
>> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
>> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>>
>> In function squashfs_readahead(...),
>> file_end is initialized with i_size_read(inode) >> msblk->block_log,
>> which means the last block index of the file.
>> But later in the logic to check if the page is last one or not the
>> code is
>> if (pages[nr_pages - 1]->index == file_end && bytes) {
>> ...
>> }
>> , use file_end as the last page index of file but actually is the last
>> block index, so for the common setup of page and block size, the first
>> comparison is true only when pages[nr_pages - 1]->index is 0.
>> Otherwise, the trailing bytes of last page won't be zeroed.
>>
>> Maybe it's the real cause of the snap bug in some way.
>>
>
> Hi Jintao, thanks for explaining the real culprit. However, I'd like to
> see the fixup patch from you so that we can test.
>

Sorry for inconvenience. Hsin-Yi Wang have already submitted the candidate
fix at [1]. Thanks anyway.

[1]: https://lore.kernel.org/lkml/CAJMQK-hgQEkhgpO9VFOCgn-cKtVsr7Hb_58pAYiGoDi5SzGZtA@mail.gmail.com/

--
An old man doll... just what I always wanted! - Clara

2022-10-18 09:04:13

by Hsin-Yi Wang

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On Tue, Oct 18, 2022 at 2:52 PM Mirsad Todorovac
<[email protected]> wrote:
>
> On 10/18/22 04:15, Jintao Yin wrote:
>
> > On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
> >> Thorsten Leemhuis <[email protected]> wrote:
> >>> Topposting, to make this easier to access for everyone.
> >>>
> >>> @Mirsad, thx for bisecting.
> >>>
> >>> @Phillip: if you want to see a problem description and the whole
> >>> backstory of the problem that apparently is caused by b09a7a036d20
> >>> ("squashfs: support reading fragments in readahead call"), see this
> >>> thread (Mirsad sadly started a new one with the quoted mail below):
> >>> https://lore.kernel.org/all/[email protected]/
> >>>
> >> The above backstory tends to suggest data corruption which is happening
> >> after a couple of hours especially on heavy loads, e.g. the comment
> >>
> >>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
> >>> The bug usually isn't showing immediately, but after a couple of hours
> >>> of running (especially with multimedia running inside Firefox).
> >> Which is typically caused by double freed buffers or race conditions in
> >> freeing and reusing.
> >>
> >> Thanks Mirsad for the following
> >>
> >> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
> >>> Here are the results of the requested bisect on the bug involving the
> >>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
> >>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
> >>>
> >>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
> >>> git bisect start
> >>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
> >>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
> >>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
> >>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
> >>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
> >>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
> >>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
> >>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
> >>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
> >>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
> >>> 'mm-stable-2022-08-03' of
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
> >>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
> >>> 'mm-nonmm-stable-2022-08-06-2' of
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
> >>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
> >>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
> >>> Add quirk for HP Spectre x360 15-eb0xxx
> >>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
> >>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
> >>> 'dma-mapping-5.20-2022-08-06' of
> >>> git://git.infradead.org/users/hch/dma-mapping
> >>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
> >>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
> >>> kexec build error
> >>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
> >>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
> >>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
> >>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
> >>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
> >>> build "file direct" version of page actor
> >>> git bisect good db98b43086275350294f5c6f797249b714d6316d
> >>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
> >>> Check the size of screen before memset_io()
> >>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
> >>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
> >>> 'for-5.20/fbdev-1' of
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
> >>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
> >>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
> >>> useless functions
> >>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
> >>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
> >>> address space of proc_dohung_task_timeout_secs
> >>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
> >>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
> >>> reading fragments in readahead call
> >>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
> >>> mtodorov@domac:~/linux/kernel/linux_stable$
> >>>
> >>> The git bisect stopped at the squashfs commit
> >>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
> >>> according to the Code of Conduct.
> >> Which identified the "squashfs: support reading fragments in readahead call"
> >> patch.
> >>
> >> There is a race-condition introduced in that patch, which involves cache
> >> releasing and reuse.
> >>
> >> The following diff will fix that race-condition. It would be great if
> >> someone could test and verify before sending it out as a patch.
> >>
> >> Thanks
> >>
> >> Phillip
> > Hi Phillip,
> > There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
> > which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
> >
> > In function squashfs_readahead(...),
> > file_end is initialized with i_size_read(inode) >> msblk->block_log,
> > which means the last block index of the file.
> > But later in the logic to check if the page is last one or not the
> > code is
> > if (pages[nr_pages - 1]->index == file_end && bytes) {
> > ...
> > }
> > , use file_end as the last page index of file but actually is the last
> > block index, so for the common setup of page and block size, the first
> > comparison is true only when pages[nr_pages - 1]->index is 0.
> > Otherwise, the trailing bytes of last page won't be zeroed.
> >
> > Maybe it's the real cause of the snap bug in some way.
> >
Hi Jintao,

Thanks for pointing out and sorry for missing this. Does the following
diff improve the issue?

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b229..7759bd70dfbf2 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -600,7 +600,7 @@ static void squashfs_readahead(struct
readahead_control *ractl)

/* Last page (if present) may have trailing
bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
+ if ((pages[nr_pages - 1]->index >> shift) ==
file_end && bytes)
memzero_page(pages[nr_pages - 1], bytes,
PAGE_SIZE - bytes);


readahead only handles the case that the first page and the last page
have the same block index:
index = pages[0]->index >> shift;
if ((pages[nr_pages - 1]->index >> shift) != index)
goto skip_pages;

The diff above makes a difference to SQUASHFS_INVALID_BLK case, which
will not be handled by squashfs_readahead_fragment() if
index==file_end.
With the above diff, it will now be memzero_page().

Hi Phillip,
Does the SQUASHFS_INVALID_BLK case handling with index==file_end
sounds reasonable to you?

Thanks.

> > Thanks,
> >
> > Jintao
>
> Dear Jintao,
>
> Forgive me for noticing that this is no longer Phillip's code, so I took the
> freedom as the original submitter of the bug to include Hsin-Yi into the Cc:
> list, so he'd be informed about the error in his code.
>
> Phillip:
> I usually stop myself at bisecting bugs, and not people, so if you think that
> I was demeaning your professional contribution, I will pull a halt on this and
> pull out of this thread.
>
> I am more like weeks than decades long in Linux kernel study, so I realise I
> haven't earned the right to give you lessons, and if I sounded like that, I
> apologise again.
>
> Owing to the Author of my story, it is more important for me to win hearts for
> my Saviour than points in bug hunting.
>
> Thank you.
>
> --
> Mirsad Goran Todorovac
> Sistem inženjer
> Grafički fakultet | Akademija likovnih umjetnosti
> Sveučilište u Zagrebu
> --
> System engineer
> Faculty of Graphic Arts | Academy of Fine Arts
> University of Zagreb, Republic of Croatia
>

2022-10-18 09:56:28

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/17/22 16:45, Bagas Sanjaya wrote:
> On 10/16/22 19:21, Bagas Sanjaya wrote:
>>
>> No Verneed warnings so far. However, I need to test for a longer time
>> (a day) to check if any warnings are reported.
>>
>
> The regression still occurs with this patch applied. Let's see if
> reverting the offending commit helps.
>

No regressions with the offending commit reverted. Now I have to
try the proposed fix ([1]).

[1]: https://lore.kernel.org/lkml/CAJMQK-hgQEkhgpO9VFOCgn-cKtVsr7Hb_58pAYiGoDi5SzGZtA@mail.gmail.com/

--
An old man doll... just what I always wanted! - Clara

2022-10-18 10:10:55

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/18/22 10:24, Hsin-Yi Wang wrote:

> On Tue, Oct 18, 2022 at 2:52 PM Mirsad Todorovac
> <[email protected]> wrote:
>> On 10/18/22 04:15, Jintao Yin wrote:
>>
>>> On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
>>>> Thorsten Leemhuis <[email protected]> wrote:
>>>>> Topposting, to make this easier to access for everyone.
>>>>>
>>>>> @Mirsad, thx for bisecting.
>>>>>
>>>>> @Phillip: if you want to see a problem description and the whole
>>>>> backstory of the problem that apparently is caused by b09a7a036d20
>>>>> ("squashfs: support reading fragments in readahead call"), see this
>>>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>>>> https://lore.kernel.org/all/[email protected]/
>>>>>
>>>> The above backstory tends to suggest data corruption which is happening
>>>> after a couple of hours especially on heavy loads, e.g. the comment
>>>>
>>>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>>>> The bug usually isn't showing immediately, but after a couple of hours
>>>>> of running (especially with multimedia running inside Firefox).
>>>> Which is typically caused by double freed buffers or race conditions in
>>>> freeing and reusing.
>>>>
>>>> Thanks Mirsad for the following
>>>>
>>>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>>>> Here are the results of the requested bisect on the bug involving the
>>>>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>>>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>>>
>>>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>>>> git bisect start
>>>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>>>> 'mm-stable-2022-08-03' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>>>> 'mm-nonmm-stable-2022-08-06-2' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>>>> Add quirk for HP Spectre x360 15-eb0xxx
>>>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>>>> 'dma-mapping-5.20-2022-08-06' of
>>>>> git://git.infradead.org/users/hch/dma-mapping
>>>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>>>> kexec build error
>>>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>>>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>>>> build "file direct" version of page actor
>>>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>>>> Check the size of screen before memset_io()
>>>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>>>> 'for-5.20/fbdev-1' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>>>> useless functions
>>>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>>>>> address space of proc_dohung_task_timeout_secs
>>>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>>>>> reading fragments in readahead call
>>>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>>>
>>>>> The git bisect stopped at the squashfs commit
>>>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>>>>> according to the Code of Conduct.
>>>> Which identified the "squashfs: support reading fragments in readahead call"
>>>> patch.
>>>>
>>>> There is a race-condition introduced in that patch, which involves cache
>>>> releasing and reuse.
>>>>
>>>> The following diff will fix that race-condition. It would be great if
>>>> someone could test and verify before sending it out as a patch.
>>>>
>>>> Thanks
>>>>
>>>> Phillip
>>> Hi Phillip,
>>> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
>>> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>>>
>>> In function squashfs_readahead(...),
>>> file_end is initialized with i_size_read(inode) >> msblk->block_log,
>>> which means the last block index of the file.
>>> But later in the logic to check if the page is last one or not the
>>> code is
>>> if (pages[nr_pages - 1]->index == file_end && bytes) {
>>> ...
>>> }
>>> , use file_end as the last page index of file but actually is the last
>>> block index, so for the common setup of page and block size, the first
>>> comparison is true only when pages[nr_pages - 1]->index is 0.
>>> Otherwise, the trailing bytes of last page won't be zeroed.
>>>
>>> Maybe it's the real cause of the snap bug in some way.
>>>
> Hi Jintao,
>
> Thanks for pointing out and sorry for missing this. Does the following
> diff improve the issue?
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b229..7759bd70dfbf2 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -600,7 +600,7 @@ static void squashfs_readahead(struct
> readahead_control *ractl)
>
> /* Last page (if present) may have trailing
> bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> + if ((pages[nr_pages - 1]->index >> shift) ==
> file_end && bytes)
> memzero_page(pages[nr_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
>
> readahead only handles the case that the first page and the last page
> have the same block index:
> index = pages[0]->index >> shift;
> if ((pages[nr_pages - 1]->index >> shift) != index)
> goto skip_pages;
>
> The diff above makes a difference to SQUASHFS_INVALID_BLK case, which
> will not be handled by squashfs_readahead_fragment() if
> index==file_end.
> With the above diff, it will now be memzero_page().
>
> Hi Phillip,
> Does the SQUASHFS_INVALID_BLK case handling with index==file_end
> sounds reasonable to you?
>
> Thanks.
Hi Hsin-Yi,

The patches were successfully applied (both yours and Phillip's) on v6.0.2 source
and the build is already in progress.

However, due to the stochastic nature of the bug, it may take time before saying
anything for sure.

I will test at around 6 PM UTC+02 on the machine with the same set of windows and
tabs in Firefox that triggered the bug so we'll know then if that case at least is
fixed.

Thanks.

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

2022-10-18 14:10:19

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On Tue, Oct 18, 2022 at 04:24:46PM +0800, Hsin-Yi Wang wrote:
> Hi Jintao,
>
> Thanks for pointing out and sorry for missing this. Does the following
> diff improve the issue?
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b229..7759bd70dfbf2 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -600,7 +600,7 @@ static void squashfs_readahead(struct
> readahead_control *ractl)
>
> /* Last page (if present) may have trailing
> bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> + if ((pages[nr_pages - 1]->index >> shift) ==
> file_end && bytes)
> memzero_page(pages[nr_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
>
> readahead only handles the case that the first page and the last page
> have the same block index:
> index = pages[0]->index >> shift;
> if ((pages[nr_pages - 1]->index >> shift) != index)
> goto skip_pages;
>
> The diff above makes a difference to SQUASHFS_INVALID_BLK case, which
> will not be handled by squashfs_readahead_fragment() if
> index==file_end.
> With the above diff, it will now be memzero_page().

Hi Hsin-Yi Wang, thanks for the proposed diff. However, it was wrapped,
so I have to manually type the changes.

I compiled 6.1-rc1 with your diff applied. For testing, I have both
hello-world and lxd snaps installed. No problems on running the former.
On the latter, I got coredump when trying to start lxd services with
`snap start lxd`. The coredump for lxd processes are attached.

From above, I think b09a7a036d2035 ("squashfs: support reading fragments in readahead call") should be reverted until we come up with proper solution.

Thanks.

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (0.00 B)
signature.asc (235.00 B)
Download all attachments

2022-10-18 14:10:53

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 18/10/2022 09:24, Hsin-Yi Wang wrote:
> On Tue, Oct 18, 2022 at 2:52 PM Mirsad Todorovac
> <[email protected]> wrote:
>>
>> On 10/18/22 04:15, Jintao Yin wrote:
>>
>>> On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
>>>> Thorsten Leemhuis <[email protected]> wrote:
>>>>> Topposting, to make this easier to access for everyone.
>>>>>
>>>>> @Mirsad, thx for bisecting.
>>>>>
>>>>> @Phillip: if you want to see a problem description and the whole
>>>>> backstory of the problem that apparently is caused by b09a7a036d20
>>>>> ("squashfs: support reading fragments in readahead call"), see this
>>>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>>>> https://lore.kernel.org/all/[email protected]/
>>>>>
>>>> The above backstory tends to suggest data corruption which is happening
>>>> after a couple of hours especially on heavy loads, e.g. the comment
>>>>
>>>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>>>> The bug usually isn't showing immediately, but after a couple of hours
>>>>> of running (especially with multimedia running inside Firefox).
>>>> Which is typically caused by double freed buffers or race conditions in
>>>> freeing and reusing.
>>>>
>>>> Thanks Mirsad for the following
>>>>
>>>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>>>> Here are the results of the requested bisect on the bug involving the
>>>>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>>>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>>>
>>>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>>>> git bisect start
>>>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>>>> 'mm-stable-2022-08-03' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>>>> 'mm-nonmm-stable-2022-08-06-2' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>>>> Add quirk for HP Spectre x360 15-eb0xxx
>>>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>>>> 'dma-mapping-5.20-2022-08-06' of
>>>>> git://git.infradead.org/users/hch/dma-mapping
>>>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>>>> kexec build error
>>>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>>>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>>>> build "file direct" version of page actor
>>>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>>>> Check the size of screen before memset_io()
>>>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>>>> 'for-5.20/fbdev-1' of
>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>>>> useless functions
>>>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>>>>> address space of proc_dohung_task_timeout_secs
>>>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>>>>> reading fragments in readahead call
>>>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>>>
>>>>> The git bisect stopped at the squashfs commit
>>>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>>>>> according to the Code of Conduct.
>>>> Which identified the "squashfs: support reading fragments in readahead call"
>>>> patch.
>>>>
>>>> There is a race-condition introduced in that patch, which involves cache
>>>> releasing and reuse.
>>>>
>>>> The following diff will fix that race-condition. It would be great if
>>>> someone could test and verify before sending it out as a patch.
>>>>
>>>> Thanks
>>>>
>>>> Phillip
>>> Hi Phillip,
>>> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
>>> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>>>
>>> In function squashfs_readahead(...),
>>> file_end is initialized with i_size_read(inode) >> msblk->block_log,
>>> which means the last block index of the file.
>>> But later in the logic to check if the page is last one or not the
>>> code is
>>> if (pages[nr_pages - 1]->index == file_end && bytes) {
>>> ...
>>> }
>>> , use file_end as the last page index of file but actually is the last
>>> block index, so for the common setup of page and block size, the first
>>> comparison is true only when pages[nr_pages - 1]->index is 0.
>>> Otherwise, the trailing bytes of last page won't be zeroed.
>>>
>>> Maybe it's the real cause of the snap bug in some way.
>>>
> Hi Jintao,
>
> Thanks for pointing out and sorry for missing this. Does the following
> diff improve the issue?
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b229..7759bd70dfbf2 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -600,7 +600,7 @@ static void squashfs_readahead(struct
> readahead_control *ractl)
>
> /* Last page (if present) may have trailing
> bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> + if ((pages[nr_pages - 1]->index >> shift) ==
> file_end && bytes)
> memzero_page(pages[nr_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
>
> readahead only handles the case that the first page and the last page
> have the same block index:
> index = pages[0]->index >> shift;
> if ((pages[nr_pages - 1]->index >> shift) != index)
> goto skip_pages;
>
> The diff above makes a difference to SQUASHFS_INVALID_BLK case, which
> will not be handled by squashfs_readahead_fragment() if
> index==file_end.
> With the above diff, it will now be memzero_page().
>
> Hi Phillip,
> Does the SQUASHFS_INVALID_BLK case handling with index==file_end
> sounds reasonable to you?
>

That is OK. When fragment_block is SQUASHFS_INVALID_BLK, it means
there is no fragment.

Phillip

> Thanks.
>
>>> Thanks,
>>>
>>> Jintao
>>
>> Dear Jintao,
>>
>> Forgive me for noticing that this is no longer Phillip's code, so I took the
>> freedom as the original submitter of the bug to include Hsin-Yi into the Cc:
>> list, so he'd be informed about the error in his code.
>>
>> Phillip:
>> I usually stop myself at bisecting bugs, and not people, so if you think that
>> I was demeaning your professional contribution, I will pull a halt on this and
>> pull out of this thread.
>>
>> I am more like weeks than decades long in Linux kernel study, so I realise I
>> haven't earned the right to give you lessons, and if I sounded like that, I
>> apologise again.
>>
>> Owing to the Author of my story, it is more important for me to win hearts for
>> my Saviour than points in bug hunting.
>>
>> Thank you.
>>
>> --
>> Mirsad Goran Todorovac
>> Sistem inženjer
>> Grafički fakultet | Akademija likovnih umjetnosti
>> Sveučilište u Zagrebu
>> --
>> System engineer
>> Faculty of Graphic Arts | Academy of Fine Arts
>> University of Zagreb, Republic of Croatia
>>

2022-10-18 17:33:32

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 18/10/2022 03:15, Jintao Yin wrote:
> On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
>> Thorsten Leemhuis <[email protected]> wrote:
>>>
>>> Topposting, to make this easier to access for everyone.
>>>
>>> @Mirsad, thx for bisecting.
>>>
>>> @Phillip: if you want to see a problem description and the whole
>>> backstory of the problem that apparently is caused by b09a7a036d20
>>> ("squashfs: support reading fragments in readahead call"), see this
>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>> https://lore.kernel.org/all/[email protected]/
>>>
>>
>> The above backstory tends to suggest data corruption which is happening
>> after a couple of hours especially on heavy loads, e.g. the comment
>>
>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>> The bug usually isn't showing immediately, but after a couple of hours
>>> of running (especially with multimedia running inside Firefox).
>>
>> Which is typically caused by double freed buffers or race conditions in
>> freeing and reusing.
>>
>> Thanks Mirsad for the following
>>
>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>>
>>> Here are the results of the requested bisect on the bug involving the
>>> Firefox snap build 104.x, 105.0.x, squashfs and which was manifested on
>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>
>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>> git bisect start
>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>> 'mm-stable-2022-08-03' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>> 'mm-nonmm-stable-2022-08-06-2' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>> Add quirk for HP Spectre x360 15-eb0xxx
>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>> 'dma-mapping-5.20-2022-08-06' of
>>> git://git.infradead.org/users/hch/dma-mapping
>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>> kexec build error
>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>> 's390-5.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>> build "file direct" version of page actor
>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>> Check the size of screen before memset_io()
>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>> 'for-5.20/fbdev-1' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>> useless functions
>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task: fix
>>> address space of proc_dohung_task_timeout_secs
>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65] squashfs: support
>>> reading fragments in readahead call
>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>
>>> The git bisect stopped at the squashfs commit
>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in Cc:,
>>> according to the Code of Conduct.
>>
>> Which identified the "squashfs: support reading fragments in readahead call"
>> patch.
>>
>> There is a race-condition introduced in that patch, which involves cache
>> releasing and reuse.
>>
>> The following diff will fix that race-condition. It would be great if
>> someone could test and verify before sending it out as a patch.
>>
>> Thanks
>>
>> Phillip
> Hi Phillip,
> There is a logical bug in commit 8fc78b6fe24c36b151ac98d7546591ed92083d4f
> which is parent commit of commit b09a7a036d2035b14636cd4c4c69518d73770f65.
>
> In function squashfs_readahead(...),
> file_end is initialized with i_size_read(inode) >> msblk->block_log,
> which means the last block index of the file.
> But later in the logic to check if the page is last one or not the
> code is
> if (pages[nr_pages - 1]->index == file_end && bytes) {
> ...
> }
> , use file_end as the last page index of file but actually is the last
> block index, so for the common setup of page and block size, the first
> comparison is true only when pages[nr_pages - 1]->index is 0.
> Otherwise, the trailing bytes of last page won't be zeroed.
>
> Maybe it's the real cause of the snap bug in some way.
>

That code segment is indeed the cause of the bug. But the logic to
check if it is the last page is completely wrong and more broken than
described.

I will send a diff. This has aleady been shown to fix the issue with my
reproducer.

Thanks

Phillip

> Thanks,
>
> Jintao


2022-10-18 17:58:39

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 18. 10. 2022. 19:15, Phillip Lougher wrote:

> On 18/10/2022 03:15, Jintao Yin wrote:
>> On Sat, Oct 15, 2022 at 09:59:36PM +0100, Phillip Lougher wrote:
>>> Thorsten Leemhuis <[email protected]> wrote:
>>>>
>>>> Topposting, to make this easier to access for everyone.
>>>>
>>>> @Mirsad, thx for bisecting.
>>>>
>>>> @Phillip: if you want to see a problem description and the whole
>>>> backstory of the problem that apparently is caused by b09a7a036d20
>>>> ("squashfs: support reading fragments in readahead call"), see this
>>>> thread (Mirsad sadly started a new one with the quoted mail below):
>>>> https://lore.kernel.org/all/[email protected]/
>>>>
>>>>
>>>
>>> The above backstory tends to suggest data corruption which is happening
>>> after a couple of hours especially on heavy loads, e.g. the comment
>>>
>>>> On 10/3/22 at 4:18 AM, Mirsad Goran Todorovac wrote:
>>>> The bug usually isn't showing immediately, but after a couple of hours
>>>> of running (especially with multimedia running inside Firefox).
>>>
>>> Which is typically caused by double freed buffers or race conditions in
>>> freeing and reusing.
>>>
>>> Thanks Mirsad for the following
>>>
>>> On Sat, 15 Oct 2022 16:59:44 +0200, Mirsad Goran Todorovac wrote:
>>>>
>>>> Here are the results of the requested bisect on the bug involving the
>>>> Firefox snap build 104.x, 105.0.x, squashfs and which was
>>>> manifested on
>>>> both Ubuntu snap and with snapd in AlmaLinux 8.6 (CentOS fork):
>>>>
>>>> mtodorov@domac:~/linux/kernel/linux_stable$ git bisect log
>>>> git bisect start
>>>> # bad: [568035b01cfb107af8d2e4bd2fb9aea22cf5b868] Linux 6.0-rc1
>>>> git bisect bad 568035b01cfb107af8d2e4bd2fb9aea22cf5b868
>>>> # good: [51dd976781da8c0b47e106ed59a96d7c28972ce4] Linux 5.19.15
>>>> git bisect good 51dd976781da8c0b47e106ed59a96d7c28972ce4
>>>> # good: [3d7cb6b04c3f3115719235cc6866b10326de34cd] Linux 5.19
>>>> git bisect good 3d7cb6b04c3f3115719235cc6866b10326de34cd
>>>> # good: [b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1] Merge tag
>>>> 'drm-next-2022-08-03' of git://anongit.freedesktop.org/drm/drm
>>>> git bisect good b44f2fd87919b5ae6e1756d4c7ba2cbba22238e1
>>>> # good: [6614a3c3164a5df2b54abb0b3559f51041cf705b] Merge tag
>>>> 'mm-stable-2022-08-03' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>> git bisect good 6614a3c3164a5df2b54abb0b3559f51041cf705b
>>>> # bad: [eb5699ba31558bdb2cee6ebde3d0a68091e47dce] Merge tag
>>>> 'mm-nonmm-stable-2022-08-06-2' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
>>>> git bisect bad eb5699ba31558bdb2cee6ebde3d0a68091e47dce
>>>> # good: [24df5428ef9d1ca1edd54eca7eb667110f2dfae3] ALSA: hda/realtek:
>>>> Add quirk for HP Spectre x360 15-eb0xxx
>>>> git bisect good 24df5428ef9d1ca1edd54eca7eb667110f2dfae3
>>>> # good: [c993e07be023acdeec8e84e2e0743c52adb5fc94] Merge tag
>>>> 'dma-mapping-5.20-2022-08-06' of
>>>> git://git.infradead.org/users/hch/dma-mapping
>>>> git bisect good c993e07be023acdeec8e84e2e0743c52adb5fc94
>>>> # good: [4cfa6ff24a9744ba484521c38bea613134fbfcb3] powerpc/64e: Fix
>>>> kexec build error
>>>> git bisect good 4cfa6ff24a9744ba484521c38bea613134fbfcb3
>>>> # good: [24cb958695724ffb4488ef4f65892c0767bcd2f2] Merge tag
>>>> 's390-5.20-1' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
>>>> git bisect good 24cb958695724ffb4488ef4f65892c0767bcd2f2
>>>> # good: [db98b43086275350294f5c6f797249b714d6316d] squashfs: always
>>>> build "file direct" version of page actor
>>>> git bisect good db98b43086275350294f5c6f797249b714d6316d
>>>> # good: [6ba592fa014f21f35a8ee8da4ca7b95a018f13e8] video: fbdev: s3fb:
>>>> Check the size of screen before memset_io()
>>>> git bisect good 6ba592fa014f21f35a8ee8da4ca7b95a018f13e8
>>>> # good: [b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6] Merge tag
>>>> 'for-5.20/fbdev-1' of
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev
>>>> git bisect good b5a8466d37d30cfcc8015789f4a3f0c44b6c7bc6
>>>> # bad: [97d3b2676fc6bc4865eb825037f4492f0fb804eb] ocfs2: remove some
>>>> useless functions
>>>> git bisect bad 97d3b2676fc6bc4865eb825037f4492f0fb804eb
>>>> # bad: [591c32bddbe20ba0e172d9def3c7f22b9c926ad9] kernel/hung_task:
>>>> fix
>>>> address space of proc_dohung_task_timeout_secs
>>>> git bisect bad 591c32bddbe20ba0e172d9def3c7f22b9c926ad9
>>>> # bad: [b09a7a036d2035b14636cd4c4c69518d73770f65]  squashfs: support
>>>> reading fragments in readahead call
>>>> git bisect bad b09a7a036d2035b14636cd4c4c69518d73770f65
>>>> mtodorov@domac:~/linux/kernel/linux_stable$
>>>>
>>>> The git bisect stopped at the squashfs commit
>>>> b09a7a036d2035b14636cd4c4c69518d73770f65, so I included Phillip in
>>>> Cc:,
>>>> according to the Code of Conduct.
>>>
>>> Which identified the "squashfs: support reading fragments in
>>> readahead call"
>>> patch.
>>>
>>> There is a race-condition introduced in that patch, which involves
>>> cache
>>> releasing and reuse.
>>>
>>> The following diff will fix that race-condition.  It would be great if
>>> someone could test and verify before sending it out as a patch.
>>>
>>> Thanks
>>>
>>> Phillip
>> Hi Phillip,
>>    There is a logical bug in commit
>> 8fc78b6fe24c36b151ac98d7546591ed92083d4f
>>    which is parent commit of commit
>> b09a7a036d2035b14636cd4c4c69518d73770f65.
>>       In function squashfs_readahead(...),
>>    file_end is initialized with i_size_read(inode) >> msblk->block_log,
>>    which means the last block index of the file.
>>    But later in the logic to check if the page is last one or not the
>>    code is
>>      if (pages[nr_pages - 1]->index == file_end && bytes) {
>>        ...
>>      }
>>    , use file_end as the last page index of file but actually is the
>> last
>>    block index, so for the common setup of page and block size, the
>> first
>>    comparison is true only when pages[nr_pages - 1]->index is 0.
>>    Otherwise, the trailing bytes of last page won't be zeroed.
>>
>>    Maybe it's the real cause of the snap bug in some way.
>>
>
> That code segment is indeed the cause of the bug.  But the logic to
> check if it is the last page is completely wrong and more broken than
> described.
>
> I will send a diff.  This has aleady been shown to fix the issue with my
> reproducer.
>
> Thanks
Then it is no surprise that v6.0.2 build with Phillip's and Hsin-Yi's
patches also
failed, giving the same "Gah, tab crashed" and Verneed errors as before
given
the same Firefox windows and tabs test:

Oct 18 19:34:00 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7400]:
/snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1:
unsupported version 0 of Verneed record
Oct 18 19:34:00 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7400]:
Couldn't load XPCOM.
Oct 18 19:34:03 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7418]:
XPCOMGlueLoad error for file
/snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
Oct 18 19:34:03 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7418]:
/snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1:
unsupported version 0 of Verneed record
Oct 18 19:34:03 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7418]:
Couldn't load XPCOM.
Oct 18 19:34:04 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[5416]:
Missing chrome or resource URL:
resource://gre/modules/UpdateListener.sys.mjs
Oct 18 19:34:05 marvin-IdeaPad-3-15ITL6 systemd[1629]:
snap.firefox.firefox.945b7e54-eb2c-40d3-836d-96ac26e19293.scope:
Consumed 3min 19.877s CPU time.
Oct 18 19:34:19 marvin-IdeaPad-3-15ITL6 systemd[1629]: Started
Application launched by gnome-shell.
Oct 18 19:34:19 marvin-IdeaPad-3-15ITL6 systemd[1629]: Started
snap.firefox.firefox.b586e811-9aaa-4b30-afa8-54ae3b119798.scope.
Oct 18 19:34:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7483]:
cut: cut: no version information available (required by cut)
Oct 18 19:34:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7483]:
message repeated 5 times: [ cut: cut: no version information available
(required by cut)]
Oct 18 19:34:20 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7483]:
cut: symbol lookup error: cut: undefined symbol:
Oct 18 19:34:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7441]:
XPCOMGlueLoad error for file
/snap/firefox/1943/usr/lib/firefox/libmozgtk.so:
Oct 18 19:34:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7441]:
/snap/firefox/1943/gnome-platform/usr/lib/x86_64-linux-gnu/libXcomposite.so.1:
unsupported version 0 of Verneed record
Oct 18 19:34:21 marvin-IdeaPad-3-15ITL6 firefox_firefox.desktop[7441]:
Couldn't load XPCOM.
Oct 18 19:34:21 marvin-IdeaPad-3-15ITL6 systemd[1629]:
snap.firefox.firefox.b586e811-9aaa-4b30-afa8-54ae3b119798.scope:
Consumed 1.382s CPU time.
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 systemd[1]: Starting Download
data for packages that failed at package install time...
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 anacron[956]: Job `cron.daily'
started
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 systemd[1]:
update-notifier-download.service: Deactivated successfully.
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 systemd[1]: Finished Download
data for packages that failed at package install time.
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 anacron[7529]: Updated timestamp
for job `cron.daily' to 2022-10-18
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 cracklib: no dictionary update
necessary.
Oct 18 19:34:39 marvin-IdeaPad-3-15ITL6 anacron[956]: Job `cron.daily'
terminated
Oct 18 19:34:42 marvin-IdeaPad-3-15ITL6 dbus-daemon[782]: [system]
Activating via systemd: service name='org.freedesktop.timedate1'
unit='dbus-org.freedesktop.timedate1.service' requested by ':1.45'
(uid=0 pid=810 comm="/usr/lib/snapd/snapd ")
Oct 18 19:34:42 marvin-IdeaPad-3-15ITL6 systemd[1]: Starting Time & Date
Service...
Oct 18 19:34:42 marvin-IdeaPad-3-15ITL6 dbus-daemon[782]: [system]
Successfully activated service 'org.freedesktop.timedate1'
Oct 18 19:34:42 marvin-IdeaPad-3-15ITL6 systemd[1]: Started Time & Date
Service.
Oct 18 19:34:43 marvin-IdeaPad-3-15ITL6 snapd[810]: storehelpers.go:748:
cannot refresh: snap has no updates available: "bare",
"canonical-livepatch", "core", "core18", "core20", "gnome-3-34-1804",
"gnome-3-38-2004", "gtk-common-themes", "slack", "snap-store", "snapd",
"zoom-client"

I wish there is something I could do to help other than bug reports, but the
squashfs source is way too complex for me ...

Good luck then guys.

Thanks

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-10-18 18:13:02

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

This diff has fixed the problem with my repoducer.
Please test and report your results.

Phillip

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b22..e526eb7a1658 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -557,6 +557,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
int res, bsize;
u64 block = 0;
unsigned int expected;
+ struct page *last_page;

nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
@@ -593,15 +594,15 @@ static void squashfs_readahead(struct readahead_control *ractl)

res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ last_page = squashfs_page_actor_free(actor);

if (res == expected) {
int bytes;

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
- memzero_page(pages[nr_pages - 1], bytes,
+ if (index == file_end && bytes && last_page)
+ memzero_page(last_page, bytes,
PAGE_SIZE - bytes);

for (i = 0; i < nr_pages; i++) {
diff --git a/fs/squashfs/page_actor.c b/fs/squashfs/page_actor.c
index 54b93bf4a25c..6aa38f88e31c 100644
--- a/fs/squashfs/page_actor.c
+++ b/fs/squashfs/page_actor.c
@@ -53,6 +53,7 @@ struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
actor->pages = pages;
actor->next_page = 0;
actor->tmp_buffer = NULL;
+ actor->last_page = NULL;
actor->squashfs_first_page = cache_first_page;
actor->squashfs_next_page = cache_next_page;
actor->squashfs_finish_page = cache_finish_page;
@@ -71,11 +72,13 @@ static void *handle_next_page(struct squashfs_page_actor *actor)
(actor->next_index != actor->page[actor->next_page]->index)) {
actor->next_index++;
actor->returned_pages++;
+ actor->last_page = NULL;
return actor->alloc_buffer ? actor->tmp_buffer : ERR_PTR(-ENOMEM);
}

actor->next_index++;
actor->returned_pages++;
+ actor->last_page = actor->page[actor->next_page];
return actor->pageaddr = kmap_local_page(actor->page[actor->next_page++]);
}

diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
index 95ffbb543d91..97d4983559b1 100644
--- a/fs/squashfs/page_actor.h
+++ b/fs/squashfs/page_actor.h
@@ -16,6 +16,7 @@ struct squashfs_page_actor {
void *(*squashfs_first_page)(struct squashfs_page_actor *);
void *(*squashfs_next_page)(struct squashfs_page_actor *);
void (*squashfs_finish_page)(struct squashfs_page_actor *);
+ struct page *last_page;
int pages;
int length;
int next_page;
@@ -29,10 +30,13 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
extern struct squashfs_page_actor *squashfs_page_actor_init_special(
struct squashfs_sb_info *msblk,
struct page **page, int pages, int length);
-static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
+static inline struct page *squashfs_page_actor_free(struct squashfs_page_actor *actor)
{
+ struct page *last_page = actor->last_page;
+
kfree(actor->tmp_buffer);
kfree(actor);
+ return last_page;
}
static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
{

2022-10-19 05:46:33

by Slade Watkins

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

Hi Phillip,

On Tue, Oct 18, 2022 at 1:42 PM Phillip Lougher <[email protected]> wrote:
>
> This diff has fixed the problem with my repoducer.
> Please test and report your results.
>
> Phillip

Tested-by me just a few moments ago and can report this fixed the
problem. (I believe Mirsad also reported that this fixed it.)

Yours,
-srw

>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..e526eb7a1658 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -557,6 +557,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + struct page *last_page;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,15 +594,15 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + last_page = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + if (index == file_end && bytes && last_page)
> + memzero_page(last_page, bytes,
> PAGE_SIZE - bytes);
>
> for (i = 0; i < nr_pages; i++) {
> diff --git a/fs/squashfs/page_actor.c b/fs/squashfs/page_actor.c
> index 54b93bf4a25c..6aa38f88e31c 100644
> --- a/fs/squashfs/page_actor.c
> +++ b/fs/squashfs/page_actor.c
> @@ -53,6 +53,7 @@ struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> actor->pages = pages;
> actor->next_page = 0;
> actor->tmp_buffer = NULL;
> + actor->last_page = NULL;
> actor->squashfs_first_page = cache_first_page;
> actor->squashfs_next_page = cache_next_page;
> actor->squashfs_finish_page = cache_finish_page;
> @@ -71,11 +72,13 @@ static void *handle_next_page(struct squashfs_page_actor *actor)
> (actor->next_index != actor->page[actor->next_page]->index)) {
> actor->next_index++;
> actor->returned_pages++;
> + actor->last_page = NULL;
> return actor->alloc_buffer ? actor->tmp_buffer : ERR_PTR(-ENOMEM);
> }
>
> actor->next_index++;
> actor->returned_pages++;
> + actor->last_page = actor->page[actor->next_page];
> return actor->pageaddr = kmap_local_page(actor->page[actor->next_page++]);
> }
>
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..97d4983559b1 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -16,6 +16,7 @@ struct squashfs_page_actor {
> void *(*squashfs_first_page)(struct squashfs_page_actor *);
> void *(*squashfs_next_page)(struct squashfs_page_actor *);
> void (*squashfs_finish_page)(struct squashfs_page_actor *);
> + struct page *last_page;
> int pages;
> int length;
> int next_page;
> @@ -29,10 +30,13 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline struct page *squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + struct page *last_page = actor->last_page;
> +
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return last_page;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

2022-10-19 08:02:54

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/19/22 00:41, Phillip Lougher wrote:
> This diff has fixed the problem with my repoducer.
> Please test and report your results.
>
> Phillip
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..e526eb7a1658 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -557,6 +557,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + struct page *last_page;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,15 +594,15 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + last_page = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + if (index == file_end && bytes && last_page)
> + memzero_page(last_page, bytes,
> PAGE_SIZE - bytes);
>
> for (i = 0; i < nr_pages; i++) {
> diff --git a/fs/squashfs/page_actor.c b/fs/squashfs/page_actor.c
> index 54b93bf4a25c..6aa38f88e31c 100644
> --- a/fs/squashfs/page_actor.c
> +++ b/fs/squashfs/page_actor.c
> @@ -53,6 +53,7 @@ struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> actor->pages = pages;
> actor->next_page = 0;
> actor->tmp_buffer = NULL;
> + actor->last_page = NULL;
> actor->squashfs_first_page = cache_first_page;
> actor->squashfs_next_page = cache_next_page;
> actor->squashfs_finish_page = cache_finish_page;
> @@ -71,11 +72,13 @@ static void *handle_next_page(struct squashfs_page_actor *actor)
> (actor->next_index != actor->page[actor->next_page]->index)) {
> actor->next_index++;
> actor->returned_pages++;
> + actor->last_page = NULL;
> return actor->alloc_buffer ? actor->tmp_buffer : ERR_PTR(-ENOMEM);
> }
>
> actor->next_index++;
> actor->returned_pages++;
> + actor->last_page = actor->page[actor->next_page];
> return actor->pageaddr = kmap_local_page(actor->page[actor->next_page++]);
> }
>
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..97d4983559b1 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -16,6 +16,7 @@ struct squashfs_page_actor {
> void *(*squashfs_first_page)(struct squashfs_page_actor *);
> void *(*squashfs_next_page)(struct squashfs_page_actor *);
> void (*squashfs_finish_page)(struct squashfs_page_actor *);
> + struct page *last_page;
> int pages;
> int length;
> int next_page;
> @@ -29,10 +30,13 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline struct page *squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + struct page *last_page = actor->last_page;
> +
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return last_page;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

The fixup makes regression gone, thanks.

Tested-by: Bagas Sanjaya <[email protected]>

--
An old man doll... just what I always wanted! - Clara

2022-10-19 12:15:35

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 19.10.2022. 7:17, Slade Watkins wrote:

> Hi Phillip,
>
> On Tue, Oct 18, 2022 at 1:42 PM Phillip Lougher <[email protected]> wrote:
>> This diff has fixed the problem with my repoducer.
>> Please test and report your results.
>>
>> Phillip
> Tested-by me just a few moments ago and can report this fixed the
> problem. (I believe Mirsad also reported that this fixed it.)
>
> Yours,
> -srw

Hi all,

If none of you experienced machine lockup with Phillip's fix, then it
must have
been my overload of Firefox windows and tabs, Chrome and T-bird that exposed
the bug in the first place ...

My build sequence was rather standard:

 1053  10/17/2022 05:27:50 AM  git checkout v6.0.2
 1125  10/18/2022 10:21:47 PM  cd ..
 1126  10/18/2022 10:22:09 PM  time rm -rf linux_stable_build; \
                    time cp -rp linux_stable linux_stable_build; \
                    time diff -ur linux_stable linux_stable_build; \
                    cd linux_stable_build
 1127  10/18/2022 10:25:22 PM  df -k .
 1128  10/18/2022 10:25:35 PM  git apply ../phillip-squashfs-20220215.patch
 1129  10/18/2022 10:25:42 PM  git apply ../phillip-squashfs2.patch
 1130  10/18/2022 10:26:00 PM  cp -p ../config-6.0.0-060000-generic
.config; \
                    make olddefconfig; \
                    time nice make CC="ccache gcc"
KBUILD_BUILD_TIMESTAMP="" -j10 deb-pkg; date

Thanks.

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-10-20 07:59:43

by Jintao Yin

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

This diff is based on Phillip's latest two patches and fixes another wrong
check of last page in file_direct.c. At the same time, improve the logic
to update the status of pages only if page actor touches them.

Please help test and feedbacks are welcome.

Hi Phillip,

If check (bytes && used_pages > 0) is enough (since bytes is non-zero only
when handling the last block) or should I caculate the real last page index
from the file size and check the page index?

Thanks,

Jintao

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b22..5efa2a9f9630 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
squashfs_i(inode)->fragment_size);
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ int res = buffer->error;

- if (buffer->error)
+ if (res)
goto out;

expected += squashfs_i(inode)->fragment_offset;
@@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,

out:
squashfs_cache_put(buffer);
- return buffer->error;
+ return res;
}

static void squashfs_readahead(struct readahead_control *ractl)
@@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
int res, bsize;
u64 block = 0;
unsigned int expected;
+ int nr_used_pages;

nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
@@ -593,18 +595,18 @@ static void squashfs_readahead(struct readahead_control *ractl)

res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ nr_used_pages = squashfs_page_actor_free(actor);

if (res == expected) {
int bytes;

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
- memzero_page(pages[nr_pages - 1], bytes,
+ if (index == file_end && bytes && nr_used_pages > 0)
+ memzero_page(pages[nr_used_pages - 1], bytes,
PAGE_SIZE - bytes);

- for (i = 0; i < nr_pages; i++) {
+ for (i = 0; i < nr_used_pages; i++) {
flush_dcache_page(pages[i]);
SetPageUptodate(pages[i]);
}
diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
index f1ccad519e28..8d875d8c5709 100644
--- a/fs/squashfs/file_direct.c
+++ b/fs/squashfs/file_direct.c
@@ -30,7 +30,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
int start_index = target_page->index & ~mask;
int end_index = start_index | mask;
- int i, n, pages, bytes, res = -ENOMEM;
+ int i, n, pages, used_pages, bytes, res = -ENOMEM;
struct page **page;
struct squashfs_page_actor *actor;
void *pageaddr;
@@ -74,7 +74,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
/* Decompress directly into the page cache buffers */
res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ used_pages = squashfs_page_actor_free(actor);

if (res < 0)
goto mark_errored;
@@ -86,16 +86,18 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (page[pages - 1]->index == end_index && bytes) {
- pageaddr = kmap_local_page(page[pages - 1]);
+ if (bytes && used_pages > 0) {
+ pageaddr = kmap_local_page(page[used_pages - 1]);
memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
kunmap_local(pageaddr);
}

/* Mark pages as uptodate, unlock and release */
for (i = 0; i < pages; i++) {
- flush_dcache_page(page[i]);
- SetPageUptodate(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageUptodate(page[i]);
+ }
unlock_page(page[i]);
if (page[i] != target_page)
put_page(page[i]);
@@ -112,8 +114,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
for (i = 0; i < pages; i++) {
if (page[i] == NULL || page[i] == target_page)
continue;
- flush_dcache_page(page[i]);
- SetPageError(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageError(page[i]);
+ }
unlock_page(page[i]);
put_page(page[i]);
}
diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
index 95ffbb543d91..c2c5c3937ef9 100644
--- a/fs/squashfs/page_actor.h
+++ b/fs/squashfs/page_actor.h
@@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
extern struct squashfs_page_actor *squashfs_page_actor_init_special(
struct squashfs_sb_info *msblk,
struct page **page, int pages, int length);
-static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
+static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
{
+ int res = actor->next_page;
kfree(actor->tmp_buffer);
kfree(actor);
+ return res;
}
static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
{

2022-10-20 08:02:31

by Jintao Yin

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

Ignore my before diff. I didn't notice the end_index caculation greater
than file_end part. Use the following diff instead.

This diff is based on Phillip's latest two patches and improves the logic
to update the status of pages only if page actor touches them.

Please help test and feedbacks are welcome.

Thanks,

Jintao

diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b22..5efa2a9f9630 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
squashfs_i(inode)->fragment_size);
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ int res = buffer->error;

- if (buffer->error)
+ if (res)
goto out;

expected += squashfs_i(inode)->fragment_offset;
@@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,

out:
squashfs_cache_put(buffer);
- return buffer->error;
+ return res;
}

static void squashfs_readahead(struct readahead_control *ractl)
@@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
int res, bsize;
u64 block = 0;
unsigned int expected;
+ int nr_used_pages;

nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
@@ -593,18 +595,18 @@ static void squashfs_readahead(struct readahead_control *ractl)

res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ nr_used_pages = squashfs_page_actor_free(actor);

if (res == expected) {
int bytes;

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
- memzero_page(pages[nr_pages - 1], bytes,
+ if (index == file_end && bytes && nr_used_pages > 0)
+ memzero_page(pages[nr_used_pages - 1], bytes,
PAGE_SIZE - bytes);

- for (i = 0; i < nr_pages; i++) {
+ for (i = 0; i < nr_used_pages; i++) {
flush_dcache_page(pages[i]);
SetPageUptodate(pages[i]);
}
diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
index f1ccad519e28..1bb0347f98b0 100644
--- a/fs/squashfs/file_direct.c
+++ b/fs/squashfs/file_direct.c
@@ -30,10 +30,9 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
int start_index = target_page->index & ~mask;
int end_index = start_index | mask;
- int i, n, pages, bytes, res = -ENOMEM;
+ int i, n, pages, used_pages, bytes, res = -ENOMEM;
struct page **page;
struct squashfs_page_actor *actor;
- void *pageaddr;

if (end_index > file_end)
end_index = file_end;
@@ -74,7 +73,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
/* Decompress directly into the page cache buffers */
res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ used_pages = squashfs_page_actor_free(actor);

if (res < 0)
goto mark_errored;
@@ -86,16 +85,18 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (page[pages - 1]->index == end_index && bytes) {
- pageaddr = kmap_local_page(page[pages - 1]);
- memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
- kunmap_local(pageaddr);
+ if (used_pages > 0 && page[used_pages - 1]->index == end_index &&
+ bytes) {
+ memzero_page(page[used_pages - 1], bytes,
+ PAGE_SIZE - bytes);
}

/* Mark pages as uptodate, unlock and release */
for (i = 0; i < pages; i++) {
- flush_dcache_page(page[i]);
- SetPageUptodate(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageUptodate(page[i]);
+ }
unlock_page(page[i]);
if (page[i] != target_page)
put_page(page[i]);
@@ -112,8 +113,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
for (i = 0; i < pages; i++) {
if (page[i] == NULL || page[i] == target_page)
continue;
- flush_dcache_page(page[i]);
- SetPageError(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageError(page[i]);
+ }
unlock_page(page[i]);
put_page(page[i]);
}
diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
index 95ffbb543d91..c2c5c3937ef9 100644
--- a/fs/squashfs/page_actor.h
+++ b/fs/squashfs/page_actor.h
@@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
extern struct squashfs_page_actor *squashfs_page_actor_init_special(
struct squashfs_sb_info *msblk,
struct page **page, int pages, int length);
-static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
+static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
{
+ int res = actor->next_page;
kfree(actor->tmp_buffer);
kfree(actor);
+ return res;
}
static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
{

2022-10-20 10:21:36

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 20. 10. 2022. 09:43, Jintao Yin wrote:

> Ignore my before diff. I didn't notice the end_index caculation greater
> than file_end part. Use the following diff instead.
>
> This diff is based on Phillip's latest two patches and improves the logic
> to update the status of pages only if page actor touches them.
>
> Please help test and feedbacks are welcome.
>
> Thanks,
>
> Jintao
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..5efa2a9f9630 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int res = buffer->error;
>
> - if (buffer->error)
> + if (res)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return res;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
> @@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + int nr_used_pages;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,18 +595,18 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + nr_used_pages = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + if (index == file_end && bytes && nr_used_pages > 0)
> + memzero_page(pages[nr_used_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_used_pages; i++) {
> flush_dcache_page(pages[i]);
> SetPageUptodate(pages[i]);
> }
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index f1ccad519e28..1bb0347f98b0 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -30,10 +30,9 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> int start_index = target_page->index & ~mask;
> int end_index = start_index | mask;
> - int i, n, pages, bytes, res = -ENOMEM;
> + int i, n, pages, used_pages, bytes, res = -ENOMEM;
> struct page **page;
> struct squashfs_page_actor *actor;
> - void *pageaddr;
>
> if (end_index > file_end)
> end_index = file_end;
> @@ -74,7 +73,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> /* Decompress directly into the page cache buffers */
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + used_pages = squashfs_page_actor_free(actor);
>
> if (res < 0)
> goto mark_errored;
> @@ -86,16 +85,18 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (page[pages - 1]->index == end_index && bytes) {
> - pageaddr = kmap_local_page(page[pages - 1]);
> - memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
> - kunmap_local(pageaddr);
> + if (used_pages > 0 && page[used_pages - 1]->index == end_index &&
> + bytes) {
> + memzero_page(page[used_pages - 1], bytes,
> + PAGE_SIZE - bytes);
> }
>
> /* Mark pages as uptodate, unlock and release */
> for (i = 0; i < pages; i++) {
> - flush_dcache_page(page[i]);
> - SetPageUptodate(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageUptodate(page[i]);
> + }
> unlock_page(page[i]);
> if (page[i] != target_page)
> put_page(page[i]);
> @@ -112,8 +113,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> for (i = 0; i < pages; i++) {
> if (page[i] == NULL || page[i] == target_page)
> continue;
> - flush_dcache_page(page[i]);
> - SetPageError(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageError(page[i]);
> + }
> unlock_page(page[i]);
> put_page(page[i]);
> }
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..c2c5c3937ef9 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + int res = actor->next_page;
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return res;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {
Sorry to have disappointing news: vanilla mainline stable branch v6.0.2
kernel + your latest patch
doesn't run well with my test case of firefox windows and tabs (other
kernel versions start it OK).

What I got was:

Oct 20 11:46:50 marvin-IdeaPad-3-15ITL6 kernel: [  213.292487]
desktop-launch[4687]: segfault at 8 ip 00007f5dc6c38f29 sp
00007ffdb5429470 error 4 in ld-2.31.so[7f5dc6c27000+23000]
Oct 20 11:46:50 marvin-IdeaPad-3-15ITL6 kernel: [  213.292506] Code: 8b
44 24 08 4c 8b 40 08 41 80 38 00 75 18 48 8b 05 ec a6 01 00 4c 8b 00 48
8d 05 74 21 01 00 4d 85 c0 4c 0f 44 c0 48 8b 45 68 <48> 8b 40 08 48 89
04 24 48 8d 05 e8 a6 01 00 f6 00 10 0f 85 c7 01

I think we're getting somewhere.

Did I understand well that your patch replaced both Phillip's patches?

Thank you.

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia
The European Union

2022-10-20 13:31:49

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/20/22 14:43, Jintao Yin wrote:
> Ignore my before diff. I didn't notice the end_index caculation greater
> than file_end part. Use the following diff instead.
>
> This diff is based on Phillip's latest two patches and improves the logic
> to update the status of pages only if page actor touches them.
>
> Please help test and feedbacks are welcome.
>
> Thanks,
>
> Jintao
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..5efa2a9f9630 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int res = buffer->error;
>
> - if (buffer->error)
> + if (res)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return res;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
> @@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + int nr_used_pages;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,18 +595,18 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + nr_used_pages = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + if (index == file_end && bytes && nr_used_pages > 0)
> + memzero_page(pages[nr_used_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_used_pages; i++) {
> flush_dcache_page(pages[i]);
> SetPageUptodate(pages[i]);
> }
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index f1ccad519e28..1bb0347f98b0 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -30,10 +30,9 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> int start_index = target_page->index & ~mask;
> int end_index = start_index | mask;
> - int i, n, pages, bytes, res = -ENOMEM;
> + int i, n, pages, used_pages, bytes, res = -ENOMEM;
> struct page **page;
> struct squashfs_page_actor *actor;
> - void *pageaddr;
>
> if (end_index > file_end)
> end_index = file_end;
> @@ -74,7 +73,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> /* Decompress directly into the page cache buffers */
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + used_pages = squashfs_page_actor_free(actor);
>
> if (res < 0)
> goto mark_errored;
> @@ -86,16 +85,18 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (page[pages - 1]->index == end_index && bytes) {
> - pageaddr = kmap_local_page(page[pages - 1]);
> - memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
> - kunmap_local(pageaddr);
> + if (used_pages > 0 && page[used_pages - 1]->index == end_index &&
> + bytes) {
> + memzero_page(page[used_pages - 1], bytes,
> + PAGE_SIZE - bytes);
> }
>
> /* Mark pages as uptodate, unlock and release */
> for (i = 0; i < pages; i++) {
> - flush_dcache_page(page[i]);
> - SetPageUptodate(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageUptodate(page[i]);
> + }
> unlock_page(page[i]);
> if (page[i] != target_page)
> put_page(page[i]);
> @@ -112,8 +113,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> for (i = 0; i < pages; i++) {
> if (page[i] == NULL || page[i] == target_page)
> continue;
> - flush_dcache_page(page[i]);
> - SetPageError(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageError(page[i]);
> + }
> unlock_page(page[i]);
> put_page(page[i]);
> }
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..c2c5c3937ef9 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + int res = actor->next_page;
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return res;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

Hi Jintao,

Booting v6.1-rc1 with your patch applied, I got mixed results. On hello-world
snap, it runs without errors. On lxd, I got the Verneed regression as
originally reported. Sometimes I also get snapd panic.

Thanks.

--
An old man doll... just what I always wanted! - Clara

2022-10-20 15:07:34

by Jintao Yin

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

Hi all,

After review the details of page actor, the tail bytes may be written to
a temp buffer instead the last used page. So before diff would wrongly
memzero a page which is not the tail bytes in.

In this diff fixes it by caculation of the real index the trailing bytes
in and check if the last used page matches this index. If the page is
the real tail bytes in, then memzero the trailing bypte of the page.

Please help test and any feedbacks are welcome.

Thanks,

Jintao


diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
index e56510964b22..e1fafd10a850 100644
--- a/fs/squashfs/file.c
+++ b/fs/squashfs/file.c
@@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
squashfs_i(inode)->fragment_size);
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ int res = buffer->error;

- if (buffer->error)
+ if (res)
goto out;

expected += squashfs_i(inode)->fragment_offset;
@@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,

out:
squashfs_cache_put(buffer);
- return buffer->error;
+ return res;
}

static void squashfs_readahead(struct readahead_control *ractl)
@@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
int res, bsize;
u64 block = 0;
unsigned int expected;
+ int nr_used_pages;

nr_pages = __readahead_batch(ractl, pages, max_pages);
if (!nr_pages)
@@ -593,18 +595,21 @@ static void squashfs_readahead(struct readahead_control *ractl)

res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ nr_used_pages = squashfs_page_actor_free(actor);

if (res == expected) {
int bytes;
+ pgoff_t bytes_index;

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (pages[nr_pages - 1]->index == file_end && bytes)
- memzero_page(pages[nr_pages - 1], bytes,
+ bytes_index = (index << shift) + ((res - bytes) >> PAGE_SHIFT);
+ if (bytes && nr_used_pages > 0 &&
+ pages[nr_used_pages - 1]->index == bytes_index)
+ memzero_page(pages[nr_used_pages - 1], bytes,
PAGE_SIZE - bytes);

- for (i = 0; i < nr_pages; i++) {
+ for (i = 0; i < nr_used_pages; i++) {
flush_dcache_page(pages[i]);
SetPageUptodate(pages[i]);
}
diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
index f1ccad519e28..ee462ef380bf 100644
--- a/fs/squashfs/file_direct.c
+++ b/fs/squashfs/file_direct.c
@@ -26,14 +26,14 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
struct inode *inode = target_page->mapping->host;
struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;

- int file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
- int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
- int start_index = target_page->index & ~mask;
- int end_index = start_index | mask;
- int i, n, pages, bytes, res = -ENOMEM;
+ pgoff_t file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
+ pgoff_t mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
+ pgoff_t start_index = target_page->index & ~mask;
+ pgoff_t end_index = start_index | mask;
+ int i, pages, used_pages, bytes, res = -ENOMEM;
+ pgoff_t n, bytes_index;
struct page **page;
struct squashfs_page_actor *actor;
- void *pageaddr;

if (end_index > file_end)
end_index = file_end;
@@ -74,7 +74,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
/* Decompress directly into the page cache buffers */
res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);

- squashfs_page_actor_free(actor);
+ used_pages = squashfs_page_actor_free(actor);

if (res < 0)
goto mark_errored;
@@ -86,16 +86,19 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,

/* Last page (if present) may have trailing bytes not filled */
bytes = res % PAGE_SIZE;
- if (page[pages - 1]->index == end_index && bytes) {
- pageaddr = kmap_local_page(page[pages - 1]);
- memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
- kunmap_local(pageaddr);
+ bytes_index = start_index + ((res - bytes) >> PAGE_SHIFT);
+ if (used_pages > 0 && bytes &&
+ page[used_pages - 1]->index == bytes_index) {
+ memzero_page(page[used_pages - 1], bytes,
+ PAGE_SIZE - bytes);
}

/* Mark pages as uptodate, unlock and release */
for (i = 0; i < pages; i++) {
- flush_dcache_page(page[i]);
- SetPageUptodate(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageUptodate(page[i]);
+ }
unlock_page(page[i]);
if (page[i] != target_page)
put_page(page[i]);
@@ -112,8 +115,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
for (i = 0; i < pages; i++) {
if (page[i] == NULL || page[i] == target_page)
continue;
- flush_dcache_page(page[i]);
- SetPageError(page[i]);
+ if (i < used_pages) {
+ flush_dcache_page(page[i]);
+ SetPageError(page[i]);
+ }
unlock_page(page[i]);
put_page(page[i]);
}
diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
index 95ffbb543d91..c2c5c3937ef9 100644
--- a/fs/squashfs/page_actor.h
+++ b/fs/squashfs/page_actor.h
@@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
extern struct squashfs_page_actor *squashfs_page_actor_init_special(
struct squashfs_sb_info *msblk,
struct page **page, int pages, int length);
-static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
+static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
{
+ int res = actor->next_page;
kfree(actor->tmp_buffer);
kfree(actor);
+ return res;
}
static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
{

2022-10-20 16:01:00

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 20/10/2022 14:55, Jintao Yin wrote:
> Hi all,
>
> After review the details of page actor, the tail bytes may be written to
> a temp buffer instead the last used page. So before diff would wrongly
> memzero a page which is not the tail bytes in.
>
> In this diff fixes it by caculation of the real index the trailing bytes
> in and check if the last used page matches this index. If the page is
> the real tail bytes in, then memzero the trailing bypte of the page.
>
> Please help test and any feedbacks are welcome.
>
> Thanks,

This is a rediculously complex patch. Sorry, but, absolute NACK.

Additionally, it is poor etiquette and pointless to fix an
already fixed issue with a _more_ complex patch.

Phillip

>
> Jintao
>
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..e1fafd10a850 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int res = buffer->error;
>
> - if (buffer->error)
> + if (res)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return res;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
> @@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + int nr_used_pages;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,18 +595,21 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + nr_used_pages = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
> + pgoff_t bytes_index;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + bytes_index = (index << shift) + ((res - bytes) >> PAGE_SHIFT);
> + if (bytes && nr_used_pages > 0 &&
> + pages[nr_used_pages - 1]->index == bytes_index)
> + memzero_page(pages[nr_used_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_used_pages; i++) {
> flush_dcache_page(pages[i]);
> SetPageUptodate(pages[i]);
> }
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index f1ccad519e28..ee462ef380bf 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -26,14 +26,14 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> struct inode *inode = target_page->mapping->host;
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>
> - int file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> - int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> - int start_index = target_page->index & ~mask;
> - int end_index = start_index | mask;
> - int i, n, pages, bytes, res = -ENOMEM;
> + pgoff_t file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> + pgoff_t mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + pgoff_t start_index = target_page->index & ~mask;
> + pgoff_t end_index = start_index | mask;
> + int i, pages, used_pages, bytes, res = -ENOMEM;
> + pgoff_t n, bytes_index;
> struct page **page;
> struct squashfs_page_actor *actor;
> - void *pageaddr;
>
> if (end_index > file_end)
> end_index = file_end;
> @@ -74,7 +74,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> /* Decompress directly into the page cache buffers */
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + used_pages = squashfs_page_actor_free(actor);
>
> if (res < 0)
> goto mark_errored;
> @@ -86,16 +86,19 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (page[pages - 1]->index == end_index && bytes) {
> - pageaddr = kmap_local_page(page[pages - 1]);
> - memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
> - kunmap_local(pageaddr);
> + bytes_index = start_index + ((res - bytes) >> PAGE_SHIFT);
> + if (used_pages > 0 && bytes &&
> + page[used_pages - 1]->index == bytes_index) {
> + memzero_page(page[used_pages - 1], bytes,
> + PAGE_SIZE - bytes);
> }
>
> /* Mark pages as uptodate, unlock and release */
> for (i = 0; i < pages; i++) {
> - flush_dcache_page(page[i]);
> - SetPageUptodate(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageUptodate(page[i]);
> + }
> unlock_page(page[i]);
> if (page[i] != target_page)
> put_page(page[i]);
> @@ -112,8 +115,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> for (i = 0; i < pages; i++) {
> if (page[i] == NULL || page[i] == target_page)
> continue;
> - flush_dcache_page(page[i]);
> - SetPageError(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageError(page[i]);
> + }
> unlock_page(page[i]);
> put_page(page[i]);
> }
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..c2c5c3937ef9 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + int res = actor->next_page;
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return res;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

2022-10-20 16:02:59

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 20/10/2022 14:55, Jintao Yin wrote:
> Hi all,
>
> After review the details of page actor, the tail bytes may be written to
> a temp buffer instead the last used page. So before diff would wrongly
> memzero a page which is not the tail bytes in.

If you *only* discovered that *after* writing the patch, it is clear
you don't fully understand what is going on.

Cheers

Phillip

>
> In this diff fixes it by caculation of the real index the trailing bytes
> in and check if the last used page matches this index. If the page is
> the real tail bytes in, then memzero the trailing bypte of the page.
>
> Please help test and any feedbacks are welcome.
>
> Thanks,
>
> Jintao
>
>
> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..e1fafd10a850 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int res = buffer->error;
>
> - if (buffer->error)
> + if (res)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return res;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
> @@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + int nr_used_pages;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,18 +595,21 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + nr_used_pages = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
> + pgoff_t bytes_index;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + bytes_index = (index << shift) + ((res - bytes) >> PAGE_SHIFT);
> + if (bytes && nr_used_pages > 0 &&
> + pages[nr_used_pages - 1]->index == bytes_index)
> + memzero_page(pages[nr_used_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_used_pages; i++) {
> flush_dcache_page(pages[i]);
> SetPageUptodate(pages[i]);
> }
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index f1ccad519e28..ee462ef380bf 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -26,14 +26,14 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> struct inode *inode = target_page->mapping->host;
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>
> - int file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> - int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> - int start_index = target_page->index & ~mask;
> - int end_index = start_index | mask;
> - int i, n, pages, bytes, res = -ENOMEM;
> + pgoff_t file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> + pgoff_t mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + pgoff_t start_index = target_page->index & ~mask;
> + pgoff_t end_index = start_index | mask;
> + int i, pages, used_pages, bytes, res = -ENOMEM;
> + pgoff_t n, bytes_index;
> struct page **page;
> struct squashfs_page_actor *actor;
> - void *pageaddr;
>
> if (end_index > file_end)
> end_index = file_end;
> @@ -74,7 +74,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> /* Decompress directly into the page cache buffers */
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + used_pages = squashfs_page_actor_free(actor);
>
> if (res < 0)
> goto mark_errored;
> @@ -86,16 +86,19 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (page[pages - 1]->index == end_index && bytes) {
> - pageaddr = kmap_local_page(page[pages - 1]);
> - memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
> - kunmap_local(pageaddr);
> + bytes_index = start_index + ((res - bytes) >> PAGE_SHIFT);
> + if (used_pages > 0 && bytes &&
> + page[used_pages - 1]->index == bytes_index) {
> + memzero_page(page[used_pages - 1], bytes,
> + PAGE_SIZE - bytes);
> }
>
> /* Mark pages as uptodate, unlock and release */
> for (i = 0; i < pages; i++) {
> - flush_dcache_page(page[i]);
> - SetPageUptodate(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageUptodate(page[i]);
> + }
> unlock_page(page[i]);
> if (page[i] != target_page)
> put_page(page[i]);
> @@ -112,8 +115,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> for (i = 0; i < pages; i++) {
> if (page[i] == NULL || page[i] == target_page)
> continue;
> - flush_dcache_page(page[i]);
> - SetPageError(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageError(page[i]);
> + }
> unlock_page(page[i]);
> put_page(page[i]);
> }
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..c2c5c3937ef9 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + int res = actor->next_page;
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return res;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

2022-10-20 16:05:39

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/20/22 15:55, Jintao Yin wrote:

> Hi all,
>
> After review the details of page actor, the tail bytes may be written to
> a temp buffer instead the last used page. So before diff would wrongly
> memzero a page which is not the tail bytes in.
>
> In this diff fixes it by caculation of the real index the trailing bytes
> in and check if the last used page matches this index. If the page is
> the real tail bytes in, then memzero the trailing bypte of the page.
>
> Please help test and any feedbacks are welcome.
>
> Thanks,
>
> Jintao

Hi, Jintao,

I have very good news. The bug reproducing Firefox windows and tabs
setup that
crashed with core dump in your previous diff now works a OK.

The build from the previous email was v6.0.2+ and this one is 6.1-rc1+.

As a newbie I cannot say anything of importance, but I feel good about this
being done.

Have a nice day!
Mirsad

> diff --git a/fs/squashfs/file.c b/fs/squashfs/file.c
> index e56510964b22..e1fafd10a850 100644
> --- a/fs/squashfs/file.c
> +++ b/fs/squashfs/file.c
> @@ -506,8 +506,9 @@ static int squashfs_readahead_fragment(struct page **page,
> squashfs_i(inode)->fragment_size);
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> unsigned int n, mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + int res = buffer->error;
>
> - if (buffer->error)
> + if (res)
> goto out;
>
> expected += squashfs_i(inode)->fragment_offset;
> @@ -529,7 +530,7 @@ static int squashfs_readahead_fragment(struct page **page,
>
> out:
> squashfs_cache_put(buffer);
> - return buffer->error;
> + return res;
> }
>
> static void squashfs_readahead(struct readahead_control *ractl)
> @@ -557,6 +558,7 @@ static void squashfs_readahead(struct readahead_control *ractl)
> int res, bsize;
> u64 block = 0;
> unsigned int expected;
> + int nr_used_pages;
>
> nr_pages = __readahead_batch(ractl, pages, max_pages);
> if (!nr_pages)
> @@ -593,18 +595,21 @@ static void squashfs_readahead(struct readahead_control *ractl)
>
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + nr_used_pages = squashfs_page_actor_free(actor);
>
> if (res == expected) {
> int bytes;
> + pgoff_t bytes_index;
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (pages[nr_pages - 1]->index == file_end && bytes)
> - memzero_page(pages[nr_pages - 1], bytes,
> + bytes_index = (index << shift) + ((res - bytes) >> PAGE_SHIFT);
> + if (bytes && nr_used_pages > 0 &&
> + pages[nr_used_pages - 1]->index == bytes_index)
> + memzero_page(pages[nr_used_pages - 1], bytes,
> PAGE_SIZE - bytes);
>
> - for (i = 0; i < nr_pages; i++) {
> + for (i = 0; i < nr_used_pages; i++) {
> flush_dcache_page(pages[i]);
> SetPageUptodate(pages[i]);
> }
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> index f1ccad519e28..ee462ef380bf 100644
> --- a/fs/squashfs/file_direct.c
> +++ b/fs/squashfs/file_direct.c
> @@ -26,14 +26,14 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> struct inode *inode = target_page->mapping->host;
> struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
>
> - int file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> - int mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> - int start_index = target_page->index & ~mask;
> - int end_index = start_index | mask;
> - int i, n, pages, bytes, res = -ENOMEM;
> + pgoff_t file_end = (i_size_read(inode) - 1) >> PAGE_SHIFT;
> + pgoff_t mask = (1 << (msblk->block_log - PAGE_SHIFT)) - 1;
> + pgoff_t start_index = target_page->index & ~mask;
> + pgoff_t end_index = start_index | mask;
> + int i, pages, used_pages, bytes, res = -ENOMEM;
> + pgoff_t n, bytes_index;
> struct page **page;
> struct squashfs_page_actor *actor;
> - void *pageaddr;
>
> if (end_index > file_end)
> end_index = file_end;
> @@ -74,7 +74,7 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> /* Decompress directly into the page cache buffers */
> res = squashfs_read_data(inode->i_sb, block, bsize, NULL, actor);
>
> - squashfs_page_actor_free(actor);
> + used_pages = squashfs_page_actor_free(actor);
>
> if (res < 0)
> goto mark_errored;
> @@ -86,16 +86,19 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
>
> /* Last page (if present) may have trailing bytes not filled */
> bytes = res % PAGE_SIZE;
> - if (page[pages - 1]->index == end_index && bytes) {
> - pageaddr = kmap_local_page(page[pages - 1]);
> - memset(pageaddr + bytes, 0, PAGE_SIZE - bytes);
> - kunmap_local(pageaddr);
> + bytes_index = start_index + ((res - bytes) >> PAGE_SHIFT);
> + if (used_pages > 0 && bytes &&
> + page[used_pages - 1]->index == bytes_index) {
> + memzero_page(page[used_pages - 1], bytes,
> + PAGE_SIZE - bytes);
> }
>
> /* Mark pages as uptodate, unlock and release */
> for (i = 0; i < pages; i++) {
> - flush_dcache_page(page[i]);
> - SetPageUptodate(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageUptodate(page[i]);
> + }
> unlock_page(page[i]);
> if (page[i] != target_page)
> put_page(page[i]);
> @@ -112,8 +115,10 @@ int squashfs_readpage_block(struct page *target_page, u64 block, int bsize,
> for (i = 0; i < pages; i++) {
> if (page[i] == NULL || page[i] == target_page)
> continue;
> - flush_dcache_page(page[i]);
> - SetPageError(page[i]);
> + if (i < used_pages) {
> + flush_dcache_page(page[i]);
> + SetPageError(page[i]);
> + }
> unlock_page(page[i]);
> put_page(page[i]);
> }
> diff --git a/fs/squashfs/page_actor.h b/fs/squashfs/page_actor.h
> index 95ffbb543d91..c2c5c3937ef9 100644
> --- a/fs/squashfs/page_actor.h
> +++ b/fs/squashfs/page_actor.h
> @@ -29,10 +29,12 @@ extern struct squashfs_page_actor *squashfs_page_actor_init(void **buffer,
> extern struct squashfs_page_actor *squashfs_page_actor_init_special(
> struct squashfs_sb_info *msblk,
> struct page **page, int pages, int length);
> -static inline void squashfs_page_actor_free(struct squashfs_page_actor *actor)
> +static inline int squashfs_page_actor_free(struct squashfs_page_actor *actor)
> {
> + int res = actor->next_page;
> kfree(actor->tmp_buffer);
> kfree(actor);
> + return res;
> }
> static inline void *squashfs_first_page(struct squashfs_page_actor *actor)
> {

--
Mirsad Goran Todorovac
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu
--
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb, Republic of Croatia

2022-10-20 23:25:27

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 10/20/22 22:45, Phillip Lougher wrote:
> On 20/10/2022 14:55, Jintao Yin wrote:
>> Hi all,
>>
>> After review the details of page actor, the tail bytes may be written to
>> a temp buffer instead the last used page. So before diff would wrongly
>> memzero a page which is not the tail bytes in.
>>
>> In this diff fixes it by caculation of the real index the trailing bytes
>> in and check if the last used page matches this index. If the page is
>> the real tail bytes in, then memzero the trailing bypte of the page.
>>
>> Please help test and any feedbacks are welcome.
>>
>> Thanks,
>
> This is a rediculously complex patch.  Sorry, but, absolute NACK.
>
> Additionally, it is poor etiquette and pointless to fix an
> already fixed issue with a _more_ complex patch.
>

Ah! I was about to test his third fixup patch. I prefer to go
with your fix instead (as the formal patch).

--
An old man doll... just what I always wanted! - Clara

2022-10-20 23:52:09

by Slade Watkins

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

[Resend due to formatting issue, thanks gmail.]

On Thu, Oct 20, 2022 at 7:23 PM Bagas Sanjaya <[email protected]> wrote:
>
> Ah! I was about to test his third fixup patch. I prefer to go
> with your fix instead (as the formal patch).

+1, agreed.

-srw

2022-10-21 02:03:55

by Phillip Lougher

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 21/10/2022 00:44, Slade Watkins wrote:
> [Resend due to formatting issue, thanks gmail.]
>
> On Thu, Oct 20, 2022 at 7:23 PM Bagas Sanjaya <[email protected]> wrote:
>>
>> Ah! I was about to test his third fixup patch. I prefer to go
>> with your fix instead (as the formal patch).
>
> +1, agreed.
>
> -srw

His patch is wrong, plus he's broken a number of rules of conduct, two
serious. But, I will not reprimand him as kernel maintainer for what is
probably extreme naivety, unless I have to.

The link to the full set of patches is here

https://lore.kernel.org/lkml/[email protected]/

The first patch fixes the regression reported here in the correct way.

The second patch fixes another regression (which is separate to the
regression causing the issue here). This is where the code incorrectly
extends readahead beyond the end of the file. This is the reason for
the unused trailing pages that Jintao Yin noticed. But, this patch
fixes the cause, rather than fixing the symptom.

The third patch fixes the buffer release race condition that I
posted a fix for earlier.

Spitting this into three patches is one of the rules. Each patch should
do one thing, and one thing only. Three separate regressions means
three separate patches. This is a requirement for "git bisect" to work
effectively.

Phillip

2022-10-21 04:13:29

by Jintao Yin

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On Thu, Oct 20, 2022 at 04:45:53PM +0100, Phillip Lougher wrote:
> On 20/10/2022 14:55, Jintao Yin wrote:
> > Hi all,
> >
> > After review the details of page actor, the tail bytes may be written to
> > a temp buffer instead the last used page. So before diff would wrongly
> > memzero a page which is not the tail bytes in.
> >
> > In this diff fixes it by caculation of the real index the trailing bytes
> > in and check if the last used page matches this index. If the page is
> > the real tail bytes in, then memzero the trailing bypte of the page.
> >
> > Please help test and any feedbacks are welcome.
> >
> > Thanks,
>
> This is a rediculously complex patch. Sorry, but, absolute NACK.
>
> Additionally, it is poor etiquette and pointless to fix an
> already fixed issue with a _more_ complex patch.
>
> Phillip
>

Dear Phillip,

I'd like to apologize to you for my serious breakages of code of conduct. And
I had no intend to offend you. I'm new to this community and I will do my best to
follow the code of conduct and keep with good etiquettes.

Best Regards,
Jintao

2022-10-21 07:33:54

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 21.10.2022. 3:48, Phillip Lougher wrote:

> On 21/10/2022 00:44, Slade Watkins wrote:
>> [Resend due to formatting issue, thanks gmail.]
>>
>> On Thu, Oct 20, 2022 at 7:23 PM Bagas Sanjaya <[email protected]>
>> wrote:
>>>
>>> Ah! I was about to test his third fixup patch. I prefer to go
>>> with your fix instead (as the formal patch).
>>
>> +1, agreed.
>>
>> -srw
>
> His patch is wrong, plus he's broken a number of rules of conduct, two
> serious.  But, I will not reprimand him as kernel maintainer for what is
> probably extreme naivety, unless I have to.
>
> The link to the full set of patches is here
>
> https://lore.kernel.org/lkml/[email protected]/
>
>
> The first patch fixes the regression reported here in the correct way.
>
> The second patch fixes another regression (which is separate to the
> regression causing the issue here).  This is where the code incorrectly
> extends readahead beyond the end of the file.  This is the reason for
> the unused trailing pages that Jintao Yin noticed.  But, this patch
> fixes the cause, rather than fixing the symptom.
>
> The third patch fixes the buffer release race condition that I
> posted a fix for earlier.
>
> Spitting this into three patches is one of the rules.  Each patch should
> do one thing, and one thing only.  Three separate regressions means
> three separate patches.  This is a requirement for "git bisect" to work
> effectively.

Dear Mr. Phillip,

My squashfs learning curve isn't yet at the stage to verify what your
statements,
however, separating each of the regression causes into different patches
seems
logical even to me.

I can confirm a successful build of mainline vanilla 6.1-rc1 + your
patches-[012/3].

The snapped Firefox with the previously held windows and tabs that caused
the initial regression reproduce now work seemingly OK.

Have a nice day.

Best regards,
Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-10-21 09:11:32

by Mirsad Todorovac

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7

On 21.10.2022. 3:48, Phillip Lougher wrote:

> On 21/10/2022 00:44, Slade Watkins wrote:
>> [Resend due to formatting issue, thanks gmail.]
>>
>> On Thu, Oct 20, 2022 at 7:23 PM Bagas Sanjaya <[email protected]>
>> wrote:
>>>
>>> Ah! I was about to test his third fixup patch. I prefer to go
>>> with your fix instead (as the formal patch).
>>
>> +1, agreed.
>>
>> -srw
>
> His patch is wrong, plus he's broken a number of rules of conduct, two
> serious.  But, I will not reprimand him as kernel maintainer for what is
> probably extreme naivety, unless I have to.
>
> The link to the full set of patches is here
>
> https://lore.kernel.org/lkml/[email protected]/
>
>
> The first patch fixes the regression reported here in the correct way.
>
> The second patch fixes another regression (which is separate to the
> regression causing the issue here).  This is where the code incorrectly
> extends readahead beyond the end of the file.  This is the reason for
> the unused trailing pages that Jintao Yin noticed.  But, this patch
> fixes the cause, rather than fixing the symptom.
>
> The third patch fixes the buffer release race condition that I
> posted a fix for earlier.
>
> Spitting this into three patches is one of the rules.  Each patch should
> do one thing, and one thing only.  Three separate regressions means
> three separate patches.  This is a requirement for "git bisect" to work
> effectively.

Confirming successful build mainline vanilla 6.0.2 with patches [012/3].

And the combination of Firefox snap windows and tabs that reproduced the
regression
appears to work.

Best regards,
Mirsad

--
Mirsad Todorovac
System engineer
Faculty of Graphic Arts | Academy of Fine Arts
University of Zagreb
Republic of Croatia, the European Union
--
Sistem inženjer
Grafički fakultet | Akademija likovnih umjetnosti
Sveučilište u Zagrebu

2022-11-04 13:04:40

by Thorsten Leemhuis

[permalink] [raw]
Subject: Re: BISECT result: 6.0.0-RC kernels trigger Firefox snap bug with 6.0.0-rc3 through 6.0.0-rc7 #forregzbot

[Note: this mail is primarily send for documentation purposes and/or for
regzbot, my Linux kernel regression tracking bot. That's why I removed
most or all folks from the list of recipients, but left any that looked
like a mailing lists. These mails usually contain '#forregzbot' in the
subject, to make them easy to spot and filter out.]

On 16.10.22 14:43, Thorsten Leemhuis wrote:
> #regzbot introduced: b09a7a036d2035b14636c

#regzbot fixed-by: e11c4e088be