2023-07-17 21:51:05

by Theodore Ts'o

[permalink] [raw]
Subject: Re: next: kernel BUG at fs/ext4/mballoc.c:4369!

On Mon, Jul 17, 2023 at 08:04:54PM +0530, Ritesh Harjani wrote:
>
> These can basically trigger in extremely low memory space and only when
> such ranges exist in the PA rbtree. Hence, I guess it is a little hard
> to tigger race.

Ritesh, thanks for looking into this!

Naresh, how easy is it for you to trigger the BUG when using LTP? I
did two xfstests runs using "gce-xfstests -c ext2/default -g auto",
one on the ext4 dev branch, and one on linux-next 20230717, and I
wasn't able to trigger the BUG.

If you can trivially trigger it using LTP (perhaps with a low memory
configuration in your test setup?), that would be useful to know.

Cheers,

- Ted


2023-07-18 01:59:49

by Ritesh Harjani

[permalink] [raw]
Subject: Re: next: kernel BUG at fs/ext4/mballoc.c:4369!

"Theodore Ts'o" <[email protected]> writes:

> On Mon, Jul 17, 2023 at 08:04:54PM +0530, Ritesh Harjani wrote:
>>
>> These can basically trigger in extremely low memory space and only when
>> such ranges exist in the PA rbtree. Hence, I guess it is a little hard
>> to tigger race.
>
> Ritesh, thanks for looking into this!
>
> Naresh, how easy is it for you to trigger the BUG when using LTP? I
> did two xfstests runs using "gce-xfstests -c ext2/default -g auto",
> one on the ext4 dev branch, and one on linux-next 20230717, and I
> wasn't able to trigger the BUG.
>
> If you can trivially trigger it using LTP (perhaps with a low memory
> configuration in your test setup?), that would be useful to know.

Hi Ted,

Sorry for wrong choice of words. By low memory space I meant low disk
space i.e. ENOSPC test (fs_fill). I reproduced it like this -

root@ubuntu:/opt/ltp# while [ 1 ]; do ./runltp -s fs_fill; sleep 1; done

For me it took around ~1-2 hours for it to reproduce when I tried again.
I am hoping if we run generic/269 (fsstress ENOSPC) in a while loop like
this maybe it can hit this bug. But I didn't give it a shot.

-ritesh

2023-07-18 09:20:12

by Petr Vorel

[permalink] [raw]
Subject: Re: [LTP] next: kernel BUG at fs/ext4/mballoc.c:4369!

> "Theodore Ts'o" <[email protected]> writes:

> > On Mon, Jul 17, 2023 at 08:04:54PM +0530, Ritesh Harjani wrote:

> >> These can basically trigger in extremely low memory space and only when
> >> such ranges exist in the PA rbtree. Hence, I guess it is a little hard
> >> to tigger race.

> > Ritesh, thanks for looking into this!

> > Naresh, how easy is it for you to trigger the BUG when using LTP? I
> > did two xfstests runs using "gce-xfstests -c ext2/default -g auto",
> > one on the ext4 dev branch, and one on linux-next 20230717, and I
> > wasn't able to trigger the BUG.

> > If you can trivially trigger it using LTP (perhaps with a low memory
> > configuration in your test setup?), that would be useful to know.

> Hi Ted,

Hi Ted, Ritesh, all,

> Sorry for wrong choice of words. By low memory space I meant low disk
> space i.e. ENOSPC test (fs_fill). I reproduced it like this -

> root@ubuntu:/opt/ltp# while [ 1 ]; do ./runltp -s fs_fill; sleep 1; done

Late, but better than never: LTP C tests can be run without any wrapper.
e.g. to reproduce the bug triggered by fs_fill, you can just:

git clone https://github.com/linux-test-project/ltp.git && cd ltp
./ci/your-distro.sh # optionally install the dependencies
make autotools
./configure
cd testcases/kernel/fs/fs_fill/
make -j`nproc`
while true; do ./fs_fill; sleep 1; done

NOTE: runltp is
1) deprecated, replaced by runltp-ng [1]
2) again, there is no need to use this shell wrapper to run a single C binary

Kind regards,
Petr

[1] https://github.com/linux-test-project/runltp-ng

> For me it took around ~1-2 hours for it to reproduce when I tried again.
> I am hoping if we run generic/269 (fsstress ENOSPC) in a while loop like
> this maybe it can hit this bug. But I didn't give it a shot.

> -ritesh

2023-07-18 12:02:25

by Naresh Kamboju

[permalink] [raw]
Subject: Re: next: kernel BUG at fs/ext4/mballoc.c:4369!

On Tue, 18 Jul 2023 at 03:04, Theodore Ts'o <[email protected]> wrote:
>
> On Mon, Jul 17, 2023 at 08:04:54PM +0530, Ritesh Harjani wrote:
> >
> > These can basically trigger in extremely low memory space and only when
> > such ranges exist in the PA rbtree. Hence, I guess it is a little hard
> > to tigger race.
>
> Ritesh, thanks for looking into this!
>
> Naresh, how easy is it for you to trigger the BUG when using LTP? I
> did two xfstests runs using "gce-xfstests -c ext2/default -g auto",
> one on the ext4 dev branch, and one on linux-next 20230717, and I
> wasn't able to trigger the BUG.
>
> If you can trivially trigger it using LTP (perhaps with a low memory
> configuration in your test setup?), that would be useful to know.

In our setup it is not easy to reproduce with the same device and
same build on x86_4 and arm64 juno-r2 connected with SSD drive
and running LTP fs testing.

LTP fs_fill is triggering several ENOSPC before getting this reported
kernel BUG at fs/ext4/mballoc.c:4369!

The reported issues are not noticed on latest Linux next tags.

- Naresh

>
> Cheers,
>
> - Ted