2019-10-02 18:26:03

by A. Wilcox

[permalink] [raw]
Subject: t_ext_jnl_rm test takes 96 seconds to finish

Hello there,

While building e2fsprogs 1.45.4, I noticed the following output during
testing:

t_ext_jnl_rm: remove missing external journal device: ok
t_ext_jnl_rm: *** took 96 seconds to finish ***
t_ext_jnl_rm: consider adding t_ext_jnl_rm/is_slow_test

I didn't see any mention of this in the list archives, and I'm not
entirely sure if it should really be marked as a slow test.

System is a 64-thread POWER9 @ 3.8 GHz with DDR4-2400 RAM, so it isn't
exactly a "low end" machine.

Let me know if you need any further detail from my end; I'll be happy to
provide it. Thank you.

Best to you and yours,
--arw

--
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature

2019-10-02 21:46:38

by Theodore Ts'o

[permalink] [raw]
Subject: Re: t_ext_jnl_rm test takes 96 seconds to finish

On Wed, Oct 02, 2019 at 01:16:51PM -0500, A. Wilcox wrote:
> Hello there,
>
> While building e2fsprogs 1.45.4, I noticed the following output during
> testing:
>
> t_ext_jnl_rm: remove missing external journal device: ok
> t_ext_jnl_rm: *** took 96 seconds to finish ***
> t_ext_jnl_rm: consider adding t_ext_jnl_rm/is_slow_test
>
> I didn't see any mention of this in the list archives, and I'm not
> entirely sure if it should really be marked as a slow test.
>
> System is a 64-thread POWER9 @ 3.8 GHz with DDR4-2400 RAM, so it isn't
> exactly a "low end" machine.

The first time I ran this test, it took 20 seconds. (And that was
only because I had a WDC external SSD attached to my laptop and it
took time to spin it up; more on that below.) The next time, it was
nearly instaneous:

% time ./test_script t_ext_jnl_rm
t_ext_jnl_rm: remove missing external journal device: ok
356 tests succeeded 0 tests failed

real 0m0.242s
user 0m0.053s
sys 0m0.114s

If you look at the test script, you'll see that we are creating a file
system, setting up an external journal which doesn't exist:

dd if=/dev/zero of=$TMPFILE bs=1k count=512 > /dev/null 2>&1

echo mke2fs -q -F -o Linux -b 1024 $TMPFILE >> $OUT
$MKE2FS -q -F -o Linux -I 128 -b 1024 $TMPFILE >> $OUT 2>&1

echo "debugfs add journal device/UUID" >> $OUT
$DEBUGFS -w -f - $TMPFILE <<- EOF >> $OUT 2>&1
feature has_journal
ssv journal_dev 0x9999
ssv journal_uuid 1db3f677-6832-4adb-bafc-8e4059c30a33
EOF

... and then we ask tune2fs to remove the journal:

echo "tune2fs -f -O ^has_journal $TMPFILE" >> $OUT
$TUNE2FS -f -O ^has_journal $TMPFILE >> $OUT 2>&1

So the time that it takes is based on how long it takes to search all
of the disks attached to the system looking for an external journal
with the uuid 1db3f677-6832-4adb-bafc-8e4059c30a33.

On most systems, this is fast. If you happen to have a slow device
attached to your system, then this can take a while --- but there's
really not much we can do about this. I suppose we could try to add a
test mock which disables the external journal search, if there isn't
any way you can speed things up on your end now that you know what's
causing the delay?

- Ted

2019-10-02 21:53:29

by A. Wilcox

[permalink] [raw]
Subject: Re: t_ext_jnl_rm test takes 96 seconds to finish

On 02/10/2019 14:59, Theodore Y. Ts'o wrote:
> On Wed, Oct 02, 2019 at 01:16:51PM -0500, A. Wilcox wrote:
>> While building e2fsprogs 1.45.4, I noticed the following output during
>> testing:
>>
>> t_ext_jnl_rm: remove missing external journal device: ok
>> t_ext_jnl_rm: *** took 96 seconds to finish ***
>> t_ext_jnl_rm: consider adding t_ext_jnl_rm/is_slow_test
>>
>> I didn't see any mention of this in the list archives, and I'm not
>> entirely sure if it should really be marked as a slow test.
>
> The first time I ran this test, it took 20 seconds. (And that was
> only because I had a WDC external SSD attached to my laptop and it
> took time to spin it up; more on that below.) The next time, it was
> nearly instaneous:
>
> % time ./test_script t_ext_jnl_rm
> t_ext_jnl_rm: remove missing external journal device: ok
> 356 tests succeeded 0 tests failed
>
> real 0m0.242s
> user 0m0.053s
> sys 0m0.114s
>
> If you look at the test script, you'll see that we are creating a file
> system, setting up an external journal which doesn't exist:
>
>
> ... and then we ask tune2fs to remove the journal:
>
>
> So the time that it takes is based on how long it takes to search all
> of the disks attached to the system looking for an external journal
> with the uuid 1db3f677-6832-4adb-bafc-8e4059c30a33.
>
> On most systems, this is fast. If you happen to have a slow device
> attached to your system, then this can take a while --- but there's
> really not much we can do about this. I suppose we could try to add a
> test mock which disables the external journal search, if there isn't
> any way you can speed things up on your end now that you know what's
> causing the delay?
>
> - Ted


Ah, okay. This machine happened to have backup devices still connected
which are external USB spinning rust drives. They could easily take 30+
seconds to spin up, each, and there were two. Additionally it has a
caching HDD internally that may have been spun down. That explains it.

Thank you for making sense of this, and doing so promptly. I really
appreciate it!

Best,
--arw


--
A. Wilcox (awilfox)
Project Lead, Adélie Linux
https://www.adelielinux.org


Attachments:
signature.asc (849.00 B)
OpenPGP digital signature