2010-06-27 16:56:10

by Nebojsa Trpkovic

[permalink] [raw]
Subject: lifetime_write_kbytes isn't preserved during unclean shutdown

I've noticed that lifetime_write_kbytes isn't preserved during unclean
shutdown.
At least on my Intel X25-V, but I guess it's ext4's issue, not Intel's
one. :)

lifetime_write_kbytes constantly grow while using filesystem. If
computer is rebooted regulary with clean unmount, next time it boots
lifetime_write_kbytes will continue incrementing counter and everything
seems to work right.

Then again, if there was unclean shutdown/reboot, lifetime_write_kbytes
will be reset to value of last clean unmount. Example:

lifetime_write_kbytes 18 GB
lifetime_write_kbytes 20 GB
lifetime_write_kbytes 22 GB
clean reboot
lifetime_write_kbytes 22 GB
lifetime_write_kbytes 24 GB
lifetime_write_kbytes 26 GB
unclean reboot (reboot switch/power failure/whatever)
lifetime_write_kbytes 22 GB
...

I guess that /sys takes value written in filesystem itself and increment
it during system operation, but writes back new lifetime_write_kbytes
value to filesystem just on clean unmount.
It seems it would not hurt to write periodicly, maybe not too often - 10
minutes or so, intermidiate lifetime_write_kbytes value to filesystem
and avoid unnecessary lost of lifetime_writes on unclean reboot.

Nebojsa Trpkovic



2010-06-28 19:57:42

by Theodore Ts'o

[permalink] [raw]
Subject: Re: lifetime_write_kbytes isn't preserved during unclean shutdown

On Sun, Jun 27, 2010 at 06:56:07PM +0200, Nebojsa Trpkovic wrote:
> I've noticed that lifetime_write_kbytes isn't preserved during unclean
> shutdown.

Yes, right now we are only updating the superblock's lifetime write
kbytes at unmount time. It should be possible to do a better job; but
I don't want to increase writes to the disk just to keep the
s_lifetime value up-to-date. So what we should probably do is update
it when we are going to be updating the superblock anyway (i.e., when
we update the orphaned inode linked list) and maybe on some periodic
basis (say once an hour) otherwise.

Thanks for reporting this. I'll put it on my backlog queue, unless
someone beats me to submitting a patch....

- Ted

2010-06-30 05:46:58

by Andreas Dilger

[permalink] [raw]
Subject: Re: lifetime_write_kbytes isn't preserved during unclean shutdown

On 2010-06-28, at 13:57, [email protected] wrote:
> On Sun, Jun 27, 2010 at 06:56:07PM +0200, Nebojsa Trpkovic wrote:
>> I've noticed that lifetime_write_kbytes isn't preserved during unclean
>> shutdown.
>
> Yes, right now we are only updating the superblock's lifetime write
> kbytes at unmount time. It should be possible to do a better job; but
> I don't want to increase writes to the disk just to keep the
> s_lifetime value up-to-date. So what we should probably do is update
> it when we are going to be updating the superblock anyway (i.e., when
> we update the orphaned inode linked list)

Could we also update the superblock blocks/inodes free counters at that time as well?

> and maybe on some periodic basis (say once an hour) otherwise.

I don't think that is the right thing to do, unless the filesystem is still active for other reasons. We don't necessarily want to spin up the disks every hour if the filesystem is inactive, I'd rather write out the superblock for an existing transaction while it is still active than generate a new transaction for no particular reason.

One way to do this would be to have a JBD transaction pre-commit callback, and if the superblock has not been written in N seconds then it can be added as part of that transaction (if it will fit). If it is already in the transaction it can be updated via the existing bh callbacks that OCFS2 is using.

Cheers, Andreas






2010-06-30 13:31:12

by Theodore Ts'o

[permalink] [raw]
Subject: Re: lifetime_write_kbytes isn't preserved during unclean shutdown

On Tue, Jun 29, 2010 at 11:46:57PM -0600, Andreas Dilger wrote:
> On 2010-06-28, at 13:57, [email protected] wrote:
> > On Sun, Jun 27, 2010 at 06:56:07PM +0200, Nebojsa Trpkovic wrote:
> >> I've noticed that lifetime_write_kbytes isn't preserved during unclean
> >> shutdown.
> >
> > Yes, right now we are only updating the superblock's lifetime write
> > kbytes at unmount time. It should be possible to do a better job; but
> > I don't want to increase writes to the disk just to keep the
> > s_lifetime value up-to-date. So what we should probably do is update
> > it when we are going to be updating the superblock anyway (i.e., when
> > we update the orphaned inode linked list)
>
> Could we also update the superblock blocks/inodes free counters at
> that time as well?

Remind me again why you wanted it. You had some use case where you
wanted to be able to read the file system's block device directly and
have vaguely correct free inode/block numbers in the superblock?

> I don't think that is the right thing to do, unless the filesystem
> is still active for other reasons. We don't necessarily want to
> spin up the disks every hour if the filesystem is inactive, I'd
> rather write out the superblock for an existing transaction while it
> is still active than generate a new transaction for no particular
> reason.
>
> One way to do this would be to have a JBD transaction pre-commit
> callback, and if the superblock has not been written in N seconds
> then it can be added as part of that transaction (if it will fit).
> If it is already in the transaction it can be updated via the
> existing bh callbacks that OCFS2 is using.

That seems reasonable way of doing things. I think N seconds should
be in the region of every 5-60 minutes, though. And yes, absolutely
we wouldn't want to spin up the disk if there was no other activity.

- Ted

2010-06-30 21:54:03

by Andreas Dilger

[permalink] [raw]
Subject: Re: lifetime_write_kbytes isn't preserved during unclean shutdown

On 2010-06-30, at 07:31, [email protected] wrote:
> On Tue, Jun 29, 2010 at 11:46:57PM -0600, Andreas Dilger wrote:
>> Could we also update the superblock blocks/inodes free counters at
>> that time as well?
>
> Remind me again why you wanted it. You had some use case where you
> wanted to be able to read the file system's block device directly and
> have vaguely correct free inode/block numbers in the superblock?

That is so that running "e2fsck -fn" on a quiet filesystem does not complain. I always find this annoying that it fails with the inode and blocks counters being incorrect, even though the filesystem has been idle for hours.

It would also be possible to fix e2fsck not to complain about this for a read-only e2fsck run (as it already does with "e2fsck -fy"). However, the last I recall from that approach was that you didn't want it to return 0 if there was anything that actually needed fixing.

>> One way to do this would be to have a JBD transaction pre-commit
>> callback, and if the superblock has not been written in N seconds
>> then it can be added as part of that transaction (if it will fit).
>> If it is already in the transaction it can be updated via the
>> existing bh callbacks that OCFS2 is using.
>
> That seems reasonable way of doing things. I think N seconds should
> be in the region of every 5-60 minutes, though. And yes, absolutely
> we wouldn't want to spin up the disk if there was no other activity.

I didn't say how big "N" would be :-).

It may be easier to have the _first_ handle started after N seconds where the superblock was not updated add in the superblock to the transaction, and then the superblock buffer gets updated just before transaction close time with whatever we want to stick in there.

Cheers, Andreas