2012-05-07 10:43:07

by Daniel Pocock

[permalink] [raw]
Subject: ext4 barrier on SCSI vs SATA?



I understand that for barriers to work, the fs needs to be able to tell
the drive when to move data from hardware cache to the platter.

I notice various pages mention the SYNCHRONIZE CACHE command (SCSI) and
the FLUSH_CACHE_EXT command (ATA) as if they are equivalent.

Looking more closely, I found the SYNCHRONIZE CACHE supports a block
range, whereas it appears that FLUSH_CACHE_EXT always flushes the entire
cache (maybe 32MB or 64MB on a SATA drive)

Does ext4 always flush all of the cache contents? Or if the system is
SCSI, does it only selectively flush the blocks that must be flushed to
maintain coherency?



2012-05-09 19:51:00

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 barrier on SCSI vs SATA?

On Mon 07-05-12 10:35:48, Daniel Pocock wrote:
>
>
> I understand that for barriers to work, the fs needs to be able to tell
> the drive when to move data from hardware cache to the platter.
>
> I notice various pages mention the SYNCHRONIZE CACHE command (SCSI) and
> the FLUSH_CACHE_EXT command (ATA) as if they are equivalent.
>
> Looking more closely, I found the SYNCHRONIZE CACHE supports a block
> range, whereas it appears that FLUSH_CACHE_EXT always flushes the entire
> cache (maybe 32MB or 64MB on a SATA drive)
>
> Does ext4 always flush all of the cache contents? Or if the system is
> SCSI, does it only selectively flush the blocks that must be flushed to
> maintain coherency?
We always flush the complete cache. Actually, there's no interface for
filesystem to tell lower layers that only some blocks should be flushed
AFAIK. And even if we could, journaling is designed so that we need to
flush caches for most of blocks because usually data blocks need to be on
stable storage when transaction commits.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-05-11 07:19:34

by Asdo

[permalink] [raw]
Subject: Re: ext4 barrier on SCSI vs SATA?

On 05/09/12 21:50, Jan Kara wrote:
> []

I have some troubles understanding the barriers thing, can you help me?


In the past some blockdevices would not provide / propagate the
barriers, e.g. MD raid 5 would not. So filesystems during mount would
try the barrier operation and see that it wouldn't work, so they would
disable barrier option and mount as nobarrier.

However the flush was always available (I think), in fact databases
would not corrupt (not even above ext4 nobarrier, above a raid5 without
barriers) if fsync was called at proper times.

So first question is : why filesystems were not using the flush as a
barrier like databases did?

Second question is : was a nobarrier mount (ext4) more risky in terms of
data or metadata lost on sudden power loss?

Thank you
Asdo

2012-05-14 09:02:47

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 barrier on SCSI vs SATA?

On Fri 11-05-12 07:08:20, Asdo wrote:
> On 05/09/12 21:50, Jan Kara wrote:
> I have some troubles understanding the barriers thing, can you help me?
>
> In the past some blockdevices would not provide / propagate the
> barriers, e.g. MD raid 5 would not. So filesystems during mount
> would try the barrier operation and see that it wouldn't work, so
> they would disable barrier option and mount as nobarrier.
Correct.

> However the flush was always available (I think), in fact databases
> would not corrupt (not even above ext4 nobarrier, above a raid5
> without barriers) if fsync was called at proper times.
This is not true. Both cache flushes and barriers were implemented by
the same mechanism in older kernels. Thus if the device did not properly
propagate the barrier capability, then fsync did not provide any guarantees
in case of power failure (if there are volalile write caches in the storage
device).

> So first question is : why filesystems were not using the flush as a
> barrier like databases did?
The above explains that I guess.

> Second question is : was a nobarrier mount (ext4) more risky in
> terms of data or metadata lost on sudden power loss?
Sure, if you have volatile write caches (normal situation on all disk
drives when you don't have UPS), then nobarrier can cause filesystem
corruption on power failure. It was like that before and it is still true.
Nobarrier is there for cases like - you are sure you won't have unexpected
power failure (you have UPS or laptop with working battery and everything
is setup to shutdown the system cleanly when the battery gets low), or you
have disabled write caches on the device, or the device itself has battery
backed caches (the case of higher grade storage cards or NAS).

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-05-14 10:33:15

by Asdo

[permalink] [raw]
Subject: Re: ext4 barrier on SCSI vs SATA?

On 05/14/12 11:02, Jan Kara wrote:
>> However the flush was always available (I think), in fact databases
>> would not corrupt (not even above ext4 nobarrier, above a raid5
>> without barriers) if fsync was called at proper times.
> This is not true. Both cache flushes and barriers were implemented by
> the same mechanism in older kernels. Thus if the device did not properly
> propagate the barrier capability, then fsync did not provide any guarantees
> in case of power failure (if there are volalile write caches in the storage
> device).

Oh! Thanks I had not realized this.

So, if barrier IS provided by the underlying blockdevice but filesystem
is nevertheless mounted as nobarrier (as an explicit option) would
database flushes (fsync) for files on THAT filesystem work properly or not?

Thanks for your insight

2012-05-14 10:52:00

by Jan Kara

[permalink] [raw]
Subject: Re: ext4 barrier on SCSI vs SATA?

On Mon 14-05-12 12:33:03, Asdo wrote:
> On 05/14/12 11:02, Jan Kara wrote:
> >>However the flush was always available (I think), in fact databases
> >>would not corrupt (not even above ext4 nobarrier, above a raid5
> >>without barriers) if fsync was called at proper times.
> > This is not true. Both cache flushes and barriers were implemented by
> >the same mechanism in older kernels. Thus if the device did not properly
> >propagate the barrier capability, then fsync did not provide any guarantees
> >in case of power failure (if there are volalile write caches in the storage
> >device).
>
> Oh! Thanks I had not realized this.
>
> So, if barrier IS provided by the underlying blockdevice but
> filesystem is nevertheless mounted as nobarrier (as an explicit
> option) would database flushes (fsync) for files on THAT filesystem
> work properly or not?
If you have volatile write caches, they would not. nobarrier option
means: "I *know* I don't need cache flushes for data integrity and I want
maximum performance."

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR