2013-02-07 16:43:24

by Autif Khan

[permalink] [raw]
Subject: How can I flush all writes before yanking the power cable?

The standard operating procedure to power down my machine is to switch
it off. To work around this, we use mSATA SSDs (actually we recently
switched from SATA SSDs) with linux on a read only partition.

This works just fine, however, we want to be able to upgrade some
parts of the application. To do this, we have put the application on
/app partition. We mount it read only at start up. When we want to
upgrade the app, we remount read-write sync (mount -o remount,rw,sync
/app) perform the write operations and remount read only.

If we yank the power cable after this, we get file system errors on
the next reboot.

We can display a message to the user telling them that it is safe to
power down the machine.

My question is

1) Is this the right place to discuss this or should I have posted
this in the file systems mailing list?

2) how can we determine that all the writes are flushed? (and this it
is safe to yank the power cable)

3) is there a better way to do this? - for example we may not have to
remount read write sync - and we can force a sync before remounting
read only or something

I have already tried "sudo sync" before remounting the filesystem as
read only. It does not help.

Please advise.

Thanks

Autif


2013-02-07 18:12:34

by Eric Sandeen

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On 2/7/13 10:43 AM, Autif Khan wrote:
> The standard operating procedure to power down my machine is to switch
> it off. To work around this, we use mSATA SSDs (actually we recently
> switched from SATA SSDs) with linux on a read only partition.

Not sure the SSD part makes any significant difference, but the RO
mount should.

> This works just fine, however, we want to be able to upgrade some
> parts of the application. To do this, we have put the application on
> /app partition. We mount it read only at start up. When we want to
> upgrade the app, we remount read-write sync (mount -o remount,rw,sync
> /app) perform the write operations and remount read only.
>
> If we yank the power cable after this, we get file system errors on
> the next reboot.

What kind of errors? (and on what kernel? Are you mounted with
barriers enabled?)

If you use barriers, remount RO, that completes, you yank the power,
and you see corruption, I would guess one of a few things is happening:

1) You're not mounting w/ barriers, and you lose data in the SSD's cache
2) You *are* mounting w/ barriers, and the SSD is lying to you
3) There's a bug in our remount,ro path which doesn't quiesce things properly

mount -o remount,ro should be >this< close to an unmount; things should
be stable on disk when it's done.

-Eric

> We can display a message to the user telling them that it is safe to
> power down the machine.
>
> My question is
>
> 1) Is this the right place to discuss this or should I have posted
> this in the file systems mailing list?
>
> 2) how can we determine that all the writes are flushed? (and this it
> is safe to yank the power cable)
>
> 3) is there a better way to do this? - for example we may not have to
> remount read write sync - and we can force a sync before remounting
> read only or something
>
> I have already tried "sudo sync" before remounting the filesystem as
> read only. It does not help.
>
> Please advise.
>
> Thanks
>
> Autif
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>


2013-02-07 19:37:04

by Autif Khan

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On Thu, Feb 7, 2013 at 1:12 PM, Eric Sandeen <[email protected]> wrote:
> On 2/7/13 10:43 AM, Autif Khan wrote:
>> The standard operating procedure to power down my machine is to switch
>> it off. To work around this, we use mSATA SSDs (actually we recently
>> switched from SATA SSDs) with linux on a read only partition.
>
> Not sure the SSD part makes any significant difference, but the RO
> mount should.
>
>> This works just fine, however, we want to be able to upgrade some
>> parts of the application. To do this, we have put the application on
>> /app partition. We mount it read only at start up. When we want to
>> upgrade the app, we remount read-write sync (mount -o remount,rw,sync
>> /app) perform the write operations and remount read only.
>>
>> If we yank the power cable after this, we get file system errors on
>> the next reboot.
>
> What kind of errors? (and on what kernel? Are you mounted with
> barriers enabled?)

Filesystem check errors that the OS throws at you on an unclean
shutdown. Where it asks you to 'F'ix, 'S'kip, 'Ignore or 'M'anually
fix the error using fsck. The kernel is a custom kernel for our
hardware.

> If you use barriers, remount RO, that completes, you yank the power,
> and you see corruption, I would guess one of a few things is happening:
>
> 1) You're not mounting w/ barriers, and you lose data in the SSD's cache

That was precisely my ignorance. I did not know about barrier. Adding
it during mount ro and remount rw seems to have fixed these issues.

Thanks you very much for all your help.

Autif

> 2) You *are* mounting w/ barriers, and the SSD is lying to you
> 3) There's a bug in our remount,ro path which doesn't quiesce things properly
>
> mount -o remount,ro should be >this< close to an unmount; things should
> be stable on disk when it's done.
>
> -Eric
>
>> We can display a message to the user telling them that it is safe to
>> power down the machine.
>>
>> My question is
>>
>> 1) Is this the right place to discuss this or should I have posted
>> this in the file systems mailing list?
>>
>> 2) how can we determine that all the writes are flushed? (and this it
>> is safe to yank the power cable)
>>
>> 3) is there a better way to do this? - for example we may not have to
>> remount read write sync - and we can force a sync before remounting
>> read only or something
>>
>> I have already tried "sudo sync" before remounting the filesystem as
>> read only. It does not help.
>>
>> Please advise.
>>
>> Thanks
>>
>> Autif
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>

2013-02-07 20:58:06

by Theodore Ts'o

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On Thu, Feb 07, 2013 at 02:37:02PM -0500, Autif Khan wrote:
>
> That was precisely my ignorance. I did not know about barrier. Adding
> it during mount ro and remount rw seems to have fixed these issues.

You also didn't say what file system you were using. Was it ext4?
ext3? ext2? What kernel version? On modern kernels barrier is
enabled by default for both ext3 and ext4.

Regards,

- Ted

2013-02-07 21:28:04

by Autif Khan

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On Thu, Feb 7, 2013 at 3:58 PM, Theodore Ts'o <[email protected]> wrote:
> On Thu, Feb 07, 2013 at 02:37:02PM -0500, Autif Khan wrote:
>>
>> That was precisely my ignorance. I did not know about barrier. Adding
>> it during mount ro and remount rw seems to have fixed these issues.
>
> You also didn't say what file system you were using. Was it ext4?
> ext3? ext2? What kernel version? On modern kernels barrier is
> enabled by default for both ext3 and ext4.

The filesystem is ext4. Kernel version is 3.2.0

I could not grep -i barrier in the kernel config. How is barrier
enabled or disabled in the kernel by default?

2013-02-07 21:29:52

by Eric Sandeen

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On 2/7/13 3:28 PM, Autif Khan wrote:
> On Thu, Feb 7, 2013 at 3:58 PM, Theodore Ts'o <[email protected]> wrote:
>> On Thu, Feb 07, 2013 at 02:37:02PM -0500, Autif Khan wrote:
>>>
>>> That was precisely my ignorance. I did not know about barrier. Adding
>>> it during mount ro and remount rw seems to have fixed these issues.
>>
>> You also didn't say what file system you were using. Was it ext4?
>> ext3? ext2? What kernel version? On modern kernels barrier is
>> enabled by default for both ext3 and ext4.
>
> The filesystem is ext4. Kernel version is 3.2.0
>
> I could not grep -i barrier in the kernel config. How is barrier
> enabled or disabled in the kernel by default?


in ext4_fill_super():

if ((def_mount_opts & EXT4_DEFM_NOBARRIER) == 0)
set_opt(sb, BARRIER);

It's a mount option, not a kernel config option.

-Eric


2013-05-16 18:31:47

by Autif Khan

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On Thu, Feb 7, 2013 at 2:37 PM, Autif Khan <[email protected]> wrote:
>
> On Thu, Feb 7, 2013 at 1:12 PM, Eric Sandeen <[email protected]> wrote:
> > On 2/7/13 10:43 AM, Autif Khan wrote:
> >> The standard operating procedure to power down my machine is to switch
> >> it off. To work around this, we use mSATA SSDs (actually we recently
> >> switched from SATA SSDs) with linux on a read only partition.
> >
> > Not sure the SSD part makes any significant difference, but the RO
> > mount should.
> >
> >> This works just fine, however, we want to be able to upgrade some
> >> parts of the application. To do this, we have put the application on
> >> /app partition. We mount it read only at start up. When we want to
> >> upgrade the app, we remount read-write sync (mount -o remount,rw,sync
> >> /app) perform the write operations and remount read only.
> >>
> >> If we yank the power cable after this, we get file system errors on
> >> the next reboot.
> >
> > What kind of errors? (and on what kernel? Are you mounted with
> > barriers enabled?)
>
> Filesystem check errors that the OS throws at you on an unclean
> shutdown. Where it asks you to 'F'ix, 'S'kip, 'Ignore or 'M'anually
> fix the error using fsck. The kernel is a custom kernel for our
> hardware.
>
> > If you use barriers, remount RO, that completes, you yank the power,
> > and you see corruption, I would guess one of a few things is happening:
> >
> > 1) You're not mounting w/ barriers, and you lose data in the SSD's cache
>
> That was precisely my ignorance. I did not know about barrier. Adding
> it during mount ro and remount rw seems to have fixed these issues.
>
> Thanks you very much for all your help.
>
> Autif
>
> > 2) You *are* mounting w/ barriers, and the SSD is lying to you

Resurrecting this thread as we have run into a very peculiar problem.

We now mount our partitions either ro or rw,barriers=1 and remount
ro,barrier=1 after write is complete.

This worked beautifully well on the one prototype that we have.

We built another prototype with a different mSATA SSD and we are now
seeing FS corruption after we mount rw,barrier=1, write, remount
ro,barrier=1 and finally yank the power cable (after a considerable
wait ~10 seconds). We tried 3-4 different SSDs but we have the one SSD
that does not exhibit this issue and several SSDs that do exhibit this
issue. The issue travels with the SSD.

I am guessing that the SSD is lying (Eric's choice of the word - above :-)

How can we tell if an SSD supports barriers or flushes etc?

(Apologies to Eric for spam - somehow I replied, instead of reply to all)

> > 3) There's a bug in our remount,ro path which doesn't quiesce things properly
> >
> > mount -o remount,ro should be >this< close to an unmount; things should
> > be stable on disk when it's done.
> >
> > -Eric
> >
> >> We can display a message to the user telling them that it is safe to
> >> power down the machine.
> >>
> >> My question is
> >>
> >> 1) Is this the right place to discuss this or should I have posted
> >> this in the file systems mailing list?
> >>
> >> 2) how can we determine that all the writes are flushed? (and this it
> >> is safe to yank the power cable)
> >>
> >> 3) is there a better way to do this? - for example we may not have to
> >> remount read write sync - and we can force a sync before remounting
> >> read only or something
> >>
> >> I have already tried "sudo sync" before remounting the filesystem as
> >> read only. It does not help.
> >>
> >> Please advise.
> >>
> >> Thanks
> >>
> >> Autif
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >> the body of a message to [email protected]
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >

2013-05-16 19:03:47

by Theodore Ts'o

[permalink] [raw]
Subject: Re: How can I flush all writes before yanking the power cable?

On Thu, May 16, 2013 at 02:31:45PM -0400, Autif Khan wrote:
> > >
> > > 1) You're not mounting w/ barriers, and you lose data in the SSD's cache
> >
> > That was precisely my ignorance. I did not know about barrier. Adding
> > it during mount ro and remount rw seems to have fixed these issues.

What kernel version and file system (ext3 vs ext4) are you using?
Barriers have been enabled by default for quite a while.

> > > 2) You *are* mounting w/ barriers, and the SSD is lying to you
>
> Resurrecting this thread as we have run into a very peculiar problem.
>
> We now mount our partitions either ro or rw,barriers=1 and remount
> ro,barrier=1 after write is complete.
>
> This worked beautifully well on the one prototype that we have.
>
> We built another prototype with a different mSATA SSD and we are now
> seeing FS corruption after we mount rw,barrier=1, write, remount
> ro,barrier=1 and finally yank the power cable (after a considerable
> wait ~10 seconds). We tried 3-4 different SSDs but we have the one SSD
> that does not exhibit this issue and several SSDs that do exhibit this
> issue. The issue travels with the SSD.

Are the SSD's from different manufacturers? If they are from the same
manufacturer and have the same model number, do they have the same
firmware version?

Note that there are some cheap (or to put another way, crappy) SSD's
where yanking the power cable at the wrong time causes the SSD's
internal metadata for its Flash Translation Layer to get corrupted,
and you end up with a completely bricked SSD.

This was much more common in the past with Compact Flash cards, where
stories of wedding photographers who lost all of their photos from a
wedding shoot after they accidentally ejected their flash card, and
the CF card was complteely toasted. If you were lucky, the compact
flash manufacturer had special recovery software that would allow you
to do the moral equivalent of running fsck on the FTL metadata (since
the FTL can be thought of as a file system, where instead of file
names you use sector numbers instead), and then you might get to
recover some of the photos. If you were not so lucky, you got to
replace the compact flash card (which was annoying, but the cost of
losing all of the wedding photos was often far more expensive from a
commercial perspective).

> I am guessing that the SSD is lying (Eric's choice of the word - above :-)
>
> How can we tell if an SSD supports barriers or flushes etc?

Well, it's not necessarily lying --- it could just be buggy. That is,
it tried to make sure all of the data was written to the flash chips,
but on a ower pull, the SSD's FTL metadata got corrupted, and this
caused the wrong data to be returned when you try to read from the SSD
--- which in some ways is worse, since if the SSD is lying, it's
generally only the most recently written blocks which get lost. If
the SSD is buggy, blocks written hours or days ago could get lost when
the FTL gets corrupted.

Well, if you're a manufacturer, you write programs which test to see
whether the SSD does the right thing after a power pull (i.e. write a
test progam which writes blocks with timestamps and periodic CACHE
FLUSH commands, and then execute a power drop, and then verify that
the data on the disk is as you expect). If it isn't, then you reject
the SSD vendor as providing devices which are not fit for purpose.
Since as a manufacturer, you're purchasing SSD's or eMMC devices by
the millions, you have a certain amount of leverage over the
manufacturer. :-)

If you're some random end user, you're basically at the mercy of the
SSD manufacturer. You can look at various review sites, but
unfortunately not all reviewers test to make sure the barriers work
correctly and that the device is robust against power drops. The
problem is all of the reviewers tend to do performance tests, and so
there is a huge temptation to optimize for performance over
robustness....

- Ted