2005-09-20 12:55:54

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

Peter Menzebach wrote:
> No, not at then moment. If I have some time, I can try to rewrite the
> chipset driver, that it reports a sector size of 1024.

I glanced at the manual. Uhh, DataFlash is very specific beast. It
suppoers page program with built-in erase command... So DataFlash
effectively may be considered as a block device. Then you may use any FS
on it providing you have wrote proper driver? Why do you need JFFS2 then
:-) ?

JFFS2 orients to "classical" flashes. They have no "write page with
built-in erase" operation.

Didn't read the manual carefully, what do they refer by "Main memory array"?

BTW, having 8*1056 write buffer is not perfect ides, better make it as
small as possible, i.e., 1056 bytes.

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.


2005-09-20 13:32:50

by Jörn Engel

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

On Tue, 20 September 2005 16:55:52 +0400, Artem B. Bityutskiy wrote:
> Peter Menzebach wrote:
> >No, not at then moment. If I have some time, I can try to rewrite the
> >chipset driver, that it reports a sector size of 1024.

Don't. I'm actually glad about some flash with sizes not exactly
matching a power of two. It causes you some pain, but generally helps
to find bugs.

> I glanced at the manual. Uhh, DataFlash is very specific beast. It
> suppoers page program with built-in erase command... So DataFlash
> effectively may be considered as a block device. Then you may use any FS
> on it providing you have wrote proper driver? Why do you need JFFS2 then
> :-) ?

Still can't. Block devices have the attribute that writing AAA... to
a block containing BBB... gives you one of three possible results in
case of power failure:

1. BBB...BBB all written
2. AAA...AAA nothing written
3. AAA...BBB partially written.

Flash doesn't have 3, but two more cases:
4. FFF...FFF erased, nothing written
5. AAA...FFF erased, partially written

Plus the really obnoxious
6. FFF...FFF partially erased. Looks fine but some bits may flip
randomly, writes may not stick, etc.

Now try finding a filesystem that is robust if 4-6 happens. ;)

> JFFS2 orients to "classical" flashes. They have no "write page with
> built-in erase" operation.

What does this thing do?

> BTW, having 8*1056 write buffer is not perfect ides, better make it as
> small as possible, i.e., 1056 bytes.

Definitely.

J?rn

--
You can't tell where a program is going to spend its time. Bottlenecks
occur in surprising places, so don't try to second guess and put in a
speed hack until you've proven that's where the bottleneck is.
-- Rob Pike

2005-09-20 14:11:06

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

J?rn Engel wrote:
> Still can't. Block devices have the attribute that writing AAA... to
> a block containing BBB... gives you one of three possible results in
> case of power failure:
>
> 1. BBB...BBB all written
> 2. AAA...AAA nothing written
> 3. AAA...BBB partially written.
>
> Flash doesn't have 3, but two more cases:
> 4. FFF...FFF erased, nothing written
> 5. AAA...FFF erased, partially written
>
> Plus the really obnoxious
> 6. FFF...FFF partially erased. Looks fine but some bits may flip
> randomly, writes may not stick, etc.
>
> Now try finding a filesystem that is robust if 4-6 happens. ;)
Don't underastand this. If you mean the atomicity, CRC may help here.
And no problems. Or may be you missed the the fact that we have
eraseblock size = writeblock size?

>>JFFS2 orients to "classical" flashes. They have no "write page with
>>built-in erase" operation.
> What does this thing do?
It erases individual page, then writes there. To put it differently, in
your terminology, eraseblock size = writeblock size.

P.S. I actually missed the mailing list, this should have gone to the
MTD ML. So let's move there please.

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

2005-09-22 10:32:32

by Pavel Machek

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

Hi!

> > I glanced at the manual. Uhh, DataFlash is very specific beast. It
> > suppoers page program with built-in erase command... So DataFlash
> > effectively may be considered as a block device. Then you may use any FS
> > on it providing you have wrote proper driver? Why do you need JFFS2 then
> > :-) ?
>
> Still can't. Block devices have the attribute that writing AAA... to
> a block containing BBB... gives you one of three possible results in
> case of power failure:
>
> 1. BBB...BBB all written
> 2. AAA...AAA nothing written
> 3. AAA...BBB partially written.
>
> Flash doesn't have 3, but two more cases:
> 4. FFF...FFF erased, nothing written
> 5. AAA...FFF erased, partially written
>
> Plus the really obnoxious
> 6. FFF...FFF partially erased. Looks fine but some bits may flip
> randomly, writes may not stick, etc.
>
> Now try finding a filesystem that is robust if 4-6 happens. ;)

ext2 and anything that does not do journalling?

I do not thing behaviour on powerfail is part of block device definition.

Pavel
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2005-09-22 10:48:41

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

Pavel Machek wrote:
> ext2 and anything that does not do journalling?
>
> I do not thing behaviour on powerfail is part of block device definition.
>
Pavel, AFAIU,

Joern meant that if HDD starts a block write operation, it will
accomplish it even if power-fail happens (probably there are some
capacitors there). So, it is impossible, say, that HDD has written one
half of a sector and has not written the other half.

And he wanted to say that DataFlash HW does not guarante this. But,
perhaps, adding a special HW, this is implementable.

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

2005-09-22 11:34:38

by Jörn Engel

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

On Wed, 21 September 2005 21:07:59 +0200, Pavel Machek wrote:
>
> > > I glanced at the manual. Uhh, DataFlash is very specific beast. It
> > > suppoers page program with built-in erase command... So DataFlash
> > > effectively may be considered as a block device. Then you may use any FS
> > > on it providing you have wrote proper driver? Why do you need JFFS2 then
> > > :-) ?
> >
> > Still can't. Block devices have the attribute that writing AAA... to
> > a block containing BBB... gives you one of three possible results in
> > case of power failure:
> >
> > 1. BBB...BBB all written
> > 2. AAA...AAA nothing written
> > 3. AAA...BBB partially written.
> >
> > Flash doesn't have 3, but two more cases:
> > 4. FFF...FFF erased, nothing written
> > 5. AAA...FFF erased, partially written
> >
> > Plus the really obnoxious
> > 6. FFF...FFF partially erased. Looks fine but some bits may flip
> > randomly, writes may not stick, etc.
> >
> > Now try finding a filesystem that is robust if 4-6 happens. ;)
>
> ext2 and anything that does not do journalling?
>
> I do not thing behaviour on powerfail is part of block device definition.

Noone bothered defining it, but most everyone is happy about it being
as it is. Non-journalling filesystems would have severe corruption on
unclean umounts. lost+found would fill up much faster than people are
used to, if 4-6 was common for hard disks.

Journalling filesystems actually would be robust against 4-6, as long
as their block size was large enough. The journal must contain the
complete erase block from flash - which is commonly in the area of 64k
or 128k. Ext3 blocks are much smaller, so the fs would still corrupt.

Well - DataFlash appears to have very small block sizes. So yes, it
would be possible to use ext3 on it. But then there's still the
problem of limited per-block flash lifetime and ext3 doesn't do wear
levelling.

J?rn

--
You cannot suppose that Moliere ever troubled himself to be original in the
matter of ideas. You cannot suppose that the stories he tells in his plays
have never been told before. They were culled, as you very well know.
-- Andre-Louis Moreau in Scarabouche

2005-09-22 11:54:45

by Jörn Engel

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

On Thu, 22 September 2005 13:34:30 +0200, J?rn Engel wrote:
>
> Noone bothered defining it, but most everyone is happy about it being
> as it is. Non-journalling filesystems would have severe corruption on
> unclean umounts. lost+found would fill up much faster than people are
> used to, if 4-6 was common for hard disks.

Worse, actually. Corruption will also happen for file data, which may
pass fsck just fine. Your data is gone and noone told you about it.
;)

J?rn

--
A defeated army first battles and then seeks victory.
-- Sun Tzu

2005-09-22 16:46:57

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

On Thu, 22 Sep 2005 14:48:39 +0400, "Artem B. Bityutskiy" said:

> Joern meant that if HDD starts a block write operation, it will
> accomplish it even if power-fail happens (probably there are some
> capacitors there). So, it is impossible, say, that HDD has written one
> half of a sector and has not written the other half.

Hard drives contain capacitors to prevent writing of runt sectors on
a powerfail? Didn't we go around this a while ago and decide it's mostly
urban legend, and that plenty of people have seen runt/bad sectors?


Attachments:
(No filename) (226.00 B)

2005-09-22 17:03:51

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

[email protected] wrote:
> On Thu, 22 Sep 2005 14:48:39 +0400, "Artem B. Bityutskiy" said:
>
>>Joern meant that if HDD starts a block write operation, it will
>>accomplish it even if power-fail happens (probably there are some
>>capacitors there). So, it is impossible, say, that HDD has written one
>>half of a sector and has not written the other half.
>
> Hard drives contain capacitors to prevent writing of runt sectors on
> a powerfail? Didn't we go around this a while ago and decide it's mostly
> urban legend, and that plenty of people have seen runt/bad sectors?

No idea. But theoretically it should be so, at least "good" drives
should. May be a competent person will comment on this, that's quite
interesting.

--
Best Regards,
Artem B. Bityuckiy,
St.-Petersburg, Russia.

2005-09-22 17:22:26

by linux-os (Dick Johnson)

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash


On Thu, 22 Sep 2005, Artem B. Bityutskiy wrote:

> [email protected] wrote:
>> On Thu, 22 Sep 2005 14:48:39 +0400, "Artem B. Bityutskiy" said:
>>
>>> Joern meant that if HDD starts a block write operation, it will
>>> accomplish it even if power-fail happens (probably there are some
>>> capacitors there). So, it is impossible, say, that HDD has written one
>>> half of a sector and has not written the other half.
>>
>> Hard drives contain capacitors to prevent writing of runt sectors on
>> a powerfail? Didn't we go around this a while ago and decide it's mostly
>> urban legend, and that plenty of people have seen runt/bad sectors?
>
> No idea. But theoretically it should be so, at least "good" drives
> should. May be a competent person will comment on this, that's quite
> interesting.
>
> --
> Best Regards,
> Artem B. Bityuckiy,
> St.-Petersburg, Russia.

The only significant energy storage that hard disks contain
is the inertia of the rotating disk assembly. Since the platter
motor is not a generator it doesn't help. Those tiny bypass
capacitors you see can't store enough energy to do anything
useful during a power failure.

BUT... The PC/AT power supplies store a lot of energy and
they run for many milliseconds after a power fail.
2-100 uF in series = 50 uF @ 300 v.
J = 1/2 CV^2

J = 50uF * 300^2 / 2 = 2.25 joules (lots of energy).

If the power-fail line is properly connected and if the
power fail line operates at the correct time, the CPU
will be halted while there is still enough energy available
to complete any write that has gotten to the disk-drives sector
buffer. This does not protect data, but it should certainly
protect the sectors which might now contain header, good data
or junk, and a proper CRC. IOW a good sector.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.13 on an i686 machine (5589.55 BogoMips).
Warning : 98.36% of all statistics are fiction.

****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to [email protected] - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.

2005-09-22 18:43:31

by J. Scott Kasten

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

Actually, 3 years ago, I researched this in my
company's lab. There is about a 1 in 180 probability
of the power to the read/write head failing before the
sector write is complete.

The end result is a sector that fails ECC check. The
linux kernel reports this and your I/O operation
returns with an error.

Not easy, but if you force a rewrite of the sector,
the problem cures itself. Not easy to do through the
VFS though. I was only able to make it happen by
using dd if=/dev/zero on the unmounted device node.
After that, the sector read just fine.

-Scott Kasten-

--- "linux-os (Dick Johnson)" <[email protected]>
wrote:

>
> On Thu, 22 Sep 2005, Artem B. Bityutskiy wrote:
>
> > [email protected] wrote:
> >> On Thu, 22 Sep 2005 14:48:39 +0400, "Artem B.
> Bityutskiy" said:
> >>
> >>> Joern meant that if HDD starts a block write
> operation, it will
> >>> accomplish it even if power-fail happens
> (probably there are some
> >>> capacitors there). So, it is impossible, say,
> that HDD has written one
> >>> half of a sector and has not written the other
> half.
> >>
> >> Hard drives contain capacitors to prevent writing
> of runt sectors on
> >> a powerfail? Didn't we go around this a while
> ago and decide it's mostly
> >> urban legend, and that plenty of people have seen
> runt/bad sectors?
> >
> > No idea. But theoretically it should be so, at
> least "good" drives
> > should. May be a competent person will comment on
> this, that's quite
> > interesting.
> >
> > --
> > Best Regards,
> > Artem B. Bityuckiy,
> > St.-Petersburg, Russia.
>
> The only significant energy storage that hard disks
> contain
> is the inertia of the rotating disk assembly. Since
> the platter
> motor is not a generator it doesn't help. Those tiny
> bypass
> capacitors you see can't store enough energy to do
> anything
> useful during a power failure.
>
> BUT... The PC/AT power supplies store a lot of
> energy and
> they run for many milliseconds after a power fail.
> 2-100 uF in series = 50 uF @ 300 v.
> J = 1/2 CV^2
>
> J = 50uF * 300^2 / 2 = 2.25 joules (lots of
> energy).
>
> If the power-fail line is properly connected and if
> the
> power fail line operates at the correct time, the
> CPU
> will be halted while there is still enough energy
> available
> to complete any write that has gotten to the
> disk-drives sector
> buffer. This does not protect data, but it should
> certainly
> protect the sectors which might now contain header,
> good data
> or junk, and a proper CRC. IOW a good sector.
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.6.13 on an i686 machine
> (5589.55 BogoMips).
> Warning : 98.36% of all statistics are fiction.
>
>
****************************************************************
> The information transmitted in this message is
> confidential and may be privileged. Any review,
> retransmission, dissemination, or other use of this
> information by persons or entities other than the
> intended recipient is prohibited. If you are not
> the intended recipient, please notify Analogic
> Corporation immediately - by replying to this
> message or by sending an email to
> [email protected] - and destroy all copies
> of this information, including any attachments,
> without reading or disclosing them.
>
> Thank you.
> -
> To unsubscribe from this list: send the line
> "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at
> http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2005-09-23 08:51:51

by Jörn Engel

[permalink] [raw]
Subject: Re: data loss on jffs2 filesystem on dataflash

On Thu, 22 September 2005 12:46:33 -0400, [email protected] wrote:
> On Thu, 22 Sep 2005 14:48:39 +0400, "Artem B. Bityutskiy" said:
>
> > Joern meant that if HDD starts a block write operation, it will
> > accomplish it even if power-fail happens (probably there are some
> > capacitors there). So, it is impossible, say, that HDD has written one
> > half of a sector and has not written the other half.
>
> Hard drives contain capacitors to prevent writing of runt sectors on
> a powerfail? Didn't we go around this a while ago and decide it's mostly
> urban legend, and that plenty of people have seen runt/bad sectors?

Yep. I did _not_ say anything about finishing to write a sector.
What I said was that there is only one case of a started and
unfinished sector: it contains partially old and partially new data
and nothing else.

And the difference (one of them, at least) between hard disks and
flash is the "and nothing else" part. Flash may contain other
information as well or even be in a partially erased state, randomly
flipping bits in the future or not accepting writes.

J?rn

--
When in doubt, use brute force.
-- Ken Thompson