2008-12-02 14:48:48

by Pavel Machek

[permalink] [raw]
Subject: SD/MMC cards: how crappy they are?


I have 32GB card here...

root@amd:/home/pavel/WWW/wear/tinylight# time cat /dev/mmc1 > /dev/null
cat: /dev/mmc1: Input/output error
1.32user 49.03system 4184.78 (69m44.789s) elapsed 1.20%CPU

...maybe it was because of powerfail? I'll try to run badblocks to
recover it...

...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
did.. And yes, thosse 'temporarily bad blocks' seem very much
powerfail related.

Its bad, because ext2/3 does not seem to handle this very well... not
even fsck does the selective rewrite... :-(.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


2008-12-02 16:31:11

by H. Peter Anvin

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Pavel Machek wrote:
>
> ...maybe it was because of powerfail? I'll try to run badblocks to
> recover it...
>
> ...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
> did.. And yes, thosse 'temporarily bad blocks' seem very much
> powerfail related.
>

Power failures can, indeed, do nasty things to SD/MMC cards, especially
power rail sag in the middle of writes.

-hpa

2008-12-02 16:56:22

by Theodore Ts'o

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

On Tue, Dec 02, 2008 at 08:30:29AM -0800, H. Peter Anvin wrote:
> > ...maybe it was because of powerfail? I'll try to run badblocks to
> > recover it...
> >
> > ...I did. Badblocks did not help, but cat /dev/zero > /dev/mmc1
> > did.. And yes, thosse 'temporarily bad blocks' seem very much
> > powerfail related.
> >
>
> Power failures can, indeed, do nasty things to SD/MMC cards, especially
> power rail sag in the middle of writes.

If this is your random eject out from your HP laptop problem, note
that random ejects while the card is writing can cause corruption of
the flash translation layer (FTL), which for some really crappy cards,
can permanently damage them; hopefully most of those are gone from the
market, but I wouldn't be positive about that. The better ones will
have some kind of journalling scheme for their FTL...

Fsck does have a force rewrite option, although it's not the default.
You have to answer "n" to ignore error, and then yes to "force
rewrite". I should perhaps change that; my worry at the time was a
transient read error tricking e2fsck into blowing away the contents of
what was actually a good sector. Of course, that will only help
blocks which fsck actually tried reading; it won't help data blocks.

Badblocks -n will fix the problem, since it will do a non-destructive
read/write test over the entire disk. Patches to add an
forced-rewrite mode to the standard r/o badblocks sweep (so we only
write to a sector that has a read error) would be gratefully accepted.

- Ted

2008-12-02 18:00:23

by H. Peter Anvin

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Theodore Tso wrote:
>
> If this is your random eject out from your HP laptop problem, note
> that random ejects while the card is writing can cause corruption of
> the flash translation layer (FTL), which for some really crappy cards,
> can permanently damage them; hopefully most of those are gone from the
> market, but I wouldn't be positive about that. The better ones will
> have some kind of journalling scheme for their FTL...
>

I have seen flash cards die permanently from having a partition table it
didn't like written to it. Yes, the microcontroller on the flash card
tried to interpret the partition table, assumed to be MS-DOS style, and
would crash.

-hpa

2008-12-04 10:30:28

by Pavel Machek

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

On Tue 2008-12-02 09:59:42, H. Peter Anvin wrote:
> Theodore Tso wrote:
> >
> > If this is your random eject out from your HP laptop problem, note
> > that random ejects while the card is writing can cause corruption of
> > the flash translation layer (FTL), which for some really crappy cards,
> > can permanently damage them; hopefully most of those are gone from the
> > market, but I wouldn't be positive about that. The better ones will
> > have some kind of journalling scheme for their FTL...
> >
>
> I have seen flash cards die permanently from having a partition table it
> didn't like written to it. Yes, the microcontroller on the flash card
> tried to interpret the partition table, assumed to be MS-DOS style, and
> would crash.

Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
... hopefully thats fixed in the better/bigger cards now.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-12-04 19:03:55

by H. Peter Anvin

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Pavel Machek wrote:
>>>
>> I have seen flash cards die permanently from having a partition table it
>> didn't like written to it. Yes, the microcontroller on the flash card
>> tried to interpret the partition table, assumed to be MS-DOS style, and
>> would crash.
>
> Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
> ... hopefully thats fixed in the better/bigger cards now.
>

Also had a batch of cards which would silently "correct" the partition
table for you to align the partitions to its flash erase blocks.

-hpa

2008-12-26 21:46:36

by Pavel Machek

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

> Pavel Machek wrote:
> >>>
> >> I have seen flash cards die permanently from having a partition table it
> >> didn't like written to it. Yes, the microcontroller on the flash card
> >> tried to interpret the partition table, assumed to be MS-DOS style, and
> >> would crash.
> >
> > Aha... that explains why I killed few flashcards by tar xzvf /dev/sdX files
> > ... hopefully thats fixed in the better/bigger cards now.
> >
>
> Also had a batch of cards which would silently "correct" the partition
> table for you to align the partitions to its flash erase blocks.

Can you mention the manufacturer/model? Silendt data corruption is a
nasty thing....
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-12-26 21:50:24

by H. Peter Anvin

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Pavel Machek wrote:
>> Also had a batch of cards which would silently "correct" the partition
>> table for you to align the partitions to its flash erase blocks.
>
> Can you mention the manufacturer/model? Silendt data corruption is a
> nasty thing....

I would, if I remembered. It was a few years ago. All I can remember
now is that it wasn't one of the well-known brands like SanDisk or PQI.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

2008-12-26 22:39:19

by Pavel Machek

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Hi!

> If this is your random eject out from your HP laptop problem, note
> that random ejects while the card is writing can cause corruption of
> the flash translation layer (FTL), which for some really crappy cards,
> can permanently damage them; hopefully most of those are gone from the
> market, but I wouldn't be positive about that. The better ones will
> have some kind of journalling scheme for their FTL...
>
> Fsck does have a force rewrite option, although it's not the default.
> You have to answer "n" to ignore error, and then yes to "force
> rewrite". I should perhaps change that; my worry at the time was a
> transient read error tricking e2fsck into blowing away the contents of
> what was actually a good sector. Of course, that will only help

Yes, I think that should be changed. Transient read errors are not
common, while bad blocks fixed by rewrite are quite common.
>
> Badblocks -n will fix the problem, since it will do a non-destructive
> read/write test over the entire disk. Patches to add an
> forced-rewrite mode to the standard r/o badblocks sweep (so we only
> write to a sector that has a read error) would be gratefully accepted.

badblocks -n took > 8hours on 32GB flash, so no, that's not usable. I
started digging into badblocks (please take a look/apply following
documentation updates, I only understood some stuff when reading the
source)... And I wish I'd known about SIGALARM before.

Question: does badblocks expect the media to be valied ext2/3/4
filesystem? It seems so...

Pavel

Binary files e2fsprogs-1.41.3-clean/misc/badblocks and e2fsprogs-1.41.3/misc/badblocks differ
diff -ur e2fsprogs-1.41.3-clean/misc/badblocks.8 e2fsprogs-1.41.3/misc/badblocks.8
--- e2fsprogs-1.41.3-clean/misc/badblocks.8 2008-12-26 23:08:55.000000000 +0100
+++ e2fsprogs-1.41.3/misc/badblocks.8 2008-12-26 23:18:56.000000000 +0100
@@ -173,6 +173,10 @@
read-only test is done. This option must not be combined with the
.B \-w
option, as they are mutually exclusive.
+
+This will read the block to be tested, then overwrite it with few different
+patterns, then write old data back. If something goes very wrong during the
+test (powerfail?) it may still damage the data.
.TP
.B \-s
Show the progress of the scan by writing out the block numbers as they
@@ -211,6 +215,10 @@
bad blocks. Therefore it is recommended to use it only when one wants
to know if there are any bad blocks at all on the device, and not when
the list of bad blocks is wanted.
+
+You can send SIGALARM to make badblocks report its progress. You can
+send SIGTERM to make badblocks terminate; it will catch the signal, clean
+up and exit.
.SH AUTHOR
.B badblocks
was written by Remy Card <[email protected]>. Current maintainer is

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2008-12-27 01:02:54

by Ben Pfaff

[permalink] [raw]
Subject: Re: SD/MMC cards: how crappy they are?

Pavel Machek <[email protected]> writes:

> @@ -211,6 +215,10 @@
> bad blocks. Therefore it is recommended to use it only when one wants
> to know if there are any bad blocks at all on the device, and not when
> the list of bad blocks is wanted.
> +
> +You can send SIGALARM to make badblocks report its progress. You can
> +send SIGTERM to make badblocks terminate; it will catch the signal, clean
> +up and exit.

s/SIGALARM/SIGALRM/
--
Ben Pfaff
http://benpfaff.org