Andrew,
I do not know about the specific corruption Shane is talking about, but I
could summarize what we have found out in the past 2 years. I have written
the InnoDB backend to MySQL, and have been hunting Linux corruption bugs for
2 years now.
- Corruption seems to happen on Red Hat kernels 2.4.18 under heavy file i/o
load on some computers.
- A user ran a very simple stress test of type SELECT 'abbaguu' with many
clients. On a 2-way Dell server he was able to get mysqld to crash
predictably in < 24 hours. Sometimes he also got corruption. But another,
cheaper computer worked ok. Both were running a Red Hat kernel 2.4.18. When
the user upgraded to a 'stock' kernel 2.4.20, the crashes and corruption
disappeared.
- Our 4-way Xeon SuSE-2.4.18 computer never corrupts databases, though I run
very heavy stress tests on it.
- Kernels 2.4.20 seem to be more reliable than 2.4.18. I have only one
corruption case from such a kernel.
- We know with certainty that corruption is sometimes caused by
OS/drivers/hardware and not by mysqld, because in some cases rebooting the
computer has magically fixed the corruption. Looks like Linux had corrupted
its own file cache, but the data on disk was ok. I reported this on the
Linux kernel mailing list 2 years ago, but got no definite feedback.
- In some cases InnoDB reports checksum errors in pages. In those cases it
is also very probable that the corruption was caused by OS/drivers/hardware,
and not by mysqld.
- I have not noticed any clear connection between corruption reports and the
used file system.
- I have personally tested on 4 Linux computers. On an old 2.2 kernel
computer I was able to get read errors in 30 seconds. The three 2.4 kernel
computers have worked ok.
My hypothesis is that there are bugs in drivers of Linux. That would explain
why some computers work ok. Or there are Linux kernel bugs which only
manifest on certain hardware under certain file i/o workload.
What to do? People who write drivers should run heavy, multithreaded file
i/o tests on their computer using some SQL database which calls fsync(). For
example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
the problems are in drivers, that could help.
Best regards,
Heikki Tuuri
Innobase Oy
.................
List: linux-kernel
Subject: Re: 2.6.0-test2-mm3 and mysql
From: Andrew Morton <akpm () osdl ! org>
Date: 2003-08-03 2:08:59
[Download message RAW]
Shane Shrybman <[email protected]> wrote:
>
> The db corruption hit again on test2-mm2.
How do you know it is "db corruption"?
>
> I am still backing out the 64 bit devt bit
why?
"Heikki Tuuri" <[email protected]> wrote:
>
> What to do? People who write drivers should run heavy, multithreaded file
> i/o tests on their computer using some SQL database which calls fsync(). For
> example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
> the problems are in drivers, that could help.
Well there's a problem. We're kernel people, not database people. I, for
one, would not have a clue how to set such a thing up.
If someone could prepare a simple-enough-for-kernel-people description of
how to get such a test up and running, then we might make some progress.
On Sun, Aug 03, 2003 at 12:10:01PM +0300, Heikki Tuuri wrote:
>
> What to do? People who write drivers should run heavy, multithreaded file
> i/o tests on their computer using some SQL database which calls fsync(). For
> example, run the Perl '/sql-bench/innotest's all concurrently on MySQL. If
> the problems are in drivers, that could help.
Did you know that until test2-mm3, nothing would report errors that
occurred on non-synchronous writes? There was no infrastructure to
propagate the error back to userspace. If you wrote a page, the write
failed on an intermittent I/O error, and then read again, you'd
silently get back the old page.
--
Matt Mackall : http://www.selenic.com : of or relating to the moon