2003-08-28 19:11:17

by Heikki Tuuri

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3 and mysql

Sergey,

----- Original Message -----
From: "Sergey S. Kostyliov" <[email protected]>
To: "Heikki Tuuri" <[email protected]>; <[email protected]>
Sent: Thursday, August 28, 2003 10:01 PM
Subject: Re: 2.6.0-test2-mm3 and mysql


> Hi Heikki,
>
> On Thursday 28 August 2003 21:59, Heikki Tuuri wrote:
> > Sergey,
> >
> > does it always crash when you start mysqld?
>
> Yes It was always crashing until I deleted all InnoDB files and restored
> InnoDB tables from backup.
>
> >
> > It is page number 0 in the InnoDB tablespace. That is, the header page
of
> > the whole tablespace!
> >
> > The checksums in the page are ok. That shows the page was not corrupted
in
> > the Linux file system.
> >
> > InnoDB is trying to do an index search, but that of course crashes,
because
> > the header page is not any index page.
> >
> > The reason for the crash is probably that a page number in a pointer
record
> > in the father node of the B-tree has been reset to zero. The corruption
has
> > happened in the mysqld process memory, not in the file system of Linux.
> > Otherwise, InnoDB would have complained about page checksum errors.
> >
> > No one else has reported this error. I have now added a check to a
future
> > version of InnoDB which will catch this particular error earlier and
will
> > hex dump the father page.
>
> Yes, now it seems for me that this particular crash in not related to
linux
> kernel at all.
> The funny thing I've managed to get another InnoDB crash on the same box
> http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash-20030828
> which in turn was posted to linux-kernel over a two hours ago.
> This time the cheksums are different :(


ok, this time the corruption probably happened in the file cache, or the
file system of Linux, or in the hardware.

It is not at all surprising that you encounter memory corruption and file
corruption in the same computer. That is a common pattern if these problems
appear at all.

Do you have a swap partition? I do not know Linux well enough, but in
theory, file or disk corruption could cause also memory corruption if pages
of the process memory get swapped to disk.


> > By the way, I noticed that a website http://www.linuxtestproject.org has
> > made an extensive regression test suite for Linux. They have also
> > successfully run big MySQL and DB2 stress tests on their computers, on
> > 2.5.xx kernels. If there is something wrong with 2.5.xx or 2.6.0, it
> > apparently does not concern all computers.
> >
> > "
> > The Linux Test Project test suite, ltp-20030807, has been released. The
> > latest version of the testsuite contains 2000+ tests for the Linux OS.
> > "
> >
> > The general picture about InnoDB corruption is that reports have almost
> > stopped after I advised people on the mailing list to upgrade to
> > Linux-2.4.20 kernels.
>
> In fact I'm also a happy InnoDB user. It runs fine on 6 of my production
> servers. Thanks for a nice work btw!
>
> It has worked fine also on our development server until I upgraded it to
> 2.6.0-testX. I don't know. It might be just a broken hardware
> which is better stressed with 2.6 than with 2.4...
>
> >
> > With apologies,
> >
> > Heikki
> > Innobase Oy

Best regards,

Heikki
Innobase Oy
http://www.innodb.com



2003-08-28 19:27:28

by Sergey S. Kostyliov

[permalink] [raw]
Subject: Re: 2.6.0-test2-mm3 and mysql

On Thursday 28 August 2003 23:10, Heikki Tuuri wrote:
> Sergey,
>
> ----- Original Message -----
> From: "Sergey S. Kostyliov" <[email protected]>
> To: "Heikki Tuuri" <[email protected]>;
> <[email protected]> Sent: Thursday, August 28, 2003 10:01 PM
> Subject: Re: 2.6.0-test2-mm3 and mysql
>
> > Hi Heikki,
> >
> > On Thursday 28 August 2003 21:59, Heikki Tuuri wrote:
> > > Sergey,
> > >
> > > does it always crash when you start mysqld?
> >
> > Yes It was always crashing until I deleted all InnoDB files and restored
> > InnoDB tables from backup.
> >
> > > It is page number 0 in the InnoDB tablespace. That is, the header page
>
> of
>
> > > the whole tablespace!
> > >
> > > The checksums in the page are ok. That shows the page was not corrupted
>
> in
>
> > > the Linux file system.
> > >
> > > InnoDB is trying to do an index search, but that of course crashes,
>
> because
>
> > > the header page is not any index page.
> > >
> > > The reason for the crash is probably that a page number in a pointer
>
> record
>
> > > in the father node of the B-tree has been reset to zero. The corruption
>
> has
>
> > > happened in the mysqld process memory, not in the file system of Linux.
> > > Otherwise, InnoDB would have complained about page checksum errors.
> > >
> > > No one else has reported this error. I have now added a check to a
>
> future
>
> > > version of InnoDB which will catch this particular error earlier and
>
> will
>
> > > hex dump the father page.
> >
> > Yes, now it seems for me that this particular crash in not related to
>
> linux
>
> > kernel at all.
> > The funny thing I've managed to get another InnoDB crash on the same box
> > http://sysadminday.org.ru/linux-2.6.0-test4_InnoDB_crash-20030828
> > which in turn was posted to linux-kernel over a two hours ago.
> > This time the cheksums are different :(
>
> ok, this time the corruption probably happened in the file cache, or the
> file system of Linux, or in the hardware.
>
> It is not at all surprising that you encounter memory corruption and file
> corruption in the same computer. That is a common pattern if these problems
> appear at all.
>
> Do you have a swap partition? I do not know Linux well enough, but in
> theory, file or disk corruption could cause also memory corruption if pages
> of the process memory get swapped to disk.

Yes, the swap is used on this box.

rathamahata@dev rathamahata $ free -k
total used free shared buffers cached
Mem: 1035248 860740 174508 0 32676 581852
-/+ buffers/cache: 246212 789036
Swap: 8969484 195136 8774348