2009-10-05 01:17:06

by Norbert Preining

[permalink] [raw]
Subject: complete IO hang since a few kernel revision

Hi everyone,

(please Cc)

kernel 2.6.31.1, 2.6.31.2rc1, 2.6.31, 2.6.30 (at least)
Intel Corporation ICH9M/M-E SATA AHCI


I am experiencing IO stalls of real serious dimensions. I mean up to 20secs
waiting for some operations.

That normally happeny when I do a svn up on a big subversion repository,
but even on other locations.

Yesterday a simple sync took 30sec although I was not doing anything else.

I am *quite* sure that it wasn't like that in former kernel revisions, but
I don't have old ones at hand for testing.

Is there something known about a regression in this area?

Sorry for not Cc-ing the right people, didn't know who it could be.

Thanks a lot for any help/remarks/suggestions

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology [email protected]
Vienna University of Technology [email protected]
Debian Developer (Debian TeX Task Force) [email protected]
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
QUOYNESS (n.)
The hatefulness of words like 'relionus' and 'easiephit'.
--- Douglas Adams, The Meaning of Liff


2009-10-05 02:57:40

by Ulrich Lukas

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

Hi Norbert,


I reported similar observations a short time ago on this mailing list.

There has been quite a bit of development been done in this area since
2.6.31. (See the postings under "Re: IO scheduler based IO controller
V10" of 2009-09-25 et seqq.)


A number of patches went into 2.6.32-rc3 which was released today.

Can you test if your problems are solved with this latest version?


(BTW, Thanks to Vivek Goyal, Jens Axboe, Mike Galbraith, Ingo Molnar,
Corrado Zoccolo and anyone who I forgot!)

2009-10-05 08:33:51

by Norbert Preining

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

On Mon, 05 Oct 2009, Ulrich Lukas wrote:
> A number of patches went into 2.6.32-rc3 which was released today.

Ok, with 2.6.32-rc3 running now.

> Can you test if your problems are solved with this latest version?

Example:
dpkg-source -x of a big debian source package
which is more or less untarring a big .tar and then applying a diff
which is much smaller (tar.gz: 482M, diff.gz: 412k)

The dpkg-source has finished, I am back at the prompt.

At the same time a simple
$ <ENTER>
in a different xterm did not produce *any* thing, no output, for about
30 seconds. I mean 30 seconds waiting for an empty command is a bit
strange I would say.

I don't know to which this is related, but it seems quite buggy.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology [email protected]
Vienna University of Technology [email protected]
Debian Developer (Debian TeX Task Force) [email protected]
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
CLABBY (adj.)
A 'clabby' conversation is one stuck up by a commissionaire or
cleaning lady in order to avoid any further actual work. The opening
gambit is usually designed to provoke the maximum confusion, and
therefore the longest possible clabby conversation. It is vitally
important to learn the correct, or 'clixby' (q.v.), responses to a
clabby gambit, and not to get trapped by a 'ditherington' (q.v.). For
instance, if confronted with a clabby gambit such as 'Oh, mr Smith, I
didn't know you'd had your leg off', the ditherington response is 'I
haven't....' whereas the clixby is 'good.'
--- Douglas Adams, The Meaning of Liff

2009-10-06 02:02:16

by Robert Hancock

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

On 10/05/2009 02:33 AM, Norbert Preining wrote:
> On Mon, 05 Oct 2009, Ulrich Lukas wrote:
>> A number of patches went into 2.6.32-rc3 which was released today.
>
> Ok, with 2.6.32-rc3 running now.
>
>> Can you test if your problems are solved with this latest version?
>
> Example:
> dpkg-source -x of a big debian source package
> which is more or less untarring a big .tar and then applying a diff
> which is much smaller (tar.gz: 482M, diff.gz: 412k)
>
> The dpkg-source has finished, I am back at the prompt.
>
> At the same time a simple
> $<ENTER>
> in a different xterm did not produce *any* thing, no output, for about
> 30 seconds. I mean 30 seconds waiting for an empty command is a bit
> strange I would say.
>
> I don't know to which this is related, but it seems quite buggy.
>
> Best wishes
>
> Norbert

Are you getting any errors showing up in dmesg?

2009-10-06 02:05:57

by Norbert Preining

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

On Mo, 05 Okt 2009, Robert Hancock wrote:
> Are you getting any errors showing up in dmesg?

No, neither dmesg, nor syslog, nor console.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology [email protected]
Vienna University of Technology [email protected]
Debian Developer (Debian TeX Task Force) [email protected]
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
it's at times like this, when
I'm trapped in a Vogon airlock with a man from Betelgeuse,
and about to die from asphyxiation in deep space that I
really wish I'd listened to what my mother told me when I
was young.'
`Why, what did she tell you?'
`I don't know, I didn't listen.'
--- Arthur coping with certain death as best as he could.
--- Douglas Adams, The Hitchhikers Guide to the Galaxy

2009-10-06 07:12:44

by Mike Galbraith

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

Hi,

On Mon, 2009-10-05 at 17:33 +0900, Norbert Preining wrote:
> On Mon, 05 Oct 2009, Ulrich Lukas wrote:
> > A number of patches went into 2.6.32-rc3 which was released today.
>
> Ok, with 2.6.32-rc3 running now.
>
> > Can you test if your problems are solved with this latest version?
>
> Example:
> dpkg-source -x of a big debian source package
> which is more or less untarring a big .tar and then applying a diff
> which is much smaller (tar.gz: 482M, diff.gz: 412k)
>
> The dpkg-source has finished, I am back at the prompt.
>
> At the same time a simple
> $ <ENTER>
> in a different xterm did not produce *any* thing, no output, for about
> 30 seconds. I mean 30 seconds waiting for an empty command is a bit
> strange I would say.
>
> I don't know to which this is related, but it seems quite buggy.

I've tried unsuccessfully to reproduce this behavior.

Since it doesn't appear to be something readily reproducible, more
information and/or some digging on your part may be needed.

2009-10-06 07:16:52

by Norbert Preining

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

On Tue, 06 Oct 2009, Mike Galbraith wrote:
> I've tried unsuccessfully to reproduce this behavior.
>
> Since it doesn't appear to be something readily reproducible, more
> information and/or some digging on your part may be needed.

Hm, how could I do that? I am fine with compiling some strange
kernel options, or running some debugging task, but I don't have
and netconsole or serial console available.

Best wishes

Norbert

-------------------------------------------------------------------------------
Dr. Norbert Preining Associate Professor
JAIST Japan Advanced Institute of Science and Technology [email protected]
Vienna University of Technology [email protected]
Debian Developer (Debian TeX Task Force) [email protected]
gpg DSA: 0x09C5B094 fp: 14DF 2E6C 0307 BE6D AD76 A9C0 D2BF 4AA3 09C5 B094
-------------------------------------------------------------------------------
MEATHOP (n.)
One who sets off for the scene of an aircraft crash with a picnic
hamper.
--- Douglas Adams, The Meaning of Liff

2009-10-06 07:45:14

by Mike Galbraith

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

On Tue, 2009-10-06 at 16:16 +0900, Norbert Preining wrote:
> On Tue, 06 Oct 2009, Mike Galbraith wrote:
> > I've tried unsuccessfully to reproduce this behavior.
> >
> > Since it doesn't appear to be something readily reproducible, more
> > information and/or some digging on your part may be needed.
>
> Hm, how could I do that? I am fine with compiling some strange
> kernel options, or running some debugging task, but I don't have
> and netconsole or serial console available.

Locating when it started for you would be valuable, even if it's not
pin-pointed such as a git bisection could do. A bisection is the best
bet to nail it down, but can be quite time consuming.

One thing you can try is to drop to a non-gui shell, and see it the
symptom is reproducible without the GUI. If it is, poke SysRq-W while
things are thoroughly jammed up, and post the output+config here.

-Mike

2009-10-10 19:33:06

by Petr Titěra

[permalink] [raw]
Subject: Re: complete IO hang since a few kernel revision

Mike Galbraith napsal(a):
> On Tue, 2009-10-06 at 16:16 +0900, Norbert Preining wrote:
>
>> On Tue, 06 Oct 2009, Mike Galbraith wrote:
>>
>>> I've tried unsuccessfully to reproduce this behavior.
>>>
>>> Since it doesn't appear to be something readily reproducible, more
>>> information and/or some digging on your part may be needed.
>>>
>> Hm, how could I do that? I am fine with compiling some strange
>> kernel options, or running some debugging task, but I don't have
>> and netconsole or serial console available.
>>
>
> Locating when it started for you would be valuable, even if it's not
> pin-pointed such as a git bisection could do. A bisection is the best
> bet to nail it down, but can be quite time consuming.
>
>
I see something similar here. Processes just get stuck in
balance_dirty_pages for infinite time. It helps if, when process is
stuck another process generates some dirty pages. Unfotunately last
kernel I'm prety sure everything worked is 2.6.31-rc9-tip-01355-ge035e96
(yes its x86 -tip kernel). I'm building new kernel now.

Petr

> One thing you can try is to drop to a non-gui shell, and see it the
> symptom is reproducible without the GUI. If it is, poke SysRq-W while
> things are thoroughly jammed up, and post the output+config here.
>
> -Mike
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
> __________ Informace od ESET Smart Security, verze databaze 4496 (20091010) __________
>
> Tuto zpravu proveril ESET Smart Security.
>
> http://www.eset.cz
>
>
>



__________ Informace od ESET Smart Security, verze databaze 4496 (20091010) __________

Tuto zpravu proveril ESET Smart Security.

http://www.eset.cz