2009-12-17 17:00:57

by Alain Knaff

[permalink] [raw]
Subject: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)

On 17/12/09 16:49, Alain Knaff wrote:
> On 17/12/09 16:43, Mark Hounschell wrote:
>> On 12/17/2009 10:35 AM, Alain Knaff wrote:
>>
>>>> Should I do more work in between?
>>>
>>> No, but make sure to look at track 0... Other tracks will still have the
>>> error, as there was nothing forcing a memory flush between track 0 and 1...
>>
>> Ok track 0
> [...]
>> 0: 0
>> 1: 0
>> 2: 0
>> 3: 4f <--
>> 4: 0
>> 5: 1
>> 6: 2
>> no disk change
>
> Yeah, that's what I meant... So the memory flusher program didn't manage to
> clear up the inconsistency...
>
> So either my theory is wrong, or the memory flusher program was not
> efficient enough.... hmmm, maybe doing some surfing in between the formats,
> or doing another kernel compilation might be a better test.
>
> Alain

Ok, so I had a look at the differences between 2.6.27.41 and 2.6.28, and
there have indeed been changes to the iommu and DMA handling code.

So I suspect that the problem may be lying here

Cc'ed Linus and kernel list on this. For Linux and the list, here's the
summary of what we are observing:

- A DMA transfer of a memory block transfers the wrong value for the first
byte of the block. All other bytes of the block are transferred correctly.
The value of the first byte turns out to be the value that this byte held
during the *previous* transfer. Just as if there was some kind of cache,
and the transfer started before that cache was refreshed with the new
values from main memory.

Example:

1. initial contents: 33 44 55 66
2. one DMA transfer is performed
3. program changes buffer to: 77 88 99 aa
4. new DMA transfer is performed => instead it transmits 33 88 99 aa
(i.e. first byte is from previous contents)

This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
all hardware though.

It does indeed seem to be related to a DMA-side cache (rather than the
processor's cache not being flushed to main memory), as doing lots of
memory intensive work (kernel compilation) between 2 and 3 doesn't fix the
problem.

In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
arch/x86/kernel/amd_iommu.c and related files, could any of these have
triggered this behavior?

Any ideas, anybody?

Alain


2009-12-17 17:28:49

by Linus Torvalds

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)



On Thu, 17 Dec 2009, Alain Knaff wrote:
>
> 1. initial contents: 33 44 55 66
> 2. one DMA transfer is performed
> 3. program changes buffer to: 77 88 99 aa
> 4. new DMA transfer is performed => instead it transmits 33 88 99 aa
> (i.e. first byte is from previous contents)
>
> This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
> all hardware though.

Do you have a list of hardware it works on? Especially chipsets.

On x86, where all caches are supposed to be totally coherent (except for
I$ under very special circumstances), the above should never be able to
happen. At least not unless there is really buggy hardware involved.

> It does indeed seem to be related to a DMA-side cache (rather than the
> processor's cache not being flushed to main memory), as doing lots of
> memory intensive work (kernel compilation) between 2 and 3 doesn't fix the
> problem.

I'm not entirely surprised. Actual CPU bugs are pretty rare in the x86
world. But chipset bugs? Another thing entirely. There are buffers and
caches there, and those are sometimes software-visible. The most obvious
case of that is just the IOMMU's themselves, but from your description I
don't think you actually change the DMA _mappings_ do you? Just the
actual buffer (that was then mapped earlier)?

So I don't think it's the IOMMU code itself necessarily, although an IOMMU
may well be involved (eg I could easily see a few cachelines worth of
actual DMA data caching going on in the whole IOMMU too)

And to some degree the floppy driver might be _more_ likely to see some
kinds of bugs, because it uses that crazy legacy DMA engine. So it's not
going to go through the regular PCI DMA hardware paths, it's going to go
through its own special paths that nobody else uses any more (and thus has
probably not had as much testing).

> In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
> arch/x86/kernel/amd_iommu.c and related files, could any of these have
> triggered this behavior?

Could it have triggered? Sure. Chipset caches are often flushed by certain
trivial operations (often the caches are small, and operations like "any
PIO access" will make sure they are flushed). Different IOMMU flush
patterns could easily account for it.

But I think we'd like to see a list of hardware where this can be
triggered, and quite frankly, a 'git bisect' would be absolutely wonderful
especially if the list of hardware is not showing any really obvious
patterns (and I assume they aren't all _that_ obvious, or you'd have
mentioned them).

Linus

2009-12-17 18:21:47

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28

Linus Torvalds <[email protected]> writes:

> On x86, where all caches are supposed to be totally coherent (except for
> I$ under very special circumstances),

BTW SWIOTLB is a non-coherent "cache" in some sense, though I'd be
surprised if it's related. Anyway mentioning $CPU and $RAM at the very
least would be a good idea in such cases.
--
Krzysztof Halasa

2009-12-17 20:47:14

by Alain Knaff

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)

Linus Torvalds wrote:
>
> On Thu, 17 Dec 2009, Alain Knaff wrote:
>> 1. initial contents: 33 44 55 66
>> 2. one DMA transfer is performed
>> 3. program changes buffer to: 77 88 99 aa
>> 4. new DMA transfer is performed => instead it transmits 33 88 99 aa
>> (i.e. first byte is from previous contents)
>>
>> This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
>> all hardware though.
>
> Do you have a list of hardware it works on? Especially chipsets.

For the moment, I have a very small sample of hardware:
1. One machine which works (my own): Athlon XP 1800+ processor
2. One which doesn't work (Mark's)

I might get access to a wider sample of boxen in a week or so, in order
to do some stats.

What's the easiest way to find out the chipset?

Here's already the output of lspci from my machine (works):

00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP]
Host Bridge
00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1
Controller (rev 80)
00:10.3 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 82)
00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
00:11.1 IDE interface: VIA Technologies, Inc.
VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
00:11.5 Multimedia audio controller: VIA Technologies, Inc.
VT8233/A/8235/8237 AC97 Audio Controller (rev 50)
00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II]
(rev 74)
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX
440] (rev a3)


[...]
> I'm not entirely surprised. Actual CPU bugs are pretty rare in the x86
> world. But chipset bugs? Another thing entirely. There are buffers and
> caches there, and those are sometimes software-visible. The most obvious
> case of that is just the IOMMU's themselves, but from your description I
> don't think you actually change the DMA _mappings_ do you? Just the
> actual buffer (that was then mapped earlier)?

No, I don't change any DMA mappings. And the buffer is still the same
physical buffer, at the same physical address.

(It happens during formatting the floppy drive: here the first byte
happens to be the trackid of the first physical sector of the track, and
it always ends up being the track of the *previously* formatted track).

> So I don't think it's the IOMMU code itself necessarily, although an IOMMU
> may well be involved (eg I could easily see a few cachelines worth of
> actual DMA data caching going on in the whole IOMMU too)
>
> And to some degree the floppy driver might be _more_ likely to see some
> kinds of bugs, because it uses that crazy legacy DMA engine. So it's not

Indeed, most other drivers use "bus master" DMA, that doesn't use the
legacy DMA controller at all, but use DMA controllers hosted on the
device itself...

> going to go through the regular PCI DMA hardware paths, it's going to go
> through its own special paths that nobody else uses any more (and thus has
> probably not had as much testing).
>
>> In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
>> arch/x86/kernel/amd_iommu.c and related files, could any of these have
>> triggered this behavior?
>
> Could it have triggered? Sure. Chipset caches are often flushed by certain
> trivial operations (often the caches are small, and operations like "any
> PIO access" will make sure they are flushed). Different IOMMU flush
> patterns could easily account for it.
>
> But I think we'd like to see a list of hardware where this can be
> triggered,

We'll get a list of 2 machines relatively quickly (unless other people
would like to chime in: the test is easy, just fdformat a floppy disk),
and more in a week or so.

> and quite frankly, a 'git bisect' would be absolutely wonderful

How exactly would I use this (command line sample)?

> especially if the list of hardware is not showing any really obvious
> patterns (and I assume they aren't all _that_ obvious, or you'd have
> mentioned them).
>
> Linus

Thanks,

Alain

2009-12-17 21:15:10

by Linus Torvalds

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)



On Thu, 17 Dec 2009, Alain Knaff wrote:
>
> For the moment, I have a very small sample of hardware:
> 1. One machine which works (my own): Athlon XP 1800+ processor
> 2. One which doesn't work (Mark's)

Ok. I don't think I even have any machines with floppy drives any more
(one external USB drive somewhere gathering dust just in case I ever
encounter a floppy again).

> I might get access to a wider sample of boxen in a week or so, in order
> to do some stats.

Ok, I was more thinking "we have a bugzilla with ten different people
reporting this". If it's just a single machine, that's not going to be
relevant.

> What's the easiest way to find out the chipset?
>
> Here's already the output of lspci from my machine (works):
>
> 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge

Yeah, lspci (and generally only the northbridge and southbridge matters,
the "ISA bridge" might technically be relevant, but since it's universally
on the same die as the southbridge, I left it in there just for kicks).

> (It happens during formatting the floppy drive: here the first byte
> happens to be the trackid of the first physical sector of the track, and
> it always ends up being the track of the *previously* formatted track).

I guess it could simply be a floppy controller bug too, triggered by some
random timing difference or innocuous-looking change.

> > But I think we'd like to see a list of hardware where this can be
> > triggered,
>
> We'll get a list of 2 machines relatively quickly (unless other people
> would like to chime in: the test is easy, just fdformat a floppy disk),
> and more in a week or so.

Only the "it doesn't work on xyz" is likely interesting. The machines it
works on are probably uninteresting statistically.

> > and quite frankly, a 'git bisect' would be absolutely wonderful
>
> How exactly would I use this (command line sample)?

You'd need a git tree that contains both the working and non-working
versions, and then literally just do

git bisect start
git bisect good <known good version number here>
git bisect bad <known bad version here>

and it will give you a commit to try. Compile, test, see if it's good or
bad, and do

git bisect [good|bad]

depending on the result. Rinse and repeat (depending on how tight the
initial good/bad commits were, it will need 10-15 kernel tests).

So in this case, since apparently 2.6.27.41 is good, and 2.6.28 is not, it
would be something like this:

# clone hpa's tree that has all the stable releases in one place
git clone git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-allstable.git

cd linux-2.6-allstable
git bisect start
git bisect bad v2.6.28
git bisect good v2.6.27.41

and off you go.

NOTE! Bisection depends very much on the bug being 100% reproducible. If
you ever mark a good kernel bad (because you messed up) or a bad kernel
good (because the bug wasn't 100% reproducible, so you _thought_ it was
good even though the bug was present and just happened to hide), the end
result of the bisect will be totally unreliable and seriously screwed up.

So after a successful bisect, it is usually a good idea to try to go back
to the original known-bad kernel, and then revert the commit that was
indicated as the bad one (assuming the revert works - it could be that the
bad one ends up being fundamental to other commits after it), and test
that yes, that really fixes the bug.

It gets more complicated if the bisect hits kernels that you can't test
because they have _unrelated_ issues on that machine (compile failures or
just other bugs that hide the actual floppy behavior), but generally
bisection is pretty simple. "man git-bisect" does have some extra
pointers.

So git bisect may be somewhat time-consuming and mindless, but for
reliably triggering bugs where nobody really knows what caused the bug it
is a _really_ convenient thing to do. The only thing you need is a
reliably triggering test-case, and some time.

Linus

2009-12-17 22:11:42

by Alain Knaff

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)

Linus Torvalds wrote:
>
> On Thu, 17 Dec 2009, Alain Knaff wrote:
>> For the moment, I have a very small sample of hardware:
>> 1. One machine which works (my own): Athlon XP 1800+ processor
>> 2. One which doesn't work (Mark's)
>
> Ok. I don't think I even have any machines with floppy drives any more
> (one external USB drive somewhere gathering dust just in case I ever
> encounter a floppy again).

Well, on my new box, I have no floppy drive either. The one I mentioned
is an old machine that I kept around just in case I needed to debug
floppy-related problems.

>> I might get access to a wider sample of boxen in a week or so, in order
>> to do some stats.
>
> Ok, I was more thinking "we have a bugzilla with ten different people
> reporting this". If it's just a single machine, that's not going to be
> relevant.

We do have a bugzilla
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=548434 , but
unfortunately it has only 2 people so far having seen the bug, one of
which (ael) turned out to be a false alert (dusty drive).

>
>> What's the easiest way to find out the chipset?
>>
>> Here's already the output of lspci from my machine (works):
>>
>> 00:00.0 Host bridge: VIA Technologies, Inc. VT8377 [KT400/KT600 AGP] Host Bridge
>> 00:01.0 PCI bridge: VIA Technologies, Inc. VT8235 PCI Bridge
>> 00:11.0 ISA bridge: VIA Technologies, Inc. VT8235 ISA Bridge
>
> Yeah, lspci (and generally only the northbridge and southbridge matters,
> the "ISA bridge" might technically be relevant, but since it's universally
> on the same die as the southbridge, I left it in there just for kicks).

Good. Here's some info about some machines of Mark which do have the
problem (there's more than one, fortunately):

1st one showing the problem (claimed to be AMD 790x chipset):

00:00.0 Host bridge: ATI Technologies Inc RD790 Northbridge only dual
slot PCI-e_GFX and HT3 K8 part
00:02.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge
(external gfx0 port A)
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller

2nd one showing the problem (also claimed to be AMD 790x chipset):

00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge
00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge
(int gfx)
00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller

He also has several machines that do work:

1st one that does work:
00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)

... and a couple more where he didn't get around to test.

[...]
> Only the "it doesn't work on xyz" is likely interesting. The machines it
> works on are probably uninteresting statistically.

I understand... (working machine above just mentioned for completeness'
sake).

[...]
> You'd need a git tree that contains both the working and non-working
> versions, and then literally just do
>
> git bisect start
> git bisect good <known good version number here>
> git bisect bad <known bad version here>
>
> and it will give you a commit to try. Compile, test, see if it's good or
> bad, and do
>
> git bisect [good|bad]
>
> depending on the result. Rinse and repeat (depending on how tight the
> initial good/bad commits were, it will need 10-15 kernel tests).

... and how do I check out the most recent good / oldest bad kernel for
compilation?


> So in this case, since apparently 2.6.27.41 is good, and 2.6.28 is not, it
> would be something like this:
>
> # clone hpa's tree that has all the stable releases in one place
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-allstable.git
>
> cd linux-2.6-allstable
> git bisect start
> git bisect bad v2.6.28
> git bisect good v2.6.27.41
>
> and off you go.

ok...

> NOTE! Bisection depends very much on the bug being 100% reproducible. If
> you ever mark a good kernel bad (because you messed up) or a bad kernel
> good (because the bug wasn't 100% reproducible, so you _thought_ it was
> good even though the bug was present and just happened to hide), the end
> result of the bisect will be totally unreliable and seriously screwed up.
>
> So after a successful bisect, it is usually a good idea to try to go back
> to the original known-bad kernel, and then revert the commit that was
> indicated as the bad one (assuming the revert works - it could be that the
> bad one ends up being fundamental to other commits after it), and test
> that yes, that really fixes the bug.

What command lines would I use for that revert?

> It gets more complicated if the bisect hits kernels that you can't test
> because they have _unrelated_ issues on that machine (compile failures or
> just other bugs that hide the actual floppy behavior), but generally
> bisection is pretty simple. "man git-bisect" does have some extra
> pointers.
>
> So git bisect may be somewhat time-consuming and mindless, but for
> reliably triggering bugs where nobody really knows what caused the bug it
> is a _really_ convenient thing to do. The only thing you need is a
> reliably triggering test-case, and some time.
>
> Linus

Alain

2009-12-17 22:43:29

by Linus Torvalds

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)



On Thu, 17 Dec 2009, Alain Knaff wrote:
> [...]
> > You'd need a git tree that contains both the working and non-working
> > versions, and then literally just do
> >
> > git bisect start
> > git bisect good <known good version number here>
> > git bisect bad <known bad version here>
> >
> > and it will give you a commit to try. Compile, test, see if it's good or
> > bad, and do
> >
> > git bisect [good|bad]
> >
> > depending on the result. Rinse and repeat (depending on how tight the
> > initial good/bad commits were, it will need 10-15 kernel tests).
>
> ... and how do I check out the most recent good / oldest bad kernel for
> compilation?

'git bisect' does all that for you. You don't need to check out the
kernels you mark good or bad - git will just calculate the commit graphs,
and pick a commit that is in the "middle" between them, and check out that
commit.

> > So after a successful bisect, it is usually a good idea to try to go back
> > to the original known-bad kernel, and then revert the commit that was
> > indicated as the bad one (assuming the revert works - it could be that the
> > bad one ends up being fundamental to other commits after it), and test
> > that yes, that really fixes the bug.
>
> What command lines would I use for that revert?

git revert <sha1-that-git-bisect-reported>

but even if that revert isn't successful, just the bisection result will
be very interesting (assuming it all looks sane, of course - as mentioned,
sometimes bisect results get screwed up because the bug isn't entirely
reproducible due to timing etc).

Linus

2009-12-17 23:25:19

by Alain Knaff

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)

Linus Torvalds wrote:
>
> On Thu, 17 Dec 2009, Alain Knaff wrote:
>> [...]
>>> You'd need a git tree that contains both the working and non-working
>>> versions, and then literally just do
>>>
>>> git bisect start
>>> git bisect good <known good version number here>
>>> git bisect bad <known bad version here>
>>>
>>> and it will give you a commit to try. Compile, test, see if it's good or
>>> bad, and do
>>>
>>> git bisect [good|bad]
>>>
>>> depending on the result. Rinse and repeat (depending on how tight the
>>> initial good/bad commits were, it will need 10-15 kernel tests).
>> ... and how do I check out the most recent good / oldest bad kernel for
>> compilation?
>
> 'git bisect' does all that for you. You don't need to check out the
> kernels you mark good or bad - git will just calculate the commit graphs,
> and pick a commit that is in the "middle" between them, and check out that
> commit.
>
>>> So after a successful bisect, it is usually a good idea to try to go back
>>> to the original known-bad kernel, and then revert the commit that was
>>> indicated as the bad one (assuming the revert works - it could be that the
>>> bad one ends up being fundamental to other commits after it), and test
>>> that yes, that really fixes the bug.
>> What command lines would I use for that revert?
>
> git revert <sha1-that-git-bisect-reported>
>
> but even if that revert isn't successful, just the bisection result will
> be very interesting (assuming it all looks sane, of course - as mentioned,
> sometimes bisect results get screwed up because the bug isn't entirely
> reproducible due to timing etc).
>
> Linus

thanks for these explanations, that makes it clearer indeed.

Now, I only need to find a machine locally to test this on. Or Mark: are
you confident in doing this yourself?

Thanks,

Alain

2009-12-18 08:59:59

by Mark Hounschell

[permalink] [raw]
Subject: Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils] Cannot format floppies under kernel 2.6.*?)

On 12/17/2009 06:24 PM, Alain Knaff wrote:

>
> Now, I only need to find a machine locally to test this on. Or Mark: are
> you confident in doing this yourself?
>

I'll give it a shot. Sounds easy enough. If I have problems, I'll yell.

Mark

2009-12-18 10:55:09

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/18/2009 03:59 AM, Mark Hounschell wrote:
> On 12/17/2009 06:24 PM, Alain Knaff wrote:
>
>>
>> Now, I only need to find a machine locally to test this on. Or Mark: are
>> you confident in doing this yourself?
>>
>
> I'll give it a shot. Sounds easy enough. If I have problems, I'll yell.
>

Ok, I ran into a build issue on the third on.

#harley:/usr/src # git clone
git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-allstable.git
Initialized empty Git repository in /usr/src/linux-2.6-allstable/.git/
remote: Counting objects: 1486248, done.
remote: Compressing objects: 100% (248092/248092), done.
Receiving objects: 100% (1486248/1486248), 323.35 MiB | 6753 KiB/s, done.
remote: Total 1486248 (delta 1236282), reused 1476516 (delta 1227133)
Resolving deltas: 100% (1236282/1236282), done.
Checking out files: 100% (31502/31502), done.


harley:/usr/src # cd linux-2.6-allstable
harley:/usr/src/linux-2.6-allstable # git bisect start
harley:/usr/src/linux-2.6-allstable # git bisect bad v2.6.28
harley:/usr/src/linux-2.6-allstable # git bisect good v2.6.27.41
Bisecting: a merge base must be tested
[3fa8749e584b55f1180411ab1b51117190bac1e5] Linux 2.6.27

Build and test kernel: This one worked so:

harley:/usr/src/linux-2.6-allstable # git bisect good
Bisecting: 4879 revisions left to test after this (roughly 12 steps)
[c813b4e16ead3c3df98ac84419d4df2adf33fe01] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6

Build and test kernel: This one worked so:

harley:/usr/src/linux-2.6-allstable # git bisect good
Bisecting: 2443 revisions left to test after this (roughly 11 steps)
[db563fc2e80534f98c7f9121a6f7dfe41f177a79] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6

This one doesn't build:

CC [M] fs/ext3/super.o
fs/ext3/super.c: In function ?ext3_quota_on?:
fs/ext3/super.c:2839: error: ?nd? undeclared (first use in this function)
fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
fs/ext3/super.c:2839: error: for each function it appears in.)
make[2]: *** [fs/ext3/super.o] Error 1
make[1]: *** [fs/ext3] Error 2
make: *** [fs] Error 2

I haven't yet determined that I can but, if I were to make a modification to the
tree now to fix this would that screw up the bisect process?

Regards
Mark

2009-12-18 15:01:35

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Mark Hounschell <[email protected]> writes:

> harley:/usr/src/linux-2.6-allstable # git bisect good
> Bisecting: 2443 revisions left to test after this (roughly 11 steps)
> [db563fc2e80534f98c7f9121a6f7dfe41f177a79] Merge
> git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
>
> This one doesn't build:
>
> CC [M] fs/ext3/super.o
> fs/ext3/super.c: In function ‘ext3_quota_on’:
> fs/ext3/super.c:2839: error: ‘nd’ undeclared (first use in this function)
> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
> fs/ext3/super.c:2839: error: for each function it appears in.)
> make[2]: *** [fs/ext3/super.o] Error 1
> make[1]: *** [fs/ext3] Error 2
> make: *** [fs] Error 2
>
> I haven't yet determined that I can but, if I were to make a modification to the
> tree now to fix this would that screw up the bisect process?

It won't, in such cases.
But you can also git reset --hard another_commit_id (while doing git
bisect) if it fixes this problem (e.g. some next commit).

And you can skip uninteresting parts of the tree when starting git
bisect (though if the cause is in skipped parts, the results will be
meaningless).
--
Krzysztof Halasa

2009-12-18 15:22:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)



On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> This one doesn't build:
>
> CC [M] fs/ext3/super.o
> fs/ext3/super.c: In function ?ext3_quota_on?:
> fs/ext3/super.c:2839: error: ?nd? undeclared (first use in this function)
> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
> fs/ext3/super.c:2839: error: for each function it appears in.)
> make[2]: *** [fs/ext3/super.o] Error 1
> make[1]: *** [fs/ext3] Error 2
> make: *** [fs] Error 2
>
> I haven't yet determined that I can but, if I were to make a modification to the
> tree now to fix this would that screw up the bisect process?

You can safely fix unrelated problems without screwing up the bisection.
And in this case you can be pretty sure that this is unrelated, so it's
all ok.

The fix for that silly problem is

- path_put(&nd.path);
+ path_put(&path);

(it's due to a silent merge failure - it merged cleanly, but semantics had
changed in a branch and impacted code that was newly introduced in another
branch).


Linus

2009-12-18 15:28:34

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/18/2009 10:22 AM, Linus Torvalds wrote:
>
>
> On Fri, 18 Dec 2009, Mark Hounschell wrote:
>>
>> This one doesn't build:
>>
>> CC [M] fs/ext3/super.o
>> fs/ext3/super.c: In function ?ext3_quota_on?:
>> fs/ext3/super.c:2839: error: ?nd? undeclared (first use in this function)
>> fs/ext3/super.c:2839: error: (Each undeclared identifier is reported only once
>> fs/ext3/super.c:2839: error: for each function it appears in.)
>> make[2]: *** [fs/ext3/super.o] Error 1
>> make[1]: *** [fs/ext3] Error 2
>> make: *** [fs] Error 2
>>
>> I haven't yet determined that I can but, if I were to make a modification to the
>> tree now to fix this would that screw up the bisect process?
>
> You can safely fix unrelated problems without screwing up the bisection.
> And in this case you can be pretty sure that this is unrelated, so it's
> all ok.
>
> The fix for that silly problem is
>
> - path_put(&nd.path);
> + path_put(&path);
>
> (it's due to a silent merge failure - it merged cleanly, but semantics had
> changed in a branch and impacted code that was newly introduced in another
> branch).

Yep, thanks. I'm past that now. But haven't done a bisect [good|bad] on the
results of that one yet. Did you see Alain's email response to my bisect
progress report to him?

I'm still at a loss as to how to proceed?

Mark

2009-12-18 15:46:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)



On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> Yep, thanks. I'm past that now. But haven't done a bisect [good|bad] on the
> results of that one yet. Did you see Alain's email response to my bisect
> progress report to him?
>
> I'm still at a loss as to how to proceed?

Ahh, the HPET issue.

That one is actually very interesting information, because we've had
problems with HPET before. But what I would suggest is to try to continue
to bisect with HPET enabled (to see the problem), and the commit that you
couldn't even boot with HPET enabled you should not count as good or bad
because you just don't know.

You can do "git bisect skip" to make git know that some particular commit
is not a commit you can test, and you can also move away from a whole
problematic region to another area by doing

git bisect visualize

to bring up a graphical gitk view of what all you have left to bisect,
pick a good point (still _reasonably_ close to the middle) there, and do

git reset --hard <the-point-you-want-to-test>

and try that kernel instead of the one git bisect suggested.

But this floppy DMA inconsistency being somehow HPET-related is
interestign in itself. One thing that HPET does si to obviously change how
we read the time - and what that can cause (totally indirectly) is that
now we don't touch the southbridge with IO accesses nearly as much,
because instead of going to the old 8253 PIT will touch the same legacy
chip support that implements the floppy controller itself.

So it's entirely possible that the reason a non-HPET setup doesn't show
this is that the accesses to the i8253 PIT part will "synchronize" the old
floppy controller too, and hide some issue.

But still, I assume you had HPET enabled in 2.6.27, so it would be
interesting to see exactly when the problem starts.

Linus

2009-12-18 20:05:30

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/18/2009 10:45 AM, Linus Torvalds wrote:
>
>
> On Fri, 18 Dec 2009, Mark Hounschell wrote:
>>
>> Yep, thanks. I'm past that now. But haven't done a bisect [good|bad] on the
>> results of that one yet. Did you see Alain's email response to my bisect
>> progress report to him?
>>
>> I'm still at a loss as to how to proceed?
>
> Ahh, the HPET issue.
>
> That one is actually very interesting information, because we've had
> problems with HPET before. But what I would suggest is to try to continue
> to bisect with HPET enabled (to see the problem), and the commit that you
> couldn't even boot with HPET enabled you should not count as good or bad
> because you just don't know.
>
> You can do "git bisect skip" to make git know that some particular commit
> is not a commit you can test, and you can also move away from a whole
> problematic region to another area by doing
>
> git bisect visualize
>
> to bring up a graphical gitk view of what all you have left to bisect,
> pick a good point (still _reasonably_ close to the middle) there, and do
>
> git reset --hard <the-point-you-want-to-test>
>
> and try that kernel instead of the one git bisect suggested.
>
> But this floppy DMA inconsistency being somehow HPET-related is
> interestign in itself. One thing that HPET does si to obviously change how
> we read the time - and what that can cause (totally indirectly) is that
> now we don't touch the southbridge with IO accesses nearly as much,
> because instead of going to the old 8253 PIT will touch the same legacy
> chip support that implements the floppy controller itself.
>
> So it's entirely possible that the reason a non-HPET setup doesn't show
> this is that the accesses to the i8253 PIT part will "synchronize" the old
> floppy controller too, and hide some issue.
>
> But still, I assume you had HPET enabled in 2.6.27, so it would be
> interesting to see exactly when the problem starts.
>
> Linus
>

It looks like I may have to back up and first find the points that, let me,
and stop me, booting with the HPET enabled. Before I change direction, can
the git-bisect start sequence use the SHA1 id for the starting 'goods' and
'bads'? I don't see reference to that in the doc.

Thanks
Mark

2009-12-18 20:16:12

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)



On Fri, 18 Dec 2009, Mark Hounschell wrote:
>
> It looks like I may have to back up and first find the points that, let me,
> and stop me, booting with the HPET enabled. Before I change direction, can
> the git-bisect start sequence use the SHA1 id for the starting 'goods' and
> 'bads'? I don't see reference to that in the doc.

You can always use a SHA1 id instead of a tag. So when you did

git bisect good v2.6.17.4

you could always have replaced that "v2.6.17.4" with the SHA1 of the
commit.

In git, the SHA1 ID's are the "real" names - the tags and branch names are
purely for human-readable decoration. Git always turns them into SHA1 id's
internally.

Linus

2009-12-22 15:11:53

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/18/2009 03:15 PM, Linus Torvalds wrote:
>
>
> On Fri, 18 Dec 2009, Mark Hounschell wrote:
>>
>> It looks like I may have to back up and first find the points that, let me,
>> and stop me, booting with the HPET enabled. Before I change direction, can
>> the git-bisect start sequence use the SHA1 id for the starting 'goods' and
>> 'bads'? I don't see reference to that in the doc.
>
> You can always use a SHA1 id instead of a tag. So when you did
>
> git bisect good v2.6.17.4
>
> you could always have replaced that "v2.6.17.4" with the SHA1 of the
> commit.
>
> In git, the SHA1 ID's are the "real" names - the tags and branch names are
> purely for human-readable decoration. Git always turns them into SHA1 id's
> internally.
>
> Linus
>

Ok, I may have something that might help.

# git bisect bad
26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
Author: [email protected] <[email protected]>
Date: Fri Sep 5 18:02:18 2008 -0700

x86: HPET_MSI Initialise per-cpu HPET timers

Initialize a per CPU HPET MSI timer when possible. We retain the HPET
timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being
used. We
setup the remaining HPET timers as per CPU MSI based timers. This per CPU
timer will eliminate the need for timer broadcasting with IRQ 0 when there
is non-functional LAPIC timer across CPU deep C-states.

If there are more CPUs than number of available timers, CPUs that do not
find any timer to use will continue using LAPIC and IRQ 0 broadcast.

Signed-off-by: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Shaohua Li <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

:040000 040000 b0a11fa0abdc591427e78236a1f25f26b824140e
f2e9b13cf9e2eb7e0fc101660b1e1d499033d78f M arch


And of coarse this was the first commit that I could not boot if I had hpet
enabled. To get this one to boot (single user mode only) I had to add the
the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c

commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a

@ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
{

if (request_irq(dev->irq, hpet_interrupt_handler,
- IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
+ IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
return -1;

disable_irq(dev->irq);


AND add the quiet cmdline option.

Also, of all the machines it does work on with hpets enabled, I don't see
the HPET2 in /proc/interupts as below.


cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 82 0 3 0 IO-APIC-edge timer
1: 0 0 1712 6 IO-APIC-edge i8042
3: 0 0 6 0 IO-APIC-edge
4: 0 0 6 0 IO-APIC-edge
6: 0 0 4 0 IO-APIC-edge floppy
8: 0 0 60 0 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 37798 179 IO-APIC-edge i8042
14: 0 0 16462 71 IO-APIC-edge
pata_atiixp
15: 0 0 5713 17 IO-APIC-edge
pata_atiixp
16: 0 0 904 2 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel, ni-pci-gpib
17: 0 0 2 0 IO-APIC-fasteoi
ehci_hcd:usb1, parport0, ni-pci-gpib
18: 0 0 49940 90 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, nvidia
19: 0 0 703 2 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
22: 0 0 1303 15 IO-APIC-fasteoi ahci


24: 261763 0 0 0 HPET_MSI-edge hpet2


29: 0 0 220 5 PCI-MSI-edge
sky2@pci:0000:04:00.0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 138 271356 264446 261050 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring
interrupts
PND: 0 0 0 0 Performance pending work
RES: 4511 9275 8470 8086 Rescheduling interrupts
CAL: 3624 8666 523 4543 Function call interrupts
TLB: 981 1111 1065 1058 TLB shootdowns
ERR: 0
MIS: 0


Regards
Mark

2009-12-22 17:38:54

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)


[ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
details, but Mark is basically chasing down a situation where the floppy
driver seems to have trouble formatting floppies, and it happened
between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
memory block transfers the wrong value for the first byte of the block.

Which should be impossible, but whatever. Some part of the system has a
cached buffer that isn't flushed.

What gets _you_ guys involved is that Mark cannot reproduce the bug if
HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
pure luck while bisecting, because some time during his bisect, his
machine wouldn't even boot with HPET.

So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
2.6.28 (and current -git) does not. Any ideas? ]

On Tue, 22 Dec 2009, Mark Hounschell wrote:
>
> Ok, I may have something that might help.
>
> # git bisect bad
> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
> Author: [email protected] <[email protected]>
> Date: Fri Sep 5 18:02:18 2008 -0700
>
> x86: HPET_MSI Initialise per-cpu HPET timers
>
> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
> timer will eliminate the need for timer broadcasting with IRQ 0 when there
> is non-functional LAPIC timer across CPU deep C-states.
>
> If there are more CPUs than number of available timers, CPUs that do not
> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>
> Signed-off-by: Venkatesh Pallipadi <[email protected]>
> Signed-off-by: Shaohua Li <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
>
> And of coarse this was the first commit that I could not boot if I had hpet
> enabled. To get this one to boot (single user mode only) I had to add the
> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>
> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>
> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
> {
>
> if (request_irq(dev->irq, hpet_interrupt_handler,
> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
> return -1;
>
> disable_irq(dev->irq);
>
> AND add the quiet cmdline option.

Ok, so we know why HPET didn't boot for you, and that was fixed later (by
that 5ceb1a04). But is this also when the floppy started mis-behaving?

IOW, _if_ you boot with that fix from commit 5ceb1a04 (and the quiet
option - I wonder what that is about: do you have any ideas?), is the
per-CPU HPET timer commit also the commit that causes floppy problems, or
is this purely a "bisect when HPET became a boot-up problem"?

Linus

---
> Also, of all the machines it does work on with hpets enabled, I don't see
> the HPET2 in /proc/interupts as below.
>
>
> cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 0: 82 0 3 0 IO-APIC-edge timer
> 1: 0 0 1712 6 IO-APIC-edge i8042
> 3: 0 0 6 0 IO-APIC-edge
> 4: 0 0 6 0 IO-APIC-edge
> 6: 0 0 4 0 IO-APIC-edge floppy
> 8: 0 0 60 0 IO-APIC-edge rtc0
> 9: 0 0 0 0 IO-APIC-fasteoi acpi
> 12: 0 0 37798 179 IO-APIC-edge i8042
> 14: 0 0 16462 71 IO-APIC-edge pata_atiixp
> 15: 0 0 5713 17 IO-APIC-edge pata_atiixp
> 16: 0 0 904 2 IO-APIC-fasteoi aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel, ni-pci-gpib
> 17: 0 0 2 0 IO-APIC-fasteoi ehci_hcd:usb1, parport0, ni-pci-gpib
> 18: 0 0 49940 90 IO-APIC-fasteoi ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, nvidia
> 19: 0 0 703 2 IO-APIC-fasteoi aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
> 22: 0 0 1303 15 IO-APIC-fasteoi ahci
>
> 24: 261763 0 0 0 HPET_MSI-edge hpet2
>
> 29: 0 0 220 5 PCI-MSI-edge sky2@pci:0000:04:00.0
> NMI: 0 0 0 0 Non-maskable interrupts
> LOC: 138 271356 264446 261050 Local timer interrupts
> SPU: 0 0 0 0 Spurious interrupts
> PMI: 0 0 0 0 Performance monitoring interrupts
> PND: 0 0 0 0 Performance pending work
> RES: 4511 9275 8470 8086 Rescheduling interrupts
> CAL: 3624 8666 523 4543 Function call interrupts
> TLB: 981 1111 1065 1058 TLB shootdowns
> ERR: 0
> MIS: 0

2009-12-22 17:57:17

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>
> [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
> details, but Mark is basically chasing down a situation where the floppy
> driver seems to have trouble formatting floppies, and it happened
> between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
> memory block transfers the wrong value for the first byte of the block.
>
> Which should be impossible, but whatever. Some part of the system has a
> cached buffer that isn't flushed.
>
> What gets _you_ guys involved is that Mark cannot reproduce the bug if
> HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
> pure luck while bisecting, because some time during his bisect, his
> machine wouldn't even boot with HPET.
>
> So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
> 2.6.28 (and current -git) does not. Any ideas? ]
>
> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>
>> Ok, I may have something that might help.
>>
>> # git bisect bad
>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>> Author: [email protected] <[email protected]>
>> Date: Fri Sep 5 18:02:18 2008 -0700
>>
>> x86: HPET_MSI Initialise per-cpu HPET timers
>>
>> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>> timer will eliminate the need for timer broadcasting with IRQ 0 when there
>> is non-functional LAPIC timer across CPU deep C-states.
>>
>> If there are more CPUs than number of available timers, CPUs that do not
>> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>>
>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>> Signed-off-by: Shaohua Li <[email protected]>
>> Signed-off-by: Ingo Molnar <[email protected]>
>>
>> And of coarse this was the first commit that I could not boot if I had hpet
>> enabled. To get this one to boot (single user mode only) I had to add the
>> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>>
>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>
>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>> {
>>
>> if (request_irq(dev->irq, hpet_interrupt_handler,
>> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
>> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>> return -1;
>>
>> disable_irq(dev->irq);
>>
>> AND add the quiet cmdline option.
>
> Ok, so we know why HPET didn't boot for you, and that was fixed later (by
> that 5ceb1a04). But is this also when the floppy started mis-behaving?
>

Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
working
and also when I could no longer boot with hpet enabled. Commit 5ceb1a04 is
where I found I could boot again with the hpet enabled. It was a simple
patch so backed it into where I was
in order to be able to boot with hpet on. I did 2 different bisects. First
to find out when I could boot again with hpet on, then the next to find
which caused the floppy problem. Using the patch from the first bisect
(5ceb1a04) while doing the second bisect.

> IOW, _if_ you boot with that fix from commit 5ceb1a04 (and the quiet
> option - I wonder what that is about: do you have any ideas?), is the
> per-CPU HPET timer commit also the commit that causes floppy problems, or
> is this purely a "bisect when HPET became a boot-up problem"?
>

The quiet option was only needed because with that 5ceb1a04 commit applied
to the kernels I was interested in, kernel messages of some kind went on
for hours and I could not get a login prompt. They went by so fast and I
didn't have a serial console available to see them.
They must not have too important or critical because the machine acted as
normal as any machine in single user mode.

But once I got to a single user login prompt it was for sure the same
floppy problem.

>
> ---
>> Also, of all the machines it does work on with hpets enabled, I don't see
>> the HPET2 in /proc/interupts as below.
>>
>>
>> cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> 0: 82 0 3 0 IO-APIC-edge timer
>> 1: 0 0 1712 6 IO-APIC-edge i8042
>> 3: 0 0 6 0 IO-APIC-edge
>> 4: 0 0 6 0 IO-APIC-edge
>> 6: 0 0 4 0 IO-APIC-edge floppy
>> 8: 0 0 60 0 IO-APIC-edge rtc0
>> 9: 0 0 0 0 IO-APIC-fasteoi acpi
>> 12: 0 0 37798 179 IO-APIC-edge i8042
>> 14: 0 0 16462 71 IO-APIC-edge pata_atiixp
>> 15: 0 0 5713 17 IO-APIC-edge pata_atiixp
>> 16: 0 0 904 2 IO-APIC-fasteoi aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel, ni-pci-gpib
>> 17: 0 0 2 0 IO-APIC-fasteoi ehci_hcd:usb1, parport0, ni-pci-gpib
>> 18: 0 0 49940 90 IO-APIC-fasteoi ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, nvidia
>> 19: 0 0 703 2 IO-APIC-fasteoi aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
>> 22: 0 0 1303 15 IO-APIC-fasteoi ahci
>>
>> 24: 261763 0 0 0 HPET_MSI-edge hpet2
>>
>> 29: 0 0 220 5 PCI-MSI-edge sky2@pci:0000:04:00.0
>> NMI: 0 0 0 0 Non-maskable interrupts
>> LOC: 138 271356 264446 261050 Local timer interrupts
>> SPU: 0 0 0 0 Spurious interrupts
>> PMI: 0 0 0 0 Performance monitoring interrupts
>> PND: 0 0 0 0 Performance pending work
>> RES: 4511 9275 8470 8086 Rescheduling interrupts
>> CAL: 3624 8666 523 4543 Function call interrupts
>> TLB: 981 1111 1065 1058 TLB shootdowns
>> ERR: 0
>> MIS: 0
>


Regards
Mark

2009-12-22 23:37:59

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
> >
> > [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
> > details, but Mark is basically chasing down a situation where the floppy
> > driver seems to have trouble formatting floppies, and it happened
> > between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
> > memory block transfers the wrong value for the first byte of the block.
> >
> > Which should be impossible, but whatever. Some part of the system has a
> > cached buffer that isn't flushed.
> >
> > What gets _you_ guys involved is that Mark cannot reproduce the bug if
> > HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
> > pure luck while bisecting, because some time during his bisect, his
> > machine wouldn't even boot with HPET.
> >
> > So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
> > 2.6.28 (and current -git) does not. Any ideas? ]
> >
> > On Tue, 22 Dec 2009, Mark Hounschell wrote:
> >>
> >> Ok, I may have something that might help.
> >>
> >> # git bisect bad
> >> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
> >> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
> >> Author: [email protected] <[email protected]>
> >> Date: Fri Sep 5 18:02:18 2008 -0700
> >>
> >> x86: HPET_MSI Initialise per-cpu HPET timers
> >>
> >> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
> >> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
> >> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
> >> timer will eliminate the need for timer broadcasting with IRQ 0 when there
> >> is non-functional LAPIC timer across CPU deep C-states.
> >>
> >> If there are more CPUs than number of available timers, CPUs that do not
> >> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
> >>
> >> Signed-off-by: Venkatesh Pallipadi <[email protected]>
> >> Signed-off-by: Shaohua Li <[email protected]>
> >> Signed-off-by: Ingo Molnar <[email protected]>
> >>
> >> And of coarse this was the first commit that I could not boot if I had hpet
> >> enabled. To get this one to boot (single user mode only) I had to add the
> >> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
> >>
> >> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
> >>
> >> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
> >> {
> >>
> >> if (request_irq(dev->irq, hpet_interrupt_handler,
> >> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
> >> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
> >> return -1;
> >>
> >> disable_irq(dev->irq);
> >>
> >> AND add the quiet cmdline option.
> >
> > Ok, so we know why HPET didn't boot for you, and that was fixed later (by
> > that 5ceb1a04). But is this also when the floppy started mis-behaving?
> >
>
> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
> working
> and also when I could no longer boot with hpet enabled.


I am missing something here. Commit 26afe5f2 is where system does not
boot with HPET or is it where the floppy stops working when you boot
with HPET enabled.

Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
output in each case. With that option, we should be using local APIC
timer and PIT, HPET or HPET with MSI should not really matter. Does it
still fail with .28 with that option?

Thanks,
Venki

2009-12-23 00:22:26

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>
>>> [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
>>> details, but Mark is basically chasing down a situation where the floppy
>>> driver seems to have trouble formatting floppies, and it happened
>>> between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
>>> memory block transfers the wrong value for the first byte of the block.
>>>
>>> Which should be impossible, but whatever. Some part of the system has a
>>> cached buffer that isn't flushed.
>>>
>>> What gets _you_ guys involved is that Mark cannot reproduce the bug if
>>> HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
>>> pure luck while bisecting, because some time during his bisect, his
>>> machine wouldn't even boot with HPET.
>>>
>>> So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>
>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>
>>>> Ok, I may have something that might help.
>>>>
>>>> # git bisect bad
>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>> Author: [email protected] <[email protected]>
>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>
>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>
>>>> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>>>> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>>>> timer will eliminate the need for timer broadcasting with IRQ 0 when there
>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>
>>>> If there are more CPUs than number of available timers, CPUs that do not
>>>> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>>>>
>>>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>>>> Signed-off-by: Shaohua Li <[email protected]>
>>>> Signed-off-by: Ingo Molnar <[email protected]>
>>>>
>>>> And of coarse this was the first commit that I could not boot if I had hpet
>>>> enabled. To get this one to boot (single user mode only) I had to add the
>>>> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>>>>
>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>
>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>>>> {
>>>>
>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
>>>> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>>>> return -1;
>>>>
>>>> disable_irq(dev->irq);
>>>>
>>>> AND add the quiet cmdline option.
>>>
>>> Ok, so we know why HPET didn't boot for you, and that was fixed later (by
>>> that 5ceb1a04). But is this also when the floppy started mis-behaving?
>>>
>>
>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
>> working
>> and also when I could no longer boot with hpet enabled.
>
>
> I am missing something here. Commit 26afe5f2 is where system does not
> boot with HPET or is it where the floppy stops working when you boot
> with HPET enabled.
>

As it happens, both happen there. Commit 5ceb1a04 is where it starts
booting _again_ with hpet enabled. So I took that patch (5ceb1a04) and
applied it to (26afe5f2f) to be able to boot with hpet enabled. I had to
use the quiet option to get to a login prompt, but there is where the
floppy format first fails, just as it does in 2.6.28 and up.

> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
> output in each case. With that option, we should be using local APIC
> timer and PIT, HPET or HPET with MSI should not really matter. Does it
> still fail with .28 with that option?
>

Yes, I will try that for you but will have to wait until the morning. Sorry.

Regards
Mark

2009-12-23 13:02:42

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/22/2009 07:22 PM, Mark Hounschell wrote:
> On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
>> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>>
>>>> [ Ingo, Venki and Shaohua added to cc: see the whole thread on lkml for
>>>> details, but Mark is basically chasing down a situation where the floppy
>>>> driver seems to have trouble formatting floppies, and it happened
>>>> between 2.6.27 and .28. The trouble seems to be that a DMA transfer of a
>>>> memory block transfers the wrong value for the first byte of the block.
>>>>
>>>> Which should be impossible, but whatever. Some part of the system has a
>>>> cached buffer that isn't flushed.
>>>>
>>>> What gets _you_ guys involved is that Mark cannot reproduce the bug if
>>>> HPET is disabled in the BIOS or by using 'nohpet'. He found that out by
>>>> pure luck while bisecting, because some time during his bisect, his
>>>> machine wouldn't even boot with HPET.
>>>>
>>>> So the problem is: with HPET enabled, 2.6.27.4 _used_ to work. But
>>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>>
>>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>>
>>>>> Ok, I may have something that might help.
>>>>>
>>>>> # git bisect bad
>>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>>> Author: [email protected] <[email protected]>
>>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>>
>>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>>
>>>>> Initialize a per CPU HPET MSI timer when possible. We retain the HPET
>>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when legacy mode is being used. We
>>>>> setup the remaining HPET timers as per CPU MSI based timers. This per CPU
>>>>> timer will eliminate the need for timer broadcasting with IRQ 0 when there
>>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>>
>>>>> If there are more CPUs than number of available timers, CPUs that do not
>>>>> find any timer to use will continue using LAPIC and IRQ 0 broadcast.
>>>>>
>>>>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>>>>> Signed-off-by: Shaohua Li <[email protected]>
>>>>> Signed-off-by: Ingo Molnar <[email protected]>
>>>>>
>>>>> And of coarse this was the first commit that I could not boot if I had hpet
>>>>> enabled. To get this one to boot (single user mode only) I had to add the
>>>>> the quiet cmdline option and following patch from to arch/x86/kernel/hpet.c
>>>>>
>>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>>
>>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct hpet_dev *dev)
>>>>> {
>>>>>
>>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>>> - IRQF_SHARED|IRQF_NOBALANCING, dev->name, dev))
>>>>> + IRQF_DISABLED|IRQF_NOBALANCING, dev->name, dev))
>>>>> return -1;
>>>>>
>>>>> disable_irq(dev->irq);
>>>>>
>>>>> AND add the quiet cmdline option.
>>>>
>>>> Ok, so we know why HPET didn't boot for you, and that was fixed later (by
>>>> that 5ceb1a04). But is this also when the floppy started mis-behaving?
>>>>
>>>
>>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when the floppy stops
>>> working
>>> and also when I could no longer boot with hpet enabled.
>>
>>
>> I am missing something here. Commit 26afe5f2 is where system does not
>> boot with HPET or is it where the floppy stops working when you boot
>> with HPET enabled.
>>
>
> As it happens, both happen there. Commit 5ceb1a04 is where it starts
> booting _again_ with hpet enabled. So I took that patch (5ceb1a04) and
> applied it to (26afe5f2f) to be able to boot with hpet enabled. I had to
> use the quiet option to get to a login prompt, but there is where the
> floppy format first fails, just as it does in 2.6.28 and up.
>
>> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
>> output in each case. With that option, we should be using local APIC
>> timer and PIT, HPET or HPET with MSI should not really matter. Does it
>> still fail with .28 with that option?
>>

2.6.28 still fails with that option.

2.6.27.41 /proc/interrupts with idle=halt

CPU0 CPU1 CPU2 CPU3
0: 126 0 0 1 IO-APIC-edge timer
1: 0 0 1 157 IO-APIC-edge i8042
3: 0 0 0 6 IO-APIC-edge
4: 0 0 0 6 IO-APIC-edge
6: 0 0 0 4 IO-APIC-edge floppy
8: 0 0 0 1 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 1 128 IO-APIC-edge i8042
14: 0 0 34 4457 IO-APIC-edge
pata_atiixp
15: 0 0 4 480 IO-APIC-edge
pata_atiixp
16: 0 0 0 397 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
17: 0 0 0 2 IO-APIC-fasteoi
ehci_hcd:usb1
18: 0 0 0 0 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
19: 0 0 0 142 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
22: 0 0 4 1154 IO-APIC-fasteoi ahci
219: 0 0 3 63 PCI-MSI-edge eth0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 91539 91964 92525 91181 Local timer interrupts
RES: 2888 3873 2434 2721 Rescheduling interrupts
CAL: 240 245 247 84 function call interrupts
TLB: 768 628 526 512 TLB shootdowns
SPU: 0 0 0 0 Spurious interrupts
ERR: 0
MIS: 0

2.6.28 /proc/interrupts with idle=halt

CPU0 CPU1 CPU2 CPU3
0: 126 0 2 0 IO-APIC-edge timer
1: 0 0 192 0 IO-APIC-edge i8042
3: 0 0 6 0 IO-APIC-edge
4: 0 0 6 0 IO-APIC-edge
6: 0 0 4 0 IO-APIC-edge floppy
8: 0 0 1 0 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 128 1 IO-APIC-edge i8042
14: 0 1 147114 396 IO-APIC-edge
pata_atiixp
15: 0 0 646 2 IO-APIC-edge
pata_atiixp
16: 0 0 396 0 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel
17: 0 0 0 0 IO-APIC-fasteoi
ehci_hcd:usb1
18: 0 0 0 0 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
19: 0 0 362 1 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
22: 0 0 874 1 IO-APIC-fasteoi ahci
1274: 0 0 193 4 PCI-MSI-edge eth0
1279: 513207 0 0 0 HPET_MSI-edge hpet2
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 268 513395 513138 522088 Local timer interrupts
RES: 3262 3679 2573 3746 Rescheduling interrupts
CAL: 131 166 57 147 Function call interrupts
TLB: 680 438 450 639 TLB shootdowns
SPU: 0 0 0 0 Spurious interrupts
ERR: 0
MIS: 0


Mark

2009-12-23 15:06:01

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)



>-----Original Message-----
>From: Mark Hounschell [mailto:[email protected]]
>Sent: Wednesday, December 23, 2009 5:03 AM
>To: Pallipadi, Venkatesh
>Cc: [email protected]; Linus Torvalds; Alain Knaff; Linux
>Kernel Mailing List; [email protected]; Li, Shaohua; Ingo Molnar
>Subject: Re: [Fdutils] DMA cache consistency bug introduced in
>2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)
>
>On 12/22/2009 07:22 PM, Mark Hounschell wrote:
>> On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
>>> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>>>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>>>
>>>>> [ Ingo, Venki and Shaohua added to cc: see the whole
>thread on lkml for
>>>>> details, but Mark is basically chasing down a situation
>where the floppy
>>>>> driver seems to have trouble formatting floppies, and
>it happened
>>>>> between 2.6.27 and .28. The trouble seems to be that a
>DMA transfer of a
>>>>> memory block transfers the wrong value for the first
>byte of the block.
>>>>>
>>>>> Which should be impossible, but whatever. Some part of
>the system has a
>>>>> cached buffer that isn't flushed.
>>>>>
>>>>> What gets _you_ guys involved is that Mark cannot
>reproduce the bug if
>>>>> HPET is disabled in the BIOS or by using 'nohpet'. He
>found that out by
>>>>> pure luck while bisecting, because some time during his
>bisect, his
>>>>> machine wouldn't even boot with HPET.
>>>>>
>>>>> So the problem is: with HPET enabled, 2.6.27.4 _used_
>to work. But
>>>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>>>
>>>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>>>
>>>>>> Ok, I may have something that might help.
>>>>>>
>>>>>> # git bisect bad
>>>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>>>> Author: [email protected]
><[email protected]>
>>>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>>>
>>>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>>>
>>>>>> Initialize a per CPU HPET MSI timer when possible.
>We retain the HPET
>>>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when
>legacy mode is being used. We
>>>>>> setup the remaining HPET timers as per CPU MSI based
>timers. This per CPU
>>>>>> timer will eliminate the need for timer broadcasting
>with IRQ 0 when there
>>>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>>>
>>>>>> If there are more CPUs than number of available
>timers, CPUs that do not
>>>>>> find any timer to use will continue using LAPIC and
>IRQ 0 broadcast.
>>>>>>
>>>>>> Signed-off-by: Venkatesh Pallipadi
><[email protected]>
>>>>>> Signed-off-by: Shaohua Li <[email protected]>
>>>>>> Signed-off-by: Ingo Molnar <[email protected]>
>>>>>>
>>>>>> And of coarse this was the first commit that I could not
>boot if I had hpet
>>>>>> enabled. To get this one to boot (single user mode only)
>I had to add the
>>>>>> the quiet cmdline option and following patch from to
>arch/x86/kernel/hpet.c
>>>>>>
>>>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>>>
>>>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct
>hpet_dev *dev)
>>>>>> {
>>>>>>
>>>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>>>> - IRQF_SHARED|IRQF_NOBALANCING,
>dev->name, dev))
>>>>>> + IRQF_DISABLED|IRQF_NOBALANCING,
>dev->name, dev))
>>>>>> return -1;
>>>>>>
>>>>>> disable_irq(dev->irq);
>>>>>>
>>>>>> AND add the quiet cmdline option.
>>>>>
>>>>> Ok, so we know why HPET didn't boot for you, and that was
>fixed later (by
>>>>> that 5ceb1a04). But is this also when the floppy started
>mis-behaving?
>>>>>
>>>>
>>>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when
>the floppy stops
>>>> working
>>>> and also when I could no longer boot with hpet enabled.
>>>
>>>
>>> I am missing something here. Commit 26afe5f2 is where
>system does not
>>> boot with HPET or is it where the floppy stops working when you boot
>>> with HPET enabled.
>>>
>>
>> As it happens, both happen there. Commit 5ceb1a04 is where it starts
>> booting _again_ with hpet enabled. So I took that patch
>(5ceb1a04) and
>> applied it to (26afe5f2f) to be able to boot with hpet
>enabled. I had to
>> use the quiet option to get to a login prompt, but there is where the
>> floppy format first fails, just as it does in 2.6.28 and up.
>>
>>> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
>>> output in each case. With that option, we should be using local APIC
>>> timer and PIT, HPET or HPET with MSI should not really
>matter. Does it
>>> still fail with .28 with that option?
>>>
>
>2.6.28 still fails with that option.
>
>2.6.27.41 /proc/interrupts with idle=halt
>
> CPU0 CPU1 CPU2 CPU3
> 0: 126 0 0 1
>IO-APIC-edge timer
> 1: 0 0 1 157
>IO-APIC-edge i8042
> 3: 0 0 0 6 IO-APIC-edge
> 4: 0 0 0 6 IO-APIC-edge
> 6: 0 0 0 4
>IO-APIC-edge floppy
> 8: 0 0 0 1
>IO-APIC-edge rtc0
> 9: 0 0 0 0
>IO-APIC-fasteoi acpi
> 12: 0 0 1 128
>IO-APIC-edge i8042
> 14: 0 0 34 4457 IO-APIC-edge
>pata_atiixp
> 15: 0 0 4 480 IO-APIC-edge
>pata_atiixp
> 16: 0 0 0 397 IO-APIC-fasteoi
>aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
> 17: 0 0 0 2 IO-APIC-fasteoi
>ehci_hcd:usb1
> 18: 0 0 0 0 IO-APIC-fasteoi
>ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
> 19: 0 0 0 142 IO-APIC-fasteoi
>aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
> 22: 0 0 4 1154
>IO-APIC-fasteoi ahci
>219: 0 0 3 63
>PCI-MSI-edge eth0
>NMI: 0 0 0 0
>Non-maskable interrupts
>LOC: 91539 91964 92525 91181 Local timer
>interrupts
>RES: 2888 3873 2434 2721
>Rescheduling interrupts
>CAL: 240 245 247 84 function
>call interrupts
>TLB: 768 628 526 512 TLB shootdowns
>SPU: 0 0 0 0 Spurious interrupts
>ERR: 0
>MIS: 0
>
>2.6.28 /proc/interrupts with idle=halt
>
> CPU0 CPU1 CPU2 CPU3
> 0: 126 0 2 0
>IO-APIC-edge timer
> 1: 0 0 192 0
>IO-APIC-edge i8042
> 3: 0 0 6 0 IO-APIC-edge
> 4: 0 0 6 0 IO-APIC-edge
> 6: 0 0 4 0
>IO-APIC-edge floppy
> 8: 0 0 1 0
>IO-APIC-edge rtc0
> 9: 0 0 0 0
>IO-APIC-fasteoi acpi
> 12: 0 0 128 1
>IO-APIC-edge i8042
> 14: 0 1 147114 396 IO-APIC-edge
>pata_atiixp
> 15: 0 0 646 2 IO-APIC-edge
>pata_atiixp
> 16: 0 0 396 0 IO-APIC-fasteoi
>aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel
> 17: 0 0 0 0 IO-APIC-fasteoi
>ehci_hcd:usb1
> 18: 0 0 0 0 IO-APIC-fasteoi
>ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
> 19: 0 0 362 1 IO-APIC-fasteoi
>aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
> 22: 0 0 874 1
>IO-APIC-fasteoi ahci
>1274: 0 0 193 4
>PCI-MSI-edge eth0
>1279: 513207 0 0 0
>HPET_MSI-edge hpet2
>NMI: 0 0 0 0
>Non-maskable interrupts
>LOC: 268 513395 513138 522088 Local timer
>interrupts
>RES: 3262 3679 2573 3746
>Rescheduling interrupts
>CAL: 131 166 57 147 Function
>call interrupts
>TLB: 680 438 450 639 TLB shootdowns
>SPU: 0 0 0 0 Spurious interrupts
>ERR: 0
>MIS: 0
>

Hmm. Looks like hpet2 is still getting used instead of local APIC timer in .28 case.

I was expecting some low number in hpet2 and local timer on all CPU to be around the same value. Above shows CPU 0 is depending on hpet2 for some reason even with idle=halt. Can you send the output of below two in case of .28
/proc/timer_list
grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*

Thanks,
Venki

2009-12-23 15:34:10

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/23/2009 10:10 AM, Pallipadi, Venkatesh wrote:
>
>
>> -----Original Message-----
>> From: Mark Hounschell [mailto:[email protected]]
>> Sent: Wednesday, December 23, 2009 5:03 AM
>> To: Pallipadi, Venkatesh
>> Cc: [email protected]; Linus Torvalds; Alain Knaff; Linux
>> Kernel Mailing List; [email protected]; Li, Shaohua; Ingo Molnar
>> Subject: Re: [Fdutils] DMA cache consistency bug introduced in
>> 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)
>>
>> On 12/22/2009 07:22 PM, Mark Hounschell wrote:
>>> On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
>>>> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>>>>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>>>>
>>>>>> [ Ingo, Venki and Shaohua added to cc: see the whole
>> thread on lkml for
>>>>>> details, but Mark is basically chasing down a situation
>> where the floppy
>>>>>> driver seems to have trouble formatting floppies, and
>> it happened
>>>>>> between 2.6.27 and .28. The trouble seems to be that a
>> DMA transfer of a
>>>>>> memory block transfers the wrong value for the first
>> byte of the block.
>>>>>>
>>>>>> Which should be impossible, but whatever. Some part of
>> the system has a
>>>>>> cached buffer that isn't flushed.
>>>>>>
>>>>>> What gets _you_ guys involved is that Mark cannot
>> reproduce the bug if
>>>>>> HPET is disabled in the BIOS or by using 'nohpet'. He
>> found that out by
>>>>>> pure luck while bisecting, because some time during his
>> bisect, his
>>>>>> machine wouldn't even boot with HPET.
>>>>>>
>>>>>> So the problem is: with HPET enabled, 2.6.27.4 _used_
>> to work. But
>>>>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>>>>
>>>>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>>>>
>>>>>>> Ok, I may have something that might help.
>>>>>>>
>>>>>>> # git bisect bad
>>>>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>>>>> Author: [email protected]
>> <[email protected]>
>>>>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>>>>
>>>>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>>>>
>>>>>>> Initialize a per CPU HPET MSI timer when possible.
>> We retain the HPET
>>>>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when
>> legacy mode is being used. We
>>>>>>> setup the remaining HPET timers as per CPU MSI based
>> timers. This per CPU
>>>>>>> timer will eliminate the need for timer broadcasting
>> with IRQ 0 when there
>>>>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>>>>
>>>>>>> If there are more CPUs than number of available
>> timers, CPUs that do not
>>>>>>> find any timer to use will continue using LAPIC and
>> IRQ 0 broadcast.
>>>>>>>
>>>>>>> Signed-off-by: Venkatesh Pallipadi
>> <[email protected]>
>>>>>>> Signed-off-by: Shaohua Li <[email protected]>
>>>>>>> Signed-off-by: Ingo Molnar <[email protected]>
>>>>>>>
>>>>>>> And of coarse this was the first commit that I could not
>> boot if I had hpet
>>>>>>> enabled. To get this one to boot (single user mode only)
>> I had to add the
>>>>>>> the quiet cmdline option and following patch from to
>> arch/x86/kernel/hpet.c
>>>>>>>
>>>>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>>>>
>>>>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct
>> hpet_dev *dev)
>>>>>>> {
>>>>>>>
>>>>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>>>>> - IRQF_SHARED|IRQF_NOBALANCING,
>> dev->name, dev))
>>>>>>> + IRQF_DISABLED|IRQF_NOBALANCING,
>> dev->name, dev))
>>>>>>> return -1;
>>>>>>>
>>>>>>> disable_irq(dev->irq);
>>>>>>>
>>>>>>> AND add the quiet cmdline option.
>>>>>>
>>>>>> Ok, so we know why HPET didn't boot for you, and that was
>> fixed later (by
>>>>>> that 5ceb1a04). But is this also when the floppy started
>> mis-behaving?
>>>>>>
>>>>>
>>>>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when
>> the floppy stops
>>>>> working
>>>>> and also when I could no longer boot with hpet enabled.
>>>>
>>>>
>>>> I am missing something here. Commit 26afe5f2 is where
>> system does not
>>>> boot with HPET or is it where the floppy stops working when you boot
>>>> with HPET enabled.
>>>>
>>>
>>> As it happens, both happen there. Commit 5ceb1a04 is where it starts
>>> booting _again_ with hpet enabled. So I took that patch
>> (5ceb1a04) and
>>> applied it to (26afe5f2f) to be able to boot with hpet
>> enabled. I had to
>>> use the quiet option to get to a login prompt, but there is where the
>>> floppy format first fails, just as it does in 2.6.28 and up.
>>>
>>>> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
>>>> output in each case. With that option, we should be using local APIC
>>>> timer and PIT, HPET or HPET with MSI should not really
>> matter. Does it
>>>> still fail with .28 with that option?
>>>>
>>
>> 2.6.28 still fails with that option.
>>
>> 2.6.27.41 /proc/interrupts with idle=halt
>>
>> CPU0 CPU1 CPU2 CPU3
>> 0: 126 0 0 1
>> IO-APIC-edge timer
>> 1: 0 0 1 157
>> IO-APIC-edge i8042
>> 3: 0 0 0 6 IO-APIC-edge
>> 4: 0 0 0 6 IO-APIC-edge
>> 6: 0 0 0 4
>> IO-APIC-edge floppy
>> 8: 0 0 0 1
>> IO-APIC-edge rtc0
>> 9: 0 0 0 0
>> IO-APIC-fasteoi acpi
>> 12: 0 0 1 128
>> IO-APIC-edge i8042
>> 14: 0 0 34 4457 IO-APIC-edge
>> pata_atiixp
>> 15: 0 0 4 480 IO-APIC-edge
>> pata_atiixp
>> 16: 0 0 0 397 IO-APIC-fasteoi
>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
>> 17: 0 0 0 2 IO-APIC-fasteoi
>> ehci_hcd:usb1
>> 18: 0 0 0 0 IO-APIC-fasteoi
>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>> 19: 0 0 0 142 IO-APIC-fasteoi
>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>> 22: 0 0 4 1154
>> IO-APIC-fasteoi ahci
>> 219: 0 0 3 63
>> PCI-MSI-edge eth0
>> NMI: 0 0 0 0
>> Non-maskable interrupts
>> LOC: 91539 91964 92525 91181 Local timer
>> interrupts
>> RES: 2888 3873 2434 2721
>> Rescheduling interrupts
>> CAL: 240 245 247 84 function
>> call interrupts
>> TLB: 768 628 526 512 TLB shootdowns
>> SPU: 0 0 0 0 Spurious interrupts
>> ERR: 0
>> MIS: 0
>>
>> 2.6.28 /proc/interrupts with idle=halt
>>
>> CPU0 CPU1 CPU2 CPU3
>> 0: 126 0 2 0
>> IO-APIC-edge timer
>> 1: 0 0 192 0
>> IO-APIC-edge i8042
>> 3: 0 0 6 0 IO-APIC-edge
>> 4: 0 0 6 0 IO-APIC-edge
>> 6: 0 0 4 0
>> IO-APIC-edge floppy
>> 8: 0 0 1 0
>> IO-APIC-edge rtc0
>> 9: 0 0 0 0
>> IO-APIC-fasteoi acpi
>> 12: 0 0 128 1
>> IO-APIC-edge i8042
>> 14: 0 1 147114 396 IO-APIC-edge
>> pata_atiixp
>> 15: 0 0 646 2 IO-APIC-edge
>> pata_atiixp
>> 16: 0 0 396 0 IO-APIC-fasteoi
>> aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel
>> 17: 0 0 0 0 IO-APIC-fasteoi
>> ehci_hcd:usb1
>> 18: 0 0 0 0 IO-APIC-fasteoi
>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>> 19: 0 0 362 1 IO-APIC-fasteoi
>> aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
>> 22: 0 0 874 1
>> IO-APIC-fasteoi ahci
>> 1274: 0 0 193 4
>> PCI-MSI-edge eth0
>> 1279: 513207 0 0 0
>> HPET_MSI-edge hpet2
>> NMI: 0 0 0 0
>> Non-maskable interrupts
>> LOC: 268 513395 513138 522088 Local timer
>> interrupts
>> RES: 3262 3679 2573 3746
>> Rescheduling interrupts
>> CAL: 131 166 57 147 Function
>> call interrupts
>> TLB: 680 438 450 639 TLB shootdowns
>> SPU: 0 0 0 0 Spurious interrupts
>> ERR: 0
>> MIS: 0
>>
>
> Hmm. Looks like hpet2 is still getting used instead of local APIC timer in .28 case.
>
> I was expecting some low number in hpet2 and local timer on all CPU to be around the same value. Above shows CPU 0 is depending on hpet2 for some reason even with idle=halt. Can you send the output of below two in case of .28
> /proc/timer_list

Attached.

> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*

I have no /sys/devices/system/cpu/cpu0/cpuidle on this machine.
Maybe because of

#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
# CONFIG_CPU_IDLE is not set

Would it be OK if when you ask for 2.6.28 info, I use a 2.6.32.2 kernel?
That kernel also fails fdformat with hpet enabled on these machines.

Thanks
Mark


Attachments:
timer_list.txt (7.72 kB)

2009-12-23 15:57:57

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)

On 12/23/2009 10:34 AM, Mark Hounschell wrote:
> On 12/23/2009 10:10 AM, Pallipadi, Venkatesh wrote:
>>
>>
>>> -----Original Message-----
>>> From: Mark Hounschell [mailto:[email protected]]
>>> Sent: Wednesday, December 23, 2009 5:03 AM
>>> To: Pallipadi, Venkatesh
>>> Cc: [email protected]; Linus Torvalds; Alain Knaff; Linux
>>> Kernel Mailing List; [email protected]; Li, Shaohua; Ingo Molnar
>>> Subject: Re: [Fdutils] DMA cache consistency bug introduced in
>>> 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)
>>>
>>> On 12/22/2009 07:22 PM, Mark Hounschell wrote:
>>>> On 12/22/2009 06:37 PM, Pallipadi, Venkatesh wrote:
>>>>> On Tue, 2009-12-22 at 09:57 -0800, Mark Hounschell wrote:
>>>>>> On 12/22/2009 12:38 PM, Linus Torvalds wrote:
>>>>>>>
>>>>>>> [ Ingo, Venki and Shaohua added to cc: see the whole
>>> thread on lkml for
>>>>>>> details, but Mark is basically chasing down a situation
>>> where the floppy
>>>>>>> driver seems to have trouble formatting floppies, and
>>> it happened
>>>>>>> between 2.6.27 and .28. The trouble seems to be that a
>>> DMA transfer of a
>>>>>>> memory block transfers the wrong value for the first
>>> byte of the block.
>>>>>>>
>>>>>>> Which should be impossible, but whatever. Some part of
>>> the system has a
>>>>>>> cached buffer that isn't flushed.
>>>>>>>
>>>>>>> What gets _you_ guys involved is that Mark cannot
>>> reproduce the bug if
>>>>>>> HPET is disabled in the BIOS or by using 'nohpet'. He
>>> found that out by
>>>>>>> pure luck while bisecting, because some time during his
>>> bisect, his
>>>>>>> machine wouldn't even boot with HPET.
>>>>>>>
>>>>>>> So the problem is: with HPET enabled, 2.6.27.4 _used_
>>> to work. But
>>>>>>> 2.6.28 (and current -git) does not. Any ideas? ]
>>>>>>>
>>>>>>> On Tue, 22 Dec 2009, Mark Hounschell wrote:
>>>>>>>>
>>>>>>>> Ok, I may have something that might help.
>>>>>>>>
>>>>>>>> # git bisect bad
>>>>>>>> 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is the first bad commit
>>>>>>>> commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0
>>>>>>>> Author: [email protected]
>>> <[email protected]>
>>>>>>>> Date: Fri Sep 5 18:02:18 2008 -0700
>>>>>>>>
>>>>>>>> x86: HPET_MSI Initialise per-cpu HPET timers
>>>>>>>>
>>>>>>>> Initialize a per CPU HPET MSI timer when possible.
>>> We retain the HPET
>>>>>>>> timer 0 (IRQ 0) and timer 1 (IRQ 8) as is when
>>> legacy mode is being used. We
>>>>>>>> setup the remaining HPET timers as per CPU MSI based
>>> timers. This per CPU
>>>>>>>> timer will eliminate the need for timer broadcasting
>>> with IRQ 0 when there
>>>>>>>> is non-functional LAPIC timer across CPU deep C-states.
>>>>>>>>
>>>>>>>> If there are more CPUs than number of available
>>> timers, CPUs that do not
>>>>>>>> find any timer to use will continue using LAPIC and
>>> IRQ 0 broadcast.
>>>>>>>>
>>>>>>>> Signed-off-by: Venkatesh Pallipadi
>>> <[email protected]>
>>>>>>>> Signed-off-by: Shaohua Li <[email protected]>
>>>>>>>> Signed-off-by: Ingo Molnar <[email protected]>
>>>>>>>>
>>>>>>>> And of coarse this was the first commit that I could not
>>> boot if I had hpet
>>>>>>>> enabled. To get this one to boot (single user mode only)
>>> I had to add the
>>>>>>>> the quiet cmdline option and following patch from to
>>> arch/x86/kernel/hpet.c
>>>>>>>>
>>>>>>>> commit 5ceb1a04187553e08c6ab60d30cee7c454ee139a
>>>>>>>>
>>>>>>>> @ -445,7 +445,7 @@ static int hpet_setup_irq(struct
>>> hpet_dev *dev)
>>>>>>>> {
>>>>>>>>
>>>>>>>> if (request_irq(dev->irq, hpet_interrupt_handler,
>>>>>>>> - IRQF_SHARED|IRQF_NOBALANCING,
>>> dev->name, dev))
>>>>>>>> + IRQF_DISABLED|IRQF_NOBALANCING,
>>> dev->name, dev))
>>>>>>>> return -1;
>>>>>>>>
>>>>>>>> disable_irq(dev->irq);
>>>>>>>>
>>>>>>>> AND add the quiet cmdline option.
>>>>>>>
>>>>>>> Ok, so we know why HPET didn't boot for you, and that was
>>> fixed later (by
>>>>>>> that 5ceb1a04). But is this also when the floppy started
>>> mis-behaving?
>>>>>>>
>>>>>>
>>>>>> Commit 26afe5f2fbf06ea0765aaa316640c4dd472310c0 is when
>>> the floppy stops
>>>>>> working
>>>>>> and also when I could no longer boot with hpet enabled.
>>>>>
>>>>>
>>>>> I am missing something here. Commit 26afe5f2 is where
>>> system does not
>>>>> boot with HPET or is it where the floppy stops working when you boot
>>>>> with HPET enabled.
>>>>>
>>>>
>>>> As it happens, both happen there. Commit 5ceb1a04 is where it starts
>>>> booting _again_ with hpet enabled. So I took that patch
>>> (5ceb1a04) and
>>>> applied it to (26afe5f2f) to be able to boot with hpet
>>> enabled. I had to
>>>> use the quiet option to get to a login prompt, but there is where the
>>>> floppy format first fails, just as it does in 2.6.28 and up.
>>>>
>>>>> Can you try "idle=halt" with both .27 and .28 with /proc/interrupts
>>>>> output in each case. With that option, we should be using local APIC
>>>>> timer and PIT, HPET or HPET with MSI should not really
>>> matter. Does it
>>>>> still fail with .28 with that option?
>>>>>
>>>
>>> 2.6.28 still fails with that option.
>>>
>>> 2.6.27.41 /proc/interrupts with idle=halt
>>>
>>> CPU0 CPU1 CPU2 CPU3
>>> 0: 126 0 0 1
>>> IO-APIC-edge timer
>>> 1: 0 0 1 157
>>> IO-APIC-edge i8042
>>> 3: 0 0 0 6 IO-APIC-edge
>>> 4: 0 0 0 6 IO-APIC-edge
>>> 6: 0 0 0 4
>>> IO-APIC-edge floppy
>>> 8: 0 0 0 1
>>> IO-APIC-edge rtc0
>>> 9: 0 0 0 0
>>> IO-APIC-fasteoi acpi
>>> 12: 0 0 1 128
>>> IO-APIC-edge i8042
>>> 14: 0 0 34 4457 IO-APIC-edge
>>> pata_atiixp
>>> 15: 0 0 4 480 IO-APIC-edge
>>> pata_atiixp
>>> 16: 0 0 0 397 IO-APIC-fasteoi
>>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel
>>> 17: 0 0 0 2 IO-APIC-fasteoi
>>> ehci_hcd:usb1
>>> 18: 0 0 0 0 IO-APIC-fasteoi
>>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>>> 19: 0 0 0 142 IO-APIC-fasteoi
>>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>>> 22: 0 0 4 1154
>>> IO-APIC-fasteoi ahci
>>> 219: 0 0 3 63
>>> PCI-MSI-edge eth0
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 91539 91964 92525 91181 Local timer
>>> interrupts
>>> RES: 2888 3873 2434 2721
>>> Rescheduling interrupts
>>> CAL: 240 245 247 84 function
>>> call interrupts
>>> TLB: 768 628 526 512 TLB shootdowns
>>> SPU: 0 0 0 0 Spurious interrupts
>>> ERR: 0
>>> MIS: 0
>>>
>>> 2.6.28 /proc/interrupts with idle=halt
>>>
>>> CPU0 CPU1 CPU2 CPU3
>>> 0: 126 0 2 0
>>> IO-APIC-edge timer
>>> 1: 0 0 192 0
>>> IO-APIC-edge i8042
>>> 3: 0 0 6 0 IO-APIC-edge
>>> 4: 0 0 6 0 IO-APIC-edge
>>> 6: 0 0 4 0
>>> IO-APIC-edge floppy
>>> 8: 0 0 1 0
>>> IO-APIC-edge rtc0
>>> 9: 0 0 0 0
>>> IO-APIC-fasteoi acpi
>>> 12: 0 0 128 1
>>> IO-APIC-edge i8042
>>> 14: 0 1 147114 396 IO-APIC-edge
>>> pata_atiixp
>>> 15: 0 0 646 2 IO-APIC-edge
>>> pata_atiixp
>>> 16: 0 0 396 0 IO-APIC-fasteoi
>>> aic79xx, ohci_hcd:usb2, ohci_hcd:usb4, HDA Intel
>>> 17: 0 0 0 0 IO-APIC-fasteoi
>>> ehci_hcd:usb1
>>> 18: 0 0 0 0 IO-APIC-fasteoi
>>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7
>>> 19: 0 0 362 1 IO-APIC-fasteoi
>>> aic7xxx, ehci_hcd:usb3, ttySLG0, eth1
>>> 22: 0 0 874 1
>>> IO-APIC-fasteoi ahci
>>> 1274: 0 0 193 4
>>> PCI-MSI-edge eth0
>>> 1279: 513207 0 0 0
>>> HPET_MSI-edge hpet2
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 268 513395 513138 522088 Local timer
>>> interrupts
>>> RES: 3262 3679 2573 3746
>>> Rescheduling interrupts
>>> CAL: 131 166 57 147 Function
>>> call interrupts
>>> TLB: 680 438 450 639 TLB shootdowns
>>> SPU: 0 0 0 0 Spurious interrupts
>>> ERR: 0
>>> MIS: 0
>>>
>>
>> Hmm. Looks like hpet2 is still getting used instead of local APIC timer in .28 case.
>>
>> I was expecting some low number in hpet2 and local timer on all CPU to be around the same value. Above shows CPU 0 is depending on hpet2 for some reason even with idle=halt. Can you send the output of below two in case of .28
>> /proc/timer_list
>
> Attached.
>
>> grep . /sys/devices/system/cpu/cpu0/cpuidle/*/*
>
> I have no /sys/devices/system/cpu/cpu0/cpuidle on this machine.
> Maybe because of
>
> #
> # CPU Frequency scaling
> #
> # CONFIG_CPU_FREQ is not set
> # CONFIG_CPU_IDLE is not set
>
> Would it be OK if when you ask for 2.6.28 info, I use a 2.6.32.2 kernel?
> That kernel also fails fdformat with hpet enabled on these machines.
>

I do have this on 2.6.32.2 though.

# grep . /sys/devices/system/cpu/cpuidle/current_*
/sys/devices/system/cpu/cpuidle/current_driver:acpi_idle
/sys/devices/system/cpu/cpuidle/current_governor_ro:ladder

Want me to go back to 2.6.28 and show this?

Mark

2009-12-23 16:32:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format floppies under kernel 2.6.*?)



On Wed, 23 Dec 2009, Mark Hounschell wrote:
> >
> > Hmm. Looks like hpet2 is still getting used instead of local APIC
> > timer in .28 case.
> >
> > I was expecting some low number in hpet2 and local timer on all CPU to
> > be around the same value. Above shows CPU 0 is depending on hpet2 for
> > some reason even with idle=halt. Can you send the output of below two
> > in case of .28 /proc/timer_list
>
> Attached.

Oh wow.

That's crazy:

Tick Device: mode: 1
Per CPU device: 0
Clock Event Device: hpet2
max_delta_ns: 2147483647
min_delta_ns: 5000
mult: 61510047
shift: 32
mode: 3
next_event: 123991000000 nsecs
set_next_event: hpet_msi_next_event
set_mode: hpet_msi_set_mode
event_handler: hrtimer_interrupt

Tick Device: mode: 1
Per CPU device: 1
Clock Event Device: lapic
max_delta_ns: 670831998
min_delta_ns: 1199
mult: 53707624
shift: 32
mode: 3
next_event: 123991125000 nsecs
set_next_event: lapic_next_event
set_mode: lapic_timer_setup
event_handler: hrtimer_interrupt

...

It's not using the lapic for CPU0.

Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
expensive to reprogram (compared to the local apic). And having different
timers for different CPU's is just odd.

The fact that the timer subsystem can do this and it all (mostly) works at
all is nice and impressive, but doesn't make it any less crazy ;)

That said, none of this seems to explain why DMA/fdformat doesn't work.

Linus

2009-12-23 16:38:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Linus Torvalds <[email protected]> writes:

> It's not using the lapic for CPU0.
>
> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> expensive to reprogram (compared to the local apic). And having different
> timers for different CPU's is just odd.
>
> The fact that the timer subsystem can do this and it all (mostly) works at
> all is nice and impressive, but doesn't make it any less crazy ;)

I suspect it's a system where the APIC timer stops in deeper idle
states and it supports them. In this case CPU #0 does timer broadcasts
when needed to wake the other CPUs up from deep C, but for that it has
to run with HPET. At least the other ones can still enjoy the LAPIC
timer.

This might suggest that Mark's floppy controller doesn't like
deep C? Mark, did you try booting with processor.max_cstate=1
and HPET enabled?

-Andi
--
[email protected] -- Speaking for myself only.

2009-12-23 16:51:11

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28



On Wed, 23 Dec 2009, Andi Kleen wrote:
>
> I suspect it's a system where the APIC timer stops in deeper idle
> states and it supports them. In this case CPU #0 does timer broadcasts
> when needed to wake the other CPUs up from deep C, but for that it has
> to run with HPET. At least the other ones can still enjoy the LAPIC
> timer.

Ahh, ok, that makes sense. I was assuming the broadcast timer would act in
that capacity, but..

> This might suggest that Mark's floppy controller doesn't like
> deep C? Mark, did you try booting with processor.max_cstate=1
> and HPET enabled?

We have indeed had historical issues with floppy and sleep states before.

I do note another issue, though - the floppy driver itself seems totally
broken when it comes to using interleaved sectors. Alain, that "place
logical sectors" code is simply _broken_ - the "while" kicks in only if
the first sector we test is busy _and_ we were at the last sector so that
we increment past F_SECT_PER_TRACK.

So shouldn't that sector layout be something like the appended?

Linus
---
drivers/block/floppy.c | 7 ++-----
1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
index 3266b4f..9c9148c 100644
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -2237,13 +2237,10 @@ static void setup_format_params(int track)
for (count = 1; count <= F_SECT_PER_TRACK; ++count) {
here[n].sect = count;
n = (n + il) % F_SECT_PER_TRACK;
- if (here[n].sect) { /* sector busy, find next free sector */
+ while (here[n].sect) { /* sector busy, find next free sector */
++n;
- if (n >= F_SECT_PER_TRACK) {
+ if (n >= F_SECT_PER_TRACK)
n -= F_SECT_PER_TRACK;
- while (here[n].sect)
- ++n;
- }
}
}
if (_floppy->stretch & FD_SECTBASEMASK) {

2009-12-23 17:08:36

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Wed, Dec 23, 2009 at 08:49:38AM -0800, Linus Torvalds wrote:
>
>
> On Wed, 23 Dec 2009, Andi Kleen wrote:
> >
> > I suspect it's a system where the APIC timer stops in deeper idle
> > states and it supports them. In this case CPU #0 does timer broadcasts
> > when needed to wake the other CPUs up from deep C, but for that it has
> > to run with HPET. At least the other ones can still enjoy the LAPIC
> > timer.
>
> Ahh, ok, that makes sense. I was assuming the broadcast timer would act in
> that capacity, but..

The "broadcasts" are done using IPIs from cpu #08 and only when that target
CPU is deep idle. That's more efficient than letting the hardware
always broadcast.

>
> > This might suggest that Mark's floppy controller doesn't like
> > deep C? Mark, did you try booting with processor.max_cstate=1
> > and HPET enabled?
>
> We have indeed had historical issues with floppy and sleep states before.

I removed that code when moving to 64bit (floppy driver disabling C1),
but perhaps we need some variant of it again (but it's the first such
report in many years). Although it would be sad to have it again on all
systems.

-Andi

2009-12-23 17:16:48

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

> This is what I was thining yday and asked Mark to try idle=halt.
> This /proc/interrupts is with idle=halt when there should not be any
> C-states and broadcasts involved.

Ah ok, missed that sorry.

Actually I'm glad that the floppy-idle hack is not needed again.

-Andi
--
[email protected] -- Speaking for myself only.

2009-12-23 17:14:53

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [Fdutils] DMA cache consistency bug introduced in 2.6.28



>-----Original Message-----
>From: Linus Torvalds [mailto:[email protected]]
>Sent: Wednesday, December 23, 2009 8:50 AM
>To: Andi Kleen
>Cc: Mark Hounschell; Pallipadi, Venkatesh; [email protected];
>Alain Knaff; Linux Kernel Mailing List;
>[email protected]; Li, Shaohua; Ingo Molnar
>Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28
>
>
>
>On Wed, 23 Dec 2009, Andi Kleen wrote:
>>
>> I suspect it's a system where the APIC timer stops in deeper idle
>> states and it supports them. In this case CPU #0 does timer
>broadcasts
>> when needed to wake the other CPUs up from deep C, but for
>that it has
>> to run with HPET. At least the other ones can still enjoy the LAPIC
>> timer.
>
>Ahh, ok, that makes sense. I was assuming the broadcast timer
>would act in
>that capacity, but..

This is what I was thining yday and asked Mark to try idle=halt.
This /proc/interrupts is with idle=halt when there should not be any
C-states and broadcasts involved.
>>> HPET_MSI-edge hpet2
>>> NMI: 0 0 0 0
>>> Non-maskable interrupts
>>> LOC: 268 513395 513138 522088 Local timer
>>> interrupts

Not sure how this is related to floppy problem. But, we surely
have something wrong with percpu HPET usage here.

Thanks,
Venki

2009-12-23 17:42:07

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 12/23/2009 11:38 AM, Andi Kleen wrote:
> Linus Torvalds <[email protected]> writes:
>
>> It's not using the lapic for CPU0.
>>
>> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
>> expensive to reprogram (compared to the local apic). And having different
>> timers for different CPU's is just odd.
>>
>> The fact that the timer subsystem can do this and it all (mostly) works at
>> all is nice and impressive, but doesn't make it any less crazy ;)
>
> I suspect it's a system where the APIC timer stops in deeper idle
> states and it supports them. In this case CPU #0 does timer broadcasts
> when needed to wake the other CPUs up from deep C, but for that it has
> to run with HPET. At least the other ones can still enjoy the LAPIC
> timer.
>
> This might suggest that Mark's floppy controller doesn't like
> deep C? Mark, did you try booting with processor.max_cstate=1
> and HPET enabled?

I just did and /proc/interrupts looks the same and the floppy still does
not format.

I'll try the patch Linus provided now.

Mark

2009-12-23 18:01:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28



On Wed, 23 Dec 2009, Mark Hounschell wrote:
>
> I'll try the patch Linus provided now.

I doubt it matters - because if it did, it would matter for everybody, and
the HPET thing shouldn't make any difference at all.

[ Or rather, it should matter for everybody trying to format a specific
format (without interleave it won't matter, and not all formats have any
interleave - I think it was mainly used on 5.25" floppies and special
formats). ]

Besides, maybe I was just mis-reading the code.

But getting some testing for the patch certainly won't hurt, so I'm not
going to argue against it any more ;)

Linus

2009-12-23 18:11:25

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 12/23/2009 01:01 PM, Linus Torvalds wrote:
>
>
> On Wed, 23 Dec 2009, Mark Hounschell wrote:
>>
>> I'll try the patch Linus provided now.
>
> I doubt it matters - because if it did, it would matter for everybody, and
> the HPET thing shouldn't make any difference at all.
>
> [ Or rather, it should matter for everybody trying to format a specific
> format (without interleave it won't matter, and not all formats have any
> interleave - I think it was mainly used on 5.25" floppies and special
> formats). ]
>
> Besides, maybe I was just mis-reading the code.
>
> But getting some testing for the patch certainly won't hurt, so I'm not
> going to argue against it any more ;)

Yea, that hosed it up pretty good. The very first track label sent out
caused some sort of timeout.

Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: floppy driver state
Dec 23 13:10:02 harley kernel: -------------------
Dec 23 13:10:02 harley kernel: now=9017 last interrupt=8117 diff=900 last
called handler=f73ce27d
Dec 23 13:10:02 harley kernel: timeout_message=lock fdc
Dec 23 13:10:02 harley kernel: last output bytes:
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 1a 90 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 3 90 4294899106
Dec 23 13:10:02 harley kernel: c1 90 4294899106
Dec 23 13:10:02 harley kernel: 10 90 4294899106
Dec 23 13:10:02 harley kernel: 7 80 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: 8 81 4294899106
Dec 23 13:10:02 harley kernel: 4 80 4294899106
Dec 23 13:10:02 harley kernel: 0 90 4294899106
Dec 23 13:10:02 harley kernel: e6 80 8007
Dec 23 13:10:02 harley kernel: 0 90 8007
Dec 23 13:10:02 harley syslog-ng[2651]: last message repeated 2 times
Dec 23 13:10:02 harley kernel: 1 90 8007
Dec 23 13:10:02 harley kernel: 2 90 8007
Dec 23 13:10:02 harley kernel: 12 90 8007
Dec 23 13:10:02 harley kernel: 1b 90 8007
Dec 23 13:10:02 harley kernel: ff 90 8007
Dec 23 13:10:02 harley kernel: last result at 8117
Dec 23 13:10:02 harley kernel: last redo_fd_request at 8117
Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: status=80
Dec 23 13:10:02 harley kernel: fdc_busy=1
Dec 23 13:10:02 harley kernel: cont=f73d58e4
Dec 23 13:10:02 harley kernel: current_req=(null)
Dec 23 13:10:02 harley kernel: command_status=-1
Dec 23 13:10:02 harley kernel:
Dec 23 13:10:02 harley kernel: floppy0: floppy timeout called
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: floppy driver state
Dec 23 13:10:22 harley kernel: -------------------
Dec 23 13:10:22 harley kernel: now=15017 last interrupt=8117 diff=6900 last
called handler=f73ce27d
Dec 23 13:10:22 harley kernel: timeout_message=do wakeup
Dec 23 13:10:22 harley kernel: last output bytes:
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 1a 90 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 3 90 4294899106
Dec 23 13:10:22 harley kernel: c1 90 4294899106
Dec 23 13:10:22 harley kernel: 10 90 4294899106
Dec 23 13:10:22 harley kernel: 7 80 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: 8 81 4294899106
Dec 23 13:10:22 harley kernel: 4 80 4294899106
Dec 23 13:10:22 harley kernel: 0 90 4294899106
Dec 23 13:10:22 harley kernel: e6 80 8007
Dec 23 13:10:22 harley kernel: 0 90 8007
Dec 23 13:10:22 harley syslog-ng[2651]: last message repeated 2 times
Dec 23 13:10:22 harley kernel: 1 90 8007
Dec 23 13:10:22 harley kernel: 2 90 8007
Dec 23 13:10:22 harley kernel: 12 90 8007
Dec 23 13:10:22 harley kernel: 1b 90 8007
Dec 23 13:10:22 harley kernel: ff 90 8007
Dec 23 13:10:22 harley kernel: last result at 8117
Dec 23 13:10:22 harley kernel: last redo_fd_request at 8117
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: status=80
Dec 23 13:10:22 harley kernel: fdc_busy=1
Dec 23 13:10:22 harley kernel: floppy_work.func=f73d03da
Dec 23 13:10:22 harley kernel: cont=f73d5274
Dec 23 13:10:22 harley kernel: current_req=(null)
Dec 23 13:10:22 harley kernel: command_status=-1
Dec 23 13:10:22 harley kernel:
Dec 23 13:10:22 harley kernel: floppy0: floppy timeout called
Dec 23 13:10:22 harley kernel: floppy.c: no request in request_don

Have to reboot now...

Mark

2009-12-23 19:18:12

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Wed, Dec 23, 2009 at 09:41:50AM -0800, Mark Hounschell wrote:
> On 12/23/2009 11:38 AM, Andi Kleen wrote:
> > Linus Torvalds <[email protected]> writes:
> >
> >> It's not using the lapic for CPU0.
> >>
> >> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> >> expensive to reprogram (compared to the local apic). And having different
> >> timers for different CPU's is just odd.
> >>
> >> The fact that the timer subsystem can do this and it all (mostly) works at
> >> all is nice and impressive, but doesn't make it any less crazy ;)
> >
> > I suspect it's a system where the APIC timer stops in deeper idle
> > states and it supports them. In this case CPU #0 does timer broadcasts
> > when needed to wake the other CPUs up from deep C, but for that it has
> > to run with HPET. At least the other ones can still enjoy the LAPIC
> > timer.
> >
> > This might suggest that Mark's floppy controller doesn't like
> > deep C? Mark, did you try booting with processor.max_cstate=1
> > and HPET enabled?
>
> I just did and /proc/interrupts looks the same and the floppy still does
> not format.
>

Can you try this one line patch either on .28 or .32 (with /proc/interrupts
output).
This disables hpet2 and lapic timer should then be used on CPU 0. If things
work with this test patch, we will know that the failure is somehow related
to HPET usage in MSI mode.

Thanks,
Venki

Reduce the rating of percpu hpet timer

Signed-off-by: Venkatesh Pallipadi <[email protected]>
---
arch/x86/kernel/hpet.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index cafb1c6..f89d17a 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
hpet_setup_irq(hdev);
evt->irq = hdev->irq;

- evt->rating = 110;
+ evt->rating = 40;
evt->features = CLOCK_EVT_FEAT_ONESHOT;
if (hdev->flags & HPET_DEV_PERI_CAP)
evt->features |= CLOCK_EVT_FEAT_PERIODIC;
--
1.6.0.6

2009-12-23 19:35:58

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 12/23/2009 02:18 PM, Pallipadi, Venkatesh wrote:
> On Wed, Dec 23, 2009 at 09:41:50AM -0800, Mark Hounschell wrote:
>> On 12/23/2009 11:38 AM, Andi Kleen wrote:
>>> Linus Torvalds <[email protected]> writes:
>>>
>>>> It's not using the lapic for CPU0.
>>>>
>>>> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
>>>> expensive to reprogram (compared to the local apic). And having different
>>>> timers for different CPU's is just odd.
>>>>
>>>> The fact that the timer subsystem can do this and it all (mostly) works at
>>>> all is nice and impressive, but doesn't make it any less crazy ;)
>>>
>>> I suspect it's a system where the APIC timer stops in deeper idle
>>> states and it supports them. In this case CPU #0 does timer broadcasts
>>> when needed to wake the other CPUs up from deep C, but for that it has
>>> to run with HPET. At least the other ones can still enjoy the LAPIC
>>> timer.
>>>
>>> This might suggest that Mark's floppy controller doesn't like
>>> deep C? Mark, did you try booting with processor.max_cstate=1
>>> and HPET enabled?
>>
>> I just did and /proc/interrupts looks the same and the floppy still does
>> not format.
>>
>
> Can you try this one line patch either on .28 or .32 (with /proc/interrupts
> output).
> This disables hpet2 and lapic timer should then be used on CPU 0. If things
> work with this test patch, we will know that the failure is somehow related
> to HPET usage in MSI mode.
>
> Thanks,
> Venki
>
> Reduce the rating of percpu hpet timer
>
> Signed-off-by: Venkatesh Pallipadi <[email protected]>
> ---
> arch/x86/kernel/hpet.c | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> index cafb1c6..f89d17a 100644
> --- a/arch/x86/kernel/hpet.c
> +++ b/arch/x86/kernel/hpet.c
> @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
> hpet_setup_irq(hdev);
> evt->irq = hdev->irq;
>
> - evt->rating = 110;
> + evt->rating = 40;
> evt->features = CLOCK_EVT_FEAT_ONESHOT;
> if (hdev->flags & HPET_DEV_PERI_CAP)
> evt->features |= CLOCK_EVT_FEAT_PERIODIC;

That made it work. Used 2.6.32.2

cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
0: 82 0 0 1 IO-APIC-edge timer
1: 0 0 0 67 IO-APIC-edge i8042
3: 0 0 0 6 IO-APIC-edge
4: 0 0 0 4 IO-APIC-edge
6: 0 0 0 4 IO-APIC-edge floppy
8: 0 0 0 8 IO-APIC-edge rtc0
9: 0 0 0 0 IO-APIC-fasteoi acpi
12: 0 0 10 1519 IO-APIC-edge i8042
14: 0 0 39 10995 IO-APIC-edge
pata_atiixp
15: 0 0 3 391 IO-APIC-edge
pata_atiixp
16: 0 0 2 606 IO-APIC-fasteoi
aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
17: 0 0 0 3 IO-APIC-fasteoi
ehci_hcd:usb1, parport0, ni-pci-gpib
18: 0 0 10 2168 IO-APIC-fasteoi
ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
19: 0 0 0 130 IO-APIC-fasteoi
aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
22: 0 0 8 1151 IO-APIC-fasteoi ahci
24: 0 0 0 0 HPET_MSI-edge hpet2
29: 0 0 0 48 PCI-MSI-edge
sky2@pci:0000:04:00.0
NMI: 0 0 0 0 Non-maskable interrupts
LOC: 34842 30177 29672 29632 Local timer interrupts
SPU: 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 Performance monitoring
interrupts
PND: 0 0 0 0 Performance pending work
RES: 17501 20449 16670 11224 Rescheduling interrupts
CAL: 10554 2336 1102 1071 Function call interrupts
TLB: 364 562 753 468 TLB shootdowns
ERR: 0
MIS: 0


# fdformat /dev/fd0u1440
Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
Formatting ... done
Verifying ... done

2009-12-23 20:12:15

by Alain Knaff

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Linus Torvalds wrote:

> diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c
> index 3266b4f..9c9148c 100644
> --- a/drivers/block/floppy.c
> +++ b/drivers/block/floppy.c
> @@ -2237,13 +2237,10 @@ static void setup_format_params(int track)
> for (count = 1; count <= F_SECT_PER_TRACK; ++count) {
> here[n].sect = count;
> n = (n + il) % F_SECT_PER_TRACK;
> - if (here[n].sect) { /* sector busy, find next free sector */
> + while (here[n].sect) { /* sector busy, find next free sector */
> ++n;
> - if (n >= F_SECT_PER_TRACK) {
> + if (n >= F_SECT_PER_TRACK)
> n -= F_SECT_PER_TRACK;
> - while (here[n].sect)
> - ++n;
> - }
> }
> }
> if (_floppy->stretch & FD_SECTBASEMASK) {

The original code does indeed look a little bit strange... and might
break if there is a long run of "busy" sectors near the end of the
physical track. Or maybe there is a mathematical reason why this
situation cannot occur. I'll have to think about it a little bit more to
come up with a test case that will break either the new or old code.

But in any case, if a bug would occur due to this code, it would only
depend on the format's parameters, and not on the hardwarde.

Regards,

Alain

2009-12-23 20:30:45

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Wed, 2009-12-23 at 11:35 -0800, Mark Hounschell wrote:
> On 12/23/2009 02:18 PM, Pallipadi, Venkatesh wrote:
> > On Wed, Dec 23, 2009 at 09:41:50AM -0800, Mark Hounschell wrote:
> >> On 12/23/2009 11:38 AM, Andi Kleen wrote:
> >>> Linus Torvalds <[email protected]> writes:
> >>>
> >>>> It's not using the lapic for CPU0.
> >>>>
> >>>> Using the HPET as a per-cpu timer is some crazy sh*t, since it's pretty
> >>>> expensive to reprogram (compared to the local apic). And having different
> >>>> timers for different CPU's is just odd.
> >>>>
> >>>> The fact that the timer subsystem can do this and it all (mostly) works at
> >>>> all is nice and impressive, but doesn't make it any less crazy ;)
> >>>
> >>> I suspect it's a system where the APIC timer stops in deeper idle
> >>> states and it supports them. In this case CPU #0 does timer broadcasts
> >>> when needed to wake the other CPUs up from deep C, but for that it has
> >>> to run with HPET. At least the other ones can still enjoy the LAPIC
> >>> timer.
> >>>
> >>> This might suggest that Mark's floppy controller doesn't like
> >>> deep C? Mark, did you try booting with processor.max_cstate=1
> >>> and HPET enabled?
> >>
> >> I just did and /proc/interrupts looks the same and the floppy still does
> >> not format.
> >>
> >
> > Can you try this one line patch either on .28 or .32 (with /proc/interrupts
> > output).
> > This disables hpet2 and lapic timer should then be used on CPU 0. If things
> > work with this test patch, we will know that the failure is somehow related
> > to HPET usage in MSI mode.
> >
> > Thanks,
> > Venki
> >
> > Reduce the rating of percpu hpet timer
> >
> > Signed-off-by: Venkatesh Pallipadi <[email protected]>
> > ---
> > arch/x86/kernel/hpet.c | 2 +-
> > 1 files changed, 1 insertions(+), 1 deletions(-)
> >
> > diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> > index cafb1c6..f89d17a 100644
> > --- a/arch/x86/kernel/hpet.c
> > +++ b/arch/x86/kernel/hpet.c
> > @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
> > hpet_setup_irq(hdev);
> > evt->irq = hdev->irq;
> >
> > - evt->rating = 110;
> > + evt->rating = 40;
> > evt->features = CLOCK_EVT_FEAT_ONESHOT;
> > if (hdev->flags & HPET_DEV_PERI_CAP)
> > evt->features |= CLOCK_EVT_FEAT_PERIODIC;
>
> That made it work. Used 2.6.32.2
>
> cat /proc/interrupts
> CPU0 CPU1 CPU2 CPU3
> 0: 82 0 0 1 IO-APIC-edge timer
> 1: 0 0 0 67 IO-APIC-edge i8042
> 3: 0 0 0 6 IO-APIC-edge
> 4: 0 0 0 4 IO-APIC-edge
> 6: 0 0 0 4 IO-APIC-edge floppy
> 8: 0 0 0 8 IO-APIC-edge rtc0
> 9: 0 0 0 0 IO-APIC-fasteoi acpi
> 12: 0 0 10 1519 IO-APIC-edge i8042
> 14: 0 0 39 10995 IO-APIC-edge
> pata_atiixp
> 15: 0 0 3 391 IO-APIC-edge
> pata_atiixp
> 16: 0 0 2 606 IO-APIC-fasteoi
> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
> 17: 0 0 0 3 IO-APIC-fasteoi
> ehci_hcd:usb1, parport0, ni-pci-gpib
> 18: 0 0 10 2168 IO-APIC-fasteoi
> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
> 19: 0 0 0 130 IO-APIC-fasteoi
> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
> 24: 0 0 0 0 HPET_MSI-edge hpet2
> 29: 0 0 0 48 PCI-MSI-edge
> sky2@pci:0000:04:00.0
> NMI: 0 0 0 0 Non-maskable interrupts
> LOC: 34842 30177 29672 29632 Local timer interrupts
> SPU: 0 0 0 0 Spurious interrupts
> PMI: 0 0 0 0 Performance monitoring
> interrupts
> PND: 0 0 0 0 Performance pending work
> RES: 17501 20449 16670 11224 Rescheduling interrupts
> CAL: 10554 2336 1102 1071 Function call interrupts
> TLB: 364 562 753 468 TLB shootdowns
> ERR: 0
> MIS: 0
>
>
> # fdformat /dev/fd0u1440
> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
> Formatting ... done
> Verifying ... done

Hmmm.. Thats very interesting indeed.

That clearly says that HPET MSI interrupts somehow is causing some
caching side effect in the chipset that results in this floppy dma
failure.

Here's is what we have until now.
IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
in cases where LAPIC timer will stop working in deep C-state. As we have
only one HPET channel free for percpu HPET, we only have hpet2 tied to
CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
deep C-state.

One problem here is that percpu hpet should only get used when LAPIC
cannot be used (that is when CPU enters deep C-state). Using hpet2 in
place of LAPIC timer even when deep C-state is not supported is not
right in terms of performance. We need some changes here to fix that
[Problem 1].

But, that still does not explain why we are seeing this problem in the
first place. I mean, using hpet2 is not optimal, but should not have
functionality issues like this. Even fixing [Problem 1] above, we may
see this problem on some other platform that supports deep C-state and
so has hpet2 enabled for a valid reason.

Also, I am not sure whether the problem also happens if legacy HPET
interrupts are used during run time in place of LAPIC timer (May be
worth to try this with a simple test patch, let me think about it). In
this case, legacy HPET interrupt rightly goes quiet after boot, giving
priority to LAPIC timer.

With hpet MSI interrupts, we do a write followed by read of HPET
memmapped register to set a HPET channel timeout + read of global HPET
timer. This happens on every timer interrupt on CPU 0. And we also have
MSI interrupt being delivered to CPU 0. I cannot think of any reason why
this can break dma. We can probably try adding some dummy HPET read
after dma write, to see if that flushes things properly.

Thanks,
Venki

2009-12-23 20:36:00

by Alain Knaff

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Pallipadi, Venkatesh wrote:
> MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> this can break dma. We can probably try adding some dummy HPET read
> after dma write, to see if that flushes things properly.

Shouldn't that be "... some dummy HPET read _before_ dma write...". In
order to ensure that DMA cache is consistent _before_ dma controller
reads it?

Regards,

Alain

2009-12-23 21:34:31

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Wed, 2009-12-23 at 12:34 -0800, alain wrote:
> Pallipadi, Venkatesh wrote:
> > MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> > this can break dma. We can probably try adding some dummy HPET read
> > after dma write, to see if that flushes things properly.
>
> Shouldn't that be "... some dummy HPET read _before_ dma write...". In
> order to ensure that DMA cache is consistent _before_ dma controller
> reads it?
>

Yes. I meant after the contents of the buffer is changed and before the
DMA transfer and the controller reading it.

Thanks,
Venki

2009-12-25 12:19:12

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Wed, 23 Dec 2009 18:08:32 +0100
Andi Kleen <[email protected]> wrote:

> I removed that code when moving to 64bit (floppy driver disabling C1),
> but perhaps we need some variant of it again (but it's the first such
> report in many years). Although it would be sad to have it again on
> all systems.

at least now we have the pmqos infrastructure, driver just needs to ask
for 0 latency ;)


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-12-25 20:33:08

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Fri, Dec 25, 2009 at 01:21:16PM +0100, Arjan van de Ven wrote:
> On Wed, 23 Dec 2009 18:08:32 +0100
> Andi Kleen <[email protected]> wrote:
>
> > I removed that code when moving to 64bit (floppy driver disabling C1),
> > but perhaps we need some variant of it again (but it's the first such
> > report in many years). Although it would be sad to have it again on
> > all systems.
>
> at least now we have the pmqos infrastructure, driver just needs to ask
> for 0 latency ;)

Does pmqos work with apci=off etc.? I didn't think it shut down
the classic "HLT" idle, does it? The old i386 systems needed that
apparently, they long pre date any deeper idle states.

Anyways the code is still there for 32bit.

-Andi

--
[email protected] -- Speaking for myself only.

2009-12-26 09:36:42

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Fri, 25 Dec 2009 21:33:04 +0100
Andi Kleen <[email protected]> wrote:

> On Fri, Dec 25, 2009 at 01:21:16PM +0100, Arjan van de Ven wrote:
> > On Wed, 23 Dec 2009 18:08:32 +0100
> > Andi Kleen <[email protected]> wrote:
> >
> > > I removed that code when moving to 64bit (floppy driver disabling
> > > C1), but perhaps we need some variant of it again (but it's the
> > > first such report in many years). Although it would be sad to
> > > have it again on all systems.
> >
> > at least now we have the pmqos infrastructure, driver just needs to
> > ask for 0 latency ;)
>
> Does pmqos work with apci=off etc.?

yes

> I didn't think it shut down
> the classic "HLT" idle, does it?

it does if you specify a latency of 0; it will then go into the
spin-only state until you give up your latency requirement


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2009-12-26 16:40:34

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

> > Does pmqos work with apci=off etc.?
>
> yes
>
> > I didn't think it shut down
> > the classic "HLT" idle, does it?
>
> it does if you specify a latency of 0; it will then go into the
> spin-only state until you give up your latency requirement

I looked at it this evening, but it seems like pm_qos is not
interrupt safe (e.g. calls blocking notifiers) and floppy currently does
enable/disable_hlt from interrupts and bottom halves.

Would need some more infrastructure work or restructuring
of the floppy driver.

-Andi
--
[email protected] -- Speaking for myself only.

2009-12-28 20:05:50

by Pavel Machek

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28


> > > This might suggest that Mark's floppy controller doesn't like
> > > deep C? Mark, did you try booting with processor.max_cstate=1
> > > and HPET enabled?
> >
> > We have indeed had historical issues with floppy and sleep states before.
>
> I removed that code when moving to 64bit (floppy driver disabling C1),
> but perhaps we need some variant of it again (but it's the first such
> report in many years). Although it would be sad to have it again on all
> systems.

C1 is hlt. Are you sure? I could see how C3 could cause problems (DMA
latency), but...

Can mark simply try with idle=poll?

Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-12-27 12:30:54

by Alain Knaff

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Andi Kleen wrote:
>>> Does pmqos work with apci=off etc.?
>> yes
>>
>>> I didn't think it shut down
>>> the classic "HLT" idle, does it?
>> it does if you specify a latency of 0; it will then go into the
>> spin-only state until you give up your latency requirement
>
> I looked at it this evening, but it seems like pm_qos is not
> interrupt safe (e.g. calls blocking notifiers) and floppy currently does
> enable/disable_hlt from interrupts and bottom halves.
>
> Would need some more infrastructure work or restructuring
> of the floppy driver.
>
> -Andi

disable_hlt/enable_hlt was only needed to work around a bug on TM4000
(Texas Instrument) Laptops which were popular around 1994 / 1995.
Basically, as soon as the CPU went into hlt() state, so did the DMA
controller, either causing a really slow transfer, or (worse) a buffer
over/underrun which failed the operation.

On hardware unaffected by this particular bug (which would be most
hardware around now, 14 years after the fact...), these calls can safely
be removed.

Regards,

Alain

2009-12-28 01:54:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
> (Texas Instrument) Laptops which were popular around 1994 / 1995.

I don't think we can fully drop support for these systems.

Did they have an unique PCI ID or something else that could be tested
for?

Perhaps it could be just a white list like dmi_year > 1995 to disable.

Depending on how often floppies are still used this might save
non trivial amounts of power on newer systems :)

Anyways it would be probably good to convert this to the new infrastructure,
and remove the old hooks, but the interrupt-context issue would
need to be fixed first.

-Andi
--
[email protected] -- Speaking for myself only.

2009-12-28 10:34:36

by Alain Knaff

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

Andi Kleen wrote:
>> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
>> (Texas Instrument) Laptops which were popular around 1994 / 1995.
>
> I don't think we can fully drop support for these systems.
>
> Did they have an unique PCI ID or something else that could be tested
> for?

Floppy controllers are not PCI devices and thus have no PCI id
unfortunately... :-(

> Perhaps it could be just a white list like dmi_year > 1995 to disable.
>
> Depending on how often floppies are still used this might save
> non trivial amounts of power on newer systems :)

Removing these calls will indeed save a *tiny* amount of power by
allowing the CPU to go into halt during DMA transfer. But the main
argument should be simplification.

> Anyways it would be probably good to convert this to the new infrastructure,
> and remove the old hooks, but the interrupt-context issue would
> need to be fixed first.
>
> -Andi

Well, at least for testing whether it fixes the new problem (DMA cache
issue), it's useful to know that these calls can be safely removed on
almost all of today's machines. That way, we will know whether this
refactoring will be worth the effort.

Regards,

Alain

2009-12-28 14:54:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Mon, Dec 28, 2009 at 11:27:56AM +0100, Alain Knaff wrote:
> Andi Kleen wrote:
> >> disable_hlt/enable_hlt was only needed to work around a bug on TM4000
> >> (Texas Instrument) Laptops which were popular around 1994 / 1995.
> >
> > I don't think we can fully drop support for these systems.
> >
> > Did they have an unique PCI ID or something else that could be tested
> > for?
>
> Floppy controllers are not PCI devices and thus have no PCI id
> unfortunately... :-(

Yes, but it's enough to identify any component in the system.

-Andi
--
[email protected] -- Speaking for myself only.

2009-12-28 20:54:40

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 12/27/2009 06:09 AM, Pavel Machek wrote:
>
>>>> This might suggest that Mark's floppy controller doesn't like
>>>> deep C? Mark, did you try booting with processor.max_cstate=1
>>>> and HPET enabled?
>>>
>>> We have indeed had historical issues with floppy and sleep states before.
>>
>> I removed that code when moving to 64bit (floppy driver disabling C1),
>> but perhaps we need some variant of it again (but it's the first such
>> report in many years). Although it would be sad to have it again on all
>> systems.
>
> C1 is hlt. Are you sure? I could see how C3 could cause problems (DMA
> latency), but...
>
> Can mark simply try with idle=poll?
>
> Pavel
>

The floppy still fails with idle=poll

Mark

2010-01-08 17:42:40

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 12/23/2009 03:30 PM, Pallipadi, Venkatesh wrote:

>>> Can you try this one line patch either on .28 or .32 (with /proc/interrupts
>>> output).
>>> This disables hpet2 and lapic timer should then be used on CPU 0. If things
>>> work with this test patch, we will know that the failure is somehow related
>>> to HPET usage in MSI mode.
>>>
>>> Thanks,
>>> Venki
>>>
>>> Reduce the rating of percpu hpet timer
>>>
>>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>>> ---
>>> arch/x86/kernel/hpet.c | 2 +-
>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
>>> index cafb1c6..f89d17a 100644
>>> --- a/arch/x86/kernel/hpet.c
>>> +++ b/arch/x86/kernel/hpet.c
>>> @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
>>> hpet_setup_irq(hdev);
>>> evt->irq = hdev->irq;
>>>
>>> - evt->rating = 110;
>>> + evt->rating = 40;
>>> evt->features = CLOCK_EVT_FEAT_ONESHOT;
>>> if (hdev->flags & HPET_DEV_PERI_CAP)
>>> evt->features |= CLOCK_EVT_FEAT_PERIODIC;
>>
>> That made it work. Used 2.6.32.2
>>
>> cat /proc/interrupts
>> CPU0 CPU1 CPU2 CPU3
>> 0: 82 0 0 1 IO-APIC-edge timer
>> 1: 0 0 0 67 IO-APIC-edge i8042
>> 3: 0 0 0 6 IO-APIC-edge
>> 4: 0 0 0 4 IO-APIC-edge
>> 6: 0 0 0 4 IO-APIC-edge floppy
>> 8: 0 0 0 8 IO-APIC-edge rtc0
>> 9: 0 0 0 0 IO-APIC-fasteoi acpi
>> 12: 0 0 10 1519 IO-APIC-edge i8042
>> 14: 0 0 39 10995 IO-APIC-edge
>> pata_atiixp
>> 15: 0 0 3 391 IO-APIC-edge
>> pata_atiixp
>> 16: 0 0 2 606 IO-APIC-fasteoi
>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
>> 17: 0 0 0 3 IO-APIC-fasteoi
>> ehci_hcd:usb1, parport0, ni-pci-gpib
>> 18: 0 0 10 2168 IO-APIC-fasteoi
>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
>> 19: 0 0 0 130 IO-APIC-fasteoi
>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
>> 24: 0 0 0 0 HPET_MSI-edge hpet2
>> 29: 0 0 0 48 PCI-MSI-edge
>> sky2@pci:0000:04:00.0
>> NMI: 0 0 0 0 Non-maskable interrupts
>> LOC: 34842 30177 29672 29632 Local timer interrupts
>> SPU: 0 0 0 0 Spurious interrupts
>> PMI: 0 0 0 0 Performance monitoring
>> interrupts
>> PND: 0 0 0 0 Performance pending work
>> RES: 17501 20449 16670 11224 Rescheduling interrupts
>> CAL: 10554 2336 1102 1071 Function call interrupts
>> TLB: 364 562 753 468 TLB shootdowns
>> ERR: 0
>> MIS: 0
>>
>>
>> # fdformat /dev/fd0u1440
>> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
>> Formatting ... done
>> Verifying ... done
>
> Hmmm.. Thats very interesting indeed.
>
> That clearly says that HPET MSI interrupts somehow is causing some
> caching side effect in the chipset that results in this floppy dma
> failure.
>
> Here's is what we have until now.
> IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
> of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
> CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
> in cases where LAPIC timer will stop working in deep C-state. As we have
> only one HPET channel free for percpu HPET, we only have hpet2 tied to
> CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
> deep C-state.
>
> One problem here is that percpu hpet should only get used when LAPIC
> cannot be used (that is when CPU enters deep C-state). Using hpet2 in
> place of LAPIC timer even when deep C-state is not supported is not
> right in terms of performance. We need some changes here to fix that
> [Problem 1].
>
> But, that still does not explain why we are seeing this problem in the
> first place. I mean, using hpet2 is not optimal, but should not have
> functionality issues like this. Even fixing [Problem 1] above, we may
> see this problem on some other platform that supports deep C-state and
> so has hpet2 enabled for a valid reason.
>
> Also, I am not sure whether the problem also happens if legacy HPET
> interrupts are used during run time in place of LAPIC timer (May be
> worth to try this with a simple test patch, let me think about it). In
> this case, legacy HPET interrupt rightly goes quiet after boot, giving
> priority to LAPIC timer.
>
> With hpet MSI interrupts, we do a write followed by read of HPET
> memmapped register to set a HPET channel timeout + read of global HPET
> timer. This happens on every timer interrupt on CPU 0. And we also have
> MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> this can break dma. We can probably try adding some dummy HPET read
> after dma write, to see if that flushes things properly.
>

Haven't seen any activity on this thread in a while. Just curious, are we
still working this?
Is there anything else I can do to help?

Thanks
Mark

2010-01-12 00:19:20

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Fri, 2010-01-08 at 09:42 -0800, Mark Hounschell wrote:
> On 12/23/2009 03:30 PM, Pallipadi, Venkatesh wrote:
>
> >>> Can you try this one line patch either on .28 or .32 (with /proc/interrupts
> >>> output).
> >>> This disables hpet2 and lapic timer should then be used on CPU 0. If things
> >>> work with this test patch, we will know that the failure is somehow related
> >>> to HPET usage in MSI mode.
> >>>
> >>> Thanks,
> >>> Venki
> >>>
> >>> Reduce the rating of percpu hpet timer
> >>>
> >>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
> >>> ---
> >>> arch/x86/kernel/hpet.c | 2 +-
> >>> 1 files changed, 1 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
> >>> index cafb1c6..f89d17a 100644
> >>> --- a/arch/x86/kernel/hpet.c
> >>> +++ b/arch/x86/kernel/hpet.c
> >>> @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
> >>> hpet_setup_irq(hdev);
> >>> evt->irq = hdev->irq;
> >>>
> >>> - evt->rating = 110;
> >>> + evt->rating = 40;
> >>> evt->features = CLOCK_EVT_FEAT_ONESHOT;
> >>> if (hdev->flags & HPET_DEV_PERI_CAP)
> >>> evt->features |= CLOCK_EVT_FEAT_PERIODIC;
> >>
> >> That made it work. Used 2.6.32.2
> >>
> >> cat /proc/interrupts
> >> CPU0 CPU1 CPU2 CPU3
> >> 0: 82 0 0 1 IO-APIC-edge timer
> >> 1: 0 0 0 67 IO-APIC-edge i8042
> >> 3: 0 0 0 6 IO-APIC-edge
> >> 4: 0 0 0 4 IO-APIC-edge
> >> 6: 0 0 0 4 IO-APIC-edge floppy
> >> 8: 0 0 0 8 IO-APIC-edge rtc0
> >> 9: 0 0 0 0 IO-APIC-fasteoi acpi
> >> 12: 0 0 10 1519 IO-APIC-edge i8042
> >> 14: 0 0 39 10995 IO-APIC-edge
> >> pata_atiixp
> >> 15: 0 0 3 391 IO-APIC-edge
> >> pata_atiixp
> >> 16: 0 0 2 606 IO-APIC-fasteoi
> >> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
> >> 17: 0 0 0 3 IO-APIC-fasteoi
> >> ehci_hcd:usb1, parport0, ni-pci-gpib
> >> 18: 0 0 10 2168 IO-APIC-fasteoi
> >> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
> >> 19: 0 0 0 130 IO-APIC-fasteoi
> >> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
> >> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
> >> 24: 0 0 0 0 HPET_MSI-edge hpet2
> >> 29: 0 0 0 48 PCI-MSI-edge
> >> sky2@pci:0000:04:00.0
> >> NMI: 0 0 0 0 Non-maskable interrupts
> >> LOC: 34842 30177 29672 29632 Local timer interrupts
> >> SPU: 0 0 0 0 Spurious interrupts
> >> PMI: 0 0 0 0 Performance monitoring
> >> interrupts
> >> PND: 0 0 0 0 Performance pending work
> >> RES: 17501 20449 16670 11224 Rescheduling interrupts
> >> CAL: 10554 2336 1102 1071 Function call interrupts
> >> TLB: 364 562 753 468 TLB shootdowns
> >> ERR: 0
> >> MIS: 0
> >>
> >>
> >> # fdformat /dev/fd0u1440
> >> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
> >> Formatting ... done
> >> Verifying ... done
> >
> > Hmmm.. Thats very interesting indeed.
> >
> > That clearly says that HPET MSI interrupts somehow is causing some
> > caching side effect in the chipset that results in this floppy dma
> > failure.
> >
> > Here's is what we have until now.
> > IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
> > of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
> > CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
> > in cases where LAPIC timer will stop working in deep C-state. As we have
> > only one HPET channel free for percpu HPET, we only have hpet2 tied to
> > CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
> > deep C-state.
> >
> > One problem here is that percpu hpet should only get used when LAPIC
> > cannot be used (that is when CPU enters deep C-state). Using hpet2 in
> > place of LAPIC timer even when deep C-state is not supported is not
> > right in terms of performance. We need some changes here to fix that
> > [Problem 1].
> >
> > But, that still does not explain why we are seeing this problem in the
> > first place. I mean, using hpet2 is not optimal, but should not have
> > functionality issues like this. Even fixing [Problem 1] above, we may
> > see this problem on some other platform that supports deep C-state and
> > so has hpet2 enabled for a valid reason.
> >
> > Also, I am not sure whether the problem also happens if legacy HPET
> > interrupts are used during run time in place of LAPIC timer (May be
> > worth to try this with a simple test patch, let me think about it). In
> > this case, legacy HPET interrupt rightly goes quiet after boot, giving
> > priority to LAPIC timer.
> >
> > With hpet MSI interrupts, we do a write followed by read of HPET
> > memmapped register to set a HPET channel timeout + read of global HPET
> > timer. This happens on every timer interrupt on CPU 0. And we also have
> > MSI interrupt being delivered to CPU 0. I cannot think of any reason why
> > this can break dma. We can probably try adding some dummy HPET read
> > after dma write, to see if that flushes things properly.
> >
>
> Haven't seen any activity on this thread in a while. Just curious, are we
> still working this?
> Is there anything else I can do to help?

Sorry for not following up on this. We have narrowed this down to HPET
MSI and floppy DMA. I still don't know how HPET MSI interrupts are
breaking floppy DMA.

You are seeing the problem on two different systems. Correct? Do you
have any system where this works with HPET MSI enabled?

Couple of options on how we can go about this one:
1) Change the HPET-MSI change to not get activated when there are no
C-states with LAPIC stoppage involved. This will resolve the problem on
the systems you reported as there are no deep C-states. But, I fear that
with the actual problem unresolved, we may hit it in future with this or
some other platform having same issue with CPUs that support deep
C-state.
2) Try this testcase on few other platforms that support HPET-MSI and
deep C-states and check how widespread the problem is and then add a
whitelist-blacklist for HPET MSI usage.

I think, for 2.6.33 option 1 is better. Will work on that and send in
patches for you test.

Thanks,
Venki

2010-01-12 09:04:32

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 01/11/2010 07:19 PM, Pallipadi, Venkatesh wrote:
> On Fri, 2010-01-08 at 09:42 -0800, Mark Hounschell wrote:
>> On 12/23/2009 03:30 PM, Pallipadi, Venkatesh wrote:
>>
>>>>> Can you try this one line patch either on .28 or .32 (with /proc/interrupts
>>>>> output).
>>>>> This disables hpet2 and lapic timer should then be used on CPU 0. If things
>>>>> work with this test patch, we will know that the failure is somehow related
>>>>> to HPET usage in MSI mode.
>>>>>
>>>>> Thanks,
>>>>> Venki
>>>>>
>>>>> Reduce the rating of percpu hpet timer
>>>>>
>>>>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>>>>> ---
>>>>> arch/x86/kernel/hpet.c | 2 +-
>>>>> 1 files changed, 1 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
>>>>> index cafb1c6..f89d17a 100644
>>>>> --- a/arch/x86/kernel/hpet.c
>>>>> +++ b/arch/x86/kernel/hpet.c
>>>>> @@ -480,7 +480,7 @@ static void init_one_hpet_msi_clockevent(struct hpet_dev *hdev, int cpu)
>>>>> hpet_setup_irq(hdev);
>>>>> evt->irq = hdev->irq;
>>>>>
>>>>> - evt->rating = 110;
>>>>> + evt->rating = 40;
>>>>> evt->features = CLOCK_EVT_FEAT_ONESHOT;
>>>>> if (hdev->flags & HPET_DEV_PERI_CAP)
>>>>> evt->features |= CLOCK_EVT_FEAT_PERIODIC;
>>>>
>>>> That made it work. Used 2.6.32.2
>>>>
>>>> cat /proc/interrupts
>>>> CPU0 CPU1 CPU2 CPU3
>>>> 0: 82 0 0 1 IO-APIC-edge timer
>>>> 1: 0 0 0 67 IO-APIC-edge i8042
>>>> 3: 0 0 0 6 IO-APIC-edge
>>>> 4: 0 0 0 4 IO-APIC-edge
>>>> 6: 0 0 0 4 IO-APIC-edge floppy
>>>> 8: 0 0 0 8 IO-APIC-edge rtc0
>>>> 9: 0 0 0 0 IO-APIC-fasteoi acpi
>>>> 12: 0 0 10 1519 IO-APIC-edge i8042
>>>> 14: 0 0 39 10995 IO-APIC-edge
>>>> pata_atiixp
>>>> 15: 0 0 3 391 IO-APIC-edge
>>>> pata_atiixp
>>>> 16: 0 0 2 606 IO-APIC-fasteoi
>>>> aic79xx, ohci_hcd:usb3, ohci_hcd:usb4, HDA Intel, Digi DBX2, ni-pci-gpib
>>>> 17: 0 0 0 3 IO-APIC-fasteoi
>>>> ehci_hcd:usb1, parport0, ni-pci-gpib
>>>> 18: 0 0 10 2168 IO-APIC-fasteoi
>>>> ohci_hcd:usb5, ohci_hcd:usb6, ohci_hcd:usb7, Digi DBX2, nvidia
>>>> 19: 0 0 0 130 IO-APIC-fasteoi
>>>> aic7xxx, ehci_hcd:usb2, ttySLG0, eth1
>>>> 22: 0 0 8 1151 IO-APIC-fasteoi ahci
>>>> 24: 0 0 0 0 HPET_MSI-edge hpet2
>>>> 29: 0 0 0 48 PCI-MSI-edge
>>>> sky2@pci:0000:04:00.0
>>>> NMI: 0 0 0 0 Non-maskable interrupts
>>>> LOC: 34842 30177 29672 29632 Local timer interrupts
>>>> SPU: 0 0 0 0 Spurious interrupts
>>>> PMI: 0 0 0 0 Performance monitoring
>>>> interrupts
>>>> PND: 0 0 0 0 Performance pending work
>>>> RES: 17501 20449 16670 11224 Rescheduling interrupts
>>>> CAL: 10554 2336 1102 1071 Function call interrupts
>>>> TLB: 364 562 753 468 TLB shootdowns
>>>> ERR: 0
>>>> MIS: 0
>>>>
>>>>
>>>> # fdformat /dev/fd0u1440
>>>> Double-sided, 80 tracks, 18 sec/track. Total capacity 1440 kB.
>>>> Formatting ... done
>>>> Verifying ... done
>>>
>>> Hmmm.. Thats very interesting indeed.
>>>
>>> That clearly says that HPET MSI interrupts somehow is causing some
>>> caching side effect in the chipset that results in this floppy dma
>>> failure.
>>>
>>> Here's is what we have until now.
>>> IRQ 0 is based on HPET legacy interrupt and HPET device is also capable
>>> of MSI on this platform. So we also have a percpu hpet (hpet2 tied to
>>> CPU0). percpu hpet was added to avoid the usage of IRQ0+LAPIC broadcast
>>> in cases where LAPIC timer will stop working in deep C-state. As we have
>>> only one HPET channel free for percpu HPET, we only have hpet2 tied to
>>> CPU 0 and other CPUs still have to go through IRQ0+LAPIC broadcast with
>>> deep C-state.
>>>
>>> One problem here is that percpu hpet should only get used when LAPIC
>>> cannot be used (that is when CPU enters deep C-state). Using hpet2 in
>>> place of LAPIC timer even when deep C-state is not supported is not
>>> right in terms of performance. We need some changes here to fix that
>>> [Problem 1].
>>>
>>> But, that still does not explain why we are seeing this problem in the
>>> first place. I mean, using hpet2 is not optimal, but should not have
>>> functionality issues like this. Even fixing [Problem 1] above, we may
>>> see this problem on some other platform that supports deep C-state and
>>> so has hpet2 enabled for a valid reason.
>>>
>>> Also, I am not sure whether the problem also happens if legacy HPET
>>> interrupts are used during run time in place of LAPIC timer (May be
>>> worth to try this with a simple test patch, let me think about it). In
>>> this case, legacy HPET interrupt rightly goes quiet after boot, giving
>>> priority to LAPIC timer.
>>>
>>> With hpet MSI interrupts, we do a write followed by read of HPET
>>> memmapped register to set a HPET channel timeout + read of global HPET
>>> timer. This happens on every timer interrupt on CPU 0. And we also have
>>> MSI interrupt being delivered to CPU 0. I cannot think of any reason why
>>> this can break dma. We can probably try adding some dummy HPET read
>>> after dma write, to see if that flushes things properly.
>>>
>>
>> Haven't seen any activity on this thread in a while. Just curious, are we
>> still working this?
>> Is there anything else I can do to help?
>
> Sorry for not following up on this. We have narrowed this down to HPET
> MSI and floppy DMA. I still don't know how HPET MSI interrupts are
> breaking floppy DMA.
>
> You are seeing the problem on two different systems. Correct? Do you
> have any system where this works with HPET MSI enabled?
>

I see the problem on every system in which the HPET2 shows up in
/proc/interrupts. The machines that work with HPET enabled don't show HPET
at all in /proc/interrupts. I have some of each. All the boxes that fail
here use the (AMD) 790x series chip sets.

> Couple of options on how we can go about this one:
> 1) Change the HPET-MSI change to not get activated when there are no
> C-states with LAPIC stoppage involved. This will resolve the problem on
> the systems you reported as there are no deep C-states. But, I fear that
> with the actual problem unresolved, we may hit it in future with this or
> some other platform having same issue with CPUs that support deep
> C-state.
> 2) Try this testcase on few other platforms that support HPET-MSI and
> deep C-states and check how widespread the problem is and then add a
> whitelist-blacklist for HPET MSI usage.
>
> I think, for 2.6.33 option 1 is better. Will work on that and send in
> patches for you test.
>

OK, thanks
Mark

2010-01-21 19:09:56

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800


HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

Tested-by: Mark Hounschell <[email protected]>

Signed-off-by: Venkatesh Pallipadi <[email protected]>
---

This patch needs to go to stable as well. But, there are some conflicts that prevents
the patch from going as is. I can rebase/resubmit to stable once the patch goes upstream.

arch/x86/include/asm/hpet.h | 1 +
arch/x86/kernel/hpet.c | 8 ++++++++
arch/x86/kernel/quirks.c | 13 +++++++++++++
3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 5d89fd2..1d5c08a 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -67,6 +67,7 @@ extern unsigned long hpet_address;
extern unsigned long force_hpet_address;
extern u8 hpet_blockid;
extern int hpet_force_user;
+extern u8 hpet_msi_disable;
extern int is_hpet_enabled(void);
extern int hpet_enable(void);
extern void hpet_disable(void);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ba6e658..ad80a1c 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -34,6 +34,8 @@
*/
unsigned long hpet_address;
u8 hpet_blockid; /* OS timer block num */
+u8 hpet_msi_disable;
+
#ifdef CONFIG_PCI_MSI
static unsigned long hpet_num_timers;
#endif
@@ -596,6 +598,9 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
unsigned int num_timers_used = 0;
int i;

+ if (hpet_msi_disable)
+ return;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return;
id = hpet_readl(HPET_ID);
@@ -928,6 +933,9 @@ static __init int hpet_late_init(void)
hpet_reserve_platform_timers(hpet_readl(HPET_ID));
hpet_print_config();

+ if (hpet_msi_disable)
+ return 0;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return 0;

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 18093d7..12e9fea 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -491,6 +491,19 @@ void force_hpet_resume(void)
break;
}
}
+
+/*
+ * HPET MSI on some boards (ATI SB700/SB800) has side effect on
+ * floppy DMA. Disable HPET MSI on such platforms.
+ */
+static void force_disable_hpet_msi(struct pci_dev *unused)
+{
+ hpet_msi_disable = 1;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
+ force_disable_hpet_msi);
+
#endif

#if defined(CONFIG_PCI) && defined(CONFIG_NUMA)
--
1.6.0.6

2010-01-22 22:01:08

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [tip:x86/urgent] x86: Disable HPET MSI on ATI SB700/SB800

Commit-ID: 9f0b0ce525f19ef408e877b1c7662b60424c7cdc
Gitweb: http://git.kernel.org/tip/9f0b0ce525f19ef408e877b1c7662b60424c7cdc
Author: Pallipadi, Venkatesh <[email protected]>
AuthorDate: Thu, 21 Jan 2010 11:09:52 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Fri, 22 Jan 2010 13:47:01 -0800

x86: Disable HPET MSI on ATI SB700/SB800

HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

[ This patch needs to go to stable as well. But, there are some
conflicts that prevents the patch from going as is. I can
rebase/resubmit to stable once the patch goes upstream.
hpa: still Cc:'ing stable@ as an FYI. ]

Tested-by: Mark Hounschell <[email protected]>
Signed-off-by: Venkatesh Pallipadi <[email protected]>
Cc: <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/hpet.h | 1 +
arch/x86/kernel/hpet.c | 8 ++++++++
arch/x86/kernel/quirks.c | 13 +++++++++++++
3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 5d89fd2..1d5c08a 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -67,6 +67,7 @@ extern unsigned long hpet_address;
extern unsigned long force_hpet_address;
extern u8 hpet_blockid;
extern int hpet_force_user;
+extern u8 hpet_msi_disable;
extern int is_hpet_enabled(void);
extern int hpet_enable(void);
extern void hpet_disable(void);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ba6e658..ad80a1c 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -34,6 +34,8 @@
*/
unsigned long hpet_address;
u8 hpet_blockid; /* OS timer block num */
+u8 hpet_msi_disable;
+
#ifdef CONFIG_PCI_MSI
static unsigned long hpet_num_timers;
#endif
@@ -596,6 +598,9 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
unsigned int num_timers_used = 0;
int i;

+ if (hpet_msi_disable)
+ return;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return;
id = hpet_readl(HPET_ID);
@@ -928,6 +933,9 @@ static __init int hpet_late_init(void)
hpet_reserve_platform_timers(hpet_readl(HPET_ID));
hpet_print_config();

+ if (hpet_msi_disable)
+ return 0;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return 0;

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 18093d7..12e9fea 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -491,6 +491,19 @@ void force_hpet_resume(void)
break;
}
}
+
+/*
+ * HPET MSI on some boards (ATI SB700/SB800) has side effect on
+ * floppy DMA. Disable HPET MSI on such platforms.
+ */
+static void force_disable_hpet_msi(struct pci_dev *unused)
+{
+ hpet_msi_disable = 1;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
+ force_disable_hpet_msi);
+
#endif

#if defined(CONFIG_PCI) && defined(CONFIG_NUMA)

2010-01-23 06:52:20

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [tip:x86/urgent] x86: Disable HPET MSI on ATI SB700/SB800

Commit-ID: 73472a46b5b28116b145fb5fc05242c1aa8e1461
Gitweb: http://git.kernel.org/tip/73472a46b5b28116b145fb5fc05242c1aa8e1461
Author: Pallipadi, Venkatesh <[email protected]>
AuthorDate: Thu, 21 Jan 2010 11:09:52 -0800
Committer: Ingo Molnar <[email protected]>
CommitDate: Sat, 23 Jan 2010 06:21:58 +0100

x86: Disable HPET MSI on ATI SB700/SB800

HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Original problem report from Mark Hounschell
http://lkml.indiana.edu/hypermail/linux/kernel/0912.2/01118.html

[ This patch needs to go to stable as well. But, there are some
conflicts that prevents the patch from going as is. I can
rebase/resubmit to stable once the patch goes upstream.
hpa: still Cc:'ing stable@ as an FYI. ]

Tested-by: Mark Hounschell <[email protected]>
Signed-off-by: Venkatesh Pallipadi <[email protected]>
Cc: <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/include/asm/hpet.h | 1 +
arch/x86/kernel/hpet.c | 8 ++++++++
arch/x86/kernel/quirks.c | 13 +++++++++++++
3 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/hpet.h b/arch/x86/include/asm/hpet.h
index 5d89fd2..1d5c08a 100644
--- a/arch/x86/include/asm/hpet.h
+++ b/arch/x86/include/asm/hpet.h
@@ -67,6 +67,7 @@ extern unsigned long hpet_address;
extern unsigned long force_hpet_address;
extern u8 hpet_blockid;
extern int hpet_force_user;
+extern u8 hpet_msi_disable;
extern int is_hpet_enabled(void);
extern int hpet_enable(void);
extern void hpet_disable(void);
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index ba6e658..ad80a1c 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -34,6 +34,8 @@
*/
unsigned long hpet_address;
u8 hpet_blockid; /* OS timer block num */
+u8 hpet_msi_disable;
+
#ifdef CONFIG_PCI_MSI
static unsigned long hpet_num_timers;
#endif
@@ -596,6 +598,9 @@ static void hpet_msi_capability_lookup(unsigned int start_timer)
unsigned int num_timers_used = 0;
int i;

+ if (hpet_msi_disable)
+ return;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return;
id = hpet_readl(HPET_ID);
@@ -928,6 +933,9 @@ static __init int hpet_late_init(void)
hpet_reserve_platform_timers(hpet_readl(HPET_ID));
hpet_print_config();

+ if (hpet_msi_disable)
+ return 0;
+
if (boot_cpu_has(X86_FEATURE_ARAT))
return 0;

diff --git a/arch/x86/kernel/quirks.c b/arch/x86/kernel/quirks.c
index 18093d7..12e9fea 100644
--- a/arch/x86/kernel/quirks.c
+++ b/arch/x86/kernel/quirks.c
@@ -491,6 +491,19 @@ void force_hpet_resume(void)
break;
}
}
+
+/*
+ * HPET MSI on some boards (ATI SB700/SB800) has side effect on
+ * floppy DMA. Disable HPET MSI on such platforms.
+ */
+static void force_disable_hpet_msi(struct pci_dev *unused)
+{
+ hpet_msi_disable = 1;
+}
+
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
+ force_disable_hpet_msi);
+
#endif

#if defined(CONFIG_PCI) && defined(CONFIG_NUMA)

2010-01-23 07:21:14

by Yuhong Bao

[permalink] [raw]
Subject: RE: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800


> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
I think somebody from AMD should review the situation.Clearly something is happening inside their southbridge.CCing?Andreas Herrmann?from AMD.
Yuhong Bao
_________________________________________________________________
Hotmail: Trusted email with Microsoft?s powerful SPAM protection.
http://clk.atdmt.com/GBL/go/196390706/direct/01/-

Subject: Re: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800

On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>
> > HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> > side-effects on floppy DMA. Do not use HPET MSI on such platforms.

Argh, will see what information I can find about this problem ...

> I think somebody from AMD should review the situation.Clearly
something is happening inside their southbridge.CCing?Andreas
Herrmann?from AMD.

I have the feeling that this problem should rather be fixed with a DMI
quirk instead of disabling HPET MSI for the entire chipset.

Was the latest available BIOS installed on the affected system?


Thanks,
Andreas

--
Operating | Advanced Micro Devices GmbH
System | Karl-Hammerschmidt-Str. 34, 85609 Dornach b. M?nchen, Germany
Research | Gesch?ftsf?hrer: Andrew Bowd, Thomas M. McCoy, Giuliano Meroni
Center | Sitz: Dornach, Gemeinde Aschheim, Landkreis M?nchen
(OSRC) | Registergericht M?nchen, HRB Nr. 43632

2010-01-28 09:17:39

by Mark Hounschell

[permalink] [raw]
Subject: Re: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800

On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>
>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>
> Argh, will see what information I can find about this problem ...
>
>> I think somebody from AMD should review the situation.Clearly
> something is happening inside their southbridge.CCing Andreas
> Herrmann from AMD.
>
> I have the feeling that this problem should rather be fixed with a DMI
> quirk instead of disabling HPET MSI for the entire chipset.
>
> Was the latest available BIOS installed on the affected system?
>

You mean "systems" of different manufactures? I will check today. Due to
mis configured filters I didn't see this until today. Sorry.

Mark

2010-01-28 13:25:29

by Mark Hounschell

[permalink] [raw]
Subject: Re: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800

On 01/28/2010 04:17 AM, Mark Hounschell wrote:
> On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
>> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>>
>>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>>
>> Argh, will see what information I can find about this problem ...
>>
>>> I think somebody from AMD should review the situation.Clearly
>> something is happening inside their southbridge.CCing Andreas
>> Herrmann from AMD.
>>
>> I have the feeling that this problem should rather be fixed with a DMI
>> quirk instead of disabling HPET MSI for the entire chipset.
>>
>> Was the latest available BIOS installed on the affected system?
>>
>
> You mean "systems" of different manufactures? I will check today. Due to
> mis configured filters I didn't see this until today. Sorry.
>
> Mark
>

My BIOS were below rev on all my affected boards but updating did not help
with the problem.

Andreas, while I have your ear, I am also having another issue with this
chip set doing peer to peer bus transfers between pci buses and pci-e buses
and from pci-e to pci-e buses. I've read the chip set specs and they _seem_
to imply that it may not be allowed due to "Trusted Computing" something or
another. I've posed the issue to the AMD forums with no luck, and
I can't figure out why this doesn't work using these chip sets.

Sorry to change the subject. I just figured I'd ask someone from AMD while
I had the chance.

Thanks and Regards
Mark

Subject: Re: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800

On Thu, Jan 28, 2010 at 08:25:23AM -0500, Mark Hounschell wrote:
> On 01/28/2010 04:17 AM, Mark Hounschell wrote:
> > On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
> >> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
> >>>
> >>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
> >>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
> >>
> >> Argh, will see what information I can find about this problem ...
> >>
> >>> I think somebody from AMD should review the situation.Clearly
> >> something is happening inside their southbridge.CCing Andreas
> >> Herrmann from AMD.
> >>
> >> I have the feeling that this problem should rather be fixed with a DMI
> >> quirk instead of disabling HPET MSI for the entire chipset.
> >>
> >> Was the latest available BIOS installed on the affected system?
> >>
> >
> > You mean "systems" of different manufactures? I will check today. Due to
> > mis configured filters I didn't see this until today. Sorry.
> >
> > Mark
> >
>
> My BIOS were below rev on all my affected boards but updating did not help
> with the problem.

Hi,

can you post the BIOS vendors of the boards along with the respective
BIOS versions?

Thanks.

--
Regards/Gruss,
Boris.

--
Advanced Micro Devices, Inc.
Operating Systems Research Center

2010-01-28 14:45:10

by Mark Hounschell

[permalink] [raw]
Subject: Re: [PATCH] x86: Disable HPET MSI on ATI SB700/SB800

On 01/28/2010 08:41 AM, Borislav Petkov wrote:
> On Thu, Jan 28, 2010 at 08:25:23AM -0500, Mark Hounschell wrote:
>> On 01/28/2010 04:17 AM, Mark Hounschell wrote:
>>> On 01/25/2010 12:10 PM, Andreas Herrmann wrote:
>>>> On Fri, Jan 22, 2010 at 11:21:06PM -0800, Yuhong Bao wrote:
>>>>>
>>>>>> HPET MSI on platforms with ATI SB700/SB800 as they seem to have some
>>>>>> side-effects on floppy DMA. Do not use HPET MSI on such platforms.
>>>>
>>>> Argh, will see what information I can find about this problem ...
>>>>
>>>>> I think somebody from AMD should review the situation.Clearly
>>>> something is happening inside their southbridge.CCing Andreas
>>>> Herrmann from AMD.
>>>>
>>>> I have the feeling that this problem should rather be fixed with a DMI
>>>> quirk instead of disabling HPET MSI for the entire chipset.
>>>>
>>>> Was the latest available BIOS installed on the affected system?
>>>>
>>>
>>> You mean "systems" of different manufactures? I will check today. Due to
>>> mis configured filters I didn't see this until today. Sorry.
>>>
>>> Mark
>>>
>>
>> My BIOS were below rev on all my affected boards but updating did not help
>> with the problem.
>
> Hi,
>
> can you post the BIOS vendors of the boards along with the respective
> BIOS versions?
>
> Thanks.
>

DFI DK-790FXB-M3H5 MB using AWARD bios D7SDA09.BIN (10/09/2009)
BIOSTAR TA790GXB A2+ using AMI bios 78DDA928.BST (09/28/09)

Regards
Mark

2010-01-15 02:02:00

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On Tue, 2010-01-12 at 01:04 -0800, Mark Hounschell wrote:
> On 01/11/2010 07:19 PM, Pallipadi, Venkatesh wrote:
> >
> > Sorry for not following up on this. We have narrowed this down to HPET
> > MSI and floppy DMA. I still don't know how HPET MSI interrupts are
> > breaking floppy DMA.
> >
> > You are seeing the problem on two different systems. Correct? Do you
> > have any system where this works with HPET MSI enabled?
> >
>
> I see the problem on every system in which the HPET2 shows up in
> /proc/interrupts. The machines that work with HPET enabled don't show HPET
> at all in /proc/interrupts. I have some of each. All the boxes that fail
> here use the (AMD) 790x series chip sets.
>
> > Couple of options on how we can go about this one:
> > 1) Change the HPET-MSI change to not get activated when there are no
> > C-states with LAPIC stoppage involved. This will resolve the problem on
> > the systems you reported as there are no deep C-states. But, I fear that
> > with the actual problem unresolved, we may hit it in future with this or
> > some other platform having same issue with CPUs that support deep
> > C-state.
> > 2) Try this testcase on few other platforms that support HPET-MSI and
> > deep C-states and check how widespread the problem is and then add a
> > whitelist-blacklist for HPET MSI usage.
> >
> > I think, for 2.6.33 option 1 is better. Will work on that and send in
> > patches for you test.
> >
>

Mark,

I just sent out a patchset that should workaround the problem here. Can
you check and let me know whether thats the case.

We will still need a simpler/smaller workaround for .33. Will send a
patch for that soon.

Also, are you testing this with usb floppy controller? I tried to test
it on my end, but fdformat doesn't seem to like my usb floppy drive. I
tried, 'ufiformat -f 1440 <dev>', with which I am not able to reproduce
the failure on any of my boxes. Not sure whether that really means I
don't hit this bug or that is going through totally different code path.

Thanks,
Venki

2010-01-15 09:39:47

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 01/14/2010 09:01 PM, Pallipadi, Venkatesh wrote:
> On Tue, 2010-01-12 at 01:04 -0800, Mark Hounschell wrote:
>> On 01/11/2010 07:19 PM, Pallipadi, Venkatesh wrote:
>>>
>>> Sorry for not following up on this. We have narrowed this down to HPET
>>> MSI and floppy DMA. I still don't know how HPET MSI interrupts are
>>> breaking floppy DMA.
>>>
>>> You are seeing the problem on two different systems. Correct? Do you
>>> have any system where this works with HPET MSI enabled?
>>>
>>
>> I see the problem on every system in which the HPET2 shows up in
>> /proc/interrupts. The machines that work with HPET enabled don't show HPET
>> at all in /proc/interrupts. I have some of each. All the boxes that fail
>> here use the (AMD) 790x series chip sets.
>>
>>> Couple of options on how we can go about this one:
>>> 1) Change the HPET-MSI change to not get activated when there are no
>>> C-states with LAPIC stoppage involved. This will resolve the problem on
>>> the systems you reported as there are no deep C-states. But, I fear that
>>> with the actual problem unresolved, we may hit it in future with this or
>>> some other platform having same issue with CPUs that support deep
>>> C-state.
>>> 2) Try this testcase on few other platforms that support HPET-MSI and
>>> deep C-states and check how widespread the problem is and then add a
>>> whitelist-blacklist for HPET MSI usage.
>>>
>>> I think, for 2.6.33 option 1 is better. Will work on that and send in
>>> patches for you test.
>>>
>>
>
> Mark,
>
> I just sent out a patchset that should workaround the problem here. Can
> you check and let me know whether thats the case.
>

Yes, I'll try that today. I assume I'll find them on LMKL.

> We will still need a simpler/smaller workaround for .33. Will send a
> patch for that soon.
>
> Also, are you testing this with usb floppy controller? I tried to test
> it on my end, but fdformat doesn't seem to like my usb floppy drive. I
> tried, 'ufiformat -f 1440 <dev>', with which I am not able to reproduce
> the failure on any of my boxes. Not sure whether that really means I
> don't hit this bug or that is going through totally different code path.
>

No, I've never even seen a USB floppy controller.

Mark

2010-01-15 18:02:59

by Mark Hounschell

[permalink] [raw]
Subject: Re: [Fdutils] DMA cache consistency bug introduced in 2.6.28

On 01/14/2010 09:01 PM, Pallipadi, Venkatesh wrote:
> On Tue, 2010-01-12 at 01:04 -0800, Mark Hounschell wrote:
>> On 01/11/2010 07:19 PM, Pallipadi, Venkatesh wrote:
>>>
>>> Sorry for not following up on this. We have narrowed this down to HPET
>>> MSI and floppy DMA. I still don't know how HPET MSI interrupts are
>>> breaking floppy DMA.
>>>
>>> You are seeing the problem on two different systems. Correct? Do you
>>> have any system where this works with HPET MSI enabled?
>>>
>>
>> I see the problem on every system in which the HPET2 shows up in
>> /proc/interrupts. The machines that work with HPET enabled don't show HPET
>> at all in /proc/interrupts. I have some of each. All the boxes that fail
>> here use the (AMD) 790x series chip sets.
>>
>>> Couple of options on how we can go about this one:
>>> 1) Change the HPET-MSI change to not get activated when there are no
>>> C-states with LAPIC stoppage involved. This will resolve the problem on
>>> the systems you reported as there are no deep C-states. But, I fear that
>>> with the actual problem unresolved, we may hit it in future with this or
>>> some other platform having same issue with CPUs that support deep
>>> C-state.
>>> 2) Try this testcase on few other platforms that support HPET-MSI and
>>> deep C-states and check how widespread the problem is and then add a
>>> whitelist-blacklist for HPET MSI usage.
>>>
>>> I think, for 2.6.33 option 1 is better. Will work on that and send in
>>> patches for you test.
>>>
>>
>
> Mark,
>
> I just sent out a patchset that should workaround the problem here. Can
> you check and let me know whether thats the case.
>

Yes, it does seem to fix the issue. The floppy formats and /proc/interrupts
look normal with nothing going on with the hpet2 msi.

Regards
Mark