2006-11-05 15:18:35

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Adrian Bunk <[email protected]> writes:

> This email lists some known regressions in 2.6.19-rc4 compared to 2.6.18
> that are not yet fixed in Linus' tree.
>
> If you find your name in the Cc header, you are either submitter of one
> of the bugs, maintainer of an affectected subsystem or driver, a patch
> of you caused a breakage or I'm considering you in any other way possibly
> involved with one or more of these issues.
>
> Due to the huge amount of recipients, please trim the Cc when answering.
>
>
> Subject : ipath driver MCEs system on load when HT chip present
> References : http://bugzilla.kernel.org/show_bug.cgi?id=7455
> Submitter : Bryan O'Sullivan <[email protected]>
> Caused-By : Eric W. Biederman <[email protected]>
> Status : unknown

Status in problem is being debugged. I have posted some infrastructure
patches that should allow Bryan to fix his driver cleanly.

I did not cause this. The ipath HTX card driver's irq handling has
never been anything but a hack. It has never worked correctly even in
the instances it worked. It only worked on i386 or x86_64 when
CONFIG_PCI_MSI was enabled but did not use MSI. It was relying on the
implementation detail that the architecture specific vector number was
placed in the dev->irq. dev->irq is actually meaningless on this card
as it doesn't have any ordinary pci interrupts.

So while I am happy to take credit for flushing this bug out I did not
introduce it.

Eric


2006-11-07 04:22:14

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

On Sun, Nov 05, 2006 at 08:17:53AM -0700, Eric W. Biederman wrote:
> Adrian Bunk <[email protected]> writes:
>
> > This email lists some known regressions in 2.6.19-rc4 compared to 2.6.18
> > that are not yet fixed in Linus' tree.
> >
> > If you find your name in the Cc header, you are either submitter of one
> > of the bugs, maintainer of an affectected subsystem or driver, a patch
> > of you caused a breakage or I'm considering you in any other way possibly
> > involved with one or more of these issues.
> >
> > Due to the huge amount of recipients, please trim the Cc when answering.
> >
> >
> > Subject : ipath driver MCEs system on load when HT chip present
> > References : http://bugzilla.kernel.org/show_bug.cgi?id=7455
> > Submitter : Bryan O'Sullivan <[email protected]>
> > Caused-By : Eric W. Biederman <[email protected]>
> > Status : unknown
>
> Status in problem is being debugged. I have posted some infrastructure
> patches that should allow Bryan to fix his driver cleanly.
>
> I did not cause this. The ipath HTX card driver's irq handling has
> never been anything but a hack. It has never worked correctly even in
> the instances it worked. It only worked on i386 or x86_64 when
> CONFIG_PCI_MSI was enabled but did not use MSI. It was relying on the
> implementation detail that the architecture specific vector number was
> placed in the dev->irq. dev->irq is actually meaningless on this card
> as it doesn't have any ordinary pci interrupts.
>
> So while I am happy to take credit for flushing this bug out I did not
> introduce it.

My notion of "regression" is from a user's perspective.

Therefore, if a hack that worked at least for some users no longer
works, that's a regression. That's independent of the technical question
whose fault it actually was.

We should either get this fixed before 2.6.19 or at least make it clear
for users that support for this hardware won't be back before 2.6.20.

> Eric

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2006-11-07 05:18:41

by Bryan O'Sullivan

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Adrian Bunk wrote:

> We should either get this fixed before 2.6.19 or at least make it clear
> for users that support for this hardware won't be back before 2.6.20.

Eric and I are working on patches for this.

<b

2006-11-07 08:51:26

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

"Bryan O'Sullivan" <[email protected]> writes:

> Adrian Bunk wrote:
>
>> We should either get this fixed before 2.6.19 or at least make it clear for
>> users that support for this hardware won't be back before 2.6.20.
>
> Eric and I are working on patches for this.

How comes the driver fixes in the context of my patches?

Eric

2006-11-07 16:19:57

by Bryan O'Sullivan

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Eric W. Biederman wrote:

> How comes the driver fixes in the context of my patches?

The fix is simple enough, it's just not as clean as I'd like. I have to
pull apart the contents of the ht_irq_msg that the new update hook is
getting passed, in order to get the vector out, so that I can reprogram
the offending chip register after I've done the config space writes.
It's a pretty roundabout way to do the job.

<b

2006-11-07 17:34:31

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

"Bryan O'Sullivan" <[email protected]> writes:

> Eric W. Biederman wrote:
>
>> How comes the driver fixes in the context of my patches?
>
> The fix is simple enough, it's just not as clean as I'd like. I have to pull
> apart the contents of the ht_irq_msg that the new update hook is getting passed,
> in order to get the vector out, so that I can reprogram the offending chip
> register after I've done the config space writes. It's a pretty roundabout way
> to do the job.

Huh? As I read the ipath code I am passing you the value that needs to go
into ipath->int_config and thus into dd->ipath_kregs->kr_interrupt_config.

Sure it is coming as 2 32bit words instead of a one big 64 bit one, but
that is simple to fix.

If your card doesn't pay attention to configuration space access cycles then
there should be no reason to write the value there. If your card does pay
attention to the configuration space access cycles it should be trivial to
make this work.

Please stop getting stuck on your existing hack that only put in the vector
number and didn't put in any of the other interrupt routing information.

If you really need to write to both the config space registers and your
magic shadow copy of the register I can certainly do the config space
writes for you. I just figured it would be more efficient not to.

Eric

2006-11-07 17:37:10

by Dave Olson

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

On Tue, 7 Nov 2006, Eric W. Biederman wrote:
| Huh? As I read the ipath code I am passing you the value that needs to go
| into ipath->int_config and thus into dd->ipath_kregs->kr_interrupt_config.

Yes.

| Sure it is coming as 2 32bit words instead of a one big 64 bit one, but
| that is simple to fix.

It would be cleaner, but not absolutely necessary.

| If your card doesn't pay attention to configuration space access cycles then
| there should be no reason to write the value there. If your card does pay
| attention to the configuration space access cycles it should be trivial to
| make this work.

The card does pay attention, and other programs such as lspci and the
like also look at the config space. They should definitely be kept
in sync, and config writes are fairly cheap, anyway.

| If you really need to write to both the config space registers and your
| magic shadow copy of the register I can certainly do the config space
| writes for you. I just figured it would be more efficient not to.

The HT layer should always do the config updates, since you are trying
to clean up that layer. Only the "extra" stuff (if any) should be done by
the callback.

Dave Olson
[email protected]

2006-11-07 18:01:19

by Bryan O'Sullivan

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Eric W. Biederman wrote:

> If you really need to write to both the config space registers and your
> magic shadow copy of the register I can certainly do the config space
> writes for you. I just figured it would be more efficient not to.

Yes, we need to do both.

I've got code that refactors your patch a little as I try to get the
driver happy, but so far it's not seeing any interrupts. I'll keep you
posted.

<b

2006-11-07 18:21:17

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Dave Olson <[email protected]> writes:

> On Tue, 7 Nov 2006, Eric W. Biederman wrote:
> | Huh? As I read the ipath code I am passing you the value that needs to go
> | into ipath->int_config and thus into dd->ipath_kregs->kr_interrupt_config.
>
> Yes.
>
> | Sure it is coming as 2 32bit words instead of a one big 64 bit one, but
> | that is simple to fix.
>
> It would be cleaner, but not absolutely necessary.
>
> | If your card doesn't pay attention to configuration space access cycles then
> | there should be no reason to write the value there. If your card does pay
> | attention to the configuration space access cycles it should be trivial to
> | make this work.
>
> The card does pay attention, and other programs such as lspci and the
> like also look at the config space. They should definitely be kept
> in sync, and config writes are fairly cheap, anyway.

Well this is a rathole so it really isn't safe for lspci to play with
(races with the kernel accessing it)

This hole concept of you having the register but not connecting it up on
the card is rather bizarre.

> | If you really need to write to both the config space registers and your
> | magic shadow copy of the register I can certainly do the config space
> | writes for you. I just figured it would be more efficient not to.
>
> The HT layer should always do the config updates, since you are trying
> to clean up that layer. Only the "extra" stuff (if any) should be done by
> the callback.

Fine by me. That's why the patch was up for review. That is just moving
the if statement I currently have. So it should be trivial. If that
won't break your card that is good enough for me.

Eric

2006-11-07 18:30:35

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

"Bryan O'Sullivan" <[email protected]> writes:

> Eric W. Biederman wrote:
>
>> If you really need to write to both the config space registers and your
>> magic shadow copy of the register I can certainly do the config space
>> writes for you. I just figured it would be more efficient not to.
>
> Yes, we need to do both.
>
> I've got code that refactors your patch a little as I try to get the driver
> happy, but so far it's not seeing any interrupts. I'll keep you posted.

Ok. Thanks.

Eric

2006-11-07 20:30:28

by Dave Olson

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

On Tue, 7 Nov 2006, Eric W. Biederman wrote:
| > | If your card doesn't pay attention to configuration space access cycles then
| > | there should be no reason to write the value there. If your card does pay
| > | attention to the configuration space access cycles it should be trivial to
| > | make this work.
| >
| > The card does pay attention, and other programs such as lspci and the
| > like also look at the config space. They should definitely be kept
| > in sync, and config writes are fairly cheap, anyway.
|
| Well this is a rathole so it really isn't safe for lspci to play with
| (races with the kernel accessing it)

Displaying something that might change is a fact of life, and no
different than the PCI world. It's still best to keep things as
correct as possible.

| This hole concept of you having the register but not connecting it up on
| the card is rather bizarre.

The HT core we use made it extremely difficult, unfortunately. One of
those things in hardware you sometimes just have to live with.

| > The HT layer should always do the config updates, since you are trying
| > to clean up that layer. Only the "extra" stuff (if any) should be done by
| > the callback.
|
| Fine by me. That's why the patch was up for review. That is just moving
| the if statement I currently have. So it should be trivial. If that
| won't break your card that is good enough for me.

Thanks,

Dave Olson
[email protected]

2006-11-07 20:51:57

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Dave Olson <[email protected]> writes:

> On Tue, 7 Nov 2006, Eric W. Biederman wrote:
> | > | If your card doesn't pay attention to configuration space access cycles
> then
> | > | there should be no reason to write the value there. If your card does pay
> | > | attention to the configuration space access cycles it should be trivial to
> | > | make this work.
> | >
> | > The card does pay attention, and other programs such as lspci and the
> | > like also look at the config space. They should definitely be kept
> | > in sync, and config writes are fairly cheap, anyway.
> |
> | Well this is a rathole so it really isn't safe for lspci to play with
> | (races with the kernel accessing it)
>
> Displaying something that might change is a fact of life, and no
> different than the PCI world. It's still best to keep things as
> correct as possible.

No. I was thinking of the rat hole in pci config space you have to
access to read these registers. You have to actively write a pci
config value to select which register you are going to read.

So by default it is not safe to touch this value from user space,
because you could mess up the kernel, if the kernel is updating the
value.

Eric

2006-11-07 21:01:45

by Dave Olson

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

On Tue, 7 Nov 2006, Eric W. Biederman wrote:
| > Displaying something that might change is a fact of life, and no
| > different than the PCI world. It's still best to keep things as
| > correct as possible.
|
| No. I was thinking of the rat hole in pci config space you have to
| access to read these registers. You have to actively write a pci
| config value to select which register you are going to read.
|
| So by default it is not safe to touch this value from user space,
| because you could mess up the kernel, if the kernel is updating the
| value.

Nonetheless, as root, lspci already does that (it displays the MSI
interrupt info). I wasn't talking about fixing that, just saying
that having the data being as correct as possible, is highly
desirable. We can't know everything that everybody is doing with
the data.

Improvements in the pciutils library and locking with respect to the
kernel may well be desirable, but are an independent issue from
correctness.

Dave Olson
[email protected]

2006-11-07 21:32:15

by Bryan O'Sullivan

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Eric W. Biederman wrote:

> If you really need to write to both the config space registers and your
> magic shadow copy of the register I can certainly do the config space
> writes for you.

Here's an updated copy of your second patch that does just that.

I've also included a preview of the ipath patch that depends on this.
With your original patch, my rework of your second patch, and the new
ipath patch, the driver is back to working happily for me on top of
current -git.

I need to test the ipath patch on powerpc before I consider it cooked,
but I won't have time to do that today.

<b


Attachments:
htirq-allow-buggy-drivers-of-buggy-hardware-to-write-the-registers.patch (3.53 kB)
ipath-htirq.patch (6.52 kB)
Download all attachments

2006-11-07 21:36:10

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Dave Olson <[email protected]> writes:

> On Tue, 7 Nov 2006, Eric W. Biederman wrote:
> | > Displaying something that might change is a fact of life, and no
> | > different than the PCI world. It's still best to keep things as
> | > correct as possible.
> |
> | No. I was thinking of the rat hole in pci config space you have to
> | access to read these registers. You have to actively write a pci
> | config value to select which register you are going to read.
> |
> | So by default it is not safe to touch this value from user space,
> | because you could mess up the kernel, if the kernel is updating the
> | value.
>
> Nonetheless, as root, lspci already does that (it displays the MSI
> interrupt info). I wasn't talking about fixing that, just saying
> that having the data being as correct as possible, is highly
> desirable. We can't know everything that everybody is doing with
> the data.

I think we are talking past each other. I think it is fine but silly
to set a standard register that isn't actually used. It probably makes
debugging a little easier but it might also make things a little more
confusing because we are doing something totally unnecessary.

The pci capability is fine. The issue with the hyptertrasnport interrupt
capability is that it is 8 bytes long and controls up to 1024 bytes of data.
It is not nor can I ever it image it being safe for lspci to write the
window address register to read back the interrupt routing register.
Someone poking at this with setpci and lspci is fine.

In general reads of random registers are racy but harmless. Writes of
registers that the kernel needs to have a specific value should never
ever be done by default, because bad nasty things may happen. It is
a very good way to shoot yourself in the foot.

> Improvements in the pciutils library and locking with respect to the
> kernel may well be desirable, but are an independent issue from
> correctness.

This is not a race issue this is a true correctness issue. There
is an address register and a data register. It will never be correct
for any user space program to write to the address register without
first proving that the kernel does not care what value that register
takes on, or the user has sufficient privileges and says do it anyway
I know what I am doing.

These are not ordinary pci config space registers, although they are standard
registers for hypertransport devices.


Eric

2006-11-07 21:41:10

by Dave Olson

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

On Tue, 7 Nov 2006, Eric W. Biederman wrote:
| I think we are talking past each other. I think it is fine but silly
| to set a standard register that isn't actually used. It probably makes
| debugging a little easier but it might also make things a little more
| confusing because we are doing something totally unnecessary.

I think we are saying exactly the same thing, so I'll leave it at that.

Dave Olson
[email protected]

2006-11-07 22:00:58

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

"Bryan O'Sullivan" <[email protected]> writes:

> Eric W. Biederman wrote:
>
>> If you really need to write to both the config space registers and your
>> magic shadow copy of the register I can certainly do the config space
>> writes for you.
>
> Here's an updated copy of your second patch that does just that.
>
> I've also included a preview of the ipath patch that depends on this. With your
> original patch, my rework of your second patch, and the new ipath patch, the
> driver is back to working happily for me on top of current -git.
>
> I need to test the ipath patch on powerpc before I consider it cooked, but I
> won't have time to do that today.

Ok. It looks good except you aren't calling ht_destroy_irq on driver unload.
Which is a small resource leak.

Cool looks like we have got this one.

Eric

2006-11-07 22:26:31

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Dave Olson <[email protected]> writes:

> On Tue, 7 Nov 2006, Eric W. Biederman wrote:
> | I think we are talking past each other. I think it is fine but silly
> | to set a standard register that isn't actually used. It probably makes
> | debugging a little easier but it might also make things a little more
> | confusing because we are doing something totally unnecessary.
>
> I think we are saying exactly the same thing, so I'll leave it at that.

Ok. I wish it was clear to me that you understood my concerns but you aren't
writing a patch for lspci so it really doesn't matter.

Eric

2006-11-08 05:14:09

by Bryan O'Sullivan

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

Eric W. Biederman wrote:

> Ok. It looks good except you aren't calling ht_destroy_irq on driver unload.
> Which is a small resource leak.

Yeah, I'm also not reprogramming that register if the interrupt routing
changes. Ran out of time today. I'll fix those tomorrow.

<b

2006-11-08 11:12:06

by Eric W. Biederman

[permalink] [raw]
Subject: Re: 2.6.19-rc4: known unfixed regressions (v3)

"Bryan O'Sullivan" <[email protected]> writes:

> Eric W. Biederman wrote:
>
>> Ok. It looks good except you aren't calling ht_destroy_irq on driver unload.
>> Which is a small resource leak.
>
> Yeah, I'm also not reprogramming that register if the interrupt routing changes.
> Ran out of time today. I'll fix those tomorrow.

Good point. You generate the value and then never put it anywhere...
Anyway Andrew appears to have the rest of the fixes so this should be
easy to finish.

Eric