Hi Ross,
I have a version of your timer patch (io_apic.c) for kernel 2.6.1. It is
attached. I have been monitoring a problem with it. It seems that with the
patch, I gain 1 seconds time over 10 minutes (roughly). So I gain about 2-3
mintues a day. I haven't taken exact measurements, but I know it ends up about
20 minutes difference after a week. This is not good, which would require
resetting the time often.
I tried the 2.6.1 kernel without the timer patch. The timer is now back in PIC
mode, and interrupt 7 has the old noise. Synched the time with my watch. At
first, I noticed no gain in time over 10 minutes. However the next day, I found
it gained 1-2 seconds. Now it is about 7 seconds ahead a few days later now.
This is much better.
So I'm left to thinking, the patch does two things, maybe one thing right, and
one possibly very wrong:
1) It does place the timer in APIC mode.
2) But the timer seems to be fed extra interrupts, maybe the same that is found
on irq 7 without the patch (is this possible?)
I remember someone making a comment which might explain the issue:
http://marc.theaimsgroup.com/?l=linux-kernel&m=107098440019588&w=2
I don't think the patch was much different now than it was then. So I think
there is something wrong with setting up the timer this way. I don't know if
you worked something out with Maciej. I don't know much about interrupt
controller programming so... if maybe you can explain to me anything I'm
missing. For now I've dropped the patch.
Jesse
PS: I have run with disconnect on, and without your ack patch since I got that
surpise BIOS update. No lockups have occurred in the past month, since that. So the disconnect problem is a BIOS bug. (Shuttle has not responded)
PSS: CC me, I'm not subscribed right now.
Hi,
I can confirm this activity, my clock has been skewing recently, but I had not
made the link myself that this started happening after I started using the
APIC/IOAPIC nforce fixes.
If theres any debug info I can provide let me know. I run an AMD XP2600+ on an
Abit NF7-S V2.0 motherboard.
Daniel
Jesse Allen wrote:
> Hi Ross,
>
> I have a version of your timer patch (io_apic.c) for kernel 2.6.1. It is
> attached. I have been monitoring a problem with it. It seems that with the
> patch, I gain 1 seconds time over 10 minutes (roughly). So I gain about 2-3
> mintues a day. I haven't taken exact measurements, but I know it ends up about
> 20 minutes difference after a week. This is not good, which would require
> resetting the time often.
>
> I tried the 2.6.1 kernel without the timer patch. The timer is now back in PIC
> mode, and interrupt 7 has the old noise. Synched the time with my watch. At
> first, I noticed no gain in time over 10 minutes. However the next day, I found
> it gained 1-2 seconds. Now it is about 7 seconds ahead a few days later now.
> This is much better.
>
> So I'm left to thinking, the patch does two things, maybe one thing right, and
> one possibly very wrong:
>
> 1) It does place the timer in APIC mode.
> 2) But the timer seems to be fed extra interrupts, maybe the same that is found
> on irq 7 without the patch (is this possible?)
>
> I remember someone making a comment which might explain the issue:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=107098440019588&w=2
>
> I don't think the patch was much different now than it was then. So I think
> there is something wrong with setting up the timer this way. I don't know if
> you worked something out with Maciej. I don't know much about interrupt
> controller programming so... if maybe you can explain to me anything I'm
> missing. For now I've dropped the patch.
>
>
> Jesse
>
>
> PS: I have run with disconnect on, and without your ack patch since I got that
> surpise BIOS update. No lockups have occurred in the past month, since that. So the disconnect problem is a BIOS bug. (Shuttle has not responded)
>
> PSS: CC me, I'm not subscribed right now.
Ditto here, using nforce2. I've been up for about a week and a half and my
clock is skew'ed by at least 20 mins .
Matt H.
On Monday 12 January 2004 10:35 am, Jesse Allen wrote:
> Hi Ross,
>
> I have a version of your timer patch (io_apic.c) for kernel 2.6.1. It is
> attached. I have been monitoring a problem with it. It seems that with
> the patch, I gain 1 seconds time over 10 minutes (roughly). So I gain
> about 2-3 mintues a day. I haven't taken exact measurements, but I know it
> ends up about 20 minutes difference after a week. This is not good, which
> would require resetting the time often.
>
> I tried the 2.6.1 kernel without the timer patch. The timer is now back in
> PIC mode, and interrupt 7 has the old noise. Synched the time with my
> watch. At first, I noticed no gain in time over 10 minutes. However the
> next day, I found it gained 1-2 seconds. Now it is about 7 seconds ahead a
> few days later now. This is much better.
>
> So I'm left to thinking, the patch does two things, maybe one thing right,
> and one possibly very wrong:
>
> 1) It does place the timer in APIC mode.
> 2) But the timer seems to be fed extra interrupts, maybe the same that is
> found on irq 7 without the patch (is this possible?)
>
> I remember someone making a comment which might explain the issue:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=107098440019588&w=2
>
> I don't think the patch was much different now than it was then. So I
> think there is something wrong with setting up the timer this way. I don't
> know if you worked something out with Maciej. I don't know much about
> interrupt controller programming so... if maybe you can explain to me
> anything I'm missing. For now I've dropped the patch.
>
>
> Jesse
>
>
> PS: I have run with disconnect on, and without your ack patch since I got
> that surpise BIOS update. No lockups have occurred in the past month,
> since that. So the disconnect problem is a BIOS bug. (Shuttle has not
> responded)
>
> PSS: CC me, I'm not subscribed right now.
Jesse Allen wrote:
> Hi Ross,
>
> I have a version of your timer patch (io_apic.c) for kernel 2.6.1. It is
> attached. I have been monitoring a problem with it. It seems that with the
> patch, I gain 1 seconds time over 10 minutes (roughly). So I gain about 2-3
> mintues a day. I haven't taken exact measurements, but I know it ends up about
> 20 minutes difference after a week. This is not good, which would require
> resetting the time often.
Not a good thing.
I have not noticed it on my patched 2.4.24 kernel, it seems to keep
good time with the clock on the wall. I will have to check further to be certain.
I do not run 2.6.x by default as yet.
>
> I tried the 2.6.1 kernel without the timer patch. The timer is now back in PIC
> mode, and interrupt 7 has the old noise. Synched the time with my watch. At
> first, I noticed no gain in time over 10 minutes. However the next day, I found
> it gained 1-2 seconds. Now it is about 7 seconds ahead a few days later now.
> This is much better.
>
> So I'm left to thinking, the patch does two things, maybe one thing right, and
> one possibly very wrong:
>
> 1) It does place the timer in APIC mode.
> 2) But the timer seems to be fed extra interrupts, maybe the same that is found
> on irq 7 without the patch (is this possible?)
>
> I remember someone making a comment which might explain the issue:
> http://marc.theaimsgroup.com/?l=linux-kernel&m=107098440019588&w=2
>
> I don't think the patch was much different now than it was then. So I think
> there is something wrong with setting up the timer this way. I don't know if
> you worked something out with Maciej. I don't know much about interrupt
> controller programming so... if maybe you can explain to me anything I'm
> missing. For now I've dropped the patch.
We looked into the issue, these emails (was my 2.4.23 kernel) indicate that
the 8259 PIC is fully masked off when the check timer looked for interrupts.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2288.html
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/2375.html
So if there are additional interrupts getting into irq0 then they are very
unexpected and may be from an undocumented source? Or perhaps something
is different in counting time in 2.6.x wrt 2.4.xx? Could the same interrupt
occasionally be getting counted twice or something?
>
>
> Jesse
>
>
> PS: I have run with disconnect on, and without your ack patch since I got that
> surpise BIOS update. No lockups have occurred in the past month, since that.
So the disconnect problem is a BIOS bug. (Shuttle has not responded)
>
I would love a bios update like that for my motherboards!
Given Daniel has had lockups with disconnect off.
http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/4769.html
And your machine no longer has them with disconnect on, then I now think the
cause is something different to the AMD C1 disconnect issue, but what I do not
know? I believe AMD are still looking into it as per my support request.
I would really like to hear from Nvidia on this issue, I have tried emailing
them, form mailing them and also posting to their linux forum with no success or
response. Given its been over a month since my first posting I am having
serious second thoughts about my choice of chipsets and motherboards!
I currently do not feel very warm and fuzzy about their linux support!
> PSS: CC me, I'm not subscribed right now.
Ross Dickson wrote:
> Given Daniel has had lockups with disconnect off.
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/4769.html
> And your machine no longer has them with disconnect on, then I now think the
> cause is something different to the AMD C1 disconnect issue, but what I do not
> know? I believe AMD are still looking into it as per my support request.
Hi Ross,
I feel that I should clarify your point here as I've got a bit confused myself
as to which patch does what, etc!
I compiled 2.6.0-test11-mm1 with IOAPIC and APIC support (never had used these
options before). Within minutes I met my first lockup ever on this PC. (This
kernel included the other recent nforce2 patches:
nforce2-disconnect-quirk.patch and +nforce2-apic.patch)
I then tried your patches with default settings, and I thought that the
problem was solved, as you can see in the URL you referenced. That was a false
alarm, I did then meet another lockup a few hours later, and another one a few
hours after that. I then reverted out the nforce2-disconnect-quirk patch (as
per your suggestion), and started booting with the apic_tack=2 argument.
See: http://linux.derkeiler.com/Mailing-Lists/Kernel/2003-12/5259.html
No lockups since then, but I have been experiencing the clock skew.
But, there might be more to it. I had forgotten up until now, but I am using
this code in my /etc/conf.d/local.start :
setpci -v -H1 -s 0:0.0 6F=$(printf %x $((0x$(setpci -H1 -s 0:0.0 6F) | 0x10)))
and this one in /etc/conf.d/local.stop :
setpci -v -H1 -s 0:0.0 6F=$(printf %x $((0x$(setpci -H1 -s 0:0.0 6F) & 0xef)))
I got these codes from
http://www.tldp.org/HOWTO/Athlon-Powersaving-HOWTO/approaches.html#commandline
a while back - they are supposed to enable powersaving and make your CPU run
cooler. I ran a test when I originally started using this, and I did find that
my CPU ran cooler when idle after I had ran these commands.
I don't know if this would influence my systems behaviour with/without the
various patches that have been flying around.. but I hope you can make some
sense out of it and continue bugging AMD/nvidia!
> I would really like to hear from Nvidia on this issue, I have tried emailing
> them, form mailing them and also posting to their linux forum with no
success or
> response. Given its been over a month since my first posting I am having
> serious second thoughts about my choice of chipsets and motherboards!
> I currently do not feel very warm and fuzzy about their linux support!
I'd certainly investigate a different type of chipset on my next board, if I
knew of a manufacturer that they *would* approach issues like this one out in
the open...
Daniel
> But, there might be more to it. I had forgotten up until now, but I am
> using this code in my /etc/conf.d/local.start :
> setpci -v -H1 -s 0:0.0 6F=$(printf %x $((0x$(setpci -H1 -s 0:0.0 6F) |
> 0x10)))
>
> and this one in /etc/conf.d/local.stop :
> setpci -v -H1 -s 0:0.0 6F=$(printf %x $((0x$(setpci -H1 -s 0:0.0 6F) &
> 0xef)))
>
> I got these codes from
> http://www.tldp.org/HOWTO/Athlon-Powersaving-HOWTO/approaches.html#commandline
Well you are putting disconnect to "on" on boot and "off" on shutdown,
if I am not mistaken. The quirk wanted to take it turn it off on boot
time, so strange it lead to locking to you. I have locking problems
intorduced with 2.6.1 mm line. Trying to find out, what is the case...
I haven't tried Ross' patches, btw.
Prakash