I just upgraded from -mm3 (which I'd been running solidly for over a
week) to -mm8, and find that the system freezes hard after I start the X
server. After X starts, lifetime varies from zero to maybe 20 seconds
of app launching, then everything locks up.
At this point, the machine is still pingable, but daemons like sshd
don't respond, and I can't see any logs. After a reboot back to -mm3,
there's nothing suspicious in /var/log.
<b
On Wed, 2003-06-11 at 15:17, Bryan O'Sullivan wrote:
> I just upgraded from -mm3 (which I'd been running solidly for over a
> week) to -mm8, and find that the system freezes hard after I start the X
> server. After X starts, lifetime varies from zero to maybe 20 seconds
> of app launching, then everything locks up.
>
> At this point, the machine is still pingable, but daemons like sshd
> don't respond, and I can't see any logs. After a reboot back to -mm3,
> there's nothing suspicious in /var/log.
Same problem here. It started happening in -mm6.
I have not narrowed it down to anything suspicious, and because it is in
X and my normal desktop machine I have not really debugged the issue.
Interrupts are still on, though...
Robert Love
"Bryan O'Sullivan" <[email protected]> wrote:
>
> I just upgraded from -mm3 (which I'd been running solidly for over a
> week) to -mm8, and find that the system freezes hard after I start the X
> server. After X starts, lifetime varies from zero to maybe 20 seconds
> of app launching, then everything locks up.
You might try reverting
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.70/2.5.70-mm8/broken-out/pci-init-ordering-fix.patch
> At this point, the machine is still pingable, but daemons like sshd
> don't respond, and I can't see any logs. After a reboot back to -mm3,
> there's nothing suspicious in /var/log.
Something oopsed I'd say. You using radeon? That seems pretty oopsy
lately.
On Wed, 2003-06-11 at 15:41, Andrew Morton wrote:
> You might try reverting
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.70/2.5.70-mm8/broken-out/pci-init-ordering-fix.patch
Will do.
> Something oopsed I'd say. You using radeon? That seems pretty oopsy
> lately.
Yep, R7500.
<b
On Wed, 2003-06-11 at 15:41, Andrew Morton wrote:
> You might try reverting
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.70/2.5.70-mm8/broken-out/pci-init-ordering-fix.patch
No good.
<b
On Wed, 2003-06-11 at 15:41, Andrew Morton wrote:
> You might try reverting
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.70/2.5.70-mm8/broken-out/pci-init-ordering-fix.patch
>
> Something oopsed I'd say. You using radeon? That seems pretty oopsy
> lately.
I will debunk both theories: its not Radeon (I have a Matrox) and its
not the pci-init-ordering-fix patch (I already tried that).
Robert Love
On Wed, 2003-06-11 at 16:34, Robert Love wrote:
> I will debunk both theories: its not Radeon (I have a Matrox) and its
> not the pci-init-ordering-fix patch (I already tried that).
Ah, it is the anticipatory I/O scheduler.
There is a logic thinko somewhere... I have not found it yet, but I have
narrowed it down to something which the attached patch fixes (i.e.,
apply this patch and the problem is fixed).
Maybe Nick can see the bug and short circuit my search? The problem is
related to the as-autotune-write-batches patch.
Robert Love
Fix system deadlock in 2.5.70-mm6 and beyond.
drivers/block/as-iosched.c | 68 +++++++--------------------------------------
1 files changed, 11 insertions(+), 57 deletions(-)
diff -urN linux-2.5.70-mm8/drivers/block/as-iosched.c linux/drivers/block/as-iosched.c
--- linux-2.5.70-mm8/drivers/block/as-iosched.c 2003-06-11 17:12:02.919110360 -0700
+++ linux/drivers/block/as-iosched.c 2003-06-11 17:05:59.000000000 -0700
@@ -52,7 +52,7 @@
* See, the problem is: we can send a lot of writes to disk cache / TCQ in
* a short amount of time...
*/
-#define default_write_batch_expire (HZ / 20)
+#define default_write_batch_expire (5)
/*
* max time we may wait to anticipate a read (default around 6ms)
@@ -135,9 +135,6 @@
unsigned long last_check_fifo[2];
int changed_batch;
int batch_data_dir; /* current batch REQ_SYNC / REQ_ASYNC */
- int write_batch_count; /* max # of reqs in a write batch */
- int current_write_count; /* how many requests left this batch */
- int write_batch_idled; /* has the write batch gone idle? */
mempool_t *arq_pool;
enum anticipation_status antic_status;
@@ -938,35 +935,6 @@
}
/*
- * Gathers timings and resizes the write batch automatically
- */
-void update_write_batch(struct as_data *ad)
-{
- unsigned long batch = ad->batch_expire[REQ_ASYNC];
- long write_time;
-
- write_time = (jiffies - ad->current_batch_expires) + batch;
- if (write_time < 0)
- write_time = 0;
-
- if (write_time > batch + 5 && !ad->write_batch_idled) {
- if (write_time / batch > 2)
- ad->write_batch_count /= 2;
- else
- ad->write_batch_count--;
-
- } else if (write_time + 5 < batch && ad->current_write_count == 0) {
- if (batch / write_time > 2)
- ad->write_batch_count *= 2;
- else
- ad->write_batch_count++;
- }
-
- if (ad->write_batch_count < 1)
- ad->write_batch_count = 1;
-}
-
-/*
* as_completed_request is to be called when a request has completed and
* returned something to the requesting process, be it an error or data.
*/
@@ -981,7 +949,8 @@
return;
}
- WARN_ON(blk_fs_request(rq) && arq->state == AS_RQ_NEW);
+ if (blk_fs_request(rq) && arq->state == AS_RQ_NEW)
+ printk(KERN_INFO "warning: as_completed_request got bad request\n");
if (arq->state != AS_RQ_DISPATCHED)
return;
@@ -999,7 +968,6 @@
*/
if (ad->batch_data_dir == REQ_SYNC && ad->changed_batch
&& ad->batch_data_dir == arq->is_sync) {
- update_write_batch(ad);
ad->current_batch_expires = jiffies +
ad->batch_expire[REQ_SYNC];
ad->changed_batch = 0;
@@ -1151,11 +1119,10 @@
return 0;
if (ad->batch_data_dir == REQ_SYNC)
- /* TODO! add a check so a complete fifo gets written? */
- return time_after(jiffies, ad->current_batch_expires);
+ return time_after(jiffies, ad->current_batch_expires) &&
+ time_after(jiffies, ad->fifo_expire[REQ_SYNC]);
- return time_after(jiffies, ad->current_batch_expires)
- || ad->current_write_count == 0;
+ return !ad->current_batch_expires;
}
/*
@@ -1187,9 +1154,8 @@
put_as_io_context(&ad->as_io_context);
ad->as_io_context = NULL;
}
-
- if (ad->current_write_count != 0)
- ad->current_write_count--;
+ if (ad->current_batch_expires)
+ ad->current_batch_expires--;
}
ad->aic_finished = 0;
@@ -1218,12 +1184,6 @@
const int reads = !list_empty(&ad->fifo_list[REQ_SYNC]);
const int writes = !list_empty(&ad->fifo_list[REQ_ASYNC]);
- /* Signal that the write batch was uncontended, so we can't time it */
- if (ad->batch_data_dir == REQ_ASYNC && !reads) {
- if (ad->current_write_count == 0 || !writes)
- ad->write_batch_idled = 1;
- }
-
if (!(reads || writes)
|| ad->antic_status == ANTIC_WAIT_REQ
|| ad->antic_status == ANTIC_WAIT_NEXT
@@ -1288,8 +1248,7 @@
if (ad->batch_data_dir == REQ_SYNC)
ad->changed_batch = 1;
ad->batch_data_dir = REQ_ASYNC;
- ad->current_write_count = ad->write_batch_count;
- ad->write_batch_idled = 0;
+ ad->current_batch_expires = ad->batch_expire[REQ_ASYNC];
arq = ad->next_arq[ad->batch_data_dir];
goto dispatch_request;
}
@@ -1311,11 +1270,9 @@
if (ad->changed_batch) {
if (ad->changed_batch == 1 && ad->nr_dispatched)
return 0;
- if (ad->batch_data_dir == REQ_ASYNC) {
- ad->current_batch_expires = jiffies +
- ad->batch_expire[REQ_ASYNC];
+ if (ad->changed_batch == 1 && ad->batch_data_dir == REQ_ASYNC)
ad->changed_batch = 0;
- } else
+ else
ad->changed_batch = 2;
arq->request->flags |= REQ_HARDBARRIER;
}
@@ -1727,9 +1684,6 @@
e->elevator_data = ad;
ad->current_batch_expires = jiffies + ad->batch_expire[REQ_SYNC];
- ad->write_batch_count = ad->batch_expire[REQ_ASYNC] / 10;
- if (ad->write_batch_count < 2)
- ad->write_batch_count = 2;
return 0;
}
Robert Love <[email protected]> wrote:
>
> On Wed, 2003-06-11 at 16:34, Robert Love wrote:
>
> > I will debunk both theories: its not Radeon (I have a Matrox) and its
> > not the pci-init-ordering-fix patch (I already tried that).
>
> Ah, it is the anticipatory I/O scheduler.
>
> There is a logic thinko somewhere... I have not found it yet, but I have
> narrowed it down to something which the attached patch fixes (i.e.,
> apply this patch and the problem is fixed).
>
> Maybe Nick can see the bug and short circuit my search? The problem is
> related to the as-autotune-write-batches patch.
Do you know what the actual oops is?
Odd that starting the X server triggers it. Be interesting if your patch
fixes things for Brian.
There appear to be several divide-by-zero possibilities in there. A random
patch would be:
diff -puN drivers/block/as-iosched.c~a drivers/block/as-iosched.c
--- 25/drivers/block/as-iosched.c~a Wed Jun 11 17:23:42 2003
+++ 25-akpm/drivers/block/as-iosched.c Wed Jun 11 17:23:42 2003
@@ -950,13 +950,13 @@ void update_write_batch(struct as_data *
write_time = 0;
if (write_time > batch + 5 && !ad->write_batch_idled) {
- if (write_time / batch > 2)
+ if (batch && (write_time / batch > 2))
ad->write_batch_count /= 2;
else
ad->write_batch_count--;
} else if (write_time + 5 < batch && ad->current_write_count == 0) {
- if (batch / write_time > 2)
+ if (write_time && (batch / write_time > 2))
ad->write_batch_count *= 2;
else
ad->write_batch_count++;
_
On Wed, 2003-06-11 at 17:24, Andrew Morton wrote:
> Do you know what the actual oops is?
I got it all figured out now.
It is a divide by zero in update_write_batch() called from
as_completed_request().
> Odd that starting the X server triggers it. Be interesting if your patch
> fixes things for Brian.
I reproduced it without X.
The divide by zero is on line 959 with the divide by 'write_time'. It
can obviously be zero (see line 950). The divide by 'batch' on line 953
seems safe.
The correct patch is below.
Most important question: why are only some of us seeing this?
Robert Love
Fix as-iosched divide-by-zero bug.
drivers/block/as-iosched.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff -urN linux-2.5.70-mm8/drivers/block/as-iosched.c linux/drivers/block/as-iosched.c
--- linux-2.5.70-mm8/drivers/block/as-iosched.c 2003-06-11 17:12:02.000000000 -0700
+++ linux/drivers/block/as-iosched.c 2003-06-11 18:04:15.222619392 -0700
@@ -954,9 +954,9 @@
ad->write_batch_count /= 2;
else
ad->write_batch_count--;
-
+
} else if (write_time + 5 < batch && ad->current_write_count == 0) {
- if (batch / write_time > 2)
+ if (write_time && (batch / write_time > 2))
ad->write_batch_count *= 2;
else
ad->write_batch_count++;
Robert Love <[email protected]> wrote:
>
> Fix as-iosched divide-by-zero bug.
hrm, OK. Still not convinced about `batch'.
How about this?
--- 25/drivers/block/as-iosched.c~as-div-by-zero-fix 2003-06-11 18:17:04.000000000 -0700
+++ 25-akpm/drivers/block/as-iosched.c 2003-06-11 18:20:58.000000000 -0700
@@ -930,13 +930,12 @@ void update_write_batch(struct as_data *
write_time = 0;
if (write_time > batch + 5 && !ad->write_batch_idled) {
- if (write_time / batch > 2)
+ if (write_time > batch * 2)
ad->write_batch_count /= 2;
else
ad->write_batch_count--;
-
} else if (write_time + 5 < batch && ad->current_write_count == 0) {
- if (batch / write_time > 2)
+ if (batch > write_time * 2)
ad->write_batch_count *= 2;
else
ad->write_batch_count++;
_
On Wed, 2003-06-11 at 18:22, Andrew Morton wrote:
> hrm, OK. Still not convinced about `batch'.
batch is only zero if {read,write}_batch_expire are zero. Nick, is that
legal/desirable? Or should we prevent that in the sysfs interface?
> - if (write_time / batch > 2)
> + if (write_time > batch * 2)
>
> - if (batch / write_time > 2)
> + if (batch > write_time * 2)
Much better! :)
Robert Love
Robert Love wrote:
>On Wed, 2003-06-11 at 17:24, Andrew Morton wrote:
>
>
>>Do you know what the actual oops is?
>>
>
>I got it all figured out now.
>
>It is a divide by zero in update_write_batch() called from
>as_completed_request().
>
>
>>Odd that starting the X server triggers it. Be interesting if your patch
>>fixes things for Brian.
>>
>
>I reproduced it without X.
>
>The divide by zero is on line 959 with the divide by 'write_time'. It
>can obviously be zero (see line 950). The divide by 'batch' on line 953
>seems safe.
>
>The correct patch is below.
>
Probably put in the other check to be on the safe side.
And can the check be if (!write_time || (batch / write_time > 2)
>
>
>Most important question: why are only some of us seeing this?
>
It would occur if a write batch didn't take any jiffies, which
isn't very likely. The HZ=100 change probbly brought it out.
Thanks guys.
On Wed, 2003-06-11 at 18:22, Nick Piggin wrote:
> It would occur if a write batch didn't take any jiffies, which
> isn't very likely. The HZ=100 change probbly brought it out.
I reverted that change (yah yah, I am no help). I have seen the problem
since mm6, but just today got around to looking into it.
Anyhow, its fixed.
Robert Love
Andrew Morton wrote:
>Robert Love <[email protected]> wrote:
>
>>Fix as-iosched divide-by-zero bug.
>>
>
>hrm, OK. Still not convinced about `batch'.
>
>How about this?
>
Yeah, thats the way to do it, of course. It was too
jumpy at that setting though, so make it batch*3
(or <<1+batch if you don't want the multiply).
>
>--- 25/drivers/block/as-iosched.c~as-div-by-zero-fix 2003-06-11 18:17:04.000000000 -0700
>+++ 25-akpm/drivers/block/as-iosched.c 2003-06-11 18:20:58.000000000 -0700
>@@ -930,13 +930,12 @@ void update_write_batch(struct as_data *
> write_time = 0;
>
> if (write_time > batch + 5 && !ad->write_batch_idled) {
>- if (write_time / batch > 2)
>+ if (write_time > batch * 2)
> ad->write_batch_count /= 2;
> else
> ad->write_batch_count--;
>-
> } else if (write_time + 5 < batch && ad->current_write_count == 0) {
>- if (batch / write_time > 2)
>+ if (batch > write_time * 2)
> ad->write_batch_count *= 2;
> else
> ad->write_batch_count++;
>
>_
>
>
>
>
Robert Love wrote:
>On Wed, 2003-06-11 at 18:22, Andrew Morton wrote:
>
>
>>hrm, OK. Still not convinced about `batch'.
>>
>
>batch is only zero if {read,write}_batch_expire are zero. Nick, is that
>legal/desirable? Or should we prevent that in the sysfs interface?
>
Yeah it shouldn't really be allowed, but I think everything
will keep working (with this change), even writes.
On Wed, 2003-06-11 at 19:10, Robert Love wrote:
> On Wed, 2003-06-11 at 17:24, Andrew Morton wrote:
>
> > Do you know what the actual oops is?
>
> I got it all figured out now.
>
> It is a divide by zero in update_write_batch() called from
> as_completed_request().
>
> > Odd that starting the X server triggers it. Be interesting if your patch
> > fixes things for Brian.
>
> I reproduced it without X.
>
> The divide by zero is on line 959 with the divide by 'write_time'. It
> can obviously be zero (see line 950). The divide by 'batch' on line 953
> seems safe.
>
> The correct patch is below.
>
> Most important question: why are only some of us seeing this?
>
> Robert Love
With regards to the last, here is an anti-AOL! for the oops. I ran
2.5.70-mm8 for several hours today, doing kernel compiles and running
dbench 64 on ext3, xfs, and jfs. No oops.
All while running X (although that now seems moot). Base distro is RH9
if that could matter. System is UP (PIII), PREEMPT, IDE, i810 chipset.
Steven
On Wed, 2003-06-11 at 18:54, Steven Cole wrote:
> With regards to the last, here is an anti-AOL! for the oops. I ran
> 2.5.70-mm8 for several hours today, doing kernel compiles and running
> dbench 64 on ext3, xfs, and jfs. No oops.
>
> All while running X (although that now seems moot). Base distro is RH9
> if that could matter. System is UP (PIII), PREEMPT, IDE, i810 chipset.
Right. Most people are not seeing this.
I have a system very similar to yours, interestingly. It is just random
timings I guess.
Robert Love
Nick> Yeah, thats the way to do it, of course. It was too jumpy at
Nick> that setting though, so make it batch*3 (or <<1+batch if you
Nick> don't want the multiply).
Aren't we trying to get away from magic constants like this? Or at
least a better idea of why batch*3 is better than batch*2? I will
admit I haven't had the chance to peer into the code, so I'm probably
just being stupid (and lazy) here to speak up.
I guess the real question I have is what happens if we make it
batch*100, how does the affect the algorithm? And if going from 2 to
3 makes such a difference, doesn't that point to a scaling issue,
i.e. we should have 200 and 300 here, so we can try out 250 as an
intermediate value.
*shrug* Just trying to understand...
John
John Stoffel wrote:
>Nick> Yeah, thats the way to do it, of course. It was too jumpy at
>Nick> that setting though, so make it batch*3 (or <<1+batch if you
>Nick> don't want the multiply).
>
>Aren't we trying to get away from magic constants like this? Or at
>least a better idea of why batch*3 is better than batch*2? I will
>admit I haven't had the chance to peer into the code, so I'm probably
>just being stupid (and lazy) here to speak up.
>
>I guess the real question I have is what happens if we make it
>batch*100, how does the affect the algorithm? And if going from 2 to
>3 makes such a difference, doesn't that point to a scaling issue,
>i.e. we should have 200 and 300 here, so we can try out 250 as an
>intermediate value.
>
>*shrug* Just trying to understand...
>
Its not very critical. It is used to estimate how many
requests will take t ms to complete, based on how many
requests we sent last time around, and how long that
took.
As long as it increment when we were below estimate and
decrement when above it should be OK. The mul/div were
put there in order to adapt more quickly to changes in
the request pattern.
You would think that if a batch of 10 requests took
twice as long as we wanted, we might as well submit 5
requests, but in testing I found that this makes the
numbers jump around too much, while a 3* threshold
smoothed it over while still adapting to changes nicely.
On Wed, 2003-06-11 at 17:24, Andrew Morton wrote:
> Odd that starting the X server triggers it. Be interesting if your patch
> fixes things for Brian.
I think Robert and I are seeing different things. For me, -mm6 is fine
(unlike Robert's case), -mm7 oopses in the PCI init code during early
boot (somewhere in the radeon init stuff, can't capture the oops
easily), and -mm8 gives itself a wedgie a few seconds after starting X.
I'm about to try, um, whichever of the umpty-ump patches that went back
and forth looks most plausible.
<b
On Thu, 2003-06-12 at 10:19, Bryan O'Sullivan wrote:
> I'm about to try, um, whichever of the umpty-ump patches that went back
> and forth looks most plausible.
Tried your two-liner, Andrew, but -mm8 is as freezy as ever. I'll see
if I can hook up a serial console and find an oops at some point.
<b
On Thu, 12 Jun 2003, Bryan O'Sullivan wrote:
> On Wed, 2003-06-11 at 17:24, Andrew Morton wrote:
>
> > Odd that starting the X server triggers it. Be interesting if your patch
> > fixes things for Brian.
>
> I think Robert and I are seeing different things. For me, -mm6 is fine
> (unlike Robert's case), -mm7 oopses in the PCI init code during early
> boot (somewhere in the radeon init stuff, can't capture the oops
> easily), and -mm8 gives itself a wedgie a few seconds after starting X.
>
> I'm about to try, um, whichever of the umpty-ump patches that went back
> and forth looks most plausible.
I'm having total freezes with 2.5.69 in both my home laptop with a SiS650
chipset and in my machine at work with Intel Corp. 82845G/GL. Using X with
Gnome (RH9) the system will end up to a completely frozen state after a
random amount of time. This happens with almost no activity on the machine
that makes me thing to be not related to some kind of load. IRQ are
disabled and the IDE drive light remains on. I planned to debug the thing
but I didn't have time yet. I set up the NMI oopser and I need to do
something to get the dump since when the NMI trigger I'm in graphic mode.
I was thinking about LKCD. It has never happened in console mode so it
must be X/Gnome+2.5.69
- Davide