2003-03-21 07:47:25

by Andrew Morton

[permalink] [raw]
Subject: 2.5.65-mm3


http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.65/2.5.65-mm3/

Will appear later at:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.65/2.5.65-mm3/


. There is a significnat one-line fix to the CFQ IO scheduler here. It
possibly invalidates testing which was previously performed against CFQ.

. Added Hugh's new rmap-without-pte_chains-for-anon-pages patches. Mainly
for interested parties to test and benchmark at this stage.

It seems to be stable, however it is not clear that this passes the
benefit-vs-disruption test.



Changes since 2.5.65-mm2:


+posix-timers-fixes.patch

Fix the nanosleep-sleeps-forever bug

+as-remove-frontmerge.patch
+as-misc-cleanups.patch

Anticipatory scheduler cleanups and simplifications

+cfq-fix.patch

Fix the large pauses which the CFQ scheduler was prone to hitting.

+anobjrmap-1-rmap_h.patch
+anobjrmap-2-mapping.patch
+anobjrmap-3-unchained.patch
+anobjrmap-4-anonmm.patch
+anobjrmap-5-rechained.patch
+anobjrmap-6-arches.patch

Remove pte_chains for anonymous page reverse mappings.

+anobjrmap-ttfb-no-BUG.patch

Don't go BUG over truncated ext3 pages

+timer-simplification.patch

Remvoe some duplicated info from timer data structures

+timer-lockup-fix-simplification.patch

Simplify the timer lockup fix.

+slab-large-obj-tuning.patch

Don't cache huge objects in slab.

-pagecache-accounting-speedup.patch

This got broken and I need to fix it.

+floppy-oops-fix.patch

Fix an oops in the floppy driver

+ext3_writepage-use-after-free-fix.patch

Fix a rare ext3 bug

+list-barriers-on-smp-only.patch

Optimise list_head operations for uniprocessors.

+sync_filesystems-docco-lock.patch

Documentation and livelock/starvation avoidance

+awe_wave-linkage-error-fix.patch

__init section fixes

+conntrack-use-after-free-fix.patch

Maybe fix use-after-free in netfilter

+syscalls-return-long.patch
+syscalls-return-long-2.patch

Correct return type for system calls



All 127 patches

linus.patch
Latest from Linus

mm.patch
add -mmN to EXTRAVERSION

kgdb.patch

kgdb-cleanup.patch
make kgdb less invasive (when disabled)

posix-timers-fixes.patch
sys_nanosleep() fix

proc-sys-debug.patch
create /proc/sys/debug/0 ... 7

config_spinline.patch
uninline spinlocks for profiling accuracy.

ppc64-reloc_hide.patch

ppc64-pci-patch.patch
Subject: pci patch

ppc64-aio-32bit-emulation.patch
32/64bit emulation for aio

ppc64-scruffiness.patch
Fix some PPC64 compile warnings

sym-do-160.patch
make the SYM driver do 160 MB/sec

config-PAGE_OFFSET.patch
Configurable kenrel/user memory split

ptrace-flush.patch
cache flushing in the ptrace code

buffer-debug.patch
buffer.c debugging

warn-null-wakeup.patch

ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages

reiserfs_file_write-5.patch

tcp-wakeups.patch
Use fast wakeups in TCP/IPV4

rcu-stats.patch
RCU statistics reporting

ext3-journalled-data-assertion-fix.patch
Remove incorrect assertion from ext3

nfs-speedup.patch

nfs-oom-fix.patch
nfs oom fix

sk-allocation.patch
Subject: Re: nfs oom

nfs-more-oom-fix.patch

rpciod-atomic-allocations.patch
Make rcpiod use atomic allocations

linux-isp.patch

isp-update-1.patch

kblockd.patch
Create `kblockd' workqueue

as-iosched.patch
anticipatory I/O scheduler

as-debug-BUG-fix.patch

as-eject-BUG-fix.patch
AS: don't go BUG during cdrom eject

as-jumbo-fix.patch
AS: OSDL fixes

as-request_fn-in-timer.patch
Remove the scheduled_work thing

as-remove-request-fix.patch

as-np-1.patch
as: cleanups & comments

as-use-kblockd.patch

as-cleanup-2.patch
AS: cleanup + comments

as-as_remove_request-simplification.patch
as: as_remove_request simplification

as-dont-go-BUG-again.patch

as-handle-non-block-requests.patch
AS: handle non-block requests

as-np-reads-1.patch
AS: read-vs-read fixes

as-np-reads-2.patch
AS: more read-vs-read fixes

as-predict-data-direction.patch
as: predict direction of next IO

as-remove-frontmerge.patch
AS: remove frontmerge tunable

as-misc-cleanups.patch
AS: misc cleanups

cfq-2.patch
CFQ scheduler, #2

cfq-fix.patch
cfq queued bugfix

unplug-use-kblockd.patch
Use kblockd for running request queues

remap-file-pages-2.5.63-a1.patch
Subject: [patch] remap-file-pages-2.5.63-A1

hugh-remap-fix.patch
hugh's file-offset-in-pte fix

fremap-limit-offsets.patch
fremap: limit remap_file_pages() file offsets

fremap-all-mappings.patch
Make all executable mappings be nonlinear

filemap_populate-speedup.patch
filemap_populate speedup

file-offset-in-pte-x86_64.patch
x86_64: support for file offsets in pte's

file-offset-in-pte-ppc64.patch

objrmap-2.5.62-5.patch
object-based rmap

objrmap-nonlinear-fixes.patch
objrmap fix for nonlinear

anobjrmap-1-rmap_h.patch
anobjrmap 1/6 rmap.h

anobjrmap-2-mapping.patch
Subject: [PATCH] anobjrmap 2/6 mapping

anobjrmap-3-unchained.patch
anobjrmap 3/6 unchained

anobjrmap-4-anonmm.patch
anobjrmap 4/6 anonmm

anobjrmap-5-rechained.patch
anobjrmap 5/6 rechained

anobjrmap-6-arches.patch
anobjrmap 6/6 arches

anobjrmap-ttfb-no-BUG.patch

sched-2.5.64-D3.patch
sched-2.5.64-D3, more interactivity changes

scheduler-tunables.patch
scheduler tunables

show_task-free-stack-fix.patch
show_task() fix and cleanup

yellowfin-set_bit-fix.patch
yellowfin driver set_bit fix

htree-nfs-fix.patch
Fix ext3 htree / NFS compatibility problems

update_atime-ng.patch
inode a/c/mtime modification speedup

one-sec-times.patch
Implement a/c/time speedup in ext2 & ext3

task_prio-fix.patch
simple task_prio() fix

slab_store_user-large-objects.patch
slab debug: perform redzoning against larger objects

pcmcia-2.patch

pcmcia-3b.patch

pcmcia-3.patch

pcmcia-4.patch

pcmcia-5.patch

pcmcia-6.patch

pcmcia-7b.patch

pcmcia-7.patch

pcmcia-8.patch

pcmcia-9.patch

pcmcia-10.patch

htree-nfs-fix-2.patch
htree nfs fix

ext2-no-lock_super.patch
concurrent block allocation for ext2

ext2-ialloc-no-lock_super.patch
concurrent inode allocation for ext2

brlock-1b.patch
Re: 2.5.64-mm8 breaks MASQ

brlock-removal-2.patch
brlock removal 2/5: remove brlock from snap and vlan

brlock-removal-3.patch
brlock removal 3/5: remove brlock from bridge

brlock-removal-4.patch
brlock removal 4/5: removal from ipv4/ipv6

brlock-removal-5.patch
brlock removal 5/5: remove brlock code

lseek-ext2_readdir.patch
remove lock_kernel() from readdir implementations.

inode_setattr-lock_kernel-removal.patch
remove lock_kernel() from inode_setattr's vmtruncate() call

ide_probe-init_irq-fix.patch
ide-probe init_irq cleanup

raid1-fix.patch
MD RAID1 fix

nmi-watchdog-fix.patch
NMI watchdog fix

vm_enough_memory-speedup.patch
speed up vm_enough_memory()

nanosleep-accuracy-fix-2.patch
fix nanosleep() granularity bumps

linear-oops-fix-1.patch
md/linear oops fix

dev_t-1-kill-cdev.patch
dev_t [1/3]: kill cdev

dev_t-2-remove-MAX_CHRDEV.patch
dev_t [2/3] - remove MAX_CHRDEV

dev_t-3-major_h-cleanup.patch
dev_t [3/3]: major.h cleanups

dev_t-32-bit.patch
[for playing only] change type of dev_t

dev_t-drm-warnings.patch
dev_t: fix drm printk warnings

dev_t-remove-B_FREE.patch
dev_t: eliminate B_FREE

smalldevfs.patch
smalldevfs

cpufreq-xtime-locking.patch
add write_seqlock to cpufreq change notifier for TSC

cs46xx-fixes.patch
cs46xx minor fixes

notsclock-option.patch
boot time parameter to turn of TSC usage

tty-put_user-checks.patch
Add missing put_user checks in n_tty

fail-setup_irq-for-unconfigured-IRQs.patch
Fail setup_irq for unconfigured IRQs

raw-fix-address_space-rewriting.patch
raw driver: rewrite i_mapping only on final close

raw-cleanups-and-fixlets.patch
raw driver: cleanups and small fixes

oops-dump-preceding-code.patch
i386 oops output: dump preceding code

timer-simplification.patch
timer simplification

timer-lockup-fix-simplification.patch
simplify the timer lockup avoidance code

slab-large-obj-tuning.patch
slab: tune batchcounts for large objects

floppy-oops-fix.patch
Fix floppy oops on forced unload

ext3_writepage-use-after-free-fix.patch
ext3: fix use-after-free bug

list-barriers-on-smp-only.patch
make list.h barriers smp-only

sync_filesystems-docco-lock.patch
sync_filesystems commentary and latency fix

awe_wave-linkage-error-fix.patch
fix .text.exit error in OSS awe_wave.c

conntrack-use-after-free-fix.patch
fix use-after-free in ip_conntrack

syscalls-return-long.patch
Make arch-independent syscalls return long

syscalls-return-long-2.patch
More syscalls-returning-long




2003-03-21 10:54:55

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.65-mm3

Alexander Hoogerhuis <[email protected]> wrote:
>
> Andrew Morton <[email protected]> writes:
> >
> > [SNIP]
> >
>
> ...
> make[4]: *** [net/ipv4/netfilter/ip_conntrack_core.o] Error 1

Bah, sorry.

--- 25/net/ipv4/netfilter/ip_conntrack_core.c~a 2003-03-21 03:04:45.000000000 -0800
+++ 25-akpm/net/ipv4/netfilter/ip_conntrack_core.c 2003-03-21 03:04:48.000000000 -0800
@@ -274,7 +274,7 @@ static void remove_expectations(struct i
* the un-established ones only */
if (exp->sibling) {
DEBUGP("remove_expectations: skipping established %p of %p\n", exp->sibling, ct);
- exp->sibling =3D NULL;
+ exp->sibling = NULL;
continue;
}


_

2003-03-21 11:54:20

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: 2.5.65-mm3

Andrew Morton <[email protected]> writes:

> Alexander Hoogerhuis <[email protected]> wrote:
> >
> > Andrew Morton <[email protected]> writes:
> > >
> > > [SNIP]
> > >
> >
> > ...
> > make[4]: *** [net/ipv4/netfilter/ip_conntrack_core.o] Error 1
>
> Bah, sorry.
>
> --- 25/net/ipv4/netfilter/ip_conntrack_core.c~a 2003-03-21 03:04:45.000000000 -0800
> +++ 25-akpm/net/ipv4/netfilter/ip_conntrack_core.c 2003-03-21 03:04:48.000000000 -0800
> @@ -274,7 +274,7 @@ static void remove_expectations(struct i
> * the un-established ones only */
> if (exp->sibling) {
> DEBUGP("remove_expectations: skipping established %p of %p\n", exp->sibling, ct);
> - exp->sibling =3D NULL;
> + exp->sibling = NULL;
> continue;
> }
>

Restarting my PCMCIA init.d script ended with this one in my log:

Unable to handle kernel NULL pointer dereference at virtual address 00000004
printing eip:
c0211457
*pde = 00000000
Oops: 0002 [#1]
CPU: 0
EIP: 0060:[<c0211457>] Not tainted VLI
EFLAGS: 00210246
EIP is at devclass_remove_driver+0x4b/0x8f
eax: f4d88b48 ebx: f0964664 ecx: 00000000 edx: 00000000
esi: f0964620 edi: f4d88b00 ebp: c9a59f28 esp: c9a59f14
ds: 007b es: 007b ss: 0068
Process modprobe (pid: 5665, threadinfo=c9a58000 task=de569380)
Stack: c02b8c07 00000042 c0331440 c0331400 f4d88b00 c9a59f44 c0210d5b f4d88b00
00000042 f4d88b0c f4d88b00 c02fa6f8 c9a59f5c c0211129 f4d88b00 f4d88c80
f4d88c80 c02fa6f8 c9a59f98 f0948326 f4d88b00 0000001a 00000000 00000019
Call Trace:
[<f4d88b00>] i82365_driver+0x0/0x80 [i82365]
[<c0210d5b>] bus_remove_driver+0x5f/0x97
[<f4d88b00>] i82365_driver+0x0/0x80 [i82365]
[<f4d88b0c>] i82365_driver+0xc/0x80 [i82365]
[<f4d88b00>] i82365_driver+0x0/0x80 [i82365]
[<c0211129>] driver_unregister+0x1a/0x44
[<f4d88b00>] i82365_driver+0x0/0x80 [i82365]
[<f4d88c80>] +0x0/0x200 [i82365]
[<f4d88c80>] +0x0/0x200 [i82365]
[<f0948326>] init_i82365+0x127/0x131 [i82365]
[<f4d88b00>] i82365_driver+0x0/0x80 [i82365]
[<f095d940>] +0x1e0/0x397 [pcmcia_core]
[<c01300ed>] sys_init_module+0x13f/0x21d
[<c010ad8f>] syscall_call+0x7/0xb

Code: 42 00 00 00 c7 04 24 07 8c 2b c0 e8 2a 89 f0 ff 89 d8 ba 01 00 ff ff 0f c1 10 85 d2 0f 85 f1 03 00 00 8d 47 48 8b 57 48 8b 48 04 <89> 4a 04 89 11 89 40 04 89 47 48 89 3c 24 e8 68 fe ff ff 89 d8
<6>cs: IO port probe 0x0c00-0x0cff: clean.
cs: IO port probe 0x0800-0x08ff: clean.
cs: IO port probe 0x0100-0x04ff: excluding 0x3c0-0x3df 0x3f8-0x3ff 0x4d0-0x4d7
cs: IO port probe 0x1000-0x17ff: excluding 0x1000-0x107f 0x1100-0x113f 0x1200-0x121f
cs: IO port probe 0x0a00-0x0aff: clean.
lapper root #

Apart form that, the machine seems alive :)

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-21 14:04:16

by Felipe Alfaro Solana

[permalink] [raw]
Subject: Re: 2.5.65-mm3

----- Original Message -----
From: Alexander Hoogerhuis <[email protected]>
Date: 21 Mar 2003 11:58:18 +0100
To: Andrew Morton <[email protected]>
Subject: Re: 2.5.65-mm3

> Andrew Morton <[email protected]> writes:
> >
> > [SNIP]
> >
>
> gcc -Wp,-MD,net/ipv4/netfilter/.ip_conntrack_core.o.d -D__KERNEL__ -Iinclude -Wall -Wstrict-prototypes
-Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=pentium4
-Iinclude/asm-i386/mach-default
> -nostdinc -iwithprefix include -DMODULE -DKBUILD_BASENAME=ip_conntrack_core -c -o
net/ipv4/netfilter/.tmp_ip_conntrack_core.o net/ipv4/netfilter/ip_conntrack_core.c
> net/ipv4/netfilter/ip_conntrack_core.c: In function `remove_expectations':
> net/ipv4/netfilter/ip_conntrack_core.c:276: invalid suffix on integer constant
> net/ipv4/netfilter/ip_conntrack_core.c:276: called object is not a function
> make[4]: *** [net/ipv4/netfilter/ip_conntrack_core.o] Error 1
> make[3]: *** [net/ipv4/netfilter] Error 2
> make[2]: *** [net/ipv4] Error 2
> make[1]: *** [net] Error 2
> make: *** [modules] Error 2
>

Edit line 276 of net/ipv4/netfilter/ip_conntrack_core and simply
remove the '3D' sequence of characters after the equal (=)
sign.

--
______________________________________________
http://www.linuxmail.org/
Now with e-mail forwarding for only US$5.95/yr

Powered by Outblaze

2003-03-21 20:06:25

by Seth Chandler

[permalink] [raw]
Subject: Re: 2.5.65-mm3



Andrew,

I'm getting some (sort of) random NFS Auth errors with -mm2 and -mm3.
Sometimes the directories i export get exported read only, so i can't edit
them on my nfs clients.

When i'm running 2.5.65 from BK, the problem doesn't exist, its only when i
switch to the -mm branch it manifests itself. I was going to back out the
nfs patches, and see if i could find the culprit....


thanks,

seth
On Friday 21 March 2003 02:58, Andrew Morton wrote:
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.65/2.5.65-mm3/
>
> Will appear later at:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.65/2.5.65
>-mm3/
>

2003-03-21 20:08:55

by Robert Love

[permalink] [raw]
Subject: Re: 2.5.65-mm3

On Fri, 2003-03-21 at 02:58, Andrew Morton wrote:

> dev_t-3-major_h-cleanup.patch
> dev_t [3/3]: major.h cleanups
>
> dev_t-32-bit.patch
> [for playing only] change type of dev_t

Now that dev_t is an unsigned long, MKDEV() correspondingly returns an
unsigned long. This causes a compiler warning and potential bug on
64-bit architectures in drivers/scsi/sg.c :: sg_device_kdev_read().

This patch needs to be applied on top of the dev_t patches.

Robert Love


drivers/scsi/sg.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)


diff -urN linux-2.5.65-mm3/drivers/scsi/sg.c linux/drivers/scsi/sg.c
--- linux-2.5.65-mm3/drivers/scsi/sg.c 2003-03-17 16:44:05.000000000 -0500
+++ linux/drivers/scsi/sg.c 2003-03-19 11:35:50.706607408 -0500
@@ -1331,9 +1331,11 @@
sg_device_kdev_read(struct device *driverfs_dev, char *page)
{
Sg_device *sdp = list_entry(driverfs_dev, Sg_device, sg_driverfs_dev);
- return sprintf(page, "%x\n", MKDEV(sdp->disk->major,
- sdp->disk->first_minor));
+
+ return sprintf(page, "%lx\n", MKDEV(sdp->disk->major,
+ sdp->disk->first_minor));
}
+
static DEVICE_ATTR(kdev,S_IRUGO,sg_device_kdev_read,NULL);

static ssize_t



2003-03-21 20:31:00

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] 2.5.65-mm3 kernel BUG at fs/ext3/super.c:1795!

Alexander Hoogerhuis <[email protected]> wrote:
>
> Andrew Morton <[email protected]> writes:
> >
> > [SNIP]
> >
>
> Disk I/O on my machine froze up during very light work after a few
> hours, luckily I had a window open on another machine so I could do a
> simple capture and save the info:
>
> kernel BUG at fs/ext3/super.c:1795!
> invalid operand: 0000 [#1]
> CPU: 0
> EIP: 0060:[<c018b522>] Not tainted VLI
> EFLAGS: 00010246
> EIP is at ext3_write_super+0x36/0x94
> eax: 00000000 ebx: c8834000 ecx: efb5904c edx: efb59000
> esi: efb59000 edi: c8834000 ebp: c8835ecc esp: c8835ec0
> ds: 007b es: 007b ss: 0068
> Process pdflush (pid: 7853, threadinfo=c8834000 task=ed0a5880)
> Stack: c8835ee4 00000287 efb5904c c8835ee4 c0153148 efb59000 00000077 51eb851f
> c8835fcc c8835fa4 c0137fd0 c03892fc 007b9f47 007b168f 00000000 00000000
> c8835ef4 00000000 00000001 00000000 00000001 00000000 00000053 00000000
> Call Trace:
> [<c0153148>] sync_supers+0xde/0xea
> [<c0137fd0>] wb_kupdate+0x68/0x161
> [<c0118985>] schedule+0x1a4/0x3ac
> [<c01386e8>] __pdflush+0xdc/0x1d8
> [<c01387e4>] pdflush+0x0/0x15
> [<c01387f5>] pdflush+0x11/0x15
> [<c0137f68>] wb_kupdate+0x0/0x161
> [<c0108e69>] kernel_thread_helper+0x5/0xb

How on earth did you do that?

sync_supers() does lock_super, then calls ext3_write_super.

ext3_write_super() does a down_trylock() on sb->s_lock and goes BUG
if it acquired the lock.

So you've effectively done this:

down(&sem);
if (down_trylock(&sem))
BUG();

This can only be a random memory scribble, a hardware bug or a
preempt-related bug in down_trylock().

2003-03-22 02:44:26

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: [BUG] 2.5.65-mm3 kernel BUG at fs/ext3/super.c:1795!

Andrew Morton <[email protected]> writes:

> Alexander Hoogerhuis <[email protected]> wrote:
> >
> > Andrew Morton <[email protected]> writes:
> > >
> > > [SNIP]
> > >
> >
> > Disk I/O on my machine froze up during very light work after a few
> > hours, luckily I had a window open on another machine so I could do a
> > simple capture and save the info:
> >
> > kernel BUG at fs/ext3/super.c:1795!
> > invalid operand: 0000 [#1]
> > CPU: 0
> > EIP: 0060:[<c018b522>] Not tainted VLI
> > EFLAGS: 00010246
> > EIP is at ext3_write_super+0x36/0x94
> > eax: 00000000 ebx: c8834000 ecx: efb5904c edx: efb59000
> > esi: efb59000 edi: c8834000 ebp: c8835ecc esp: c8835ec0
> > ds: 007b es: 007b ss: 0068
> > Process pdflush (pid: 7853, threadinfo=c8834000 task=ed0a5880)
> > Stack: c8835ee4 00000287 efb5904c c8835ee4 c0153148 efb59000 00000077 51eb851f
> > c8835fcc c8835fa4 c0137fd0 c03892fc 007b9f47 007b168f 00000000 00000000
> > c8835ef4 00000000 00000001 00000000 00000001 00000000 00000053 00000000
> > Call Trace:
> > [<c0153148>] sync_supers+0xde/0xea
> > [<c0137fd0>] wb_kupdate+0x68/0x161
> > [<c0118985>] schedule+0x1a4/0x3ac
> > [<c01386e8>] __pdflush+0xdc/0x1d8
> > [<c01387e4>] pdflush+0x0/0x15
> > [<c01387f5>] pdflush+0x11/0x15
> > [<c0137f68>] wb_kupdate+0x0/0x161
> > [<c0108e69>] kernel_thread_helper+0x5/0xb
>
> How on earth did you do that?
>
> sync_supers() does lock_super, then calls ext3_write_super.
>
> ext3_write_super() does a down_trylock() on sb->s_lock and goes BUG
> if it acquired the lock.
>
> So you've effectively done this:
>
> down(&sem);
> if (down_trylock(&sem))
> BUG();
>
> This can only be a random memory scribble, a hardware bug or a
> preempt-related bug in down_trylock().

Heh. My "portable murphy field" if powerful. Honestly, all I did was
to have a few gnome-terminals, an emacs or two, a few mozillas and a
bit more up, same as always, and jut "just happened" (that's what all
kids claim when they break stuff) :)

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-22 12:48:38

by Alexander Hoogerhuis

[permalink] [raw]
Subject: 2.5.65-mm3 bad: scheduling while atomic! [SCSI]

Andrew Morton <[email protected]> writes:
>
> [SNIP]
>

Here's a few more funnies caught while burning a CD:

leep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Debug: sleeping function called from illegal context at include/asm/semaphore.h:119
Call Trace:
[<c0119d92>] __might_sleep+0x5f/0x65
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6ce9>] scsi_sleep+0x77/0xa6 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c011c8a4>] printk+0x11d/0x17b
[<c0109bbe>] __down+0x91/0xf9
[<c0118baa>] default_wake_function+0x0/0x12
[<c010b1dd>] dump_stack+0x11/0x15
[<c0109dcb>] __down_failed+0xb/0x14
[<f08a76e8>] .text.lock.scsi_error+0x37/0x47 [scsi_mod]
[<f08ae5d9>] +0x2119/0x2c80 [scsi_mod]
[<f08a6c5e>] scsi_sleep_done+0x0/0x14 [scsi_mod]
[<f4d548e7>] idescsi_abort+0xf1/0xfa [ide_scsi]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a63bd>] scsi_try_to_abort_cmd+0x65/0x80 [scsi_mod]
[<f08a6506>] scsi_eh_abort_cmds+0x41/0xdb [scsi_mod]
[<f08a72f8>] scsi_unjam_host+0x165/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

ide-scsi: reset called for 12543
bad: scheduling while atomic!
Call Trace:
[<c0118b55>] schedule+0x3a4/0x3a9
[<c0122d9e>] add_timer+0x99/0xa5
[<c01238e4>] schedule_timeout+0x5a/0xab
[<c011c8a4>] printk+0x11d/0x17b
[<c012387e>] process_timeout+0x0/0xc
[<f4d549f8>] idescsi_reset+0x108/0x11e [ide_scsi]
[<f4d54fa0>] +0x400/0x423 [ide_scsi]
[<f08a65f7>] scsi_try_bus_device_reset+0x57/0x8d [scsi_mod]
[<f08a66b0>] scsi_eh_bus_device_reset+0x83/0x17c [scsi_mod]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a7071>] scsi_eh_ready_devs+0x28/0x74 [scsi_mod]
[<f08a7315>] scsi_unjam_host+0x182/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

hdc: DMA disabled
------------[ cut here ]------------
kernel BUG at kernel/timer.c:172!
invalid operand: 0000 [#1]
CPU: 0
EIP: 0060:[<c0122da0>] Not tainted
EFLAGS: 00010002 VLI
EIP is at add_timer+0x9b/0xa5
eax: 00000001 ebx: efc4a480 ecx: c02ffe80 edx: c03986f8
esi: e7000000 edi: efc4a4a4 ebp: e7001e58 esp: e7001e44
ds: 007b es: 007b ss: 0068
Process scsi_eh_0 (pid: 4794, threadinfo=e7000000 task=e7082780)
Stack: c03986f8 c03986f8 efc4a480 e7000000 00000086 e7001e88 c0223fb8 efc4a4a4
c022f45b c03986f8 c03986f8 00000000 00000032 c02240a6 00000000 00000000
c03986f8 e7001eb8 c022468f c03986f8 c02240a6 00000032 00000000 efc4a480
Call Trace:
[<c0223fb8>] ide_set_handler+0x5f/0xa0
[<c022f45b>] __ide_dma_off+0x2b/0x32
[<c02240a6>] atapi_reset_pollfunc+0x0/0x11a
[<c022468f>] do_reset1+0x211/0x22e
[<c02240a6>] atapi_reset_pollfunc+0x0/0x11a
[<c02246dd>] ide_do_reset+0x31/0x84
[<f4d549d1>] idescsi_reset+0xe1/0x11e [ide_scsi]
[<f08a65f7>] scsi_try_bus_device_reset+0x57/0x8d [scsi_mod]
[<f08a66b0>] scsi_eh_bus_device_reset+0x83/0x17c [scsi_mod]
[<f4d547f6>] idescsi_abort+0x0/0xfa [ide_scsi]
[<f08a7071>] scsi_eh_ready_devs+0x28/0x74 [scsi_mod]
[<f08a7315>] scsi_unjam_host+0x182/0x217 [scsi_mod]
[<c0117020>] do_page_fault+0x0/0x4bf
[<f08a74fe>] scsi_error_handler+0x154/0x1c0 [scsi_mod]
[<f08a73aa>] scsi_error_handler+0x0/0x1c0 [scsi_mod]
[<c0108e69>] kernel_thread_helper+0x5/0xb

Code: e0 08 75 20 83 6e 14 01 8b 46 08 83 e0 08 75 08 83 c4 08 5b 5e 5f 5d c3 83 c4 08 5b 5e 5f 5d e9 c1 5d ff ff e8 bc 5d ff ff eb d9 <0f> 0b ac 00 2e 3f 2c c0 eb 8b 55 31 c0 89 e5 83 ec 14 89 7d fc
<6>note: scsi_eh_0[4794] exited with preempt_count 3
hdc: ATAPI reset complete

mvh,
A
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-22 13:26:22

by Alan

[permalink] [raw]
Subject: Re: 2.5.65-mm3 bad: scheduling while atomic! [SCSI]

On Sat, 2003-03-22 at 12:38, Alexander Hoogerhuis wrote:
> Andrew Morton <[email protected]> writes:
> >
> > [SNIP]
> >
>
> Here's a few more funnies caught while burning a CD:

ide-scsi is known broken in 2.5, and will stay that way for a little
while yet I suspect. I sent Linus the infrastructure needed to fix
it yesterday.

2003-03-22 15:57:46

by Martin J. Bligh

[permalink] [raw]
Subject: 2.5.65-mm2 vs 2.5.65-mm3 (full objrmap)

> . Added Hugh's new rmap-without-pte_chains-for-anon-pages patches. Mainly
> for interested parties to test and benchmark at this stage.
>
> It seems to be stable, however it is not clear that this passes the
> benefit-vs-disruption test.

I see very little impact either way. My initial analysis showed that 90%
of the anonymous mappings were singletons, so the chain manipulation costs
are probably very low. If there's a workload that has long anonymous chains,
and manipulates them a lot, that might benefit.

However, I thought there might be some benefit in the fork/exec cycle
(which presumably sets up a new chain instead of the direct mapping then
tears it down again) ... but seemingly not. Did you keep the pte_direct
optimisation? That seems worth keeping, with partial objrmap as well
(I think that was removed in Dave's patch, but would presumably be easy
to put back). Or maybe we just need some more tuning ;-)

Results from 16x NUMA-Q below (that seems to have severe problems with
pte_chains, so is a good testbed for these things... )

Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
Elapsed System User CPU
2.5.65-mm2 44.04 80.63 566.83 1469.25
2.5.65-mm3 44.21 80.57 567.61 1466.00

Kernbench: (make -j N vmlinux, where N = 16 x num_cpus)
Elapsed System User CPU
2.5.65-mm2 44.27 88.56 574.21 1496.75
2.5.65-mm3 44.10 89.24 574.70 1503.75

Kernbench: (make -j vmlinux, maximal tasks)
Elapsed System User CPU
2.5.65-mm2 44.30 86.09 572.75 1488.25
2.5.65-mm3 44.36 86.86 573.28 1486.25


DISCLAIMER: SPEC(tm) and the benchmark name SDET(tm) are registered
trademarks of the Standard Performance Evaluation Corporation. This
benchmarking was performed for research purposes only, and the run results
are non-compliant and not-comparable with any published results.

Results are shown as percentages of the first set displayed

SDET 1 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 2.2%
2.5.65-mm3 98.9% 2.0%

SDET 2 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 2.7%
2.5.65-mm3 97.1% 2.3%

SDET 4 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 1.3%
2.5.65-mm3 103.5% 1.3%

SDET 8 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 1.0%
2.5.65-mm3 98.0% 1.0%

SDET 16 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 0.6%
2.5.65-mm3 99.4% 1.0%

SDET 32 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 0.2%
2.5.65-mm3 101.7% 0.5%

SDET 64 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 0.1%
2.5.65-mm3 101.0% 0.4%

SDET 128 (see disclaimer)
Throughput Std. Dev
2.5.65-mm2 100.0% 0.5%
2.5.65-mm3 100.8% 0.6%


diffprofile (kernbench, + worse in -mm3)

353 0.0% set_page_dirty
284 0.0% try_to_unmap_one
224 17.1% page_add_rmap
193 5.8% find_get_page
142 1.8% d_lookup
113 10.6% link_path_walk
67 0.0% page_dup_rmap
56 9.4% __wake_up
53 0.0% rmap_get_cpu
46 8.8% find_vma
45 8.6% fd_install
44 0.0% page_turn_rmap
40 14.6% do_lookup
37 9.9% .text.lock.file_table
36 2.1% buffered_rmqueue
35 5.5% copy_process
34 10.5% pgd_ctor
33 97.1% exit_mmap
31 4.9% handle_mm_fault
...
-36 -80.0% profile_exit_mmap
-51 -14.9% pte_alloc_map
-53 -27.5% install_page
-99 -100.0% pte_chain_alloc
-127 -8.4% free_hot_cold_page
-128 -0.9% do_anonymous_page
-158 -100.0% __pte_chain_free
-238 -16.4% do_no_page
-293 -12.0% page_remove_rmap
-330 -0.2% total
-355 -100.0% __set_page_dirty_buffers
-666 -1.4% default_idle



diffprofile (sdet, + worse in -mm3)

2139 0.0% try_to_unmap_one
2032 0.0% page_dup_rmap
1508 0.0% set_page_dirty
1448 0.0% page_turn_rmap
223 9.9% link_path_walk
169 2.8% .text.lock.dcache
168 3.0% .text.lock.namei
158 1.9% find_get_page
139 1.2% d_lookup
104 6.8% .text.lock.attr
98 32.9% exit_mmap
97 5.0% d_alloc
93 21.8% generic_delete_inode
92 89.3% __blk_queue_bounce
90 0.0% rmap_get_cpu
83 0.9% .text.lock.dec_and_lock
70 0.0% dup_rmap
69 0.6% atomic_dec_and_lock
67 24.6% find_group_other
65 8.2% new_inode
59 0.9% path_lookup
50 4.8% prune_dcache
50 2.9% .text.lock.base
...
-51 -22.6% ext2_new_block
-56 -6.1% read_block_bitmap
-60 -3.6% __read_lock_failed
-66 -3.7% current_kernel_time
-67 -4.0% __find_get_block
-78 -21.9% group_reserve_blocks
-83 -2.5% do_anonymous_page
-84 -22.3% truncate_inode_pages
-85 -5.4% real_lookup
-87 -27.6% unlock_page
-106 -59.6% profile_exit_mmap
-107 -100.0% pte_chain_alloc
-134 -41.0% install_page
-158 -100.0% __pte_chain_free
-170 -6.4% kmem_cache_free
-170 -9.9% __wake_up
-202 -41.7% grab_block
-253 -1.7% copy_page_range
-266 -3.9% __copy_to_user_ll
-328 -1.7% zap_pte_range
-626 -6.9% release_pages
-679 -19.0% __down
-730 -23.9% do_no_page
-1492 -100.0% __set_page_dirty_buffers
-2051 -0.6% default_idle
-3399 -22.3% page_remove_rmap
-4052 -54.8% page_add_rmap
-6486 -1.0% total



2003-03-22 23:35:44

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.5.65-mm2 vs 2.5.65-mm3 (full objrmap)

On Sat, 22 Mar 2003, Martin J. Bligh wrote:
>
> I see very little impact either way. My initial analysis showed that 90%
> of the anonymous mappings were singletons, so the chain manipulation costs
> are probably very low. If there's a workload that has long anonymous chains,
> and manipulates them a lot, that might benefit.

It would, yes - but like you I'm unable to name that workload
(aside from one of my own tests, not much use to the wide world).

> However, I thought there might be some benefit in the fork/exec cycle
> (which presumably sets up a new chain instead of the direct mapping then
> tears it down again) ... but seemingly not.

I do see such benefit, but disappointingly little. In kernel builds,
say 1% to 3% consistently (on a given machine with given jN) off the
system time; but the user time correspondingly up (eh? lock step tick
issue? cache oddity?), and elapsed time either same or slightly up.
oprofiles didn't enlighten me.

Your figures don't seem to show even that reduction in system time;
though I think you were comparing 2.5.65-mm2 against 2.5.65-mm3,
whereas I was comparing 2.5.65-mm3 with 2.5.65-mm3 minus anobjrmap.
It's conceivable there's something else in -mm3 affecting results.

> Did you keep the pte_direct
> optimisation? That seems worth keeping, with partial objrmap as well
> (I think that was removed in Dave's patch, but would presumably be easy
> to put back).

Dave didn't remove it at all, just went another way so that it became
irrelevant to obj rmaps (or you could say, every obj rmap direct,
apart from the sys_remap_file pages). I did the same with anonymous,
they're almost all direct (since a given anon page is almost always
mapped at the same user virtual address in whatever mms it appears),
the exception needing chains coming from a perverse use of mremap.

The clearest advantage of anobjrmap so far is for your HIGHMEM64G
HIGHPTE configurations: which had a 64-bit direct pte_addr_t in
struct page, now just a 32-bit count like in the non-PAE configs.
(Though that saving could have been achieved in other ways.)

> Or maybe we just need some more tuning ;-)

Be nice if a magic wand would make it go faster, but it seems too
simple for tuning. A lot of effort went into speeding up pte_chains,
looks like the effort paid off. (It's particularly helpful that the
chains got collapsed back to direct lazily, by page_referenced, not
by page_remove_rmap - that means a repetitively forking process
was not perpetually convulsed in allocating and freeing chains).

Hugh

2003-03-23 00:56:56

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.5.65-mm2 vs 2.5.65-mm3 (full objrmap)

>> Did you keep the pte_direct
>> optimisation? That seems worth keeping, with partial objrmap as well
>> (I think that was removed in Dave's patch, but would presumably be easy
>> to put back).
>
> Dave didn't remove it at all, just went another way so that it became
> irrelevant to obj rmaps (or you could say, every obj rmap direct,
> apart from the sys_remap_file pages). I did the same with anonymous,
> they're almost all direct (since a given anon page is almost always
> mapped at the same user virtual address in whatever mms it appears),
> the exception needing chains coming from a perverse use of mremap.

OK, so you're saying we can still use the direct map for singletons
that are filebacked? I thought that dissappeared for some reason ...
don't recall what. I just thought it'd save some time on the lookup
side of the equation .. but I'm not sure our testing is doing any
lookup ;-)

> The clearest advantage of anobjrmap so far is for your HIGHMEM64G
> HIGHPTE configurations: which had a 64-bit direct pte_addr_t in
> struct page, now just a 32-bit count like in the non-PAE configs.
> (Though that saving could have been achieved in other ways.)

Ah, I don't run highpte, too much performance impact from kmap, even
once they were made atomic instead of persistant. Were you using highpte
in your tests? shpte seems to work much better in terms of performance,
and control the high-use cases for ptes ... I think the UKVA based
version with each process permanently mapping its own pagetables will
perform much better.

>> Or maybe we just need some more tuning ;-)
>
> Be nice if a magic wand would make it go faster, but it seems too
> simple for tuning. A lot of effort went into speeding up pte_chains,
> looks like the effort paid off.

Well, I think the real key is that we're hardly using pte_chains any
more with the partial objrmap code ... they're mostly direct mapped
singletons anyway, so you're not saving much. I had a crude /proc
thingy to draw histograms of chain length somewhere that I did my
initial analysis on, I'll try to dig it out.

Did you measure either partial objrmap or anon-objrmap under memory
pressure?

> (It's particularly helpful that the
> chains got collapsed back to direct lazily, by page_referenced, not
> by page_remove_rmap - that means a repetitively forking process
> was not perpetually convulsed in allocating and freeing chains).

mmm ... can you explain that one a bit more? I think I missed that
bit, and maybe it explains why we don't see too much impact from
the fork/exec stuff for anon pages.

Thanks,

M.

2003-03-23 08:01:26

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.5.65-mm2 vs 2.5.65-mm3 (full objrmap)

On Sat, 22 Mar 2003, Martin J. Bligh wrote:
> >> Did you keep the pte_direct
> >> optimisation? That seems worth keeping, with partial objrmap as well
> >> (I think that was removed in Dave's patch, but would presumably be easy
> >> to put back).
> >
> > Dave didn't remove it at all, just went another way so that it became
> > irrelevant to obj rmaps (or you could say, every obj rmap direct,
> > apart from the sys_remap_file pages). I did the same with anonymous,
> > they're almost all direct (since a given anon page is almost always
> > mapped at the same user virtual address in whatever mms it appears),
> > the exception needing chains coming from a perverse use of mremap.
>
> OK, so you're saying we can still use the direct map for singletons
> that are filebacked? I thought that dissappeared for some reason ...
> don't recall what. I just thought it'd save some time on the lookup
> side of the equation .. but I'm not sure our testing is doing any
> lookup ;-)

Sorry, no, that's not what I meant. I was looking at it from the
perspective that with objrmap the file page has no pte_chains at all
(except in the sys_remap_file_pages case), and with anobjrmap the
anon page also has no chains at all (except in odd mremap case).
Thinking of the chain as the thing you waste time on adding a page
to and removing a page from (more the latter).

But when it comes to lookup (page_referenced or try_to_unmap), yes,
with objrmap all file pages are chained (via page->mapping->i_mmap
and page->mapping->i_mmap_shared lists of vmas); and with anobjrmap
all anon pages are chained (via page->anonmm->mm). So you could
say they've abandoned the page direct map (but I think it came in
as a space saving, to prevent every single mapped page from needing
a struct pte_chain attached, not as a lookup optimization).

There's no doubt (except insofar as actual measurement can spring
surprises!) that the pte_chain-based rmap is much more efficient at
locating ptes referencing a page, than objrmap or objrmap+anobjrmap.
The problem with pte_chain-based rmap is that it's faster at the
operations we expect to be slow, and slower at the common operations
we need to be fast (adding and removing a page).

To make up your mind whether we've preserved or abandoned the
page direct optimization, I think you'd better look at the code:
it's just different.

> > The clearest advantage of anobjrmap so far is for your HIGHMEM64G
> > HIGHPTE configurations: which had a 64-bit direct pte_addr_t in
> > struct page, now just a 32-bit count like in the non-PAE configs.
> > (Though that saving could have been achieved in other ways.)
>
> Ah, I don't run highpte, too much performance impact from kmap, even
> once they were made atomic instead of persistant. Were you using highpte
> in your tests? shpte seems to work much better in terms of performance,
> and control the high-use cases for ptes ... I think the UKVA based
> version with each process permanently mapping its own pagetables will
> perform much better.

I build with highpte on one machine for testing purposes, I don't have
enough memory for it actually to be important. I'm almost always
working with Andrew's trees, so was using shpte when it was in, but
not since. I like the idea of shpte (and the UKVA idea), I couldn't
see its constituency - the small processes immediately needed to cow
all their page tables, and the large ones should have been using
huge pages instead (or such was my misperception).

> >> Or maybe we just need some more tuning ;-)
> >
> > Be nice if a magic wand would make it go faster, but it seems too
> > simple for tuning. A lot of effort went into speeding up pte_chains,
> > looks like the effort paid off.
>
> Well, I think the real key is that we're hardly using pte_chains any
> more with the partial objrmap code ... they're mostly direct mapped
> singletons anyway, so you're not saving much. I had a crude /proc
> thingy to draw histograms of chain length somewhere that I did my
> initial analysis on, I'll try to dig it out.
>
> Did you measure either partial objrmap or anon-objrmap under memory
> pressure?

No. I'd expect, and be content with, some slowdown there:
if it's not obvious then it does not matter.

> > (It's particularly helpful that the
> > chains got collapsed back to direct lazily, by page_referenced, not
> > by page_remove_rmap - that means a repetitively forking process
> > was not perpetually convulsed in allocating and freeing chains).
>
> mmm ... can you explain that one a bit more? I think I missed that
> bit, and maybe it explains why we don't see too much impact from
> the fork/exec stuff for anon pages.

When a process forks, each page it had mapped gains one more reference.
With pte_chain-based rmap that means one more pte pointer has to be
added to the page: so if it was PageDirect before, now a struct
pte_chain has to be allocated and the now two pointers put there.
If the forked child immediately execs, its copy of the mm is
immediately torn down and the page references return to what they
were before. Naively I'd expect page_remove_rmap to be tidy and
collapse pte_chain back to PageDirect and free the struct, but in
fact it doesn't bother, leaving that collapse for the next
page_referenced check. And that's a good strategy for processes
which do a lot of forking+execing, they won't be forever switching
between direct and chained.

Hugh

2003-03-23 08:45:29

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.5.65-mm2 vs 2.5.65-mm3 (full objrmap)

On Sun, Mar 23, 2003 at 08:14:12AM +0000, Hugh Dickins wrote:
> I build with highpte on one machine for testing purposes, I don't have
> enough memory for it actually to be important. I'm almost always
> working with Andrew's trees, so was using shpte when it was in, but
> not since. I like the idea of shpte (and the UKVA idea), I couldn't
> see its constituency - the small processes immediately needed to cow
> all their page tables, and the large ones should have been using
> huge pages instead (or such was my misperception).

There's some recent benchmark data from hrandoz showing shpte is
actually doing very well on the speed front lately.

As far as constituency goes I mostly see unintelligent forking servers
(unfortunately these are all too common and tend not to cooperate by
using hugetlb etc.) and smaller machines wanting various trimmings from
kernel memory consumption benefitting. I personally see 2-3MB of
pagetable memory savings from it with end-user workstation loads (X,
xterms, xmms, web browsers, very little dynamic fork()/exec()'ing etc.),
which IMHO is a substantial reduction of the runtime footprint of the
kernel. It alone also conserves 5-6MB of pte_chains in addition to ptes
w/o objrmap. I've also overheard strong interest from software vendors.

-- wli