2003-01-04 08:52:17

by Andrew Morton

[permalink] [raw]
Subject: 2.5.54-mm3


http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/

Several patches here which fix pretty much the last source of long
scheduling latency stalls in the core kernel - long-held page_table_lock
during pagetable teardown.

The preemptible kernel now achieves around 500 microsecond worst-case
latency on a 500MHz PIII (with a slow memory system). This is about as
good as the 2.4 low-latency patch. Maybe better.

This is with ext2, and only with ext2. Other filesystems need work
to reach that level of performance.

Non-preemptible kernels will benefit as well. This sort of means that
preemptibility is only really needed for specialised multimedia/control
type apps. Opinions vary ;)

Filesystem mount and unmount is a problem. Probably, this will not be
addressed. People who have specialised latency requirements should avoid
using automounters and those gadgets which poll CDROMs for insertion events.

This work has broken the shared pagetable patch - it touches the same code
in many places. I shall put Humpty together again, but will not be
including it for some time. This is because there may be bugs in this
patch series which are accidentally fixed in the shared pagetable patch. So
shared pagetables will be reintegrated when these changes have had sufficient
testing.

Hugh, could you please closely review these changes sometime? Thanks.




Changes since 2.4.54-mm2:


-no-stem-compression.patch

It got fixed.

+linus.patch

latest drop from Linus

+devfs-mount-fix.patch

Fix a problem with mounting a devfs=y system with root=<number>

+nfsd-fix.patch

Fix a knfsd problem

+3c920.patch

3c920 support for 3c59x.c

-shpte-ng.patch

See above

+untypedef-mmu_gather.patch

Replace the mmu_gather_t typedef with `struct mmu_gather'.

+low-latency-page-unmapping.patch

Fix long-held spinlocks in exit_mmap() and unmap_region()

+smp-preempt-latency-fix.patch

Fix a cross-cpu problem which causes long scheduling stalls
on SMP+preempt

-#smaller-head-arrays.patch
+smaller-head-arrays.patch

I really do think those slab head arrays are too big.

-fix-ethernet-hash.patch

Jeff merged a fix



All 70 patches:

linus.patch
cset-1.930.1.15-to-1.977.txt.gz

kgdb.patch

log_buf_size.patch
move LOG_BUF_SIZE to header/config

nfsd-fix.patch
Subject: Re: nfsd woes

rcf.patch
run-child-first after fork

devfs-fix.patch

devfs-mount-fix.patch
devfs mount-time readdir fix and cleanup

dio-return-partial-result.patch

aio-direct-io-infrastructure.patch
AIO support for raw/O_DIRECT

deferred-bio-dirtying.patch
bio dirtying infrastructure

aio-direct-io.patch
AIO support for raw/O_DIRECT

aio-dio-debug.patch

dio-reduce-context-switch-rate.patch
Reduced wakeup rate in direct-io code

cputimes_stat.patch
Retore per-cpu time accounting, with a config option

misc.patch
misc fixes

3c920.patch
3c59x: 3c920 support

inlines-net.patch

rbtree-iosched.patch
rbtree-based IO scheduler

deadsched-fix.patch
deadline scheduler fix

copy_page_range-cleanup.patch
copy_page_range: minor cleanup

pte_chain_alloc-fix.patch
infrastructure for handling pte_chain_alloc() failures

page_add_rmap-rework.patch
handle fix pte_chain_alloc() failures

rat-preload.patch
infrastructure for handling radix_tree_node allocation failures

use-rat-preallocation.patch
handle radix_tree_node allocation failures

i_shared_sem.patch
turn i_shared_lock into a semaphore

cond_resched_lock-rework.patch
simplify and generalise cond_resched_lock

untypedef-mmu_gather.patch
replace `typedef mmu_gather_t' with `struct mmu_gather'

low-latency-page-unmapping.patch
low-latency pagetable teardown

smp-preempt-latency-fix.patch
Fix an SMP+preempt latency problem

smaller-head-arrays.patch

mempool_resize-fix.patch
mempool_resize fix

slab-redzone-cleanup.patch
slab: redzoning cleanup

shrink-kmap-space.patch
shrink the amount of vmalloc space reserved for kmap

setuid-exec-no-lock_kernel.patch
remove lock_kernel() from exec of setuid apps

ptrace-flush.patch
Subject: [PATCH] ptrace on 2.5.44

buffer-debug.patch
buffer.c debugging

warn-null-wakeup.patch

pentium-II.patch
Pentium-II support bits

rcu-stats.patch
RCU statistics reporting

auto-unplug.patch
self-unplugging request queues

less-unplugging.patch
Remove most of the blk_run_queues() calls

ext3-fsync-speedup.patch
Clean up ext3_sync_file()

lockless-current_kernel_time.patch
Lockless current_kernel_timer()

scheduler-tunables.patch
scheduler tunables

dio-always-kmalloc.patch
direct-io: dynamically allocate struct dio

set_page_dirty_lock.patch
fix set_page_dirty vs truncate&free races

htlb-2.patch
hugetlb: fix MAP_FIXED handling

route-cache-kmalloc-per-cpu.patch
use kmalloc-per-cpu for the routecache stats

wli-01_numaq_io.patch
(undescribed patch)

wli-02_do_sak.patch
(undescribed patch)

wli-03_proc_super.patch
(undescribed patch)

wli-06_uml_get_task.patch
(undescribed patch)

wli-07_numaq_mem_map.patch
(undescribed patch)

wli-08_numaq_pgdat.patch
(undescribed patch)

wli-09_has_stopped_jobs.patch
(undescribed patch)

wli-10_inode_wait.patch
(undescribed patch)

wli-11_pgd_ctor.patch
(undescribed patch)

wli-11_pgd_ctor-update.patch
pgd_ctor update

wli-12_pidhash_size.patch
Dynamically size the pidhash hash table.

wli-13_rmap_nrpte.patch
(undescribed patch)

dcache_rcu-2.patch
dcache_rcu-2-2.5.51.patch

dcache_rcu-3.patch
dcache_rcu-3-2.5.51.patch

page-walk-api.patch

page-walk-api-2.5.53-mm2-update.patch
pagewalk API update

page-walk-scsi.patch

page-walk-scsi-2.5.53-mm2.patch
pagewalk scsi update


2003-01-04 15:39:24

by Steven Barnhart

[permalink] [raw]
Subject: Re: 2.5.54-mm3

On Sat, 04 Jan 2003 01:00:38 +0000, Andrew Morton wrote:

> Filesystem mount and unmount is a problem. Probably, this will not be
> addressed. People who have specialised latency requirements should avoid
> using automounters and those gadgets which poll CDROMs for insertion events.

That stinks...it don't work in .54 and I'd likem to have my automounter
functioning again. Oh well it *is* 2.5.

> This work has broken the shared pagetable patch - it touches the same code
> in many places. I shall put Humpty together again, but will not be
> including it for some time. This is because there may be bugs in this
> patch series which are accidentally fixed in the shared pagetable patch. So
> shared pagetables will be reintegrated when these changes have had sufficient
> testing.

Also for some reason I always have to do a "touch /fastboot" and boot in
rw mode to boot the kernel. The kernel fails on remouting fs in r-w mode.
X also don't work saying /dev/agpgart don't exist even though it does and
I saw it. agpgart module is loaded..maybe it would work as built into the
kernel? .config attached.

Steven


2003-01-04 21:10:32

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.54-mm3

Steven Barnhart wrote:
>
> On Sat, 04 Jan 2003 01:00:38 +0000, Andrew Morton wrote:
>
> > Filesystem mount and unmount is a problem. Probably, this will not be
> > addressed. People who have specialised latency requirements should avoid
> > using automounters and those gadgets which poll CDROMs for insertion events.
>
> That stinks...it don't work in .54 and I'd likem to have my automounter
> functioning again. Oh well it *is* 2.5.

autofsv4 has been working fine across the 2.5 series. You'll need to
send a (much) better report.

> > This work has broken the shared pagetable patch - it touches the same code
> > in many places. I shall put Humpty together again, but will not be
> > including it for some time. This is because there may be bugs in this
> > patch series which are accidentally fixed in the shared pagetable patch. So
> > shared pagetables will be reintegrated when these changes have had sufficient
> > testing.
>
> Also for some reason I always have to do a "touch /fastboot" and boot in
> rw mode to boot the kernel. The kernel fails on remouting fs in r-w mode.

Many more details are needed. Sufficient for a developer to be able to
reproduce the problem.

> X also don't work saying /dev/agpgart don't exist even though it does and
> I saw it. agpgart module is loaded..maybe it would work as built into the
> kernel? .config attached.

You could try statically linking it, yes. More details are needed,
such as a description of what hardware you have and what driver you're
using.

2003-01-04 22:24:58

by Steven Barnhart

[permalink] [raw]
Subject: Re: 2.5.54-mm3


> autofsv4 has been working fine across the 2.5 series. You'll need to
> send a (much) better report.

I don't really know what the problem is..everything seems to be working
right except when it goes to mount the system from ro mode to rw mode.
Therefore well everything goes down hill after that. I looked through the
/var/log/messages and all those files but nothing specific to the problem.
If I disable fsck and append rw mode kernel boots fine. One minor note, boot
also fails during Mounting other filesystems and gives the typical mount
error about bad superblock, or to many mounted filesystems. My .config was
attached before(?) and that's all I have..anything paticular you are looking
for?

> You could try statically linking it, yes. More details are needed,
> such as a description of what hardware you have and what driver you're
> using.

I have a i810 Intel graphics card/motherboard, intel celeron 1.06 GHz
processor, and agp 3 enabled, could that be the problem? I have enabled the
intel i810 driver in the graphics area as you can see in the .config. The
intel driver seems to be enabled fine as in the Xfree/GDM log it says
something about Intel. Only error is it can't find device /dev/agpgart even
though it *is* there. Any more info you would need?

2003-01-04 22:38:17

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.54-mm3

Steven Barnhart wrote:
>
> > autofsv4 has been working fine across the 2.5 series. You'll need to
> > send a (much) better report.
>
> I don't really know what the problem is..everything seems to be working
> right except when it goes to mount the system from ro mode to rw mode.
> Therefore well everything goes down hill after that. I looked through the
> /var/log/messages and all those files but nothing specific to the problem.
> If I disable fsck and append rw mode kernel boots fine. One minor note, boot
> also fails during Mounting other filesystems and gives the typical mount
> error about bad superblock, or to many mounted filesystems. My .config was
> attached before(?) and that's all I have..anything paticular you are looking
> for?

Your .config was not attached.

There is a devfs mounting problem in 2.5.54. If you're using devfs
you may find that
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/broken-out/devfs-mount-fix.patch
will help

> > You could try statically linking it, yes. More details are needed,
> > such as a description of what hardware you have and what driver you're
> > using.
>
> I have a i810 Intel graphics card/motherboard, intel celeron 1.06 GHz
> processor, and agp 3 enabled, could that be the problem? I have enabled the
> intel i810 driver in the graphics area as you can see in the .config. The
> intel driver seems to be enabled fine as in the Xfree/GDM log it says
> something about Intel. Only error is it can't find device /dev/agpgart even
> though it *is* there. Any more info you would need?

The device node exists in /dev. It sounds like no kernel driver
has registered itself against tht node's major/minor. Make really
sure that you have compiled the appropriate driver for your hardware;
things may have changed. All else fails, send lspci and dmesg output
to this list and/or [email protected]

2003-01-04 23:33:04

by Steven Barnhart

[permalink] [raw]
Subject: Re: 2.5.54-mm3

On Sat, 04 Jan 2003 14:46:37 +0000, Andrew Morton wrote:

> Your .config was not attached.
>
> There is a devfs mounting problem in 2.5.54. If you're using devfs
> you may find that
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/broken-out/devfs-mount-fix.patch
> will help

.config is now attached and no, I am not using devfs.

> The device node exists in /dev. It sounds like no kernel driver
> has registered itself against tht node's major/minor. Make really
> sure that you have compiled the appropriate driver for your hardware;
> things may have changed. All else fails, send lspci and dmesg output
> to this list and/or [email protected]

I also attached the dmesg output from 2.5.54. Hope that helps. Sometime
tonight/tomorrow I will reboot 2.5 and attempt to get the lspci output for
the list.

Steven



2003-01-05 02:12:30

by Steven Barnhart

[permalink] [raw]
Subject: Re: 2.5.54-mm3

Hey Andrew,

I changed some options in the filesystems (only) category. I seemed to
have been able to reproduce my oops report that I got from the very
beginning and was forced to basically do a minimized config. This time
the oops didn't flood my screen so I was able to get it. I hope the oops
report and the new .config will shed some light on this annoying
problem.

--snip--
unable to handle kernel paging request at virtual address ffffff8d
printing eip:
c0130693
*pde=00002067
*pte=00000000
Oops:0002
CPU:0
EIP:0060:[<c0130693>] Not tainted
E-flags:00010000
--snip--
--
Steven
[email protected]
GnuPG Fingerprint: 9357 F403 B0A1 E18D 86D5 2230 BB92 6D64 D516 0A94


Attachments:
.config (21.10 kB)

2003-01-05 14:26:28

by Michael Abshoff

[permalink] [raw]
Subject: Re: 2.5.54-mm3

I use an IBM X-30 Laptop, based on an i830-Intel Chipset, and experience
similar problems.
Once or twice a week I get a ro-root filesystem when booting. In that
case dmesg shows
the line "Trying to unmount old root ... failed"

If I do a 'mount -o remount /' and then a 'rm /dev/null; mknod -m 666
/dev/null c 1 3' everything
reverts back to normal.

I currently run a SuSE 8.0 Kernel under ext2 - .config, dmesg and lspci
-vv are attached .
Since I have experienced the problem also with ext3, reiserfs on SuSE
8.0 and 8.1 kernels
the problems seems to be independent of the filesystem used. I don't
have a clue what to do next.

Any Ideas?

Michael

--
Michael Abshoff - MRB - Universit?t Dortmund - Telefon 755-3463 (intern)

Where do you want to RTFM today?



Attachments:
config (39.84 kB)
dmesg (7.17 kB)
lspci (8.59 kB)
Download all attachments

2003-01-05 17:56:16

by uaca

[permalink] [raw]
Subject: Re: 2.5.54-mm3

On Sat, Jan 04, 2003 at 01:00:38AM -0800, Andrew Morton wrote:
>
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/

It seems to me that the patch you pointed here doesn't include the latency
instrumentation.

Where it is the needed instrumentation to meassure it?

In http://www.zip.com.au/~akpm/linux/ the are no timepeg/intlat patches for
2.5...

>
> Several patches here which fix pretty much the last source of long
> scheduling latency stalls in the core kernel - long-held page_table_lock
> during pagetable teardown.
>
> The preemptible kernel now achieves around 500 microsecond worst-case
> latency on a 500MHz PIII (with a slow memory system). This is about as

[...]

Ulisses

Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---

2003-01-05 20:29:51

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.54-mm3

[email protected] wrote:
>
> On Sat, Jan 04, 2003 at 01:00:38AM -0800, Andrew Morton wrote:
> >
> > http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/
>
> It seems to me that the patch you pointed here doesn't include the latency
> instrumentation.

No, it doesn't. You can monitor the latency using realfeel or realfeel2
from http://www.zip.com.au/~akpm/linux/amlat.tar.gz

But that won't tell you _why_ large latencies are occurring. For that,
you'll need to apply
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/experimental/rtc-debug.patch
and run `amlat'. This combination will spit out stack backtraces whenever
there is a 2 millisecond scheduling overrun.

> Where it is the needed instrumentation to meassure it?
>
> In http://www.zip.com.au/~akpm/linux/ the are no timepeg/intlat patches for
> 2.5...

That's not suitable for this work. intlat is OK for locating and
measuring interrupts-off code paths. But it's a bit hard to drive.

2003-01-05 21:08:43

by uaca

[permalink] [raw]
Subject: Re: 2.5.54-mm3


Thanks for your reply

Ulisses


On Sun, Jan 05, 2003 at 12:38:16PM -0800, Andrew Morton wrote:
> [email protected] wrote:
> >
> > On Sat, Jan 04, 2003 at 01:00:38AM -0800, Andrew Morton wrote:
> > >
> > > http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/
> >
> > It seems to me that the patch you pointed here doesn't include the latency
> > instrumentation.
>
> No, it doesn't. You can monitor the latency using realfeel or realfeel2
> from http://www.zip.com.au/~akpm/linux/amlat.tar.gz
>
> But that won't tell you _why_ large latencies are occurring. For that,
> you'll need to apply
> http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.54/2.5.54-mm3/experimental/rtc-debug.patch
> and run `amlat'. This combination will spit out stack backtraces whenever
> there is a 2 millisecond scheduling overrun.
>
> > Where it is the needed instrumentation to meassure it?
> >
> > In http://www.zip.com.au/~akpm/linux/ the are no timepeg/intlat patches for
> > 2.5...
>
> That's not suitable for this work. intlat is OK for locating and
> measuring interrupts-off code paths. But it's a bit hard to drive.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Debian GNU/Linux: a dream come true
-----------------------------------------------------------------------------
"Computers are useless. They can only give answers." Pablo Picasso

---> Visita http://www.valux.org/ para saber acerca de la <---
---> Asociaci?n Valenciana de Usuarios de Linux <---