2003-03-06 06:56:05

by Andrew Morton

[permalink] [raw]
Subject: 2.5.64-mm1


http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5.64-mm1/

. Included Ingo's file-offset-in-pte patch which allows pages which are in
nonlinear mappings to be reestablished by the kernel's pagefault handler.
This is enabled against all mappings for testing purposes.

. No functional changes to the anticipatory scheduler this time. Just
stabilisation work. It doesn't seem to oops any more.

. A bunch of buxfixes plus the usual sweepings off the factory floor.




Changes since 2.5.63-mm2:

linus.patch

Latest from Linus

-separate.patch
-ppc64-e100-fix.patch
-deadline-dispatching-fix.patch
-loop-hack.patch
-oprofile-up-fix.patch
-presto_get_sb-fix.patch
-on_each_cpu.patch
-on_each_cpu-ldt-cleanup.patch
-notsc-panic.patch
-alloc_pages_cleanup.patch
-ext2-handle-htree-flag.patch
-mpparse-typo-fix.patch
-i386-no-swap-fix.patch
-remove-hugetlb_key.patch
-hugetlbpage-doc-update.patch
-hugetlb-valid-page-ranges.patch
-cciss-startup-problem-fix.patch
-cciss-retry-bus-reset.patch
-cciss-add-cmd-type.patch
-cciss-getluninfo-ioctl.patch
-cciss-passthrough-ioctl.patch

Merged

+balance_irq-cleanup.patch

Clean up some stuff in io_apic.c

+balance_irq-fix.patch

Fix a system lockup.

-sysfs-dget-fix-2.patch

Dropped - fixed in 2.5.64.

-irq-sharing-fix.patch

Dropped - mixing SA_INTERRUPT and SA_SHIRQ handlers is illegal anyway.

+shared-irq-warning.patch

Warn about mixed SA_INTERRUPT & SA_SHIRQ handlers.

+as-naming-comments-BUG.patch
+as-unnecessary-test.patch
+as-atomicity-fix.patch

Anticipatory scheduler work.

-update_atime-speedup.patch
-ext2-update_atime_speedup.patch
-ext3-update_atime_speedup.patch
-UPDATE_ATIME-to-update_atime.patch

Dropped. Was junk.

+objrmap-atomic_t-fix.patch

Tighten up objrmap's handling of page->pte.mapcount

+scheduler-tunables.patch

Put the CPU scheduler tunables back (/proc/sys/sched)

+rtc-locking-fix.patch

rtc.c lock ranking bugfix

+yellowfin-set_bit-fix.patch

Don't do set_bit() on a ushort.

+sk98-build-fix.patch

Don't do 64-bit divides

+cciss-pci-hotplug-fix.patch

cciss fix

+export-pfn_to_nid.patch

An EXPORT_SYMBOL for discontigmem

+move-CONFIG_SWAP.patch

Tidy up the config menus.

+random-stack-use.patch

Reduce stack use in the random driver

+inode-pruning-fix.patch

Fix the icache shrinking logic

+remap-file-pages-2.5.63-a1.patch

Allow pages in nonlinear mappings to be faulted back in by the kernel.

+pte_file-always.patch

Force the new remap-file-pages logic to apply to _all_ mappings, for
testing.

+remove-__pgd_offset.patch
+remove-__pmd_offset.patch
+remove-__pte_offset.patch

Cleanups

+htree-lock_kernel-fix.patch

Missing unlock_kernel() on htree error path

+pci-1.patch
+pci-2.patch
+pci-3.patch
+pci-4.patch
+pci-5.patch

PCI/Cardbus handling changes

+elf_core_dump-stack-size-reduction.patch

Reduce stack size in elf core dumping code

+uninline-binfmt_elf.patch

Nuke some inlines

+htree-nfs-fix.patch

Maybe fix the NFS-server-on-ext3/htree problems

+bonding-zerodiv-fix.patch

Fix a div-by-zero in the bonding driver

+update_atime-ng.patch

Speed up update_atime, and mtime and ctimes too. (Haven't tested that this
is actually working yet).

+one-sec-times.patch

Implement the above for ext2 and ext3.




All 83 patches:

linus.patch
Latest from Linus

mm.patch
add -mmN to EXTRAVERSION

balance_irq-cleanup.patch
i386 IRQ balancing cleanup

balance_irq-fix.patch
balance_irq lockup fix

rpc_rmdir-fix.patch
Fix nfs oops during mount

ppc64-reloc_hide.patch

ppc64-pci-patch.patch
Subject: pci patch

ppc64-aio-32bit-emulation.patch
32/64bit emulation for aio

ppc64-64-bit-exec-fix.patch
Subject: 64bit exec

ppc64-scruffiness.patch
Fix some PPC64 compile warnings

sym-do-160.patch
make the SYM driver do 160 MB/sec

kgdb.patch

nfsd-disable-softirq.patch
Fix race in svcsock.c in 2.5.61

report-lost-ticks.patch
make lost-tick detection more informative

ptrace-flush.patch
cache flushing in the ptrace code

buffer-debug.patch
buffer.c debugging

warn-null-wakeup.patch

ext3-truncate-ordered-pages.patch
ext3: explicitly free truncated pages

limit-write-latency.patch
fix possible latency in balance_dirty_pages()

reiserfs_file_write-5.patch

tcp-wakeups.patch
Use fast wakeups in TCP/IPV4

lockd-lockup-fix-2.patch
Subject: Re: Fw: Re: 2.4.20 NFS server lock-up (SMP)

rcu-stats.patch
RCU statistics reporting

ext3-journalled-data-assertion-fix.patch
Remove incorrect assertion from ext3

nfs-speedup.patch

nfs-oom-fix.patch
nfs oom fix

sk-allocation.patch
Subject: Re: nfs oom

nfs-more-oom-fix.patch

nfs-sendfile.patch
Implement sendfile() for NFS

rpciod-atomic-allocations.patch
Make rcpiod use atomic allocations

linux-isp.patch

isp-update-1.patch

remove-unused-congestion-stuff.patch
Subject: [PATCH] remove unused congestion stuff

aic-makefile-fix.patch
aicasm Makefile fix

atm_dev_sem.patch
convert atm_dev_lock from spinlock to semaphore

flock-fix.patch
flock fixes for 2.5.62

shared-irq-warning.patch
detect and warn about attempts to share SA_INTERRUPT handlers

as-iosched.patch
anticipatory I/O scheduler

as-random-fixes.patch
Subject: [PATCH] important fixes

as-comment-fix.patch
AS: comment fix

as-naming-comments-BUG.patch
AS: fix up naming, comments, add more BUGs

as-unnecessary-test.patch

as-atomicity-fix.patch

readahead-shrink-to-zero.patch
Allow VFS readahead to fall to zero

cfq-2.patch
CFQ scheduler, #2

smalldevfs.patch
smalldevfs

objrmap-2.5.62-5.patch
object-based rmap

objrmap-X-fix.patch
objrmap fix for X

objrmap-nr_mapped-fix.patch
objrmap: fix /proc/meminfo:Mapped

objrmap-mapped-mem-fix-2.patch
fix objrmap mapped mem accounting again

objrmap-atomic_t-fix.patch
Make objrmap mapcount non-atomic

per-cpu-disk-stats.patch
Make diskstats per-cpu using kmalloc_percpu

sched-b3.patch
HT scheduler, sched-2.5.63-B3

scheduler-tunables.patch
scheduler tunables

show_task-free-stack-fix.patch
show_task() fix and cleanup

use-after-free-check.patch
slab use-after-free detector

reiserfs-fix-memleaks.patch
ReiserFS: fix memleaks on journal opening failures

copy_page_range-invalid-page-fix.patch
Fix copy_page_range()'s handling of invalid pages

rtc-locking-fix.patch
rtc lock ranking fix

yellowfin-set_bit-fix.patch
yellowfin driver set_bit fix

sk98-build-fix.patch
sk98lin 64-bit divide fix

cciss-pci-hotplug-fix.patch
cciss: fix initialization for PCI hotplug

export-pfn_to_nid.patch
export pfn_to_nid to modules

move-CONFIG_SWAP.patch
move the CONFIG_SWAP menu option to somewhere logical

random-stack-use.patch
Reduced stack usage in random.c

inode-pruning-fix.patch
fix inode reclaim imbalance.

remap-file-pages-2.5.63-a1.patch
Subject: [patch] remap-file-pages-2.5.63-A1

pte_file-always.patch
enable file-offset-in-pte's for all mappings

remove-__pgd_offset.patch
remove __pgd_offset

remove-__pmd_offset.patch
remove __pmd_offset

remove-__pte_offset.patch
remove __pte_offset

htree-lock_kernel-fix.patch
missed unlock_kernel() in ext3+htree

pci-1.patch
PCI probing for cardbus (1/5)

pci-2.patch
PCI probing for cardbus (2/5)

pci-3.patch
PCI probing for cardbus (3/5)

pci-4.patch
PCI probing for cardbus (4/5)

pci-5.patch
PCI probing for cardbus (5/5)

elf_core_dump-stack-size-reduction.patch
reduce stack size: elf_core_dump()

uninline-binfmt_elf.patch
uninlining in fs/binfmt_elf.c

htree-nfs-fix.patch
Fix ext3 htree / NFS compatibility problems

bonding-zerodiv-fix.patch
Subject: [PATCH][bonding] division by zero bug

update_atime-ng.patch
inode a/c/mtime modification speedup

one-sec-times.patch
Implement a/c/time speedup in ext2 & ext3




2003-03-06 09:56:21

by Alex Tomas

[permalink] [raw]
Subject: Re: 2.5.64-mm1


As far as I understand this isn't error path.

lock_kernel();

sb = inode->i_sb;

if (is_dx(inode)) {
err = ext3_dx_readdir(filp, dirent, filldir);
if (err != ERR_BAD_DX_DIR)
return err;
/*
* We don't set the inode dirty flag since it's not
* critical that it get flushed back to the disk.
*/
EXT3_I(filp->f_dentry->d_inode)->i_flags &= ~EXT3_INDEX_FL;
}

So, if ext3_dx_readdir() returns 0 (OK path), then ext3_readdir() finish
w/o unlock_kernel(). The remain part of ext3_readdir() gets used if
ext3_dx_readdir() can't use HTree and returns ERR_BAD_DX_DIR.

Am I miss something?

>>>>> Andrew Morton (AM) writes:

AM> +htree-lock_kernel-fix.patch

AM> Missing unlock_kernel() on htree error path


2003-03-06 10:11:16

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.64-mm1

Alex Tomas <[email protected]> wrote:
>
>
> As far as I understand this isn't error path.
>
> lock_kernel();
>
> sb = inode->i_sb;
>
> if (is_dx(inode)) {
> err = ext3_dx_readdir(filp, dirent, filldir);
> if (err != ERR_BAD_DX_DIR)
> return err;
> /*
> * We don't set the inode dirty flag since it's not
> * critical that it get flushed back to the disk.
> */
> EXT3_I(filp->f_dentry->d_inode)->i_flags &= ~EXT3_INDEX_FL;
> }
>
> So, if ext3_dx_readdir() returns 0 (OK path), then ext3_readdir() finish
> w/o unlock_kernel(). The remain part of ext3_readdir() gets used if
> ext3_dx_readdir() can't use HTree and returns ERR_BAD_DX_DIR.
>

hm, yes, it does look that way.

It could be that any task which travels that path ends up running under
lock_kernel() for the rest of its existence, and nobody noticed.

2003-03-06 10:51:05

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.5.64-mm1

Andrew Morton <[email protected]> wrote:
>
>
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.5/2.5.64/2.5.64-mm1/
>

It doesn't build with gcc-3.2.1. Please put a

#include <linux/string.h>

into include/linux/genhd.h


Also, the remap_file_pages changes make 2.5.64-mm1 an x86-only kernel.


And, with gcc-3.2.1:

mnm:/usr/src/25> nm vmlinux|grep __constant_memcpy | wc
129 387 3741
mnm:/usr/src/25> nm vmlinux|grep __constant_c_and_count_memset | wc
233 699 9553

2003-03-06 13:46:02

by Alex Tomas

[permalink] [raw]
Subject: Re: 2.5.64-mm1

>>>>> Andrew Morton (AM) writes:

AM> hm, yes, it does look that way.

AM> It could be that any task which travels that path ends up running
AM> under lock_kernel() for the rest of its existence, and nobody
AM> noticed.

Probably, this patch may help us. It checks current->lock_depth after
each syscall and prints warning.

diff -uNr linux/arch/i386/kernel/entry.S edited/arch/i386/kernel/entry.S
--- linux/arch/i386/kernel/entry.S Thu Mar 6 14:57:38 2003
+++ edited/arch/i386/kernel/entry.S Thu Mar 6 16:40:27 2003
@@ -282,6 +282,17 @@
syscall_call:
call *sys_call_table(,%eax,4)
movl %eax,EAX(%esp) # store the return value
+
+ movl TI_TASK(%ebp), %edx # check current->lock_depth
+ movl 20(%edx), %ecx
+ cmpl $0, %ecx
+ je syscall_exit
+ cmpl $-1, %ecx
+ je syscall_exit
+
+ GET_THREAD_INFO(%ebp)
+ call warn_invalid_lock_depth
+
syscall_exit:
cli # make sure we don't miss an interrupt
# setting need_resched or sigpending
diff -uNr linux/arch/i386/kernel/l edited/arch/i386/kernel/l
--- linux/arch/i386/kernel/l Thu Jan 1 03:00:00 1970
+++ edited/arch/i386/kernel/l Thu Mar 6 13:44:03 2003
@@ -0,0 +1 @@
+make: *** No rule to make target `bzImage'. Stop.
diff -uNr linux/arch/i386/kernel/process.c edited/arch/i386/kernel/process.c
--- linux/arch/i386/kernel/process.c Thu Mar 6 14:57:25 2003
+++ edited/arch/i386/kernel/process.c Thu Mar 6 16:32:17 2003
@@ -714,3 +714,14 @@
return 0;
}

+asmlinkage void warn_invalid_lock_depth(void)
+{
+ struct task_struct * tsk = current;
+
+ if (!(tsk->flags & 0x10000000)) {
+ printk("WARNING: non-zero(%d) lock_depth, pid %u\n",
+ tsk->lock_depth, tsk->pid);
+ tsk->flags |= 0x10000000;
+ }
+}
+

2003-03-09 16:00:44

by Alexander Hoogerhuis

[permalink] [raw]
Subject: Re: 2.5.64-mm1

And a quick note, the IO-schduler and all other improvements seems
like heaven for a poor guy stuck on 2.4 for long time, and I've kept
reading lk-ml for the progress reports and feedback and just like to
add my 2,000 rupiahs that this stuff certainly makes life enjoyable
under load on a slow harddrive on a laptop.

mvh,
A

Andrew Morton <[email protected]> writes:

> [SNIP AGAIN]
--
Alexander Hoogerhuis | [email protected]
CCNP - CCDP - MCNE - CCSE | +47 908 21 485
"You have zero privacy anyway. Get over it." --Scott McNealy

2003-03-11 22:35:54

by Ed Tomlinson

[permalink] [raw]
Subject: [opps] 2.5.64-mm1

Got home this afternoon and found my box has paniced with:

Unable to handle kernel NULL pointer dereference at virtual address 0000003c
printing eip:
c011372b
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0060:[<c011372b>] Not tainted
EFLAGS: 00010007
EIP is at do_schedule+0x193/0x330
eax: dff8a000 ebx: ffffffe0 ecx: c030e0a0 edx: 00000003
esi: dff8c660 edi: 00000000 ebp: dff8bfdc esp: dff8bfc4
ds: 007b es: 007b ss: 0068
Process ksoftirqd/0 (pid: 2, threadinfo=dff8a000 task=dff8c660)
Stack: dff8a000 dff8a000 00000000 dff8a000 00000000 dff8c660 dff8bfec c011a3be
c011a358 00000000 00000000 c01070e9 00000000 00000000 00000000
Call Trace:
[<c011a3be>] ksoftirqd+0x66/0xa4
[<c011a358>] ksoftirqd+0x0/0xa4
[<c01070e9>] kernel_thread_helper+0x5/0xc

Code: 8b 53 5c 8b 7e 60 85 d2 75 0b 89 7b 60 ff 47 18 eb 78 8d 76

and feeding the eip and code to ksymoops:

>>EIP; c011372b <do_schedule+193/330> <=====

Code; c011372b <do_schedule+193/330>
00000000 <_EIP>:
Code; c011372b <do_schedule+193/330> <=====
0: 8b 53 5c mov 0x5c(%ebx),%edx <=====
Code; c011372e <do_schedule+196/330>
3: 8b 7e 60 mov 0x60(%esi),%edi
Code; c0113731 <do_schedule+199/330>
6: 85 d2 test %edx,%edx
Code; c0113733 <do_schedule+19b/330>
8: 75 0b jne 15 <_EIP+0x15>
Code; c0113735 <do_schedule+19d/330>
a: 89 7b 60 mov %edi,0x60(%ebx)
Code; c0113738 <do_schedule+1a0/330>
d: ff 47 18 incl 0x18(%edi)
Code; c011373b <do_schedule+1a3/330>
10: eb 78 jmp 8a <_EIP+0x8a>
Code; c011373d <do_schedule+1a5/330>
12: 8d 76 00 lea 0x0(%esi),%esi

Does this ring bells anywhere?

Ed Tomlinson