2003-08-25 00:10:59

by Andrew Morton

[permalink] [raw]
Subject: 2.6.0-test4-mm1


ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test4/2.6.0-test4-mm1/


. Lots of random fixes.

. m68k, x86_64 and networking syncups.


Linus is away for the rest of this month so if you have stuff it is probably
best to wait until he returns or to send it to myself.




Changes since 2.6.0-test3-mm3:


linus.patch

Current Linus bk tree

-si_band-type-fix.patch
-ext3-block-allocation-cleanup.patch
-nfs-revert-backoff.patch
-signal-race-fix.patch
-vmscan-defer-writepage.patch
-local-apic-enable-fixes.patch
-awe-core.patch
-awe-use-gfp_flags.patch
-awe-fix-truncate-errors.patch
-ikconfig-enable.patch
-bd-claim-whole-disk.patch
-O_EXCL-claim-blockdevs.patch
-opl3sa2-lock-init-fix.patch
-dscc4-1.patch
-dscc4-2.patch
-dscc4-3.patch
-dscc4-4.patch
-dscc4-5.patch
-dscc4-6.patch
-dscc4-7.patch
-dscc4-8.patch
-aio-mm-leak-fix.patch
-selinux-avc_log_lock-fix.patch
-selinux-check-behaviour-fix.patch
-ymfpci-oops-fix.patch
-ymf_devs-lock.patch
-slab-drain_array-fix.patch
-loop-oops-fix.patch
-atp870u_detect-lockup-fix.patch
-copy_user-handle-kernel-fault.patch
-Locking-update.patch
-sysctl_h-needs-compiler_h.patch
-aio-mm-refcounting-fix.patch

Merged

-x86_64-fixes.patch
+kgdb-x86_64-fixes.patch

Renamed

+handle-unreadable-dot-config.patch

build system fixes

+huge-net-update.patch

Current davem tree

+x86_64-update-3.patch

Current ak tree

+dac960-GAM-IOCTLs-cleanup.patch

DAC960 rework

+thread-pgrp-fix-2.patch

Temp fix for the setpgrp-vs-threading problem

+kj-maintainers.patch

Add the kernel janitor project to ./MAINTAINERS

+ramdisk-cleanup.patch

Random whitespace fiddling

+v4l-use-after-free-fix.patch

v4l bugfix

+ikconfig-makefile-update.patch

/proc/ikconfig build tweaks

+ftape-warning-fix.patch

Fix a warning

+jffs-retval-fix.patch

Fix return value types

+make-ACPI_SLEEP-select-SOFTWARE_SUSPEND.patch

Config dependency

+3GB-personality.patch

Create a 3GB exec personality to support buggy apps on x86_64 and, later
ia32 with the 4G/4G split.

+zeromap_pmd_range-fix.patch

pagetable initialisation fix

+no-async-write-errors-on-close.patch

Don't try to report EIO and ENOSPC errors through close()

+sis190-fix.patch

build/bug fix

+remove-add_wait_queue_cond.patch

dead code

+spin_lock_irqrestore-fixes.patch

typos

+pcmciamtd-fix.patch

build fix

+zoran-memleak-fixes.patch
+zoran-rename-debug.patch
+zoran-release-callback.patch
+zoran-pci_disable_device.patch
+zoran-cleanups.patch
+zoran-cleanups-2.patch
+zoran-naming-fix.patch

Zoran driver update

+airo-build-fix.patch

Make airo build with CONFIG_PCI=n

+m68k-vmlinux_lds-move.patch
+mac-ide-fix.patch
+m68k-asm-sections-fix.patch
+m68k-asm-local.patch
+amiga-z2ram-fix.patch
+amiga-floppy-fix.patch
+atari-floppy-fix.patch
+m68k-switch_to-fix.patch

m68k updates

+pcxx-warning-fix.patch

Warning fix

+pcnet32-unregister_pci-fix.patch

rmmod crash fix

+hwifs-oops-unregister-fix.patch

ide unregistration oops fix

+proc-pid-maps-32-bit-fix.patch

Make /proc/pid/maps output more palatable to apps which expect 32-bit
addresses.

-kjournald-PF_SYNCWRITE.patch

Dropped - it slows down rxt3.

+sched-balance-fix-2.6.0-test3-mm3-A0.patch

CPU scheduler runqueue balancing fix

+o18int.patch
+o18.1int.patch

Interactivity tweaks from Con

-blacklist-asus-L3800C-dmi.patch

Dropped - the ACPI guys fixed it.






All 182 patches


linus.patch

mm.patch
add -mmN to EXTRAVERSION

kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdbL warning fix

kgdb-warning-fix.patch
kgdbL warning fix

kgdb-build-fix.patch

kgdb-spinlock-fix.patch

kgdb-fix-debug-info.patch
kgdb: CONFIG_DEBUG_INFO fix

kgdb-cpumask_t.patch

kgdb-x86_64-fixes.patch
x86_64 fixes

handle-unreadable-dot-config.patch
correctly handle unreadable .configs

huge-net-update.patch
net update

config_spinline.patch
uninline spinlocks for profiling accuracy.

ppc64-bar-0-fix.patch
Allow PCI BARs that start at 0

ppc64-reloc_hide.patch

ppc64-semaphore-reimplementation.patch
ppc64: use the ia32 semaphore implementation

ppc64-local.patch
ppc64: local.h implementation

ppc64-sched_clock.patch
ppc64: sched_clock()

sym-do-160.patch
make the SYM driver do 160 MB/sec

x86_64-update-3.patch
x86-64 update for test4

random-locking-fixes.patch
random: SMP locking

random-accounting-and-sleeping-fixes.patch
random: accounting and sleeping fixes

rt-tasks-special-vm-treatment.patch
real-time enhanced page allocator and throttling

rt-tasks-special-vm-treatment-2.patch

input-use-after-free-checks.patch
input layer debug checks

deadline-requeue-workaround.patch
deadline requeue workaround

fbdev.patch

cursor-flashing-fix.patch
fbdev: fix cursor letovers

disable-athlon-prefetch.patch

sis900-atomicity-fix.patch
sis900 atomicity fix

slab-hexdump.patch
slab: hexdump structures when things go wrong

aic7xxx-parallel-build-fix.patch
fix parallel builds for aic7xxx

yenta-20030817-1-zv.patch

yenta-20030817-2-override.patch

yenta-20030817-3-sockinit.patch

yenta-20030817-4-pm.patch

yenta-20030817-5-pm2.patch

yenta-20030817-6-init.patch

yenta-20030817-7-quirks.patch

proc-pid-setuid-ownership-fix.patch
fix /proc/pid/fd ownership across setuid()

pid-revalidate-security-hook.patch
Call security hook from pid*_revalidate

dac960-GAM-IOCTLs-cleanup.patch
move DAC960 GAM IOCTLs into a new device

thread-pgrp-fix-2.patch
Fix setpgid and threads

kj-maintainers.patch
Add the kernel janitors to MAINTAINERS

ramdisk-cleanup.patch

v4l-use-after-free-fix.patch
Fix bug in v4l core for 2.6.0-test3-bk

ikconfig-makefile-update.patch
ikconfig - Makefile update

ftape-warning-fix.patch
Fix ftape warning

jffs-retval-fix.patch
jffs aops return type fix

delay-ksoftirqd-fallback.patch
Try harded in IRQ context before falling back to ksoftirqd

rcu-grace-period.patch
Monitor RCU grace period

intel8x0-cleanup.patch
intel8x0 cleanups

make-ACPI_SLEEP-select-SOFTWARE_SUSPEND.patch
Make ACPI_SLEEP select SOFTWARE_SUSPEND

3GB-personality.patch
Add 3GB personality

zeromap_pmd_range-fix.patch
zeromap_pmd_range bugfix

no-async-write-errors-on-close.patch
don't report async write errors on close() after all

sis190-fix.patch
sis190 synchronize_irq fix

remove-add_wait_queue_cond.patch
remove add_wait_queue_cond()

spin_lock_irqrestore-fixes.patch
spin_lock_irqrestore() typo fixes

pcmciamtd-fix.patch
pcmciamtd.c: remove release timer

zoran-memleak-fixes.patch
zoran: memleak fixes

zoran-rename-debug.patch
zoran: debug->zr_debug

zoran-release-callback.patch
zoran: add release callback

zoran-pci_disable_device.patch
zoranL: add pci_disable_device() call

zoran-cleanups.patch
zoran: cleanups

zoran-cleanups-2.patch
zoran: more cleanups

zoran-naming-fix.patch
zoran: correct name field breakage

airo-build-fix.patch
airo CONFIG_PCI=n build fix

m68k-vmlinux_lds-move.patch
move m68k vmlinux.lds files

mac-ide-fix.patch
Fix Mac IDE

m68k-asm-sections-fix.patch
m68k asm/sections.h

m68k-asm-local.patch
m68k asm/local.h

amiga-z2ram-fix.patch
Amiga z2ram

amiga-floppy-fix.patch
Amiga floppy

atari-floppy-fix.patch
Atari floppy

m68k-switch_to-fix.patch
M68k switch_to fix

pcxx-warning-fix.patch
drivers/char/pcxx.c warning fix

pcnet32-unregister_pci-fix.patch
pcnet32 needs unregister_pci

hwifs-oops-unregister-fix.patch
Fix ide unregister vs. driver model

p00001_synaptics-restore-on-close.patch

p00002_psmouse-reset-timeout.patch

p00003_synaptics-multi-button.patch

p00004_synaptics-optional.patch

p00005_synaptics-pass-through.patch

p00006_psmouse-suspend-resume.patch

p00007_synaptics-old-proto.patch

synaptics-mode-set.patch
Synaptics mode setting

syn-multi-btn-fix.patch
synaptics multibutton fix

keyboard-resend-fix.patch
keyboard resend fix

proc-pid-maps-32-bit-fix.patch
Do 32bit addresses in /proc/self/maps if possible

linux-isp-2.patch

linux-isp-2-fix-again.patch
lost feral fix

feral-bounce-fix.patch
Feral driver - highmem issues

feral-bounce-fix-2.patch
Feral driver bouncing fix

list_del-debug.patch
list_del debug check

print-build-options-on-oops.patch
print a few config options on oops

show_task-free-stack-fix.patch
show_task() fix and cleanup

put_task_struct-debug.patch

ia32-mknod64.patch
mknod64 for ia32

ext2-64-bit-special-inodes.patch
ext2: support for 64-bit device nodes

ext3-64-bit-special-inodes.patch
ext3: support for 64-bit device nodes

64-bit-dev_t-kdev_t.patch
64-bit dev_t and kdev_t

64-bit-dev_t-other-archs.patch
enable 64-bit dev_t for other archs

mknod64-64-bit-fix.patch
dev_t: fix mknod for 64-bit archs

ustat64.patch
ustat64

ppc-64-bit-stat.patch
fix ppc stat.h for 64-bit dev_t

64-bit-dev_t-init_rd-fixes.patch
initrd fixes for 64-bit dev_t

arch-dev_t-stat-fixes.patch
Fix all asm-*/stat.h dev_t instances

oops-dump-preceding-code.patch
i386 oops output: dump preceding code

lockmeter.patch

sparc64-lockmeter-fix.patch

sparc64-lockmeter-fix-2.patch
Fix lockmeter on sparc64

printk-oops-mangle-fix.patch
disentangle printk's whilst oopsing on SMP

20-odirect_enable.patch

21-odirect_cruft.patch

22-read_proc.patch

23-write_proc.patch

24-commit_proc.patch

25-odirect.patch

nfs-O_DIRECT-always-enabled.patch
Force CONFIG_NFS_DIRECTIO

sched-balance-fix-2.6.0-test3-mm3-A0.patch
sched-balance-fix-2.6.0-test3-mm3-A0

sched-2.6.0-test2-mm2-A3.patch
sched-2.6.0-test2-mm2-A3

ppc-sched_clock.patch

sparc64_sched_clock.patch

x86_64-sched_clock.patch
Add sched_clock for x86-64

sched-warning-fix.patch

sched-balance-tuning.patch
CPU scheduler balancing fix

sched-no-tsc-on-numa.patch
Subject: Re: Fw: Re: 2.6.0-test2-mm3

o12.2int.patch
O12.2int for interactivity

o12.3.patch
O12.3 for interactivity

o13int.patch
O13int for interactivity

o13.1int.patch
O13.1int

o14int.patch
O14int

o14int-div-fix.patch
o14int 64-bit-divide fix

o14.1int.patch
O14.1int

o15int.patch
O15int for interactivity

o16int.patch
From: Con Kolivas <[email protected]>
Subject: [PATCH] O16int for interactivity

o16.1int.patch
O16.1int for interactivity

o16.2int.patch
O16.2int

o16.3int.patch
O16.3int

o18int.patch
O18int

o18.1int.patch
O18.1int

4g-2.6.0-test2-mm2-A5.patch
4G/4G split patch

4g4g-vmlinux-update-got-lost.patch

4g4g-do_page_fault-cleanup.patch
4G/4G: remove debug code

4g4g-cleanups.patch

kgdb-4g4g-fix-2.patch

4g4g-config-fix.patch

4g4g-pmd-fix.patch
4g4g: pmd fix

4g4g-wli-fixes.patch
4g/4g: fixes from Bill

4g4g-fpu-fix.patch
4g4g: fpu emulation fix

4g4g-show_registers-fix.patch
4g4g: show_registers() fix

4g4g-pin_page-atomicity-fix.patch
4g/4g usercopy atomicity fix

4g4g-remove-touch_all_pages.patch

4g4g-debug-flags-fix.patch
4g4g: debug flags fix

4g4g-TI_task-fix.patch
4g4g: Fix wrong asm-offsets entry

cyclone-fixmap-fix.patch
cyclone time fixmap fix

ppc-fixes.patch
make mm4 compile on ppc

aic7xxx_old-oops-fix.patch

aio-01-retry.patch
AIO: Core retry infrastructure

io_submit_one-EINVAL-fix.patch
Fix aio process hang on EINVAL

aio-02-lockpage_wq.patch
AIO: Async page wait

aio-03-fs_read.patch
AIO: Filesystem aio read

aio-04-buffer_wq.patch
AIO: Async buffer wait

aio-05-fs_write.patch
AIO: Filesystem aio write

aio-05-fs_write-fix.patch

aio-06-bread_wq.patch
AIO: Async block read

aio-06-bread_wq-fix.patch

aio-07-ext2getblk_wq.patch
AIO: Async get block for ext2

O_SYNC-speedup-2.patch
speed up O_SYNC writes

aio-09-o_sync.patch
aio O_SYNC

aio-10-BUG-fix.patch
AIO: fix a BUG

aio-11-workqueue-flush.patch
AIO: flush workqueues before destroying ioctx'es

aio-12-readahead.patch
AIO: readahead fixes

aio-dio-no-readahead.patch
aio O_DIRECT no readahead

lock_buffer_wq-fix.patch
lock_buffer_wq fix

unuse_mm-locked.patch
AIO: hold the context lock across unuse_mm

aio-take-task_lock.patch
From: Suparna Bhattacharya <[email protected]>
Subject: Re: 2.5.72-mm1 - Under heavy testing with AIO,.. vmstat seems to blow the kernel

aio-O_SYNC-fix.patch
Unify o_sync changes for aio and regular writes

O_SYNC-speedup-nolock-fix.patch

aio-remove-lseek-triggerable-BUG_ONs.patch

aio-readahead-rework.patch
Unified page range readahead for aio and regular reads

aio-readahead-speedup.patch
Readahead issues and AIO read speedup




2003-08-25 06:17:05

by Barry K. Nathan

[permalink] [raw]
Subject: pcnet32 oops patches (was Re: 2.6.0-test4-mm1)

On Sun, Aug 24, 2003 at 05:13:18PM -0700, Andrew Morton wrote:
> +pcnet32-unregister_pci-fix.patch
>
> rmmod crash fix

Here's another (conflicting) patch by the same author:
http://bugme.osdl.org/attachment.cgi?id=684&action=view

There's an oops I'm having (bugzilla bug 976 -- basically, after
modprobing pcnet32 on a box without pcnet32 hardware, the next ethernet
driver to be modprobed blows up) which is not fixed by the patch in
test4-mm1, but which is fixed by attachment 684...

-Barry K. Nathan <[email protected]>

2003-08-25 11:00:15

by Domen Puncer

[permalink] [raw]
Subject: Re: pcnet32 oops patches (was Re: 2.6.0-test4-mm1)

On Monday 25 of August 2003 08:16, Barry K. Nathan wrote:
> On Sun, Aug 24, 2003 at 05:13:18PM -0700, Andrew Morton wrote:
> > +pcnet32-unregister_pci-fix.patch
> >
> > rmmod crash fix
>
> Here's another (conflicting) patch by the same author:
> http://bugme.osdl.org/attachment.cgi?id=684&action=view
>
> There's an oops I'm having (bugzilla bug 976 -- basically, after
> modprobing pcnet32 on a box without pcnet32 hardware, the next ethernet
> driver to be modprobed blows up) which is not fixed by the patch in
> test4-mm1, but which is fixed by attachment 684...

That patch in test4-mm1... someone must have made my patch shorter...
and looks like he/she broke it. :-(

Domen

2003-08-25 17:30:27

by Adrian Bunk

[permalink] [raw]
Subject: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

I got the following compile error in 2.6.0-test4-mm1:

<-- snip -->

...
CC drivers/net/wireless/wl3501_cs.o
drivers/net/wireless/wl3501_cs.c: In function `wl3501_mgmt_join':
drivers/net/wireless/wl3501_cs.c:641: unknown field `id' specified in
initializer
drivers/net/wireless/wl3501_cs.c:641: warning: missing braces around
initializer
drivers/net/wireless/wl3501_cs.c:641: warning: (near initialization for
`sig.ds_pset.el')
drivers/net/wireless/wl3501_cs.c:642: unknown field `el' specified in
initializer
drivers/net/wireless/wl3501_cs.c:643: unknown field `chan' specified in
initializer
drivers/net/wireless/wl3501_cs.c: In function `wl3501_mgmt_start':
drivers/net/wireless/wl3501_cs.c:658: unknown field `id' specified in
initializer
drivers/net/wireless/wl3501_cs.c:658: warning: missing braces around
initializer
drivers/net/wireless/wl3501_cs.c:658: warning: (near initialization for
`sig.ds_pset.el')
drivers/net/wireless/wl3501_cs.c:659: unknown field `el' specified in
initializer
drivers/net/wireless/wl3501_cs.c:660: unknown field `chan' specified in
initializer
drivers/net/wireless/wl3501_cs.c:663: unknown field `id' specified in
initializer
drivers/net/wireless/wl3501_cs.c:664: unknown field `el' specified in
initializer
drivers/net/wireless/wl3501_cs.c:665: unknown field `data_rate_labels'
specified in initializer
drivers/net/wireless/wl3501_cs.c:673: unknown field `id' specified in
initializer
drivers/net/wireless/wl3501_cs.c:674: unknown field `el' specified in
initializer
drivers/net/wireless/wl3501_cs.c:675: unknown field `data_rate_labels'
specified in initializer
drivers/net/wireless/wl3501_cs.c:683: unknown field `id' specified in
initializer
drivers/net/wireless/wl3501_cs.c:684: unknown field `el' specified in
initializer
drivers/net/wireless/wl3501_cs.c:685: unknown field `atim_window'
specified in initializer
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_mgmt_scan_confirm':
drivers/net/wireless/wl3501_cs.c:702: parse error before `)'
drivers/net/wireless/wl3501_cs.c:705: parse error before `)'
drivers/net/wireless/wl3501_cs.c:740: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function `wl3501_mgmt_auth':
drivers/net/wireless/wl3501_cs.c:899: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function `wl3501_mgmt_association':
drivers/net/wireless/wl3501_cs.c:913: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_mgmt_join_confirm':
drivers/net/wireless/wl3501_cs.c:923: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_md_confirm_interrupt':
drivers/net/wireless/wl3501_cs.c:982: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_get_confirm_interrupt':
drivers/net/wireless/wl3501_cs.c:1038: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_start_confirm_interrupt':
drivers/net/wireless/wl3501_cs.c:1050: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_assoc_confirm_interrupt':
drivers/net/wireless/wl3501_cs.c:1062: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function
`wl3501_auth_confirm_interrupt':
drivers/net/wireless/wl3501_cs.c:1074: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function `wl3501_rx_interrupt':
drivers/net/wireless/wl3501_cs.c:1090: parse error before `)'
drivers/net/wireless/wl3501_cs.c: In function `wl3501_exit_module':
drivers/net/wireless/wl3501_cs.c:2350: parse error before `)'
make[3]: *** [drivers/net/wireless/wl3501_cs.o] Error 1


<-- snip -->


cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2003-08-25 17:38:15

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

Em Mon, Aug 25, 2003 at 07:30:07PM +0200, Adrian Bunk escreveu:
> I got the following compile error in 2.6.0-test4-mm1:

I'm checking this now...

- Arnaldo

2003-08-25 18:16:55

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

Em Mon, Aug 25, 2003 at 02:46:27PM -0300, Arnaldo C. Melo escreveu:
> Em Mon, Aug 25, 2003 at 07:30:07PM +0200, Adrian Bunk escreveu:
> > I got the following compile error in 2.6.0-test4-mm1:
>
> I'm checking this now...

Problem doesn't exists in 2.6.0-test4 vanilla (ok, it has patch-2.6.0-test4-pa2
the latest parisc patchset, but it doesn't touches what we're looking at here),
now to test 2.6.0-test4-mm1...

Ah, compiling it as a module.

- Arnaldo

2003-08-25 18:31:18

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

Em Mon, Aug 25, 2003 at 03:24:42PM -0300, Arnaldo C. Melo escreveu:
> Em Mon, Aug 25, 2003 at 02:46:27PM -0300, Arnaldo C. Melo escreveu:
> > Em Mon, Aug 25, 2003 at 07:30:07PM +0200, Adrian Bunk escreveu:
> > > I got the following compile error in 2.6.0-test4-mm1:
> >
> > I'm checking this now...
>
> Problem doesn't exists in 2.6.0-test4 vanilla (ok, it has patch-2.6.0-test4-pa2
> the latest parisc patchset, but it doesn't touches what we're looking at here),
> now to test 2.6.0-test4-mm1...
>
> Ah, compiling it as a module.

No problems with 2.6.0-test4-mm1 (also with the patch-2.6.0-test4-pa2 parisc
patchset), could you please send your .config to me?

- Arnaldo

2003-08-25 19:37:44

by Barry K. Nathan

[permalink] [raw]
Subject: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

I'm really short on time right now, so this bug report might be vague,
but it's important enough for me to try:

I have an NFS fileserver (running 2.6.0-test4-mm1) exporting stuff from
three filesystems: ReiserFS, ext3, and XFS. I'm seeing no problems with
my ReiserFS and ext3 filesystems. XFS is a different story.

My client machine is running 2.4.21bkn1 (my own kernel, not released to
the public; the differences from vanilla 2.4.21 are XFS and Win4Lin).

If I use my client machine to sign RPM packages (rpm --addsign ...),
using rpm-4.2-16mdk, and the packages are on the XFS partition on the
NFS server, about half of the packages are truncated by a couple hundred
bytes afterwards (and GPG sig verification fails on those packages).

It's always the same packages that get truncated by the same amounts of
data. This is 100% reproducible. It doesn't matter whether I compile the
kernel with gcc 2.95.3 or 3.1.1. If I perform the operation on my non-XFS
filesystem the problem doesn't happen. If I run 2.6.0-test4-bk2 instead of
test4-mm1 on the NFS server, the problem goes away. (I have never run
any previous -mm kernels on this server.)

Hmmm... If I sign the packages on the NFS server itself, even with
test4-mm1 on the XFS partition, I can't reproduce the problem.
*However*, that's a different version of RPM (4.0.4).

Is this enough information to help find the cause of the bug? If not,
it might be several days (if I'm unlucky, maybe even a week or two)
before I have time to do anything more...

-Barry K. Nathan <[email protected]>

2003-08-25 20:01:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

"Barry K. Nathan" <[email protected]> wrote:
>
> I'm really short on time right now, so this bug report might be vague,
> but it's important enough for me to try:
>
> I have an NFS fileserver (running 2.6.0-test4-mm1) exporting stuff from
> three filesystems: ReiserFS, ext3, and XFS. I'm seeing no problems with
> my ReiserFS and ext3 filesystems. XFS is a different story.
>
> My client machine is running 2.4.21bkn1 (my own kernel, not released to
> the public; the differences from vanilla 2.4.21 are XFS and Win4Lin).
>
> If I use my client machine to sign RPM packages (rpm --addsign ...),
> using rpm-4.2-16mdk, and the packages are on the XFS partition on the
> NFS server, about half of the packages are truncated by a couple hundred
> bytes afterwards (and GPG sig verification fails on those packages).
>
> It's always the same packages that get truncated by the same amounts of
> data. This is 100% reproducible. It doesn't matter whether I compile the
> kernel with gcc 2.95.3 or 3.1.1. If I perform the operation on my non-XFS
> filesystem the problem doesn't happen. If I run 2.6.0-test4-bk2 instead of
> test4-mm1 on the NFS server, the problem goes away. (I have never run
> any previous -mm kernels on this server.)
>
> Hmmm... If I sign the packages on the NFS server itself, even with
> test4-mm1 on the XFS partition, I can't reproduce the problem.
> *However*, that's a different version of RPM (4.0.4).
>
> Is this enough information to help find the cause of the bug? If not,
> it might be several days (if I'm unlucky, maybe even a week or two)
> before I have time to do anything more...
>

-mm kernels have O_DIRECT-for-NFS patches in them. And some versions of
RPM use O_DIRECT. Whether O_DIRECT makes any difference at the server end
I do not know, but it would be useful if you could repeat the test on stock
2.6.0-test4.

Alternatively, run

export LD_ASSUME_KERNEL=2.2.5

before running RPM. I think that should tell RPM to not try O_DIRECT.

2003-08-25 22:06:11

by Adrian Bunk

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

On Mon, Aug 25, 2003 at 03:39:48PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Aug 25, 2003 at 03:24:42PM -0300, Arnaldo C. Melo escreveu:
> > Em Mon, Aug 25, 2003 at 02:46:27PM -0300, Arnaldo C. Melo escreveu:
> > > Em Mon, Aug 25, 2003 at 07:30:07PM +0200, Adrian Bunk escreveu:
> > > > I got the following compile error in 2.6.0-test4-mm1:
> > >
> > > I'm checking this now...
> >
> > Problem doesn't exists in 2.6.0-test4 vanilla (ok, it has patch-2.6.0-test4-pa2
> > the latest parisc patchset, but it doesn't touches what we're looking at here),
> > now to test 2.6.0-test4-mm1...
> >
> > Ah, compiling it as a module.
>
> No problems with 2.6.0-test4-mm1 (also with the patch-2.6.0-test4-pa2 parisc
> patchset), could you please send your .config to me?

It's attached.

I traced the problem down to the gcc version: The file compiles with
gcc 3.3 but fails to compile with gcc 2.95.

> - Arnaldo

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed


Attachments:
(No filename) (1.14 kB)
.config (47.26 kB)
Download all attachments

2003-08-25 22:58:35

by Steve Lord

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

On Mon, 2003-08-25 at 14:45, Andrew Morton wrote:
> "Barry K. Nathan" <[email protected]> wrote:
> >
> > I'm really short on time right now, so this bug report might be vague,
> > but it's important enough for me to try:
> >
> > I have an NFS fileserver (running 2.6.0-test4-mm1) exporting stuff from
> > three filesystems: ReiserFS, ext3, and XFS. I'm seeing no problems with
> > my ReiserFS and ext3 filesystems. XFS is a different story.
> >
> > My client machine is running 2.4.21bkn1 (my own kernel, not released to
> > the public; the differences from vanilla 2.4.21 are XFS and Win4Lin).
> >
> > If I use my client machine to sign RPM packages (rpm --addsign ...),
> > using rpm-4.2-16mdk, and the packages are on the XFS partition on the
> > NFS server, about half of the packages are truncated by a couple hundred
> > bytes afterwards (and GPG sig verification fails on those packages).
> >
> > It's always the same packages that get truncated by the same amounts of
> > data. This is 100% reproducible. It doesn't matter whether I compile the
> > kernel with gcc 2.95.3 or 3.1.1. If I perform the operation on my non-XFS
> > filesystem the problem doesn't happen. If I run 2.6.0-test4-bk2 instead of
> > test4-mm1 on the NFS server, the problem goes away. (I have never run
> > any previous -mm kernels on this server.)
> >
> > Hmmm... If I sign the packages on the NFS server itself, even with
> > test4-mm1 on the XFS partition, I can't reproduce the problem.
> > *However*, that's a different version of RPM (4.0.4).
> >
> > Is this enough information to help find the cause of the bug? If not,
> > it might be several days (if I'm unlucky, maybe even a week or two)
> > before I have time to do anything more...
> >
>
> -mm kernels have O_DIRECT-for-NFS patches in them. And some versions of
> RPM use O_DIRECT. Whether O_DIRECT makes any difference at the server end
> I do not know, but it would be useful if you could repeat the test on stock
> 2.6.0-test4.
>
> Alternatively, run
>
> export LD_ASSUME_KERNEL=2.2.5
>
> before running RPM. I think that should tell RPM to not try O_DIRECT.

I doubt the NFS client is O_DIRECT capable here, I have run some rpm
builds over nfs to 2.6.0-test4 and an xfs filesystem, everything is
behaving so far. I will try mm1 tomorrow.

Do we know if this NFS V3 or V2 by the way?

Steve

--

Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]

2003-08-25 23:21:32

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1

System time is still rather higher in kernbench, though maybe the
elapsed time isn't degraded so much any more. Not sure if this is
scheduler changes or not, but last time, we isolated a change of
exactly this magnitude to one of those patches (Ingo's IIRC).

I tried "set TIMESLICE_GRANULARITY to MAX_TIMESLICE in sched.c" as
requested, makes no difference really (-max result below).

Kernbench: (make -j vmlinux, maximal tasks)
Elapsed System User CPU
2.6.0-test4 45.87 116.92 571.10 1499.00
2.6.0-test4-mm1 46.29 121.39 570.52 1494.75
2.6.0-test4-mm1-max 46.00 122.18 570.73 1505.75

diffprofile:

7763 4.8% total
2921 6.4% default_idle
949 0.0% direct_strnlen_user
719 20.6% __copy_from_user_ll
554 10.4% __copy_to_user_ll
544 33.5% kmem_cache_free
425 0.0% kpmd_ctor
372 26.1% schedule
349 18.7% atomic_dec_and_lock
322 4.1% __d_lookup
318 8.6% find_get_page
283 165.5% may_open
279 1.2% page_remove_rmap
275 16.0% buffered_rmqueue
263 42.4% __wake_up
212 15.3% free_hot_cold_page
119 6.4% path_lookup
117 3.7% zap_pte_range
114 0.0% direct_strncpy_from_user
107 17.3% generic_file_open
...
-102 -1.6% page_add_rmap
-122 -100.0% strncpy_from_user
-288 -79.8% dentry_open
-305 -66.2% do_page_cache_readahead
-353 -100.0% pgd_ctor
-447 -80.4% file_ra_state_init
-558 -74.9% filp_close
-854 -100.0% strnlen_user

2003-08-26 10:11:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

Steve Lord <[email protected]> wrote:
>
> > > Is this enough information to help find the cause of the bug? If not,
> > > it might be several days (if I'm unlucky, maybe even a week or two)
> > > before I have time to do anything more...
> > >
> >
> > -mm kernels have O_DIRECT-for-NFS patches in them. And some versions of
> > RPM use O_DIRECT. Whether O_DIRECT makes any difference at the server end
> > I do not know, but it would be useful if you could repeat the test on stock
> > 2.6.0-test4.
> >
> > Alternatively, run
> >
> > export LD_ASSUME_KERNEL=2.2.5
> >
> > before running RPM. I think that should tell RPM to not try O_DIRECT.
>
> I doubt the NFS client is O_DIRECT capable here, I have run some rpm
> builds over nfs to 2.6.0-test4 and an xfs filesystem, everything is
> behaving so far. I will try mm1 tomorrow.
>
> Do we know if this NFS V3 or V2 by the way?

OK, sorry for the noise. It appears that this is due to the AIO patches in
-mm. fsx-linux fails instantly on nfsv3 to localhost on XFS. It's OK on
ext2 for some reason.

Binary searching reveals that the offending patch is
O_SYNC-speedup-nolock-fix.patch

testcase:

mkfs.xfs -f /dev/hda5
mount /dev/hda5 /mnt/hda5
chmod a+rw /mnt/hda5
service nfs start
mount localhost:/mnt/hda5 /mnt/localhost
cd /mnt/localhost
fsx-linux foo


truncating to largest ever: 0x13e76
READ BAD DATA: offset = 0x18f13, size = 0xee06, fname = foo
OFFSET GOOD BAD RANGE
0x26000 0x02eb 0x0000 0x 0
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x26001 0xeb02 0x0000 0x 1
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops
0x26002 0x0228 0x0000 0x 2
operation# (mod 256) for the bad data unknown, check HOLE and EXTEND ops

2003-08-26 10:07:14

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1

On Mon, Aug 25, 2003 at 04:10:42PM -0700, Martin J. Bligh wrote:
> 7763 4.8% total
> 2921 6.4% default_idle
> 949 0.0% direct_strnlen_user
> 719 20.6% __copy_from_user_ll
> 554 10.4% __copy_to_user_ll
> 544 33.5% kmem_cache_free
> 425 0.0% kpmd_ctor
> 372 26.1% schedule
> 349 18.7% atomic_dec_and_lock
> 322 4.1% __d_lookup
> 318 8.6% find_get_page
> 283 165.5% may_open

Hmm, seeing functions I wrote in diffprofiles like this gives me the
wli's. Any chance you could snapshot /proc/slabinfo say every 1s during
a run so I can see what's going on?


-- wli

2003-08-26 10:56:31

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

On Tue, Aug 26, 2003 at 03:14:12AM -0700, Andrew Morton wrote:
> Steve Lord <[email protected]> wrote:
> >
> > > > Is this enough information to help find the cause of the bug? If not,
> > > > it might be several days (if I'm unlucky, maybe even a week or two)
> > > > before I have time to do anything more...
> > > >
> > >
> > > -mm kernels have O_DIRECT-for-NFS patches in them. And some versions of
> > > RPM use O_DIRECT. Whether O_DIRECT makes any difference at the server end
> > > I do not know, but it would be useful if you could repeat the test on stock
> > > 2.6.0-test4.
> > >
> > > Alternatively, run
> > >
> > > export LD_ASSUME_KERNEL=2.2.5
> > >
> > > before running RPM. I think that should tell RPM to not try O_DIRECT.
> >
> > I doubt the NFS client is O_DIRECT capable here, I have run some rpm
> > builds over nfs to 2.6.0-test4 and an xfs filesystem, everything is
> > behaving so far. I will try mm1 tomorrow.
> >
> > Do we know if this NFS V3 or V2 by the way?
>
> OK, sorry for the noise. It appears that this is due to the AIO patches in
> -mm. fsx-linux fails instantly on nfsv3 to localhost on XFS. It's OK on
> ext2 for some reason.
>
> Binary searching reveals that the offending patch is
> O_SYNC-speedup-nolock-fix.patch
>

I'm not sure if this would help here, but there is
one bug which I just spotted which would affect writev from
XFS. I wasn't passing the nr_segs down properly.

Regards
Suparna

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India


--- linux-2.6.0-test4-mm1/mm/filemap.c 2003-08-26 10:09:50.000000000 +0530
+++ fix-mm/mm/filemap.c 2003-08-26 16:23:55.000000000 +0530
@@ -1942,7 +1942,7 @@ generic_file_aio_write_nolock(struct kio
goto osync;
}

- ret = __generic_file_aio_write_nolock(iocb, iov, 1, ppos);
+ ret = __generic_file_aio_write_nolock(iocb, iov, nr_segs, ppos);

/*
* Avoid doing a sync in parts for aio - its more efficient to

2003-08-26 14:27:37

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1

--William Lee Irwin III <[email protected]> wrote (on Tuesday, August 26, 2003 03:08:24 -0700):

> On Mon, Aug 25, 2003 at 04:10:42PM -0700, Martin J. Bligh wrote:
>> 7763 4.8% total
>> 2921 6.4% default_idle
>> 949 0.0% direct_strnlen_user
>> 719 20.6% __copy_from_user_ll
>> 554 10.4% __copy_to_user_ll
>> 544 33.5% kmem_cache_free
>> 425 0.0% kpmd_ctor
>> 372 26.1% schedule
>> 349 18.7% atomic_dec_and_lock
>> 322 4.1% __d_lookup
>> 318 8.6% find_get_page
>> 283 165.5% may_open
>
> Hmm, seeing functions I wrote in diffprofiles like this gives me the
> wli's. Any chance you could snapshot /proc/slabinfo say every 1s during
> a run so I can see what's going on?

You should be able to recreate this easily yourself, but on closer
inspection, it seems the cost is just shifted from pgd_ctor.

M.

2003-08-26 14:32:09

by William Lee Irwin III

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1

William Lee Irwin III <[email protected]> wrote (on Tuesday, August 26, 2003 03:08:24 -0700):
>> Hmm, seeing functions I wrote in diffprofiles like this gives me the
>> wli's. Any chance you could snapshot /proc/slabinfo say every 1s during
>> a run so I can see what's going on?

On Tue, Aug 26, 2003 at 07:23:10AM -0700, Martin J. Bligh wrote:
> You should be able to recreate this easily yourself, but on closer
> inspection, it seems the cost is just shifted from pgd_ctor.

That's a big relief.

Thanks.


-- wli

2003-08-26 15:14:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: 2.6.0-test4-mm1: wl3501_cs.c doesn't compile

Em Tue, Aug 26, 2003 at 12:01:08AM +0200, Adrian Bunk escreveu:
> It's attached.

Not needed as per what you said below :)

> I traced the problem down to the gcc version: The file compiles with
> gcc 3.3 but fails to compile with gcc 2.95.

I see, I'll fix this one, I had a problem with this in another source file,
IIRC. Die, gcc 2.95 die! :-)

- Arnaldo

2003-08-26 17:42:19

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

Suparna Bhattacharya <[email protected]> wrote:
>
> > Binary searching reveals that the offending patch is
> > O_SYNC-speedup-nolock-fix.patch
> >
>
> I'm not sure if this would help here, but there is
> one bug which I just spotted which would affect writev from
> XFS. I wasn't passing the nr_segs down properly.

That fixes it, thanks.

2003-08-26 17:58:44

by Steve Lord

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

On Tue, 2003-08-26 at 12:44, Andrew Morton wrote:
> Suparna Bhattacharya <[email protected]> wrote:
> >
> > > Binary searching reveals that the offending patch is
> > > O_SYNC-speedup-nolock-fix.patch
> > >
> >
> > I'm not sure if this would help here, but there is
> > one bug which I just spotted which would affect writev from
> > XFS. I wasn't passing the nr_segs down properly.
>
> That fixes it, thanks.

Does rpm use readv/writev though? Or does the nfs server? not sure
how this change would affect the original problem report.

Steve

--

Steve Lord voice: +1-651-683-3511
Principal Engineer, Filesystem Software email: [email protected]

2003-08-26 18:31:49

by Andrew Morton

[permalink] [raw]
Subject: Re: [BUG] 2.6.0-test4-mm1: NFS+XFS=data corruption

Steve Lord <[email protected]> wrote:
>
> Does rpm use readv/writev though? Or does the nfs server? not sure
> how this change would affect the original problem report.

The NFS server uses multisegment writev. RPM was running at the other end
of the ethernet, so it doesn't really matter what sort of write RPM
is issuing.