2003-11-13 07:26:05

by Andrew Morton

[permalink] [raw]
Subject: 2.6.0-test9-mm3


http://www.zip.com.au/~akpm/linux/patches/2.6.0-test9-mm3.gz

kernel.org is being slow. Will appear at:

ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test9/2.6.0-test9-mm3/

- Various new fixes; generally uncritical ones.

- Significant changes to the AIO and direct-io code. This needs beating
on; hopefully we're now close to a solution to the fairly complex problems
in there.

- Several ext2 and ext3 allocator fixes. These need serious testing on big
SMP.

- Anyone who has patches in here which they think should go into 2.6.0,
please retest them in -mm3 and let me know, thanks.



linus.patch

Latest Linus tree

-as-badness-warning-fix.patch
-3c509-mca-fix.patch
-ext2-allocation-fix.patch
-ohci-locking-fix.patch
-disable-ide-tcq.patch
-via-quirk-fix.patch
-raid1-recovery-fix.patch
-journal_remove_journal_head-assertion-fix.patch
-x86_64-tss-limit-fix.patch
-keyboard-repeat-rate-setting-fix.patch
-aio-refcounting-fix.patch

Merged

-RD16-rest-B6.patch

Al said to drop this.

+cramfs-use-pagecache.patch

cramfs fixes

-ia32-MSI-support-tweaks.patch

Folded into ia32-MSI-support.patch

+ia32-MSI-support-x86_64-fixes.patch

x86_64 build fix

-ia32-efi-asm-warning-fix.patch
-ia32-efi-support-mem-equals-fix.patch
-CONFIG_ACPI_EFI-defaults-off.patch
-ia32-efi-support-warning-fixes.patch
-ia32-efi-support-tidy.patch
-ia32-efi-other-arch-fix.patch
-efi-constant-sizing-fix.patch
-ia32-efi-config-option.patch
-ia32-efi-config-option-tweaks.patch
-ia32-efi-config-help-update.patch
-ia64-CONFIG_EFI-update.patch

Folded into ia32-efi-support.patch

+ia64-ia32-missing-compat-syscalls.patch
+compat-layer-fixes.patch

32-bit compat layer fixes

+compat-ioctl-for-i2c.patch

compat layer for i2c (old version)

+loop-bio-handling-fix.patch

Loop driver fixlet

-gcc-Os-if-embedded-better-help.patch

Folded into gcc-Os-if-embedded.patch

+as-request-poisoning-fix.patch
+as-fix-all-known-bugs.patch

Anticipatory scheduler fixes.

+more-than-256-cpus.patch

cpumask fixes for huge SMP

+acpi-pm-timer.patch
+acpi-pm-timer-fixes.patch

Yet another timer source for ia32

+ZONE_SHIFT-from-NODES_SHIFT.patch

Memory zone arith fixup

+ext2_new_inode-fixes.patch
+ext2_new_inode-fixes-tweaks.patch
+remove-ext2_reverve_inode.patch

ext2 fixes

+memmove-speedup.patch

Make memmove() faster.

+percpu-counter-linkage-fix.patch

Fix the build for when ext2 and ext3 are modular

+ide-scsi-warnings.patch

Print warnings when someone tries to use ide-scsi for a cdrom

+pipe-readv-writev.patch

pipe readv() and writev() correctness fix and speedup

+ext3_new_inode-scan-fix.patch

ext3 inode allocator fix

+lockless-semop.patch

sysv semaphore SMP speedup

+percpu_counter-use-alloc_percpu.patch

Fix the percpu counters for huge SMP.

+i450nx-scanning-fix.patch

PCI bridge fix for i450nx chipset machines

+serio-pm-fix.patch

Fix psmouse PM resume

+find_busiest_queue-commentary.patch

CPU scheduler comments

+ext2-block-allocator-fixes.patch

More ext2 allocator fixes.

+SOUND_CMPCI-config-typo-fix.patch

Sound driver config fix

+atkbd-24-compatibility.patch

Make AT keyboard userspace interface compatible with 2.4's.

+init_h-needs-compiler_h.patch
+init_h-needs-compiler_h-fix.patch

Compile fix

+cpu_sibling_map-fix.patch

cpu_sibling_map is broken on summit.

+tulip-hash-fix.patch

Fix multicast hash generation for some tulips

+context-switch-accounting-fix.patch

Fix CPU scheduler beancounting with CONFIG_PREEMPT.

+access-vfs_permission-fix.patch

Fix access()

+eicon-linkage-fix.patch

ISDM build fix

+kobject-docco-additions.patch

Documentation additions.

-O_DIRECT-race-fixes-rework-XFS-fix.patch
-O_DIRECT-race-fixes-rework-XFS-fix-fix.patch

Folded into O_DIRECT-race-fixes-rollup.patch

+dio-aio-fixes.patch
+dio-aio-fixes-fixes.patch

AIO/direct-io fixes

+promise-sata-id.patch

Additional STAT PCI ID.




All 201 patches


linus.patch

mm.patch
add -mmN to EXTRAVERSION

kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdbL warning fix

kgdb-buff-too-big.patch
kgdb buffer overflow fix

kgdb-warning-fix.patch
kgdbL warning fix

kgdb-build-fix.patch

kgdb-spinlock-fix.patch

kgdb-fix-debug-info.patch
kgdb: CONFIG_DEBUG_INFO fix

kgdb-cpumask_t.patch

kgdb-x86_64-fixes.patch
x86_64 fixes

kgdb-over-ethernet.patch
kgdb-over-ethernet patch

kgdb-over-ethernet-fixes.patch
kgdb-over-ethernet fixlets

kgdb-CONFIG_NET_POLL_CONTROLLER.patch
kgdb: replace CONFIG_KGDB with CONFIG_NET_RX_POLL in net drivers

kgdb-handle-stopped-NICs.patch
kgdb: handle netif_stopped NICs

eepro100-poll-controller.patch

tlan-poll_controller.patch

tulip-poll_controller.patch

tg3-poll_controller.patch
kgdb: tg3 poll_controller

8139too-poll_controller.patch
8139too poll controller

kgdb-eth-smp-fix.patch
kgdb-over-ethernet: fix SMP

kgdb-eth-reattach.patch

kgdb-skb_reserve-fix.patch
kgdb-over-ethernet: skb_reserve() fix

must-fix.patch

should-fix.patch

must-fix-update-01.patch
must fix lists update

RD1-cdrom_ioctl-B6.patch

RD2-ioctl-B6.patch

RD2-ioctl-B6-fix.patch
RD2-ioctl-B6 fixes

RD3-cdrom_open-B6.patch

RD4-open-B6.patch

RD5-cdrom_release-B6.patch

RD6-release-B6.patch

RD7-presto_journal_close-B6.patch

RD8-f_mapping-B6.patch

RD9-f_mapping2-B6.patch

RD10-i_sem-B6.patch

RD11-f_mapping3-B6.patch

RD12-generic_osync_inode-B6.patch

RD13-bd_acquire-B6.patch

RD14-generic_write_checks-B6.patch

RD15-I_BDEV-B6.patch

cramfs-use-pagecache.patch
cramfs: use pagecache better

invalidate_inodes-speedup.patch
invalidate_inodes speedup

invalidate_inodes-speedup-fixes-2.patch
more invalidate_inodes speedup fixes

serio-01-renaming.patch
serio: rename serio_[un]register_slave_port to __serio_[un]register_port

serio-02-race-fix.patch
serio: possible race between port removal and kseriod

serio-03-blacklist.patch
Add black list to handler<->device matching

serio-04-synaptics-cleanup.patch
Synaptics: code cleanup

serio-05-reconnect-facility.patch
serio: reconnect facility

serio-06-synaptics-use-reconnect.patch
Synaptics: use serio_reconnect

acpi_off-fix.patch
fix acpi=off

cfq-4.patch
CFQ io scheduler
CFQ fixes

config_spinline.patch
uninline spinlocks for profiling accuracy.

ppc64-bar-0-fix.patch
Allow PCI BARs that start at 0

ppc64-reloc_hide.patch

sym-do-160.patch
make the SYM driver do 160 MB/sec

input-use-after-free-checks.patch
input layer debug checks

aic7xxx-parallel-build-fix.patch
fix parallel builds for aic7xxx

ramdisk-cleanup.patch

intel8x0-cleanup.patch
intel8x0 cleanups

pdflush-diag.patch

kobject-oops-fixes.patch
fix oopses is kobject parent is removed before child

futex-uninlinings.patch
futex uninlining

zap_page_range-debug.patch
zap_page_range() debug

call_usermodehelper-retval-fix-3.patch
Make call_usermodehelper report exit status

asus-L5-fix.patch
Asus L5 framebuffer fix

jffs-use-daemonize.patch

tulip-NAPI-support.patch
tulip NAPI support

tulip-napi-disable.patch
tulip NAPI: disable poll in close

get_user_pages-handle-VM_IO.patch

ia32-MSI-support.patch
Updated ia32 MSI Patches

ia32-MSI-support-x86_64-fixes.patch

ia32-efi-support.patch
EFI support for ia32
efi warning fix
fix EFI for ppc64, ia64
efi: warning fixes
ia32 EFI: Add CONFIG_EFI
efi: Update Kconfig help
efi update patch (ia64)

support-zillions-of-scsi-disks.patch
support many SCSI disks

SGI-IOC4-IDE-chipset-support.patch
Add support for SGI's IOC4 chipset

sparc32-sched_clock.patch

pcibios_test_irq-fix.patch
Fix pcibios test IRQ handler return

fixmap-in-proc-pid-maps.patch
report user-readable fixmap area in /proc/PID/maps

i82365-sysfs-ordering-fix.patch
Fix init_i82365 sysfs ordering oops

pci_set_power_state-might-sleep.patch

ia64-ia32-missing-compat-syscalls.patch
From: Arun Sharma <[email protected]>
Subject: Missing compat syscalls in ia64

compat-layer-fixes.patch
Minor bug fixes to the compat layer

compat-ioctl-for-i2c.patch
compat_ioctl for i2c

compat_ioctl-cleanup.patch
cleanup of compat_ioctl functions

fix-sqrt.patch
sqrt() fixes

scale-min_free_kbytes.patch
scale the initial value of min_free_kbytes

cdrom-allocation-try-harder.patch
Use __GFP_REPEAT for cdrom buffer

sym-2.1.18f.patch

CONFIG_STANDALONE-default-to-n.patch
Make CONFIG_STANDALONE default to N

extra-buffer-diags.patch

nosysfs.patch

constant_test_bit-doesnt-like-zwanes-gcc.patch
gcc bug workaround for constant_test_bit()

slab-leak-detector.patch
slab leak detector

early-serial-registration-fix.patch
serial console registration bugfix

3c527-smp-update.patch
SMP support on 3c527 net driver

3c527-race-fix.patch

ext3-latency-fix.patch
ext3 scheduling latency fix

videobuf_waiton-race-fix.patch

firmware-kernel_thread-on-demand.patch
Remove workqueue usage from request_firmware_async()

loop-autoloading-fix.patch
Fix loop module auto loading

loop-module-alias.patch
loop needs MODULE_ALIAS_BLOCK

loop-remove-blkdev-special-case.patch

loop-highmem.patch
remove useless highmem bounce from loop/cryptoloop

loop-highmem-fixes.patch

loop-bio-handling-fix.patch
loop: BIO handling fix

cmpci-set_fs-fix.patch
cmpci.c: remove pointless set_fs()

dentry-bloat-fix-2.patch
Fix dcache and icache bloat with deep directories

nls-config-fixes.patch
NSL config fixes

proc_pid_lookup-vs-exit-race-fix.patch
Fix proc_pid_lookup vs exit race

gcc-Os-if-embedded.patch
Add `gcc -Os' config option

aic7xxx-sleep-in-spinlock-fix.patch

vm86-sysenter-fix.patch
Fix sysenter disabling in vm86 mode

gettimeofday-resolution-fix.patch
gettimeofday resolution fix

refill_counter-overflow-fix.patch
vmscan: reset refill_counter after refilling the inactive list

verbose-timesource.patch
be verbose about the time source

as-regression-fix.patch
Fix IO scheduler regression

as-request-poisoning.patch
AS: request poisoning

as-request-poisoning-fix.patch
AS: request poisining fix

as-fix-all-known-bugs.patch
AS fixes

as-new-process-estimation.patch
AS: new process estimation

as-cooperative-thinktime.patch
AS: thinktime improvement

scale-nr_requests.patch
scale nr_requests with TCQ depth

truncate_inode_pages-check.patch

local_bh_enable-warning-fix.patch

cdc-acm-softirq-rx.patch
cdc-acm: move rx processing to softirq

forcedeth.patch
forcedeth: nForce ethernet driver

reiserfs-pinned-buffer-fix.patch
reiserfs pinned buffer fix

proc-pid-maps-output-fix.patch
Restore /proc/pid/maps formatting

atomic_dec-debug.patch
atomic_dec debug

sis900-pm-support.patch
Add PM support to sis900 network driver

8139too-locking-fix.patch
8139too locking fix

ia32-wp-test-cleanup.patch
ia32 WP test cleanup

hugetlb-needs-pse.patch
ia32: hugetlb needs pse

powermate-payload-size-fix.patch
Griffin Powermate fix

more-than-256-cpus.patch
Fix for more than 256 CPUs

acpi-pm-timer.patch
ACPI PM Timer

acpi-pm-timer-fixes.patch
ACPI PM-Timer fixes

ZONE_SHIFT-from-NODES_SHIFT.patch
Use NODES_SHIFT to calculate ZONE_SHIFT

ext2_new_inode-fixes.patch
Fix bugs in ext2_new_inode()

ext2_new_inode-fixes-tweaks.patch
ext2_new_inode: more tweaking

remove-ext2_reverve_inode.patch

memmove-speedup.patch
optimize ia32 memmove

percpu-counter-linkage-fix.patch
fix percpu_counter_mod linkage problem

ide-scsi-warnings.patch
ide-scsi: warn when used for cdroms

pipe-readv-writev.patch
Fix writev atomicity on pipe/fifo

ext3_new_inode-scan-fix.patch
ext3_new_inode fixlet

lockless-semop.patch
lockless semop

percpu_counter-use-alloc_percpu.patch
use alloc_percpu in percpu_counters

i450nx-scanning-fix.patch
i450nx PCI scanning fix

serio-pm-fix.patch
psmouse pm resume fix

find_busiest_queue-commentary.patch
find_busiest_queue() commentary fix

ext2-block-allocator-fixes.patch
ext2 block allocator fixes

SOUND_CMPCI-config-typo-fix.patch
fix SOUND_CMPCI Configure help entry

atkbd-24-compatibility.patch
Fixes for keyboard 2.4 compatibility

init_h-needs-compiler_h.patch
init.h needs to include compiler.h

init_h-needs-compiler_h-fix.patch
compile fix for older gcc's

cpu_sibling_map-fix.patch
cpu_sibling_map fix

tulip-hash-fix.patch
tulip filter hash fix

context-switch-accounting-fix.patch
Fix context switch accounting

access-vfs_permission-fix.patch
Subject: Re: [PATCH] fix access() / vfs_permission() bug

eicon-linkage-fix.patch
eicon/ and hardware/eicon/ drivers using the same symbols

kobject-docco-additions.patch
Improve documentation for kobjects

list_del-debug.patch
list_del debug check

print-build-options-on-oops.patch

show_task-free-stack-fix.patch
show_task() fix and cleanup

oops-dump-preceding-code.patch
i386 oops output: dump preceding code

lockmeter.patch

printk-oops-mangle-fix.patch
disentangle printk's whilst oopsing on SMP

4g-2.6.0-test2-mm2-A5.patch
4G/4G split patch
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
use direct_copy_{to,from}_user for kernel access in mm/usercopy.c
4G/4G might_sleep warning fix
4g/4g pagetable accounting fix

4g4g-athlon-prefetch-handling-fix.patch

4g4g-wp-test-fix.patch
Fix 4G/4G and WP test lockup

4g4g-KERNEL_DS-usercopy-fix.patch
4G/4G KERNEL_DS usercopy again

ppc-fixes.patch
make mm4 compile on ppc

aic7xxx_old-oops-fix.patch

O_DIRECT-race-fixes-rollup.patch
DIO fixes forward port and AIO-DIO fix
O_DIRECT race fixes comments
O_DRIECT race fixes fix fix fix
DIO locking rework
O_DIRECT XFS fix

dio-aio-fixes.patch
direct-io AIO fixes

dio-aio-fixes-fixes.patch
dio-aio fix fix

readahead-multiple-fixes.patch
readahead: multipole performance fixes

readahead-simplification.patch
readahead simplification

aio-sysctl-parms.patch
aio sysctl parms

aio-01-retry.patch
AIO: Core retry infrastructure
Fix aio process hang on EINVAL
AIO: flush workqueues before destroying ioctx'es
AIO: hold the context lock across unuse_mm
task task_lock in use_mm()

4g4g-aio-hang-fix.patch
Fix AIO and 4G-4G hang

aio-retry-elevated-refcount.patch
aio: extra ref count during retry

aio-splice-runlist.patch
Splice AIO runlist for fairer handling of multiple io contexts

aio-02-lockpage_wq.patch
AIO: Async page wait

aio-03-fs_read.patch
AIO: Filesystem aio read

aio-04-buffer_wq.patch
AIO: Async buffer wait
lock_buffer_wq fix

aio-05-fs_write.patch
AIO: Filesystem aio write

aio-06-bread_wq.patch
AIO: Async block read

aio-07-ext2getblk_wq.patch
AIO: Async get block for ext2

O_SYNC-speedup-2.patch
speed up O_SYNC writes

O_SYNC-speedup-2-f_mapping-fixes.patch

aio-09-o_sync.patch
aio O_SYNC
AIO: fix a BUG
Unify o_sync changes for aio and regular writes
aio-O_SYNC-fix bits got lost
aio: writev nr_segs fix
More AIO O_SYNC related fixes

aio-09-o_sync-f_mapping-fixes.patch

gang_lookup_next.patch
Change the page gang lookup API

aio-gang_lookup-fix.patch
AIO gang lookup fixes

aio-O_SYNC-short-write-fix.patch
Fix for O_SYNC short writes

aio-12-readahead.patch
AIO: readahead fixes
aio O_DIRECT no readahead
Unified page range readahead for aio and regular reads

aio-12-readahead-f_mapping-fix.patch

aio-readahead-speedup.patch
Readahead issues and AIO read speedup

promise-sata-id.patch
add Promise 20376 PCI ID




2003-11-13 20:07:48

by john stultz

[permalink] [raw]
Subject: [PATCH] linux-2.6.0-test9-mm3_verbose-timesource-acpi-pm_A0

On Wed, 2003-11-12 at 23:30, Andrew Morton wrote:
> +acpi-pm-timer.patch
> +acpi-pm-timer-fixes.patch
>
> Yet another timer source for ia32
>
[snip]
> verbose-timesource.patch
> be verbose about the time source

Andrew,
I forgot that I sent you the verbose-timesource patch. The ACPI PM time
source will need this simple fix to work along side that patch.

thanks
-john

===== arch/i386/kernel/timers/timer_pm.c 1.6 vs edited =====
--- 1.6/arch/i386/kernel/timers/timer_pm.c Tue Nov 4 11:39:50 2003
+++ edited/arch/i386/kernel/timers/timer_pm.c Thu Nov 13 11:12:23 2003
@@ -185,6 +185,7 @@

/* acpi timer_opts struct */
struct timer_opts timer_pmtmr = {
+ .name = "pmtmr",
.init = init_pmtmr,
.mark_offset = mark_offset_pmtmr,
.get_offset = get_offset_pmtmr,



2003-11-13 22:07:03

by John Cherry

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 (compile stats)

Linux 2.6 (mm tree) Compile Statistics (gcc 3.2.2)
Warnings/Errors Summary

Kernel bzImage bzImage bzImage modules bzImage
modules
(defconfig) (allno) (allyes) (allyes) (allmod)
(allmod)
--------------- ---------- -------- -------- -------- --------
---------
2.6.0-test9-mm3 0w/0e 0w/0e 172w/ 0e 12w/0e 3w/0e 211w/0e
2.6.0-test9-mm2 0w/0e 0w/0e 172w/ 0e 12w/0e 3w/0e 211w/1e
2.6.0-test9-mm1 0w/0e 0w/0e 179w/ 1e 12w/0e 3w/0e 213w/1e
2.6.0-test8-mm1 0w/0e 0w/0e 183w/ 1e 13w/0e 3w/0e 223w/1e
2.6.0-test7-mm1 0w/0e 1w/0e 176w/ 1e 9w/0e 3w/0e 231w/1e
2.6.0-test6-mm4 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 234w/1e
2.6.0-test6-mm3 0w/0e 1w/0e 178w/ 1e 9w/0e 3w/0e 252w/2e
2.6.0-test6-mm2 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 252w/2e
2.6.0-test6-mm1 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 252w/2e

Web page with links to complete details:
http://developer.osdl.org/cherry/compile/

Version information for host [ cherrypit.pdx.osdl.net ]
gcc: 3.2.2
patch: 2.5.4

Kernel version: 2.6.0-test9-mm3
Kernel build:
Making bzImage (defconfig): 0 warnings, 0 errors
Making modules (defconfig): 0 warnings, 0 errors
Making bzImage (allnoconfig): 0 warnings, 0 errors
Making bzImage (allyesconfig): 172 warnings, 0 errors
Making modules (allyesconfig): 12 warnings, 0 errors
Making bzImage (allmodconfig): 3 warnings, 0 errors
Making modules (allmodconfig): 211 warnings, 0 errors

Building directories:
Building fs/adfs: clean
Building fs/affs: clean
Building fs/afs: clean
Building fs/autofs: clean
Building fs/autofs4: clean
Building fs/befs: clean
Building fs/bfs: clean
Building fs/cifs: clean
Building fs/coda: clean
Building fs/cramfs: clean
Building fs/devfs: clean
Building fs/devpts: clean
Building fs/efs: clean
Building fs/exportfs: clean
Building fs/ext2: clean
Building fs/ext3: clean
Building fs/fat: clean
Building fs/freevxfs: clean
Building fs/hfs: clean
Building fs/hpfs: clean
Building fs/hugetlbfs: clean
Building fs/intermezzo: clean
Building fs/isofs: clean
Building fs/jbd: clean
Building fs/jffs: clean
Building fs/jffs2: clean
Building fs/jfs: clean
Building fs/lockd: clean
Building fs/minix: clean
Building fs/msdos: clean
Building fs/ncpfs: clean
Building fs/nfs: clean
Building fs/nfsd: clean
Building fs/nls: clean
Building fs/ntfs: clean
Building fs/partitions: clean
Building fs/proc: clean
Building fs/qnx4: clean
Building fs/ramfs: clean
Building fs/reiserfs: clean
Building fs/romfs: clean
Building fs/smbfs: clean
Building fs/sysfs: clean
Building fs/sysv: clean
Building fs/udf: clean
Building fs/ufs: clean
Building fs/vfat: clean
Building fs/xfs: clean
Building drivers/i2c: clean
Building drivers/net: 31 warnings, 0 errors
Building drivers/media: 1 warnings, 0 errors
Building drivers/base: clean
Building drivers/pci: clean
Building drivers/eisa: clean
Building drivers/isdn: clean
Building drivers/char: 1 warnings, 0 errors
Building drivers/acpi: clean
Building drivers/serial: 1 warnings, 0 errors
Building drivers/fc4: clean
Building drivers/parport: clean
Building drivers/mtd: 23 warnings, 0 errors
Building drivers/usb: clean
Building drivers/block: 1 warnings, 0 errors
Building drivers/pcmcia: 3 warnings, 0 errors
Building drivers/input: clean
Building drivers/atm: clean
Building drivers/ide: 30 warnings, 0 errors
Building drivers/pnp: clean
Building drivers/oprofile: clean
Building drivers/ieee1394: clean
Building drivers/cdrom: 3 warnings, 0 errors
Building drivers/md: clean
Building drivers/message: 1 warnings, 0 errors
Building drivers/cpufreq: clean
Building drivers/sbus: clean
Building drivers/bluetooth: clean
Building drivers/telephony: 5 warnings, 0 errors
Building drivers/zorro: clean
Building drivers/acorn: clean
Building drivers/tc: clean
Building drivers/mca: clean
Building drivers/nubus: clean
Building drivers/misc: clean
Building drivers/dio: clean
Building drivers/scsi/aacraid: clean
Building drivers/scsi/aic7xxx: clean
Building drivers/scsi/pcmcia: 4 warnings, 0 errors
Building drivers/scsi/sym53c8xx_2: clean
Building drivers/video/aty: 3 warnings, 0 errors
Building drivers/video/console: 2 warnings, 0 errors
Building drivers/video/i810: clean
Building drivers/video/logo: clean
Building drivers/video/matrox: 5 warnings, 0 errors
Building drivers/video/riva: clean
Building drivers/video/sis: 1 warnings, 0 errors
Building sound/core: clean
Building sound/drivers: clean
Building sound/i2c: clean
Building sound/isa: 3 warnings, 0 errors
Building sound/oss: 33 warnings, 0 errors
Building sound/pci: clean
Building sound/pcmcia: clean
Building sound/synth: clean
Building sound/usb: clean
Building arch/i386: clean
Building crypto: clean
Building lib: clean
Building net: 9 warnings, 0 errors
Building security: clean
Building sound: clean
Building usr: clean
Building fs: clean
Building drivers/video: 8 warnings, 0 errors
Building drivers/scsi: 44 warnings, 0 errors
Building drivers/net: 0 warnings, 1 errors


Error Summary (individual module builds):

drivers/net: 0 warnings, 1 errors


Warning Summary (individual module builds):

drivers/block: 1 warnings, 0 errors
drivers/cdrom: 3 warnings, 0 errors
drivers/char: 1 warnings, 0 errors
drivers/ide: 30 warnings, 0 errors
drivers/media: 1 warnings, 0 errors
drivers/message: 1 warnings, 0 errors
drivers/mtd: 23 warnings, 0 errors
drivers/net: 31 warnings, 0 errors
drivers/pcmcia: 3 warnings, 0 errors
drivers/scsi/pcmcia: 4 warnings, 0 errors
drivers/scsi: 44 warnings, 0 errors
drivers/serial: 1 warnings, 0 errors
drivers/telephony: 5 warnings, 0 errors
drivers/video/aty: 3 warnings, 0 errors
drivers/video/console: 2 warnings, 0 errors
drivers/video/matrox: 5 warnings, 0 errors
drivers/video/sis: 1 warnings, 0 errors
drivers/video: 8 warnings, 0 errors
net: 9 warnings, 0 errors
sound/isa: 3 warnings, 0 errors
sound/oss: 33 warnings, 0 errors


Error List:

make[1]: [arch/i386/boot/bzImage] Error 1 (ignored)
make[2]: [drivers/net/wan/wanxlfw.inc] Error 127 (ignored)


Warning List:

arch/i386/kernel/cpu/cpufreq/powernow-k8.c:38:2: warning: #warning this
driver has not been tested on a preempt system
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:938:2: warning: #warning
pol->policy is in undefined state here
drivers/cdrom/aztcd.c:379: warning: `pa_ok' defined but not used
drivers/cdrom/isp16.c:124: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/cdrom/mcdx.h:180:2: warning: #warning You have not edited mcdx.h
drivers/cdrom/mcdx.h:181:2: warning: #warning Perhaps irq and i/o
settings are wrong.
drivers/cdrom/sjcd.c:1700: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/char/applicom.c:522:2: warning: #warning "Je suis stupide. DW. -
copy*user in cli"
drivers/char/applicom.c:67: warning: `applicom_pci_tbl' defined but not
used
drivers/char/watchdog/alim1535_wdt.c:320: warning: `ali_pci_tbl' defined
but not used
drivers/ide/ide-probe.c:1326: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/ide-probe.c:1353: warning: `MOD_DEC_USE_COUNT' is deprecated
(declared at include/linux/module.h:494)
drivers/ide/ide-tape.c:6213: warning: duplicate `const'
drivers/ide/ide.c:2470: warning: implicit declaration of function
`pnpide_init'
drivers/ide/legacy/ide-cs.c:365: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/legacy/ide-cs.c:411: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/ide/pci/aec62xx.c:533: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/alim15x3.c:871: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/amd74xx.c:451: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/cmd64x.c:755: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cs5520.c:294: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cs5530.c:416: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cy82c693.c:437: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/hpt34x.c:334: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/hpt366.c:1223: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/ns87415.c:228: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/opti621.c:364: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/pdc202xx_new.c:631: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/pdc202xx_old.c:925: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/piix.c:746: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/rz1000.c:65: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/sc1200.c:557: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/serverworks.c:804: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/siimage.c:1174: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/sis5513.c:956: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/slc90e66.c:376: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/triflex.c:227: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/trm290.c:378: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/ide/pci/trm290.c:406: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/via82cxxx.c:618: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/input/gameport/ns558.c:121: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
drivers/input/gameport/ns558.c:80: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/media/common/saa7146_vbi.c:6: warning: `vbi_workaround' defined
but not used
drivers/media/video/zoran_card.c:149: warning: `zr36067_pci_tbl' defined
but not used
drivers/message/fusion/mptscsih.c:6922: warning: `mptscsih_setup'
defined but not used
drivers/message/i2o/i2o_block.c:1506: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/mtd/chips/amd_flash.c:783: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/mtd/chips/cfi_cmdset_0001.c:381: warning: unsigned int format,
different type arg (arg 2)
drivers/mtd/chips/cfi_cmdset_0001.c:965: warning: unsigned int format,
different type arg (arg 2)
drivers/mtd/chips/cfi_cmdset_0002.c:1157: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:513: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:651: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:977: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0020.c:1139: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:1288: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:493: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:853: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/sharp.c:157: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/mtd/cmdlinepart.c:344: warning: `mtdpart_setup' defined but not
used
drivers/mtd/devices/doc2000.c:567: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2000.c:568: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2001.c:376: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2001.c:377: warning: assignment from incompatible
pointer type
drivers/mtd/nftlcore.c:354: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:358: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:363: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:632: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:696: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlmount.c:220: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/net/3c515.c:529: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/acenic.c:135: warning: `acenic_pci_tbl' defined but not used
drivers/net/arcnet/arc-rimi.c:319: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com20020-isa.c:152: warning: `dev_alloc' is
deprecated (declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com20020-pci.c:71: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90io.c:385: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90xx.c:146: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/arcnet/com90xx.c:412: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90xx.c:609: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/dgrs.c:124: warning: `dgrs_pci_tbl' defined but not used
drivers/net/eepro.c:575: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/ewrk3.c:1291: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/ewrk3.c:1335: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/hp100.c:288: warning: `hp100_pci_tbl' defined but not used
drivers/net/hp100.c:385: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:432: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:463: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:471: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/sk98lin/skaddr.c:1092: warning: `ReturnCode' might be used
uninitialized in this function
drivers/net/sk98lin/skaddr.c:1624: warning: `ReturnCode' might be used
uninitialized in this function
drivers/net/skfp/skfddi.c:185: warning: `skfddi_pci_tbl' defined but not
used
drivers/net/tokenring/smctr.c:3494: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/tokenring/smctr.c:733: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/tulip/winbond-840.c:149: warning: `version' defined but not
used
drivers/net/wan/cycx_drv.c:430: warning: long unsigned int format, u32
arg (arg 2)
drivers/net/wan/farsync.c:1316: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/farsync.c:1329: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/hostess_sv11.c:125: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/hostess_sv11.c:157: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/lmc/lmc_main.c:1063: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
drivers/net/wan/lmc/lmc_main.c:1184: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/lmc/lmc_main.c:1355: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/pc300_drv.c:3168: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/pc300_drv.c:3204: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/sbni.c:308: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/i82365.c:680: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/i82365.c:817: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/tcic.c:340: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:1003: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:1008: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:700: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:704: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:708: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:712: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:716: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:720: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:973: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:988: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:993: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:998: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/NCR5380.c:396: warning: `phases' defined but not used
drivers/scsi/NCR5380.c:699: warning: `NCR5380_probe_irq' defined but not
used
drivers/scsi/NCR5380.c:756: warning: `NCR5380_print_options' defined but
not used
drivers/scsi/NCR53c406a.c:611: warning: `NCR53c406a_setup' defined but
not used
drivers/scsi/NCR53c406a.c:660: warning: initialization from incompatible
pointer type
drivers/scsi/NCR53c406a.c:669: warning: `wait_intr' defined but not used
drivers/scsi/advansys.c:10006: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/advansys.c:4622: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/aha152x.c:396: warning: `id_table' defined but not used
drivers/scsi/aha152x.c:793: warning: `aha152x_setup' defined but not
used
drivers/scsi/aha152x.c:852: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/aha152x.c:870: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/atp870u.c:2350: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/atp870u.c:2422: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/cpqfcTSinit.c:1583: warning: unused variable `timeout'
drivers/scsi/cpqfcTSinit.c:1584: warning: unused variable `retries'
drivers/scsi/cpqfcTSinit.c:1585: warning: unused variable `scsi_cdb'
drivers/scsi/cpqfcTSinit.c:471: warning: `my_ioctl_done' defined but not
used
drivers/scsi/dtc.c:187: warning: `dtc_setup' defined but not used
drivers/scsi/eata_pio.c:596: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/fd_mcs.c:300: warning: `fd_mcs_setup' defined but not used
drivers/scsi/fd_mcs.c:311: warning: initialization from incompatible
pointer type
drivers/scsi/fd_mcs.h:27: warning: `fd_mcs_command' declared `static'
but never defined
drivers/scsi/fdomain.c:763: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/g_NCR5380.c:926: warning: `id_table' defined but not used
drivers/scsi/gdth.c:881: warning: `gdthtable' defined but not used
drivers/scsi/inia100.h:70: warning: `inia100_detect' declared `static'
but never defined
drivers/scsi/inia100.h:71: warning: `inia100_release' declared `static'
but never defined
drivers/scsi/inia100.h:72: warning: `inia100_queue' declared `static'
but never defined
drivers/scsi/inia100.h:73: warning: `inia100_abort' declared `static'
but never defined
drivers/scsi/inia100.h:74: warning: `inia100_device_reset' declared
`static' but never defined
drivers/scsi/inia100.h:75: warning: `inia100_bus_reset' declared
`static' but never defined
drivers/scsi/libata-core.c:2133: warning: `ata_qc_push' defined but not
used
drivers/scsi/psi240i.c:713: warning: initialization from incompatible
pointer type
drivers/scsi/psi240i.c:714: warning: initialization from incompatible
pointer type
drivers/scsi/sym53c416.c:627: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/sym53c416.c:715: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/wd7000.c:1611: warning: `wd7000_abort' defined but not used
drivers/serial/8250.c:693: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7737: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7799: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7835: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.h:41: warning: `ixj_h_rcsid' defined but not used
drivers/usb/class/usb-midi.h:150: warning: `usb_midi_ids' defined but
not used
drivers/video/aty/aty128fb.c:2335: warning: `aty128fb_exit' defined but
not used
drivers/video/aty/aty128fb.c:254: warning: `mode' defined but not used
drivers/video/aty/aty128fb.c:256: warning: `nomtrr' defined but not used
drivers/video/console/mdacon.c:374: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/video/console/mdacon.c:384: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/video/hgafb.c:452: warning: `hgafb_fillrect' defined but not
used
drivers/video/hgafb.c:472: warning: `hgafb_copyarea' defined but not
used
drivers/video/hgafb.c:502: warning: `hgafb_imageblit' defined but not
used
drivers/video/imsttfb.c:1089: warning: `imsttfb_load_cursor_image'
defined but not used
drivers/video/imsttfb.c:1159: warning: `imstt_set_cursor' defined but
not used
drivers/video/matrox/matroxfb_base.c:1250: warning: `inverse' defined
but not used
drivers/video/matrox/matroxfb_g450.c:129: warning: duplicate `const'
drivers/video/matrox/matroxfb_g450.c:130: warning: duplicate `const'
drivers/video/matrox/matroxfb_maven.c:347: warning: duplicate `const'
drivers/video/matrox/matroxfb_maven.c:348: warning: duplicate `const'
drivers/video/sis/sis_main.c:622: warning: unused variable `reg'
drivers/video/tdfxfb.c:1005: warning: `tdfxfb_cursor' defined but not
used
drivers/video/tdfxfb.c:198: warning: `inverse' defined but not used
drivers/video/tdfxfb.c:199: warning: `mode_option' defined but not used
drivers/video/tridentfb.c:455: warning: `tridentfb_fillrect' defined but
not used
drivers/video/tridentfb.c:473: warning: `tridentfb_copyarea' defined but
not used
include/linux/ixjuser.h:45: warning: `ixjuser_h_rcsid' defined but not
used
include/linux/mca-legacy.h:12:2: warning: #warning "MCA legacy - please
move your driver to the new sysfs api"
net/decnet/dn_nsp_in.c:805: warning: `skb_linearize' is deprecated
(declared at include/linux/skbuff.h:1136)
net/decnet/dn_route.c:639: warning: `skb_linearize' is deprecated
(declared at include/linux/skbuff.h:1136)
net/ipv4/ipcomp.c:189: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv4/ipcomp.c:72: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/ipcomp6.c:174: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/ipcomp6.c:61: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/netfilter/ip6_tables.c:349: warning: `skb_linearize' is
deprecated (declared at include/linux/skbuff.h:1136)
net/ipv6/netfilter/ip6table_mangle.c:162: warning: `skb_linearize' is
deprecated (declared at include/linux/skbuff.h:1136)
net/wanrouter/wanmain.c:729: warning: `dev_get' is deprecated (declared
at include/linux/netdevice.h:514)
sound/isa/opti9xx/opti92x-ad1848.c:1670: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/isa/opti9xx/opti92x-ad1848.c:1686: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/isa/opti9xx/opti92x-ad1848.c:314: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/oss/ad1848.c:1580: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/ad1848.c:2530: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/ad1848.c:2967: warning: `id_table' defined but not used
sound/oss/cmpci.c:1465: warning: unused variable `s'
sound/oss/cmpci.c:2865: warning: `cmpci_pci_tbl' defined but not used
sound/oss/cs4232.c:141: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/cs4232.c:193: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:76: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:78: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:93: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:94: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/mad16.c:322: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/maui.c:307: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/mpu401.c:1217: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/msnd.c:74: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
sound/oss/msnd.c:95: warning: `MOD_DEC_USE_COUNT' is deprecated
(declared at include/linux/module.h:494)
sound/oss/msnd_pinnacle.c:1123: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/msnd_pinnacle.c:1811: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/opl3sa.c:114: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/opl3sa.c:122: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/pss.c:1004: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:191: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:640: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:710: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/sb_common.c:1224: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/sb_common.c:523: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/sgalaxy.c:89: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sgalaxy.c:97: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1113: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1132: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1137: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:737: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/trix.c:147: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/trix.c:292: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/trix.c:85: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/wavfront.c:2426: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/wf_midi.c:788: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)



2003-11-13 22:04:20

by Daniel McNeil

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

Andrew,

I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
I tested using the test programs aiocp and aiodio_sparse.
(see http://developer.osdl.org/daniel/AIO/)

Using aiocp with i/o sizes from 1k to 512k to copy files worked
without any errors or kernel debug messages.

With 64k i/o, the aiodio_sparse program complete without any errors.
There are no kernel error messages, so that is good.

There are still problems with non power of 2 i/o sizes using AIO and
O_DIRECT. It hangs with aio's that do not seem to complete. The test
does exit when hitting ^c and there are no kernel messages. Test output
below:

$ ./aiodio_sparse

$ ./aiodio_sparse -dd -s 1751k -r 18k -w 11k
child 1843, read loop count 0
io_submit() return 16
aiodio_sparse: 16 i/o in flight
aiodio_sparse: offset 180224 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 191488 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 202752 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 214016 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 225280 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 236544 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 247808 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 259072 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 270336 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 281600 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 292864 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 304128 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
child 1843, read loop count 10
io_submit() return 1
aiodio_sparse: offset 315392 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 326656 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 337920 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 349184 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 360448 filesize 1793024 inflight 16
child 1843, read loop count 20
child 1843, read loop count 30
child 1843, read loop count 40
child 1843, read loop count 50
child 1843, read loop count 60
child 1843, read loop count 70

$ ./aiodio_sparse -i 9 -d -s 180k -r 18k -w 18k
io_submit() return 9
aiodio_sparse: 9 i/o in flight
aiodio_sparse: offset 165888 filesize 184320 inflight 9
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 18432 res2 0
io_submit() return 1
child 2060, read loop count 0
child 2060, read loop count 10
child 2060, read loop count 20

Daniel

On Wed, 2003-11-12 at 23:30, Andrew Morton wrote:

> - Significant changes to the AIO and direct-io code. This needs beating
> on; hopefully we're now close to a solution to the fairly complex problems
> in there.
>


2003-11-14 07:20:13

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

> - Several ext2 and ext3 allocator fixes. These need serious testing on big
> SMP.

Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
later.

M.

2003-11-14 18:55:15

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

"Martin J. Bligh" <[email protected]> wrote:
>
>
>
> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > SMP.
>
> OK, ext3 survived a swatting on the 16-way as well>

Great, thanks.

> It's still slow as snot, but it does work ;-)

I think SDET generates storms of metadata updates. Making the journal
larger may help get that idle time down.

Probably the default journal size is too small nowadays. Most tests seem
to run faster when it is enlarged.


2003-11-14 18:50:06

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3



> - Several ext2 and ext3 allocator fixes. These need serious testing on big
> SMP.

OK, ext3 survived a swatting on the 16-way as well. It's still slow as snot,
but it does work ;-) No changes from before, methinks.

Diffprofile for kernbench (-j) from ext2 to ext3 on mm3

27022 16.3% total
24069 53.3% default_idle
583 2.4% page_remove_rmap
539 248.4% fd_install
478 388.6% __blk_queue_bounce
319 4.0% __d_lookup
220 122.9% may_open
204 68.2% filemap_nopage
124 0.0% journal_add_journal_head
122 321.1% __find_get_block_slow
122 0.0% do_get_write_access
101 57.1% generic_fillattr
...
-52 -73.2% .text.lock.highmem
-52 -94.5% generic_file_read
-53 -18.7% do_generic_mapping_read
-58 -3.3% do_no_page
-65 -13.0% page_address
-65 -60.2% kmap_high
-74 -100.0% grab_block
-75 -3.3% do_page_fault
-85 -1.9% __copy_from_user_ll
-273 -19.5% link_path_walk
-299 -6.5% find_get_page
-758 -100.0% generic_file_open

SDET:

1726439 214.7% total
1383611 345.4% default_idle
115417 0.0% .text.lock.transaction
79362 0.0% find_next_usable_block
38003 0.0% do_get_write_access
32429 2316.4% __down
31231 0.0% journal_dirty_metadata
15114 553.8% schedule
14350 1253.3% __wake_up
13459 0.0% start_this_handle
13100 0.0% journal_stop
...
-1105 -25.1% copy_mm
-1144 -100.0% generic_file_open
-1205 -45.0% .text.lock.dec_and_lock
-1342 -100.0% ext2_new_inode
-1365 -50.5% follow_mount
-1453 -100.0% grab_block
-1580 -30.5% remove_shared_vm_struct
-1759 -11.0% copy_page_range
-2145 -18.4% __d_lookup
-2157 -35.6% path_lookup
-2222 -33.7% atomic_dec_and_lock
-2813 -25.0% release_pages
-3764 -19.1% zap_pte_range
-8954 -21.2% page_add_rmap
-22707 -25.0% page_remove_rmap

2003-11-14 19:18:03

by Badari Pulavarty

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Friday 14 November 2003 11:08 am, Martin J. Bligh wrote:
> > - Several ext2 and ext3 allocator fixes. These need serious testing on
> > big SMP.
>
> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
> snot, but it does work ;-) No changes from before, methinks.
>
> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
>
> 27022 16.3% total
> 24069 53.3% default_idle
> 583 2.4% page_remove_rmap
> 539 248.4% fd_install
> 478 388.6% __blk_queue_bounce

What driver are you using ? Why are you bouncing ?

Thanks,
Badari

2003-11-14 19:32:51

by Mike Fedyk

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Fri, Nov 14, 2003 at 10:59:47AM -0800, Andrew Morton wrote:
> "Martin J. Bligh" <[email protected]> wrote:
> >
> >
> >
> > > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > > SMP.
> >
> > OK, ext3 survived a swatting on the 16-way as well>
>
> Great, thanks.
>
> > It's still slow as snot, but it does work ;-)
>
> I think SDET generates storms of metadata updates. Making the journal
> larger may help get that idle time down.
>
> Probably the default journal size is too small nowadays. Most tests seem
> to run faster when it is enlarged.

Or maybe if it didn't start sync committing from the journal once it hits 50%.

2003-11-14 20:04:22

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

>> > - Several ext2 and ext3 allocator fixes. These need serious testing on
>> > big SMP.
>>
>> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
>> snot, but it does work ;-) No changes from before, methinks.
>>
>> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
>>
>> 27022 16.3% total
>> 24069 53.3% default_idle
>> 583 2.4% page_remove_rmap
>> 539 248.4% fd_install
>> 478 388.6% __blk_queue_bounce
>
> What driver are you using ? Why are you bouncing ?

qlogicisp. Because the driver is crap? ;-)

M.

2003-11-14 20:58:08

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Thu, 13 Nov 2003, Martin J. Bligh wrote:

> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > SMP.
>
> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
> later.

It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
on all my other test boxes. Any hints?

2003-11-14 21:33:12

by Martin J. Bligh

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

>> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
>> > SMP.
>>
>> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
>> later.
>
> It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
> i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
> on all my other test boxes. Any hints?

Linus had some debug thing for triple faults, a few months ago, IIRC ...
probably in the archives somewhere ...

M.

2003-11-14 21:48:04

by Linus Torvalds

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3


On Fri, 14 Nov 2003, Martin J. Bligh wrote:
>
> Linus had some debug thing for triple faults, a few months ago, IIRC ...
> probably in the archives somewhere ...

Triple faults you can't debug, they raise a line outside the CPU, and
normal PC hardware will cause that to just trigger a reboot.

But double faults do get caught, and that debugging stuff actually is in
the standard kernel. It won't give _nearly_ as good a debug report as a
"normal" oops, since I didn't want the double-fault handler to touch
anything even remotely unsafe, but it often gives a good hint about what
might be wrong. Certainly better than triple-faulting did (which we still
do for _catastrophic_ corruption, eg totally munged kernel page tables etc
- it's just very hard to avoid once you get corrupted enough).

Linus

2003-11-14 21:39:00

by John Stoffel

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3


Mike> Or maybe if it didn't start sync committing from the journal
Mike> once it hits 50%.

Instead of using a percentage like this, would it make sense to flush
the journal when there are only N number of free journal slots/entries
left? Now the question is how to compute N in a sane way that works
for small (memory) systems, as well as for larger systems.

You don't want to grow N too aggresively, or base it on the memory of
the system, do you? When you have a 20mb journal, maybe starting
writeout after 10mb is used makes sense, because you've only got 10
transaction slots open. But when you have a 200mb journal, does it
make sense to start writeout when you only have 100 transaction slots
left?

Since I don't know the internals of Ext3 at all, I'm probably
completely missing the idea here, but my gut feeling is that the
scaling we use in these cases shouldn't be linear at all, but more
likely inverse logyrythmic instead. Basically, the larger we get with
a resource, the slower we grow our useage, or the smaller we grow the
absolute size of the writeout buffer(s).

Hmmm... this doesn't sound clear even to me. But the idea I think I'm
trying to get at is that if we have X size of a journal, we want to
start writeout when we have X/2 available. But when we have Y size of
a journal, where Y is X*10 (or larger), we don't want Y/2 as the
cutover point, we want something like Y/10. The idea is that we grow
the denominator here at a slow rate, since it will shrink the free
buffer percentage nicely, yet not let us get too close to a truly zero
sized buffer.

X X/N
----- --------
10 5
100 10
1000 25
10000 125

Does this make any sense to anyone?

John

2003-11-14 21:38:23

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Fri, 14 Nov 2003, Martin J. Bligh wrote:

> >> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> >> > SMP.
> >>
> >> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
> >> later.
> >
> > It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
> > i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
> > on all my other test boxes. Any hints?
>
> Linus had some debug thing for triple faults, a few months ago, IIRC ...
> probably in the archives somewhere ...

It should all be in the kernel right now; arch/i386/kernel/doublefault.c
but i think i may be a bit low on luck =)

2003-11-15 00:56:52

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Fri, 14 Nov 2003, Linus Torvalds wrote:

> Triple faults you can't debug, they raise a line outside the CPU, and
> normal PC hardware will cause that to just trigger a reboot.
>
> But double faults do get caught, and that debugging stuff actually is in
> the standard kernel. It won't give _nearly_ as good a debug report as a
> "normal" oops, since I didn't want the double-fault handler to touch
> anything even remotely unsafe, but it often gives a good hint about what
> might be wrong. Certainly better than triple-faulting did (which we still
> do for _catastrophic_ corruption, eg totally munged kernel page tables etc
> - it's just very hard to avoid once you get corrupted enough).

"Catastrophic" seems to be rather apt here. 2.6.0-test8-mm1 produced the
following, i'm still doing a binary search.

Unable to handle kernel paging request at virtual address 00002000
printing eip:
00007341
*pde = 00000000
Oops: 0004 [#1]
PREEMPT SMP DEBUG_PAGEALLOC
CPU: 0
EIP: c000:[<00007341>] Not tainted VLI
EFLAGS: 00033246
EIP is at 0x7341
eax: 32454256 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00002000 ebp: 00000fd6 esp: 08763f24
ds: 0000 es: 0000 ss: 0068
Process X (pid: 939, threadinfo=08762000 task=0890b330)
Stack: 00000fcb 00000100 00000000 0000c000 00000000 00000000 00000000 00000000
00000005 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Call Trace:

Code: Bad EIP value.

2003-11-15 01:01:10

by Mike Fedyk

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

On Fri, Nov 14, 2003 at 03:27:01PM -0500, John Stoffel wrote:
> You don't want to grow N too aggresively, or base it on the memory of
> the system, do you? When you have a 20mb journal, maybe starting
> writeout after 10mb is used makes sense, because you've only got 10
> transaction slots open. But when you have a 200mb journal, does it
> make sense to start writeout when you only have 100 transaction slots
> left?

The minimum transaction size is one block (since ext3 is the only journaling
FS to log entire blocks, instead of the specific logical changes made during
the transaction), and your blocks are 1k, 2k, or 4k.

Though many times you'll have several blocks per transaction since each
transaction can change bitmaps, directory blocks, and etc.

> Since I don't know the internals of Ext3 at all, I'm probably
> completely missing the idea here, but my gut feeling is that the
> scaling we use in these cases shouldn't be linear at all, but more
> likely inverse logyrythmic instead. Basically, the larger we get with
> a resource, the slower we grow our useage, or the smaller we grow the
> absolute size of the writeout buffer(s).
>
> Hmmm... this doesn't sound clear even to me. But the idea I think I'm
> trying to get at is that if we have X size of a journal, we want to
> start writeout when we have X/2 available. But when we have Y size of
> a journal, where Y is X*10 (or larger), we don't want Y/2 as the
> cutover point, we want something like Y/10. The idea is that we grow
> the denominator here at a slow rate, since it will shrink the free
> buffer percentage nicely, yet not let us get too close to a truly zero
> sized buffer.

Last I heard, ext3 will try to flush the journal with an async process and
if that isn't able to keep up, once the journal hits 50% full, the system
will write syncronously until the journal is empty (or was that until it was
25% full or less, I forget...).

AFAIK everyone agrees that this is not optimal, but nobody's taken the time
to fix it yet either.

Mike

2003-11-15 19:35:26

by Zwane Mwaikambo

[permalink] [raw]
Subject: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

The 4G/4G page fault handling path doesn't appear to handle faults
happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
kernel test.

However i'm still debugging the X11 triple fault in test9-mm3

Unable to handle kernel paging request at virtual address 00002000
printing eip:
00007341
*pde = 00000000
Oops: 0004 [#1]
SMP DEBUG_PAGEALLOC
CPU: 0
EIP: c000:[<00007341>] Not tainted VLI
EFLAGS: 00033246
EIP is at 0x7341
eax: 32454256 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00002000 ebp: 00000fd6 esp: 087bbf24
ds: 0000 es: 0000 ss: 0068
Process X (pid: 939, threadinfo=087ba000 task=0891c690)
Stack: 00000fcb 00000100 00000000 0000c000 00000000 00000000 00000000 00000000
00000005 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Call Trace:

Index: linux-2.6.0-test9-mm3/arch/i386/mm/fault.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/mm/fault.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 fault.c
--- linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 15 Nov 2003 19:08:34 -0000
@@ -264,7 +264,9 @@ asmlinkage void do_page_fault(struct pt_
if (error_code & 3)
goto bad_area_nosemaphore;

- goto vmalloc_fault;
+ /* If it's vm86 fall through */
+ if (!(error_code & 4))
+ goto vmalloc_fault;
}
#else
if (unlikely(address >= TASK_SIZE)) {

2003-11-15 19:56:32

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Sat, 15 Nov 2003, Zwane Mwaikambo wrote:

> The 4G/4G page fault handling path doesn't appear to handle faults
> happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
> kernel test.

Perhaps this would be more desirable?

Index: linux-2.6.0-test9-mm3/arch/i386/mm/fault.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/mm/fault.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 fault.c
--- linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 15 Nov 2003 19:40:17 -0000
@@ -264,7 +264,9 @@ asmlinkage void do_page_fault(struct pt_
if (error_code & 3)
goto bad_area_nosemaphore;

- goto vmalloc_fault;
+ /* If it's vm86 fall through */
+ if (!(regs->eflags & VM_MASK))
+ goto vmalloc_fault;
}
#else
if (unlikely(address >= TASK_SIZE)) {

2003-11-17 05:19:54

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> Andrew,
>
> I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> I tested using the test programs aiocp and aiodio_sparse.
> (see http://developer.osdl.org/daniel/AIO/)
>
> Using aiocp with i/o sizes from 1k to 512k to copy files worked
> without any errors or kernel debug messages.
>
> With 64k i/o, the aiodio_sparse program complete without any errors.
> There are no kernel error messages, so that is good.
>
> There are still problems with non power of 2 i/o sizes using AIO and
> O_DIRECT. It hangs with aio's that do not seem to complete. The test
> does exit when hitting ^c and there are no kernel messages. Test output
> below:

Could you check if the following patch fixes the problem for you ?

Regards
Suparna

--------------------------------------------------------------

With this patch, when the DIO code falls back to buffered i/o after
having submitted part of the i/o, then buffered i/o is issued only
for the remaining part of the request (i.e. the part not already
covered by DIO).

diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
--- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
@@ -74,6 +74,7 @@
been performed at the start of a
write */
int pages_in_io; /* approximate total IO pages */
+ size_t size; /* total request size (doesn't change)*/
sector_t block_in_file; /* Current offset into the underlying
file in dio_block units. */
unsigned blocks_available; /* At block_in_file. changes */
@@ -226,7 +227,7 @@
dio_complete(dio, dio->block_in_file << dio->blkbits,
dio->result);
/* Complete AIO later if falling back to buffered i/o */
- if (dio->result != -ENOTBLK) {
+ if (dio->result >= dio->size || dio->rw == READ) {
aio_complete(dio->iocb, dio->result, 0);
kfree(dio);
} else {
@@ -889,6 +890,7 @@
dio->blkbits = blkbits;
dio->blkfactor = inode->i_blkbits - blkbits;
dio->start_zero_done = 0;
+ dio->size = 0;
dio->block_in_file = offset >> blkbits;
dio->blocks_available = 0;
dio->cur_page = NULL;
@@ -925,7 +927,7 @@

for (seg = 0; seg < nr_segs; seg++) {
user_addr = (unsigned long)iov[seg].iov_base;
- bytes = iov[seg].iov_len;
+ dio->size += bytes = iov[seg].iov_len;

/* Index into the first page of the first block */
dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
@@ -956,6 +958,13 @@
}
} /* end iovec loop */

+ if (ret == -ENOTBLK && rw == WRITE) {
+ /*
+ * The remaining part of the request will be
+ * be handled by buffered I/O when we return
+ */
+ ret = 0;
+ }
/*
* There may be some unwritten disk at the end of a part-written
* fs-block-sized block. Go zero that now.
@@ -986,19 +995,13 @@
*/
if (dio->is_async) {
if (ret == 0)
- ret = dio->result; /* Bytes written */
- if (ret == -ENOTBLK) {
- /*
- * The request will be reissued via buffered I/O
- * when we return; Any I/O already issued
- * effectively becomes redundant.
- */
- dio->result = ret;
+ ret = dio->result;
+ if (ret > 0 && dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
}
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (ret == -ENOTBLK) {
+ if (dio->waiter) {
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
@@ -1032,7 +1035,8 @@
}
dio_complete(dio, offset, ret);
/* We could have also come here on an AIO file extend */
- if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
+ if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
+ dio->result < dio->size))
aio_complete(iocb, ret, 0);
kfree(dio);
}
diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
--- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
+++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
@@ -1895,14 +1895,16 @@
*/
if (written >= 0 && file->f_flags & O_SYNC)
status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
- if (written >= 0 && !is_sync_kiocb(iocb))
+ if (written >= count && !is_sync_kiocb(iocb))
written = -EIOCBQUEUED;
- if (written != -ENOTBLK)
+ if (written < 0 || written >= count)
goto out_status;
/*
* direct-io write to a hole: fall through to buffered I/O
+ * for completing the rest of the request.
*/
- written = 0;
+ pos += written;
+ count -= written;
}

buf = iov->iov_base;

2003-11-17 21:09:20

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3

In article <100480000.1068841761@flay>,
Martin J. Bligh <[email protected]> wrote:
| >> > - Several ext2 and ext3 allocator fixes. These need serious testing on
| >> > big SMP.
| >>
| >> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
| >> snot, but it does work ;-) No changes from before, methinks.
| >>
| >> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
| >>
| >> 27022 16.3% total
| >> 24069 53.3% default_idle
| >> 583 2.4% page_remove_rmap
| >> 539 248.4% fd_install
| >> 478 388.6% __blk_queue_bounce
| >
| > What driver are you using ? Why are you bouncing ?
|
| qlogicisp. Because the driver is crap? ;-)

The question is, does that make your testing better or worse in terms of
checking the new code? Clearly you have done a good job of checking the
"disk can't keep up" case, is there a need to test further with a much
higher transaction rate?

I would assume that if there were lock issues they would have shown up,
which is probably all that's needed.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2003-11-17 21:47:30

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Sat, 15 Nov 2003, Zwane Mwaikambo wrote:

> The 4G/4G page fault handling path doesn't appear to handle faults
> happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
> kernel test.
>
> However i'm still debugging the X11 triple fault in test9-mm3

I've managed to `fix` the triple fault (see further below for the patch
in all it's glory). Unfortunately i have been unable to come up with a
simpler workaround which is fewer instructions and easier to debug. I have
tried the following;

mb()/barrier()
flush_tlb_all()
wbinvd()
outb(0x80,0x00)
local_irq_save(flags); local_irq_enable(); loop(); local_irq_restore(flags);
long_loop()

What i do know is that in the following code;

__asm__ __volatile__(
"xorl %%eax,%%eax; movl %%eax,%%fs; movl %%eax,%%gs\n\t"
"movl %0,%%esp\n\t"
"movl %1,%%ebp\n\t"
"jmp resume_userspace"
: /* no outputs */
:"r" (&info->regs), "r" (tsk->thread_info) : "ax");

It does get to resume_userspace as putting a $0 into %ebp will oops in
__switch_to

And here is the current 'workaround'. Any hints?

Index: arch/i386/kernel/vm86.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/kernel/vm86.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 vm86.c
--- arch/i386/kernel/vm86.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ arch/i386/kernel/vm86.c 17 Nov 2003 21:45:13 -0000
@@ -312,6 +311,8 @@ static void do_sys_vm86(struct kernel_vm
tsk->thread.screen_bitmap = info->screen_bitmap;
if (info->flags & VM86_SCREEN_BITMAP)
mark_screen_rdonly(tsk);
+
+ printk("ooh la la\n");
__asm__ __volatile__(
"xorl %%eax,%%eax; movl %%eax,%%fs; movl %%eax,%%gs\n\t"
"movl %0,%%esp\n\t"

2003-11-17 22:43:17

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops


On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:
>
> I've managed to `fix` the triple fault (see further below for the patch
> in all it's glory).

What's the generated assembly language for this function with and without
the "fix"?

If adding that printk fixes a triple fault, the issue is not likely to be
the printk itself as much as the difference in code that the compiler
generates - stack frame, memory re-ordering etc...

Linus

2003-11-17 23:02:22

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Mon, 17 Nov 2003, Linus Torvalds wrote:

> What's the generated assembly language for this function with and without
> the "fix"?
>
> If adding that printk fixes a triple fault, the issue is not likely to be
> the printk itself as much as the difference in code that the compiler
> generates - stack frame, memory re-ordering etc...

This would be my 'trusty' gcc 3.2.2 from RedHat 9
(gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)

With the fix:
0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
0x0210e96b <do_sys_vm86+267>: push $0x255f121
0x0210e970 <do_sys_vm86+272>: call 0x21285a0 <printk>
0x0210e975 <do_sys_vm86+277>: mov 0x50(%edi),%eax
0x0210e978 <do_sys_vm86+280>: mov %eax,0x5b4(%esi)
0x0210e97e <do_sys_vm86+286>: pop %eax
0x0210e97f <do_sys_vm86+287>: testb $0x1,0x4c(%edi)
0x0210e983 <do_sys_vm86+291>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e985 <do_sys_vm86+293>: mov 0x4(%esi),%edx
0x0210e988 <do_sys_vm86+296>: xor %eax,%eax
0x0210e98a <do_sys_vm86+298>: mov %eax,%fs
0x0210e98c <do_sys_vm86+300>: mov %eax,%gs
0x0210e98e <do_sys_vm86+302>: mov %edi,%esp
0x0210e990 <do_sys_vm86+304>: mov %edx,%ebp
0x0210e992 <do_sys_vm86+306>: jmp 0xfffeb100 <resume_userspace>
0x0210e997 <do_sys_vm86+311>: pop %ebx
0x0210e998 <do_sys_vm86+312>: pop %esi
0x0210e999 <do_sys_vm86+313>: pop %edi
0x0210e99a <do_sys_vm86+314>: ret
0x0210e99b <do_sys_vm86+315>: nop
0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
0x0210e9a0 <do_sys_vm86+320>: push %esi
0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e9a6 <do_sys_vm86+326>: pop %eax
0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e985 <do_sys_vm86+293>
0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
0x0210e9c2 <do_sys_vm86+354>: wrmsr
0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>

Without the fix:
0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9e0 <do_sys_vm86+384>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9c5 <do_sys_vm86+357>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9b6 <do_sys_vm86+342>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e999 <do_sys_vm86+313>
0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
0x0210e978 <do_sys_vm86+280>: jne 0x210e990 <do_sys_vm86+304>
0x0210e97a <do_sys_vm86+282>: mov 0x4(%esi),%edx
0x0210e97d <do_sys_vm86+285>: xor %eax,%eax
0x0210e97f <do_sys_vm86+287>: mov %eax,%fs
0x0210e981 <do_sys_vm86+289>: mov %eax,%gs
0x0210e983 <do_sys_vm86+291>: mov %edi,%esp
0x0210e985 <do_sys_vm86+293>: mov %edx,%ebp
0x0210e987 <do_sys_vm86+295>: jmp 0xfffeb100 <resume_userspace>
0x0210e98c <do_sys_vm86+300>: pop %ebx
0x0210e98d <do_sys_vm86+301>: pop %esi
0x0210e98e <do_sys_vm86+302>: pop %edi
0x0210e98f <do_sys_vm86+303>: ret
0x0210e990 <do_sys_vm86+304>: push %esi
0x0210e991 <do_sys_vm86+305>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e996 <do_sys_vm86+310>: pop %eax
0x0210e997 <do_sys_vm86+311>: jmp 0x210e97a <do_sys_vm86+282>
0x0210e999 <do_sys_vm86+313>: call 0x21222c0 <preempt_schedule>
0x0210e99e <do_sys_vm86+318>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9a0 <do_sys_vm86+320>: mov 0x24(%edx),%ax
0x0210e9a4 <do_sys_vm86+324>: mov %ax,0x10(%ecx)
0x0210e9a8 <do_sys_vm86+328>: mov $0x174,%ecx
0x0210e9ad <do_sys_vm86+333>: mov 0x24(%edx),%eax
0x0210e9b0 <do_sys_vm86+336>: xor %edx,%edx
0x0210e9b2 <do_sys_vm86+338>: wrmsr
0x0210e9b4 <do_sys_vm86+340>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9b6 <do_sys_vm86+342>: movl $0x0,0x5bc(%esi)
0x0210e9c0 <do_sys_vm86+352>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9c5 <do_sys_vm86+357>: cmp $0x4,%eax
0x0210e9c8 <do_sys_vm86+360>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9ce <do_sys_vm86+366>: movl $0x47000,0x5bc(%esi)
0x0210e9d8 <do_sys_vm86+376>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9dd <do_sys_vm86+381>: lea 0x0(%esi),%esi
0x0210e9e0 <do_sys_vm86+384>: movl $0x7000,0x5bc(%esi)
0x0210e9ea <do_sys_vm86+394>: jmp 0x210e8d8 <do_sys_vm86+120>

2003-11-17 23:18:02

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:

> On Mon, 17 Nov 2003, Linus Torvalds wrote:
>
> > What's the generated assembly language for this function with and without
> > the "fix"?
> >
> > If adding that printk fixes a triple fault, the issue is not likely to be
> > the printk itself as much as the difference in code that the compiler
> > generates - stack frame, memory re-ordering etc...
>
> This would be my 'trusty' gcc 3.2.2 from RedHat 9
> (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)

A little bird told me to send diffs... But there is a lot of noise due to
offsets i'm afraid.

--- buggy 2003-11-17 18:09:35.302964248 -0500
+++ works 2003-11-17 18:09:47.744072912 -0500
@@ -21,11 +21,11 @@
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
-0x0210e8b6 <do_sys_vm86+86>: je 0x210e9e0 <do_sys_vm86+384>
+0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
-0x0210e8bf <do_sys_vm86+95>: ja 0x210e9c5 <do_sys_vm86+357>
+0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
-0x0210e8c8 <do_sys_vm86+104>: je 0x210e9b6 <do_sys_vm86+342>
+0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
@@ -57,47 +57,52 @@
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
-0x0210e95a <do_sys_vm86+250>: jne 0x210e9a0 <do_sys_vm86+320>
+0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
-0x0210e969 <do_sys_vm86+265>: jne 0x210e999 <do_sys_vm86+313>
-0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
-0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
-0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
-0x0210e978 <do_sys_vm86+280>: jne 0x210e990 <do_sys_vm86+304>
-0x0210e97a <do_sys_vm86+282>: mov 0x4(%esi),%edx
-0x0210e97d <do_sys_vm86+285>: xor %eax,%eax
-0x0210e97f <do_sys_vm86+287>: mov %eax,%fs
-0x0210e981 <do_sys_vm86+289>: mov %eax,%gs
-0x0210e983 <do_sys_vm86+291>: mov %edi,%esp
-0x0210e985 <do_sys_vm86+293>: mov %edx,%ebp
-0x0210e987 <do_sys_vm86+295>: jmp 0xfffeb100 <resume_userspace>
-0x0210e98c <do_sys_vm86+300>: pop %ebx
-0x0210e98d <do_sys_vm86+301>: pop %esi
-0x0210e98e <do_sys_vm86+302>: pop %edi
-0x0210e98f <do_sys_vm86+303>: ret
-0x0210e990 <do_sys_vm86+304>: push %esi
-0x0210e991 <do_sys_vm86+305>: call 0x210e5b0 <mark_screen_rdonly>
-0x0210e996 <do_sys_vm86+310>: pop %eax
-0x0210e997 <do_sys_vm86+311>: jmp 0x210e97a <do_sys_vm86+282>
-0x0210e999 <do_sys_vm86+313>: call 0x21222c0 <preempt_schedule>
-0x0210e99e <do_sys_vm86+318>: jmp 0x210e96b <do_sys_vm86+267>
-0x0210e9a0 <do_sys_vm86+320>: mov 0x24(%edx),%ax
-0x0210e9a4 <do_sys_vm86+324>: mov %ax,0x10(%ecx)
-0x0210e9a8 <do_sys_vm86+328>: mov $0x174,%ecx
-0x0210e9ad <do_sys_vm86+333>: mov 0x24(%edx),%eax
-0x0210e9b0 <do_sys_vm86+336>: xor %edx,%edx
-0x0210e9b2 <do_sys_vm86+338>: wrmsr
-0x0210e9b4 <do_sys_vm86+340>: jmp 0x210e95c <do_sys_vm86+252>
-0x0210e9b6 <do_sys_vm86+342>: movl $0x0,0x5bc(%esi)
-0x0210e9c0 <do_sys_vm86+352>: jmp 0x210e8d8 <do_sys_vm86+120>
-0x0210e9c5 <do_sys_vm86+357>: cmp $0x4,%eax
-0x0210e9c8 <do_sys_vm86+360>: jne 0x210e8ce <do_sys_vm86+110>
-0x0210e9ce <do_sys_vm86+366>: movl $0x47000,0x5bc(%esi)
-0x0210e9d8 <do_sys_vm86+376>: jmp 0x210e8d8 <do_sys_vm86+120>
-0x0210e9dd <do_sys_vm86+381>: lea 0x0(%esi),%esi
-0x0210e9e0 <do_sys_vm86+384>: movl $0x7000,0x5bc(%esi)
-0x0210e9ea <do_sys_vm86+394>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
+0x0210e96b <do_sys_vm86+267>: push $0x255f121
+0x0210e970 <do_sys_vm86+272>: call 0x21285a0 <printk>
+0x0210e975 <do_sys_vm86+277>: mov 0x50(%edi),%eax
+0x0210e978 <do_sys_vm86+280>: mov %eax,0x5b4(%esi)
+0x0210e97e <do_sys_vm86+286>: pop %eax
+0x0210e97f <do_sys_vm86+287>: testb $0x1,0x4c(%edi)
+0x0210e983 <do_sys_vm86+291>: jne 0x210e9a0 <do_sys_vm86+320>
+0x0210e985 <do_sys_vm86+293>: mov 0x4(%esi),%edx
+0x0210e988 <do_sys_vm86+296>: xor %eax,%eax
+0x0210e98a <do_sys_vm86+298>: mov %eax,%fs
+0x0210e98c <do_sys_vm86+300>: mov %eax,%gs
+0x0210e98e <do_sys_vm86+302>: mov %edi,%esp
+0x0210e990 <do_sys_vm86+304>: mov %edx,%ebp
+0x0210e992 <do_sys_vm86+306>: jmp 0xfffeb100 <resume_userspace>
+0x0210e997 <do_sys_vm86+311>: pop %ebx
+0x0210e998 <do_sys_vm86+312>: pop %esi
+0x0210e999 <do_sys_vm86+313>: pop %edi
+0x0210e99a <do_sys_vm86+314>: ret
+0x0210e99b <do_sys_vm86+315>: nop
+0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
+0x0210e9a0 <do_sys_vm86+320>: push %esi
+0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
+0x0210e9a6 <do_sys_vm86+326>: pop %eax
+0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e985 <do_sys_vm86+293>
+0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
+0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
+0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
+0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
+0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
+0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
+0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
+0x0210e9c2 <do_sys_vm86+354>: wrmsr
+0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
+0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
+0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
+0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
+0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
+0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
+0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
+0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>

2003-11-18 01:15:26

by Daniel McNeil

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

Suparna,

Good news and bad news. Your patch does fix the non-power of two i/o
size problems where AIO previously did not complete:

$ ./aiodio_sparse -s 1751k -r 18k -w 11k
$ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
io_submit() return 9
aiodio_sparse: 9 i/o in flight
aiodio_sparse: offset 165888 filesize 184320 inflight 9
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 18432 res2 0
io_submit() return 1
AIO DIO write done unlinking file
dio_sparse done writing, kill children
aiodio_sparse 0 children had errors

But when testing using aiocp using O_DIRECT to copy a file to
an already allocated file, the aiocp process hangs. I used i/o
size of 4k and that compeleted. Using i/o size of 1k and 2k,
the aiocp process hung during io_sumbit() and are unkillable.
Here are the stack traces:

# ps -fu daniel | grep aiocp
daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2


aiocp D 00000001 1920 1 1902 (NOTLB)
e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
c0289a16
f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
e70aa000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0121b70>] schedule+0x3ac/0x7ef
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
[<c03c007b>] rpc_depopulate+0x1aa/0x24b


aiocp D 366EDC94 2083 2037 (NOTLB)
e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0259d3e>] write_chan+0x165/0x21e
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
[<c03c007b>] rpc_depopulate+0x1aa/0x24b



Daniel

On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > Andrew,
> >
> > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > I tested using the test programs aiocp and aiodio_sparse.
> > (see http://developer.osdl.org/daniel/AIO/)
> >
> > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > without any errors or kernel debug messages.
> >
> > With 64k i/o, the aiodio_sparse program complete without any errors.
> > There are no kernel error messages, so that is good.
> >
> > There are still problems with non power of 2 i/o sizes using AIO and
> > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > does exit when hitting ^c and there are no kernel messages. Test output
> > below:
>
> Could you check if the following patch fixes the problem for you ?
>
> Regards
> Suparna
>
> --------------------------------------------------------------
>
> With this patch, when the DIO code falls back to buffered i/o after
> having submitted part of the i/o, then buffered i/o is issued only
> for the remaining part of the request (i.e. the part not already
> covered by DIO).
>
> diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> @@ -74,6 +74,7 @@
> been performed at the start of a
> write */
> int pages_in_io; /* approximate total IO pages */
> + size_t size; /* total request size (doesn't change)*/
> sector_t block_in_file; /* Current offset into the underlying
> file in dio_block units. */
> unsigned blocks_available; /* At block_in_file. changes */
> @@ -226,7 +227,7 @@
> dio_complete(dio, dio->block_in_file << dio->blkbits,
> dio->result);
> /* Complete AIO later if falling back to buffered i/o */
> - if (dio->result != -ENOTBLK) {
> + if (dio->result >= dio->size || dio->rw == READ) {
> aio_complete(dio->iocb, dio->result, 0);
> kfree(dio);
> } else {
> @@ -889,6 +890,7 @@
> dio->blkbits = blkbits;
> dio->blkfactor = inode->i_blkbits - blkbits;
> dio->start_zero_done = 0;
> + dio->size = 0;
> dio->block_in_file = offset >> blkbits;
> dio->blocks_available = 0;
> dio->cur_page = NULL;
> @@ -925,7 +927,7 @@
>
> for (seg = 0; seg < nr_segs; seg++) {
> user_addr = (unsigned long)iov[seg].iov_base;
> - bytes = iov[seg].iov_len;
> + dio->size += bytes = iov[seg].iov_len;
>
> /* Index into the first page of the first block */
> dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> @@ -956,6 +958,13 @@
> }
> } /* end iovec loop */
>
> + if (ret == -ENOTBLK && rw == WRITE) {
> + /*
> + * The remaining part of the request will be
> + * be handled by buffered I/O when we return
> + */
> + ret = 0;
> + }
> /*
> * There may be some unwritten disk at the end of a part-written
> * fs-block-sized block. Go zero that now.
> @@ -986,19 +995,13 @@
> */
> if (dio->is_async) {
> if (ret == 0)
> - ret = dio->result; /* Bytes written */
> - if (ret == -ENOTBLK) {
> - /*
> - * The request will be reissued via buffered I/O
> - * when we return; Any I/O already issued
> - * effectively becomes redundant.
> - */
> - dio->result = ret;
> + ret = dio->result;
> + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> dio->waiter = current;
> }
> finished_one_bio(dio); /* This can free the dio */
> blk_run_queues();
> - if (ret == -ENOTBLK) {
> + if (dio->waiter) {
> /*
> * Wait for already issued I/O to drain out and
> * release its references to user-space pages
> @@ -1032,7 +1035,8 @@
> }
> dio_complete(dio, offset, ret);
> /* We could have also come here on an AIO file extend */
> - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> + dio->result < dio->size))
> aio_complete(iocb, ret, 0);
> kfree(dio);
> }
> diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> @@ -1895,14 +1895,16 @@
> */
> if (written >= 0 && file->f_flags & O_SYNC)
> status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> - if (written >= 0 && !is_sync_kiocb(iocb))
> + if (written >= count && !is_sync_kiocb(iocb))
> written = -EIOCBQUEUED;
> - if (written != -ENOTBLK)
> + if (written < 0 || written >= count)
> goto out_status;
> /*
> * direct-io write to a hole: fall through to buffered I/O
> + * for completing the rest of the request.
> */
> - written = 0;
> + pos += written;
> + count -= written;
> }
>
> buf = iov->iov_base;

2003-11-18 01:37:30

by Daniel McNeil

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

Obviously, the ps output in my previous email showed that the hangs were
with 1k i/o sizes.

More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
completed correctly.

Even 11k and 17k worked.

$ ls -l
-rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
-rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2


So, only 1k is hanging so far.

Daniel

On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> Suparna,
>
> Good news and bad news. Your patch does fix the non-power of two i/o
> size problems where AIO previously did not complete:
>
> $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> io_submit() return 9
> aiodio_sparse: 9 i/o in flight
> aiodio_sparse: offset 165888 filesize 184320 inflight 9
> aiodio_sparse: io_getevent() returned 1
> aiodio_sparse: io_getevent() res 18432 res2 0
> io_submit() return 1
> AIO DIO write done unlinking file
> dio_sparse done writing, kill children
> aiodio_sparse 0 children had errors
>
> But when testing using aiocp using O_DIRECT to copy a file to
> an already allocated file, the aiocp process hangs. I used i/o
> size of 4k and that compeleted. Using i/o size of 1k and 2k,
> the aiocp process hung during io_sumbit() and are unkillable.
> Here are the stack traces:
>
> # ps -fu daniel | grep aiocp
> daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
>
>
> aiocp D 00000001 1920 1 1902 (NOTLB)
> e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> c0289a16
> f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> e70aa000
> Call Trace:
> [<c02897fc>] generic_unplug_device+0x50/0xbd
> [<c0289a16>] blk_run_queues+0xa9/0x15c
> [<c0123712>] io_schedule+0x26/0x30
> [<c0192242>] direct_io_worker+0x376/0x5ab
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> [<c0121b70>] schedule+0x3ac/0x7ef
> [<c0145f48>] generic_file_aio_read+0x33/0x37
> [<c0194ad3>] aio_pread+0x34/0x5f
> [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> [<c019316f>] __aio_get_req+0x27/0x158
> [<c0194a9f>] aio_pread+0x0/0x5f
> [<c0194f62>] io_submit_one+0x1ea/0x2b7
> [<c0195110>] sys_io_submit+0xe1/0x194
> [<c03c29a7>] syscall_call+0x7/0xb
> [<c03c007b>] rpc_depopulate+0x1aa/0x24b
>
>
> aiocp D 366EDC94 2083 2037 (NOTLB)
> e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> Call Trace:
> [<c02897fc>] generic_unplug_device+0x50/0xbd
> [<c0289a16>] blk_run_queues+0xa9/0x15c
> [<c0123712>] io_schedule+0x26/0x30
> [<c0192242>] direct_io_worker+0x376/0x5ab
> [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> [<c0259d3e>] write_chan+0x165/0x21e
> [<c0145f48>] generic_file_aio_read+0x33/0x37
> [<c0194ad3>] aio_pread+0x34/0x5f
> [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> [<c019316f>] __aio_get_req+0x27/0x158
> [<c0194a9f>] aio_pread+0x0/0x5f
> [<c02532ab>] tty_write+0x1e8/0x3b2
> [<c0194f62>] io_submit_one+0x1ea/0x2b7
> [<c0195110>] sys_io_submit+0xe1/0x194
> [<c03c29a7>] syscall_call+0x7/0xb
> [<c03c007b>] rpc_depopulate+0x1aa/0x24b
>
>
>
> Daniel
>
> On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > Andrew,
> > >
> > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > I tested using the test programs aiocp and aiodio_sparse.
> > > (see http://developer.osdl.org/daniel/AIO/)
> > >
> > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > without any errors or kernel debug messages.
> > >
> > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > There are no kernel error messages, so that is good.
> > >
> > > There are still problems with non power of 2 i/o sizes using AIO and
> > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > does exit when hitting ^c and there are no kernel messages. Test output
> > > below:
> >
> > Could you check if the following patch fixes the problem for you ?
> >
> > Regards
> > Suparna
> >
> > --------------------------------------------------------------
> >
> > With this patch, when the DIO code falls back to buffered i/o after
> > having submitted part of the i/o, then buffered i/o is issued only
> > for the remaining part of the request (i.e. the part not already
> > covered by DIO).
> >
> > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > @@ -74,6 +74,7 @@
> > been performed at the start of a
> > write */
> > int pages_in_io; /* approximate total IO pages */
> > + size_t size; /* total request size (doesn't change)*/
> > sector_t block_in_file; /* Current offset into the underlying
> > file in dio_block units. */
> > unsigned blocks_available; /* At block_in_file. changes */
> > @@ -226,7 +227,7 @@
> > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > dio->result);
> > /* Complete AIO later if falling back to buffered i/o */
> > - if (dio->result != -ENOTBLK) {
> > + if (dio->result >= dio->size || dio->rw == READ) {
> > aio_complete(dio->iocb, dio->result, 0);
> > kfree(dio);
> > } else {
> > @@ -889,6 +890,7 @@
> > dio->blkbits = blkbits;
> > dio->blkfactor = inode->i_blkbits - blkbits;
> > dio->start_zero_done = 0;
> > + dio->size = 0;
> > dio->block_in_file = offset >> blkbits;
> > dio->blocks_available = 0;
> > dio->cur_page = NULL;
> > @@ -925,7 +927,7 @@
> >
> > for (seg = 0; seg < nr_segs; seg++) {
> > user_addr = (unsigned long)iov[seg].iov_base;
> > - bytes = iov[seg].iov_len;
> > + dio->size += bytes = iov[seg].iov_len;
> >
> > /* Index into the first page of the first block */
> > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > @@ -956,6 +958,13 @@
> > }
> > } /* end iovec loop */
> >
> > + if (ret == -ENOTBLK && rw == WRITE) {
> > + /*
> > + * The remaining part of the request will be
> > + * be handled by buffered I/O when we return
> > + */
> > + ret = 0;
> > + }
> > /*
> > * There may be some unwritten disk at the end of a part-written
> > * fs-block-sized block. Go zero that now.
> > @@ -986,19 +995,13 @@
> > */
> > if (dio->is_async) {
> > if (ret == 0)
> > - ret = dio->result; /* Bytes written */
> > - if (ret == -ENOTBLK) {
> > - /*
> > - * The request will be reissued via buffered I/O
> > - * when we return; Any I/O already issued
> > - * effectively becomes redundant.
> > - */
> > - dio->result = ret;
> > + ret = dio->result;
> > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > dio->waiter = current;
> > }
> > finished_one_bio(dio); /* This can free the dio */
> > blk_run_queues();
> > - if (ret == -ENOTBLK) {
> > + if (dio->waiter) {
> > /*
> > * Wait for already issued I/O to drain out and
> > * release its references to user-space pages
> > @@ -1032,7 +1035,8 @@
> > }
> > dio_complete(dio, offset, ret);
> > /* We could have also come here on an AIO file extend */
> > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > + dio->result < dio->size))
> > aio_complete(iocb, ret, 0);
> > kfree(dio);
> > }
> > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > @@ -1895,14 +1895,16 @@
> > */
> > if (written >= 0 && file->f_flags & O_SYNC)
> > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > - if (written >= 0 && !is_sync_kiocb(iocb))
> > + if (written >= count && !is_sync_kiocb(iocb))
> > written = -EIOCBQUEUED;
> > - if (written != -ENOTBLK)
> > + if (written < 0 || written >= count)
> > goto out_status;
> > /*
> > * direct-io write to a hole: fall through to buffered I/O
> > + * for completing the rest of the request.
> > */
> > - written = 0;
> > + pos += written;
> > + count -= written;
> > }
> >
> > buf = iov->iov_base;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>

2003-11-18 07:22:13

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:

> A little bird told me to send diffs... But there is a lot of noise due to
> offsets i'm afraid.

Another note from our avian friends; i seem to have sent a slightly
different dump from the patch, although they do both achieve the same
effect. I shall append it for completeness.

0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
0x0210e978 <do_sys_vm86+280>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e97a <do_sys_vm86+282>: push $0x255f121
0x0210e97f <do_sys_vm86+287>: call 0x21285a0 <printk>
0x0210e984 <do_sys_vm86+292>: mov 0x4(%esi),%edx
0x0210e987 <do_sys_vm86+295>: xor %eax,%eax
0x0210e989 <do_sys_vm86+297>: mov %eax,%fs
0x0210e98b <do_sys_vm86+299>: mov %eax,%gs
0x0210e98d <do_sys_vm86+301>: mov %edi,%esp
0x0210e98f <do_sys_vm86+303>: mov %edx,%ebp
0x0210e991 <do_sys_vm86+305>: jmp 0xfffeb100 <resume_userspace>
0x0210e996 <do_sys_vm86+310>: pop %esi
0x0210e997 <do_sys_vm86+311>: pop %ebx
0x0210e998 <do_sys_vm86+312>: pop %esi
0x0210e999 <do_sys_vm86+313>: pop %edi
0x0210e99a <do_sys_vm86+314>: ret
0x0210e99b <do_sys_vm86+315>: nop
0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
0x0210e9a0 <do_sys_vm86+320>: push %esi
0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e9a6 <do_sys_vm86+326>: pop %eax
0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e97a <do_sys_vm86+282>
0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
0x0210e9c2 <do_sys_vm86+354>: wrmsr
0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>

2003-11-18 11:50:00

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

I don't seem to able to recreate this at my end - even with 1k
block sizes. Did you notice if this problem occurs without
the latest patch ?

Regards
Suparna

On Mon, Nov 17, 2003 at 05:37:14PM -0800, Daniel McNeil wrote:
> Obviously, the ps output in my previous email showed that the hangs were
> with 1k i/o sizes.
>
> More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
> completed correctly.
>
> Even 11k and 17k worked.
>
> $ ls -l
> -rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
> -rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2
>
>
> So, only 1k is hanging so far.
>
> Daniel
>
> On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> > Suparna,
> >
> > Good news and bad news. Your patch does fix the non-power of two i/o
> > size problems where AIO previously did not complete:
> >
> > $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> > $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> > io_submit() return 9
> > aiodio_sparse: 9 i/o in flight
> > aiodio_sparse: offset 165888 filesize 184320 inflight 9
> > aiodio_sparse: io_getevent() returned 1
> > aiodio_sparse: io_getevent() res 18432 res2 0
> > io_submit() return 1
> > AIO DIO write done unlinking file
> > dio_sparse done writing, kill children
> > aiodio_sparse 0 children had errors
> >
> > But when testing using aiocp using O_DIRECT to copy a file to
> > an already allocated file, the aiocp process hangs. I used i/o
> > size of 4k and that compeleted. Using i/o size of 1k and 2k,
> > the aiocp process hung during io_sumbit() and are unkillable.
> > Here are the stack traces:
> >
> > # ps -fu daniel | grep aiocp
> > daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> > daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
> >
> >
> > aiocp D 00000001 1920 1 1902 (NOTLB)
> > e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> > f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> > c0289a16
> > f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> > e70aa000
> > Call Trace:
> > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > [<c0123712>] io_schedule+0x26/0x30
> > [<c0192242>] direct_io_worker+0x376/0x5ab
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > [<c0121b70>] schedule+0x3ac/0x7ef
> > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > [<c0194ad3>] aio_pread+0x34/0x5f
> > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > [<c019316f>] __aio_get_req+0x27/0x158
> > [<c0194a9f>] aio_pread+0x0/0x5f
> > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > [<c0195110>] sys_io_submit+0xe1/0x194
> > [<c03c29a7>] syscall_call+0x7/0xb
> > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> >
> >
> > aiocp D 366EDC94 2083 2037 (NOTLB)
> > e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> > 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> > f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> > Call Trace:
> > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > [<c0123712>] io_schedule+0x26/0x30
> > [<c0192242>] direct_io_worker+0x376/0x5ab
> > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > [<c0259d3e>] write_chan+0x165/0x21e
> > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > [<c0194ad3>] aio_pread+0x34/0x5f
> > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > [<c019316f>] __aio_get_req+0x27/0x158
> > [<c0194a9f>] aio_pread+0x0/0x5f
> > [<c02532ab>] tty_write+0x1e8/0x3b2
> > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > [<c0195110>] sys_io_submit+0xe1/0x194
> > [<c03c29a7>] syscall_call+0x7/0xb
> > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> >
> >
> >
> > Daniel
> >
> > On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > > Andrew,
> > > >
> > > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > > I tested using the test programs aiocp and aiodio_sparse.
> > > > (see http://developer.osdl.org/daniel/AIO/)
> > > >
> > > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > > without any errors or kernel debug messages.
> > > >
> > > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > > There are no kernel error messages, so that is good.
> > > >
> > > > There are still problems with non power of 2 i/o sizes using AIO and
> > > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > > does exit when hitting ^c and there are no kernel messages. Test output
> > > > below:
> > >
> > > Could you check if the following patch fixes the problem for you ?
> > >
> > > Regards
> > > Suparna
> > >
> > > --------------------------------------------------------------
> > >
> > > With this patch, when the DIO code falls back to buffered i/o after
> > > having submitted part of the i/o, then buffered i/o is issued only
> > > for the remaining part of the request (i.e. the part not already
> > > covered by DIO).
> > >
> > > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > > @@ -74,6 +74,7 @@
> > > been performed at the start of a
> > > write */
> > > int pages_in_io; /* approximate total IO pages */
> > > + size_t size; /* total request size (doesn't change)*/
> > > sector_t block_in_file; /* Current offset into the underlying
> > > file in dio_block units. */
> > > unsigned blocks_available; /* At block_in_file. changes */
> > > @@ -226,7 +227,7 @@
> > > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > > dio->result);
> > > /* Complete AIO later if falling back to buffered i/o */
> > > - if (dio->result != -ENOTBLK) {
> > > + if (dio->result >= dio->size || dio->rw == READ) {
> > > aio_complete(dio->iocb, dio->result, 0);
> > > kfree(dio);
> > > } else {
> > > @@ -889,6 +890,7 @@
> > > dio->blkbits = blkbits;
> > > dio->blkfactor = inode->i_blkbits - blkbits;
> > > dio->start_zero_done = 0;
> > > + dio->size = 0;
> > > dio->block_in_file = offset >> blkbits;
> > > dio->blocks_available = 0;
> > > dio->cur_page = NULL;
> > > @@ -925,7 +927,7 @@
> > >
> > > for (seg = 0; seg < nr_segs; seg++) {
> > > user_addr = (unsigned long)iov[seg].iov_base;
> > > - bytes = iov[seg].iov_len;
> > > + dio->size += bytes = iov[seg].iov_len;
> > >
> > > /* Index into the first page of the first block */
> > > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > > @@ -956,6 +958,13 @@
> > > }
> > > } /* end iovec loop */
> > >
> > > + if (ret == -ENOTBLK && rw == WRITE) {
> > > + /*
> > > + * The remaining part of the request will be
> > > + * be handled by buffered I/O when we return
> > > + */
> > > + ret = 0;
> > > + }
> > > /*
> > > * There may be some unwritten disk at the end of a part-written
> > > * fs-block-sized block. Go zero that now.
> > > @@ -986,19 +995,13 @@
> > > */
> > > if (dio->is_async) {
> > > if (ret == 0)
> > > - ret = dio->result; /* Bytes written */
> > > - if (ret == -ENOTBLK) {
> > > - /*
> > > - * The request will be reissued via buffered I/O
> > > - * when we return; Any I/O already issued
> > > - * effectively becomes redundant.
> > > - */
> > > - dio->result = ret;
> > > + ret = dio->result;
> > > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > > dio->waiter = current;
> > > }
> > > finished_one_bio(dio); /* This can free the dio */
> > > blk_run_queues();
> > > - if (ret == -ENOTBLK) {
> > > + if (dio->waiter) {
> > > /*
> > > * Wait for already issued I/O to drain out and
> > > * release its references to user-space pages
> > > @@ -1032,7 +1035,8 @@
> > > }
> > > dio_complete(dio, offset, ret);
> > > /* We could have also come here on an AIO file extend */
> > > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > > + dio->result < dio->size))
> > > aio_complete(iocb, ret, 0);
> > > kfree(dio);
> > > }
> > > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > > @@ -1895,14 +1895,16 @@
> > > */
> > > if (written >= 0 && file->f_flags & O_SYNC)
> > > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > > - if (written >= 0 && !is_sync_kiocb(iocb))
> > > + if (written >= count && !is_sync_kiocb(iocb))
> > > written = -EIOCBQUEUED;
> > > - if (written != -ENOTBLK)
> > > + if (written < 0 || written >= count)
> > > goto out_status;
> > > /*
> > > * direct-io write to a hole: fall through to buffered I/O
> > > + * for completing the rest of the request.
> > > */
> > > - written = 0;
> > > + pos += written;
> > > + count -= written;
> > > }
> > >
> > > buf = iov->iov_base;
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > the body to [email protected]. For more info on Linux AIO,
> > see: http://www.kvack.org/aio/
> > Don't email: <a href=mailto:"[email protected]">[email protected]</a>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India

2003-11-18 15:47:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops


On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
>
> Another note from our avian friends; i seem to have sent a slightly
> different dump from the patch, although they do both achieve the same
> effect. I shall append it for completeness.

Hmm. I don't see anything. However, it's a lot easier to read the
gcc-generated assembly ("make arch/i386/kernel/vm86.s") than it is to read
the objdump disassembly.

It's also a lot easier to see what the assembly language is when giving
the

-fno-reorder-blocks

switch to gcc. Without it, modern gcc's tend to have _way_ too many jumps
around. But maybe that actually changes the behaviour too.

Linus

2003-11-18 16:17:16

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, 18 Nov 2003, Linus Torvalds wrote:

> Hmm. I don't see anything. However, it's a lot easier to read the
> gcc-generated assembly ("make arch/i386/kernel/vm86.s") than it is to read
> the objdump disassembly.
>
>
> It's also a lot easier to see what the assembly language is when giving
> the
>
> -fno-reorder-blocks

I'll recompile and verify that the bug can be reproduced and worked around
with that flag.

> switch to gcc. Without it, modern gcc's tend to have _way_ too many jumps
> around. But maybe that actually changes the behaviour too.

Here are diffs from the do_sys_vm86 only.

--- asm-before 2003-11-18 10:56:02.967643808 -0500
+++ asm-after 2003-11-18 10:55:37.880457640 -0500
@@ -897,6 +897,10 @@
.LFE473:
.Lfe4:
.size sys_vm86,.Lfe4-sys_vm86
+ .section .rodata.str1.1
+.LC6:
+ .string "ooh la la\n"
+ .text
.p2align 4,,15
.type do_sys_vm86,@function
do_sys_vm86:
@@ -1053,29 +1057,37 @@
jne .L213
.L210:
.loc 1 315 0
+ pushl $.LC6
+.LCFI98:
+ call printk
+ .loc 1 316 0
movl 4(%esi), %edx
#APP
xorl %eax,%eax; movl %eax,%fs; movl %eax,%gs
movl %edi,%esp
movl %edx,%ebp
jmp resume_userspace
- .loc 1 323 0
#NO_APP
- popl %ebx
-.LCFI98:
+.LBE53:
popl %esi
.LCFI99:
- popl %edi
+ .loc 1 324 0
+ popl %ebx
.LCFI100:
+ popl %esi
+.LCFI101:
+ popl %edi
+.LCFI102:
ret
.loc 1 313 0
.p2align 4,,7
.L213:
+.LBB65:
pushl %esi
-.LCFI101:
+.LCFI103:
call mark_screen_rdonly
popl %eax
-.LCFI102:
+.LCFI104:
jmp .L210
.loc 1 310 0
.L212:
@@ -1083,7 +1095,7 @@
jmp .L197
.loc 14 454 0
.L211:
-.LBB65:
+.LBB66:
movw 36(%edx), %ax
movw %ax, 16(%ecx)
.loc 14 455 0
@@ -1097,7 +1109,7 @@
.p2align 4,,7
.L183:
.loc 1 283 0
-.LBE65:
+.LBE66:
movl $0, 1468(%esi)
.loc 1 284 0
jmp .L182
@@ -1115,7 +1127,7 @@
movl $28672, 1468(%esi)
.loc 1 287 0
jmp .L182
-.LBE53:
+.LBE65:
.LFE475:
.Lfe5:
.size do_sys_vm86,.Lfe5-do_sys_vm86

2003-11-18 16:37:35

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops


On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
>
> Here are diffs from the do_sys_vm86 only.

Ok. Much more readable.

And there is something very suspicious there.

The code with and without the printk() looks _identical_ apart from some
trivial label renumbering, and the added

pushl $.LC6
call printk
.. asm ..
popl %esi

which all looks fine (esi is dead at that point, so the compiler is just
using a "popl" as a shorter form of "addl $4,%esp").

Btw, you seem to compile with debugging, which makes the assembly
language pretty much unreadable and accounts for most of the
differences: the line numbers change. If you compile a kernel where the
line numbers don't change (by commenting _out_ the printk rather than
removing the whole line), your diff would be more readable.

Anyway, there are _zero_ differences.

Just for fun, try this: move the "printk()" to _below_ the "asm"
statement. It will never actually get executed, but if it's an issue of
some subtle code or data placement things (cache lines etc), maybe that
also hides the oops, since all the same code and data will be generated,
just not run...

Linus

2003-11-18 17:13:53

by Martin J. Bligh

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

>> Btw, you seem to compile with debugging, which makes the assembly
>> language pretty much unreadable and accounts for most of the
>> differences: the line numbers change. If you compile a kernel where the
>> line numbers don't change (by commenting _out_ the printk rather than
>> removing the whole line), your diff would be more readable.
>
> Aha! Thanks for mentioning that, noted.
>
>> Anyway, there are _zero_ differences.
>>
>> Just for fun, try this: move the "printk()" to _below_ the "asm"
>> statement. It will never actually get executed, but if it's an issue of
>> some subtle code or data placement things (cache lines etc), maybe that
>> also hides the oops, since all the same code and data will be generated,
>> just not run...
>
> Ok i just tried that and it still fails. Matt Mackall suggested i also try
> writing a minimal printk which has the same effect.

The other thing I've found printks to hide before is timing bugs / races.
Unfortunately I can't see one here, but maybe someone else can ;-)
Maybe inserting a 1ms delay or something in place of the printk would
have the same effect?

M.

2003-11-18 17:09:49

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, 18 Nov 2003, Linus Torvalds wrote:

> Ok. Much more readable.
>
> And there is something very suspicious there.
>
> The code with and without the printk() looks _identical_ apart from some
> trivial label renumbering, and the added
>
> pushl $.LC6
> call printk
> .. asm ..
> popl %esi
>
> which all looks fine (esi is dead at that point, so the compiler is just
> using a "popl" as a shorter form of "addl $4,%esp").
>
> Btw, you seem to compile with debugging, which makes the assembly
> language pretty much unreadable and accounts for most of the
> differences: the line numbers change. If you compile a kernel where the
> line numbers don't change (by commenting _out_ the printk rather than
> removing the whole line), your diff would be more readable.

Aha! Thanks for mentioning that, noted.

> Anyway, there are _zero_ differences.
>
> Just for fun, try this: move the "printk()" to _below_ the "asm"
> statement. It will never actually get executed, but if it's an issue of
> some subtle code or data placement things (cache lines etc), maybe that
> also hides the oops, since all the same code and data will be generated,
> just not run...

Ok i just tried that and it still fails. Matt Mackall suggested i also try
writing a minimal printk which has the same effect.

2003-11-18 17:23:32

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, 18 Nov 2003, Martin J. Bligh wrote:

> The other thing I've found printks to hide before is timing bugs / races.
> Unfortunately I can't see one here, but maybe someone else can ;-)
> Maybe inserting a 1ms delay or something in place of the printk would
> have the same effect?

I've tried a number of timing related workarounds, namely;
schedule_timeout(2*HZ) and some long spinning loops. I've also thrown a
schedule() in there at some point.

2003-11-18 23:48:19

by Daniel McNeil

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

Suparna,

I was unable to reproduce the hang in io_submit() without your patch.
I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.

I re-ran with your patch with both as-iosched and deadline and both
hung in io_submit(). aiocp would run a few times, but I put the
aiocp in a while loop and it hung on the 1st or 2nd time. It
did get most of the way through copying the file before hanging.
This is on a 2-proc to ide disks running ext3.

Here is the stack trace and other info for as-iosched:
daniel 2005 0.7 0.0 1388 384 pts/0 D 13:51 0:08 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
cat /proc/2005/wchan
io_schedule

aiocp D 00000001 2005 1870 (NOTLB)
e53cfc08 00200086 c18d3c80 00000001 00000003 c02897fc 00000060 00200246
f7cdb8b4 c0191630 c18d3c80 0000bfc6 78d5d3e5 00000233 e4dc1980 c0289a16
f7cdb8b4 d92978e4 c18d3c80 00000000 00000001 e53cfc14 c0123712 e53ce000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0191630>] dio_bio_add_page+0x34/0x79
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0147a80>] __generic_file_aio_write_nolock+0xa3a/0xda5
[<c025b049>] pty_write+0x1c8/0x1ca
[<c01480a4>] generic_file_aio_write+0x7e/0x115
[<c0256d12>] opost+0x9e/0x1cf
[<c01aa4a3>] ext3_file_write+0x3f/0xcc
[<c0194b3a>] aio_pwrite+0x3c/0xad
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194afe>] aio_pwrite+0x0/0xad
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb

For deadline iosched:

daniel 1889 0.1 0.0 1388 384 pts/0 D 15:12 0:01 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2

$ cat /proc/1889/wchan
io_schedule

$ cat /sys/block/hdb/stat
209058 23145 45744 58542 209022 22069 0 20758 45210

aiocp D 0AD7701D 1889 1752 (NOTLB)
ee2ddd04 00200086 f75e6660 0ad7701d 0000004e 00200282 ebd37cbc 0ad7701d
0000004e f75e6660 c18d3c80 00060539 0ad7701d 0000004e f75e6000 0000006b
ee2ddd10 c0192212 c18d3c80 00000000 00000001 ee2ddd10 c0123712 ee2dc000
Call Trace:
[<c0192212>] direct_io_worker+0x346/0x5ab
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0259d3e>] write_chan+0x165/0x21e
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb


The hung processes are stuck in the 'D' state and unkillable, of course.
I would appear something is wrong with your patch. Any ideas?

Daniel




On Tue, 2003-11-18 at 03:55, Suparna Bhattacharya wrote:
> I don't seem to able to recreate this at my end - even with 1k
> block sizes. Did you notice if this problem occurs without
> the latest patch ?
>
> Regards
> Suparna
>
> On Mon, Nov 17, 2003 at 05:37:14PM -0800, Daniel McNeil wrote:
> > Obviously, the ps output in my previous email showed that the hangs were
> > with 1k i/o sizes.
> >
> > More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
> > completed correctly.
> >
> > Even 11k and 17k worked.
> >
> > $ ls -l
> > -rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
> > -rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2
> >
> >
> > So, only 1k is hanging so far.
> >
> > Daniel
> >
> > On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> > > Suparna,
> > >
> > > Good news and bad news. Your patch does fix the non-power of two i/o
> > > size problems where AIO previously did not complete:
> > >
> > > $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> > > $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> > > io_submit() return 9
> > > aiodio_sparse: 9 i/o in flight
> > > aiodio_sparse: offset 165888 filesize 184320 inflight 9
> > > aiodio_sparse: io_getevent() returned 1
> > > aiodio_sparse: io_getevent() res 18432 res2 0
> > > io_submit() return 1
> > > AIO DIO write done unlinking file
> > > dio_sparse done writing, kill children
> > > aiodio_sparse 0 children had errors
> > >
> > > But when testing using aiocp using O_DIRECT to copy a file to
> > > an already allocated file, the aiocp process hangs. I used i/o
> > > size of 4k and that compeleted. Using i/o size of 1k and 2k,
> > > the aiocp process hung during io_sumbit() and are unkillable.
> > > Here are the stack traces:
> > >
> > > # ps -fu daniel | grep aiocp
> > > daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> > > daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
> > >
> > >
> > > aiocp D 00000001 1920 1 1902 (NOTLB)
> > > e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> > > f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> > > c0289a16
> > > f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> > > e70aa000
> > > Call Trace:
> > > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > > [<c0123712>] io_schedule+0x26/0x30
> > > [<c0192242>] direct_io_worker+0x376/0x5ab
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > > [<c0121b70>] schedule+0x3ac/0x7ef
> > > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > > [<c0194ad3>] aio_pread+0x34/0x5f
> > > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > > [<c019316f>] __aio_get_req+0x27/0x158
> > > [<c0194a9f>] aio_pread+0x0/0x5f
> > > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > > [<c0195110>] sys_io_submit+0xe1/0x194
> > > [<c03c29a7>] syscall_call+0x7/0xb
> > > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> > >
> > >
> > > aiocp D 366EDC94 2083 2037 (NOTLB)
> > > e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> > > 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> > > f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> > > Call Trace:
> > > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > > [<c0123712>] io_schedule+0x26/0x30
> > > [<c0192242>] direct_io_worker+0x376/0x5ab
> > > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > > [<c0259d3e>] write_chan+0x165/0x21e
> > > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > > [<c0194ad3>] aio_pread+0x34/0x5f
> > > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > > [<c019316f>] __aio_get_req+0x27/0x158
> > > [<c0194a9f>] aio_pread+0x0/0x5f
> > > [<c02532ab>] tty_write+0x1e8/0x3b2
> > > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > > [<c0195110>] sys_io_submit+0xe1/0x194
> > > [<c03c29a7>] syscall_call+0x7/0xb
> > > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> > >
> > >
> > >
> > > Daniel
> > >
> > > On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > > > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > > > Andrew,
> > > > >
> > > > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > > > I tested using the test programs aiocp and aiodio_sparse.
> > > > > (see http://developer.osdl.org/daniel/AIO/)
> > > > >
> > > > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > > > without any errors or kernel debug messages.
> > > > >
> > > > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > > > There are no kernel error messages, so that is good.
> > > > >
> > > > > There are still problems with non power of 2 i/o sizes using AIO and
> > > > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > > > does exit when hitting ^c and there are no kernel messages. Test output
> > > > > below:
> > > >
> > > > Could you check if the following patch fixes the problem for you ?
> > > >
> > > > Regards
> > > > Suparna
> > > >
> > > > --------------------------------------------------------------
> > > >
> > > > With this patch, when the DIO code falls back to buffered i/o after
> > > > having submitted part of the i/o, then buffered i/o is issued only
> > > > for the remaining part of the request (i.e. the part not already
> > > > covered by DIO).
> > > >
> > > > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > > > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > > > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > > > @@ -74,6 +74,7 @@
> > > > been performed at the start of a
> > > > write */
> > > > int pages_in_io; /* approximate total IO pages */
> > > > + size_t size; /* total request size (doesn't change)*/
> > > > sector_t block_in_file; /* Current offset into the underlying
> > > > file in dio_block units. */
> > > > unsigned blocks_available; /* At block_in_file. changes */
> > > > @@ -226,7 +227,7 @@
> > > > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > > > dio->result);
> > > > /* Complete AIO later if falling back to buffered i/o */
> > > > - if (dio->result != -ENOTBLK) {
> > > > + if (dio->result >= dio->size || dio->rw == READ) {
> > > > aio_complete(dio->iocb, dio->result, 0);
> > > > kfree(dio);
> > > > } else {
> > > > @@ -889,6 +890,7 @@
> > > > dio->blkbits = blkbits;
> > > > dio->blkfactor = inode->i_blkbits - blkbits;
> > > > dio->start_zero_done = 0;
> > > > + dio->size = 0;
> > > > dio->block_in_file = offset >> blkbits;
> > > > dio->blocks_available = 0;
> > > > dio->cur_page = NULL;
> > > > @@ -925,7 +927,7 @@
> > > >
> > > > for (seg = 0; seg < nr_segs; seg++) {
> > > > user_addr = (unsigned long)iov[seg].iov_base;
> > > > - bytes = iov[seg].iov_len;
> > > > + dio->size += bytes = iov[seg].iov_len;
> > > >
> > > > /* Index into the first page of the first block */
> > > > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > > > @@ -956,6 +958,13 @@
> > > > }
> > > > } /* end iovec loop */
> > > >
> > > > + if (ret == -ENOTBLK && rw == WRITE) {
> > > > + /*
> > > > + * The remaining part of the request will be
> > > > + * be handled by buffered I/O when we return
> > > > + */
> > > > + ret = 0;
> > > > + }
> > > > /*
> > > > * There may be some unwritten disk at the end of a part-written
> > > > * fs-block-sized block. Go zero that now.
> > > > @@ -986,19 +995,13 @@
> > > > */
> > > > if (dio->is_async) {
> > > > if (ret == 0)
> > > > - ret = dio->result; /* Bytes written */
> > > > - if (ret == -ENOTBLK) {
> > > > - /*
> > > > - * The request will be reissued via buffered I/O
> > > > - * when we return; Any I/O already issued
> > > > - * effectively becomes redundant.
> > > > - */
> > > > - dio->result = ret;
> > > > + ret = dio->result;
> > > > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > > > dio->waiter = current;
> > > > }
> > > > finished_one_bio(dio); /* This can free the dio */
> > > > blk_run_queues();
> > > > - if (ret == -ENOTBLK) {
> > > > + if (dio->waiter) {
> > > > /*
> > > > * Wait for already issued I/O to drain out and
> > > > * release its references to user-space pages
> > > > @@ -1032,7 +1035,8 @@
> > > > }
> > > > dio_complete(dio, offset, ret);
> > > > /* We could have also come here on an AIO file extend */
> > > > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > > > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > > > + dio->result < dio->size))
> > > > aio_complete(iocb, ret, 0);
> > > > kfree(dio);
> > > > }
> > > > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > > > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > > > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > > > @@ -1895,14 +1895,16 @@
> > > > */
> > > > if (written >= 0 && file->f_flags & O_SYNC)
> > > > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > > > - if (written >= 0 && !is_sync_kiocb(iocb))
> > > > + if (written >= count && !is_sync_kiocb(iocb))
> > > > written = -EIOCBQUEUED;
> > > > - if (written != -ENOTBLK)
> > > > + if (written < 0 || written >= count)
> > > > goto out_status;
> > > > /*
> > > > * direct-io write to a hole: fall through to buffered I/O
> > > > + * for completing the rest of the request.
> > > > */
> > > > - written = 0;
> > > > + pos += written;
> > > > + count -= written;
> > > > }
> > > >
> > > > buf = iov->iov_base;
> > >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > > the body to [email protected]. For more info on Linux AIO,
> > > see: http://www.kvack.org/aio/
> > > Don't email: <a href=mailto:"[email protected]">[email protected]</a>
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > the body to [email protected]. For more info on Linux AIO,
> > see: http://www.kvack.org/aio/
> > Don't email: <a href=mailto:"[email protected]">[email protected]</a>

2003-11-18 23:49:00

by Jon Foster

[permalink] [raw]
Subject: Re:Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

Hi,

> The other thing I've found printks to hide before is timing bugs / races.
> Unfortunately I can't see one here, but maybe someone else can ;-)
> Maybe inserting a 1ms delay or something in place of the printk would
> have the same effect?

One of my colleagues had an interesting bug caused by an
uninitialized variable - a printk() in the right place happened
to set the variable (which gcc had put in a register) to the
correct value for his code to work.

I've tried looking for uses of uninitialized registers in entry.S,
but the assembly there isn't easy to follow.

What happens if you replace the printk with assembly code
that clobbers eax, ecx, edx and (most of) eflags? (Assuming
I've remembered the calling convention correctly, those are
the registers that printk will be overwriting).

Kind regards,

Jon

2003-11-19 03:28:47

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re:Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, 18 Nov 2003, Jon Foster wrote:

> > The other thing I've found printks to hide before is timing bugs / races.
> > Unfortunately I can't see one here, but maybe someone else can ;-)
> > Maybe inserting a 1ms delay or something in place of the printk would
> > have the same effect?
>
> One of my colleagues had an interesting bug caused by an
> uninitialized variable - a printk() in the right place happened
> to set the variable (which gcc had put in a register) to the
> correct value for his code to work.

Very nice =)

> I've tried looking for uses of uninitialized registers in entry.S,
> but the assembly there isn't easy to follow.

I've walked that code and can't see anything wrong anywhere.

> What happens if you replace the printk with assembly code
> that clobbers eax, ecx, edx and (most of) eflags? (Assuming
> I've remembered the calling convention correctly, those are
> the registers that printk will be overwriting).

Well i have tried a number of heavyweight functions, so far none of them
have had the effect that a printk has had. It's also worth noting that a
printk lookalike function such as the following, does not fix things
either.

asmlinkage int kooh_la_la(const char *fmt, ...)
{
return strlen(fmt);
}

2003-11-19 05:45:37

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

Zwane Mwaikambo <[email protected]> wrote:
>
> I've walked that code and can't see anything wrong anywhere.

fwiw, X comes up happily on a couple of boxes here, with the 4g/4g split
enabled.

Have you tried a different compiler?

2003-11-19 06:58:37

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, 18 Nov 2003, Andrew Morton wrote:

> Zwane Mwaikambo <[email protected]> wrote:
> >
> > I've walked that code and can't see anything wrong anywhere.
>
> fwiw, X comes up happily on a couple of boxes here, with the 4g/4g split
> enabled.

The exact same kernel runs fine on my other test boxes. But i really don't
have faith in this compiler, it's the same one which constantly seems to be
tripping into various problems.

> Have you tried a different compiler?

I just tried the RH9 2.96 and it also triple faulted. Oh my.. The only
unique thing about this hardware compared ot the other stuff i have here
is that it's an AMD K6. Everything else is Intel.

2003-11-19 07:24:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops


On Wed, 19 Nov 2003, Zwane Mwaikambo wrote:
>
> I just tried the RH9 2.96 and it also triple faulted. Oh my.. The only
> unique thing about this hardware compared ot the other stuff i have here
> is that it's an AMD K6. Everything else is Intel.

Different TLB sizes (and organizations) etc can _easily_ matter, if the
Intel one just happens to work because something stays in the TLB while
the page table mapping is incorrect and keeps the system afloat.

Or - and in this case more likely - since the problem is fixed by running
a (complex) thing that trashes all over the DTLB/ITLB, it's more likely
that there might be a _missing_ TLB invalidate somewhere, and that the
Intel boxes stay up because they have a smaller TLB and the stale entry
gets flushed out early from them.

But you already tried a "flush_tlb_all()" which _should_ have flushed
absolutely everything, including global tables. I dunno. It could be
hitting a CPU bug too, of course.

It would be interesting to hear if other K6 users see problems..

Linus

2003-11-19 20:32:45

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Tue, Nov 18, 2003 at 08:37:25AM -0800, Linus Torvalds wrote:
>
> On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
> >
> > Here are diffs from the do_sys_vm86 only.
>
> Ok. Much more readable.
>
> And there is something very suspicious there.
>
> The code with and without the printk() looks _identical_ apart from some
> trivial label renumbering, and the added
>
> pushl $.LC6
> call printk
> .. asm ..
> popl %esi
>
> which all looks fine (esi is dead at that point, so the compiler is just
> using a "popl" as a shorter form of "addl $4,%esp").
>
> Btw, you seem to compile with debugging, which makes the assembly
> language pretty much unreadable and accounts for most of the
> differences: the line numbers change. If you compile a kernel where the
> line numbers don't change (by commenting _out_ the printk rather than
> removing the whole line), your diff would be more readable.
>
> Anyway, there are _zero_ differences.
>
> Just for fun, try this: move the "printk()" to _below_ the "asm"
> statement. It will never actually get executed, but if it's an issue of
> some subtle code or data placement things (cache lines etc), maybe that
> also hides the oops, since all the same code and data will be generated,
> just not run...

Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
doesn't help. So my suspicion is that the printk is changing the
timing just enough on Zwane's box that he's getting a timer interrupt
knocking him out of vm86 mode before he hits a fatal bit in the fault
handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
there's probably something amiss in the trampoline code.

--
Matt Mackall : http://www.selenic.com : Linux development and consulting

2003-11-19 23:09:45

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
>
> Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> doesn't help. So my suspicion is that the printk is changing the
> timing just enough on Zwane's box that he's getting a timer interrupt
> knocking him out of vm86 mode before he hits a fatal bit in the fault
> handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> there's probably something amiss in the trampoline code.

Some more datapoints:

CPU distro compiler video X result
K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot

--
Matt Mackall : http://www.selenic.com : Linux development and consulting

2003-11-20 07:15:44

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Wed, 19 Nov 2003, Matt Mackall wrote:

> On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> >
> > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > doesn't help. So my suspicion is that the printk is changing the
> > timing just enough on Zwane's box that he's getting a timer interrupt
> > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > there's probably something amiss in the trampoline code.
>
> Some more datapoints:

Thanks for trying those out, i got another one to add.

> CPU distro compiler video X result
> K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot

P4/Xeon 2000 Fedora Core 1 3.3.2 ATI Rage XL 4.3.0 reboot

2003-11-20 07:44:29

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Wed, Nov 19, 2003 at 05:09:28PM -0600, Matt Mackall wrote:
> On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> >
> > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > doesn't help. So my suspicion is that the printk is changing the
> > timing just enough on Zwane's box that he's getting a timer interrupt
> > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > there's probably something amiss in the trampoline code.
>
> Some more datapoints:
>
> CPU distro compiler video X result
> K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot

And indeed it does turn out to be a problem with the trampoline
mechanics. The fix for -mm4:


Fix triple faulting on some boxes with 4G/4G


mm-mpm/arch/i386/kernel/vm86.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)

diff -puN arch/i386/kernel/vm86.c~virtual-esp arch/i386/kernel/vm86.c
--- mm/arch/i386/kernel/vm86.c~virtual-esp 2003-11-20 01:36:32.000000000 -0600
+++ mm-mpm/arch/i386/kernel/vm86.c 2003-11-20 01:36:32.000000000 -0600
@@ -306,7 +306,7 @@ static void do_sys_vm86(struct kernel_vm
tss->esp0 = virtual_esp0(tsk);
if (cpu_has_sep)
tsk->thread.sysenter_cs = 0;
- load_esp0(tss, &tsk->thread);
+ load_virtual_esp0(tss, tsk);
put_cpu();

tsk->thread.screen_bitmap = info->screen_bitmap;

_


--
Matt Mackall : http://www.selenic.com : Linux development and consulting

2003-11-20 07:48:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

Matt Mackall <[email protected]> wrote:
>
> - load_esp0(tss, &tsk->thread);
> + load_virtual_esp0(tss, tsk);

Thanks guys.

Now I'll have to put something else in there to keep you amused ;)


2003-11-20 08:13:40

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH][2.6-mm] Fix 4G/4G X11/vm86 oops

On Thu, Nov 20, 2003 at 01:44:05AM -0600, Matt Mackall wrote:
> On Wed, Nov 19, 2003 at 05:09:28PM -0600, Matt Mackall wrote:
> > On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> > >
> > > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > > doesn't help. So my suspicion is that the printk is changing the
> > > timing just enough on Zwane's box that he's getting a timer interrupt
> > > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > > there's probably something amiss in the trampoline code.
> >
> > Some more datapoints:
> >
> > CPU distro compiler video X result
> > K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> > K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> > Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> > Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> > P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot
>
> And indeed it does turn out to be a problem with the trampoline
> mechanics. The fix for -mm4:

Cleanup, as pointed out by Zwane:

Fix triple faulting on some boxes with 4G/4G


mm-mpm/arch/i386/kernel/vm86.c | 3 +--
1 files changed, 1 insertion(+), 2 deletions(-)

diff -puN arch/i386/kernel/vm86.c~virtual-esp arch/i386/kernel/vm86.c
--- mm/arch/i386/kernel/vm86.c~virtual-esp 2003-11-20 01:36:32.000000000 -0600
+++ mm-mpm/arch/i386/kernel/vm86.c 2003-11-20 02:08:38.000000000 -0600
@@ -303,10 +303,9 @@ static void do_sys_vm86(struct kernel_vm

tss = init_tss + get_cpu();
tsk->thread.esp0 = (unsigned long) &info->VM86_TSS_ESP0;
- tss->esp0 = virtual_esp0(tsk);
if (cpu_has_sep)
tsk->thread.sysenter_cs = 0;
- load_esp0(tss, &tsk->thread);
+ load_virtual_esp0(tss, tsk);
put_cpu();

tsk->thread.screen_bitmap = info->screen_bitmap;

_



--
Matt Mackall : http://www.selenic.com : Linux development and consulting

2003-11-24 09:37:21

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: 2.6.0-test9-mm3 - AIO test results

On Tue, Nov 18, 2003 at 03:47:53PM -0800, Daniel McNeil wrote:
> Suparna,
>
> I was unable to reproduce the hang in io_submit() without your patch.
> I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.
>
> I re-ran with your patch with both as-iosched and deadline and both
> hung in io_submit(). aiocp would run a few times, but I put the
> aiocp in a while loop and it hung on the 1st or 2nd time. It
> did get most of the way through copying the file before hanging.
> This is on a 2-proc to ide disks running ext3.
>

Found one race ... not sure if its the one causing the hangs
you see. The attached patch is not a complete fix (there is one
other race to close), but it would be interesting to see if
this makes any difference for you.

Regards
Suparna

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India

------------------------------------------------------
Don't access dio fields if its possible that the dio could
already have been freed asynchronously during i/o completion.
Fixme: This still leaves a window between decrement of
bio_count and accessing dio->waiter during i/o completion
wherein the dio could get freed by the submission path.


--- pure-mm3/fs/direct-io.c 2003-11-24 13:00:33.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-24 14:15:30.000000000 +0530
@@ -994,14 +995,17 @@
* reflect the number of to-be-processed BIOs.
*/
if (dio->is_async) {
- if (ret == 0)
- ret = dio->result;
- if (ret > 0 && dio->result < dio->size && rw == WRITE) {
+ int should_wait = 0;
+
+ if (dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
+ should_wait = 1;
}
+ if (ret == 0)
+ ret = dio->result;
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (dio->waiter) {
+ if (should_wait) {
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
@@ -1013,7 +1017,7 @@
set_current_state(TASK_UNINTERRUPTIBLE);
}
set_current_state(TASK_RUNNING);
- dio->waiter = NULL;
+ kfree(dio);
}
} else {
finished_one_bio(dio);

2003-11-25 23:49:56

by Daniel McNeil

[permalink] [raw]
Subject: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch

Suparna,

Yes your patch did help. I originally had CONFIG_DEBUG_SLAB=y which
was helping me see problems because the the freed dio was getting
poisoned. I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
very good at catching these.

I updated your AIO fallback patch plus your AIO race plus I fixed
the bio_count decrement fix. This patch has all three fixes and
it is working for me.

I fixed the bio_count race, by changing bio_list_lock into bio_lock
and using that for all the bio fields. I changed bio_count and
bios_in_flight from atomics into int. They are now proctected by
the bio_lock. I fixed the race, by in finished_one_bio() by
leaving the bio_count at 1 until after the dio_complete()
and then do the bio_count decrement and wakeup holding the bio_lock.

Take a look, give it a try, and let me know what you think.

I've tested this on my 2-way and so far all my tests have past.
I have more testing to do, but this is working better.

Thanks,

Daniel



On Mon, 2003-11-24 at 01:42, Suparna Bhattacharya wrote:
> On Tue, Nov 18, 2003 at 03:47:53PM -0800, Daniel McNeil wrote:
> > Suparna,
> >
> > I was unable to reproduce the hang in io_submit() without your patch.
> > I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.
> >
> > I re-ran with your patch with both as-iosched and deadline and both
> > hung in io_submit(). aiocp would run a few times, but I put the
> > aiocp in a while loop and it hung on the 1st or 2nd time. It
> > did get most of the way through copying the file before hanging.
> > This is on a 2-proc to ide disks running ext3.
> >
>
> Found one race ... not sure if its the one causing the hangs
> you see. The attached patch is not a complete fix (there is one
> other race to close), but it would be interesting to see if
> this makes any difference for you.
>
> Regards
> Suparna


Attachments:
2.6.0-test9-mm5.aio-dio-fallback-bio_count-race.patch (8.88 kB)

2003-11-26 07:49:49

by Suparna Bhattacharya

[permalink] [raw]
Subject: Re: [PATCH 2.6.0-test9-mm5] aio-dio-fallback-bio_count-race.patch

On Tue, Nov 25, 2003 at 03:49:31PM -0800, Daniel McNeil wrote:
> Suparna,
>
> Yes your patch did help. I originally had CONFIG_DEBUG_SLAB=y which
> was helping me see problems because the the freed dio was getting
> poisoned. I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
> very good at catching these.

Ah I see - perhaps that explains why neither Janet nor I could
recreate the problem that you were hitting so easily. So we
should probably try running with CONFIG_DEBUG_SLAB and
CONFIG_DEBUG_PAGEALLOC as well.

>
> I updated your AIO fallback patch plus your AIO race plus I fixed
> the bio_count decrement fix. This patch has all three fixes and
> it is working for me.
>
> I fixed the bio_count race, by changing bio_list_lock into bio_lock
> and using that for all the bio fields. I changed bio_count and
> bios_in_flight from atomics into int. They are now proctected by
> the bio_lock. I fixed the race, by in finished_one_bio() by
> leaving the bio_count at 1 until after the dio_complete()
> and then do the bio_count decrement and wakeup holding the bio_lock.
>
> Take a look, give it a try, and let me know what you think.

I had been trying a slightly different kind of fix -- appended is
the updated version of the patch I last posted. It uses the bio_list_lock
to protect the dio->waiter field, which finished_one_bio sets back
to NULL after it has issued the wakeup; and the code that waits for
i/o to drain out checks the dio->waiter field instead of bio_count.
This might not seem very obvious given the nomenclature of the
bio_list_lock, so I was holding back wondering if it could be
improved.

Your approach looks clearer in that sense -- its pretty unambiguous
about what lock protects what fields. The only thing that bothers me (and
this is what I was trying to avoid in my patch) is the increased
use of spin_lock_irq 's (overhead of turning interrupts off and on)
instead of simple atomic inc/dec in most places.

Thoughts ?

Regards
Suparna

--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India

------------------------------------

Don't access dio fields if its possible that the dio could
already have been freed asynchronously during i/o completion.
The dio->bio_list_lock protects the dio->waiter field as in
the case of synchronous i/o.

--- pure-mm3/fs/direct-io.c 2003-11-24 13:00:33.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-25 14:08:26.000000000 +0530
@@ -231,8 +231,17 @@
aio_complete(dio->iocb, dio->result, 0);
kfree(dio);
} else {
- if (dio->waiter)
- wake_up_process(dio->waiter);
+ struct task_struct *waiter;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
+ waiter = dio->waiter;
+ if (waiter) {
+ dio->waiter = NULL;
+ wake_up_process(waiter);
+ }
+ spin_unlock_irqrestore(&dio->bio_list_lock,
+ flags);
}
}
}
@@ -994,26 +1004,35 @@
* reflect the number of to-be-processed BIOs.
*/
if (dio->is_async) {
- if (ret == 0)
- ret = dio->result;
- if (ret > 0 && dio->result < dio->size && rw == WRITE) {
+ int should_wait = 0;
+
+ if (dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
+ should_wait = 1;
}
+ if (ret == 0)
+ ret = dio->result;
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (dio->waiter) {
+ if (should_wait) {
+ unsigned long flags;
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
* before returning to fallback on buffered I/O
*/
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
set_current_state(TASK_UNINTERRUPTIBLE);
- while (atomic_read(&dio->bio_count)) {
+ while (dio->waiter) {
+ spin_unlock_irqrestore(&dio->bio_list_lock,
+ flags);
io_schedule();
set_current_state(TASK_UNINTERRUPTIBLE);
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
}
set_current_state(TASK_RUNNING);
- dio->waiter = NULL;
+ spin_unlock_irqrestore(&dio->bio_list_lock, flags);
+ kfree(dio);
}
} else {
finished_one_bio(dio);