http://www.zip.com.au/~akpm/linux/patches/2.6.0-test9-mm3.gz
kernel.org is being slow. Will appear at:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.0-test9/2.6.0-test9-mm3/
- Various new fixes; generally uncritical ones.
- Significant changes to the AIO and direct-io code. This needs beating
on; hopefully we're now close to a solution to the fairly complex problems
in there.
- Several ext2 and ext3 allocator fixes. These need serious testing on big
SMP.
- Anyone who has patches in here which they think should go into 2.6.0,
please retest them in -mm3 and let me know, thanks.
linus.patch
Latest Linus tree
-as-badness-warning-fix.patch
-3c509-mca-fix.patch
-ext2-allocation-fix.patch
-ohci-locking-fix.patch
-disable-ide-tcq.patch
-via-quirk-fix.patch
-raid1-recovery-fix.patch
-journal_remove_journal_head-assertion-fix.patch
-x86_64-tss-limit-fix.patch
-keyboard-repeat-rate-setting-fix.patch
-aio-refcounting-fix.patch
Merged
-RD16-rest-B6.patch
Al said to drop this.
+cramfs-use-pagecache.patch
cramfs fixes
-ia32-MSI-support-tweaks.patch
Folded into ia32-MSI-support.patch
+ia32-MSI-support-x86_64-fixes.patch
x86_64 build fix
-ia32-efi-asm-warning-fix.patch
-ia32-efi-support-mem-equals-fix.patch
-CONFIG_ACPI_EFI-defaults-off.patch
-ia32-efi-support-warning-fixes.patch
-ia32-efi-support-tidy.patch
-ia32-efi-other-arch-fix.patch
-efi-constant-sizing-fix.patch
-ia32-efi-config-option.patch
-ia32-efi-config-option-tweaks.patch
-ia32-efi-config-help-update.patch
-ia64-CONFIG_EFI-update.patch
Folded into ia32-efi-support.patch
+ia64-ia32-missing-compat-syscalls.patch
+compat-layer-fixes.patch
32-bit compat layer fixes
+compat-ioctl-for-i2c.patch
compat layer for i2c (old version)
+loop-bio-handling-fix.patch
Loop driver fixlet
-gcc-Os-if-embedded-better-help.patch
Folded into gcc-Os-if-embedded.patch
+as-request-poisoning-fix.patch
+as-fix-all-known-bugs.patch
Anticipatory scheduler fixes.
+more-than-256-cpus.patch
cpumask fixes for huge SMP
+acpi-pm-timer.patch
+acpi-pm-timer-fixes.patch
Yet another timer source for ia32
+ZONE_SHIFT-from-NODES_SHIFT.patch
Memory zone arith fixup
+ext2_new_inode-fixes.patch
+ext2_new_inode-fixes-tweaks.patch
+remove-ext2_reverve_inode.patch
ext2 fixes
+memmove-speedup.patch
Make memmove() faster.
+percpu-counter-linkage-fix.patch
Fix the build for when ext2 and ext3 are modular
+ide-scsi-warnings.patch
Print warnings when someone tries to use ide-scsi for a cdrom
+pipe-readv-writev.patch
pipe readv() and writev() correctness fix and speedup
+ext3_new_inode-scan-fix.patch
ext3 inode allocator fix
+lockless-semop.patch
sysv semaphore SMP speedup
+percpu_counter-use-alloc_percpu.patch
Fix the percpu counters for huge SMP.
+i450nx-scanning-fix.patch
PCI bridge fix for i450nx chipset machines
+serio-pm-fix.patch
Fix psmouse PM resume
+find_busiest_queue-commentary.patch
CPU scheduler comments
+ext2-block-allocator-fixes.patch
More ext2 allocator fixes.
+SOUND_CMPCI-config-typo-fix.patch
Sound driver config fix
+atkbd-24-compatibility.patch
Make AT keyboard userspace interface compatible with 2.4's.
+init_h-needs-compiler_h.patch
+init_h-needs-compiler_h-fix.patch
Compile fix
+cpu_sibling_map-fix.patch
cpu_sibling_map is broken on summit.
+tulip-hash-fix.patch
Fix multicast hash generation for some tulips
+context-switch-accounting-fix.patch
Fix CPU scheduler beancounting with CONFIG_PREEMPT.
+access-vfs_permission-fix.patch
Fix access()
+eicon-linkage-fix.patch
ISDM build fix
+kobject-docco-additions.patch
Documentation additions.
-O_DIRECT-race-fixes-rework-XFS-fix.patch
-O_DIRECT-race-fixes-rework-XFS-fix-fix.patch
Folded into O_DIRECT-race-fixes-rollup.patch
+dio-aio-fixes.patch
+dio-aio-fixes-fixes.patch
AIO/direct-io fixes
+promise-sata-id.patch
Additional STAT PCI ID.
All 201 patches
linus.patch
mm.patch
add -mmN to EXTRAVERSION
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdbL warning fix
kgdb-buff-too-big.patch
kgdb buffer overflow fix
kgdb-warning-fix.patch
kgdbL warning fix
kgdb-build-fix.patch
kgdb-spinlock-fix.patch
kgdb-fix-debug-info.patch
kgdb: CONFIG_DEBUG_INFO fix
kgdb-cpumask_t.patch
kgdb-x86_64-fixes.patch
x86_64 fixes
kgdb-over-ethernet.patch
kgdb-over-ethernet patch
kgdb-over-ethernet-fixes.patch
kgdb-over-ethernet fixlets
kgdb-CONFIG_NET_POLL_CONTROLLER.patch
kgdb: replace CONFIG_KGDB with CONFIG_NET_RX_POLL in net drivers
kgdb-handle-stopped-NICs.patch
kgdb: handle netif_stopped NICs
eepro100-poll-controller.patch
tlan-poll_controller.patch
tulip-poll_controller.patch
tg3-poll_controller.patch
kgdb: tg3 poll_controller
8139too-poll_controller.patch
8139too poll controller
kgdb-eth-smp-fix.patch
kgdb-over-ethernet: fix SMP
kgdb-eth-reattach.patch
kgdb-skb_reserve-fix.patch
kgdb-over-ethernet: skb_reserve() fix
must-fix.patch
should-fix.patch
must-fix-update-01.patch
must fix lists update
RD1-cdrom_ioctl-B6.patch
RD2-ioctl-B6.patch
RD2-ioctl-B6-fix.patch
RD2-ioctl-B6 fixes
RD3-cdrom_open-B6.patch
RD4-open-B6.patch
RD5-cdrom_release-B6.patch
RD6-release-B6.patch
RD7-presto_journal_close-B6.patch
RD8-f_mapping-B6.patch
RD9-f_mapping2-B6.patch
RD10-i_sem-B6.patch
RD11-f_mapping3-B6.patch
RD12-generic_osync_inode-B6.patch
RD13-bd_acquire-B6.patch
RD14-generic_write_checks-B6.patch
RD15-I_BDEV-B6.patch
cramfs-use-pagecache.patch
cramfs: use pagecache better
invalidate_inodes-speedup.patch
invalidate_inodes speedup
invalidate_inodes-speedup-fixes-2.patch
more invalidate_inodes speedup fixes
serio-01-renaming.patch
serio: rename serio_[un]register_slave_port to __serio_[un]register_port
serio-02-race-fix.patch
serio: possible race between port removal and kseriod
serio-03-blacklist.patch
Add black list to handler<->device matching
serio-04-synaptics-cleanup.patch
Synaptics: code cleanup
serio-05-reconnect-facility.patch
serio: reconnect facility
serio-06-synaptics-use-reconnect.patch
Synaptics: use serio_reconnect
acpi_off-fix.patch
fix acpi=off
cfq-4.patch
CFQ io scheduler
CFQ fixes
config_spinline.patch
uninline spinlocks for profiling accuracy.
ppc64-bar-0-fix.patch
Allow PCI BARs that start at 0
ppc64-reloc_hide.patch
sym-do-160.patch
make the SYM driver do 160 MB/sec
input-use-after-free-checks.patch
input layer debug checks
aic7xxx-parallel-build-fix.patch
fix parallel builds for aic7xxx
ramdisk-cleanup.patch
intel8x0-cleanup.patch
intel8x0 cleanups
pdflush-diag.patch
kobject-oops-fixes.patch
fix oopses is kobject parent is removed before child
futex-uninlinings.patch
futex uninlining
zap_page_range-debug.patch
zap_page_range() debug
call_usermodehelper-retval-fix-3.patch
Make call_usermodehelper report exit status
asus-L5-fix.patch
Asus L5 framebuffer fix
jffs-use-daemonize.patch
tulip-NAPI-support.patch
tulip NAPI support
tulip-napi-disable.patch
tulip NAPI: disable poll in close
get_user_pages-handle-VM_IO.patch
ia32-MSI-support.patch
Updated ia32 MSI Patches
ia32-MSI-support-x86_64-fixes.patch
ia32-efi-support.patch
EFI support for ia32
efi warning fix
fix EFI for ppc64, ia64
efi: warning fixes
ia32 EFI: Add CONFIG_EFI
efi: Update Kconfig help
efi update patch (ia64)
support-zillions-of-scsi-disks.patch
support many SCSI disks
SGI-IOC4-IDE-chipset-support.patch
Add support for SGI's IOC4 chipset
sparc32-sched_clock.patch
pcibios_test_irq-fix.patch
Fix pcibios test IRQ handler return
fixmap-in-proc-pid-maps.patch
report user-readable fixmap area in /proc/PID/maps
i82365-sysfs-ordering-fix.patch
Fix init_i82365 sysfs ordering oops
pci_set_power_state-might-sleep.patch
ia64-ia32-missing-compat-syscalls.patch
From: Arun Sharma <[email protected]>
Subject: Missing compat syscalls in ia64
compat-layer-fixes.patch
Minor bug fixes to the compat layer
compat-ioctl-for-i2c.patch
compat_ioctl for i2c
compat_ioctl-cleanup.patch
cleanup of compat_ioctl functions
fix-sqrt.patch
sqrt() fixes
scale-min_free_kbytes.patch
scale the initial value of min_free_kbytes
cdrom-allocation-try-harder.patch
Use __GFP_REPEAT for cdrom buffer
sym-2.1.18f.patch
CONFIG_STANDALONE-default-to-n.patch
Make CONFIG_STANDALONE default to N
extra-buffer-diags.patch
nosysfs.patch
constant_test_bit-doesnt-like-zwanes-gcc.patch
gcc bug workaround for constant_test_bit()
slab-leak-detector.patch
slab leak detector
early-serial-registration-fix.patch
serial console registration bugfix
3c527-smp-update.patch
SMP support on 3c527 net driver
3c527-race-fix.patch
ext3-latency-fix.patch
ext3 scheduling latency fix
videobuf_waiton-race-fix.patch
firmware-kernel_thread-on-demand.patch
Remove workqueue usage from request_firmware_async()
loop-autoloading-fix.patch
Fix loop module auto loading
loop-module-alias.patch
loop needs MODULE_ALIAS_BLOCK
loop-remove-blkdev-special-case.patch
loop-highmem.patch
remove useless highmem bounce from loop/cryptoloop
loop-highmem-fixes.patch
loop-bio-handling-fix.patch
loop: BIO handling fix
cmpci-set_fs-fix.patch
cmpci.c: remove pointless set_fs()
dentry-bloat-fix-2.patch
Fix dcache and icache bloat with deep directories
nls-config-fixes.patch
NSL config fixes
proc_pid_lookup-vs-exit-race-fix.patch
Fix proc_pid_lookup vs exit race
gcc-Os-if-embedded.patch
Add `gcc -Os' config option
aic7xxx-sleep-in-spinlock-fix.patch
vm86-sysenter-fix.patch
Fix sysenter disabling in vm86 mode
gettimeofday-resolution-fix.patch
gettimeofday resolution fix
refill_counter-overflow-fix.patch
vmscan: reset refill_counter after refilling the inactive list
verbose-timesource.patch
be verbose about the time source
as-regression-fix.patch
Fix IO scheduler regression
as-request-poisoning.patch
AS: request poisoning
as-request-poisoning-fix.patch
AS: request poisining fix
as-fix-all-known-bugs.patch
AS fixes
as-new-process-estimation.patch
AS: new process estimation
as-cooperative-thinktime.patch
AS: thinktime improvement
scale-nr_requests.patch
scale nr_requests with TCQ depth
truncate_inode_pages-check.patch
local_bh_enable-warning-fix.patch
cdc-acm-softirq-rx.patch
cdc-acm: move rx processing to softirq
forcedeth.patch
forcedeth: nForce ethernet driver
reiserfs-pinned-buffer-fix.patch
reiserfs pinned buffer fix
proc-pid-maps-output-fix.patch
Restore /proc/pid/maps formatting
atomic_dec-debug.patch
atomic_dec debug
sis900-pm-support.patch
Add PM support to sis900 network driver
8139too-locking-fix.patch
8139too locking fix
ia32-wp-test-cleanup.patch
ia32 WP test cleanup
hugetlb-needs-pse.patch
ia32: hugetlb needs pse
powermate-payload-size-fix.patch
Griffin Powermate fix
more-than-256-cpus.patch
Fix for more than 256 CPUs
acpi-pm-timer.patch
ACPI PM Timer
acpi-pm-timer-fixes.patch
ACPI PM-Timer fixes
ZONE_SHIFT-from-NODES_SHIFT.patch
Use NODES_SHIFT to calculate ZONE_SHIFT
ext2_new_inode-fixes.patch
Fix bugs in ext2_new_inode()
ext2_new_inode-fixes-tweaks.patch
ext2_new_inode: more tweaking
remove-ext2_reverve_inode.patch
memmove-speedup.patch
optimize ia32 memmove
percpu-counter-linkage-fix.patch
fix percpu_counter_mod linkage problem
ide-scsi-warnings.patch
ide-scsi: warn when used for cdroms
pipe-readv-writev.patch
Fix writev atomicity on pipe/fifo
ext3_new_inode-scan-fix.patch
ext3_new_inode fixlet
lockless-semop.patch
lockless semop
percpu_counter-use-alloc_percpu.patch
use alloc_percpu in percpu_counters
i450nx-scanning-fix.patch
i450nx PCI scanning fix
serio-pm-fix.patch
psmouse pm resume fix
find_busiest_queue-commentary.patch
find_busiest_queue() commentary fix
ext2-block-allocator-fixes.patch
ext2 block allocator fixes
SOUND_CMPCI-config-typo-fix.patch
fix SOUND_CMPCI Configure help entry
atkbd-24-compatibility.patch
Fixes for keyboard 2.4 compatibility
init_h-needs-compiler_h.patch
init.h needs to include compiler.h
init_h-needs-compiler_h-fix.patch
compile fix for older gcc's
cpu_sibling_map-fix.patch
cpu_sibling_map fix
tulip-hash-fix.patch
tulip filter hash fix
context-switch-accounting-fix.patch
Fix context switch accounting
access-vfs_permission-fix.patch
Subject: Re: [PATCH] fix access() / vfs_permission() bug
eicon-linkage-fix.patch
eicon/ and hardware/eicon/ drivers using the same symbols
kobject-docco-additions.patch
Improve documentation for kobjects
list_del-debug.patch
list_del debug check
print-build-options-on-oops.patch
show_task-free-stack-fix.patch
show_task() fix and cleanup
oops-dump-preceding-code.patch
i386 oops output: dump preceding code
lockmeter.patch
printk-oops-mangle-fix.patch
disentangle printk's whilst oopsing on SMP
4g-2.6.0-test2-mm2-A5.patch
4G/4G split patch
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
use direct_copy_{to,from}_user for kernel access in mm/usercopy.c
4G/4G might_sleep warning fix
4g/4g pagetable accounting fix
4g4g-athlon-prefetch-handling-fix.patch
4g4g-wp-test-fix.patch
Fix 4G/4G and WP test lockup
4g4g-KERNEL_DS-usercopy-fix.patch
4G/4G KERNEL_DS usercopy again
ppc-fixes.patch
make mm4 compile on ppc
aic7xxx_old-oops-fix.patch
O_DIRECT-race-fixes-rollup.patch
DIO fixes forward port and AIO-DIO fix
O_DIRECT race fixes comments
O_DRIECT race fixes fix fix fix
DIO locking rework
O_DIRECT XFS fix
dio-aio-fixes.patch
direct-io AIO fixes
dio-aio-fixes-fixes.patch
dio-aio fix fix
readahead-multiple-fixes.patch
readahead: multipole performance fixes
readahead-simplification.patch
readahead simplification
aio-sysctl-parms.patch
aio sysctl parms
aio-01-retry.patch
AIO: Core retry infrastructure
Fix aio process hang on EINVAL
AIO: flush workqueues before destroying ioctx'es
AIO: hold the context lock across unuse_mm
task task_lock in use_mm()
4g4g-aio-hang-fix.patch
Fix AIO and 4G-4G hang
aio-retry-elevated-refcount.patch
aio: extra ref count during retry
aio-splice-runlist.patch
Splice AIO runlist for fairer handling of multiple io contexts
aio-02-lockpage_wq.patch
AIO: Async page wait
aio-03-fs_read.patch
AIO: Filesystem aio read
aio-04-buffer_wq.patch
AIO: Async buffer wait
lock_buffer_wq fix
aio-05-fs_write.patch
AIO: Filesystem aio write
aio-06-bread_wq.patch
AIO: Async block read
aio-07-ext2getblk_wq.patch
AIO: Async get block for ext2
O_SYNC-speedup-2.patch
speed up O_SYNC writes
O_SYNC-speedup-2-f_mapping-fixes.patch
aio-09-o_sync.patch
aio O_SYNC
AIO: fix a BUG
Unify o_sync changes for aio and regular writes
aio-O_SYNC-fix bits got lost
aio: writev nr_segs fix
More AIO O_SYNC related fixes
aio-09-o_sync-f_mapping-fixes.patch
gang_lookup_next.patch
Change the page gang lookup API
aio-gang_lookup-fix.patch
AIO gang lookup fixes
aio-O_SYNC-short-write-fix.patch
Fix for O_SYNC short writes
aio-12-readahead.patch
AIO: readahead fixes
aio O_DIRECT no readahead
Unified page range readahead for aio and regular reads
aio-12-readahead-f_mapping-fix.patch
aio-readahead-speedup.patch
Readahead issues and AIO read speedup
promise-sata-id.patch
add Promise 20376 PCI ID
On Wed, 2003-11-12 at 23:30, Andrew Morton wrote:
> +acpi-pm-timer.patch
> +acpi-pm-timer-fixes.patch
>
> Yet another timer source for ia32
>
[snip]
> verbose-timesource.patch
> be verbose about the time source
Andrew,
I forgot that I sent you the verbose-timesource patch. The ACPI PM time
source will need this simple fix to work along side that patch.
thanks
-john
===== arch/i386/kernel/timers/timer_pm.c 1.6 vs edited =====
--- 1.6/arch/i386/kernel/timers/timer_pm.c Tue Nov 4 11:39:50 2003
+++ edited/arch/i386/kernel/timers/timer_pm.c Thu Nov 13 11:12:23 2003
@@ -185,6 +185,7 @@
/* acpi timer_opts struct */
struct timer_opts timer_pmtmr = {
+ .name = "pmtmr",
.init = init_pmtmr,
.mark_offset = mark_offset_pmtmr,
.get_offset = get_offset_pmtmr,
Linux 2.6 (mm tree) Compile Statistics (gcc 3.2.2)
Warnings/Errors Summary
Kernel bzImage bzImage bzImage modules bzImage
modules
(defconfig) (allno) (allyes) (allyes) (allmod)
(allmod)
--------------- ---------- -------- -------- -------- --------
---------
2.6.0-test9-mm3 0w/0e 0w/0e 172w/ 0e 12w/0e 3w/0e 211w/0e
2.6.0-test9-mm2 0w/0e 0w/0e 172w/ 0e 12w/0e 3w/0e 211w/1e
2.6.0-test9-mm1 0w/0e 0w/0e 179w/ 1e 12w/0e 3w/0e 213w/1e
2.6.0-test8-mm1 0w/0e 0w/0e 183w/ 1e 13w/0e 3w/0e 223w/1e
2.6.0-test7-mm1 0w/0e 1w/0e 176w/ 1e 9w/0e 3w/0e 231w/1e
2.6.0-test6-mm4 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 234w/1e
2.6.0-test6-mm3 0w/0e 1w/0e 178w/ 1e 9w/0e 3w/0e 252w/2e
2.6.0-test6-mm2 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 252w/2e
2.6.0-test6-mm1 0w/0e 1w/0e 179w/ 1e 9w/0e 3w/0e 252w/2e
Web page with links to complete details:
http://developer.osdl.org/cherry/compile/
Version information for host [ cherrypit.pdx.osdl.net ]
gcc: 3.2.2
patch: 2.5.4
Kernel version: 2.6.0-test9-mm3
Kernel build:
Making bzImage (defconfig): 0 warnings, 0 errors
Making modules (defconfig): 0 warnings, 0 errors
Making bzImage (allnoconfig): 0 warnings, 0 errors
Making bzImage (allyesconfig): 172 warnings, 0 errors
Making modules (allyesconfig): 12 warnings, 0 errors
Making bzImage (allmodconfig): 3 warnings, 0 errors
Making modules (allmodconfig): 211 warnings, 0 errors
Building directories:
Building fs/adfs: clean
Building fs/affs: clean
Building fs/afs: clean
Building fs/autofs: clean
Building fs/autofs4: clean
Building fs/befs: clean
Building fs/bfs: clean
Building fs/cifs: clean
Building fs/coda: clean
Building fs/cramfs: clean
Building fs/devfs: clean
Building fs/devpts: clean
Building fs/efs: clean
Building fs/exportfs: clean
Building fs/ext2: clean
Building fs/ext3: clean
Building fs/fat: clean
Building fs/freevxfs: clean
Building fs/hfs: clean
Building fs/hpfs: clean
Building fs/hugetlbfs: clean
Building fs/intermezzo: clean
Building fs/isofs: clean
Building fs/jbd: clean
Building fs/jffs: clean
Building fs/jffs2: clean
Building fs/jfs: clean
Building fs/lockd: clean
Building fs/minix: clean
Building fs/msdos: clean
Building fs/ncpfs: clean
Building fs/nfs: clean
Building fs/nfsd: clean
Building fs/nls: clean
Building fs/ntfs: clean
Building fs/partitions: clean
Building fs/proc: clean
Building fs/qnx4: clean
Building fs/ramfs: clean
Building fs/reiserfs: clean
Building fs/romfs: clean
Building fs/smbfs: clean
Building fs/sysfs: clean
Building fs/sysv: clean
Building fs/udf: clean
Building fs/ufs: clean
Building fs/vfat: clean
Building fs/xfs: clean
Building drivers/i2c: clean
Building drivers/net: 31 warnings, 0 errors
Building drivers/media: 1 warnings, 0 errors
Building drivers/base: clean
Building drivers/pci: clean
Building drivers/eisa: clean
Building drivers/isdn: clean
Building drivers/char: 1 warnings, 0 errors
Building drivers/acpi: clean
Building drivers/serial: 1 warnings, 0 errors
Building drivers/fc4: clean
Building drivers/parport: clean
Building drivers/mtd: 23 warnings, 0 errors
Building drivers/usb: clean
Building drivers/block: 1 warnings, 0 errors
Building drivers/pcmcia: 3 warnings, 0 errors
Building drivers/input: clean
Building drivers/atm: clean
Building drivers/ide: 30 warnings, 0 errors
Building drivers/pnp: clean
Building drivers/oprofile: clean
Building drivers/ieee1394: clean
Building drivers/cdrom: 3 warnings, 0 errors
Building drivers/md: clean
Building drivers/message: 1 warnings, 0 errors
Building drivers/cpufreq: clean
Building drivers/sbus: clean
Building drivers/bluetooth: clean
Building drivers/telephony: 5 warnings, 0 errors
Building drivers/zorro: clean
Building drivers/acorn: clean
Building drivers/tc: clean
Building drivers/mca: clean
Building drivers/nubus: clean
Building drivers/misc: clean
Building drivers/dio: clean
Building drivers/scsi/aacraid: clean
Building drivers/scsi/aic7xxx: clean
Building drivers/scsi/pcmcia: 4 warnings, 0 errors
Building drivers/scsi/sym53c8xx_2: clean
Building drivers/video/aty: 3 warnings, 0 errors
Building drivers/video/console: 2 warnings, 0 errors
Building drivers/video/i810: clean
Building drivers/video/logo: clean
Building drivers/video/matrox: 5 warnings, 0 errors
Building drivers/video/riva: clean
Building drivers/video/sis: 1 warnings, 0 errors
Building sound/core: clean
Building sound/drivers: clean
Building sound/i2c: clean
Building sound/isa: 3 warnings, 0 errors
Building sound/oss: 33 warnings, 0 errors
Building sound/pci: clean
Building sound/pcmcia: clean
Building sound/synth: clean
Building sound/usb: clean
Building arch/i386: clean
Building crypto: clean
Building lib: clean
Building net: 9 warnings, 0 errors
Building security: clean
Building sound: clean
Building usr: clean
Building fs: clean
Building drivers/video: 8 warnings, 0 errors
Building drivers/scsi: 44 warnings, 0 errors
Building drivers/net: 0 warnings, 1 errors
Error Summary (individual module builds):
drivers/net: 0 warnings, 1 errors
Warning Summary (individual module builds):
drivers/block: 1 warnings, 0 errors
drivers/cdrom: 3 warnings, 0 errors
drivers/char: 1 warnings, 0 errors
drivers/ide: 30 warnings, 0 errors
drivers/media: 1 warnings, 0 errors
drivers/message: 1 warnings, 0 errors
drivers/mtd: 23 warnings, 0 errors
drivers/net: 31 warnings, 0 errors
drivers/pcmcia: 3 warnings, 0 errors
drivers/scsi/pcmcia: 4 warnings, 0 errors
drivers/scsi: 44 warnings, 0 errors
drivers/serial: 1 warnings, 0 errors
drivers/telephony: 5 warnings, 0 errors
drivers/video/aty: 3 warnings, 0 errors
drivers/video/console: 2 warnings, 0 errors
drivers/video/matrox: 5 warnings, 0 errors
drivers/video/sis: 1 warnings, 0 errors
drivers/video: 8 warnings, 0 errors
net: 9 warnings, 0 errors
sound/isa: 3 warnings, 0 errors
sound/oss: 33 warnings, 0 errors
Error List:
make[1]: [arch/i386/boot/bzImage] Error 1 (ignored)
make[2]: [drivers/net/wan/wanxlfw.inc] Error 127 (ignored)
Warning List:
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:38:2: warning: #warning this
driver has not been tested on a preempt system
arch/i386/kernel/cpu/cpufreq/powernow-k8.c:938:2: warning: #warning
pol->policy is in undefined state here
drivers/cdrom/aztcd.c:379: warning: `pa_ok' defined but not used
drivers/cdrom/isp16.c:124: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/cdrom/mcdx.h:180:2: warning: #warning You have not edited mcdx.h
drivers/cdrom/mcdx.h:181:2: warning: #warning Perhaps irq and i/o
settings are wrong.
drivers/cdrom/sjcd.c:1700: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/char/applicom.c:522:2: warning: #warning "Je suis stupide. DW. -
copy*user in cli"
drivers/char/applicom.c:67: warning: `applicom_pci_tbl' defined but not
used
drivers/char/watchdog/alim1535_wdt.c:320: warning: `ali_pci_tbl' defined
but not used
drivers/ide/ide-probe.c:1326: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/ide-probe.c:1353: warning: `MOD_DEC_USE_COUNT' is deprecated
(declared at include/linux/module.h:494)
drivers/ide/ide-tape.c:6213: warning: duplicate `const'
drivers/ide/ide.c:2470: warning: implicit declaration of function
`pnpide_init'
drivers/ide/legacy/ide-cs.c:365: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/legacy/ide-cs.c:411: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/ide/pci/aec62xx.c:533: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/alim15x3.c:871: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/amd74xx.c:451: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/cmd64x.c:755: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cs5520.c:294: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cs5530.c:416: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/cy82c693.c:437: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/hpt34x.c:334: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/hpt366.c:1223: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/ns87415.c:228: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/opti621.c:364: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/pdc202xx_new.c:631: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/pdc202xx_old.c:925: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/piix.c:746: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/rz1000.c:65: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/sc1200.c:557: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/serverworks.c:804: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/siimage.c:1174: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/sis5513.c:956: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/slc90e66.c:376: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/triflex.c:227: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/ide/pci/trm290.c:378: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/ide/pci/trm290.c:406: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
drivers/ide/pci/via82cxxx.c:618: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/input/gameport/ns558.c:121: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
drivers/input/gameport/ns558.c:80: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/media/common/saa7146_vbi.c:6: warning: `vbi_workaround' defined
but not used
drivers/media/video/zoran_card.c:149: warning: `zr36067_pci_tbl' defined
but not used
drivers/message/fusion/mptscsih.c:6922: warning: `mptscsih_setup'
defined but not used
drivers/message/i2o/i2o_block.c:1506: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/mtd/chips/amd_flash.c:783: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/mtd/chips/cfi_cmdset_0001.c:381: warning: unsigned int format,
different type arg (arg 2)
drivers/mtd/chips/cfi_cmdset_0001.c:965: warning: unsigned int format,
different type arg (arg 2)
drivers/mtd/chips/cfi_cmdset_0002.c:1157: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:513: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:651: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0002.c:977: warning: unsigned int format,
different type arg (arg 4)
drivers/mtd/chips/cfi_cmdset_0020.c:1139: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:1288: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:493: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/cfi_cmdset_0020.c:853: warning: unsigned int format,
different type arg (arg 3)
drivers/mtd/chips/sharp.c:157: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/mtd/cmdlinepart.c:344: warning: `mtdpart_setup' defined but not
used
drivers/mtd/devices/doc2000.c:567: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2000.c:568: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2001.c:376: warning: assignment from incompatible
pointer type
drivers/mtd/devices/doc2001.c:377: warning: assignment from incompatible
pointer type
drivers/mtd/nftlcore.c:354: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:358: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:363: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:632: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlcore.c:696: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/mtd/nftlmount.c:220: warning: passing arg 7 of pointer to
function makes pointer from integer without a cast
drivers/net/3c515.c:529: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/acenic.c:135: warning: `acenic_pci_tbl' defined but not used
drivers/net/arcnet/arc-rimi.c:319: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com20020-isa.c:152: warning: `dev_alloc' is
deprecated (declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com20020-pci.c:71: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90io.c:385: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90xx.c:146: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/arcnet/com90xx.c:412: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/arcnet/com90xx.c:609: warning: `dev_alloc' is deprecated
(declared at include/linux/netdevice.h:525)
drivers/net/dgrs.c:124: warning: `dgrs_pci_tbl' defined but not used
drivers/net/eepro.c:575: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/ewrk3.c:1291: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/ewrk3.c:1335: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/net/hp100.c:288: warning: `hp100_pci_tbl' defined but not used
drivers/net/hp100.c:385: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:432: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:463: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/hp100.c:471: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
drivers/net/sk98lin/skaddr.c:1092: warning: `ReturnCode' might be used
uninitialized in this function
drivers/net/sk98lin/skaddr.c:1624: warning: `ReturnCode' might be used
uninitialized in this function
drivers/net/skfp/skfddi.c:185: warning: `skfddi_pci_tbl' defined but not
used
drivers/net/tokenring/smctr.c:3494: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/tokenring/smctr.c:733: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/tulip/winbond-840.c:149: warning: `version' defined but not
used
drivers/net/wan/cycx_drv.c:430: warning: long unsigned int format, u32
arg (arg 2)
drivers/net/wan/farsync.c:1316: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/farsync.c:1329: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/hostess_sv11.c:125: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/hostess_sv11.c:157: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/lmc/lmc_main.c:1063: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
drivers/net/wan/lmc/lmc_main.c:1184: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/lmc/lmc_main.c:1355: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/pc300_drv.c:3168: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/net/wan/pc300_drv.c:3204: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/net/wan/sbni.c:308: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/i82365.c:680: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/i82365.c:817: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/pcmcia/tcic.c:340: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:1003: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:1008: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:700: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:704: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:708: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:712: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:716: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:720: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:973: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:988: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:993: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/BusLogic.c:998: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/NCR5380.c:396: warning: `phases' defined but not used
drivers/scsi/NCR5380.c:699: warning: `NCR5380_probe_irq' defined but not
used
drivers/scsi/NCR5380.c:756: warning: `NCR5380_print_options' defined but
not used
drivers/scsi/NCR53c406a.c:611: warning: `NCR53c406a_setup' defined but
not used
drivers/scsi/NCR53c406a.c:660: warning: initialization from incompatible
pointer type
drivers/scsi/NCR53c406a.c:669: warning: `wait_intr' defined but not used
drivers/scsi/advansys.c:10006: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/advansys.c:4622: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/aha152x.c:396: warning: `id_table' defined but not used
drivers/scsi/aha152x.c:793: warning: `aha152x_setup' defined but not
used
drivers/scsi/aha152x.c:852: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/aha152x.c:870: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/atp870u.c:2350: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/atp870u.c:2422: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/cpqfcTSinit.c:1583: warning: unused variable `timeout'
drivers/scsi/cpqfcTSinit.c:1584: warning: unused variable `retries'
drivers/scsi/cpqfcTSinit.c:1585: warning: unused variable `scsi_cdb'
drivers/scsi/cpqfcTSinit.c:471: warning: `my_ioctl_done' defined but not
used
drivers/scsi/dtc.c:187: warning: `dtc_setup' defined but not used
drivers/scsi/eata_pio.c:596: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/fd_mcs.c:300: warning: `fd_mcs_setup' defined but not used
drivers/scsi/fd_mcs.c:311: warning: initialization from incompatible
pointer type
drivers/scsi/fd_mcs.h:27: warning: `fd_mcs_command' declared `static'
but never defined
drivers/scsi/fdomain.c:763: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/g_NCR5380.c:926: warning: `id_table' defined but not used
drivers/scsi/gdth.c:881: warning: `gdthtable' defined but not used
drivers/scsi/inia100.h:70: warning: `inia100_detect' declared `static'
but never defined
drivers/scsi/inia100.h:71: warning: `inia100_release' declared `static'
but never defined
drivers/scsi/inia100.h:72: warning: `inia100_queue' declared `static'
but never defined
drivers/scsi/inia100.h:73: warning: `inia100_abort' declared `static'
but never defined
drivers/scsi/inia100.h:74: warning: `inia100_device_reset' declared
`static' but never defined
drivers/scsi/inia100.h:75: warning: `inia100_bus_reset' declared
`static' but never defined
drivers/scsi/libata-core.c:2133: warning: `ata_qc_push' defined but not
used
drivers/scsi/psi240i.c:713: warning: initialization from incompatible
pointer type
drivers/scsi/psi240i.c:714: warning: initialization from incompatible
pointer type
drivers/scsi/sym53c416.c:627: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/sym53c416.c:715: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/scsi/wd7000.c:1611: warning: `wd7000_abort' defined but not used
drivers/serial/8250.c:693: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7737: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7799: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.c:7835: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
drivers/telephony/ixj.h:41: warning: `ixj_h_rcsid' defined but not used
drivers/usb/class/usb-midi.h:150: warning: `usb_midi_ids' defined but
not used
drivers/video/aty/aty128fb.c:2335: warning: `aty128fb_exit' defined but
not used
drivers/video/aty/aty128fb.c:254: warning: `mode' defined but not used
drivers/video/aty/aty128fb.c:256: warning: `nomtrr' defined but not used
drivers/video/console/mdacon.c:374: warning: `MOD_INC_USE_COUNT' is
deprecated (declared at include/linux/module.h:482)
drivers/video/console/mdacon.c:384: warning: `MOD_DEC_USE_COUNT' is
deprecated (declared at include/linux/module.h:494)
drivers/video/hgafb.c:452: warning: `hgafb_fillrect' defined but not
used
drivers/video/hgafb.c:472: warning: `hgafb_copyarea' defined but not
used
drivers/video/hgafb.c:502: warning: `hgafb_imageblit' defined but not
used
drivers/video/imsttfb.c:1089: warning: `imsttfb_load_cursor_image'
defined but not used
drivers/video/imsttfb.c:1159: warning: `imstt_set_cursor' defined but
not used
drivers/video/matrox/matroxfb_base.c:1250: warning: `inverse' defined
but not used
drivers/video/matrox/matroxfb_g450.c:129: warning: duplicate `const'
drivers/video/matrox/matroxfb_g450.c:130: warning: duplicate `const'
drivers/video/matrox/matroxfb_maven.c:347: warning: duplicate `const'
drivers/video/matrox/matroxfb_maven.c:348: warning: duplicate `const'
drivers/video/sis/sis_main.c:622: warning: unused variable `reg'
drivers/video/tdfxfb.c:1005: warning: `tdfxfb_cursor' defined but not
used
drivers/video/tdfxfb.c:198: warning: `inverse' defined but not used
drivers/video/tdfxfb.c:199: warning: `mode_option' defined but not used
drivers/video/tridentfb.c:455: warning: `tridentfb_fillrect' defined but
not used
drivers/video/tridentfb.c:473: warning: `tridentfb_copyarea' defined but
not used
include/linux/ixjuser.h:45: warning: `ixjuser_h_rcsid' defined but not
used
include/linux/mca-legacy.h:12:2: warning: #warning "MCA legacy - please
move your driver to the new sysfs api"
net/decnet/dn_nsp_in.c:805: warning: `skb_linearize' is deprecated
(declared at include/linux/skbuff.h:1136)
net/decnet/dn_route.c:639: warning: `skb_linearize' is deprecated
(declared at include/linux/skbuff.h:1136)
net/ipv4/ipcomp.c:189: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv4/ipcomp.c:72: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/ipcomp6.c:174: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/ipcomp6.c:61: warning: `skb_linearize' is deprecated (declared
at include/linux/skbuff.h:1136)
net/ipv6/netfilter/ip6_tables.c:349: warning: `skb_linearize' is
deprecated (declared at include/linux/skbuff.h:1136)
net/ipv6/netfilter/ip6table_mangle.c:162: warning: `skb_linearize' is
deprecated (declared at include/linux/skbuff.h:1136)
net/wanrouter/wanmain.c:729: warning: `dev_get' is deprecated (declared
at include/linux/netdevice.h:514)
sound/isa/opti9xx/opti92x-ad1848.c:1670: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/isa/opti9xx/opti92x-ad1848.c:1686: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/isa/opti9xx/opti92x-ad1848.c:314: warning: `check_region' is
deprecated (declared at include/linux/ioport.h:119)
sound/oss/ad1848.c:1580: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/ad1848.c:2530: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/ad1848.c:2967: warning: `id_table' defined but not used
sound/oss/cmpci.c:1465: warning: unused variable `s'
sound/oss/cmpci.c:2865: warning: `cmpci_pci_tbl' defined but not used
sound/oss/cs4232.c:141: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/cs4232.c:193: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:76: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:78: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:93: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/gus_card.c:94: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/mad16.c:322: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/maui.c:307: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/mpu401.c:1217: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/msnd.c:74: warning: `MOD_INC_USE_COUNT' is deprecated
(declared at include/linux/module.h:482)
sound/oss/msnd.c:95: warning: `MOD_DEC_USE_COUNT' is deprecated
(declared at include/linux/module.h:494)
sound/oss/msnd_pinnacle.c:1123: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/msnd_pinnacle.c:1811: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/opl3sa.c:114: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/opl3sa.c:122: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/pss.c:1004: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:191: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:640: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/pss.c:710: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/sb_common.c:1224: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/sb_common.c:523: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/sgalaxy.c:89: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sgalaxy.c:97: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1113: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1132: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:1137: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/sscape.c:737: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
sound/oss/trix.c:147: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/trix.c:292: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/trix.c:85: warning: `check_region' is deprecated (declared at
include/linux/ioport.h:119)
sound/oss/wavfront.c:2426: warning: `check_region' is deprecated
(declared at include/linux/ioport.h:119)
sound/oss/wf_midi.c:788: warning: `check_region' is deprecated (declared
at include/linux/ioport.h:119)
Andrew,
I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
I tested using the test programs aiocp and aiodio_sparse.
(see http://developer.osdl.org/daniel/AIO/)
Using aiocp with i/o sizes from 1k to 512k to copy files worked
without any errors or kernel debug messages.
With 64k i/o, the aiodio_sparse program complete without any errors.
There are no kernel error messages, so that is good.
There are still problems with non power of 2 i/o sizes using AIO and
O_DIRECT. It hangs with aio's that do not seem to complete. The test
does exit when hitting ^c and there are no kernel messages. Test output
below:
$ ./aiodio_sparse
$ ./aiodio_sparse -dd -s 1751k -r 18k -w 11k
child 1843, read loop count 0
io_submit() return 16
aiodio_sparse: 16 i/o in flight
aiodio_sparse: offset 180224 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 191488 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 202752 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 214016 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 225280 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 236544 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 247808 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 259072 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 270336 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 281600 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 292864 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 304128 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
child 1843, read loop count 10
io_submit() return 1
aiodio_sparse: offset 315392 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 326656 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 337920 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 349184 filesize 1793024 inflight 16
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 11264 res2 0
io_submit() return 1
aiodio_sparse: offset 360448 filesize 1793024 inflight 16
child 1843, read loop count 20
child 1843, read loop count 30
child 1843, read loop count 40
child 1843, read loop count 50
child 1843, read loop count 60
child 1843, read loop count 70
$ ./aiodio_sparse -i 9 -d -s 180k -r 18k -w 18k
io_submit() return 9
aiodio_sparse: 9 i/o in flight
aiodio_sparse: offset 165888 filesize 184320 inflight 9
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 18432 res2 0
io_submit() return 1
child 2060, read loop count 0
child 2060, read loop count 10
child 2060, read loop count 20
Daniel
On Wed, 2003-11-12 at 23:30, Andrew Morton wrote:
> - Significant changes to the AIO and direct-io code. This needs beating
> on; hopefully we're now close to a solution to the fairly complex problems
> in there.
>
> - Several ext2 and ext3 allocator fixes. These need serious testing on big
> SMP.
Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
later.
M.
"Martin J. Bligh" <[email protected]> wrote:
>
>
>
> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > SMP.
>
> OK, ext3 survived a swatting on the 16-way as well>
Great, thanks.
> It's still slow as snot, but it does work ;-)
I think SDET generates storms of metadata updates. Making the journal
larger may help get that idle time down.
Probably the default journal size is too small nowadays. Most tests seem
to run faster when it is enlarged.
> - Several ext2 and ext3 allocator fixes. These need serious testing on big
> SMP.
OK, ext3 survived a swatting on the 16-way as well. It's still slow as snot,
but it does work ;-) No changes from before, methinks.
Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
27022 16.3% total
24069 53.3% default_idle
583 2.4% page_remove_rmap
539 248.4% fd_install
478 388.6% __blk_queue_bounce
319 4.0% __d_lookup
220 122.9% may_open
204 68.2% filemap_nopage
124 0.0% journal_add_journal_head
122 321.1% __find_get_block_slow
122 0.0% do_get_write_access
101 57.1% generic_fillattr
...
-52 -73.2% .text.lock.highmem
-52 -94.5% generic_file_read
-53 -18.7% do_generic_mapping_read
-58 -3.3% do_no_page
-65 -13.0% page_address
-65 -60.2% kmap_high
-74 -100.0% grab_block
-75 -3.3% do_page_fault
-85 -1.9% __copy_from_user_ll
-273 -19.5% link_path_walk
-299 -6.5% find_get_page
-758 -100.0% generic_file_open
SDET:
1726439 214.7% total
1383611 345.4% default_idle
115417 0.0% .text.lock.transaction
79362 0.0% find_next_usable_block
38003 0.0% do_get_write_access
32429 2316.4% __down
31231 0.0% journal_dirty_metadata
15114 553.8% schedule
14350 1253.3% __wake_up
13459 0.0% start_this_handle
13100 0.0% journal_stop
...
-1105 -25.1% copy_mm
-1144 -100.0% generic_file_open
-1205 -45.0% .text.lock.dec_and_lock
-1342 -100.0% ext2_new_inode
-1365 -50.5% follow_mount
-1453 -100.0% grab_block
-1580 -30.5% remove_shared_vm_struct
-1759 -11.0% copy_page_range
-2145 -18.4% __d_lookup
-2157 -35.6% path_lookup
-2222 -33.7% atomic_dec_and_lock
-2813 -25.0% release_pages
-3764 -19.1% zap_pte_range
-8954 -21.2% page_add_rmap
-22707 -25.0% page_remove_rmap
On Friday 14 November 2003 11:08 am, Martin J. Bligh wrote:
> > - Several ext2 and ext3 allocator fixes. These need serious testing on
> > big SMP.
>
> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
> snot, but it does work ;-) No changes from before, methinks.
>
> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
>
> 27022 16.3% total
> 24069 53.3% default_idle
> 583 2.4% page_remove_rmap
> 539 248.4% fd_install
> 478 388.6% __blk_queue_bounce
What driver are you using ? Why are you bouncing ?
Thanks,
Badari
On Fri, Nov 14, 2003 at 10:59:47AM -0800, Andrew Morton wrote:
> "Martin J. Bligh" <[email protected]> wrote:
> >
> >
> >
> > > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > > SMP.
> >
> > OK, ext3 survived a swatting on the 16-way as well>
>
> Great, thanks.
>
> > It's still slow as snot, but it does work ;-)
>
> I think SDET generates storms of metadata updates. Making the journal
> larger may help get that idle time down.
>
> Probably the default journal size is too small nowadays. Most tests seem
> to run faster when it is enlarged.
Or maybe if it didn't start sync committing from the journal once it hits 50%.
>> > - Several ext2 and ext3 allocator fixes. These need serious testing on
>> > big SMP.
>>
>> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
>> snot, but it does work ;-) No changes from before, methinks.
>>
>> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
>>
>> 27022 16.3% total
>> 24069 53.3% default_idle
>> 583 2.4% page_remove_rmap
>> 539 248.4% fd_install
>> 478 388.6% __blk_queue_bounce
>
> What driver are you using ? Why are you bouncing ?
qlogicisp. Because the driver is crap? ;-)
M.
On Thu, 13 Nov 2003, Martin J. Bligh wrote:
> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> > SMP.
>
> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
> later.
It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
on all my other test boxes. Any hints?
>> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
>> > SMP.
>>
>> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
>> later.
>
> It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
> i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
> on all my other test boxes. Any hints?
Linus had some debug thing for triple faults, a few months ago, IIRC ...
probably in the archives somewhere ...
M.
On Fri, 14 Nov 2003, Martin J. Bligh wrote:
>
> Linus had some debug thing for triple faults, a few months ago, IIRC ...
> probably in the archives somewhere ...
Triple faults you can't debug, they raise a line outside the CPU, and
normal PC hardware will cause that to just trigger a reboot.
But double faults do get caught, and that debugging stuff actually is in
the standard kernel. It won't give _nearly_ as good a debug report as a
"normal" oops, since I didn't want the double-fault handler to touch
anything even remotely unsafe, but it often gives a good hint about what
might be wrong. Certainly better than triple-faulting did (which we still
do for _catastrophic_ corruption, eg totally munged kernel page tables etc
- it's just very hard to avoid once you get corrupted enough).
Linus
Mike> Or maybe if it didn't start sync committing from the journal
Mike> once it hits 50%.
Instead of using a percentage like this, would it make sense to flush
the journal when there are only N number of free journal slots/entries
left? Now the question is how to compute N in a sane way that works
for small (memory) systems, as well as for larger systems.
You don't want to grow N too aggresively, or base it on the memory of
the system, do you? When you have a 20mb journal, maybe starting
writeout after 10mb is used makes sense, because you've only got 10
transaction slots open. But when you have a 200mb journal, does it
make sense to start writeout when you only have 100 transaction slots
left?
Since I don't know the internals of Ext3 at all, I'm probably
completely missing the idea here, but my gut feeling is that the
scaling we use in these cases shouldn't be linear at all, but more
likely inverse logyrythmic instead. Basically, the larger we get with
a resource, the slower we grow our useage, or the smaller we grow the
absolute size of the writeout buffer(s).
Hmmm... this doesn't sound clear even to me. But the idea I think I'm
trying to get at is that if we have X size of a journal, we want to
start writeout when we have X/2 available. But when we have Y size of
a journal, where Y is X*10 (or larger), we don't want Y/2 as the
cutover point, we want something like Y/10. The idea is that we grow
the denominator here at a slow rate, since it will shrink the free
buffer percentage nicely, yet not let us get too close to a truly zero
sized buffer.
X X/N
----- --------
10 5
100 10
1000 25
10000 125
Does this make any sense to anyone?
John
On Fri, 14 Nov 2003, Martin J. Bligh wrote:
> >> > - Several ext2 and ext3 allocator fixes. These need serious testing on big
> >> > SMP.
> >>
> >> Survives kernbench and SDET on ext2 at least on 16-way. I'll try ext3
> >> later.
> >
> > It's actually triple faulting my laptop (K6 family=5 model=8 step=12) when
> > i have CONFIG_X86_4G enabled and try and run X11. The same kernel is fine
> > on all my other test boxes. Any hints?
>
> Linus had some debug thing for triple faults, a few months ago, IIRC ...
> probably in the archives somewhere ...
It should all be in the kernel right now; arch/i386/kernel/doublefault.c
but i think i may be a bit low on luck =)
On Fri, 14 Nov 2003, Linus Torvalds wrote:
> Triple faults you can't debug, they raise a line outside the CPU, and
> normal PC hardware will cause that to just trigger a reboot.
>
> But double faults do get caught, and that debugging stuff actually is in
> the standard kernel. It won't give _nearly_ as good a debug report as a
> "normal" oops, since I didn't want the double-fault handler to touch
> anything even remotely unsafe, but it often gives a good hint about what
> might be wrong. Certainly better than triple-faulting did (which we still
> do for _catastrophic_ corruption, eg totally munged kernel page tables etc
> - it's just very hard to avoid once you get corrupted enough).
"Catastrophic" seems to be rather apt here. 2.6.0-test8-mm1 produced the
following, i'm still doing a binary search.
Unable to handle kernel paging request at virtual address 00002000
printing eip:
00007341
*pde = 00000000
Oops: 0004 [#1]
PREEMPT SMP DEBUG_PAGEALLOC
CPU: 0
EIP: c000:[<00007341>] Not tainted VLI
EFLAGS: 00033246
EIP is at 0x7341
eax: 32454256 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00002000 ebp: 00000fd6 esp: 08763f24
ds: 0000 es: 0000 ss: 0068
Process X (pid: 939, threadinfo=08762000 task=0890b330)
Stack: 00000fcb 00000100 00000000 0000c000 00000000 00000000 00000000 00000000
00000005 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Call Trace:
Code: Bad EIP value.
On Fri, Nov 14, 2003 at 03:27:01PM -0500, John Stoffel wrote:
> You don't want to grow N too aggresively, or base it on the memory of
> the system, do you? When you have a 20mb journal, maybe starting
> writeout after 10mb is used makes sense, because you've only got 10
> transaction slots open. But when you have a 200mb journal, does it
> make sense to start writeout when you only have 100 transaction slots
> left?
The minimum transaction size is one block (since ext3 is the only journaling
FS to log entire blocks, instead of the specific logical changes made during
the transaction), and your blocks are 1k, 2k, or 4k.
Though many times you'll have several blocks per transaction since each
transaction can change bitmaps, directory blocks, and etc.
> Since I don't know the internals of Ext3 at all, I'm probably
> completely missing the idea here, but my gut feeling is that the
> scaling we use in these cases shouldn't be linear at all, but more
> likely inverse logyrythmic instead. Basically, the larger we get with
> a resource, the slower we grow our useage, or the smaller we grow the
> absolute size of the writeout buffer(s).
>
> Hmmm... this doesn't sound clear even to me. But the idea I think I'm
> trying to get at is that if we have X size of a journal, we want to
> start writeout when we have X/2 available. But when we have Y size of
> a journal, where Y is X*10 (or larger), we don't want Y/2 as the
> cutover point, we want something like Y/10. The idea is that we grow
> the denominator here at a slow rate, since it will shrink the free
> buffer percentage nicely, yet not let us get too close to a truly zero
> sized buffer.
Last I heard, ext3 will try to flush the journal with an async process and
if that isn't able to keep up, once the journal hits 50% full, the system
will write syncronously until the journal is empty (or was that until it was
25% full or less, I forget...).
AFAIK everyone agrees that this is not optimal, but nobody's taken the time
to fix it yet either.
Mike
The 4G/4G page fault handling path doesn't appear to handle faults
happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
kernel test.
However i'm still debugging the X11 triple fault in test9-mm3
Unable to handle kernel paging request at virtual address 00002000
printing eip:
00007341
*pde = 00000000
Oops: 0004 [#1]
SMP DEBUG_PAGEALLOC
CPU: 0
EIP: c000:[<00007341>] Not tainted VLI
EFLAGS: 00033246
EIP is at 0x7341
eax: 32454256 ebx: 00000000 ecx: 00000000 edx: 00000000
esi: 00000000 edi: 00002000 ebp: 00000fd6 esp: 087bbf24
ds: 0000 es: 0000 ss: 0068
Process X (pid: 939, threadinfo=087ba000 task=0891c690)
Stack: 00000fcb 00000100 00000000 0000c000 00000000 00000000 00000000 00000000
00000005 ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Call Trace:
Index: linux-2.6.0-test9-mm3/arch/i386/mm/fault.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/mm/fault.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 fault.c
--- linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 15 Nov 2003 19:08:34 -0000
@@ -264,7 +264,9 @@ asmlinkage void do_page_fault(struct pt_
if (error_code & 3)
goto bad_area_nosemaphore;
- goto vmalloc_fault;
+ /* If it's vm86 fall through */
+ if (!(error_code & 4))
+ goto vmalloc_fault;
}
#else
if (unlikely(address >= TASK_SIZE)) {
On Sat, 15 Nov 2003, Zwane Mwaikambo wrote:
> The 4G/4G page fault handling path doesn't appear to handle faults
> happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
> kernel test.
Perhaps this would be more desirable?
Index: linux-2.6.0-test9-mm3/arch/i386/mm/fault.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/mm/fault.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 fault.c
--- linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ linux-2.6.0-test9-mm3/arch/i386/mm/fault.c 15 Nov 2003 19:40:17 -0000
@@ -264,7 +264,9 @@ asmlinkage void do_page_fault(struct pt_
if (error_code & 3)
goto bad_area_nosemaphore;
- goto vmalloc_fault;
+ /* If it's vm86 fall through */
+ if (!(regs->eflags & VM_MASK))
+ goto vmalloc_fault;
}
#else
if (unlikely(address >= TASK_SIZE)) {
On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> Andrew,
>
> I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> I tested using the test programs aiocp and aiodio_sparse.
> (see http://developer.osdl.org/daniel/AIO/)
>
> Using aiocp with i/o sizes from 1k to 512k to copy files worked
> without any errors or kernel debug messages.
>
> With 64k i/o, the aiodio_sparse program complete without any errors.
> There are no kernel error messages, so that is good.
>
> There are still problems with non power of 2 i/o sizes using AIO and
> O_DIRECT. It hangs with aio's that do not seem to complete. The test
> does exit when hitting ^c and there are no kernel messages. Test output
> below:
Could you check if the following patch fixes the problem for you ?
Regards
Suparna
--------------------------------------------------------------
With this patch, when the DIO code falls back to buffered i/o after
having submitted part of the i/o, then buffered i/o is issued only
for the remaining part of the request (i.e. the part not already
covered by DIO).
diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
--- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
@@ -74,6 +74,7 @@
been performed at the start of a
write */
int pages_in_io; /* approximate total IO pages */
+ size_t size; /* total request size (doesn't change)*/
sector_t block_in_file; /* Current offset into the underlying
file in dio_block units. */
unsigned blocks_available; /* At block_in_file. changes */
@@ -226,7 +227,7 @@
dio_complete(dio, dio->block_in_file << dio->blkbits,
dio->result);
/* Complete AIO later if falling back to buffered i/o */
- if (dio->result != -ENOTBLK) {
+ if (dio->result >= dio->size || dio->rw == READ) {
aio_complete(dio->iocb, dio->result, 0);
kfree(dio);
} else {
@@ -889,6 +890,7 @@
dio->blkbits = blkbits;
dio->blkfactor = inode->i_blkbits - blkbits;
dio->start_zero_done = 0;
+ dio->size = 0;
dio->block_in_file = offset >> blkbits;
dio->blocks_available = 0;
dio->cur_page = NULL;
@@ -925,7 +927,7 @@
for (seg = 0; seg < nr_segs; seg++) {
user_addr = (unsigned long)iov[seg].iov_base;
- bytes = iov[seg].iov_len;
+ dio->size += bytes = iov[seg].iov_len;
/* Index into the first page of the first block */
dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
@@ -956,6 +958,13 @@
}
} /* end iovec loop */
+ if (ret == -ENOTBLK && rw == WRITE) {
+ /*
+ * The remaining part of the request will be
+ * be handled by buffered I/O when we return
+ */
+ ret = 0;
+ }
/*
* There may be some unwritten disk at the end of a part-written
* fs-block-sized block. Go zero that now.
@@ -986,19 +995,13 @@
*/
if (dio->is_async) {
if (ret == 0)
- ret = dio->result; /* Bytes written */
- if (ret == -ENOTBLK) {
- /*
- * The request will be reissued via buffered I/O
- * when we return; Any I/O already issued
- * effectively becomes redundant.
- */
- dio->result = ret;
+ ret = dio->result;
+ if (ret > 0 && dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
}
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (ret == -ENOTBLK) {
+ if (dio->waiter) {
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
@@ -1032,7 +1035,8 @@
}
dio_complete(dio, offset, ret);
/* We could have also come here on an AIO file extend */
- if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
+ if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
+ dio->result < dio->size))
aio_complete(iocb, ret, 0);
kfree(dio);
}
diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
--- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
+++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
@@ -1895,14 +1895,16 @@
*/
if (written >= 0 && file->f_flags & O_SYNC)
status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
- if (written >= 0 && !is_sync_kiocb(iocb))
+ if (written >= count && !is_sync_kiocb(iocb))
written = -EIOCBQUEUED;
- if (written != -ENOTBLK)
+ if (written < 0 || written >= count)
goto out_status;
/*
* direct-io write to a hole: fall through to buffered I/O
+ * for completing the rest of the request.
*/
- written = 0;
+ pos += written;
+ count -= written;
}
buf = iov->iov_base;
In article <100480000.1068841761@flay>,
Martin J. Bligh <[email protected]> wrote:
| >> > - Several ext2 and ext3 allocator fixes. These need serious testing on
| >> > big SMP.
| >>
| >> OK, ext3 survived a swatting on the 16-way as well. It's still slow as
| >> snot, but it does work ;-) No changes from before, methinks.
| >>
| >> Diffprofile for kernbench (-j) from ext2 to ext3 on mm3
| >>
| >> 27022 16.3% total
| >> 24069 53.3% default_idle
| >> 583 2.4% page_remove_rmap
| >> 539 248.4% fd_install
| >> 478 388.6% __blk_queue_bounce
| >
| > What driver are you using ? Why are you bouncing ?
|
| qlogicisp. Because the driver is crap? ;-)
The question is, does that make your testing better or worse in terms of
checking the new code? Clearly you have done a good job of checking the
"disk can't keep up" case, is there a need to test further with a much
higher transaction rate?
I would assume that if there were lock issues they would have shown up,
which is probably all that's needed.
--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
On Sat, 15 Nov 2003, Zwane Mwaikambo wrote:
> The 4G/4G page fault handling path doesn't appear to handle faults
> happening whilst in vm86. The regs->xcs != __USER_CS so it confused the in
> kernel test.
>
> However i'm still debugging the X11 triple fault in test9-mm3
I've managed to `fix` the triple fault (see further below for the patch
in all it's glory). Unfortunately i have been unable to come up with a
simpler workaround which is fewer instructions and easier to debug. I have
tried the following;
mb()/barrier()
flush_tlb_all()
wbinvd()
outb(0x80,0x00)
local_irq_save(flags); local_irq_enable(); loop(); local_irq_restore(flags);
long_loop()
What i do know is that in the following code;
__asm__ __volatile__(
"xorl %%eax,%%eax; movl %%eax,%%fs; movl %%eax,%%gs\n\t"
"movl %0,%%esp\n\t"
"movl %1,%%ebp\n\t"
"jmp resume_userspace"
: /* no outputs */
:"r" (&info->regs), "r" (tsk->thread_info) : "ax");
It does get to resume_userspace as putting a $0 into %ebp will oops in
__switch_to
And here is the current 'workaround'. Any hints?
Index: arch/i386/kernel/vm86.c
===================================================================
RCS file: /build/cvsroot/linux-2.6.0-test9-mm3/arch/i386/kernel/vm86.c,v
retrieving revision 1.1.1.1
diff -u -p -B -r1.1.1.1 vm86.c
--- arch/i386/kernel/vm86.c 13 Nov 2003 08:07:17 -0000 1.1.1.1
+++ arch/i386/kernel/vm86.c 17 Nov 2003 21:45:13 -0000
@@ -312,6 +311,8 @@ static void do_sys_vm86(struct kernel_vm
tsk->thread.screen_bitmap = info->screen_bitmap;
if (info->flags & VM86_SCREEN_BITMAP)
mark_screen_rdonly(tsk);
+
+ printk("ooh la la\n");
__asm__ __volatile__(
"xorl %%eax,%%eax; movl %%eax,%%fs; movl %%eax,%%gs\n\t"
"movl %0,%%esp\n\t"
On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:
>
> I've managed to `fix` the triple fault (see further below for the patch
> in all it's glory).
What's the generated assembly language for this function with and without
the "fix"?
If adding that printk fixes a triple fault, the issue is not likely to be
the printk itself as much as the difference in code that the compiler
generates - stack frame, memory re-ordering etc...
Linus
On Mon, 17 Nov 2003, Linus Torvalds wrote:
> What's the generated assembly language for this function with and without
> the "fix"?
>
> If adding that printk fixes a triple fault, the issue is not likely to be
> the printk itself as much as the difference in code that the compiler
> generates - stack frame, memory re-ordering etc...
This would be my 'trusty' gcc 3.2.2 from RedHat 9
(gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
With the fix:
0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
0x0210e96b <do_sys_vm86+267>: push $0x255f121
0x0210e970 <do_sys_vm86+272>: call 0x21285a0 <printk>
0x0210e975 <do_sys_vm86+277>: mov 0x50(%edi),%eax
0x0210e978 <do_sys_vm86+280>: mov %eax,0x5b4(%esi)
0x0210e97e <do_sys_vm86+286>: pop %eax
0x0210e97f <do_sys_vm86+287>: testb $0x1,0x4c(%edi)
0x0210e983 <do_sys_vm86+291>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e985 <do_sys_vm86+293>: mov 0x4(%esi),%edx
0x0210e988 <do_sys_vm86+296>: xor %eax,%eax
0x0210e98a <do_sys_vm86+298>: mov %eax,%fs
0x0210e98c <do_sys_vm86+300>: mov %eax,%gs
0x0210e98e <do_sys_vm86+302>: mov %edi,%esp
0x0210e990 <do_sys_vm86+304>: mov %edx,%ebp
0x0210e992 <do_sys_vm86+306>: jmp 0xfffeb100 <resume_userspace>
0x0210e997 <do_sys_vm86+311>: pop %ebx
0x0210e998 <do_sys_vm86+312>: pop %esi
0x0210e999 <do_sys_vm86+313>: pop %edi
0x0210e99a <do_sys_vm86+314>: ret
0x0210e99b <do_sys_vm86+315>: nop
0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
0x0210e9a0 <do_sys_vm86+320>: push %esi
0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e9a6 <do_sys_vm86+326>: pop %eax
0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e985 <do_sys_vm86+293>
0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
0x0210e9c2 <do_sys_vm86+354>: wrmsr
0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>
Without the fix:
0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9e0 <do_sys_vm86+384>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9c5 <do_sys_vm86+357>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9b6 <do_sys_vm86+342>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e999 <do_sys_vm86+313>
0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
0x0210e978 <do_sys_vm86+280>: jne 0x210e990 <do_sys_vm86+304>
0x0210e97a <do_sys_vm86+282>: mov 0x4(%esi),%edx
0x0210e97d <do_sys_vm86+285>: xor %eax,%eax
0x0210e97f <do_sys_vm86+287>: mov %eax,%fs
0x0210e981 <do_sys_vm86+289>: mov %eax,%gs
0x0210e983 <do_sys_vm86+291>: mov %edi,%esp
0x0210e985 <do_sys_vm86+293>: mov %edx,%ebp
0x0210e987 <do_sys_vm86+295>: jmp 0xfffeb100 <resume_userspace>
0x0210e98c <do_sys_vm86+300>: pop %ebx
0x0210e98d <do_sys_vm86+301>: pop %esi
0x0210e98e <do_sys_vm86+302>: pop %edi
0x0210e98f <do_sys_vm86+303>: ret
0x0210e990 <do_sys_vm86+304>: push %esi
0x0210e991 <do_sys_vm86+305>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e996 <do_sys_vm86+310>: pop %eax
0x0210e997 <do_sys_vm86+311>: jmp 0x210e97a <do_sys_vm86+282>
0x0210e999 <do_sys_vm86+313>: call 0x21222c0 <preempt_schedule>
0x0210e99e <do_sys_vm86+318>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9a0 <do_sys_vm86+320>: mov 0x24(%edx),%ax
0x0210e9a4 <do_sys_vm86+324>: mov %ax,0x10(%ecx)
0x0210e9a8 <do_sys_vm86+328>: mov $0x174,%ecx
0x0210e9ad <do_sys_vm86+333>: mov 0x24(%edx),%eax
0x0210e9b0 <do_sys_vm86+336>: xor %edx,%edx
0x0210e9b2 <do_sys_vm86+338>: wrmsr
0x0210e9b4 <do_sys_vm86+340>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9b6 <do_sys_vm86+342>: movl $0x0,0x5bc(%esi)
0x0210e9c0 <do_sys_vm86+352>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9c5 <do_sys_vm86+357>: cmp $0x4,%eax
0x0210e9c8 <do_sys_vm86+360>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9ce <do_sys_vm86+366>: movl $0x47000,0x5bc(%esi)
0x0210e9d8 <do_sys_vm86+376>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9dd <do_sys_vm86+381>: lea 0x0(%esi),%esi
0x0210e9e0 <do_sys_vm86+384>: movl $0x7000,0x5bc(%esi)
0x0210e9ea <do_sys_vm86+394>: jmp 0x210e8d8 <do_sys_vm86+120>
On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:
> On Mon, 17 Nov 2003, Linus Torvalds wrote:
>
> > What's the generated assembly language for this function with and without
> > the "fix"?
> >
> > If adding that printk fixes a triple fault, the issue is not likely to be
> > the printk itself as much as the difference in code that the compiler
> > generates - stack frame, memory re-ordering etc...
>
> This would be my 'trusty' gcc 3.2.2 from RedHat 9
> (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)
A little bird told me to send diffs... But there is a lot of noise due to
offsets i'm afraid.
--- buggy 2003-11-17 18:09:35.302964248 -0500
+++ works 2003-11-17 18:09:47.744072912 -0500
@@ -21,11 +21,11 @@
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
-0x0210e8b6 <do_sys_vm86+86>: je 0x210e9e0 <do_sys_vm86+384>
+0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
-0x0210e8bf <do_sys_vm86+95>: ja 0x210e9c5 <do_sys_vm86+357>
+0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
-0x0210e8c8 <do_sys_vm86+104>: je 0x210e9b6 <do_sys_vm86+342>
+0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
@@ -57,47 +57,52 @@
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
-0x0210e95a <do_sys_vm86+250>: jne 0x210e9a0 <do_sys_vm86+320>
+0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
-0x0210e969 <do_sys_vm86+265>: jne 0x210e999 <do_sys_vm86+313>
-0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
-0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
-0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
-0x0210e978 <do_sys_vm86+280>: jne 0x210e990 <do_sys_vm86+304>
-0x0210e97a <do_sys_vm86+282>: mov 0x4(%esi),%edx
-0x0210e97d <do_sys_vm86+285>: xor %eax,%eax
-0x0210e97f <do_sys_vm86+287>: mov %eax,%fs
-0x0210e981 <do_sys_vm86+289>: mov %eax,%gs
-0x0210e983 <do_sys_vm86+291>: mov %edi,%esp
-0x0210e985 <do_sys_vm86+293>: mov %edx,%ebp
-0x0210e987 <do_sys_vm86+295>: jmp 0xfffeb100 <resume_userspace>
-0x0210e98c <do_sys_vm86+300>: pop %ebx
-0x0210e98d <do_sys_vm86+301>: pop %esi
-0x0210e98e <do_sys_vm86+302>: pop %edi
-0x0210e98f <do_sys_vm86+303>: ret
-0x0210e990 <do_sys_vm86+304>: push %esi
-0x0210e991 <do_sys_vm86+305>: call 0x210e5b0 <mark_screen_rdonly>
-0x0210e996 <do_sys_vm86+310>: pop %eax
-0x0210e997 <do_sys_vm86+311>: jmp 0x210e97a <do_sys_vm86+282>
-0x0210e999 <do_sys_vm86+313>: call 0x21222c0 <preempt_schedule>
-0x0210e99e <do_sys_vm86+318>: jmp 0x210e96b <do_sys_vm86+267>
-0x0210e9a0 <do_sys_vm86+320>: mov 0x24(%edx),%ax
-0x0210e9a4 <do_sys_vm86+324>: mov %ax,0x10(%ecx)
-0x0210e9a8 <do_sys_vm86+328>: mov $0x174,%ecx
-0x0210e9ad <do_sys_vm86+333>: mov 0x24(%edx),%eax
-0x0210e9b0 <do_sys_vm86+336>: xor %edx,%edx
-0x0210e9b2 <do_sys_vm86+338>: wrmsr
-0x0210e9b4 <do_sys_vm86+340>: jmp 0x210e95c <do_sys_vm86+252>
-0x0210e9b6 <do_sys_vm86+342>: movl $0x0,0x5bc(%esi)
-0x0210e9c0 <do_sys_vm86+352>: jmp 0x210e8d8 <do_sys_vm86+120>
-0x0210e9c5 <do_sys_vm86+357>: cmp $0x4,%eax
-0x0210e9c8 <do_sys_vm86+360>: jne 0x210e8ce <do_sys_vm86+110>
-0x0210e9ce <do_sys_vm86+366>: movl $0x47000,0x5bc(%esi)
-0x0210e9d8 <do_sys_vm86+376>: jmp 0x210e8d8 <do_sys_vm86+120>
-0x0210e9dd <do_sys_vm86+381>: lea 0x0(%esi),%esi
-0x0210e9e0 <do_sys_vm86+384>: movl $0x7000,0x5bc(%esi)
-0x0210e9ea <do_sys_vm86+394>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
+0x0210e96b <do_sys_vm86+267>: push $0x255f121
+0x0210e970 <do_sys_vm86+272>: call 0x21285a0 <printk>
+0x0210e975 <do_sys_vm86+277>: mov 0x50(%edi),%eax
+0x0210e978 <do_sys_vm86+280>: mov %eax,0x5b4(%esi)
+0x0210e97e <do_sys_vm86+286>: pop %eax
+0x0210e97f <do_sys_vm86+287>: testb $0x1,0x4c(%edi)
+0x0210e983 <do_sys_vm86+291>: jne 0x210e9a0 <do_sys_vm86+320>
+0x0210e985 <do_sys_vm86+293>: mov 0x4(%esi),%edx
+0x0210e988 <do_sys_vm86+296>: xor %eax,%eax
+0x0210e98a <do_sys_vm86+298>: mov %eax,%fs
+0x0210e98c <do_sys_vm86+300>: mov %eax,%gs
+0x0210e98e <do_sys_vm86+302>: mov %edi,%esp
+0x0210e990 <do_sys_vm86+304>: mov %edx,%ebp
+0x0210e992 <do_sys_vm86+306>: jmp 0xfffeb100 <resume_userspace>
+0x0210e997 <do_sys_vm86+311>: pop %ebx
+0x0210e998 <do_sys_vm86+312>: pop %esi
+0x0210e999 <do_sys_vm86+313>: pop %edi
+0x0210e99a <do_sys_vm86+314>: ret
+0x0210e99b <do_sys_vm86+315>: nop
+0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
+0x0210e9a0 <do_sys_vm86+320>: push %esi
+0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
+0x0210e9a6 <do_sys_vm86+326>: pop %eax
+0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e985 <do_sys_vm86+293>
+0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
+0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
+0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
+0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
+0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
+0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
+0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
+0x0210e9c2 <do_sys_vm86+354>: wrmsr
+0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
+0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
+0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
+0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
+0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
+0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
+0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
+0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
+0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>
Suparna,
Good news and bad news. Your patch does fix the non-power of two i/o
size problems where AIO previously did not complete:
$ ./aiodio_sparse -s 1751k -r 18k -w 11k
$ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
io_submit() return 9
aiodio_sparse: 9 i/o in flight
aiodio_sparse: offset 165888 filesize 184320 inflight 9
aiodio_sparse: io_getevent() returned 1
aiodio_sparse: io_getevent() res 18432 res2 0
io_submit() return 1
AIO DIO write done unlinking file
dio_sparse done writing, kill children
aiodio_sparse 0 children had errors
But when testing using aiocp using O_DIRECT to copy a file to
an already allocated file, the aiocp process hangs. I used i/o
size of 4k and that compeleted. Using i/o size of 1k and 2k,
the aiocp process hung during io_sumbit() and are unkillable.
Here are the stack traces:
# ps -fu daniel | grep aiocp
daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
aiocp D 00000001 1920 1 1902 (NOTLB)
e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
c0289a16
f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
e70aa000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0121b70>] schedule+0x3ac/0x7ef
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
[<c03c007b>] rpc_depopulate+0x1aa/0x24b
aiocp D 366EDC94 2083 2037 (NOTLB)
e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0259d3e>] write_chan+0x165/0x21e
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
[<c03c007b>] rpc_depopulate+0x1aa/0x24b
Daniel
On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > Andrew,
> >
> > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > I tested using the test programs aiocp and aiodio_sparse.
> > (see http://developer.osdl.org/daniel/AIO/)
> >
> > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > without any errors or kernel debug messages.
> >
> > With 64k i/o, the aiodio_sparse program complete without any errors.
> > There are no kernel error messages, so that is good.
> >
> > There are still problems with non power of 2 i/o sizes using AIO and
> > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > does exit when hitting ^c and there are no kernel messages. Test output
> > below:
>
> Could you check if the following patch fixes the problem for you ?
>
> Regards
> Suparna
>
> --------------------------------------------------------------
>
> With this patch, when the DIO code falls back to buffered i/o after
> having submitted part of the i/o, then buffered i/o is issued only
> for the remaining part of the request (i.e. the part not already
> covered by DIO).
>
> diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> @@ -74,6 +74,7 @@
> been performed at the start of a
> write */
> int pages_in_io; /* approximate total IO pages */
> + size_t size; /* total request size (doesn't change)*/
> sector_t block_in_file; /* Current offset into the underlying
> file in dio_block units. */
> unsigned blocks_available; /* At block_in_file. changes */
> @@ -226,7 +227,7 @@
> dio_complete(dio, dio->block_in_file << dio->blkbits,
> dio->result);
> /* Complete AIO later if falling back to buffered i/o */
> - if (dio->result != -ENOTBLK) {
> + if (dio->result >= dio->size || dio->rw == READ) {
> aio_complete(dio->iocb, dio->result, 0);
> kfree(dio);
> } else {
> @@ -889,6 +890,7 @@
> dio->blkbits = blkbits;
> dio->blkfactor = inode->i_blkbits - blkbits;
> dio->start_zero_done = 0;
> + dio->size = 0;
> dio->block_in_file = offset >> blkbits;
> dio->blocks_available = 0;
> dio->cur_page = NULL;
> @@ -925,7 +927,7 @@
>
> for (seg = 0; seg < nr_segs; seg++) {
> user_addr = (unsigned long)iov[seg].iov_base;
> - bytes = iov[seg].iov_len;
> + dio->size += bytes = iov[seg].iov_len;
>
> /* Index into the first page of the first block */
> dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> @@ -956,6 +958,13 @@
> }
> } /* end iovec loop */
>
> + if (ret == -ENOTBLK && rw == WRITE) {
> + /*
> + * The remaining part of the request will be
> + * be handled by buffered I/O when we return
> + */
> + ret = 0;
> + }
> /*
> * There may be some unwritten disk at the end of a part-written
> * fs-block-sized block. Go zero that now.
> @@ -986,19 +995,13 @@
> */
> if (dio->is_async) {
> if (ret == 0)
> - ret = dio->result; /* Bytes written */
> - if (ret == -ENOTBLK) {
> - /*
> - * The request will be reissued via buffered I/O
> - * when we return; Any I/O already issued
> - * effectively becomes redundant.
> - */
> - dio->result = ret;
> + ret = dio->result;
> + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> dio->waiter = current;
> }
> finished_one_bio(dio); /* This can free the dio */
> blk_run_queues();
> - if (ret == -ENOTBLK) {
> + if (dio->waiter) {
> /*
> * Wait for already issued I/O to drain out and
> * release its references to user-space pages
> @@ -1032,7 +1035,8 @@
> }
> dio_complete(dio, offset, ret);
> /* We could have also come here on an AIO file extend */
> - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> + dio->result < dio->size))
> aio_complete(iocb, ret, 0);
> kfree(dio);
> }
> diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> @@ -1895,14 +1895,16 @@
> */
> if (written >= 0 && file->f_flags & O_SYNC)
> status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> - if (written >= 0 && !is_sync_kiocb(iocb))
> + if (written >= count && !is_sync_kiocb(iocb))
> written = -EIOCBQUEUED;
> - if (written != -ENOTBLK)
> + if (written < 0 || written >= count)
> goto out_status;
> /*
> * direct-io write to a hole: fall through to buffered I/O
> + * for completing the rest of the request.
> */
> - written = 0;
> + pos += written;
> + count -= written;
> }
>
> buf = iov->iov_base;
Obviously, the ps output in my previous email showed that the hangs were
with 1k i/o sizes.
More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
completed correctly.
Even 11k and 17k worked.
$ ls -l
-rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
-rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2
So, only 1k is hanging so far.
Daniel
On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> Suparna,
>
> Good news and bad news. Your patch does fix the non-power of two i/o
> size problems where AIO previously did not complete:
>
> $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> io_submit() return 9
> aiodio_sparse: 9 i/o in flight
> aiodio_sparse: offset 165888 filesize 184320 inflight 9
> aiodio_sparse: io_getevent() returned 1
> aiodio_sparse: io_getevent() res 18432 res2 0
> io_submit() return 1
> AIO DIO write done unlinking file
> dio_sparse done writing, kill children
> aiodio_sparse 0 children had errors
>
> But when testing using aiocp using O_DIRECT to copy a file to
> an already allocated file, the aiocp process hangs. I used i/o
> size of 4k and that compeleted. Using i/o size of 1k and 2k,
> the aiocp process hung during io_sumbit() and are unkillable.
> Here are the stack traces:
>
> # ps -fu daniel | grep aiocp
> daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
>
>
> aiocp D 00000001 1920 1 1902 (NOTLB)
> e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> c0289a16
> f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> e70aa000
> Call Trace:
> [<c02897fc>] generic_unplug_device+0x50/0xbd
> [<c0289a16>] blk_run_queues+0xa9/0x15c
> [<c0123712>] io_schedule+0x26/0x30
> [<c0192242>] direct_io_worker+0x376/0x5ab
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> [<c0121b70>] schedule+0x3ac/0x7ef
> [<c0145f48>] generic_file_aio_read+0x33/0x37
> [<c0194ad3>] aio_pread+0x34/0x5f
> [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> [<c019316f>] __aio_get_req+0x27/0x158
> [<c0194a9f>] aio_pread+0x0/0x5f
> [<c0194f62>] io_submit_one+0x1ea/0x2b7
> [<c0195110>] sys_io_submit+0xe1/0x194
> [<c03c29a7>] syscall_call+0x7/0xb
> [<c03c007b>] rpc_depopulate+0x1aa/0x24b
>
>
> aiocp D 366EDC94 2083 2037 (NOTLB)
> e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> Call Trace:
> [<c02897fc>] generic_unplug_device+0x50/0xbd
> [<c0289a16>] blk_run_queues+0xa9/0x15c
> [<c0123712>] io_schedule+0x26/0x30
> [<c0192242>] direct_io_worker+0x376/0x5ab
> [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> [<c014840f>] generic_file_direct_IO+0x70/0x89
> [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> [<c0259d3e>] write_chan+0x165/0x21e
> [<c0145f48>] generic_file_aio_read+0x33/0x37
> [<c0194ad3>] aio_pread+0x34/0x5f
> [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> [<c019316f>] __aio_get_req+0x27/0x158
> [<c0194a9f>] aio_pread+0x0/0x5f
> [<c02532ab>] tty_write+0x1e8/0x3b2
> [<c0194f62>] io_submit_one+0x1ea/0x2b7
> [<c0195110>] sys_io_submit+0xe1/0x194
> [<c03c29a7>] syscall_call+0x7/0xb
> [<c03c007b>] rpc_depopulate+0x1aa/0x24b
>
>
>
> Daniel
>
> On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > Andrew,
> > >
> > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > I tested using the test programs aiocp and aiodio_sparse.
> > > (see http://developer.osdl.org/daniel/AIO/)
> > >
> > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > without any errors or kernel debug messages.
> > >
> > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > There are no kernel error messages, so that is good.
> > >
> > > There are still problems with non power of 2 i/o sizes using AIO and
> > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > does exit when hitting ^c and there are no kernel messages. Test output
> > > below:
> >
> > Could you check if the following patch fixes the problem for you ?
> >
> > Regards
> > Suparna
> >
> > --------------------------------------------------------------
> >
> > With this patch, when the DIO code falls back to buffered i/o after
> > having submitted part of the i/o, then buffered i/o is issued only
> > for the remaining part of the request (i.e. the part not already
> > covered by DIO).
> >
> > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > @@ -74,6 +74,7 @@
> > been performed at the start of a
> > write */
> > int pages_in_io; /* approximate total IO pages */
> > + size_t size; /* total request size (doesn't change)*/
> > sector_t block_in_file; /* Current offset into the underlying
> > file in dio_block units. */
> > unsigned blocks_available; /* At block_in_file. changes */
> > @@ -226,7 +227,7 @@
> > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > dio->result);
> > /* Complete AIO later if falling back to buffered i/o */
> > - if (dio->result != -ENOTBLK) {
> > + if (dio->result >= dio->size || dio->rw == READ) {
> > aio_complete(dio->iocb, dio->result, 0);
> > kfree(dio);
> > } else {
> > @@ -889,6 +890,7 @@
> > dio->blkbits = blkbits;
> > dio->blkfactor = inode->i_blkbits - blkbits;
> > dio->start_zero_done = 0;
> > + dio->size = 0;
> > dio->block_in_file = offset >> blkbits;
> > dio->blocks_available = 0;
> > dio->cur_page = NULL;
> > @@ -925,7 +927,7 @@
> >
> > for (seg = 0; seg < nr_segs; seg++) {
> > user_addr = (unsigned long)iov[seg].iov_base;
> > - bytes = iov[seg].iov_len;
> > + dio->size += bytes = iov[seg].iov_len;
> >
> > /* Index into the first page of the first block */
> > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > @@ -956,6 +958,13 @@
> > }
> > } /* end iovec loop */
> >
> > + if (ret == -ENOTBLK && rw == WRITE) {
> > + /*
> > + * The remaining part of the request will be
> > + * be handled by buffered I/O when we return
> > + */
> > + ret = 0;
> > + }
> > /*
> > * There may be some unwritten disk at the end of a part-written
> > * fs-block-sized block. Go zero that now.
> > @@ -986,19 +995,13 @@
> > */
> > if (dio->is_async) {
> > if (ret == 0)
> > - ret = dio->result; /* Bytes written */
> > - if (ret == -ENOTBLK) {
> > - /*
> > - * The request will be reissued via buffered I/O
> > - * when we return; Any I/O already issued
> > - * effectively becomes redundant.
> > - */
> > - dio->result = ret;
> > + ret = dio->result;
> > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > dio->waiter = current;
> > }
> > finished_one_bio(dio); /* This can free the dio */
> > blk_run_queues();
> > - if (ret == -ENOTBLK) {
> > + if (dio->waiter) {
> > /*
> > * Wait for already issued I/O to drain out and
> > * release its references to user-space pages
> > @@ -1032,7 +1035,8 @@
> > }
> > dio_complete(dio, offset, ret);
> > /* We could have also come here on an AIO file extend */
> > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > + dio->result < dio->size))
> > aio_complete(iocb, ret, 0);
> > kfree(dio);
> > }
> > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > @@ -1895,14 +1895,16 @@
> > */
> > if (written >= 0 && file->f_flags & O_SYNC)
> > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > - if (written >= 0 && !is_sync_kiocb(iocb))
> > + if (written >= count && !is_sync_kiocb(iocb))
> > written = -EIOCBQUEUED;
> > - if (written != -ENOTBLK)
> > + if (written < 0 || written >= count)
> > goto out_status;
> > /*
> > * direct-io write to a hole: fall through to buffered I/O
> > + * for completing the rest of the request.
> > */
> > - written = 0;
> > + pos += written;
> > + count -= written;
> > }
> >
> > buf = iov->iov_base;
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>
On Mon, 17 Nov 2003, Zwane Mwaikambo wrote:
> A little bird told me to send diffs... But there is a lot of noise due to
> offsets i'm afraid.
Another note from our avian friends; i seem to have sent a slightly
different dump from the patch, although they do both achieve the same
effect. I shall append it for completeness.
0x0210e860 <do_sys_vm86+0>: push %edi
0x0210e861 <do_sys_vm86+1>: mov $0xffffe000,%eax
0x0210e866 <do_sys_vm86+6>: push %esi
0x0210e867 <do_sys_vm86+7>: and %esp,%eax
0x0210e869 <do_sys_vm86+9>: push %ebx
0x0210e86a <do_sys_vm86+10>: mov 0x10(%esp,1),%edi
0x0210e86e <do_sys_vm86+14>: mov 0x14(%esp,1),%esi
0x0210e872 <do_sys_vm86+18>: movl $0x0,0x1c(%edi)
0x0210e879 <do_sys_vm86+25>: movl $0x0,0x20(%edi)
0x0210e880 <do_sys_vm86+32>: mov (%eax),%edx
0x0210e882 <do_sys_vm86+34>: mov 0x30(%edi),%eax
0x0210e885 <do_sys_vm86+37>: mov %eax,0x5b8(%edx)
0x0210e88b <do_sys_vm86+43>: mov 0x30(%edi),%edx
0x0210e88e <do_sys_vm86+46>: mov 0xbc(%edi),%eax
0x0210e894 <do_sys_vm86+52>: and $0xdd5,%edx
0x0210e89a <do_sys_vm86+58>: mov %edx,0x30(%edi)
0x0210e89d <do_sys_vm86+61>: mov 0x30(%eax),%eax
0x0210e8a0 <do_sys_vm86+64>: and $0xfffff22a,%eax
0x0210e8a5 <do_sys_vm86+69>: or %eax,%edx
0x0210e8a7 <do_sys_vm86+71>: mov 0x54(%edi),%eax
0x0210e8aa <do_sys_vm86+74>: or $0x20000,%edx
0x0210e8b0 <do_sys_vm86+80>: cmp $0x3,%eax
0x0210e8b3 <do_sys_vm86+83>: mov %edx,0x30(%edi)
0x0210e8b6 <do_sys_vm86+86>: je 0x210e9f0 <do_sys_vm86+400>
0x0210e8bc <do_sys_vm86+92>: cmp $0x3,%eax
0x0210e8bf <do_sys_vm86+95>: ja 0x210e9d5 <do_sys_vm86+373>
0x0210e8c5 <do_sys_vm86+101>: cmp $0x2,%eax
0x0210e8c8 <do_sys_vm86+104>: je 0x210e9c6 <do_sys_vm86+358>
0x0210e8ce <do_sys_vm86+110>: movl $0x247000,0x5bc(%esi)
0x0210e8d8 <do_sys_vm86+120>: mov 0xbc(%edi),%eax
0x0210e8de <do_sys_vm86+126>: movl $0x0,0x18(%eax)
0x0210e8e5 <do_sys_vm86+133>: mov 0x360(%esi),%eax
0x0210e8eb <do_sys_vm86+139>: mov %eax,0x5c0(%esi)
0x0210e8f1 <do_sys_vm86+145>: movl %fs,0x5c4(%esi)
0x0210e8f7 <do_sys_vm86+151>: movl %gs,0x5c8(%esi)
0x0210e8fd <do_sys_vm86+157>: mov $0xffffe000,%ebx
0x0210e902 <do_sys_vm86+162>: and %esp,%ebx
0x0210e904 <do_sys_vm86+164>: mov 0x14(%ebx),%eax
0x0210e907 <do_sys_vm86+167>: inc %eax
0x0210e908 <do_sys_vm86+168>: mov %eax,0x14(%ebx)
0x0210e90b <do_sys_vm86+171>: mov 0x10(%ebx),%eax
0x0210e90e <do_sys_vm86+174>: mov 0x4(%esi),%edx
0x0210e911 <do_sys_vm86+177>: shl $0x9,%eax
0x0210e914 <do_sys_vm86+180>: lea 0x26ff000(%eax),%ecx
0x0210e91a <do_sys_vm86+186>: lea 0x4c(%edi),%eax
0x0210e91d <do_sys_vm86+189>: mov %eax,0x360(%esi)
0x0210e923 <do_sys_vm86+195>: sub 0x1c(%edx),%eax
0x0210e926 <do_sys_vm86+198>: add 0x20(%edx),%eax
0x0210e929 <do_sys_vm86+201>: mov %eax,0x4(%ecx)
0x0210e92c <do_sys_vm86+204>: mov 0x25fe52c,%eax
0x0210e931 <do_sys_vm86+209>: test $0x800,%eax
0x0210e936 <do_sys_vm86+214>: je 0x210e942 <do_sys_vm86+226>
0x0210e938 <do_sys_vm86+216>: movl $0x0,0x364(%esi)
0x0210e942 <do_sys_vm86+226>: lea 0x340(%esi),%edx
0x0210e948 <do_sys_vm86+232>: mov 0x20(%edx),%eax
0x0210e94b <do_sys_vm86+235>: mov %eax,0x4(%ecx)
0x0210e94e <do_sys_vm86+238>: mov 0x10(%ecx),%ax
0x0210e952 <do_sys_vm86+242>: and $0xffff,%eax
0x0210e957 <do_sys_vm86+247>: cmp 0x24(%edx),%eax
0x0210e95a <do_sys_vm86+250>: jne 0x210e9b0 <do_sys_vm86+336>
0x0210e95c <do_sys_vm86+252>: mov 0x14(%ebx),%eax
0x0210e95f <do_sys_vm86+255>: dec %eax
0x0210e960 <do_sys_vm86+256>: mov %eax,0x14(%ebx)
0x0210e963 <do_sys_vm86+259>: mov 0x8(%ebx),%eax
0x0210e966 <do_sys_vm86+262>: and $0x8,%eax
0x0210e969 <do_sys_vm86+265>: jne 0x210e9a9 <do_sys_vm86+329>
0x0210e96b <do_sys_vm86+267>: mov 0x50(%edi),%eax
0x0210e96e <do_sys_vm86+270>: mov %eax,0x5b4(%esi)
0x0210e974 <do_sys_vm86+276>: testb $0x1,0x4c(%edi)
0x0210e978 <do_sys_vm86+280>: jne 0x210e9a0 <do_sys_vm86+320>
0x0210e97a <do_sys_vm86+282>: push $0x255f121
0x0210e97f <do_sys_vm86+287>: call 0x21285a0 <printk>
0x0210e984 <do_sys_vm86+292>: mov 0x4(%esi),%edx
0x0210e987 <do_sys_vm86+295>: xor %eax,%eax
0x0210e989 <do_sys_vm86+297>: mov %eax,%fs
0x0210e98b <do_sys_vm86+299>: mov %eax,%gs
0x0210e98d <do_sys_vm86+301>: mov %edi,%esp
0x0210e98f <do_sys_vm86+303>: mov %edx,%ebp
0x0210e991 <do_sys_vm86+305>: jmp 0xfffeb100 <resume_userspace>
0x0210e996 <do_sys_vm86+310>: pop %esi
0x0210e997 <do_sys_vm86+311>: pop %ebx
0x0210e998 <do_sys_vm86+312>: pop %esi
0x0210e999 <do_sys_vm86+313>: pop %edi
0x0210e99a <do_sys_vm86+314>: ret
0x0210e99b <do_sys_vm86+315>: nop
0x0210e99c <do_sys_vm86+316>: lea 0x0(%esi,1),%esi
0x0210e9a0 <do_sys_vm86+320>: push %esi
0x0210e9a1 <do_sys_vm86+321>: call 0x210e5b0 <mark_screen_rdonly>
0x0210e9a6 <do_sys_vm86+326>: pop %eax
0x0210e9a7 <do_sys_vm86+327>: jmp 0x210e97a <do_sys_vm86+282>
0x0210e9a9 <do_sys_vm86+329>: call 0x21222d0 <preempt_schedule>
0x0210e9ae <do_sys_vm86+334>: jmp 0x210e96b <do_sys_vm86+267>
0x0210e9b0 <do_sys_vm86+336>: mov 0x24(%edx),%ax
0x0210e9b4 <do_sys_vm86+340>: mov %ax,0x10(%ecx)
0x0210e9b8 <do_sys_vm86+344>: mov $0x174,%ecx
0x0210e9bd <do_sys_vm86+349>: mov 0x24(%edx),%eax
0x0210e9c0 <do_sys_vm86+352>: xor %edx,%edx
0x0210e9c2 <do_sys_vm86+354>: wrmsr
0x0210e9c4 <do_sys_vm86+356>: jmp 0x210e95c <do_sys_vm86+252>
0x0210e9c6 <do_sys_vm86+358>: movl $0x0,0x5bc(%esi)
0x0210e9d0 <do_sys_vm86+368>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9d5 <do_sys_vm86+373>: cmp $0x4,%eax
0x0210e9d8 <do_sys_vm86+376>: jne 0x210e8ce <do_sys_vm86+110>
0x0210e9de <do_sys_vm86+382>: movl $0x47000,0x5bc(%esi)
0x0210e9e8 <do_sys_vm86+392>: jmp 0x210e8d8 <do_sys_vm86+120>
0x0210e9ed <do_sys_vm86+397>: lea 0x0(%esi),%esi
0x0210e9f0 <do_sys_vm86+400>: movl $0x7000,0x5bc(%esi)
0x0210e9fa <do_sys_vm86+410>: jmp 0x210e8d8 <do_sys_vm86+120>
I don't seem to able to recreate this at my end - even with 1k
block sizes. Did you notice if this problem occurs without
the latest patch ?
Regards
Suparna
On Mon, Nov 17, 2003 at 05:37:14PM -0800, Daniel McNeil wrote:
> Obviously, the ps output in my previous email showed that the hangs were
> with 1k i/o sizes.
>
> More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
> completed correctly.
>
> Even 11k and 17k worked.
>
> $ ls -l
> -rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
> -rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2
>
>
> So, only 1k is hanging so far.
>
> Daniel
>
> On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> > Suparna,
> >
> > Good news and bad news. Your patch does fix the non-power of two i/o
> > size problems where AIO previously did not complete:
> >
> > $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> > $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> > io_submit() return 9
> > aiodio_sparse: 9 i/o in flight
> > aiodio_sparse: offset 165888 filesize 184320 inflight 9
> > aiodio_sparse: io_getevent() returned 1
> > aiodio_sparse: io_getevent() res 18432 res2 0
> > io_submit() return 1
> > AIO DIO write done unlinking file
> > dio_sparse done writing, kill children
> > aiodio_sparse 0 children had errors
> >
> > But when testing using aiocp using O_DIRECT to copy a file to
> > an already allocated file, the aiocp process hangs. I used i/o
> > size of 4k and that compeleted. Using i/o size of 1k and 2k,
> > the aiocp process hung during io_sumbit() and are unkillable.
> > Here are the stack traces:
> >
> > # ps -fu daniel | grep aiocp
> > daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> > daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
> >
> >
> > aiocp D 00000001 1920 1 1902 (NOTLB)
> > e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> > f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> > c0289a16
> > f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> > e70aa000
> > Call Trace:
> > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > [<c0123712>] io_schedule+0x26/0x30
> > [<c0192242>] direct_io_worker+0x376/0x5ab
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > [<c0121b70>] schedule+0x3ac/0x7ef
> > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > [<c0194ad3>] aio_pread+0x34/0x5f
> > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > [<c019316f>] __aio_get_req+0x27/0x158
> > [<c0194a9f>] aio_pread+0x0/0x5f
> > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > [<c0195110>] sys_io_submit+0xe1/0x194
> > [<c03c29a7>] syscall_call+0x7/0xb
> > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> >
> >
> > aiocp D 366EDC94 2083 2037 (NOTLB)
> > e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> > 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> > f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> > Call Trace:
> > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > [<c0123712>] io_schedule+0x26/0x30
> > [<c0192242>] direct_io_worker+0x376/0x5ab
> > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > [<c0259d3e>] write_chan+0x165/0x21e
> > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > [<c0194ad3>] aio_pread+0x34/0x5f
> > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > [<c019316f>] __aio_get_req+0x27/0x158
> > [<c0194a9f>] aio_pread+0x0/0x5f
> > [<c02532ab>] tty_write+0x1e8/0x3b2
> > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > [<c0195110>] sys_io_submit+0xe1/0x194
> > [<c03c29a7>] syscall_call+0x7/0xb
> > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> >
> >
> >
> > Daniel
> >
> > On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > > Andrew,
> > > >
> > > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > > I tested using the test programs aiocp and aiodio_sparse.
> > > > (see http://developer.osdl.org/daniel/AIO/)
> > > >
> > > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > > without any errors or kernel debug messages.
> > > >
> > > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > > There are no kernel error messages, so that is good.
> > > >
> > > > There are still problems with non power of 2 i/o sizes using AIO and
> > > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > > does exit when hitting ^c and there are no kernel messages. Test output
> > > > below:
> > >
> > > Could you check if the following patch fixes the problem for you ?
> > >
> > > Regards
> > > Suparna
> > >
> > > --------------------------------------------------------------
> > >
> > > With this patch, when the DIO code falls back to buffered i/o after
> > > having submitted part of the i/o, then buffered i/o is issued only
> > > for the remaining part of the request (i.e. the part not already
> > > covered by DIO).
> > >
> > > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > > @@ -74,6 +74,7 @@
> > > been performed at the start of a
> > > write */
> > > int pages_in_io; /* approximate total IO pages */
> > > + size_t size; /* total request size (doesn't change)*/
> > > sector_t block_in_file; /* Current offset into the underlying
> > > file in dio_block units. */
> > > unsigned blocks_available; /* At block_in_file. changes */
> > > @@ -226,7 +227,7 @@
> > > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > > dio->result);
> > > /* Complete AIO later if falling back to buffered i/o */
> > > - if (dio->result != -ENOTBLK) {
> > > + if (dio->result >= dio->size || dio->rw == READ) {
> > > aio_complete(dio->iocb, dio->result, 0);
> > > kfree(dio);
> > > } else {
> > > @@ -889,6 +890,7 @@
> > > dio->blkbits = blkbits;
> > > dio->blkfactor = inode->i_blkbits - blkbits;
> > > dio->start_zero_done = 0;
> > > + dio->size = 0;
> > > dio->block_in_file = offset >> blkbits;
> > > dio->blocks_available = 0;
> > > dio->cur_page = NULL;
> > > @@ -925,7 +927,7 @@
> > >
> > > for (seg = 0; seg < nr_segs; seg++) {
> > > user_addr = (unsigned long)iov[seg].iov_base;
> > > - bytes = iov[seg].iov_len;
> > > + dio->size += bytes = iov[seg].iov_len;
> > >
> > > /* Index into the first page of the first block */
> > > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > > @@ -956,6 +958,13 @@
> > > }
> > > } /* end iovec loop */
> > >
> > > + if (ret == -ENOTBLK && rw == WRITE) {
> > > + /*
> > > + * The remaining part of the request will be
> > > + * be handled by buffered I/O when we return
> > > + */
> > > + ret = 0;
> > > + }
> > > /*
> > > * There may be some unwritten disk at the end of a part-written
> > > * fs-block-sized block. Go zero that now.
> > > @@ -986,19 +995,13 @@
> > > */
> > > if (dio->is_async) {
> > > if (ret == 0)
> > > - ret = dio->result; /* Bytes written */
> > > - if (ret == -ENOTBLK) {
> > > - /*
> > > - * The request will be reissued via buffered I/O
> > > - * when we return; Any I/O already issued
> > > - * effectively becomes redundant.
> > > - */
> > > - dio->result = ret;
> > > + ret = dio->result;
> > > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > > dio->waiter = current;
> > > }
> > > finished_one_bio(dio); /* This can free the dio */
> > > blk_run_queues();
> > > - if (ret == -ENOTBLK) {
> > > + if (dio->waiter) {
> > > /*
> > > * Wait for already issued I/O to drain out and
> > > * release its references to user-space pages
> > > @@ -1032,7 +1035,8 @@
> > > }
> > > dio_complete(dio, offset, ret);
> > > /* We could have also come here on an AIO file extend */
> > > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > > + dio->result < dio->size))
> > > aio_complete(iocb, ret, 0);
> > > kfree(dio);
> > > }
> > > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > > @@ -1895,14 +1895,16 @@
> > > */
> > > if (written >= 0 && file->f_flags & O_SYNC)
> > > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > > - if (written >= 0 && !is_sync_kiocb(iocb))
> > > + if (written >= count && !is_sync_kiocb(iocb))
> > > written = -EIOCBQUEUED;
> > > - if (written != -ENOTBLK)
> > > + if (written < 0 || written >= count)
> > > goto out_status;
> > > /*
> > > * direct-io write to a hole: fall through to buffered I/O
> > > + * for completing the rest of the request.
> > > */
> > > - written = 0;
> > > + pos += written;
> > > + count -= written;
> > > }
> > >
> > > buf = iov->iov_base;
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > the body to [email protected]. For more info on Linux AIO,
> > see: http://www.kvack.org/aio/
> > Don't email: <a href=mailto:"[email protected]">[email protected]</a>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-aio' in
> the body to [email protected]. For more info on Linux AIO,
> see: http://www.kvack.org/aio/
> Don't email: <a href=mailto:"[email protected]">[email protected]</a>
--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India
On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
>
> Another note from our avian friends; i seem to have sent a slightly
> different dump from the patch, although they do both achieve the same
> effect. I shall append it for completeness.
Hmm. I don't see anything. However, it's a lot easier to read the
gcc-generated assembly ("make arch/i386/kernel/vm86.s") than it is to read
the objdump disassembly.
It's also a lot easier to see what the assembly language is when giving
the
-fno-reorder-blocks
switch to gcc. Without it, modern gcc's tend to have _way_ too many jumps
around. But maybe that actually changes the behaviour too.
Linus
On Tue, 18 Nov 2003, Linus Torvalds wrote:
> Hmm. I don't see anything. However, it's a lot easier to read the
> gcc-generated assembly ("make arch/i386/kernel/vm86.s") than it is to read
> the objdump disassembly.
>
>
> It's also a lot easier to see what the assembly language is when giving
> the
>
> -fno-reorder-blocks
I'll recompile and verify that the bug can be reproduced and worked around
with that flag.
> switch to gcc. Without it, modern gcc's tend to have _way_ too many jumps
> around. But maybe that actually changes the behaviour too.
Here are diffs from the do_sys_vm86 only.
--- asm-before 2003-11-18 10:56:02.967643808 -0500
+++ asm-after 2003-11-18 10:55:37.880457640 -0500
@@ -897,6 +897,10 @@
.LFE473:
.Lfe4:
.size sys_vm86,.Lfe4-sys_vm86
+ .section .rodata.str1.1
+.LC6:
+ .string "ooh la la\n"
+ .text
.p2align 4,,15
.type do_sys_vm86,@function
do_sys_vm86:
@@ -1053,29 +1057,37 @@
jne .L213
.L210:
.loc 1 315 0
+ pushl $.LC6
+.LCFI98:
+ call printk
+ .loc 1 316 0
movl 4(%esi), %edx
#APP
xorl %eax,%eax; movl %eax,%fs; movl %eax,%gs
movl %edi,%esp
movl %edx,%ebp
jmp resume_userspace
- .loc 1 323 0
#NO_APP
- popl %ebx
-.LCFI98:
+.LBE53:
popl %esi
.LCFI99:
- popl %edi
+ .loc 1 324 0
+ popl %ebx
.LCFI100:
+ popl %esi
+.LCFI101:
+ popl %edi
+.LCFI102:
ret
.loc 1 313 0
.p2align 4,,7
.L213:
+.LBB65:
pushl %esi
-.LCFI101:
+.LCFI103:
call mark_screen_rdonly
popl %eax
-.LCFI102:
+.LCFI104:
jmp .L210
.loc 1 310 0
.L212:
@@ -1083,7 +1095,7 @@
jmp .L197
.loc 14 454 0
.L211:
-.LBB65:
+.LBB66:
movw 36(%edx), %ax
movw %ax, 16(%ecx)
.loc 14 455 0
@@ -1097,7 +1109,7 @@
.p2align 4,,7
.L183:
.loc 1 283 0
-.LBE65:
+.LBE66:
movl $0, 1468(%esi)
.loc 1 284 0
jmp .L182
@@ -1115,7 +1127,7 @@
movl $28672, 1468(%esi)
.loc 1 287 0
jmp .L182
-.LBE53:
+.LBE65:
.LFE475:
.Lfe5:
.size do_sys_vm86,.Lfe5-do_sys_vm86
On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
>
> Here are diffs from the do_sys_vm86 only.
Ok. Much more readable.
And there is something very suspicious there.
The code with and without the printk() looks _identical_ apart from some
trivial label renumbering, and the added
pushl $.LC6
call printk
.. asm ..
popl %esi
which all looks fine (esi is dead at that point, so the compiler is just
using a "popl" as a shorter form of "addl $4,%esp").
Btw, you seem to compile with debugging, which makes the assembly
language pretty much unreadable and accounts for most of the
differences: the line numbers change. If you compile a kernel where the
line numbers don't change (by commenting _out_ the printk rather than
removing the whole line), your diff would be more readable.
Anyway, there are _zero_ differences.
Just for fun, try this: move the "printk()" to _below_ the "asm"
statement. It will never actually get executed, but if it's an issue of
some subtle code or data placement things (cache lines etc), maybe that
also hides the oops, since all the same code and data will be generated,
just not run...
Linus
>> Btw, you seem to compile with debugging, which makes the assembly
>> language pretty much unreadable and accounts for most of the
>> differences: the line numbers change. If you compile a kernel where the
>> line numbers don't change (by commenting _out_ the printk rather than
>> removing the whole line), your diff would be more readable.
>
> Aha! Thanks for mentioning that, noted.
>
>> Anyway, there are _zero_ differences.
>>
>> Just for fun, try this: move the "printk()" to _below_ the "asm"
>> statement. It will never actually get executed, but if it's an issue of
>> some subtle code or data placement things (cache lines etc), maybe that
>> also hides the oops, since all the same code and data will be generated,
>> just not run...
>
> Ok i just tried that and it still fails. Matt Mackall suggested i also try
> writing a minimal printk which has the same effect.
The other thing I've found printks to hide before is timing bugs / races.
Unfortunately I can't see one here, but maybe someone else can ;-)
Maybe inserting a 1ms delay or something in place of the printk would
have the same effect?
M.
On Tue, 18 Nov 2003, Linus Torvalds wrote:
> Ok. Much more readable.
>
> And there is something very suspicious there.
>
> The code with and without the printk() looks _identical_ apart from some
> trivial label renumbering, and the added
>
> pushl $.LC6
> call printk
> .. asm ..
> popl %esi
>
> which all looks fine (esi is dead at that point, so the compiler is just
> using a "popl" as a shorter form of "addl $4,%esp").
>
> Btw, you seem to compile with debugging, which makes the assembly
> language pretty much unreadable and accounts for most of the
> differences: the line numbers change. If you compile a kernel where the
> line numbers don't change (by commenting _out_ the printk rather than
> removing the whole line), your diff would be more readable.
Aha! Thanks for mentioning that, noted.
> Anyway, there are _zero_ differences.
>
> Just for fun, try this: move the "printk()" to _below_ the "asm"
> statement. It will never actually get executed, but if it's an issue of
> some subtle code or data placement things (cache lines etc), maybe that
> also hides the oops, since all the same code and data will be generated,
> just not run...
Ok i just tried that and it still fails. Matt Mackall suggested i also try
writing a minimal printk which has the same effect.
On Tue, 18 Nov 2003, Martin J. Bligh wrote:
> The other thing I've found printks to hide before is timing bugs / races.
> Unfortunately I can't see one here, but maybe someone else can ;-)
> Maybe inserting a 1ms delay or something in place of the printk would
> have the same effect?
I've tried a number of timing related workarounds, namely;
schedule_timeout(2*HZ) and some long spinning loops. I've also thrown a
schedule() in there at some point.
Suparna,
I was unable to reproduce the hang in io_submit() without your patch.
I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.
I re-ran with your patch with both as-iosched and deadline and both
hung in io_submit(). aiocp would run a few times, but I put the
aiocp in a while loop and it hung on the 1st or 2nd time. It
did get most of the way through copying the file before hanging.
This is on a 2-proc to ide disks running ext3.
Here is the stack trace and other info for as-iosched:
daniel 2005 0.7 0.0 1388 384 pts/0 D 13:51 0:08 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
cat /proc/2005/wchan
io_schedule
aiocp D 00000001 2005 1870 (NOTLB)
e53cfc08 00200086 c18d3c80 00000001 00000003 c02897fc 00000060 00200246
f7cdb8b4 c0191630 c18d3c80 0000bfc6 78d5d3e5 00000233 e4dc1980 c0289a16
f7cdb8b4 d92978e4 c18d3c80 00000000 00000001 e53cfc14 c0123712 e53ce000
Call Trace:
[<c02897fc>] generic_unplug_device+0x50/0xbd
[<c0191630>] dio_bio_add_page+0x34/0x79
[<c0289a16>] blk_run_queues+0xa9/0x15c
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0147a80>] __generic_file_aio_write_nolock+0xa3a/0xda5
[<c025b049>] pty_write+0x1c8/0x1ca
[<c01480a4>] generic_file_aio_write+0x7e/0x115
[<c0256d12>] opost+0x9e/0x1cf
[<c01aa4a3>] ext3_file_write+0x3f/0xcc
[<c0194b3a>] aio_pwrite+0x3c/0xad
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194afe>] aio_pwrite+0x0/0xad
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
For deadline iosched:
daniel 1889 0.1 0.0 1388 384 pts/0 D 15:12 0:01 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
$ cat /proc/1889/wchan
io_schedule
$ cat /sys/block/hdb/stat
209058 23145 45744 58542 209022 22069 0 20758 45210
aiocp D 0AD7701D 1889 1752 (NOTLB)
ee2ddd04 00200086 f75e6660 0ad7701d 0000004e 00200282 ebd37cbc 0ad7701d
0000004e f75e6660 c18d3c80 00060539 0ad7701d 0000004e f75e6000 0000006b
ee2ddd10 c0192212 c18d3c80 00000000 00000001 ee2ddd10 c0123712 ee2dc000
Call Trace:
[<c0192212>] direct_io_worker+0x346/0x5ab
[<c0123712>] io_schedule+0x26/0x30
[<c0192242>] direct_io_worker+0x376/0x5ab
[<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
[<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
[<c014840f>] generic_file_direct_IO+0x70/0x89
[<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
[<c0259d3e>] write_chan+0x165/0x21e
[<c0145f48>] generic_file_aio_read+0x33/0x37
[<c0194ad3>] aio_pread+0x34/0x5f
[<c0193bec>] aio_run_iocb+0xa6/0x1ed
[<c019316f>] __aio_get_req+0x27/0x158
[<c0194a9f>] aio_pread+0x0/0x5f
[<c02532ab>] tty_write+0x1e8/0x3b2
[<c0194f62>] io_submit_one+0x1ea/0x2b7
[<c0195110>] sys_io_submit+0xe1/0x194
[<c03c29a7>] syscall_call+0x7/0xb
The hung processes are stuck in the 'D' state and unkillable, of course.
I would appear something is wrong with your patch. Any ideas?
Daniel
On Tue, 2003-11-18 at 03:55, Suparna Bhattacharya wrote:
> I don't seem to able to recreate this at my end - even with 1k
> block sizes. Did you notice if this problem occurs without
> the latest patch ?
>
> Regards
> Suparna
>
> On Mon, Nov 17, 2003 at 05:37:14PM -0800, Daniel McNeil wrote:
> > Obviously, the ps output in my previous email showed that the hangs were
> > with 1k i/o sizes.
> >
> > More testing using 2k, 4k, 16k, 32k, 64k, 128k, 256k and 512k all
> > completed correctly.
> >
> > Even 11k and 17k worked.
> >
> > $ ls -l
> > -rw------- 1 daniel daniel 88289280 Jun 9 16:54 glibc-2.3.2.tar
> > -rw-rw-r-- 1 daniel daniel 88289280 Nov 17 17:32 ff2
> >
> >
> > So, only 1k is hanging so far.
> >
> > Daniel
> >
> > On Mon, 2003-11-17 at 17:15, Daniel McNeil wrote:
> > > Suparna,
> > >
> > > Good news and bad news. Your patch does fix the non-power of two i/o
> > > size problems where AIO previously did not complete:
> > >
> > > $ ./aiodio_sparse -s 1751k -r 18k -w 11k
> > > $ aiodio_sparse -i 9 -dd -s 180k -r 18k -w 18k
> > > io_submit() return 9
> > > aiodio_sparse: 9 i/o in flight
> > > aiodio_sparse: offset 165888 filesize 184320 inflight 9
> > > aiodio_sparse: io_getevent() returned 1
> > > aiodio_sparse: io_getevent() res 18432 res2 0
> > > io_submit() return 1
> > > AIO DIO write done unlinking file
> > > dio_sparse done writing, kill children
> > > aiodio_sparse 0 children had errors
> > >
> > > But when testing using aiocp using O_DIRECT to copy a file to
> > > an already allocated file, the aiocp process hangs. I used i/o
> > > size of 4k and that compeleted. Using i/o size of 1k and 2k,
> > > the aiocp process hung during io_sumbit() and are unkillable.
> > > Here are the stack traces:
> > >
> > > # ps -fu daniel | grep aiocp
> > > daniel 1920 1 0 16:45 ? 00:00:07 aiocp -b 1k -n 1 -f DIRECT glibc-2.3.2.tar ff2
> > > daniel 2083 2037 0 17:00 pts/2 00:00:03 aiocp -dd -b 1k -n 8 -f DIRECT glibc-2.3.2.tar ff2
> > >
> > >
> > > aiocp D 00000001 1920 1 1902 (NOTLB)
> > > e70abd04 00200086 c18dbc80 00000001 00000003 c02897fc 00000060 00200246
> > > f7cdb8b4 c16522f0 c18dbc80 0000309c 640a05eb 0000008b e6d9e660
> > > c0289a16
> > > f7cdb8b4 e87e95cc c18dbc80 00000000 00000001 e70abd10 c0123712
> > > e70aa000
> > > Call Trace:
> > > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > > [<c0123712>] io_schedule+0x26/0x30
> > > [<c0192242>] direct_io_worker+0x376/0x5ab
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > > [<c0121b70>] schedule+0x3ac/0x7ef
> > > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > > [<c0194ad3>] aio_pread+0x34/0x5f
> > > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > > [<c019316f>] __aio_get_req+0x27/0x158
> > > [<c0194a9f>] aio_pread+0x0/0x5f
> > > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > > [<c0195110>] sys_io_submit+0xe1/0x194
> > > [<c03c29a7>] syscall_call+0x7/0xb
> > > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> > >
> > >
> > > aiocp D 366EDC94 2083 2037 (NOTLB)
> > > e758bd04 00200082 f71ba000 366edc94 00000161 c02897fc 00000060 366edc94
> > > 00000161 f71ba000 c18d3c80 000069a9 366f5a0e 00000161 e8d4acc0 c0289a16
> > > f7cdb8b4 e960465c c18d3c80 00000000 00000001 e758bd10 c0123712 e758a000
> > > Call Trace:
> > > [<c02897fc>] generic_unplug_device+0x50/0xbd
> > > [<c0289a16>] blk_run_queues+0xa9/0x15c
> > > [<c0123712>] io_schedule+0x26/0x30
> > > [<c0192242>] direct_io_worker+0x376/0x5ab
> > > [<c019264a>] __blockdev_direct_IO+0x1d3/0x2d5
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c01ad72d>] ext3_direct_IO+0xc0/0x1e1
> > > [<c01ac73e>] ext3_direct_io_get_blocks+0x0/0xbf
> > > [<c014840f>] generic_file_direct_IO+0x70/0x89
> > > [<c0145e11>] __generic_file_aio_read+0xfb/0x1ff
> > > [<c0259d3e>] write_chan+0x165/0x21e
> > > [<c0145f48>] generic_file_aio_read+0x33/0x37
> > > [<c0194ad3>] aio_pread+0x34/0x5f
> > > [<c0193bec>] aio_run_iocb+0xa6/0x1ed
> > > [<c019316f>] __aio_get_req+0x27/0x158
> > > [<c0194a9f>] aio_pread+0x0/0x5f
> > > [<c02532ab>] tty_write+0x1e8/0x3b2
> > > [<c0194f62>] io_submit_one+0x1ea/0x2b7
> > > [<c0195110>] sys_io_submit+0xe1/0x194
> > > [<c03c29a7>] syscall_call+0x7/0xb
> > > [<c03c007b>] rpc_depopulate+0x1aa/0x24b
> > >
> > >
> > >
> > > Daniel
> > >
> > > On Sun, 2003-11-16 at 21:25, Suparna Bhattacharya wrote:
> > > > On Thu, Nov 13, 2003 at 02:03:58PM -0800, Daniel McNeil wrote:
> > > > > Andrew,
> > > > >
> > > > > I'm testing test9-mm3 on a 2-proc Xeon with a ext3 file system.
> > > > > I tested using the test programs aiocp and aiodio_sparse.
> > > > > (see http://developer.osdl.org/daniel/AIO/)
> > > > >
> > > > > Using aiocp with i/o sizes from 1k to 512k to copy files worked
> > > > > without any errors or kernel debug messages.
> > > > >
> > > > > With 64k i/o, the aiodio_sparse program complete without any errors.
> > > > > There are no kernel error messages, so that is good.
> > > > >
> > > > > There are still problems with non power of 2 i/o sizes using AIO and
> > > > > O_DIRECT. It hangs with aio's that do not seem to complete. The test
> > > > > does exit when hitting ^c and there are no kernel messages. Test output
> > > > > below:
> > > >
> > > > Could you check if the following patch fixes the problem for you ?
> > > >
> > > > Regards
> > > > Suparna
> > > >
> > > > --------------------------------------------------------------
> > > >
> > > > With this patch, when the DIO code falls back to buffered i/o after
> > > > having submitted part of the i/o, then buffered i/o is issued only
> > > > for the remaining part of the request (i.e. the part not already
> > > > covered by DIO).
> > > >
> > > > diff -ur pure-mm3/fs/direct-io.c linux-2.6.0-test9-mm3/fs/direct-io.c
> > > > --- pure-mm3/fs/direct-io.c 2003-11-14 09:09:06.000000000 +0530
> > > > +++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-17 09:00:47.000000000 +0530
> > > > @@ -74,6 +74,7 @@
> > > > been performed at the start of a
> > > > write */
> > > > int pages_in_io; /* approximate total IO pages */
> > > > + size_t size; /* total request size (doesn't change)*/
> > > > sector_t block_in_file; /* Current offset into the underlying
> > > > file in dio_block units. */
> > > > unsigned blocks_available; /* At block_in_file. changes */
> > > > @@ -226,7 +227,7 @@
> > > > dio_complete(dio, dio->block_in_file << dio->blkbits,
> > > > dio->result);
> > > > /* Complete AIO later if falling back to buffered i/o */
> > > > - if (dio->result != -ENOTBLK) {
> > > > + if (dio->result >= dio->size || dio->rw == READ) {
> > > > aio_complete(dio->iocb, dio->result, 0);
> > > > kfree(dio);
> > > > } else {
> > > > @@ -889,6 +890,7 @@
> > > > dio->blkbits = blkbits;
> > > > dio->blkfactor = inode->i_blkbits - blkbits;
> > > > dio->start_zero_done = 0;
> > > > + dio->size = 0;
> > > > dio->block_in_file = offset >> blkbits;
> > > > dio->blocks_available = 0;
> > > > dio->cur_page = NULL;
> > > > @@ -925,7 +927,7 @@
> > > >
> > > > for (seg = 0; seg < nr_segs; seg++) {
> > > > user_addr = (unsigned long)iov[seg].iov_base;
> > > > - bytes = iov[seg].iov_len;
> > > > + dio->size += bytes = iov[seg].iov_len;
> > > >
> > > > /* Index into the first page of the first block */
> > > > dio->first_block_in_page = (user_addr & ~PAGE_MASK) >> blkbits;
> > > > @@ -956,6 +958,13 @@
> > > > }
> > > > } /* end iovec loop */
> > > >
> > > > + if (ret == -ENOTBLK && rw == WRITE) {
> > > > + /*
> > > > + * The remaining part of the request will be
> > > > + * be handled by buffered I/O when we return
> > > > + */
> > > > + ret = 0;
> > > > + }
> > > > /*
> > > > * There may be some unwritten disk at the end of a part-written
> > > > * fs-block-sized block. Go zero that now.
> > > > @@ -986,19 +995,13 @@
> > > > */
> > > > if (dio->is_async) {
> > > > if (ret == 0)
> > > > - ret = dio->result; /* Bytes written */
> > > > - if (ret == -ENOTBLK) {
> > > > - /*
> > > > - * The request will be reissued via buffered I/O
> > > > - * when we return; Any I/O already issued
> > > > - * effectively becomes redundant.
> > > > - */
> > > > - dio->result = ret;
> > > > + ret = dio->result;
> > > > + if (ret > 0 && dio->result < dio->size && rw == WRITE) {
> > > > dio->waiter = current;
> > > > }
> > > > finished_one_bio(dio); /* This can free the dio */
> > > > blk_run_queues();
> > > > - if (ret == -ENOTBLK) {
> > > > + if (dio->waiter) {
> > > > /*
> > > > * Wait for already issued I/O to drain out and
> > > > * release its references to user-space pages
> > > > @@ -1032,7 +1035,8 @@
> > > > }
> > > > dio_complete(dio, offset, ret);
> > > > /* We could have also come here on an AIO file extend */
> > > > - if (!is_sync_kiocb(iocb) && (ret != -ENOTBLK))
> > > > + if (!is_sync_kiocb(iocb) && !(rw == WRITE && ret >= 0 &&
> > > > + dio->result < dio->size))
> > > > aio_complete(iocb, ret, 0);
> > > > kfree(dio);
> > > > }
> > > > diff -ur pure-mm3/mm/filemap.c linux-2.6.0-test9-mm3/mm/filemap.c
> > > > --- pure-mm3/mm/filemap.c 2003-11-14 09:15:08.000000000 +0530
> > > > +++ linux-2.6.0-test9-mm3/mm/filemap.c 2003-11-15 11:11:16.000000000 +0530
> > > > @@ -1895,14 +1895,16 @@
> > > > */
> > > > if (written >= 0 && file->f_flags & O_SYNC)
> > > > status = generic_osync_inode(inode, mapping, OSYNC_METADATA);
> > > > - if (written >= 0 && !is_sync_kiocb(iocb))
> > > > + if (written >= count && !is_sync_kiocb(iocb))
> > > > written = -EIOCBQUEUED;
> > > > - if (written != -ENOTBLK)
> > > > + if (written < 0 || written >= count)
> > > > goto out_status;
> > > > /*
> > > > * direct-io write to a hole: fall through to buffered I/O
> > > > + * for completing the rest of the request.
> > > > */
> > > > - written = 0;
> > > > + pos += written;
> > > > + count -= written;
> > > > }
> > > >
> > > > buf = iov->iov_base;
> > >
> > > --
> > > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > > the body to [email protected]. For more info on Linux AIO,
> > > see: http://www.kvack.org/aio/
> > > Don't email: <a href=mailto:"[email protected]">[email protected]</a>
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-aio' in
> > the body to [email protected]. For more info on Linux AIO,
> > see: http://www.kvack.org/aio/
> > Don't email: <a href=mailto:"[email protected]">[email protected]</a>
Hi,
> The other thing I've found printks to hide before is timing bugs / races.
> Unfortunately I can't see one here, but maybe someone else can ;-)
> Maybe inserting a 1ms delay or something in place of the printk would
> have the same effect?
One of my colleagues had an interesting bug caused by an
uninitialized variable - a printk() in the right place happened
to set the variable (which gcc had put in a register) to the
correct value for his code to work.
I've tried looking for uses of uninitialized registers in entry.S,
but the assembly there isn't easy to follow.
What happens if you replace the printk with assembly code
that clobbers eax, ecx, edx and (most of) eflags? (Assuming
I've remembered the calling convention correctly, those are
the registers that printk will be overwriting).
Kind regards,
Jon
On Tue, 18 Nov 2003, Jon Foster wrote:
> > The other thing I've found printks to hide before is timing bugs / races.
> > Unfortunately I can't see one here, but maybe someone else can ;-)
> > Maybe inserting a 1ms delay or something in place of the printk would
> > have the same effect?
>
> One of my colleagues had an interesting bug caused by an
> uninitialized variable - a printk() in the right place happened
> to set the variable (which gcc had put in a register) to the
> correct value for his code to work.
Very nice =)
> I've tried looking for uses of uninitialized registers in entry.S,
> but the assembly there isn't easy to follow.
I've walked that code and can't see anything wrong anywhere.
> What happens if you replace the printk with assembly code
> that clobbers eax, ecx, edx and (most of) eflags? (Assuming
> I've remembered the calling convention correctly, those are
> the registers that printk will be overwriting).
Well i have tried a number of heavyweight functions, so far none of them
have had the effect that a printk has had. It's also worth noting that a
printk lookalike function such as the following, does not fix things
either.
asmlinkage int kooh_la_la(const char *fmt, ...)
{
return strlen(fmt);
}
Zwane Mwaikambo <[email protected]> wrote:
>
> I've walked that code and can't see anything wrong anywhere.
fwiw, X comes up happily on a couple of boxes here, with the 4g/4g split
enabled.
Have you tried a different compiler?
On Tue, 18 Nov 2003, Andrew Morton wrote:
> Zwane Mwaikambo <[email protected]> wrote:
> >
> > I've walked that code and can't see anything wrong anywhere.
>
> fwiw, X comes up happily on a couple of boxes here, with the 4g/4g split
> enabled.
The exact same kernel runs fine on my other test boxes. But i really don't
have faith in this compiler, it's the same one which constantly seems to be
tripping into various problems.
> Have you tried a different compiler?
I just tried the RH9 2.96 and it also triple faulted. Oh my.. The only
unique thing about this hardware compared ot the other stuff i have here
is that it's an AMD K6. Everything else is Intel.
On Wed, 19 Nov 2003, Zwane Mwaikambo wrote:
>
> I just tried the RH9 2.96 and it also triple faulted. Oh my.. The only
> unique thing about this hardware compared ot the other stuff i have here
> is that it's an AMD K6. Everything else is Intel.
Different TLB sizes (and organizations) etc can _easily_ matter, if the
Intel one just happens to work because something stays in the TLB while
the page table mapping is incorrect and keeps the system afloat.
Or - and in this case more likely - since the problem is fixed by running
a (complex) thing that trashes all over the DTLB/ITLB, it's more likely
that there might be a _missing_ TLB invalidate somewhere, and that the
Intel boxes stay up because they have a smaller TLB and the stale entry
gets flushed out early from them.
But you already tried a "flush_tlb_all()" which _should_ have flushed
absolutely everything, including global tables. I dunno. It could be
hitting a CPU bug too, of course.
It would be interesting to hear if other K6 users see problems..
Linus
On Tue, Nov 18, 2003 at 08:37:25AM -0800, Linus Torvalds wrote:
>
> On Tue, 18 Nov 2003, Zwane Mwaikambo wrote:
> >
> > Here are diffs from the do_sys_vm86 only.
>
> Ok. Much more readable.
>
> And there is something very suspicious there.
>
> The code with and without the printk() looks _identical_ apart from some
> trivial label renumbering, and the added
>
> pushl $.LC6
> call printk
> .. asm ..
> popl %esi
>
> which all looks fine (esi is dead at that point, so the compiler is just
> using a "popl" as a shorter form of "addl $4,%esp").
>
> Btw, you seem to compile with debugging, which makes the assembly
> language pretty much unreadable and accounts for most of the
> differences: the line numbers change. If you compile a kernel where the
> line numbers don't change (by commenting _out_ the printk rather than
> removing the whole line), your diff would be more readable.
>
> Anyway, there are _zero_ differences.
>
> Just for fun, try this: move the "printk()" to _below_ the "asm"
> statement. It will never actually get executed, but if it's an issue of
> some subtle code or data placement things (cache lines etc), maybe that
> also hides the oops, since all the same code and data will be generated,
> just not run...
Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
doesn't help. So my suspicion is that the printk is changing the
timing just enough on Zwane's box that he's getting a timer interrupt
knocking him out of vm86 mode before he hits a fatal bit in the fault
handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
there's probably something amiss in the trampoline code.
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
>
> Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> doesn't help. So my suspicion is that the printk is changing the
> timing just enough on Zwane's box that he's getting a timer interrupt
> knocking him out of vm86 mode before he hits a fatal bit in the fault
> handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> there's probably something amiss in the trampoline code.
Some more datapoints:
CPU distro compiler video X result
K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
On Wed, 19 Nov 2003, Matt Mackall wrote:
> On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> >
> > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > doesn't help. So my suspicion is that the printk is changing the
> > timing just enough on Zwane's box that he's getting a timer interrupt
> > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > there's probably something amiss in the trampoline code.
>
> Some more datapoints:
Thanks for trying those out, i got another one to add.
> CPU distro compiler video X result
> K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot
P4/Xeon 2000 Fedora Core 1 3.3.2 ATI Rage XL 4.3.0 reboot
On Wed, Nov 19, 2003 at 05:09:28PM -0600, Matt Mackall wrote:
> On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> >
> > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > doesn't help. So my suspicion is that the printk is changing the
> > timing just enough on Zwane's box that he's getting a timer interrupt
> > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > there's probably something amiss in the trampoline code.
>
> Some more datapoints:
>
> CPU distro compiler video X result
> K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot
And indeed it does turn out to be a problem with the trampoline
mechanics. The fix for -mm4:
Fix triple faulting on some boxes with 4G/4G
mm-mpm/arch/i386/kernel/vm86.c | 2 +-
1 files changed, 1 insertion(+), 1 deletion(-)
diff -puN arch/i386/kernel/vm86.c~virtual-esp arch/i386/kernel/vm86.c
--- mm/arch/i386/kernel/vm86.c~virtual-esp 2003-11-20 01:36:32.000000000 -0600
+++ mm-mpm/arch/i386/kernel/vm86.c 2003-11-20 01:36:32.000000000 -0600
@@ -306,7 +306,7 @@ static void do_sys_vm86(struct kernel_vm
tss->esp0 = virtual_esp0(tsk);
if (cpu_has_sep)
tsk->thread.sysenter_cs = 0;
- load_esp0(tss, &tsk->thread);
+ load_virtual_esp0(tss, tsk);
put_cpu();
tsk->thread.screen_bitmap = info->screen_bitmap;
_
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
Matt Mackall <[email protected]> wrote:
>
> - load_esp0(tss, &tsk->thread);
> + load_virtual_esp0(tss, tsk);
Thanks guys.
Now I'll have to put something else in there to keep you amused ;)
On Thu, Nov 20, 2003 at 01:44:05AM -0600, Matt Mackall wrote:
> On Wed, Nov 19, 2003 at 05:09:28PM -0600, Matt Mackall wrote:
> > On Wed, Nov 19, 2003 at 02:32:10PM -0600, Matt Mackall wrote:
> > >
> > > Zwane's got a K6-2 500MHz. I've just managed to reproduce this on my
> > > 1.4GHz Opteron box (with Debian gcc 3.2). Here, the "ooh la la" bit
> > > doesn't help. So my suspicion is that the printk is changing the
> > > timing just enough on Zwane's box that he's getting a timer interrupt
> > > knocking him out of vm86 mode before he hits a fatal bit in the fault
> > > handling path for 4/4. Printks in handle_vm86_trap, handle_vm86_fault,
> > > do_trap:vm86_trap, and do_general_protection:gp_in_vm86 never fire so
> > > there's probably something amiss in the trampoline code.
> >
> > Some more datapoints:
> >
> > CPU distro compiler video X result
> > K6-2/500 connectiva 9 2.96 trident 4.3 reboot (zwane)
> > K6-2/500 connectiva 9 3.2.2 trident 4.3 reboot (zwane)
> > Opteron 240 debian unstable 3.2 S3 4.2.1 reboot
> > Athlon 2100 debian unstable 3.2 radeon 7500 4.2.1 works
> > P4M 1800 debian unstable 3.2 radeon m7 4.2.1 reboot
>
> And indeed it does turn out to be a problem with the trampoline
> mechanics. The fix for -mm4:
Cleanup, as pointed out by Zwane:
Fix triple faulting on some boxes with 4G/4G
mm-mpm/arch/i386/kernel/vm86.c | 3 +--
1 files changed, 1 insertion(+), 2 deletions(-)
diff -puN arch/i386/kernel/vm86.c~virtual-esp arch/i386/kernel/vm86.c
--- mm/arch/i386/kernel/vm86.c~virtual-esp 2003-11-20 01:36:32.000000000 -0600
+++ mm-mpm/arch/i386/kernel/vm86.c 2003-11-20 02:08:38.000000000 -0600
@@ -303,10 +303,9 @@ static void do_sys_vm86(struct kernel_vm
tss = init_tss + get_cpu();
tsk->thread.esp0 = (unsigned long) &info->VM86_TSS_ESP0;
- tss->esp0 = virtual_esp0(tsk);
if (cpu_has_sep)
tsk->thread.sysenter_cs = 0;
- load_esp0(tss, &tsk->thread);
+ load_virtual_esp0(tss, tsk);
put_cpu();
tsk->thread.screen_bitmap = info->screen_bitmap;
_
--
Matt Mackall : http://www.selenic.com : Linux development and consulting
On Tue, Nov 18, 2003 at 03:47:53PM -0800, Daniel McNeil wrote:
> Suparna,
>
> I was unable to reproduce the hang in io_submit() without your patch.
> I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.
>
> I re-ran with your patch with both as-iosched and deadline and both
> hung in io_submit(). aiocp would run a few times, but I put the
> aiocp in a while loop and it hung on the 1st or 2nd time. It
> did get most of the way through copying the file before hanging.
> This is on a 2-proc to ide disks running ext3.
>
Found one race ... not sure if its the one causing the hangs
you see. The attached patch is not a complete fix (there is one
other race to close), but it would be interesting to see if
this makes any difference for you.
Regards
Suparna
--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India
------------------------------------------------------
Don't access dio fields if its possible that the dio could
already have been freed asynchronously during i/o completion.
Fixme: This still leaves a window between decrement of
bio_count and accessing dio->waiter during i/o completion
wherein the dio could get freed by the submission path.
--- pure-mm3/fs/direct-io.c 2003-11-24 13:00:33.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-24 14:15:30.000000000 +0530
@@ -994,14 +995,17 @@
* reflect the number of to-be-processed BIOs.
*/
if (dio->is_async) {
- if (ret == 0)
- ret = dio->result;
- if (ret > 0 && dio->result < dio->size && rw == WRITE) {
+ int should_wait = 0;
+
+ if (dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
+ should_wait = 1;
}
+ if (ret == 0)
+ ret = dio->result;
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (dio->waiter) {
+ if (should_wait) {
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
@@ -1013,7 +1017,7 @@
set_current_state(TASK_UNINTERRUPTIBLE);
}
set_current_state(TASK_RUNNING);
- dio->waiter = NULL;
+ kfree(dio);
}
} else {
finished_one_bio(dio);
Suparna,
Yes your patch did help. I originally had CONFIG_DEBUG_SLAB=y which
was helping me see problems because the the freed dio was getting
poisoned. I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
very good at catching these.
I updated your AIO fallback patch plus your AIO race plus I fixed
the bio_count decrement fix. This patch has all three fixes and
it is working for me.
I fixed the bio_count race, by changing bio_list_lock into bio_lock
and using that for all the bio fields. I changed bio_count and
bios_in_flight from atomics into int. They are now proctected by
the bio_lock. I fixed the race, by in finished_one_bio() by
leaving the bio_count at 1 until after the dio_complete()
and then do the bio_count decrement and wakeup holding the bio_lock.
Take a look, give it a try, and let me know what you think.
I've tested this on my 2-way and so far all my tests have past.
I have more testing to do, but this is working better.
Thanks,
Daniel
On Mon, 2003-11-24 at 01:42, Suparna Bhattacharya wrote:
> On Tue, Nov 18, 2003 at 03:47:53PM -0800, Daniel McNeil wrote:
> > Suparna,
> >
> > I was unable to reproduce the hang in io_submit() without your patch.
> > I ran aiocp with 1k i/o size constantly for 2 hours and it never hung.
> >
> > I re-ran with your patch with both as-iosched and deadline and both
> > hung in io_submit(). aiocp would run a few times, but I put the
> > aiocp in a while loop and it hung on the 1st or 2nd time. It
> > did get most of the way through copying the file before hanging.
> > This is on a 2-proc to ide disks running ext3.
> >
>
> Found one race ... not sure if its the one causing the hangs
> you see. The attached patch is not a complete fix (there is one
> other race to close), but it would be interesting to see if
> this makes any difference for you.
>
> Regards
> Suparna
On Tue, Nov 25, 2003 at 03:49:31PM -0800, Daniel McNeil wrote:
> Suparna,
>
> Yes your patch did help. I originally had CONFIG_DEBUG_SLAB=y which
> was helping me see problems because the the freed dio was getting
> poisoned. I also tested with CONFIG_DEBUG_PAGEALLOC=y which is
> very good at catching these.
Ah I see - perhaps that explains why neither Janet nor I could
recreate the problem that you were hitting so easily. So we
should probably try running with CONFIG_DEBUG_SLAB and
CONFIG_DEBUG_PAGEALLOC as well.
>
> I updated your AIO fallback patch plus your AIO race plus I fixed
> the bio_count decrement fix. This patch has all three fixes and
> it is working for me.
>
> I fixed the bio_count race, by changing bio_list_lock into bio_lock
> and using that for all the bio fields. I changed bio_count and
> bios_in_flight from atomics into int. They are now proctected by
> the bio_lock. I fixed the race, by in finished_one_bio() by
> leaving the bio_count at 1 until after the dio_complete()
> and then do the bio_count decrement and wakeup holding the bio_lock.
>
> Take a look, give it a try, and let me know what you think.
I had been trying a slightly different kind of fix -- appended is
the updated version of the patch I last posted. It uses the bio_list_lock
to protect the dio->waiter field, which finished_one_bio sets back
to NULL after it has issued the wakeup; and the code that waits for
i/o to drain out checks the dio->waiter field instead of bio_count.
This might not seem very obvious given the nomenclature of the
bio_list_lock, so I was holding back wondering if it could be
improved.
Your approach looks clearer in that sense -- its pretty unambiguous
about what lock protects what fields. The only thing that bothers me (and
this is what I was trying to avoid in my patch) is the increased
use of spin_lock_irq 's (overhead of turning interrupts off and on)
instead of simple atomic inc/dec in most places.
Thoughts ?
Regards
Suparna
--
Suparna Bhattacharya ([email protected])
Linux Technology Center
IBM Software Labs, India
------------------------------------
Don't access dio fields if its possible that the dio could
already have been freed asynchronously during i/o completion.
The dio->bio_list_lock protects the dio->waiter field as in
the case of synchronous i/o.
--- pure-mm3/fs/direct-io.c 2003-11-24 13:00:33.000000000 +0530
+++ linux-2.6.0-test9-mm3/fs/direct-io.c 2003-11-25 14:08:26.000000000 +0530
@@ -231,8 +231,17 @@
aio_complete(dio->iocb, dio->result, 0);
kfree(dio);
} else {
- if (dio->waiter)
- wake_up_process(dio->waiter);
+ struct task_struct *waiter;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
+ waiter = dio->waiter;
+ if (waiter) {
+ dio->waiter = NULL;
+ wake_up_process(waiter);
+ }
+ spin_unlock_irqrestore(&dio->bio_list_lock,
+ flags);
}
}
}
@@ -994,26 +1004,35 @@
* reflect the number of to-be-processed BIOs.
*/
if (dio->is_async) {
- if (ret == 0)
- ret = dio->result;
- if (ret > 0 && dio->result < dio->size && rw == WRITE) {
+ int should_wait = 0;
+
+ if (dio->result < dio->size && rw == WRITE) {
dio->waiter = current;
+ should_wait = 1;
}
+ if (ret == 0)
+ ret = dio->result;
finished_one_bio(dio); /* This can free the dio */
blk_run_queues();
- if (dio->waiter) {
+ if (should_wait) {
+ unsigned long flags;
/*
* Wait for already issued I/O to drain out and
* release its references to user-space pages
* before returning to fallback on buffered I/O
*/
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
set_current_state(TASK_UNINTERRUPTIBLE);
- while (atomic_read(&dio->bio_count)) {
+ while (dio->waiter) {
+ spin_unlock_irqrestore(&dio->bio_list_lock,
+ flags);
io_schedule();
set_current_state(TASK_UNINTERRUPTIBLE);
+ spin_lock_irqsave(&dio->bio_list_lock, flags);
}
set_current_state(TASK_RUNNING);
- dio->waiter = NULL;
+ spin_unlock_irqrestore(&dio->bio_list_lock, flags);
+ kfree(dio);
}
} else {
finished_one_bio(dio);