ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-mm1/
- The CPU scheduler changes in -mm (sched-domains) have been hanging about
for too long. I had been hoping that the people who care about SMT and
NUMA performance would have some results by now but all seems to be silent.
I do not wish to merge these up until the big-iron guys can say that they
suit their requirements, with a reasonable expectation that we will not
need to churn this code later in the 2.6 series.
So. If you have been testing, please speak up. If you have not been
testing, please do so.
- Major surgery against the pagecache, radix-tree and writeback code. This
work is to address the O_DIRECT-vs-buffered data exposure horrors which
we've been struggling with for months.
As a side-effect, 32 bytes are saved from struct inode and eight bytes
are removed from struct page.
This change will break any arch code which is using page->list and will
also break any arch code which is using page->lru of memory which was
obtained from slab.
It seems to work OK here, but I suggest people not rush out and convert
all of the corporate finance department's servers to 2.6.4-mm1.
The basic problem which we (mainly Daniel McNeil) have been struggling
with is in getting a really reliable fsync() across the page lists while
other processes are performing writeback against the same file. It's like
juggling four bars of wet soap with your eyes shut while someone is
whacking you with a baseball bat. Daniel pretty much has the problem
plugged but I suspect that's just because we don't have testcases to
trigger the remaining problems. The complexity and additional locking
which those patches add is worrisome.
So the approach taken here is to remove the page lists altogether and
replace the list-based writeback and wait operations with in-order
radix-tree walks.
The radix-tree code has been enhanced to support "tagging" of pages, for
later searches for pages which have a particular tag set. This means that
we can ask the radix tree code "find me the next 16 dirty pages starting at
pagecache index N" and it will do that in O(log64(N)) time.
This affects I/O scheduling potentially quite significantly. It is no
longer the case that the kernel will submit pages for I/O in the order in
which the application dirtied them. We instead submit them in file-offset
order all the time.
This is likely to be advantageous when applications are seeking all over
a large file randomly writing small amounts of data. I haven't performed
much benchmarking, but tiobench random write throughput seems to be
increased by 30%. Other tests appear to be unaltered. dbench may have got
10-20% quicker, but it's variable.
There is one large file which everyone seeks all over randomly writing
small amounts of data: the blockdev mapping which caches filesystem
metadata. The kernel's IO submission patterns for this are now ideal.
Because writeback and wait-for-writeback use a tree walk instead of a
list walk they are no longer livelockable. This probably means that we no
longer need to hold i_sem across O_SYNC writes and perhaps fsync() and
fdatasync(). This may be beneficial for databases: multiple processes
writing and syncing different parts of the same file at the same time can
now all submit and wait upon writes to just their own little bit of the
file, so we can get a lot more data into the queues.
It is trivial to implement a part-file-fdatasync() as well, so
applications can say "sync the file from byte N to byte M", and multiple
applications can do this concurrently. This is easy for ext2 filesystems,
but probably needs lots of work for data-journalled filesystems and XFS and
it probably doesn't offer much benefit over an i_semless O_SYNC write.
- Dropped the hotplug CPU patches: bits of them were merged into Linus's
kernel and things broke.
- Various little fixes as usual.
Changes since 2.6.4-rc2-mm1:
bk-acpi.patch
bk-alsa.patch
bk-driver-core.patch
bk-i2c.patch
bk-input.patch
bk-netdev.patch
bk-pci.patch
bk-scsi.patch
bk-usb.patch
Latest external trees
-export-filemap_flush.patch
-vma-corruption-fix.patch
-centaur-crypto-core-support.patch
Merged
+bk-acpi-warning-fix.patch
Fix a warning
+x86_64-update.patch
Latest x86_64 code drop
+print-kernel-version-in-oops.patch
Display the kernel version in the x86 oops message
+ppc64-iseries-virtual-console-fix.patch
iSeries device number fix
-zap_page_range-debug.patch
Turns out the code path which this patch was trying to detect the deadness
of is in fact used.
+sched-stats-64-bit.patch
Use 64-bit numbers for various CPU scheduler statistics
-hotplugcpu-generalise-bogolock.patch
-hotplugcpu-generalise-bogolock-fix-for-kthread-stop-using-signals.patch
-hotplugcpu-use-bogolock-in-modules.patch
-hotplugcpu-core.patch
-stop_machine-warning-fix.patch
-hotplugcpu-core-sparc64-build-fix.patch
-hotplugcpu-core-fix-for-kthread-stop-using-signals.patch
-migrate_to_cpu-dependency-fix.patch
-hotplugcpu-core-drain_local_pages-fix.patch
-hotplugcpu-rcupdate-many-cpus-fix.patch
Dropped
-ext3-dirty-debug-patch.patch
This debug trap never triggered
-fusion-use-min-max.patch
Other changes broke this
+dm-map-rwlock-ng.patch
New version of spinlocking for the device mapper map tables
+dm-remove-__dm_request.patch
Remvoe __dm_request()
+md-array-assembly-major-fix.patch
RAID fix
+fadvise-fixups.patch
Fix some fadvise() boundary conditions
+validate_mm-fixes.patch
Enhance validate_mm()
+3ware-update.patch
3ware driver update
+3c59x-xcvr-fix.patch
Fix 3c59x transceiver handling
+current_is_keventd-speedup.patch
Simplify current_is_keventd()
+root-ramdisk-fix.patch
Make "root=/dev/ram" work again
+cciss-per-device-queues.patch
per-device queues for the cciss driver
+blkdev-fix-final-page.patch
Fix reads of the final block of blockdevs
+wavfront-needs-syscalls_h.patch
Warning (and possible oops) fixes
+edd-legacy-parameters-fix.patch
EDD back-compatibility
+cciss-section-fix.patch
__init section fix
+pte_chain-nowarns.patch
Prevent possible-but-expected page allocator warnings
+macintosh-config-fix.patch
Don't offer mac drivers on other platforms
+applicom-warning-fix.patch
Fix a warning
+CONFIG_NVRAM-dependencies.patch
Fix NVRAM dependencies
+move-job-control-stuff-tosignal_struct.patch
Move various job control fields out of the task_struct and into the
signal_struct.
+module_h-attribute_used-fix.patch
__attribute_used__ sanity
+kobject-module-request-64-bit-fix.patch
Fix for 64-bit machines
+sch_htb-fix.patch
netfilter 64-bit fix
+blk-congestion-races.patch
Conceivably fix rare races in blk_congestion_wait()
+vm-lrutopage-cleanup.patch
Add a handy macro to tidy up vmscan.c
+radix-tree-tagging.patch
Add search tagging to radix trees.
+irq-safe-pagecache-lock.patch
Make mapping->page_lock irq-safe, and rename it to tree_lock to detect
missed conversions.
+tag-dirty-pages.patch
Tag dirty pages as being dirty within their radix trees.
+tag-writeback-pages.patch
Tag writeback pages as being under writeback in their radix trees
+stop-using-dirty-pages.patch
+stop-using-io-pages.patch
+stop-using-locked-pages.patch
+stop-using-clean-pages.patch
Wean the kernel off the four address_space page lists
+unslabify-pgds-and-pmds.patch
We cannot use page->lru to manage slab-derived pages: slab itself wants to
use it.
+slab-stop-using-page-list.patch
Switch slab page management from page->list to page->lru.
+page_alloc-stop-using-page-list.patch
Switch the page allocator from using page->list to using page->lru.
+hugetlb-stop-using-page-list.patch
Switch the hugetlbpage implementations from using page->list to using
page->lru.
+pageattr-stop-using-page-list.patch
Switch the pageattr code (CONFIG_DEBUG_PAGEALLOC) from using page->list to
using page->lru.
+readahead-stop-using-page-list.patch
Switch the readpages() API from using page->list over to using page->lru.
+compound-pages-stop-using-lru.patch
Teach the compound page management to use page fields other than page->list.
+remove-page-list.patch
Remove the `list' field from struct page.
+remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
Implement the per-page-permissions-in-remap_file_pages for ia64. Hasn't
been tested.
-4g4g-THREAD_SIZE-fixes.patch
-4g4g-handle_BUG-fix.patch
Folded into 4g-2.6.0-test2-mm2-A5.patch
O_DIRECT-vs-buffered-fix.patch
O_DIRECT-vs-buffered-fix-pdflush-hang-fix.patch
serialise-writeback-fdatawait.patch
restore-writeback-trylock.patch
Dropped. Hopefully we don't need these any more.
All 258 patches:
bk-acpi.patch
bk-alsa.patch
bk-driver-core.patch
bk-i2c.patch
bk-input.patch
bk-netdev.patch
bk-pci.patch
bk-scsi.patch
bk-usb.patch
mm.patch
add -mmN to EXTRAVERSION
dma_sync_for_device-cpu.patch
dma_sync_for_{cpu,device}()
bk-acpi-warning-fix.patch
bk-acpi warning fixes
x86_64-update.patch
x86-64 merge for 2.6.4
move-dma_consistent_dma_mask.patch
move consistent_dma_mask to the generic device
move-dma_consistent_dma_mask-x86_64-fix.patch
move-dma_consistent_dma_mask-sn-fix.patch
Fix dma_mask patch for sn platform
print-kernel-version-in-oops.patch
print kernel version in oops messages
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdbL warning fix
kgdb buffer overflow fix
kgdbL warning fix
kgdb: CONFIG_DEBUG_INFO fix
x86_64 fixes
correct kgdb.txt Documentation link (against 2.6.1-rc1-mm2)
kgdb-ga-recent-gcc-fix.patch
kgdb: fix for recent gcc
kgdboe-netpoll.patch
kgdb-over-ethernet via netpoll
kgdboe-non-ia32-build-fix.patch
kgdb-warning-fixes.patch
kgdb warning fixes
kgdb-x86_64-support.patch
kgdb-x86_64-support.patch for 2.6.2-rc1-mm3
kgdb-THREAD_SIZE-fixes.patch
THREAD_SIZE fixes for kgdb
must-fix.patch
must fix lists update
must fix list update
mustfix update
must-fix-update-5.patch
must-fix update
ppc64-iseries-virtual-console-fix.patch
ppc64: fix iSeries virtual console devices
ppc64-reloc_hide.patch
compat-signal-noarch-2004-01-29.patch
Generic 32-bit compat for copy_siginfo_to_user
compat-generic-ipc-emulation.patch
generic 32 bit emulation for System-V IPC
remove-sys_ioperm-stubs.patch
Clean up sys_ioperm stubs
readdir-cleanups.patch
readdir() cleanups
ext3-journalled-quotas-2.patch
ext3: journalled quota
invalidate_inodes-speedup.patch
invalidate_inodes speedup
more invalidate_inodes speedup fixes
cfq-4.patch
CFQ io scheduler
CFQ fixes
config_spinline.patch
uninline spinlocks for profiling accuracy.
pdflush-diag.patch
get_user_pages-handle-VM_IO.patch
fix get_user_pages() against mappings of /dev/mem
pci_set_power_state-might-sleep.patch
CONFIG_STANDALONE-default-to-n.patch
Make CONFIG_STANDALONE default to N
extra-buffer-diags.patch
CONFIG_SYSFS.patch
From: Pat Mochel <[email protected]>
Subject: [PATCH] Add CONFIG_SYSFS
CONFIG_SYSFS-boot-from-disk-fix.patch
slab-leak-detector.patch
slab leak detector
mm/slab.c warning in cache_alloc_debugcheck_after
scale-nr_requests.patch
scale nr_requests with TCQ depth
truncate_inode_pages-check.patch
local_bh_enable-warning-fix.patch
sched-stats-64-bit.patch
Use 64-bit counters for scheduler stats
sched-find_busiest_node-resolution-fix.patch
sched: improved resolution in find_busiest_node
sched-domains.patch
sched: scheduler domain support
sched: fix for NR_CPUS > BITS_PER_LONG
sched: clarify find_busiest_group
sched: find_busiest_group arithmetic fix
sched-domains-improvements.patch
sched domains kernbench improvements
sched-clock-fixes.patch
fix sched_clock()
sched-sibling-map-to-cpumask.patch
sched: cpu_sibling_map to cpu_mask
p4-clockmod sibling_map fix
p4-clockmod: handle more than two siblings
sched-domains-i386-ht.patch
sched: implement domains for i386 HT
sched: Fix CONFIG_SMT oops on UP
sched: fix SMT + NUMA bug
Change arch_init_sched_domains to use cpu_online_map
Fix build with NR_CPUS > BITS_PER_LONG
sched-domain-tweak.patch
i386-sched-domain code consolidation
sched-no-drop-balance.patch
sched: handle inter-CPU jiffies skew
sched-directed-migration.patch
sched_balance_exec(): don't fiddle with the cpus_allowed mask
sched-domain-debugging.patch
sched_domain debugging
sched-domain-balancing-improvements.patch
scheduler domain balancing improvements
sched-group-power.patch
sched-group-power
sched-group-power warning fixes
sched-domains-use-cpu_possible_map.patch
sched_domains: use cpu_possible_map
sched-smt-nice-handling.patch
sched: SMT niceness handling
sched-smt-nice-optimisation.patch
sched: SMT-ice optimisation
fa311-mac-address-fix.patch
wrong mac address with netgear FA311 ethernet card
laptop-mode-2.patch
laptop-mode for 2.6, version 6
Documentation/laptop-mode.txt
laptop-mode documentation updates
Laptop mode documentation addition
laptop mode simplification
pid_max-fix.patch
Bug when setting pid_max > 32k
use-soft-float.patch
Use -msoft-float
DRM-cvs-update.patch
DRM cvs update
drm-include-fix.patch
process-migration-speedup.patch
Reduce TLB flushing during process migration
nfs-31-attr.patch
NFSv2/v3/v4: New attribute revalidation code
nfs-reconnect-fix.patch
nfs-mount-fix.patch
Update to NFS mount....
nfs-d_drop-lowmem.patch
NFS: handle nfs_fhget() error
nfs-avoid-i_size_write.patch
NFS: avoid unlocked i_size_write()
nfs_unlink-oops-fix.patch
nfs: fix "busy inodes after umount"
nfs-remove-XID-spinlock.patch
nfs: Remove an unnecessary spinlock from XID generation...
nfs-misc-rpc-fixes.patch
nfs: Misc RPC fixes...
nfs-improved-writeback-strategy.patch
nfs: improve writeback caching
nfs-simplify-config-options.patch
nfs: simplify client configuration options.
nfs-fix-msync.patch
nfs: fix msync()
nfs-mount-return-useful-errors.patch
nfs: make mount command return more useful errors
nfs-misc-minor-fixes.patch
nfs: misc minor fixes
nfs-lockd-sync-01.patch
nfs: sync lockd to 2.4.x
nfs-lockd-sync-02.patch
nfs: sync lockd to 2.4.x
nfs-lockd-sync-03.patch
nfs: sync lockd to 2.4.x
nfs-lockd-sync-04.patch
nfs: sync lockd to 2.4.x
nfs-rpc-remove-redundant-memset.patch
nfs: remove unnecessary memset() in RPC
nfs-tunable-rpc-slot-table.patch
nfs: make the RPC slot table size a tunable value.
nfs-short-read-fix.patch
nfs: fix an NFSv2 read bug
nfs-server-in-root_server_path.patch
Pull NFS server address out of root_server_path
non-readable-binaries.patch
Handle non-readable binfmt_misc executables
binfmt_misc-credentials.patch
binfmt_misc: improve calaulation of interpreter's credentials
initramfs-search-for-init.patch
search for /init for initramfs boots
adaptive-lazy-readahead.patch
adaptive lazy readahead
sysfs_remove_dir-race-fix.patch
sysfs_remove_dir-vs-dcache_readdir race fix
sysfs_remove_subdir-dentry-leak-fix.patch
Fix dentry refcounting in sysfs_remove_group()
per-node-rss-tracking.patch
Track per-node RSS for NUMA
aic7xxx-deadlock-fix.patch
aic7xxx deadlock fix
futex_wait-debug.patch
futex_wait debug
module_exit-deadlock-fix.patch
module unload deadlock fix
selinux-inode-race-trap.patch
Try to diagnose Bug 2153
ufs2-01.patch
read-only support for UFS2
ide-scsi-error-handling-fixes.patch
ide-scsi error handling fixes
ide-scsi-error-handling-update.patch
ide-scsi error handler update
fb_console_init-fix.patch
fb_console_init fix
poll-select-longer-timeouts.patch
poll()/select(): support longer timeouts
poll-select-range-check-fix.patch
poll()/select() range checking fix
poll-select-handle-large-timeouts.patch
poll()/select(): handle long timeouts
pcmcia-debugging-rework-1.patch
Overhaul PCMCIA debugging (1)
cs_err-compile-fix.patch
pcmcia: workaround for gcc-2.95 bug in cs_err()
pcmcia-debugging-rework-2.patch
Overhaul PCMCIA debugging (2)
distribute-early-allocations-across-nodes.patch
Manfred's patch to distribute boot allocations across nodes
time-interpolator-fix.patch
time interpolator fix
kmsg-nonblock.patch
teach /proc/kmsg about O_NONBLOCK
mixart-build-fix.patch
CONFIG_SND_MIXART doesn't compile
add-a-slab-for-ethernet.patch
Add a kmalloc slab for ethernet packets
remove-__io_virt_debug.patch
remove __io_virt_debug
genrtc-cleanups.patch
genrtc: cleanups
piix_ide_init-can-be-__init.patch
piix_ide_init can be __init
i386-early-memory-cleanup.patch
i386 very early memory detection cleanup patch
modular-mce-handler.patch
Allow X86_MCE_NONFATAL to be a module
remove-more-KERNEL_SYSCALLS.patch
further __KERNEL_SYSCALLS__ removal
build fix for remove-more-KERNEL_SYSCALLS.patch
fix the build for remove-more-KERNEL_SYSCALLS
mq-01-codemove.patch
posix message queues: code move
mq-02-syscalls.patch
posix message queues: syscall stubs
mq-03-core.patch
posix message queues: implementation
mq-03-core-update.patch
posix message queues: update to core patch
mq-04-linuxext-poll.patch
posix message queues: linux-specific poll extension
mq-05-linuxext-mount.patch
posix message queues: made user mountable
mq-update-01.patch
posix message queue update
mq-security-fix.patch
security bugfix for mqueue
dm-01-endio-method.patch
dm: endio method
dm-03-list_for_each_entry-audit.patch
dm: list_for_each_entry audit
dm-04-default-queue-limits-fix.patch
dm: default queue limits
dm-05-list-targets-command.patch
dm: list targets cmd
dm-06-stripe-width-fix.patch
dm: stripe width fix
queue-congestion-callout.patch
Add queue congestion callout
queue-congestion-dm-implementation.patch
Implement queue congestion callout for device mapper
dm-maplock.patch
devicemapper: use rwlock for map alterations
dm-map-rwlock-ng.patch
Another DM maplock implementation
dm-remove-__dm_request.patch
dmL remove __dm_request
use-wait_task_inactive-in-kthread_bind.patch
use wait_task_inactive() in kthread_bind()
HPFS1-hpfs2-RC4-rc1.patch
HPFS2-hpfs_namei-RC4-rc1.patch
selinux-cleanup-binary-mount-data.patch
selinux: clean up binary mount data
udffs-update.patch
UDF filesystem update
kbuild-redundant-CFLAGS.patch
kbuild: Remove CFLAGS assignment in i386/mach-*/Makefile
numa-aware-zonelist-builder.patch
NUMA-aware zonelist builder
numa-aware zonelist builder fix
numa-aware node builder fix #2
remove-redundant-unplug_timer-deletion.patch
Redundant unplug_timer deletion
queue_work_on_cpu.patch
Add queue_work_on_cpu() workqueue function
m68k-rename-sys_functions.patch
m68k: rename sys_* functions
pdc202xx_new-update.patch
ide: update for pdc202xx_new driver
siimage-update.patch
ide: update for siimage driver
ide-cleanups-01.patch
ide: IDE cleanups
ide-cleanups-02.patch
ide: IDE cleanups
ide-cleanups-03.patch
ide: IDE cleanups
cdromaudio-use-dma.patch
use DMA for CDROM audio reading
sysfs-pin-kobject.patch
sysfs: pin kobjects to fix use-after-free crashes
ATI-IXP-IDE-support.patch
ATI IXP IDE support
ipmi-updates-3.patch
IPMI driver updates
ipmi-socket-interface.patch
IPMI: socket interface
md-use-schedule_timeout.patch
md: use "shedule_timeout(2)" instead of yield()
md-array-assembly-fix.patch
md: allow assembling of partitioned arrays at boot time.
md-array-assembly-major-fix.patch
md array assembly major number fix
compiler_h-scope-fixes.patch
compiler.h scoping fixes
nmi_watchdog-local-apic-fix.patch
Fix nmi_watchdog=2 and P4 HT
nmi-1-hz.patch
set nmi_hz to 1 with nmi_watchdog=2 and SMP
elf-mmap-fix.patch
Fix elf mapping of the zero page
kbuild-more-cleaning.patch
kbuild: Cause `make clean' to remove more files
LOOP_CHANGE_FD.patch
LOOP_CHANGE_FD ioctl
loop-setup-race-fix.patch
loop setup race fix
handle-dot-o-paths.patch
kbuild: fix usage with directories containing '.o'
acpi-asmlinkage-fix.patch
gcc-3.5: acpi build fix
ipc-sem-extra-sem_unlock.patch
Remove unneeded unlock in ipc/sem.c
procfs-dangling-subdir-fix.patch
/proc data corruption check
AMD-768MPX-bootmem-fix.patch
Work around an AMD768MPX erratum
i810fb-on-x86_64.patch
Enable i810 fb on x86-64
ext23-remove-acl-limits.patch
Remove arbitrary #acl entries limits on ext[23] when reading
watchdog-moduleparam-patches.patch
watchdog: moduleparam-patches
amd-elan-fix.patch
AMD ELAN Kconfig fix
pcmcia-netdev-ordering-fixes.patch
PCMCIA netdevice ordering issues
fadvise-fixups.patch
fadvise(POSIX_FADV_DONTNEED) fixups
validate_mm-fixes.patch
Fix and harden validate_mm
3ware-update.patch
3ware driver update
3c59x-xcvr-fix.patch
Fix 3c59x transceiver handling
current_is_keventd-speedup.patch
current_is_keventd() speedup
root-ramdisk-fix.patch
Fix rootfs on ramdisk
cciss-per-device-queues.patch
cciss: per device queues
blkdev-fix-final-page.patch
Fix reading the last block on a bdev
wavfront-needs-syscalls_h.patch
wavfront.c needs syscalls.h
edd-legacy-parameters-fix.patch
EDD: Get Legacy Parameters
cciss-section-fix.patch
cciss: init section fix
pte_chain-nowarns.patch
add nowarn to a few pte chain allocators
macintosh-config-fix.patch
Disable Macintosh device drivers for all but PPC || MAC
applicom-warning-fix.patch
Applicom warning
CONFIG_NVRAM-dependencies.patch
Fix CONFIG_NVRAM dependencies
move-job-control-stuff-tosignal_struct.patch
moef job control fields from task_struct to signal_struct
module_h-attribute_used-fix.patch
module.h __attribute_used__ fix
kobject-module-request-64-bit-fix.patch
Fix a 64bit bug in kobject module request
sch_htb-fix.patch
net: fix sch_htb on 64-bit
instrument-highmem-page-reclaim.patch
vm: per-zone vmscan instrumentation
blk_congestion_wait-return-remaining.patch
return remaining jiffies from blk_congestion_wait()
blk-congestion-races.patch
Narrow blk_congestion_wait races
vmscan-remove-priority.patch
mm/vmscan.c: remove unused priority argument.
kswapd-throttling-fixes.patch
kswapd throttling fixes
vm-refill_inactive-preserve-referenced.patch
vmscan: preserve page referenced info in refill_inactive()
shrink_slab-precision-fix.patch
shrink_slab: math precision fix
try_to_free_pages-shrink_slab-evenness.patch
vm: shrink slab evenly in try_to_free_pages()
vmscan-total_scanned-fix.patch
vmscan: fix calculation of number of pages scanned
shrink_slab-for-all-zones-2.patch
vm: scan slab in response to highmem scanning
zone-balancing-fix-2.patch
vmscan: zone balancing fix
vmscan-control-by-nr_to_scan-only.patch
vmscan: drive everything via nr_to_scan
vmscan-balance-zone-scanning-rates.patch
Balance inter-zone scan rates
vmscan-dont-throttle-if-zero-max_scan.patch
vmscan: avoid bogus throttling
kswapd-avoid-higher-zones.patch
kswapd: avoid unnecessary reclaiming from higher zones
kswapd-avoid-higher-zones-reverse-direction.patch
kswapd: fix lumpy page reclaim
kswapd-avoid-higher-zones-reverse-direction-fix.patch
fix the kswapd zone scanning algorithm
vmscan-throttle-later.patch
vmscan: less throttling of page allocators and kswapd
vm-batch-inactive-scanning.patch
vmscan: batch up inactive list scanning work
vm-batch-inactive-scanning-fix.patch
fix vm-batch-inactive-scanning.patch
vm-balance-refill-rate.patch
vm: balance inactive zone refill rates
vm-lrutopage-cleanup.patch
vmscan: add lru_to_page() helper
slab-no-higher-order.patch
slab: avoid higher-order allocations
O_DIRECT-race-fixes-rollup.patch
O_DIRECT data exposure fixes
O_DIRECT-ll_rw_block-vs-block_write_full_page-fix.patch
Fix race between ll_rw_block() and block_write_full_page()
blockdev-direct-io-speedup.patch
blockdev direct-io speedups
dio-aio-fixes.patch
direct-io AIO fixes
aio-fallback-bio_count-race-fix-2.patch
AIO+DIO bio_count race fix
aio-direct-io-oops-fix.patch
AIO/direct-io oops fix
radix-tree-tagging.patch
radix-tree tags for selective lookup
irq-safe-pagecache-lock.patch
make the pagecache lock irq-safe.
tag-dirty-pages.patch
tag dirty pages as such in the radix tree
tag-writeback-pages.patch
tag writeback pages as such in their radix tree
stop-using-dirty-pages.patch
stop using the address_space dirty_pages list
stop-using-io-pages.patch
remove address_space.io_pages
stop-using-locked-pages.patch
Stop using address_space.locked_pages
stop-using-clean-pages.patch
stop using address_space.clean_pages
unslabify-pgds-and-pmds.patch
revert the slabification of i386 pgd's and pmd's
slab-stop-using-page-list.patch
slab: stop using page.list
page_alloc-stop-using-page-list.patch
stop using page.list in the page allocator
hugetlb-stop-using-page-list.patch
stop using page->list in the hugetlbpage implementations
pageattr-stop-using-page-list.patch
stop using page.list in pageattr.c
readahead-stop-using-page-list.patch
stop using page.list in readahead
compound-pages-stop-using-lru.patch
stop using page->lru in compound pages
remove-page-list.patch
remove page.list
remap-file-pages-prot-2.6.4-rc1-mm1-A1.patch
per-page protections for remap_file_pages()
remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
remap_file_pages page-prot implementation for ia64
list_del-debug.patch
list_del debug check
oops-dump-preceding-code.patch
i386 oops output: dump preceding code
lockmeter.patch
lockmeter
lockmeter-ia64-fix.patch
ia64 CONFIG_LOCKMETER fix
4g-2.6.0-test2-mm2-A5.patch
4G/4G split patch
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g/4g usercopy atomicity fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
4G/4G preempt on vstack
4G/4G: even number of kmap types
4g4g: fix __get_user in slab
4g4g: Remove extra .data.idt section definition
4g/4g linker error (overlapping sections)
4G/4G: remove debug code
4g4g: pmd fix
4g/4g: fixes from Bill
4g4g: fpu emulation fix
4g4g: show_registers() fix
4g/4g usercopy atomicity fix
4g4g: debug flags fix
4g4g: Fix wrong asm-offsets entry
cyclone time fixmap fix
use direct_copy_{to,from}_user for kernel access in mm/usercopy.c
4G/4G might_sleep warning fix
4g/4g pagetable accounting fix
Fix 4G/4G and WP test lockup
4G/4G KERNEL_DS usercopy again
Fix 4G/4G X11/vm86 oops
Fix 4G/4G athlon triplefault
4g4g SEP fix
Fix 4G/4G split fix for pre-pentiumII machines
4g/4g PAE ACPI low mappings fix
zap_low_mappings() cannot be __init
4g/4g: remove printk at boot
4g4g: fix handle_BUG()
4g4g: acpi sleep fixes
4g4g-locked-userspace-copy.patch
Do a locked user-space copy for 4g/4g
ia32-4k-stacks.patch
ia32: 4Kb stacks (and irqstacks) patch
ia32-4k-stacks-build-fix.patch
4k stacks build fix
4k-stacks-in-modversions-magic.patch
Add 4k stacks to module version magic
ppc-fixes.patch
make mm4 compile on ppc
ppc-fixes-dependency-fix.patch
ppc-fixes dependency fix
On Wed, Mar 10 2004, Andrew Morton wrote:
> - Major surgery against the pagecache, radix-tree and writeback code. This
> work is to address the O_DIRECT-vs-buffered data exposure horrors which
> we've been struggling with for months.
[snip]
Looks extremely kick ass! mpage is has a left-over spin_unlock in there
though, I need this to boot:
--- /opt/kernel/linux-2.6.4-mm1/fs/mpage.c 2004-03-11 09:10:02.070434880 +0100
+++ fs/mpage.c 2004-03-11 09:23:19.718019755 +0100
@@ -672,7 +672,6 @@
}
pagevec_release(&pvec);
}
- spin_unlock_irq(&mapping->tree_lock);
if (bio)
mpage_bio_submit(WRITE, bio);
return ret;
--
Jens Axboe
Andrew Morton <[email protected]> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-mm1/
Needs this fix, if you use CONFIG_DEBUG_SPINLOCK
--- 25/fs/mpage.c~mpage-locking-bug 2004-03-11 00:29:21.000000000 -0800
+++ 25-akpm/fs/mpage.c 2004-03-11 00:29:25.000000000 -0800
@@ -672,7 +672,6 @@ mpage_writepages(struct address_space *m
}
pagevec_release(&pvec);
}
- spin_unlock_irq(&mapping->tree_lock);
if (bio)
mpage_bio_submit(WRITE, bio);
return ret;
_
Hi,
on my config (opteron box) I need this patch to get it compiled :
--- fs/compat_ioctl.c.orig 2004-03-11 08:57:49.472074584 +0000
+++ fs/compat_ioctl.c 2004-03-11 08:57:01.770326352 +0000
@@ -1604,7 +1604,7 @@
* To have permissions to do most of the vt ioctls, we either have
* to be the owner of the tty, or super-user.
*/
- if (current->tty == tty || capable(CAP_SYS_ADMIN))
+ if (current->signal->tty == tty || capable(CAP_SYS_ADMIN))
return 1;
return 0;
}
I guess it's been forgotten in some other patch. (and i hope it's the good fix :)
while I am at it, I am running a 64 bits kernel with 32 bits debian testing and
it seems some ioctl conversion fails
that happened with all 2.6 I tried.
here is the relevant kernel messages part :
ioctl32(dmsetup:26199): Unknown cmd fd(3) cmd(c134fd00){01} arg(0804c0b0) on /dev/mapper/control
ioctl32(fsck.reiserfs:201): Unknown cmd fd(4) cmd(80081272){00} arg(ffffdab8) on /dev/ide/host0/bus0/target0/lun0/part4
Cheers,
Mik
Le jeudi 11 Mars 2004 08:31, vous avez ?crit?:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-m
>m1/
>
>
>
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be
> silent.
>
> I do not wish to merge these up until the big-iron guys can say that they
> suit their requirements, with a reasonable expectation that we will not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
>
>
> - Major surgery against the pagecache, radix-tree and writeback code. This
> work is to address the O_DIRECT-vs-buffered data exposure horrors which
> we've been struggling with for months.
>
> As a side-effect, 32 bytes are saved from struct inode and eight bytes
> are removed from struct page.
>
> This change will break any arch code which is using page->list and will
> also break any arch code which is using page->lru of memory which was
> obtained from slab.
>
> It seems to work OK here, but I suggest people not rush out and convert
> all of the corporate finance department's servers to 2.6.4-mm1.
>
> The basic problem which we (mainly Daniel McNeil) have been struggling
> with is in getting a really reliable fsync() across the page lists while
> other processes are performing writeback against the same file. It's
> like juggling four bars of wet soap with your eyes shut while someone is
> whacking you with a baseball bat. Daniel pretty much has the problem
> plugged but I suspect that's just because we don't have testcases to
> trigger the remaining problems. The complexity and additional locking
> which those patches add is worrisome.
>
> So the approach taken here is to remove the page lists altogether and
> replace the list-based writeback and wait operations with in-order
> radix-tree walks.
>
> The radix-tree code has been enhanced to support "tagging" of pages, for
> later searches for pages which have a particular tag set. This means
> that we can ask the radix tree code "find me the next 16 dirty pages
> starting at pagecache index N" and it will do that in O(log64(N)) time.
>
> This affects I/O scheduling potentially quite significantly. It is no
> longer the case that the kernel will submit pages for I/O in the order in
> which the application dirtied them. We instead submit them in
> file-offset order all the time.
>
> This is likely to be advantageous when applications are seeking all over
> a large file randomly writing small amounts of data. I haven't performed
> much benchmarking, but tiobench random write throughput seems to be
> increased by 30%. Other tests appear to be unaltered. dbench may have
> got 10-20% quicker, but it's variable.
>
> There is one large file which everyone seeks all over randomly writing
> small amounts of data: the blockdev mapping which caches filesystem
> metadata. The kernel's IO submission patterns for this are now ideal.
>
>
> Because writeback and wait-for-writeback use a tree walk instead of a
> list walk they are no longer livelockable. This probably means that we
> no longer need to hold i_sem across O_SYNC writes and perhaps fsync() and
> fdatasync(). This may be beneficial for databases: multiple processes
> writing and syncing different parts of the same file at the same time can
> now all submit and wait upon writes to just their own little bit of the
> file, so we can get a lot more data into the queues.
>
> It is trivial to implement a part-file-fdatasync() as well, so
> applications can say "sync the file from byte N to byte M", and multiple
> applications can do this concurrently. This is easy for ext2
> filesystems, but probably needs lots of work for data-journalled
> filesystems and XFS and it probably doesn't offer much benefit over an
> i_semless O_SYNC write.
>
> - Dropped the hotplug CPU patches: bits of them were merged into Linus's
> kernel and things broke.
>
> - Various little fixes as usual.
>
>
>
>
> Changes since 2.6.4-rc2-mm1:
>
>
> bk-acpi.patch
> bk-alsa.patch
> bk-driver-core.patch
> bk-i2c.patch
> bk-input.patch
> bk-netdev.patch
> bk-pci.patch
> bk-scsi.patch
> bk-usb.patch
>
> Latest external trees
>
> -export-filemap_flush.patch
> -vma-corruption-fix.patch
> -centaur-crypto-core-support.patch
>
> Merged
>
> +bk-acpi-warning-fix.patch
>
> Fix a warning
>
> +x86_64-update.patch
>
> Latest x86_64 code drop
>
> +print-kernel-version-in-oops.patch
>
> Display the kernel version in the x86 oops message
>
> +ppc64-iseries-virtual-console-fix.patch
>
> iSeries device number fix
>
> -zap_page_range-debug.patch
>
> Turns out the code path which this patch was trying to detect the deadness
> of is in fact used.
>
> +sched-stats-64-bit.patch
>
> Use 64-bit numbers for various CPU scheduler statistics
>
> -hotplugcpu-generalise-bogolock.patch
> -hotplugcpu-generalise-bogolock-fix-for-kthread-stop-using-signals.patch
> -hotplugcpu-use-bogolock-in-modules.patch
> -hotplugcpu-core.patch
> -stop_machine-warning-fix.patch
> -hotplugcpu-core-sparc64-build-fix.patch
> -hotplugcpu-core-fix-for-kthread-stop-using-signals.patch
> -migrate_to_cpu-dependency-fix.patch
> -hotplugcpu-core-drain_local_pages-fix.patch
> -hotplugcpu-rcupdate-many-cpus-fix.patch
>
> Dropped
>
> -ext3-dirty-debug-patch.patch
>
> This debug trap never triggered
>
> -fusion-use-min-max.patch
>
> Other changes broke this
>
> +dm-map-rwlock-ng.patch
>
> New version of spinlocking for the device mapper map tables
>
> +dm-remove-__dm_request.patch
>
> Remvoe __dm_request()
>
> +md-array-assembly-major-fix.patch
>
> RAID fix
>
> +fadvise-fixups.patch
>
> Fix some fadvise() boundary conditions
>
> +validate_mm-fixes.patch
>
> Enhance validate_mm()
>
> +3ware-update.patch
>
> 3ware driver update
>
> +3c59x-xcvr-fix.patch
>
> Fix 3c59x transceiver handling
>
> +current_is_keventd-speedup.patch
>
> Simplify current_is_keventd()
>
> +root-ramdisk-fix.patch
>
> Make "root=/dev/ram" work again
>
> +cciss-per-device-queues.patch
>
> per-device queues for the cciss driver
>
> +blkdev-fix-final-page.patch
>
> Fix reads of the final block of blockdevs
>
> +wavfront-needs-syscalls_h.patch
>
> Warning (and possible oops) fixes
>
> +edd-legacy-parameters-fix.patch
>
> EDD back-compatibility
>
> +cciss-section-fix.patch
>
> __init section fix
>
> +pte_chain-nowarns.patch
>
> Prevent possible-but-expected page allocator warnings
>
> +macintosh-config-fix.patch
>
> Don't offer mac drivers on other platforms
>
> +applicom-warning-fix.patch
>
> Fix a warning
>
> +CONFIG_NVRAM-dependencies.patch
>
> Fix NVRAM dependencies
>
> +move-job-control-stuff-tosignal_struct.patch
>
> Move various job control fields out of the task_struct and into the
> signal_struct.
>
> +module_h-attribute_used-fix.patch
>
> __attribute_used__ sanity
>
> +kobject-module-request-64-bit-fix.patch
>
> Fix for 64-bit machines
>
> +sch_htb-fix.patch
>
> netfilter 64-bit fix
>
> +blk-congestion-races.patch
>
> Conceivably fix rare races in blk_congestion_wait()
>
> +vm-lrutopage-cleanup.patch
>
> Add a handy macro to tidy up vmscan.c
>
> +radix-tree-tagging.patch
>
> Add search tagging to radix trees.
>
> +irq-safe-pagecache-lock.patch
>
> Make mapping->page_lock irq-safe, and rename it to tree_lock to detect
> missed conversions.
>
> +tag-dirty-pages.patch
>
> Tag dirty pages as being dirty within their radix trees.
>
> +tag-writeback-pages.patch
>
> Tag writeback pages as being under writeback in their radix trees
>
> +stop-using-dirty-pages.patch
> +stop-using-io-pages.patch
> +stop-using-locked-pages.patch
> +stop-using-clean-pages.patch
>
> Wean the kernel off the four address_space page lists
>
> +unslabify-pgds-and-pmds.patch
>
> We cannot use page->lru to manage slab-derived pages: slab itself wants to
> use it.
>
> +slab-stop-using-page-list.patch
>
> Switch slab page management from page->list to page->lru.
>
> +page_alloc-stop-using-page-list.patch
>
> Switch the page allocator from using page->list to using page->lru.
>
> +hugetlb-stop-using-page-list.patch
>
> Switch the hugetlbpage implementations from using page->list to using
> page->lru.
>
> +pageattr-stop-using-page-list.patch
>
> Switch the pageattr code (CONFIG_DEBUG_PAGEALLOC) from using page->list to
> using page->lru.
>
> +readahead-stop-using-page-list.patch
>
> Switch the readpages() API from using page->list over to using page->lru.
>
> +compound-pages-stop-using-lru.patch
>
> Teach the compound page management to use page fields other than
> page->list.
>
> +remove-page-list.patch
>
> Remove the `list' field from struct page.
>
> +remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
>
> Implement the per-page-permissions-in-remap_file_pages for ia64. Hasn't
> been tested.
>
> -4g4g-THREAD_SIZE-fixes.patch
> -4g4g-handle_BUG-fix.patch
>
> Folded into 4g-2.6.0-test2-mm2-A5.patch
>
> O_DIRECT-vs-buffered-fix.patch
> O_DIRECT-vs-buffered-fix-pdflush-hang-fix.patch
> serialise-writeback-fdatawait.patch
> restore-writeback-trylock.patch
>
> Dropped. Hopefully we don't need these any more.
>
>
>
>
>
>
> All 258 patches:
>
>
>
> bk-acpi.patch
>
> bk-alsa.patch
>
> bk-driver-core.patch
>
> bk-i2c.patch
>
> bk-input.patch
>
> bk-netdev.patch
>
> bk-pci.patch
>
> bk-scsi.patch
>
> bk-usb.patch
>
> mm.patch
> add -mmN to EXTRAVERSION
>
> dma_sync_for_device-cpu.patch
> dma_sync_for_{cpu,device}()
>
> bk-acpi-warning-fix.patch
> bk-acpi warning fixes
>
> x86_64-update.patch
> x86-64 merge for 2.6.4
>
> move-dma_consistent_dma_mask.patch
> move consistent_dma_mask to the generic device
>
> move-dma_consistent_dma_mask-x86_64-fix.patch
>
> move-dma_consistent_dma_mask-sn-fix.patch
> Fix dma_mask patch for sn platform
>
> print-kernel-version-in-oops.patch
> print kernel version in oops messages
>
> kgdb-ga.patch
> kgdb stub for ia32 (George Anzinger's one)
> kgdbL warning fix
> kgdb buffer overflow fix
> kgdbL warning fix
> kgdb: CONFIG_DEBUG_INFO fix
> x86_64 fixes
> correct kgdb.txt Documentation link (against 2.6.1-rc1-mm2)
>
> kgdb-ga-recent-gcc-fix.patch
> kgdb: fix for recent gcc
>
> kgdboe-netpoll.patch
> kgdb-over-ethernet via netpoll
>
> kgdboe-non-ia32-build-fix.patch
>
> kgdb-warning-fixes.patch
> kgdb warning fixes
>
> kgdb-x86_64-support.patch
> kgdb-x86_64-support.patch for 2.6.2-rc1-mm3
>
> kgdb-THREAD_SIZE-fixes.patch
> THREAD_SIZE fixes for kgdb
>
> must-fix.patch
> must fix lists update
> must fix list update
> mustfix update
>
> must-fix-update-5.patch
> must-fix update
>
> ppc64-iseries-virtual-console-fix.patch
> ppc64: fix iSeries virtual console devices
>
> ppc64-reloc_hide.patch
>
> compat-signal-noarch-2004-01-29.patch
> Generic 32-bit compat for copy_siginfo_to_user
>
> compat-generic-ipc-emulation.patch
> generic 32 bit emulation for System-V IPC
>
> remove-sys_ioperm-stubs.patch
> Clean up sys_ioperm stubs
>
> readdir-cleanups.patch
> readdir() cleanups
>
> ext3-journalled-quotas-2.patch
> ext3: journalled quota
>
> invalidate_inodes-speedup.patch
> invalidate_inodes speedup
> more invalidate_inodes speedup fixes
>
> cfq-4.patch
> CFQ io scheduler
> CFQ fixes
>
> config_spinline.patch
> uninline spinlocks for profiling accuracy.
>
> pdflush-diag.patch
>
> get_user_pages-handle-VM_IO.patch
> fix get_user_pages() against mappings of /dev/mem
>
> pci_set_power_state-might-sleep.patch
>
> CONFIG_STANDALONE-default-to-n.patch
> Make CONFIG_STANDALONE default to N
>
> extra-buffer-diags.patch
>
> CONFIG_SYSFS.patch
> From: Pat Mochel <[email protected]>
> Subject: [PATCH] Add CONFIG_SYSFS
>
> CONFIG_SYSFS-boot-from-disk-fix.patch
>
> slab-leak-detector.patch
> slab leak detector
> mm/slab.c warning in cache_alloc_debugcheck_after
>
> scale-nr_requests.patch
> scale nr_requests with TCQ depth
>
> truncate_inode_pages-check.patch
>
> local_bh_enable-warning-fix.patch
>
> sched-stats-64-bit.patch
> Use 64-bit counters for scheduler stats
>
> sched-find_busiest_node-resolution-fix.patch
> sched: improved resolution in find_busiest_node
>
> sched-domains.patch
> sched: scheduler domain support
> sched: fix for NR_CPUS > BITS_PER_LONG
> sched: clarify find_busiest_group
> sched: find_busiest_group arithmetic fix
>
> sched-domains-improvements.patch
> sched domains kernbench improvements
>
> sched-clock-fixes.patch
> fix sched_clock()
>
> sched-sibling-map-to-cpumask.patch
> sched: cpu_sibling_map to cpu_mask
> p4-clockmod sibling_map fix
> p4-clockmod: handle more than two siblings
>
> sched-domains-i386-ht.patch
> sched: implement domains for i386 HT
> sched: Fix CONFIG_SMT oops on UP
> sched: fix SMT + NUMA bug
> Change arch_init_sched_domains to use cpu_online_map
> Fix build with NR_CPUS > BITS_PER_LONG
>
> sched-domain-tweak.patch
> i386-sched-domain code consolidation
>
> sched-no-drop-balance.patch
> sched: handle inter-CPU jiffies skew
>
> sched-directed-migration.patch
> sched_balance_exec(): don't fiddle with the cpus_allowed mask
>
> sched-domain-debugging.patch
> sched_domain debugging
>
> sched-domain-balancing-improvements.patch
> scheduler domain balancing improvements
>
> sched-group-power.patch
> sched-group-power
> sched-group-power warning fixes
>
> sched-domains-use-cpu_possible_map.patch
> sched_domains: use cpu_possible_map
>
> sched-smt-nice-handling.patch
> sched: SMT niceness handling
>
> sched-smt-nice-optimisation.patch
> sched: SMT-ice optimisation
>
> fa311-mac-address-fix.patch
> wrong mac address with netgear FA311 ethernet card
>
> laptop-mode-2.patch
> laptop-mode for 2.6, version 6
> Documentation/laptop-mode.txt
> laptop-mode documentation updates
> Laptop mode documentation addition
> laptop mode simplification
>
> pid_max-fix.patch
> Bug when setting pid_max > 32k
>
> use-soft-float.patch
> Use -msoft-float
>
> DRM-cvs-update.patch
> DRM cvs update
>
> drm-include-fix.patch
>
> process-migration-speedup.patch
> Reduce TLB flushing during process migration
>
> nfs-31-attr.patch
> NFSv2/v3/v4: New attribute revalidation code
>
> nfs-reconnect-fix.patch
>
> nfs-mount-fix.patch
> Update to NFS mount....
>
> nfs-d_drop-lowmem.patch
> NFS: handle nfs_fhget() error
>
> nfs-avoid-i_size_write.patch
> NFS: avoid unlocked i_size_write()
>
> nfs_unlink-oops-fix.patch
> nfs: fix "busy inodes after umount"
>
> nfs-remove-XID-spinlock.patch
> nfs: Remove an unnecessary spinlock from XID generation...
>
> nfs-misc-rpc-fixes.patch
> nfs: Misc RPC fixes...
>
> nfs-improved-writeback-strategy.patch
> nfs: improve writeback caching
>
> nfs-simplify-config-options.patch
> nfs: simplify client configuration options.
>
> nfs-fix-msync.patch
> nfs: fix msync()
>
> nfs-mount-return-useful-errors.patch
> nfs: make mount command return more useful errors
>
> nfs-misc-minor-fixes.patch
> nfs: misc minor fixes
>
> nfs-lockd-sync-01.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-02.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-03.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-04.patch
> nfs: sync lockd to 2.4.x
>
> nfs-rpc-remove-redundant-memset.patch
> nfs: remove unnecessary memset() in RPC
>
> nfs-tunable-rpc-slot-table.patch
> nfs: make the RPC slot table size a tunable value.
>
> nfs-short-read-fix.patch
> nfs: fix an NFSv2 read bug
>
> nfs-server-in-root_server_path.patch
> Pull NFS server address out of root_server_path
>
> non-readable-binaries.patch
> Handle non-readable binfmt_misc executables
>
> binfmt_misc-credentials.patch
> binfmt_misc: improve calaulation of interpreter's credentials
>
> initramfs-search-for-init.patch
> search for /init for initramfs boots
>
> adaptive-lazy-readahead.patch
> adaptive lazy readahead
>
> sysfs_remove_dir-race-fix.patch
> sysfs_remove_dir-vs-dcache_readdir race fix
>
> sysfs_remove_subdir-dentry-leak-fix.patch
> Fix dentry refcounting in sysfs_remove_group()
>
> per-node-rss-tracking.patch
> Track per-node RSS for NUMA
>
> aic7xxx-deadlock-fix.patch
> aic7xxx deadlock fix
>
> futex_wait-debug.patch
> futex_wait debug
>
> module_exit-deadlock-fix.patch
> module unload deadlock fix
>
> selinux-inode-race-trap.patch
> Try to diagnose Bug 2153
>
> ufs2-01.patch
> read-only support for UFS2
>
> ide-scsi-error-handling-fixes.patch
> ide-scsi error handling fixes
>
> ide-scsi-error-handling-update.patch
> ide-scsi error handler update
>
> fb_console_init-fix.patch
> fb_console_init fix
>
> poll-select-longer-timeouts.patch
> poll()/select(): support longer timeouts
>
> poll-select-range-check-fix.patch
> poll()/select() range checking fix
>
> poll-select-handle-large-timeouts.patch
> poll()/select(): handle long timeouts
>
> pcmcia-debugging-rework-1.patch
> Overhaul PCMCIA debugging (1)
>
> cs_err-compile-fix.patch
> pcmcia: workaround for gcc-2.95 bug in cs_err()
>
> pcmcia-debugging-rework-2.patch
> Overhaul PCMCIA debugging (2)
>
> distribute-early-allocations-across-nodes.patch
> Manfred's patch to distribute boot allocations across nodes
>
> time-interpolator-fix.patch
> time interpolator fix
>
> kmsg-nonblock.patch
> teach /proc/kmsg about O_NONBLOCK
>
> mixart-build-fix.patch
> CONFIG_SND_MIXART doesn't compile
>
> add-a-slab-for-ethernet.patch
> Add a kmalloc slab for ethernet packets
>
> remove-__io_virt_debug.patch
> remove __io_virt_debug
>
> genrtc-cleanups.patch
> genrtc: cleanups
>
> piix_ide_init-can-be-__init.patch
> piix_ide_init can be __init
>
> i386-early-memory-cleanup.patch
> i386 very early memory detection cleanup patch
>
> modular-mce-handler.patch
> Allow X86_MCE_NONFATAL to be a module
>
> remove-more-KERNEL_SYSCALLS.patch
> further __KERNEL_SYSCALLS__ removal
> build fix for remove-more-KERNEL_SYSCALLS.patch
> fix the build for remove-more-KERNEL_SYSCALLS
>
> mq-01-codemove.patch
> posix message queues: code move
>
> mq-02-syscalls.patch
> posix message queues: syscall stubs
>
> mq-03-core.patch
> posix message queues: implementation
>
> mq-03-core-update.patch
> posix message queues: update to core patch
>
> mq-04-linuxext-poll.patch
> posix message queues: linux-specific poll extension
>
> mq-05-linuxext-mount.patch
> posix message queues: made user mountable
>
> mq-update-01.patch
> posix message queue update
>
> mq-security-fix.patch
> security bugfix for mqueue
>
> dm-01-endio-method.patch
> dm: endio method
>
> dm-03-list_for_each_entry-audit.patch
> dm: list_for_each_entry audit
>
> dm-04-default-queue-limits-fix.patch
> dm: default queue limits
>
> dm-05-list-targets-command.patch
> dm: list targets cmd
>
> dm-06-stripe-width-fix.patch
> dm: stripe width fix
>
> queue-congestion-callout.patch
> Add queue congestion callout
>
> queue-congestion-dm-implementation.patch
> Implement queue congestion callout for device mapper
>
> dm-maplock.patch
> devicemapper: use rwlock for map alterations
>
> dm-map-rwlock-ng.patch
> Another DM maplock implementation
>
> dm-remove-__dm_request.patch
> dmL remove __dm_request
>
> use-wait_task_inactive-in-kthread_bind.patch
> use wait_task_inactive() in kthread_bind()
>
> HPFS1-hpfs2-RC4-rc1.patch
>
> HPFS2-hpfs_namei-RC4-rc1.patch
>
> selinux-cleanup-binary-mount-data.patch
> selinux: clean up binary mount data
>
> udffs-update.patch
> UDF filesystem update
>
> kbuild-redundant-CFLAGS.patch
> kbuild: Remove CFLAGS assignment in i386/mach-*/Makefile
>
> numa-aware-zonelist-builder.patch
> NUMA-aware zonelist builder
> numa-aware zonelist builder fix
> numa-aware node builder fix #2
>
> remove-redundant-unplug_timer-deletion.patch
> Redundant unplug_timer deletion
>
> queue_work_on_cpu.patch
> Add queue_work_on_cpu() workqueue function
>
> m68k-rename-sys_functions.patch
> m68k: rename sys_* functions
>
> pdc202xx_new-update.patch
> ide: update for pdc202xx_new driver
>
> siimage-update.patch
> ide: update for siimage driver
>
> ide-cleanups-01.patch
> ide: IDE cleanups
>
> ide-cleanups-02.patch
> ide: IDE cleanups
>
> ide-cleanups-03.patch
> ide: IDE cleanups
>
> cdromaudio-use-dma.patch
> use DMA for CDROM audio reading
>
> sysfs-pin-kobject.patch
> sysfs: pin kobjects to fix use-after-free crashes
>
> ATI-IXP-IDE-support.patch
> ATI IXP IDE support
>
> ipmi-updates-3.patch
> IPMI driver updates
>
> ipmi-socket-interface.patch
> IPMI: socket interface
>
> md-use-schedule_timeout.patch
> md: use "shedule_timeout(2)" instead of yield()
>
> md-array-assembly-fix.patch
> md: allow assembling of partitioned arrays at boot time.
>
> md-array-assembly-major-fix.patch
> md array assembly major number fix
>
> compiler_h-scope-fixes.patch
> compiler.h scoping fixes
>
> nmi_watchdog-local-apic-fix.patch
> Fix nmi_watchdog=2 and P4 HT
>
> nmi-1-hz.patch
> set nmi_hz to 1 with nmi_watchdog=2 and SMP
>
> elf-mmap-fix.patch
> Fix elf mapping of the zero page
>
> kbuild-more-cleaning.patch
> kbuild: Cause `make clean' to remove more files
>
> LOOP_CHANGE_FD.patch
> LOOP_CHANGE_FD ioctl
>
> loop-setup-race-fix.patch
> loop setup race fix
>
> handle-dot-o-paths.patch
> kbuild: fix usage with directories containing '.o'
>
> acpi-asmlinkage-fix.patch
> gcc-3.5: acpi build fix
>
> ipc-sem-extra-sem_unlock.patch
> Remove unneeded unlock in ipc/sem.c
>
> procfs-dangling-subdir-fix.patch
> /proc data corruption check
>
> AMD-768MPX-bootmem-fix.patch
> Work around an AMD768MPX erratum
>
> i810fb-on-x86_64.patch
> Enable i810 fb on x86-64
>
> ext23-remove-acl-limits.patch
> Remove arbitrary #acl entries limits on ext[23] when reading
>
> watchdog-moduleparam-patches.patch
> watchdog: moduleparam-patches
>
> amd-elan-fix.patch
> AMD ELAN Kconfig fix
>
> pcmcia-netdev-ordering-fixes.patch
> PCMCIA netdevice ordering issues
>
> fadvise-fixups.patch
> fadvise(POSIX_FADV_DONTNEED) fixups
>
> validate_mm-fixes.patch
> Fix and harden validate_mm
>
> 3ware-update.patch
> 3ware driver update
>
> 3c59x-xcvr-fix.patch
> Fix 3c59x transceiver handling
>
> current_is_keventd-speedup.patch
> current_is_keventd() speedup
>
> root-ramdisk-fix.patch
> Fix rootfs on ramdisk
>
> cciss-per-device-queues.patch
> cciss: per device queues
>
> blkdev-fix-final-page.patch
> Fix reading the last block on a bdev
>
> wavfront-needs-syscalls_h.patch
> wavfront.c needs syscalls.h
>
> edd-legacy-parameters-fix.patch
> EDD: Get Legacy Parameters
>
> cciss-section-fix.patch
> cciss: init section fix
>
> pte_chain-nowarns.patch
> add nowarn to a few pte chain allocators
>
> macintosh-config-fix.patch
> Disable Macintosh device drivers for all but PPC || MAC
>
> applicom-warning-fix.patch
> Applicom warning
>
> CONFIG_NVRAM-dependencies.patch
> Fix CONFIG_NVRAM dependencies
>
> move-job-control-stuff-tosignal_struct.patch
> moef job control fields from task_struct to signal_struct
>
> module_h-attribute_used-fix.patch
> module.h __attribute_used__ fix
>
> kobject-module-request-64-bit-fix.patch
> Fix a 64bit bug in kobject module request
>
> sch_htb-fix.patch
> net: fix sch_htb on 64-bit
>
> instrument-highmem-page-reclaim.patch
> vm: per-zone vmscan instrumentation
>
> blk_congestion_wait-return-remaining.patch
> return remaining jiffies from blk_congestion_wait()
>
> blk-congestion-races.patch
> Narrow blk_congestion_wait races
>
> vmscan-remove-priority.patch
> mm/vmscan.c: remove unused priority argument.
>
> kswapd-throttling-fixes.patch
> kswapd throttling fixes
>
> vm-refill_inactive-preserve-referenced.patch
> vmscan: preserve page referenced info in refill_inactive()
>
> shrink_slab-precision-fix.patch
> shrink_slab: math precision fix
>
> try_to_free_pages-shrink_slab-evenness.patch
> vm: shrink slab evenly in try_to_free_pages()
>
> vmscan-total_scanned-fix.patch
> vmscan: fix calculation of number of pages scanned
>
> shrink_slab-for-all-zones-2.patch
> vm: scan slab in response to highmem scanning
>
> zone-balancing-fix-2.patch
> vmscan: zone balancing fix
>
> vmscan-control-by-nr_to_scan-only.patch
> vmscan: drive everything via nr_to_scan
>
> vmscan-balance-zone-scanning-rates.patch
> Balance inter-zone scan rates
>
> vmscan-dont-throttle-if-zero-max_scan.patch
> vmscan: avoid bogus throttling
>
> kswapd-avoid-higher-zones.patch
> kswapd: avoid unnecessary reclaiming from higher zones
>
> kswapd-avoid-higher-zones-reverse-direction.patch
> kswapd: fix lumpy page reclaim
>
> kswapd-avoid-higher-zones-reverse-direction-fix.patch
> fix the kswapd zone scanning algorithm
>
> vmscan-throttle-later.patch
> vmscan: less throttling of page allocators and kswapd
>
> vm-batch-inactive-scanning.patch
> vmscan: batch up inactive list scanning work
>
> vm-batch-inactive-scanning-fix.patch
> fix vm-batch-inactive-scanning.patch
>
> vm-balance-refill-rate.patch
> vm: balance inactive zone refill rates
>
> vm-lrutopage-cleanup.patch
> vmscan: add lru_to_page() helper
>
> slab-no-higher-order.patch
> slab: avoid higher-order allocations
>
> O_DIRECT-race-fixes-rollup.patch
> O_DIRECT data exposure fixes
>
> O_DIRECT-ll_rw_block-vs-block_write_full_page-fix.patch
> Fix race between ll_rw_block() and block_write_full_page()
>
> blockdev-direct-io-speedup.patch
> blockdev direct-io speedups
>
> dio-aio-fixes.patch
> direct-io AIO fixes
>
> aio-fallback-bio_count-race-fix-2.patch
> AIO+DIO bio_count race fix
>
> aio-direct-io-oops-fix.patch
> AIO/direct-io oops fix
>
> radix-tree-tagging.patch
> radix-tree tags for selective lookup
>
> irq-safe-pagecache-lock.patch
> make the pagecache lock irq-safe.
>
> tag-dirty-pages.patch
> tag dirty pages as such in the radix tree
>
> tag-writeback-pages.patch
> tag writeback pages as such in their radix tree
>
> stop-using-dirty-pages.patch
> stop using the address_space dirty_pages list
>
> stop-using-io-pages.patch
> remove address_space.io_pages
>
> stop-using-locked-pages.patch
> Stop using address_space.locked_pages
>
> stop-using-clean-pages.patch
> stop using address_space.clean_pages
>
> unslabify-pgds-and-pmds.patch
> revert the slabification of i386 pgd's and pmd's
>
> slab-stop-using-page-list.patch
> slab: stop using page.list
>
> page_alloc-stop-using-page-list.patch
> stop using page.list in the page allocator
>
> hugetlb-stop-using-page-list.patch
> stop using page->list in the hugetlbpage implementations
>
> pageattr-stop-using-page-list.patch
> stop using page.list in pageattr.c
>
> readahead-stop-using-page-list.patch
> stop using page.list in readahead
>
> compound-pages-stop-using-lru.patch
> stop using page->lru in compound pages
>
> remove-page-list.patch
> remove page.list
>
> remap-file-pages-prot-2.6.4-rc1-mm1-A1.patch
> per-page protections for remap_file_pages()
>
> remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
> remap_file_pages page-prot implementation for ia64
>
> list_del-debug.patch
> list_del debug check
>
> oops-dump-preceding-code.patch
> i386 oops output: dump preceding code
>
> lockmeter.patch
> lockmeter
>
> lockmeter-ia64-fix.patch
> ia64 CONFIG_LOCKMETER fix
>
> 4g-2.6.0-test2-mm2-A5.patch
> 4G/4G split patch
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g/4g usercopy atomicity fix
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g/4g usercopy atomicity fix
> 4G/4G preempt on vstack
> 4G/4G: even number of kmap types
> 4g4g: fix __get_user in slab
> 4g4g: Remove extra .data.idt section definition
> 4g/4g linker error (overlapping sections)
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g4g: show_registers() fix
> 4g/4g usercopy atomicity fix
> 4g4g: debug flags fix
> 4g4g: Fix wrong asm-offsets entry
> cyclone time fixmap fix
> 4G/4G preempt on vstack
> 4G/4G: even number of kmap types
> 4g4g: fix __get_user in slab
> 4g4g: Remove extra .data.idt section definition
> 4g/4g linker error (overlapping sections)
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g4g: show_registers() fix
> 4g/4g usercopy atomicity fix
> 4g4g: debug flags fix
> 4g4g: Fix wrong asm-offsets entry
> cyclone time fixmap fix
> use direct_copy_{to,from}_user for kernel access in mm/usercopy.c
> 4G/4G might_sleep warning fix
> 4g/4g pagetable accounting fix
> Fix 4G/4G and WP test lockup
> 4G/4G KERNEL_DS usercopy again
> Fix 4G/4G X11/vm86 oops
> Fix 4G/4G athlon triplefault
> 4g4g SEP fix
> Fix 4G/4G split fix for pre-pentiumII machines
> 4g/4g PAE ACPI low mappings fix
> zap_low_mappings() cannot be __init
> 4g/4g: remove printk at boot
> 4g4g: fix handle_BUG()
> 4g4g: acpi sleep fixes
>
> 4g4g-locked-userspace-copy.patch
> Do a locked user-space copy for 4g/4g
>
> ia32-4k-stacks.patch
> ia32: 4Kb stacks (and irqstacks) patch
>
> ia32-4k-stacks-build-fix.patch
> 4k stacks build fix
>
> 4k-stacks-in-modversions-magic.patch
> Add 4k stacks to module version magic
>
> ppc-fixes.patch
> make mm4 compile on ppc
>
> ppc-fixes-dependency-fix.patch
> ppc-fixes dependency fix
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Hi, Andrew Morton wrote:
> Andrew Morton <[email protected]> wrote:
>>
>> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-mm1/
>
> Needs this fix, if you use CONFIG_DEBUG_SPINLOCK
>
Imported to BitKeeper:
bk://smurf.bkbits.net/linux-2.6.4-mm1
I have that process automated rather well now. If anybody wants the import
script (it can easily be adapted for other patch series), just ask.
--
Matthias Urlichs
Mickael Marchand <[email protected]> wrote:
>
> Hi,
>
> on my config (opteron box) I need this patch to get it compiled :
>
> --- fs/compat_ioctl.c.orig 2004-03-11 08:57:49.472074584 +0000
> +++ fs/compat_ioctl.c 2004-03-11 08:57:01.770326352 +0000
> @@ -1604,7 +1604,7 @@
> * To have permissions to do most of the vt ioctls, we either have
> * to be the owner of the tty, or super-user.
> */
> - if (current->tty == tty || capable(CAP_SYS_ADMIN))
> + if (current->signal->tty == tty || capable(CAP_SYS_ADMIN))
> return 1;
> return 0;
> }
yup, thanks.
>
> while I am at it, I am running a 64 bits kernel with 32 bits debian testing and
> it seems some ioctl conversion fails
> that happened with all 2.6 I tried.
> here is the relevant kernel messages part :
> ioctl32(dmsetup:26199): Unknown cmd fd(3) cmd(c134fd00){01} arg(0804c0b0) on /dev/mapper/control
The device mapper version 1 ioctl interface was removed. Perhaps you need
to update your dm tools?
> ioctl32(fsck.reiserfs:201): Unknown cmd fd(4) cmd(80081272){00} arg(ffffdab8) on /dev/ide/host0/bus0/target0/lun0/part4
Is this something which 2.6 has always done, or is it new behaviour?
reiserfs ioctl translation appears to be incomplete...
> > ioctl32(fsck.reiserfs:201): Unknown cmd fd(4) cmd(80081272){00} arg(ffffdab8) on /dev/ide/host0/bus0/target0/lun0/part4
>
> Is this something which 2.6 has always done, or is it new behaviour?
>
> reiserfs ioctl translation appears to be incomplete...
Some clown is running around "fixing" our ioctls:
X0081272 is BLKGETSIZE64. Yeah its bust, it was one of those calls that
we passed in sizeof(8) instead of 8. The ioctl should be X0041272.
The definition is:
#define BLKGETSIZE64 _IOR(0x12,114,size_t)
However at least in debian unstable, util-linux has:
./fdisk/common.h:#define BLKGETSIZE64 _IOR(0x12,114,8) /* 8 = sizeof(u64) */
./lib/get_blocks.c:#define BLKGETSIZE64 _IOR(0x12,114,long long)
ie X0081272
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=233626
Anton
[snip]
> > while I am at it, I am running a 64 bits kernel with 32 bits debian
> > testing and it seems some ioctl conversion fails
> > that happened with all 2.6 I tried.
> > here is the relevant kernel messages part :
> > ioctl32(dmsetup:26199): Unknown cmd fd(3) cmd(c134fd00){01} arg(0804c0b0)
> > on /dev/mapper/control
>
> The device mapper version 1 ioctl interface was removed. Perhaps you need
> to update your dm tools?
the debian tools are built with ioctlv4 (and compat for v1)
I also tried with my own compiled dm tools from source without success
> > ioctl32(fsck.reiserfs:201): Unknown cmd fd(4) cmd(80081272){00}
> > arg(ffffdab8) on /dev/ide/host0/bus0/target0/lun0/part4
>
> Is this something which 2.6 has always done, or is it new behaviour?
always since 2.6 IIRC
> reiserfs ioctl translation appears to be incomplete...
ha :)
thanks,
Mik
Andrew Morton <[email protected]> writes:
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be silent.
>
> I do not wish to merge these up until the big-iron guys can say that they
> suit their requirements, with a reasonable expectation that we will not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
I tested them on Opteron NUMA systems and they are worse on simple
tests than the stock scheduler (e.g. the parallelized STREAM test,
which is a bit silly, but still fairly important)
For SMT there is a patch from Intel pending that teaches x86-64
to set up the SMT scheduler. They said they got slightly better
benchmark results. The SMT setup seems to be racy though.
Some kind of SMT scheduler is definitely needed, we have a serious
regression compared to 2.4 here right now. I'm not sure this
is the right approach though, it seems to be far too complex.
-Andi
Mickael Marchand <[email protected]> writes:
> [snip]
>> > while I am at it, I am running a 64 bits kernel with 32 bits debian
>> > testing and it seems some ioctl conversion fails
>> > that happened with all 2.6 I tried.
>> > here is the relevant kernel messages part :
>> > ioctl32(dmsetup:26199): Unknown cmd fd(3) cmd(c134fd00){01} arg(0804c0b0)
>> > on /dev/mapper/control
>>
>> The device mapper version 1 ioctl interface was removed. Perhaps you need
>> to update your dm tools?
> the debian tools are built with ioctlv4 (and compat for v1)
> I also tried with my own compiled dm tools from source without success
If it just uses them for compatibility probes then the ioctl handler can
be silenced.
>> > ioctl32(fsck.reiserfs:201): Unknown cmd fd(4) cmd(80081272){00}
>> > arg(ffffdab8) on /dev/ide/host0/bus0/target0/lun0/part4
>>
>> Is this something which 2.6 has always done, or is it new behaviour?
> always since 2.6 IIRC
>
>> reiserfs ioctl translation appears to be incomplete...
> ha :)
I will take a look at it.
-Andi
On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
> This affects I/O scheduling potentially quite significantly. It is no
> longer the case that the kernel will submit pages for I/O in the order in
> which the application dirtied them. We instead submit them in file-offset
> order all the time.
Hi Andrew,
I have a feeling this change might significantly improve the external
sorting benchmark I emailed you ( http://lkml.org/lkml/2003/12/20/46 ).
I will try running it when I get a chance and let you know. It gives me
a good excuse to get 2.6 kernels working on my systems :-)
Thanks,
Jim
--
http://www.jeweltran.com
Le jeudi 18 Mars 2004 00:25, Andi Kleen a ?crit?:
> Mickael Marchand <[email protected]> writes:
> > [snip]
> >
> >> > while I am at it, I am running a 64 bits kernel with 32 bits debian
> >> > testing and it seems some ioctl conversion fails
> >> > that happened with all 2.6 I tried.
> >> > here is the relevant kernel messages part :
> >> > ioctl32(dmsetup:26199): Unknown cmd fd(3) cmd(c134fd00){01}
> >> > arg(0804c0b0) on /dev/mapper/control
> >>
> >> The device mapper version 1 ioctl interface was removed. Perhaps you
> >> need to update your dm tools?
> >
> > the debian tools are built with ioctlv4 (and compat for v1)
> > I also tried with my own compiled dm tools from source without success
>
> If it just uses them for compatibility probes then the ioctl handler can
> be silenced.
hmm right now, dm/lvm absolutely does not work on amd64/32 bits. all ioctls
calls are failling...
> >> reiserfs ioctl translation appears to be incomplete...
> >
> > ha :)
>
> I will take a look at it.
thanks
Mik
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be silent.
>
> I do not wish to merge these up until the big-iron guys can say that they
> suit their requirements, with a reasonable expectation that we will not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
I sucked sched-* out of mm, added sched-ppc64bits (attached) and am
having problems with the following threaded test case. NUMA is enabled.
#include <pthread.h>
#define NR_THREADS 100
void dostuff(void *junk)
{
while(1)
;
}
int main()
{
int i;
pthread_t tid;
for (i = 0; i < NR_THREADS-1; i++)
pthread_create(&tid, NULL, dostuff, NULL);
dostuff(NULL);
}
100 runnable threads but we never use more than one cpu:
user system idle user system idle
cpu0 0 0 100 cpu1 0 0 100
cpu2 0 0 100 cpu3 0 0 100
cpu4 0 0 100 cpu5 0 0 100
cpu6 0 0 100 cpu7 0 0 100
cpu8 0 0 100 cpu9 0 0 100
cpu10 0 0 100 cpu11 0 0 100
cpu12 0 0 100 cpu13 100 0 0
Anton
diff -puN arch/ppc64/Kconfig~sched-ppc64bits arch/ppc64/Kconfig
--- gr23_work/arch/ppc64/Kconfig~sched-ppc64bits 2004-03-03 07:43:29.762761114 -0600
+++ gr23_work-anton/arch/ppc64/Kconfig 2004-03-03 07:43:29.778758577 -0600
@@ -173,6 +173,16 @@ config NUMA
bool "NUMA support"
depends on DISCONTIGMEM
+config SCHED_SMT
+ bool "SMT (Hyperthreading) scheduler support"
+ depends on SMP
+ default off
+ help
+ SMT scheduler support improves the CPU scheduler's decision making
+ when dealing with Intel Pentium 4 chips with HyperThreading at a
+ cost of slightly increased overhead in some places. If unsure say
+ N here.
+
config PREEMPT
bool
help
diff -puN arch/ppc64/kernel/smp.c~sched-ppc64bits arch/ppc64/kernel/smp.c
--- gr23_work/arch/ppc64/kernel/smp.c~sched-ppc64bits 2004-03-03 07:43:29.768760162 -0600
+++ gr23_work-anton/arch/ppc64/kernel/smp.c 2004-03-03 07:43:29.782757942 -0600
@@ -890,3 +890,204 @@ static int __init topology_init(void)
return 0;
}
__initcall(topology_init);
+
+#ifdef CONFIG_SCHED_SMT
+#ifdef CONFIG_NUMA
+static struct sched_group sched_group_cpus[NR_CPUS];
+static struct sched_group sched_group_phys[NR_CPUS];
+static struct sched_group sched_group_nodes[MAX_NUMNODES];
+static DEFINE_PER_CPU(struct sched_domain, phys_domains);
+static DEFINE_PER_CPU(struct sched_domain, node_domains);
+__init void arch_init_sched_domains(void)
+{
+ int i;
+ struct sched_group *first_cpu = NULL, *last_cpu = NULL;
+
+ /* Set up domains */
+ for_each_online_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_domain *node_domain = &per_cpu(node_domains, i);
+ int node = cpu_to_node(i);
+ cpumask_t nodemask = node_to_cpumask(node);
+
+ *cpu_domain = SD_SIBLING_INIT;
+ cpumask_t tmp1 = cpumask_of_cpu(i ^ 0x1);
+ cpumask_t tmp2 = cpumask_of_cpu(i);
+ cpus_or(cpu_domain->span, tmp1, tmp2);
+
+ *phys_domain = SD_CPU_INIT;
+ phys_domain->span = nodemask;
+
+ *node_domain = SD_NODE_INIT;
+ node_domain->span = cpu_online_map;
+ }
+
+ /* Set up CPU (sibling) groups */
+ for_each_online_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ int j;
+ first_cpu = last_cpu = NULL;
+
+ if (i != first_cpu(cpu_domain->span))
+ continue;
+
+ for_each_cpu_mask(j, cpu_domain->span) {
+ struct sched_group *cpu = &sched_group_cpus[j];
+
+ cpus_clear(cpu->cpumask);
+ cpu_set(j, cpu->cpumask);
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+ }
+
+ for (i = 0; i < numnodes; i++) {
+ int j;
+ cpumask_t nodemask;
+ cpumask_t node_cpumask = node_to_cpumask(i);
+ cpus_and(nodemask, node_cpumask, cpu_online_map);
+
+ first_cpu = last_cpu = NULL;
+ /* Set up physical groups */
+ for_each_cpu_mask(j, nodemask) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(j);
+ struct sched_group *cpu = &sched_group_phys[j];
+
+ if (j != first_cpu(cpu_domain->span))
+ continue;
+
+ cpu->cpumask = cpu_domain->span;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ if (last_cpu)
+ last_cpu->next = first_cpu;
+ }
+
+ /* Set up nodes */
+ first_cpu = last_cpu = NULL;
+ for (i = 0; i < numnodes; i++) {
+ struct sched_group *cpu = &sched_group_nodes[i];
+ cpumask_t nodemask;
+ cpumask_t node_cpumask = node_to_cpumask(i);
+ cpus_and(nodemask, node_cpumask, cpu_online_map);
+
+ if (cpus_empty(nodemask))
+ continue;
+
+ cpu->cpumask = nodemask;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ if (last_cpu)
+ last_cpu->next = first_cpu;
+
+ mb();
+ for_each_online_cpu(i) {
+ int node = cpu_to_node(i);
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_domain *node_domain = &per_cpu(node_domains, i);
+ struct sched_group *cpu_group = &sched_group_cpus[i];
+ struct sched_group *phys_group = &sched_group_phys[first_cpu(cpu_domain->span)];
+ struct sched_group *node_group = &sched_group_nodes[node];
+
+ cpu_domain->parent = phys_domain;
+ phys_domain->parent = node_domain;
+
+ node_domain->groups = node_group;
+ phys_domain->groups = phys_group;
+ cpu_domain->groups = cpu_group;
+ }
+}
+#else /* CONFIG_NUMA */
+static struct sched_group sched_group_cpus[NR_CPUS];
+static struct sched_group sched_group_phys[NR_CPUS];
+static DEFINE_PER_CPU(struct sched_domain, phys_domains);
+__init void arch_init_sched_domains(void)
+{
+ int i;
+ struct sched_group *first_cpu = NULL, *last_cpu = NULL;
+
+ /* Set up domains */
+ for_each_cpu_mask(i, cpu_online_map) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+
+ *cpu_domain = SD_SIBLING_INIT;
+ cpu_domain->span = blah cpu_sibling_map[i];
+
+ *phys_domain = SD_CPU_INIT;
+ phys_domain->span = cpu_online_map;
+ }
+
+ /* Set up CPU (sibling) groups */
+ for_each_cpu_mask(i, cpu_online_map) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ int j;
+ first_cpu = last_cpu = NULL;
+
+ if (i != first_cpu(cpu_domain->span))
+ continue;
+
+ for_each_cpu_mask(j, cpu_domain->span) {
+ struct sched_group *cpu = &sched_group_cpus[j];
+
+ cpu->cpumask = CPU_MASK_NONE;
+ cpu_set(j, cpu->cpumask);
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+ }
+
+ first_cpu = last_cpu = NULL;
+ /* Set up physical groups */
+ for_each_cpu_mask(i, cpu_online_map) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_group *cpu = &sched_group_phys[i];
+
+ if (i != first_cpu(cpu_domain->span))
+ continue;
+
+ cpu->cpumask = cpu_domain->span;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+
+ mb();
+ for_each_cpu_mask(i, cpu_online_map) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_group *cpu_group = &sched_group_cpus[i];
+ struct sched_group *phys_group = &sched_group_phys[first_cpu(cpu_domain->span)];
+ cpu_domain->parent = phys_domain;
+ phys_domain->groups = phys_group;
+ cpu_domain->groups = cpu_group;
+ }
+}
+#endif /* CONFIG_NUMA */
+#endif /* CONFIG_SCHED_SMT */
diff -puN include/asm-ppc64/processor.h~sched-ppc64bits include/asm-ppc64/processor.h
--- gr23_work/include/asm-ppc64/processor.h~sched-ppc64bits 2004-03-03 07:43:29.773759370 -0600
+++ gr23_work-anton/include/asm-ppc64/processor.h 2004-03-03 07:43:29.784757625 -0600
@@ -631,6 +631,11 @@ static inline void prefetchw(const void
#define spin_lock_prefetch(x) prefetchw(x)
+#ifdef CONFIG_SCHED_SMT
+#define ARCH_HAS_SCHED_DOMAIN
+#define ARCH_HAS_SCHED_WAKE_BALANCE
+#endif
+
#endif /* ASSEMBLY */
#endif /* __ASM_PPC64_PROCESSOR_H */
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be silent.
>
> I do not wish to merge these up until the big-iron guys can say that they
> suit their requirements, with a reasonable expectation that we will not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
Some quick fixes...
Anton
--
Remove unused this_rq
diff -puN kernel/sched.c~sched-fix kernel/sched.c
--- gr23_work/kernel/sched.c~sched-fix 2004-03-03 07:43:34.242850841 -0600
+++ gr23_work-anton/kernel/sched.c 2004-03-03 07:43:34.253849097 -0600
@@ -699,7 +699,6 @@ static int try_to_wake_up(task_t * p, un
unsigned long load, this_load;
int new_cpu;
struct sched_domain *sd;
- runqueue_t *this_rq;
#endif
rq = task_rq_lock(p, &flags);
@@ -730,7 +729,6 @@ static int try_to_wake_up(task_t * p, un
goto repeat_lock_task;
}
- this_rq = this_rq();
now = sched_clock();
sd = cpu_sched_domain(this_cpu);
--
remove unused load and remove some warnings (due to type checking in
min/max macros)
diff -puN kernel/sched.c~sched-morefixes kernel/sched.c
--- gr25_work/kernel/sched.c~sched-morefixes 2004-03-11 06:42:13.895877892 -0600
+++ gr25_work-anton/kernel/sched.c 2004-03-11 06:42:41.930693672 -0600
@@ -1436,7 +1436,6 @@ nextgroup:
if (*imbalance <= SCHED_LOAD_SCALE/2) {
unsigned long pwr_now = 0, pwr_move = 0;
- unsigned long load;
unsigned long tmp;
/*
diff -puN include/linux/sched.h~sched-morefixes include/linux/sched.h
--- gr25_work/include/linux/sched.h~sched-morefixes 2004-03-11 06:47:01.892015906 -0600
+++ gr25_work-anton/include/linux/sched.h 2004-03-11 06:47:30.533869203 -0600
@@ -531,7 +531,7 @@ do { if (atomic_dec_and_test(&(tsk)->usa
#ifdef CONFIG_SMP
#define SCHED_LOAD_SHIFT 7 /* increase resolution of load calculations */
-#define SCHED_LOAD_SCALE (1 << SCHED_LOAD_SHIFT)
+#define SCHED_LOAD_SCALE (1UL << SCHED_LOAD_SHIFT)
#define SD_FLAG_NEWIDLE 1 /* Balance when about to become idle */
#define SD_FLAG_EXEC 2 /* Balance on exec */
_
> hmm right now, dm/lvm absolutely does not work on amd64/32 bits. all ioctls
> calls are failling...
With no messages in the log?
Maybe they have broken data structures again, most likely
because of different long long alignment. A lot of people
who attempt to design data structures that don't need translation
get that wrong unfortunately.
Emulating that stuff would be hard unfortunately because it has an rather
over complicated ioctl structure that would be hard to write sane
emulation code for.
Complain to the DM maintainers.
-Andi
just the ioctl cmd failed I reported in my first mail.
then dmsetup just stops...
Cheers,
Mik
Le jeudi 11 Mars 2004 15:48, Andi Kleen a ?crit?:
> > hmm right now, dm/lvm absolutely does not work on amd64/32 bits. all
> > ioctls calls are failling...
>
> With no messages in the log?
>
> Maybe they have broken data structures again, most likely
> because of different long long alignment. A lot of people
> who attempt to design data structures that don't need translation
> get that wrong unfortunately.
>
> Emulating that stuff would be hard unfortunately because it has an rather
> over complicated ioctl structure that would be hard to write sane
> emulation code for.
>
> Complain to the DM maintainers.
already did.
On Thu, Mar 11, 2004 at 04:10:02PM +0100, Mickael Marchand wrote:
> just the ioctl cmd failed I reported in my first mail.
> then dmsetup just stops...
Either it doesn't handle the fallback correctly or the ioctls
are not compatible.
>From a quick look at dm-ioctl.h I found some suspicious cases,
but no clear failures.
-Andi
On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-mm1/
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be silent.
Looks like some ppl punted on arch code sweeps. Results of one-off
fixing for a box with a couple of spindles in its boot bay to check
out the writeback and unplug goodies below.
-- wli
diff -urpN mm1-2.6.4-2/arch/sparc64/kernel/process.c mm1-2.6.4-3/arch/sparc64/kernel/process.c
--- mm1-2.6.4-2/arch/sparc64/kernel/process.c 2004-03-11 04:57:42.631636000 -0800
+++ mm1-2.6.4-3/arch/sparc64/kernel/process.c 2004-03-11 06:13:09.250485000 -0800
@@ -41,6 +41,7 @@
#include <asm/fpumacro.h>
#include <asm/head.h>
#include <asm/cpudata.h>
+#include <asm/unistd.h>
/* #define VERBOSE_SHOWREGS */
diff -urpN mm1-2.6.4-2/fs/compat_ioctl.c mm1-2.6.4-3/fs/compat_ioctl.c
--- mm1-2.6.4-2/fs/compat_ioctl.c 2004-03-10 18:55:45.000000000 -0800
+++ mm1-2.6.4-3/fs/compat_ioctl.c 2004-03-11 06:32:16.431087000 -0800
@@ -1604,7 +1604,7 @@ static int vt_check(struct file *file)
* To have permissions to do most of the vt ioctls, we either have
* to be the owner of the tty, or super-user.
*/
- if (current->tty == tty || capable(CAP_SYS_ADMIN))
+ if (current->signal->tty == tty || capable(CAP_SYS_ADMIN))
return 1;
return 0;
}
diff -urpN mm1-2.6.4-2/fs/proc/proc_misc.c mm1-2.6.4-3/fs/proc/proc_misc.c
--- mm1-2.6.4-2/fs/proc/proc_misc.c 2004-03-11 04:58:08.151756000 -0800
+++ mm1-2.6.4-3/fs/proc/proc_misc.c 2004-03-11 06:37:27.426809000 -0800
@@ -383,13 +383,13 @@ int show_stat(struct seq_file *p, void *
}
seq_printf(p, "cpu %llu %llu %llu %llu %llu %llu %llu\n",
- jiffies_64_to_clock_t(user),
- jiffies_64_to_clock_t(nice),
- jiffies_64_to_clock_t(system),
- jiffies_64_to_clock_t(idle),
- jiffies_64_to_clock_t(iowait),
- jiffies_64_to_clock_t(irq),
- jiffies_64_to_clock_t(softirq));
+ (unsigned long long)jiffies_64_to_clock_t(user),
+ (unsigned long long)jiffies_64_to_clock_t(nice),
+ (unsigned long long)jiffies_64_to_clock_t(system),
+ (unsigned long long)jiffies_64_to_clock_t(idle),
+ (unsigned long long)jiffies_64_to_clock_t(iowait),
+ (unsigned long long)jiffies_64_to_clock_t(irq),
+ (unsigned long long)jiffies_64_to_clock_t(softirq));
for_each_cpu(i) {
/* two separate calls here to work around gcc-2.95.3 ICE */
seq_printf(p, "cpu%d %llu %llu %llu ",
@@ -410,7 +410,7 @@ int show_stat(struct seq_file *p, void *
(unsigned long long)
jiffies_64_to_clock_t(kstat_cpu(i).cpustat.softirq));
}
- seq_printf(p, "intr %llu", sum);
+ seq_printf(p, "intr %llu", (unsigned long long)sum);
#if !defined(CONFIG_PPC64) && !defined(CONFIG_ALPHA)
for (i = 0; i < NR_IRQS; i++)
diff -urpN mm1-2.6.4-2/fs/udf/super.c mm1-2.6.4-3/fs/udf/super.c
--- mm1-2.6.4-2/fs/udf/super.c 2004-03-11 04:58:08.573692000 -0800
+++ mm1-2.6.4-3/fs/udf/super.c 2004-03-11 06:10:50.507577000 -0800
@@ -57,6 +57,7 @@
#include <linux/smp_lock.h>
#include <linux/buffer_head.h>
#include <linux/vfs.h>
+#include <linux/vmalloc.h>
#include <asm/byteorder.h>
#include <linux/udf_fs.h>
diff -urpN mm1-2.6.4-2/include/asm-sparc64/compat.h mm1-2.6.4-3/include/asm-sparc64/compat.h
--- mm1-2.6.4-2/include/asm-sparc64/compat.h 2004-03-10 18:55:34.000000000 -0800
+++ mm1-2.6.4-3/include/asm-sparc64/compat.h 2004-03-11 06:11:53.214045000 -0800
@@ -29,6 +29,7 @@ typedef s32 compat_int_t;
typedef s32 compat_long_t;
typedef u32 compat_uint_t;
typedef u32 compat_ulong_t;
+typedef u32 compat_timer_t;
struct compat_timespec {
compat_time_t tv_sec;
diff -urpN mm1-2.6.4-2/include/asm-sparc64/pgtable.h mm1-2.6.4-3/include/asm-sparc64/pgtable.h
--- mm1-2.6.4-2/include/asm-sparc64/pgtable.h 2004-03-10 18:55:21.000000000 -0800
+++ mm1-2.6.4-3/include/asm-sparc64/pgtable.h 2004-03-11 06:27:40.704004000 -0800
@@ -322,9 +322,16 @@ static inline pte_t mk_pte_io(unsigned l
/* File offset in PTE support. */
#define pte_file(pte) (pte_val(pte) & _PAGE_FILE)
-#define pte_to_pgoff(pte) (pte_val(pte) >> PAGE_SHIFT)
-#define pgoff_to_pte(off) (__pte(((off) << PAGE_SHIFT) | _PAGE_FILE))
-#define PTE_FILE_MAX_BITS (64UL - PAGE_SHIFT - 1UL)
+#define __pte_to_pgprot(pte) \
+ __pgprot(pte_val(pte) & (_PAGE_READ|_PAGE_WRITE))
+#define __file_pte_to_pgprot(pte) \
+ __pgprot(((pte_val(pte) >> PAGE_SHIFT) & 0x3UL) << 8)
+#define pte_to_pgprot(pte) \
+ (pte_file(pte) ? __file_pte_to_pgprot(pte) : __pte_to_pgprot(pte))
+#define pte_to_pgoff(pte) (pte_val(pte) >> (PAGE_SHIFT+2))
+#define pgoff_prot_to_pte(off, prot) \
+ (__pte(((off) << (PAGE_SHIFT+2)) | _PAGE_FILE | ((prot >> 8) & 0x3UL)))
+#define PTE_FILE_MAX_BITS (64UL - PAGE_SHIFT - 3UL)
extern unsigned long prom_virt_to_phys(unsigned long, int *);
On Thu, Mar 11, 2004 at 07:23:46AM -0800, William Lee Irwin III wrote:
> +#define pgoff_prot_to_pte(off, prot) \
> + (__pte(((off) << (PAGE_SHIFT+2)) | _PAGE_FILE | ((prot >> 8) & 0x3UL)))
> +#define PTE_FILE_MAX_BITS (64UL - PAGE_SHIFT - 3UL)
Good thing for me it's rarely exercised. Incremental (one-liner):
--- mm1-2.6.4-3/include/asm-sparc64/pgtable.h 2004-03-11 06:27:40.704004000 -0800
+++ mm1-2.6.4-4/include/asm-sparc64/pgtable.h 2004-03-11 07:35:09.766453000 -0800
@@ -330,7 +330,7 @@
(pte_file(pte) ? __file_pte_to_pgprot(pte) : __pte_to_pgprot(pte))
#define pte_to_pgoff(pte) (pte_val(pte) >> (PAGE_SHIFT+2))
#define pgoff_prot_to_pte(off, prot) \
- (__pte(((off) << (PAGE_SHIFT+2)) | _PAGE_FILE | ((prot >> 8) & 0x3UL)))
+ ((__pte(((off) | ((pgprot_val(prot) >> 8) & 0x3UL)))) << (PAGE_SHIFT+2) | _PAGE_FILE)
#define PTE_FILE_MAX_BITS (64UL - PAGE_SHIFT - 3UL)
extern unsigned long prom_virt_to_phys(unsigned long, int *);
hey andrew, i have a problem with this kernel, when it boots, it lists
vp_ide and stuff, and then suddenly after that my screen gets flodded
with sys traces and stuff, i cant even read it, so fast they come, and
the syste doesnet go further, i havent tried 2.6.4 vanilla yet, but i
will now.
if u got any ideas, please tell me, and i will test
On Thu, 2004-03-11 at 08:31, Andrew Morton wrote:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6.4-mm1/
>
>
>
> - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> for too long. I had been hoping that the people who care about SMT and
> NUMA performance would have some results by now but all seems to be silent.
>
> I do not wish to merge these up until the big-iron guys can say that they
> suit their requirements, with a reasonable expectation that we will not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
>
>
> - Major surgery against the pagecache, radix-tree and writeback code. This
> work is to address the O_DIRECT-vs-buffered data exposure horrors which
> we've been struggling with for months.
>
> As a side-effect, 32 bytes are saved from struct inode and eight bytes
> are removed from struct page.
>
> This change will break any arch code which is using page->list and will
> also break any arch code which is using page->lru of memory which was
> obtained from slab.
>
> It seems to work OK here, but I suggest people not rush out and convert
> all of the corporate finance department's servers to 2.6.4-mm1.
>
> The basic problem which we (mainly Daniel McNeil) have been struggling
> with is in getting a really reliable fsync() across the page lists while
> other processes are performing writeback against the same file. It's like
> juggling four bars of wet soap with your eyes shut while someone is
> whacking you with a baseball bat. Daniel pretty much has the problem
> plugged but I suspect that's just because we don't have testcases to
> trigger the remaining problems. The complexity and additional locking
> which those patches add is worrisome.
>
> So the approach taken here is to remove the page lists altogether and
> replace the list-based writeback and wait operations with in-order
> radix-tree walks.
>
> The radix-tree code has been enhanced to support "tagging" of pages, for
> later searches for pages which have a particular tag set. This means that
> we can ask the radix tree code "find me the next 16 dirty pages starting at
> pagecache index N" and it will do that in O(log64(N)) time.
>
> This affects I/O scheduling potentially quite significantly. It is no
> longer the case that the kernel will submit pages for I/O in the order in
> which the application dirtied them. We instead submit them in file-offset
> order all the time.
>
> This is likely to be advantageous when applications are seeking all over
> a large file randomly writing small amounts of data. I haven't performed
> much benchmarking, but tiobench random write throughput seems to be
> increased by 30%. Other tests appear to be unaltered. dbench may have got
> 10-20% quicker, but it's variable.
>
> There is one large file which everyone seeks all over randomly writing
> small amounts of data: the blockdev mapping which caches filesystem
> metadata. The kernel's IO submission patterns for this are now ideal.
>
>
> Because writeback and wait-for-writeback use a tree walk instead of a
> list walk they are no longer livelockable. This probably means that we no
> longer need to hold i_sem across O_SYNC writes and perhaps fsync() and
> fdatasync(). This may be beneficial for databases: multiple processes
> writing and syncing different parts of the same file at the same time can
> now all submit and wait upon writes to just their own little bit of the
> file, so we can get a lot more data into the queues.
>
> It is trivial to implement a part-file-fdatasync() as well, so
> applications can say "sync the file from byte N to byte M", and multiple
> applications can do this concurrently. This is easy for ext2 filesystems,
> but probably needs lots of work for data-journalled filesystems and XFS and
> it probably doesn't offer much benefit over an i_semless O_SYNC write.
>
> - Dropped the hotplug CPU patches: bits of them were merged into Linus's
> kernel and things broke.
>
> - Various little fixes as usual.
>
>
>
>
> Changes since 2.6.4-rc2-mm1:
>
>
> bk-acpi.patch
> bk-alsa.patch
> bk-driver-core.patch
> bk-i2c.patch
> bk-input.patch
> bk-netdev.patch
> bk-pci.patch
> bk-scsi.patch
> bk-usb.patch
>
> Latest external trees
>
> -export-filemap_flush.patch
> -vma-corruption-fix.patch
> -centaur-crypto-core-support.patch
>
> Merged
>
> +bk-acpi-warning-fix.patch
>
> Fix a warning
>
> +x86_64-update.patch
>
> Latest x86_64 code drop
>
> +print-kernel-version-in-oops.patch
>
> Display the kernel version in the x86 oops message
>
> +ppc64-iseries-virtual-console-fix.patch
>
> iSeries device number fix
>
> -zap_page_range-debug.patch
>
> Turns out the code path which this patch was trying to detect the deadness
> of is in fact used.
>
> +sched-stats-64-bit.patch
>
> Use 64-bit numbers for various CPU scheduler statistics
>
> -hotplugcpu-generalise-bogolock.patch
> -hotplugcpu-generalise-bogolock-fix-for-kthread-stop-using-signals.patch
> -hotplugcpu-use-bogolock-in-modules.patch
> -hotplugcpu-core.patch
> -stop_machine-warning-fix.patch
> -hotplugcpu-core-sparc64-build-fix.patch
> -hotplugcpu-core-fix-for-kthread-stop-using-signals.patch
> -migrate_to_cpu-dependency-fix.patch
> -hotplugcpu-core-drain_local_pages-fix.patch
> -hotplugcpu-rcupdate-many-cpus-fix.patch
>
> Dropped
>
> -ext3-dirty-debug-patch.patch
>
> This debug trap never triggered
>
> -fusion-use-min-max.patch
>
> Other changes broke this
>
> +dm-map-rwlock-ng.patch
>
> New version of spinlocking for the device mapper map tables
>
> +dm-remove-__dm_request.patch
>
> Remvoe __dm_request()
>
> +md-array-assembly-major-fix.patch
>
> RAID fix
>
> +fadvise-fixups.patch
>
> Fix some fadvise() boundary conditions
>
> +validate_mm-fixes.patch
>
> Enhance validate_mm()
>
> +3ware-update.patch
>
> 3ware driver update
>
> +3c59x-xcvr-fix.patch
>
> Fix 3c59x transceiver handling
>
> +current_is_keventd-speedup.patch
>
> Simplify current_is_keventd()
>
> +root-ramdisk-fix.patch
>
> Make "root=/dev/ram" work again
>
> +cciss-per-device-queues.patch
>
> per-device queues for the cciss driver
>
> +blkdev-fix-final-page.patch
>
> Fix reads of the final block of blockdevs
>
> +wavfront-needs-syscalls_h.patch
>
> Warning (and possible oops) fixes
>
> +edd-legacy-parameters-fix.patch
>
> EDD back-compatibility
>
> +cciss-section-fix.patch
>
> __init section fix
>
> +pte_chain-nowarns.patch
>
> Prevent possible-but-expected page allocator warnings
>
> +macintosh-config-fix.patch
>
> Don't offer mac drivers on other platforms
>
> +applicom-warning-fix.patch
>
> Fix a warning
>
> +CONFIG_NVRAM-dependencies.patch
>
> Fix NVRAM dependencies
>
> +move-job-control-stuff-tosignal_struct.patch
>
> Move various job control fields out of the task_struct and into the
> signal_struct.
>
> +module_h-attribute_used-fix.patch
>
> __attribute_used__ sanity
>
> +kobject-module-request-64-bit-fix.patch
>
> Fix for 64-bit machines
>
> +sch_htb-fix.patch
>
> netfilter 64-bit fix
>
> +blk-congestion-races.patch
>
> Conceivably fix rare races in blk_congestion_wait()
>
> +vm-lrutopage-cleanup.patch
>
> Add a handy macro to tidy up vmscan.c
>
> +radix-tree-tagging.patch
>
> Add search tagging to radix trees.
>
> +irq-safe-pagecache-lock.patch
>
> Make mapping->page_lock irq-safe, and rename it to tree_lock to detect
> missed conversions.
>
> +tag-dirty-pages.patch
>
> Tag dirty pages as being dirty within their radix trees.
>
> +tag-writeback-pages.patch
>
> Tag writeback pages as being under writeback in their radix trees
>
> +stop-using-dirty-pages.patch
> +stop-using-io-pages.patch
> +stop-using-locked-pages.patch
> +stop-using-clean-pages.patch
>
> Wean the kernel off the four address_space page lists
>
> +unslabify-pgds-and-pmds.patch
>
> We cannot use page->lru to manage slab-derived pages: slab itself wants to
> use it.
>
> +slab-stop-using-page-list.patch
>
> Switch slab page management from page->list to page->lru.
>
> +page_alloc-stop-using-page-list.patch
>
> Switch the page allocator from using page->list to using page->lru.
>
> +hugetlb-stop-using-page-list.patch
>
> Switch the hugetlbpage implementations from using page->list to using
> page->lru.
>
> +pageattr-stop-using-page-list.patch
>
> Switch the pageattr code (CONFIG_DEBUG_PAGEALLOC) from using page->list to
> using page->lru.
>
> +readahead-stop-using-page-list.patch
>
> Switch the readpages() API from using page->list over to using page->lru.
>
> +compound-pages-stop-using-lru.patch
>
> Teach the compound page management to use page fields other than page->list.
>
> +remove-page-list.patch
>
> Remove the `list' field from struct page.
>
> +remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
>
> Implement the per-page-permissions-in-remap_file_pages for ia64. Hasn't
> been tested.
>
> -4g4g-THREAD_SIZE-fixes.patch
> -4g4g-handle_BUG-fix.patch
>
> Folded into 4g-2.6.0-test2-mm2-A5.patch
>
> O_DIRECT-vs-buffered-fix.patch
> O_DIRECT-vs-buffered-fix-pdflush-hang-fix.patch
> serialise-writeback-fdatawait.patch
> restore-writeback-trylock.patch
>
> Dropped. Hopefully we don't need these any more.
>
>
>
>
>
>
> All 258 patches:
>
>
>
> bk-acpi.patch
>
> bk-alsa.patch
>
> bk-driver-core.patch
>
> bk-i2c.patch
>
> bk-input.patch
>
> bk-netdev.patch
>
> bk-pci.patch
>
> bk-scsi.patch
>
> bk-usb.patch
>
> mm.patch
> add -mmN to EXTRAVERSION
>
> dma_sync_for_device-cpu.patch
> dma_sync_for_{cpu,device}()
>
> bk-acpi-warning-fix.patch
> bk-acpi warning fixes
>
> x86_64-update.patch
> x86-64 merge for 2.6.4
>
> move-dma_consistent_dma_mask.patch
> move consistent_dma_mask to the generic device
>
> move-dma_consistent_dma_mask-x86_64-fix.patch
>
> move-dma_consistent_dma_mask-sn-fix.patch
> Fix dma_mask patch for sn platform
>
> print-kernel-version-in-oops.patch
> print kernel version in oops messages
>
> kgdb-ga.patch
> kgdb stub for ia32 (George Anzinger's one)
> kgdbL warning fix
> kgdb buffer overflow fix
> kgdbL warning fix
> kgdb: CONFIG_DEBUG_INFO fix
> x86_64 fixes
> correct kgdb.txt Documentation link (against 2.6.1-rc1-mm2)
>
> kgdb-ga-recent-gcc-fix.patch
> kgdb: fix for recent gcc
>
> kgdboe-netpoll.patch
> kgdb-over-ethernet via netpoll
>
> kgdboe-non-ia32-build-fix.patch
>
> kgdb-warning-fixes.patch
> kgdb warning fixes
>
> kgdb-x86_64-support.patch
> kgdb-x86_64-support.patch for 2.6.2-rc1-mm3
>
> kgdb-THREAD_SIZE-fixes.patch
> THREAD_SIZE fixes for kgdb
>
> must-fix.patch
> must fix lists update
> must fix list update
> mustfix update
>
> must-fix-update-5.patch
> must-fix update
>
> ppc64-iseries-virtual-console-fix.patch
> ppc64: fix iSeries virtual console devices
>
> ppc64-reloc_hide.patch
>
> compat-signal-noarch-2004-01-29.patch
> Generic 32-bit compat for copy_siginfo_to_user
>
> compat-generic-ipc-emulation.patch
> generic 32 bit emulation for System-V IPC
>
> remove-sys_ioperm-stubs.patch
> Clean up sys_ioperm stubs
>
> readdir-cleanups.patch
> readdir() cleanups
>
> ext3-journalled-quotas-2.patch
> ext3: journalled quota
>
> invalidate_inodes-speedup.patch
> invalidate_inodes speedup
> more invalidate_inodes speedup fixes
>
> cfq-4.patch
> CFQ io scheduler
> CFQ fixes
>
> config_spinline.patch
> uninline spinlocks for profiling accuracy.
>
> pdflush-diag.patch
>
> get_user_pages-handle-VM_IO.patch
> fix get_user_pages() against mappings of /dev/mem
>
> pci_set_power_state-might-sleep.patch
>
> CONFIG_STANDALONE-default-to-n.patch
> Make CONFIG_STANDALONE default to N
>
> extra-buffer-diags.patch
>
> CONFIG_SYSFS.patch
> From: Pat Mochel <[email protected]>
> Subject: [PATCH] Add CONFIG_SYSFS
>
> CONFIG_SYSFS-boot-from-disk-fix.patch
>
> slab-leak-detector.patch
> slab leak detector
> mm/slab.c warning in cache_alloc_debugcheck_after
>
> scale-nr_requests.patch
> scale nr_requests with TCQ depth
>
> truncate_inode_pages-check.patch
>
> local_bh_enable-warning-fix.patch
>
> sched-stats-64-bit.patch
> Use 64-bit counters for scheduler stats
>
> sched-find_busiest_node-resolution-fix.patch
> sched: improved resolution in find_busiest_node
>
> sched-domains.patch
> sched: scheduler domain support
> sched: fix for NR_CPUS > BITS_PER_LONG
> sched: clarify find_busiest_group
> sched: find_busiest_group arithmetic fix
>
> sched-domains-improvements.patch
> sched domains kernbench improvements
>
> sched-clock-fixes.patch
> fix sched_clock()
>
> sched-sibling-map-to-cpumask.patch
> sched: cpu_sibling_map to cpu_mask
> p4-clockmod sibling_map fix
> p4-clockmod: handle more than two siblings
>
> sched-domains-i386-ht.patch
> sched: implement domains for i386 HT
> sched: Fix CONFIG_SMT oops on UP
> sched: fix SMT + NUMA bug
> Change arch_init_sched_domains to use cpu_online_map
> Fix build with NR_CPUS > BITS_PER_LONG
>
> sched-domain-tweak.patch
> i386-sched-domain code consolidation
>
> sched-no-drop-balance.patch
> sched: handle inter-CPU jiffies skew
>
> sched-directed-migration.patch
> sched_balance_exec(): don't fiddle with the cpus_allowed mask
>
> sched-domain-debugging.patch
> sched_domain debugging
>
> sched-domain-balancing-improvements.patch
> scheduler domain balancing improvements
>
> sched-group-power.patch
> sched-group-power
> sched-group-power warning fixes
>
> sched-domains-use-cpu_possible_map.patch
> sched_domains: use cpu_possible_map
>
> sched-smt-nice-handling.patch
> sched: SMT niceness handling
>
> sched-smt-nice-optimisation.patch
> sched: SMT-ice optimisation
>
> fa311-mac-address-fix.patch
> wrong mac address with netgear FA311 ethernet card
>
> laptop-mode-2.patch
> laptop-mode for 2.6, version 6
> Documentation/laptop-mode.txt
> laptop-mode documentation updates
> Laptop mode documentation addition
> laptop mode simplification
>
> pid_max-fix.patch
> Bug when setting pid_max > 32k
>
> use-soft-float.patch
> Use -msoft-float
>
> DRM-cvs-update.patch
> DRM cvs update
>
> drm-include-fix.patch
>
> process-migration-speedup.patch
> Reduce TLB flushing during process migration
>
> nfs-31-attr.patch
> NFSv2/v3/v4: New attribute revalidation code
>
> nfs-reconnect-fix.patch
>
> nfs-mount-fix.patch
> Update to NFS mount....
>
> nfs-d_drop-lowmem.patch
> NFS: handle nfs_fhget() error
>
> nfs-avoid-i_size_write.patch
> NFS: avoid unlocked i_size_write()
>
> nfs_unlink-oops-fix.patch
> nfs: fix "busy inodes after umount"
>
> nfs-remove-XID-spinlock.patch
> nfs: Remove an unnecessary spinlock from XID generation...
>
> nfs-misc-rpc-fixes.patch
> nfs: Misc RPC fixes...
>
> nfs-improved-writeback-strategy.patch
> nfs: improve writeback caching
>
> nfs-simplify-config-options.patch
> nfs: simplify client configuration options.
>
> nfs-fix-msync.patch
> nfs: fix msync()
>
> nfs-mount-return-useful-errors.patch
> nfs: make mount command return more useful errors
>
> nfs-misc-minor-fixes.patch
> nfs: misc minor fixes
>
> nfs-lockd-sync-01.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-02.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-03.patch
> nfs: sync lockd to 2.4.x
>
> nfs-lockd-sync-04.patch
> nfs: sync lockd to 2.4.x
>
> nfs-rpc-remove-redundant-memset.patch
> nfs: remove unnecessary memset() in RPC
>
> nfs-tunable-rpc-slot-table.patch
> nfs: make the RPC slot table size a tunable value.
>
> nfs-short-read-fix.patch
> nfs: fix an NFSv2 read bug
>
> nfs-server-in-root_server_path.patch
> Pull NFS server address out of root_server_path
>
> non-readable-binaries.patch
> Handle non-readable binfmt_misc executables
>
> binfmt_misc-credentials.patch
> binfmt_misc: improve calaulation of interpreter's credentials
>
> initramfs-search-for-init.patch
> search for /init for initramfs boots
>
> adaptive-lazy-readahead.patch
> adaptive lazy readahead
>
> sysfs_remove_dir-race-fix.patch
> sysfs_remove_dir-vs-dcache_readdir race fix
>
> sysfs_remove_subdir-dentry-leak-fix.patch
> Fix dentry refcounting in sysfs_remove_group()
>
> per-node-rss-tracking.patch
> Track per-node RSS for NUMA
>
> aic7xxx-deadlock-fix.patch
> aic7xxx deadlock fix
>
> futex_wait-debug.patch
> futex_wait debug
>
> module_exit-deadlock-fix.patch
> module unload deadlock fix
>
> selinux-inode-race-trap.patch
> Try to diagnose Bug 2153
>
> ufs2-01.patch
> read-only support for UFS2
>
> ide-scsi-error-handling-fixes.patch
> ide-scsi error handling fixes
>
> ide-scsi-error-handling-update.patch
> ide-scsi error handler update
>
> fb_console_init-fix.patch
> fb_console_init fix
>
> poll-select-longer-timeouts.patch
> poll()/select(): support longer timeouts
>
> poll-select-range-check-fix.patch
> poll()/select() range checking fix
>
> poll-select-handle-large-timeouts.patch
> poll()/select(): handle long timeouts
>
> pcmcia-debugging-rework-1.patch
> Overhaul PCMCIA debugging (1)
>
> cs_err-compile-fix.patch
> pcmcia: workaround for gcc-2.95 bug in cs_err()
>
> pcmcia-debugging-rework-2.patch
> Overhaul PCMCIA debugging (2)
>
> distribute-early-allocations-across-nodes.patch
> Manfred's patch to distribute boot allocations across nodes
>
> time-interpolator-fix.patch
> time interpolator fix
>
> kmsg-nonblock.patch
> teach /proc/kmsg about O_NONBLOCK
>
> mixart-build-fix.patch
> CONFIG_SND_MIXART doesn't compile
>
> add-a-slab-for-ethernet.patch
> Add a kmalloc slab for ethernet packets
>
> remove-__io_virt_debug.patch
> remove __io_virt_debug
>
> genrtc-cleanups.patch
> genrtc: cleanups
>
> piix_ide_init-can-be-__init.patch
> piix_ide_init can be __init
>
> i386-early-memory-cleanup.patch
> i386 very early memory detection cleanup patch
>
> modular-mce-handler.patch
> Allow X86_MCE_NONFATAL to be a module
>
> remove-more-KERNEL_SYSCALLS.patch
> further __KERNEL_SYSCALLS__ removal
> build fix for remove-more-KERNEL_SYSCALLS.patch
> fix the build for remove-more-KERNEL_SYSCALLS
>
>
> mq-01-codemove.patch
> posix message queues: code move
>
> mq-02-syscalls.patch
> posix message queues: syscall stubs
>
> mq-03-core.patch
> posix message queues: implementation
>
> mq-03-core-update.patch
> posix message queues: update to core patch
>
> mq-04-linuxext-poll.patch
> posix message queues: linux-specific poll extension
>
> mq-05-linuxext-mount.patch
> posix message queues: made user mountable
>
> mq-update-01.patch
> posix message queue update
>
> mq-security-fix.patch
> security bugfix for mqueue
>
> dm-01-endio-method.patch
> dm: endio method
>
> dm-03-list_for_each_entry-audit.patch
> dm: list_for_each_entry audit
>
> dm-04-default-queue-limits-fix.patch
> dm: default queue limits
>
> dm-05-list-targets-command.patch
> dm: list targets cmd
>
> dm-06-stripe-width-fix.patch
> dm: stripe width fix
>
> queue-congestion-callout.patch
> Add queue congestion callout
>
> queue-congestion-dm-implementation.patch
> Implement queue congestion callout for device mapper
>
> dm-maplock.patch
> devicemapper: use rwlock for map alterations
>
> dm-map-rwlock-ng.patch
> Another DM maplock implementation
>
> dm-remove-__dm_request.patch
> dmL remove __dm_request
>
> use-wait_task_inactive-in-kthread_bind.patch
> use wait_task_inactive() in kthread_bind()
>
> HPFS1-hpfs2-RC4-rc1.patch
>
> HPFS2-hpfs_namei-RC4-rc1.patch
>
> selinux-cleanup-binary-mount-data.patch
> selinux: clean up binary mount data
>
> udffs-update.patch
> UDF filesystem update
>
> kbuild-redundant-CFLAGS.patch
> kbuild: Remove CFLAGS assignment in i386/mach-*/Makefile
>
> numa-aware-zonelist-builder.patch
> NUMA-aware zonelist builder
> numa-aware zonelist builder fix
> numa-aware node builder fix #2
>
> remove-redundant-unplug_timer-deletion.patch
> Redundant unplug_timer deletion
>
> queue_work_on_cpu.patch
> Add queue_work_on_cpu() workqueue function
>
> m68k-rename-sys_functions.patch
> m68k: rename sys_* functions
>
> pdc202xx_new-update.patch
> ide: update for pdc202xx_new driver
>
> siimage-update.patch
> ide: update for siimage driver
>
> ide-cleanups-01.patch
> ide: IDE cleanups
>
> ide-cleanups-02.patch
> ide: IDE cleanups
>
> ide-cleanups-03.patch
> ide: IDE cleanups
>
> cdromaudio-use-dma.patch
> use DMA for CDROM audio reading
>
> sysfs-pin-kobject.patch
> sysfs: pin kobjects to fix use-after-free crashes
>
> ATI-IXP-IDE-support.patch
> ATI IXP IDE support
>
> ipmi-updates-3.patch
> IPMI driver updates
>
> ipmi-socket-interface.patch
> IPMI: socket interface
>
> md-use-schedule_timeout.patch
> md: use "shedule_timeout(2)" instead of yield()
>
> md-array-assembly-fix.patch
> md: allow assembling of partitioned arrays at boot time.
>
> md-array-assembly-major-fix.patch
> md array assembly major number fix
>
> compiler_h-scope-fixes.patch
> compiler.h scoping fixes
>
> nmi_watchdog-local-apic-fix.patch
> Fix nmi_watchdog=2 and P4 HT
>
> nmi-1-hz.patch
> set nmi_hz to 1 with nmi_watchdog=2 and SMP
>
> elf-mmap-fix.patch
> Fix elf mapping of the zero page
>
> kbuild-more-cleaning.patch
> kbuild: Cause `make clean' to remove more files
>
> LOOP_CHANGE_FD.patch
> LOOP_CHANGE_FD ioctl
>
> loop-setup-race-fix.patch
> loop setup race fix
>
> handle-dot-o-paths.patch
> kbuild: fix usage with directories containing '.o'
>
> acpi-asmlinkage-fix.patch
> gcc-3.5: acpi build fix
>
> ipc-sem-extra-sem_unlock.patch
> Remove unneeded unlock in ipc/sem.c
>
> procfs-dangling-subdir-fix.patch
> /proc data corruption check
>
> AMD-768MPX-bootmem-fix.patch
> Work around an AMD768MPX erratum
>
> i810fb-on-x86_64.patch
> Enable i810 fb on x86-64
>
> ext23-remove-acl-limits.patch
> Remove arbitrary #acl entries limits on ext[23] when reading
>
> watchdog-moduleparam-patches.patch
> watchdog: moduleparam-patches
>
> amd-elan-fix.patch
> AMD ELAN Kconfig fix
>
> pcmcia-netdev-ordering-fixes.patch
> PCMCIA netdevice ordering issues
>
> fadvise-fixups.patch
> fadvise(POSIX_FADV_DONTNEED) fixups
>
> validate_mm-fixes.patch
> Fix and harden validate_mm
>
> 3ware-update.patch
> 3ware driver update
>
> 3c59x-xcvr-fix.patch
> Fix 3c59x transceiver handling
>
> current_is_keventd-speedup.patch
> current_is_keventd() speedup
>
> root-ramdisk-fix.patch
> Fix rootfs on ramdisk
>
> cciss-per-device-queues.patch
> cciss: per device queues
>
> blkdev-fix-final-page.patch
> Fix reading the last block on a bdev
>
> wavfront-needs-syscalls_h.patch
> wavfront.c needs syscalls.h
>
> edd-legacy-parameters-fix.patch
> EDD: Get Legacy Parameters
>
> cciss-section-fix.patch
> cciss: init section fix
>
> pte_chain-nowarns.patch
> add nowarn to a few pte chain allocators
>
> macintosh-config-fix.patch
> Disable Macintosh device drivers for all but PPC || MAC
>
> applicom-warning-fix.patch
> Applicom warning
>
> CONFIG_NVRAM-dependencies.patch
> Fix CONFIG_NVRAM dependencies
>
> move-job-control-stuff-tosignal_struct.patch
> moef job control fields from task_struct to signal_struct
>
> module_h-attribute_used-fix.patch
> module.h __attribute_used__ fix
>
> kobject-module-request-64-bit-fix.patch
> Fix a 64bit bug in kobject module request
>
> sch_htb-fix.patch
> net: fix sch_htb on 64-bit
>
> instrument-highmem-page-reclaim.patch
> vm: per-zone vmscan instrumentation
>
> blk_congestion_wait-return-remaining.patch
> return remaining jiffies from blk_congestion_wait()
>
> blk-congestion-races.patch
> Narrow blk_congestion_wait races
>
> vmscan-remove-priority.patch
> mm/vmscan.c: remove unused priority argument.
>
> kswapd-throttling-fixes.patch
> kswapd throttling fixes
>
> vm-refill_inactive-preserve-referenced.patch
> vmscan: preserve page referenced info in refill_inactive()
>
> shrink_slab-precision-fix.patch
> shrink_slab: math precision fix
>
> try_to_free_pages-shrink_slab-evenness.patch
> vm: shrink slab evenly in try_to_free_pages()
>
> vmscan-total_scanned-fix.patch
> vmscan: fix calculation of number of pages scanned
>
> shrink_slab-for-all-zones-2.patch
> vm: scan slab in response to highmem scanning
>
> zone-balancing-fix-2.patch
> vmscan: zone balancing fix
>
> vmscan-control-by-nr_to_scan-only.patch
> vmscan: drive everything via nr_to_scan
>
> vmscan-balance-zone-scanning-rates.patch
> Balance inter-zone scan rates
>
> vmscan-dont-throttle-if-zero-max_scan.patch
> vmscan: avoid bogus throttling
>
> kswapd-avoid-higher-zones.patch
> kswapd: avoid unnecessary reclaiming from higher zones
>
> kswapd-avoid-higher-zones-reverse-direction.patch
> kswapd: fix lumpy page reclaim
>
> kswapd-avoid-higher-zones-reverse-direction-fix.patch
> fix the kswapd zone scanning algorithm
>
> vmscan-throttle-later.patch
> vmscan: less throttling of page allocators and kswapd
>
> vm-batch-inactive-scanning.patch
> vmscan: batch up inactive list scanning work
>
> vm-batch-inactive-scanning-fix.patch
> fix vm-batch-inactive-scanning.patch
>
> vm-balance-refill-rate.patch
> vm: balance inactive zone refill rates
>
> vm-lrutopage-cleanup.patch
> vmscan: add lru_to_page() helper
>
> slab-no-higher-order.patch
> slab: avoid higher-order allocations
>
> O_DIRECT-race-fixes-rollup.patch
> O_DIRECT data exposure fixes
>
> O_DIRECT-ll_rw_block-vs-block_write_full_page-fix.patch
> Fix race between ll_rw_block() and block_write_full_page()
>
> blockdev-direct-io-speedup.patch
> blockdev direct-io speedups
>
> dio-aio-fixes.patch
> direct-io AIO fixes
>
> aio-fallback-bio_count-race-fix-2.patch
> AIO+DIO bio_count race fix
>
> aio-direct-io-oops-fix.patch
> AIO/direct-io oops fix
>
> radix-tree-tagging.patch
> radix-tree tags for selective lookup
>
> irq-safe-pagecache-lock.patch
> make the pagecache lock irq-safe.
>
> tag-dirty-pages.patch
> tag dirty pages as such in the radix tree
>
> tag-writeback-pages.patch
> tag writeback pages as such in their radix tree
>
> stop-using-dirty-pages.patch
> stop using the address_space dirty_pages list
>
> stop-using-io-pages.patch
> remove address_space.io_pages
>
> stop-using-locked-pages.patch
> Stop using address_space.locked_pages
>
> stop-using-clean-pages.patch
> stop using address_space.clean_pages
>
> unslabify-pgds-and-pmds.patch
> revert the slabification of i386 pgd's and pmd's
>
> slab-stop-using-page-list.patch
> slab: stop using page.list
>
> page_alloc-stop-using-page-list.patch
> stop using page.list in the page allocator
>
> hugetlb-stop-using-page-list.patch
> stop using page->list in the hugetlbpage implementations
>
> pageattr-stop-using-page-list.patch
> stop using page.list in pageattr.c
>
> readahead-stop-using-page-list.patch
> stop using page.list in readahead
>
> compound-pages-stop-using-lru.patch
> stop using page->lru in compound pages
>
> remove-page-list.patch
> remove page.list
>
> remap-file-pages-prot-2.6.4-rc1-mm1-A1.patch
> per-page protections for remap_file_pages()
>
> remap-file-pages-prot-ia64-2.6.4-rc2-mm1-A0.patch
> remap_file_pages page-prot implementation for ia64
>
> list_del-debug.patch
> list_del debug check
>
> oops-dump-preceding-code.patch
> i386 oops output: dump preceding code
>
> lockmeter.patch
> lockmeter
>
> lockmeter-ia64-fix.patch
> ia64 CONFIG_LOCKMETER fix
>
> 4g-2.6.0-test2-mm2-A5.patch
> 4G/4G split patch
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g/4g usercopy atomicity fix
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g/4g usercopy atomicity fix
> 4G/4G preempt on vstack
> 4G/4G: even number of kmap types
> 4g4g: fix __get_user in slab
> 4g4g: Remove extra .data.idt section definition
> 4g/4g linker error (overlapping sections)
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g4g: show_registers() fix
> 4g/4g usercopy atomicity fix
> 4g4g: debug flags fix
> 4g4g: Fix wrong asm-offsets entry
> cyclone time fixmap fix
> 4G/4G preempt on vstack
> 4G/4G: even number of kmap types
> 4g4g: fix __get_user in slab
> 4g4g: Remove extra .data.idt section definition
> 4g/4g linker error (overlapping sections)
> 4G/4G: remove debug code
> 4g4g: pmd fix
> 4g/4g: fixes from Bill
> 4g4g: fpu emulation fix
> 4g4g: show_registers() fix
> 4g/4g usercopy atomicity fix
> 4g4g: debug flags fix
> 4g4g: Fix wrong asm-offsets entry
> cyclone time fixmap fix
> use direct_copy_{to,from}_user for kernel access in mm/usercopy.c
> 4G/4G might_sleep warning fix
> 4g/4g pagetable accounting fix
> Fix 4G/4G and WP test lockup
> 4G/4G KERNEL_DS usercopy again
> Fix 4G/4G X11/vm86 oops
> Fix 4G/4G athlon triplefault
> 4g4g SEP fix
> Fix 4G/4G split fix for pre-pentiumII machines
> 4g/4g PAE ACPI low mappings fix
> zap_low_mappings() cannot be __init
> 4g/4g: remove printk at boot
> 4g4g: fix handle_BUG()
> 4g4g: acpi sleep fixes
>
> 4g4g-locked-userspace-copy.patch
> Do a locked user-space copy for 4g/4g
>
> ia32-4k-stacks.patch
> ia32: 4Kb stacks (and irqstacks) patch
>
> ia32-4k-stacks-build-fix.patch
> 4k stacks build fix
>
> 4k-stacks-in-modversions-magic.patch
> Add 4k stacks to module version magic
>
> ppc-fixes.patch
> make mm4 compile on ppc
>
> ppc-fixes-dependency-fix.patch
> ppc-fixes dependency fix
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Regards, Redeeman
[email protected]
Redeeman wrote:
> hey andrew, i have a problem with this kernel, when it boots, it lists
> vp_ide and stuff, and then suddenly after that my screen gets flodded
> with sys traces and stuff, i cant even read it, so fast they come, and
> the syste doesnet go further
Same here. bad: scheduling while atomic. .config attached (no dmesg as I have
no experience with serial consoles yet.)
HTH,
Norberto
IBM Thinkpad T30, current bios
On a clean boot (not resume - I've not gotten that working):
resuming from /dev/hda8
Resuming from device hda8
bad: scheduling while atomic!
Call Trace: (abbreviated - doing this by hand on nearby PC)
schedule+0x5d5
mempool_alloc+0x64
generic_unplug_device+0x55
blk_run_queues+0x79
io_schedule+0x3
__wait_on_buffer+0xca
autoremove_wake_function+0x0
autoremove_wake_function+0x0 (no, not a typo)
__bread_slow+0x43
__bread+0x1b
bdev_read_page+0x24
read_suspend_image+0x131
printk+0x127
release_console_sem+0xd7
software_resume+0x7a
do_initcalls+0x2b
idedisk_init+0x0
init+0x0
init+0x38
kernel_thread_helper+0x5
Resume Machine: This is normal swap space
-------------- [ cut here ] ------------------
kernel BUG at kernel/printk.c:568!
invalid operand: 0000 [#1]
PREEMPT
CPU: 0
EIP: 0060:[<c0122d14>] Not tainted VLI
EFLAGS: 00010206 (2.6.4-mm1)
EIP is at acquire_console_sem+0x14/0x60
eax: dff4f00 ebx: c03f3b88 ecx: c13fb760 edx: c0350578
esi: 0000001 edi: 00000000 ebp: dff4ffa4 esp: dff4ffa0
ds: 007b es: 007b ss: 0068
Process swapper (pid: 1, threadinfo=dff4f000 task=c141d680)
...
Call Trace:
pm_restore_console+0x12
software_resume+0x83
do_initcalls+0x2b
idedisk_init+0x0
init+0x0
init+0x38
kernel_thread_helper+0x5
Kernel panic: Fatal exception in interrupt
In interrupt handler - not syncing
--
Rick Nelson
After watching my newly-retired dad spend two weeks learning how to make a new
folder, it became obvious that "intuitive" mostly means "what the writer or
speaker of intuitive likes".
-- Bruce Ediger, [email protected], on X the intuitiveness of a Mac interface
Norberto Bensa <[email protected]> wrote:
>
> Redeeman wrote:
> > hey andrew, i have a problem with this kernel, when it boots, it lists
> > vp_ide and stuff, and then suddenly after that my screen gets flodded
> > with sys traces and stuff, i cant even read it, so fast they come, and
> > the syste doesnet go further
>
> Same here. bad: scheduling while atomic. .config attached (no dmesg as I have
> no experience with serial consoles yet.)
Did you remove the spin_unlock_irq() from the end of mpage_writepages()?
i didnt do anything more than patch with mm1, is there a patch for doing
that spin_unlock_irq()? :)
On Thu, 2004-03-11 at 19:09, Andrew Morton wrote:
> Norberto Bensa <[email protected]> wrote:
> >
> > Redeeman wrote:
> > > hey andrew, i have a problem with this kernel, when it boots, it lists
> > > vp_ide and stuff, and then suddenly after that my screen gets flodded
> > > with sys traces and stuff, i cant even read it, so fast they come, and
> > > the syste doesnet go further
> >
> > Same here. bad: scheduling while atomic. .config attached (no dmesg as I have
> > no experience with serial consoles yet.)
>
> Did you remove the spin_unlock_irq() from the end of mpage_writepages()?
--
Regards, Redeeman
[email protected]
Andrew Morton wrote:
> Norberto Bensa <[email protected]> wrote:
> > Redeeman wrote:
> > > hey andrew, i have a problem with this kernel, when it boots, it lists
> > > vp_ide and stuff, and then suddenly after that my screen gets flodded
> >
> > Same here. bad: scheduling while atomic. .config attached (no dmesg as I
> > have no experience with serial consoles yet.)
>
> Did you remove the spin_unlock_irq() from the end of mpage_writepages()?
Done now.
$ uname -a
Linux venkman 2.6.4-mm1 #2 Thu Mar 11 15:18:21 ART 2004 i686 Pentium III
(Coppermine) GenuineIntel GNU/Linux
Thanks Andrew!
Norberto
Yes, we have been testing the sched-domain scheduler. So far, the
results are all positive. We'll add more stress to it, running various
workloads.
Jun
>-----Original Message-----
>From: [email protected] [mailto:linux-kernel-
>[email protected]] On Behalf Of Andrew Morton
>Sent: Wednesday, March 10, 2004 11:32 PM
>To: [email protected]
>Subject: 2.6.4-mm1
>
>
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.4/2.6
.4-
>mm1/
>
>
>
>- The CPU scheduler changes in -mm (sched-domains) have been hanging
about
> for too long. I had been hoping that the people who care about SMT
and
> NUMA performance would have some results by now but all seems to be
>silent.
>
> I do not wish to merge these up until the big-iron guys can say that
they
> suit their requirements, with a reasonable expectation that we will
not
> need to churn this code later in the 2.6 series.
>
> So. If you have been testing, please speak up. If you have not been
> testing, please do so.
>
Redeeman <[email protected]> wrote:
>
> i didnt do anything more than patch with mm1, is there a patch for doing
> that spin_unlock_irq()? :)
--- 25/fs/mpage.c~a 2004-03-11 10:46:29.000000000 -0800
+++ 25-akpm/fs/mpage.c 2004-03-11 10:46:31.000000000 -0800
@@ -672,7 +672,6 @@ mpage_writepages(struct address_space *m
}
pagevec_release(&pvec);
}
- spin_unlock_irq(&mapping->tree_lock);
if (bio)
mpage_bio_submit(WRITE, bio);
return ret;
_
[email protected] wrote:
> On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
>
>> This affects I/O scheduling potentially quite significantly. It is no
>> longer the case that the kernel will submit pages for I/O in the order in
>> which the application dirtied them. We instead submit them in file-offset
>> order all the time.
>
>
> Hi Andrew,
> I have a feeling this change might significantly improve the external
> sorting benchmark I emailed you ( http://lkml.org/lkml/2003/12/20/46 ).
> I will try running it when I get a chance and let you know. It gives me
> a good excuse to get 2.6 kernels working on my systems :-)
Hmm, what is happening with Roger Luethi's work lately?
Have there been any patches for use once in this case?
Andi Kleen <[email protected]> wrote:
>
> Andrew Morton <[email protected]> writes:
>
> > - The CPU scheduler changes in -mm (sched-domains) have been hanging about
> > for too long. I had been hoping that the people who care about SMT and
> > NUMA performance would have some results by now but all seems to be silent.
> >
> > I do not wish to merge these up until the big-iron guys can say that they
> > suit their requirements, with a reasonable expectation that we will not
> > need to churn this code later in the 2.6 series.
> >
> > So. If you have been testing, please speak up. If you have not been
> > testing, please do so.
>
> I tested them on Opteron NUMA systems and they are worse on simple
> tests than the stock scheduler (e.g. the parallelized STREAM test,
> which is a bit silly, but still fairly important)
OK, thanks.
> For SMT there is a patch from Intel pending that teaches x86-64
> to set up the SMT scheduler. They said they got slightly better
> benchmark results. The SMT setup seems to be racy though.
Am I correct in thinking that this patch provides the necessary hooks to
integrate x86_4 into the new functionality which sched-domains provides, or
is the Intel patch independent of sched-domains?
> Some kind of SMT scheduler is definitely needed, we have a serious
> regression compared to 2.4 here right now. I'm not sure this
> is the right approach though, it seems to be far too complex.
Well that's discouraging. I really do want to push this thing along a bit.
Yours is the only report of regression of which I am aware. Is the reason
understood?
And is anyone developing alternative SMT enhancements?
> > For SMT there is a patch from Intel pending that teaches x86-64
> > to set up the SMT scheduler. They said they got slightly better
> > benchmark results. The SMT setup seems to be racy though.
>
> Am I correct in thinking that this patch provides the necessary hooks to
> integrate x86_4 into the new functionality which sched-domains provides, or
> is the Intel patch independent of sched-domains?
It sets up the sched-domains code to know about HyperThreading CPUs
on x86-64 too (basically same thing as the i386 code does with a
few minor tweaks)
So it's dependent on that.
I will send it to you in separate mail.
> > Some kind of SMT scheduler is definitely needed, we have a serious
> > regression compared to 2.4 here right now. I'm not sure this
> > is the right approach though, it seems to be far too complex.
>
> Well that's discouraging. I really do want to push this thing along a bit.
>
> Yours is the only report of regression of which I am aware. Is the reason
> understood?
I think the reason is that it doesn't do balance on clone/fork. The
normal scheduler also doesn't do that, but for some reason it still does
better on the benchmarks (but worse than the old 2.4 -aa/Intel O(1) HT
scheduler)
> And is anyone developing alternative SMT enhancements?
I thought there was a patch from Ingo Molnar? ("shared runqueue")
I must admit I never tried it, just remember seeing the patches.
Also I've been playing with the entitlement scheduler to fix
some of the interactivity problems I have on UP, but it also
seems to still have problems.
-Andi
On Thu, 2004-03-11 at 18:06, Redeeman wrote:
> hey andrew, i have a problem with this kernel, when it boots, it lists
> vp_ide and stuff, and then suddenly after that my screen gets flodded
> with sys traces and stuff, i cant even read it, so fast they come, and
> the syste doesnet go further, i havent tried 2.6.4 vanilla yet, but i
> will now.
I'm having similar problems, with the kernel crashing with not syncing
in interrupt error after a lot of oopses and BUGs. I'm trying to find
which patch is causing this, since 2.6.4-rc2-mm1 and 2.6.4 work fine.
I'll post my findings, when they are ready.
On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
>...
> ext3-journalled-quotas-2.patch
> ext3: journalled quota
>...
This patch broke modular quota:
WARNING: /lib/modules/2.6.4-mm1/kernel/fs/quota_v2.ko needs unknown
symbol mark_info_dirty
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
>...
> remove-more-KERNEL_SYSCALLS.patch
> further __KERNEL_SYSCALLS__ removal
>...
This causes the following unknown symbols in modules on i386:
<-- snip ->
WARNING:
/lib/modules/2.6.4-mm1/kernel/sound/isa/wavefront/snd-wavefront.ko needs
unknown symbol sys_read
WARNING:
/lib/modules/2.6.4-mm1/kernel/sound/isa/wavefront/snd-wavefront.ko needs
unknown symbol sys_open
WARNING: /lib/modules/2.6.4-mm1/kernel/sound/oss/wavefront.ko needs
unknown symbol sys_read
WARNING: /lib/modules/2.6.4-mm1/kernel/sound/oss/wavefront.ko needs
unknown symbol sys_open
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/tda1004x.ko
needs unknown symbol sys_lseek
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/tda1004x.ko
needs unknown symbol sys_read
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/tda1004x.ko
needs unknown symbol sys_open
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/sp887x.ko
needs unknown symbol sys_lseek
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/sp887x.ko
needs unknown symbol sys_read
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/sp887x.ko
needs unknown symbol sys_open
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/alps_tdlb7.ko
needs unknown symbol sys_lseek
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/alps_tdlb7.ko
needs unknown symbol sys_read
WARNING:
/lib/modules/2.6.4-mm1/kernel/drivers/media/dvb/frontends/alps_tdlb7.ko
needs unknown symbol sys_open
<-- snip -->
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
Andi Kleen <[email protected]> wrote:
>
> Also I've been playing with the entitlement scheduler to fix
> some of the interactivity problems I have on UP, but it also
> seems to still have problems.
You may find that nicksched fixes interactivity problems. It's a
fairly fundamental rethink of the relationship between priorities and
timeslices but back when I understood it I thought it made sense.
gonna try now, already compiling... i will come back with details in a
few minutes..
On Thu, 2004-03-11 at 19:46, Andrew Morton wrote:
> Redeeman <[email protected]> wrote:
> >
> > i didnt do anything more than patch with mm1, is there a patch for doing
> > that spin_unlock_irq()? :)
>
> --- 25/fs/mpage.c~a 2004-03-11 10:46:29.000000000 -0800
> +++ 25-akpm/fs/mpage.c 2004-03-11 10:46:31.000000000 -0800
> @@ -672,7 +672,6 @@ mpage_writepages(struct address_space *m
> }
> pagevec_release(&pvec);
> }
> - spin_unlock_irq(&mapping->tree_lock);
> if (bio)
> mpage_bio_submit(WRITE, bio);
> return ret;
>
> _
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Regards, Redeeman
[email protected]
yeah andrew it works! you are god!!
while i got you here, i have got a pray more for you.
its about amd64-agp. i never had it working (from 2.6.1 mm and vanilla)
but in 2.6.4-rc1-mm2 it worked! but sadly abit unstable :(
my big problem is that when using X, and having some windows opens, it
consumes ALL cpu if amd64-agp isnt in kernel, so i would REALLY
apreciate if you could look at it (sorry my bad english)
thanks!
On Thu, 2004-03-11 at 21:58, Redeeman wrote:
> gonna try now, already compiling... i will come back with details in a
> few minutes..
>
> On Thu, 2004-03-11 at 19:46, Andrew Morton wrote:
> > Redeeman <[email protected]> wrote:
> > >
> > > i didnt do anything more than patch with mm1, is there a patch for doing
> > > that spin_unlock_irq()? :)
> >
> > --- 25/fs/mpage.c~a 2004-03-11 10:46:29.000000000 -0800
> > +++ 25-akpm/fs/mpage.c 2004-03-11 10:46:31.000000000 -0800
> > @@ -672,7 +672,6 @@ mpage_writepages(struct address_space *m
> > }
> > pagevec_release(&pvec);
> > }
> > - spin_unlock_irq(&mapping->tree_lock);
> > if (bio)
> > mpage_bio_submit(WRITE, bio);
> > return ret;
> >
> > _
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
--
Regards, Redeeman
[email protected]
On Thu, Mar 11, 2004 at 02:45:35PM +0100, Mickael Marchand wrote:
> hmm right now, dm/lvm absolutely does not work on amd64/32 bits. all ioctls
> calls are failling...
This one has me stumped. I've tested on sparc64/debian and Kevin
Corry has tested on PPC and neither of us have problems. So it looks
like an amd64 only problem, does 2.6.4 vanilla work ? (I don't have
access to one of these machines).
- Joe
On Thu, Mar 11, 2004 at 03:48:29PM +0100, Andi Kleen wrote:
> Maybe they have broken data structures again, most likely
> because of different long long alignment. A lot of people
> who attempt to design data structures that don't need translation
> get that wrong unfortunately.
I'd thought we'd been careful about this. You're suggesting that the
size of this structure has changed between kernel versions ?!
struct dm_ioctl {
uint32_t version[3];
uint32_t data_size;
uint32_t data_start;
uint32_t target_count;
int32_t open_count;
uint32_t flags;
uint32_t event_nr;
uint32_t padding;
uint64_t dev;
char name[DM_NAME_LEN];
char uuid[DM_UUID_LEN];
};
- Joe
On Thu, Mar 11, 2004 at 09:43:54PM +0000, Joe Thornber wrote:
> I'd thought we'd been careful about this. You're suggesting that the
> size of this structure has changed between kernel versions ?!
Ignore me, I'm being an idiot.
- Joe
On Thu, Mar 11, 2004 at 09:43:54PM +0000, Joe Thornber wrote:
> struct dm_ioctl { 0
> uint32_t version[3];
> uint32_t data_size; 4
>
> uint32_t data_start;
>
> uint32_t target_count;
> int32_t open_count;
> uint32_t flags; 8
> uint32_t event_nr;
> uint32_t padding; 10 ***
Here's probably the problem. Many 64bit arches align 64bit
numbers on a 64bit boundary. So it is adding 2 more words of padding to
start the u64 at offset 12.
> uint64_t dev;
>
> char name[DM_NAME_LEN];
> char uuid[DM_UUID_LEN];
> };
Joel
--
Life's Little Instruction Book #313
"Never underestimate the power of love."
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
On Thu, Mar 11, 2004 at 01:59:55PM -0800, Joel Becker wrote:
> On Thu, Mar 11, 2004 at 09:43:54PM +0000, Joe Thornber wrote:
> > struct dm_ioctl { 0
Don't mind me, I'm an idiot.
Joel
--
"All alone at the end of the evening
When the bright lights have faded to blue.
I was thinking about a woman who had loved me
And I never knew"
Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: [email protected]
Phone: (650) 506-8127
Joel Becker wrote:
> On Thu, Mar 11, 2004 at 09:43:54PM +0000, Joe Thornber wrote:
>
>>struct dm_ioctl { 0
>> uint32_t version[3];
>> uint32_t data_size; 4
>>
>> uint32_t data_start;
>>
>> uint32_t target_count;
>> int32_t open_count;
>> uint32_t flags; 8
>> uint32_t event_nr;
>> uint32_t padding; 10 ***
>
>
> Here's probably the problem. Many 64bit arches align 64bit
> numbers on a 64bit boundary. So it is adding 2 more words of padding to
> start the u64 at offset 12.
But wouldn't this be applied across the board and therefore still work?
Or is it defined as "packed" somewhere?
Chris
--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]
[email protected] wrote:
>
> On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
> > This affects I/O scheduling potentially quite significantly. It is no
> > longer the case that the kernel will submit pages for I/O in the order in
> > which the application dirtied them. We instead submit them in file-offset
> > order all the time.
>
> Hi Andrew,
> I have a feeling this change might significantly improve the external
> sorting benchmark I emailed you ( http://lkml.org/lkml/2003/12/20/46 ).
> I will try running it when I get a chance and let you know.
That thing's still sitting in by Inbox awaiting attention :(
I just took a quick peek. The `testfile' file which it lays out is well
laid-out so yes, if you're seeking all over that file then 2.6.4-mm1 may
indeed help throughput.
But `tesfile.tmp' is not well laid-out. Looks like it was created seekily,
so its blocks are all jumbled up.
The code in 2.6.4-mm1 favours unjumbled-up files. The code in 2.4 and
2.6.4 favours jumbled-up files.
When kjournald performs writeback it favours jumbled-up files, even in
2.6.4-mm1.
So it's hard to say what will happen ;) I'll take a look later.
> It gives me a good excuse to get 2.6 kernels working on my systems :-)
Luddite.
Anton Blanchard wrote:
>
>
>>- The CPU scheduler changes in -mm (sched-domains) have been hanging about
>> for too long. I had been hoping that the people who care about SMT and
>> NUMA performance would have some results by now but all seems to be silent.
>>
>> I do not wish to merge these up until the big-iron guys can say that they
>> suit their requirements, with a reasonable expectation that we will not
>> need to churn this code later in the 2.6 series.
>>
>> So. If you have been testing, please speak up. If you have not been
>> testing, please do so.
>>
>
>I sucked sched-* out of mm, added sched-ppc64bits (attached) and am
>having problems with the following threaded test case. NUMA is enabled.
>
>#include <pthread.h>
>#define NR_THREADS 100
>
>void dostuff(void *junk)
>{
> while(1)
> ;
>}
>
>int main()
>{
> int i;
> pthread_t tid;
>
> for (i = 0; i < NR_THREADS-1; i++)
> pthread_create(&tid, NULL, dostuff, NULL);
>
> dostuff(NULL);
>}
>
>100 runnable threads but we never use more than one cpu:
>
OK thanks. This is probably a simple bug somewhere. I'll have a look
at it soon.
On Thu, Mar 11, 2004 at 09:38:03PM +0000, Joe Thornber wrote:
> On Thu, Mar 11, 2004 at 02:45:35PM +0100, Mickael Marchand wrote:
> > hmm right now, dm/lvm absolutely does not work on amd64/32 bits. all ioctls
> > calls are failling...
>
> This one has me stumped. I've tested on sparc64/debian and Kevin
> Corry has tested on PPC and neither of us have problems. So it looks
ppc and sparc64 are different from x86-64 and ia64.
The problem on i386 is that alignof(long long) is different between
32bit and 64bit. That's not the case on the riscs.
This causes problems either with moving fields around/after 64bit
values and worse it changes the alignment of whole structures in
arrays too (because alignof(struct) is the largest alignment needed
by any members)
> like an amd64 only problem, does 2.6.4 vanilla work ? (I don't have
> access to one of these machines).
Most likely it's one of your arrays. You pass arrays, right?
-Andi
Andi Kleen wrote:
>>>Some kind of SMT scheduler is definitely needed, we have a serious
>>>regression compared to 2.4 here right now. I'm not sure this
>>>is the right approach though, it seems to be far too complex.
>>>
Andi, I'll agree that the way domains currently get set up is pretty
ugly. Maybe some additional functions or macros could be used to make
this process a bit clearer.
The actual kernel/sched.c code is really not that complex. In some ways
it is *less* complicated than the old numa scheduler because it all goes
through one code path.
It also handles SMT, which is where a bit of complexity is coming from.
The other alternative is shared runqueues which is uglier and less flexible.
>>Well that's discouraging. I really do want to push this thing along a bit.
>>
>>Yours is the only report of regression of which I am aware. Is the reason
>>understood?
>>
>
>I think the reason is that it doesn't do balance on clone/fork. The
>normal scheduler also doesn't do that, but for some reason it still does
>better on the benchmarks (but worse than the old 2.4 -aa/Intel O(1) HT
>scheduler)
>
>
There have been a few changes and bug fixes since you last tested.
Maybe that would help.
>>And is anyone developing alternative SMT enhancements?
>>
>
>I thought there was a patch from Ingo Molnar? ("shared runqueue")
>I must admit I never tried it, just remember seeing the patches.
>
Yep shared runqueues. Ingo and Rusty both had implementations but
they both agreed sched-domains was a better alternative.
On Thu, Mar 11, 2004 at 09:43:54PM +0000, Joe Thornber wrote:
> On Thu, Mar 11, 2004 at 03:48:29PM +0100, Andi Kleen wrote:
> > Maybe they have broken data structures again, most likely
> > because of different long long alignment. A lot of people
> > who attempt to design data structures that don't need translation
> > get that wrong unfortunately.
>
> I'd thought we'd been careful about this. You're suggesting that the
> size of this structure has changed between kernel versions ?!
>
> struct dm_ioctl {
> uint32_t version[3];
> uint32_t data_size;
>
> uint32_t data_start;
>
> uint32_t target_count;
> int32_t open_count;
> uint32_t flags;
> uint32_t event_nr;
> uint32_t padding;
>
> uint64_t dev;
>
> char name[DM_NAME_LEN];
> char uuid[DM_UUID_LEN];
Are DM_NAME_LEN and DM_UUID_LEN not both a multiple of 8?
> };
There are more structures here, right?
If yes, that's the problem.
-Andi
2.6.4-mm1 doesn't work for me :-(
I get the:
Uncompressing kernel ... now booting Linux
message, and then ...... nothing.
I've seen this before when trying to boot a P4 kernel on a P-classic
etc, so I tried compiling with CONFIG_M386, and got lots of compile
errors:
include/asm/acpi.h: In function `__acpi_acquire_global_lock':
include/asm/acpi.h:74: warning: implicit declaration of function `cmpxchg'
So I tried the default (CONFIG_M686) and it still doesn't work.
So: where do I look next?
I've included some of the machine specs below together with a config
file.
Thanks.
(I did include the mpage.c fix)
NeilBrown
When 2.4.23 is booted, /proc/cpuinfo contains:
# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 15
model : 2
model name : Intel(R) Xeon(TM) CPU 2.80GHz
stepping : 7
cpu MHz : 2791.078
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid
bogomips : 5570.56
... and so on with 4 processors identical except for the number. (It
is a dual-Xeon machine).
lspci shows:
00:00.0 Host bridge: ServerWorks CMIC-LE (rev 13)
00:00.1 Host bridge: ServerWorks CMIC-LE
00:00.2 Host bridge: ServerWorks: Unknown device 0000
00:04.0 Class ff00: Dell Computer Corporation Embedded Systems Management Device 4
00:04.1 Class ff00: Dell Computer Corporation PowerEdge Expandable RAID Controller 3/Di
00:04.2 Class ff00: Dell Computer Corporation: Unknown device 000d
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Host bridge: ServerWorks CSB5 South Bridge (rev 93)
00:0f.1 IDE interface: ServerWorks CSB5 IDE Controller (rev 93)
00:0f.3 ISA bridge: ServerWorks GCLE Host Bridge
00:10.0 Host bridge: ServerWorks: Unknown device 0101 (rev 05)
00:10.2 Host bridge: ServerWorks: Unknown device 0101 (rev 05)
00:11.0 Host bridge: ServerWorks: Unknown device 0101 (rev 05)
00:11.2 Host bridge: ServerWorks: Unknown device 0101 (rev 05)
01:08.0 Ethernet controller: Intel Corp. 82544EI Gigabit Ethernet Controller (Copper) (rev 02)
02:06.0 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
02:06.1 SCSI storage controller: Adaptec AHA-3960D / AIC-7899A U160/m (rev 01)
03:06.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15)
03:08.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5701 Gigabit Ethernet (rev 15)
04:08.0 PCI bridge: Intel Corp.: Unknown device 0309 (rev 01)
05:06.0 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
05:06.1 SCSI storage controller: Adaptec AIC-7899P U160/m (rev 01)
Early dmesg messages when booted 2.4.23 are:
Mar 12 11:18:22 adams kernel: Linux version 2.4.23-server3 (root@adams) (gcc version 2.95.4 20011002 (Debian prerelease)) #1 SMP Wed Jan 7 13:03:33 EST 2004
Mar 12 11:18:22 adams kernel: BIOS-provided physical RAM map:
Mar 12 11:18:22 adams kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
Mar 12 11:18:22 adams kernel: BIOS-e820: 0000000000100000 - 00000000f7ff0000 (usable)
Mar 12 11:18:22 adams kernel: BIOS-e820: 00000000f7ff0000 - 00000000f7ffec00 (ACPI data)
Mar 12 11:18:22 adams kernel: BIOS-e820: 00000000f7ffec00 - 00000000f7fff000 (reserved)
Mar 12 11:18:22 adams kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
Mar 12 11:18:22 adams kernel: BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
Mar 12 11:18:22 adams kernel: BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
Mar 12 11:18:22 adams kernel: 3071MB HIGHMEM available.
Mar 12 11:18:22 adams kernel: 896MB LOWMEM available.
Mar 12 11:18:22 adams kernel: found SMP MP-table at 000fe710
Mar 12 11:18:22 adams kernel: hm, page 000fe000 reserved twice.
Mar 12 11:18:22 adams kernel: hm, page 000ff000 reserved twice.
Mar 12 11:18:22 adams kernel: hm, page 000f0000 reserved twice.
Mar 12 11:18:22 adams kernel: On node 0 totalpages: 1015792
Mar 12 11:18:22 adams kernel: zone(0): 4096 pages.
Mar 12 11:18:22 adams kernel: zone(1): 225280 pages.
Mar 12 11:18:22 adams kernel: zone(2): 786416 pages.
Mar 12 11:18:22 adams kernel: ACPI: RSDP (v000 DELL ) @ 0x000fdc60
Mar 12 11:18:22 adams kernel: ACPI: RSDT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdc74
Mar 12 11:18:22 adams kernel: ACPI: FADT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdca4
Mar 12 11:18:22 adams kernel: ACPI: MADT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdd18
Mar 12 11:18:22 adams kernel: ACPI: SPCR (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x000fdda0
Mar 12 11:18:22 adams kernel: ACPI: DSDT (v001 DELL PE2650 0x00000001 MSFT 0x0100000a) @ 0x00000000
Mar 12 11:18:22 adams kernel: ACPI: Local APIC address 0xfee00000
Mar 12 11:18:22 adams kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Mar 12 11:18:22 adams kernel: Processor #0 Pentium 4(tm) XEON(tm) APIC version 20
Mar 12 11:18:22 adams kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
Mar 12 11:18:22 adams kernel: Processor #2 Pentium 4(tm) XEON(tm) APIC version 20
Mar 12 11:18:22 adams kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x01] enabled)
Mar 12 11:18:22 adams kernel: Processor #1 Pentium 4(tm) XEON(tm) APIC version 20
Mar 12 11:18:22 adams kernel: ACPI: LAPIC (acpi_id[0x04] lapic_id[0x03] enabled)
Mar 12 11:18:22 adams kernel: Processor #3 Pentium 4(tm) XEON(tm) APIC version 20
Mar 12 11:18:22 adams kernel: ACPI: LAPIC_NMI (acpi_id[0x01] polarity[0x1] trigger[0x1] lint[0x1])
Mar 12 11:18:22 adams kernel: ACPI: LAPIC_NMI (acpi_id[0x02] polarity[0x1] trigger[0x1] lint[0x1])
Mar 12 11:18:22 adams kernel: ACPI: LAPIC_NMI (acpi_id[0x03] polarity[0x1] trigger[0x1] lint[0x1])
Mar 12 11:18:22 adams kernel: ACPI: LAPIC_NMI (acpi_id[0x04] polarity[0x1] trigger[0x1] lint[0x1])
Mar 12 11:18:22 adams kernel: Using ACPI for processor (LAPIC) configuration information
Mar 12 11:18:22 adams kernel: Intel MultiProcessor Specification v1.4
Mar 12 11:18:22 adams kernel: Virtual Wire compatibility mode.
Mar 12 11:18:22 adams kernel: OEM ID: DELL Product ID: PE 0121 APIC at: 0xFEE00000
Mar 12 11:18:22 adams kernel: I/O APIC #4 Version 17 at 0xFEC00000.
Mar 12 11:18:22 adams kernel: I/O APIC #5 Version 17 at 0xFEC01000.
Mar 12 11:18:22 adams kernel: I/O APIC #6 Version 17 at 0xFEC02000.
Mar 12 11:18:22 adams kernel: Enabling APIC mode: Flat.^IUsing 3 I/O APICs
Mar 12 11:18:22 adams kernel: Processors: 4
Mar 12 11:18:22 adams kernel: Kernel command line: auto BOOT_IMAGE=Linux ro root=801
Mar 12 11:18:22 adams kernel: Initializing CPU#0
Mar 12 11:18:22 adams kernel: Detected 2790.984 MHz processor.
Mar 12 11:18:22 adams kernel: Console: colour VGA+ 80x25
.config is:
#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
# CONFIG_STANDALONE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HOTPLUG=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
CONFIG_M686=y
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=5
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_PPRO_FENCE=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_X86_4G is not set
# CONFIG_X86_SWITCH_PAGETABLES is not set
# CONFIG_X86_4G_VM_LAYOUT is not set
# CONFIG_X86_UACCESS_INDIRECT is not set
# CONFIG_X86_HIGH_ENTRY is not set
# CONFIG_HPET_TIMER is not set
# CONFIG_HPET_EMULATE_RTC is not set
CONFIG_SMP=y
CONFIG_NR_CPUS=8
# CONFIG_SCHED_SMT is not set
# CONFIG_PREEMPT is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
# CONFIG_EDD is not set
# CONFIG_NOHIGHMEM is not set
CONFIG_HIGHMEM4G=y
# CONFIG_HIGHMEM64G is not set
CONFIG_HIGHMEM=y
# CONFIG_HIGHPTE is not set
# CONFIG_MATH_EMULATION is not set
# CONFIG_MTRR is not set
CONFIG_IRQBALANCE=y
CONFIG_HAVE_DEC_LOCK=y
CONFIG_REGPARM=y
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_SOFTWARE_SUSPEND is not set
CONFIG_PM_DISK=y
CONFIG_PM_DISK_PARTITION=""
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BOOT=y
#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
# CONFIG_PCI_USE_VECTOR is not set
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# PCMCIA/CardBus support
#
CONFIG_PCMCIA=y
# CONFIG_YENTA is not set
# CONFIG_I82092 is not set
# CONFIG_I82365 is not set
# CONFIG_TCIC is not set
CONFIG_PCMCIA_PROBE=y
#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
# CONFIG_FW_LOADER is not set
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_PC_PCMCIA is not set
# CONFIG_PARPORT_OTHER is not set
# CONFIG_PARPORT_1284 is not set
#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set
#
# Protocols
#
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set
#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
CONFIG_BLK_DEV_UMEM=y
CONFIG_BLK_DEV_LOOP=y
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
# CONFIG_BLK_DEV_NBD is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_LBD=y
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_TASKFILE_IO=y
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
# CONFIG_BLK_DEV_GENERIC is not set
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
CONFIG_BLK_DEV_SVWKS=y
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_DMA_NONPCI is not set
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
# CONFIG_SCSI_REPORT_LUNS is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AACRAID is not set
CONFIG_SCSI_AIC7XXX=y
CONFIG_AIC7XXX_CMDS_PER_DEVICE=253
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_BUILD_FIRMWARE is not set
CONFIG_AIC7XXX_DEBUG_ENABLE=y
CONFIG_AIC7XXX_DEBUG_MASK=0
CONFIG_AIC7XXX_REG_PRETTY_PRINT=y
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_MEGARAID is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=y
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# PCMCIA SCSI adapter support
#
# CONFIG_PCMCIA_AHA152X is not set
# CONFIG_PCMCIA_FDOMAIN is not set
# CONFIG_PCMCIA_NINJA_SCSI is not set
# CONFIG_PCMCIA_QLOGIC is not set
#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set
#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=y
CONFIG_MD_RAID0=y
CONFIG_MD_RAID1=y
CONFIG_MD_RAID5=y
CONFIG_MD_RAID6=y
CONFIG_MD_MULTIPATH=y
# CONFIG_BLK_DEV_DM is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
CONFIG_NET_KEY=y
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_INET_ECN is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_IPV6 is not set
# CONFIG_DECNET is not set
# CONFIG_BRIDGE is not set
# CONFIG_NETFILTER is not set
CONFIG_XFRM=y
# CONFIG_XFRM_USER is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
CONFIG_IPV6_SCTP__=y
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
CONFIG_NETDEVICES=y
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
CONFIG_LANCE=y
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
CONFIG_PCNET32=y
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
CONFIG_E100=y
# CONFIG_E100_NAPI is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_NET_POCKET is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
CONFIG_E1000=y
CONFIG_E1000_NAPI=y
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SIS190 is not set
# CONFIG_SK98LIN is not set
CONFIG_TIGON3=y
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
# CONFIG_NET_FC is not set
# CONFIG_RCPCI is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
#
# PCMCIA network device support
#
CONFIG_NET_PCMCIA=y
# CONFIG_PCMCIA_3C589 is not set
# CONFIG_PCMCIA_3C574 is not set
# CONFIG_PCMCIA_FMVJ18X is not set
CONFIG_PCMCIA_PCNET=y
# CONFIG_PCMCIA_NMCLAN is not set
# CONFIG_PCMCIA_SMC91C92 is not set
# CONFIG_PCMCIA_XIRC2PS is not set
# CONFIG_PCMCIA_AXNET is not set
#
# Amateur Radio support
#
# CONFIG_HAMRADIO is not set
#
# IrDA (infrared) support
#
# CONFIG_IRDA is not set
#
# Bluetooth support
#
# CONFIG_BT is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
CONFIG_SERIO_CT82C710=y
CONFIG_SERIO_PARKBD=y
# CONFIG_SERIO_PCIPS2 is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
# CONFIG_SERIAL_8250 is not set
#
# Non-8250 serial port support
#
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_PRINTER=y
# CONFIG_LP_CONSOLE is not set
# CONFIG_PPDEV is not set
# CONFIG_TIPAR is not set
# CONFIG_QIC02_TAPE is not set
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
# CONFIG_RTC is not set
# CONFIG_GEN_RTC is not set
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=y
CONFIG_AGP_ALI=y
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
CONFIG_AGP_NVIDIA=y
CONFIG_AGP_SIS=y
# CONFIG_AGP_SWORKS is not set
CONFIG_AGP_VIA=y
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
CONFIG_DRM_TDFX=y
# CONFIG_DRM_GAMMA is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_RADEON=y
# CONFIG_DRM_I810 is not set
# CONFIG_DRM_I830 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_MWAVE is not set
CONFIG_RAW_DRIVER=m
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=y
#
# I2C support
#
# CONFIG_I2C is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y
#
# Sound
#
# CONFIG_SOUND is not set
#
# USB support
#
# CONFIG_USB is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
# CONFIG_EXT3_FS_XATTR is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_REISERFS_FS=y
CONFIG_REISERFS_CHECK=y
CONFIG_REISERFS_PROC_INFO=y
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_QUOTA=y
CONFIG_QFMT_V1=y
# CONFIG_QFMT_V2 is not set
CONFIG_QUOTACTL=y
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# CD-ROM/DVD Filesystems
#
# CONFIG_ISO9660_FS is not set
# CONFIG_UDF_FS is not set
#
# DOS/FAT/NT Filesystems
#
# CONFIG_FAT_FS is not set
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_INTERMEZZO_FS is not set
# CONFIG_AFS_FS is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
# CONFIG_NLS is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
# CONFIG_DEBUG_STACK_USAGE is not set
CONFIG_DEBUG_SLAB=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_PAGEALLOC=y
# CONFIG_SPINLINE is not set
# CONFIG_DEBUG_HIGHMEM is not set
# CONFIG_DEBUG_INFO is not set
# CONFIG_LOCKMETER is not set
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_KGDB is not set
# CONFIG_FRAME_POINTER is not set
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
#
# Security options
#
# CONFIG_SECURITY is not set
#
# Cryptographic options
#
# CONFIG_CRYPTO is not set
#
# Library routines
#
CONFIG_CRC32=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
Neil Brown <[email protected]> wrote:
>
>
> 2.6.4-mm1 doesn't work for me :-(
>
> I get the:
> Uncompressing kernel ... now booting Linux
>
> message, and then ...... nothing.
>
> I've seen this before when trying to boot a P4 kernel on a P-classic
> etc, so I tried compiling with CONFIG_M386, and got lots of compile
> errors:
>
> include/asm/acpi.h: In function `__acpi_acquire_global_lock':
> include/asm/acpi.h:74: warning: implicit declaration of function `cmpxchg'
>
> So I tried the default (CONFIG_M686) and it still doesn't work.
>
> So: where do I look next?
>
> I've included some of the machine specs below together with a config
> file.
Tried adding earlyprintk=vga?
If that works, judicious addition of printks will narrow it down.
Nick Piggin wrote:
> Anton Blanchard wrote:
>
>>
>>
>>> - The CPU scheduler changes in -mm (sched-domains) have been hanging
>>> about
>>> for too long. I had been hoping that the people who care about SMT
>>> and
>>> NUMA performance would have some results by now but all seems to be
>>> silent.
>>>
>>> I do not wish to merge these up until the big-iron guys can say
>>> that they
>>> suit their requirements, with a reasonable expectation that we will
>>> not
>>> need to churn this code later in the 2.6 series.
>>>
>>> So. If you have been testing, please speak up. If you have not been
>>> testing, please do so.
>>>
>>
>> I sucked sched-* out of mm, added sched-ppc64bits (attached) and am
>> having problems with the following threaded test case. NUMA is enabled.
>
>
Hi Anton,
You need to be setting cpu_power for each of the CPU groups.
Nick
> You need to be setting cpu_power for each of the CPU groups.
Aha, thanks. I'll do that and retest.
Anton
The part I like with this scheduler is that the common scheduler code
has no idea about the domains topology; it just traverses the pointers.
The domains are built in an architecture or platform specific fashion.
So that part can be a bit complex as observed, but that's a setup
business, not a runtime behavior.
As we can have more complex architectures in the future, the scheduler
is flexible enough to represent various scheduling domains effectively,
and yet keeps the common scheduler code simple.
Jun
>-----Original Message-----
>From: [email protected] [mailto:linux-kernel-
>[email protected]] On Behalf Of Nick Piggin
>Sent: Thursday, March 11, 2004 3:38 PM
>To: Andi Kleen
>Cc: Andrew Morton; [email protected]
>Subject: Re: 2.6.4-mm1
>
>Andi Kleen wrote:
>
>>>>Some kind of SMT scheduler is definitely needed, we have a serious
>>>>regression compared to 2.4 here right now. I'm not sure this
>>>>is the right approach though, it seems to be far too complex.
>>>>
>
>Andi, I'll agree that the way domains currently get set up is pretty
>ugly. Maybe some additional functions or macros could be used to make
>this process a bit clearer.
>
>The actual kernel/sched.c code is really not that complex. In some ways
>it is *less* complicated than the old numa scheduler because it all
goes
>through one code path.
>
>It also handles SMT, which is where a bit of complexity is coming from.
>The other alternative is shared runqueues which is uglier and less
flexible.
>
>
>>>Well that's discouraging. I really do want to push this thing along
a
>bit.
>>>
>>>Yours is the only report of regression of which I am aware. Is the
>reason
>>>understood?
>>>
>>
>>I think the reason is that it doesn't do balance on clone/fork. The
>>normal scheduler also doesn't do that, but for some reason it still
does
>>better on the benchmarks (but worse than the old 2.4 -aa/Intel O(1) HT
>>scheduler)
>>
>>
>
>There have been a few changes and bug fixes since you last tested.
>Maybe that would help.
>
>>>And is anyone developing alternative SMT enhancements?
>>>
>>
>>I thought there was a patch from Ingo Molnar? ("shared runqueue")
>>I must admit I never tried it, just remember seeing the patches.
>>
>
>Yep shared runqueues. Ingo and Rusty both had implementations but
>they both agreed sched-domains was a better alternative.
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel"
in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
On Thu, Mar 11, 2004 at 07:04:50PM -0800, Nakajima, Jun wrote:
> As we can have more complex architectures in the future, the scheduler
> is flexible enough to represent various scheduling domains effectively,
> and yet keeps the common scheduler code simple.
I think for SMT alone it's too complex and for NUMA it doesn't do
the right thing for "modern NUMAs" (where NUMA factor is very low
and you have a small number of CPUs for each node).
-Andi
Andi Kleen wrote:
>On Thu, Mar 11, 2004 at 07:04:50PM -0800, Nakajima, Jun wrote:
>
>>As we can have more complex architectures in the future, the scheduler
>>is flexible enough to represent various scheduling domains effectively,
>>and yet keeps the common scheduler code simple.
>>
>
>I think for SMT alone it's too complex and for NUMA it doesn't do
>the right thing for "modern NUMAs" (where NUMA factor is very low
>and you have a small number of CPUs for each node).
>
>
For SMT it is a less complex than shared runqueues, it is actually
less lines of code and smaller object size.
It is also more flexible than shared runqueues in that you can still
have control over each sibling's runqueue. Con's SMT nice patch for
example would probably be more difficult to do with shared runqueues.
Shared runqueues also gives zero affinity to siblings. While current
implementations may not (do they?) care, future ones might.
For Opteron type NUMA, it actually balances much more aggressively
than the default NUMA scheduler, especially when a CPU is idle. I
don't doubt you aren't seeing great performance, but it should be
able to be fixed.
The problem is just presumably your lack of time to investigate
further, and my lack of problem descriptions or Opterons.
One thing you definitely want is a sched_balance_fork, is that right?
Have you been able to do any benchmarks on recent -mm kernels?
Hi,
> You need to be setting cpu_power for each of the CPU groups.
Thanks Nick, that fixed it. New patch attached, its basically the x86
version with cpumask fixes when compiling with NR_CPUS > 64.
Anton
---
gr25_work-anton/arch/ppc64/Kconfig | 10 +
gr25_work-anton/arch/ppc64/kernel/smp.c | 222 ++++++++++++++++++++++++++
gr25_work-anton/include/asm-ppc64/processor.h | 5
3 files changed, 237 insertions(+)
diff -puN arch/ppc64/Kconfig~sched-ppc64bits arch/ppc64/Kconfig
--- gr25_work/arch/ppc64/Kconfig~sched-ppc64bits 2004-03-11 21:27:02.655467583 -0600
+++ gr25_work-anton/arch/ppc64/Kconfig 2004-03-11 21:27:02.671465046 -0600
@@ -175,6 +175,16 @@ config NUMA
bool "NUMA support"
depends on DISCONTIGMEM
+config SCHED_SMT
+ bool "SMT (Hyperthreading) scheduler support"
+ depends on SMP
+ default off
+ help
+ SMT scheduler support improves the CPU scheduler's decision making
+ when dealing with Intel Pentium 4 chips with HyperThreading at a
+ cost of slightly increased overhead in some places. If unsure say
+ N here.
+
config PREEMPT
bool
help
diff -puN arch/ppc64/kernel/smp.c~sched-ppc64bits arch/ppc64/kernel/smp.c
--- gr25_work/arch/ppc64/kernel/smp.c~sched-ppc64bits 2004-03-11 21:27:02.660466790 -0600
+++ gr25_work-anton/arch/ppc64/kernel/smp.c 2004-03-11 21:57:09.897044235 -0600
@@ -890,3 +890,225 @@ static int __init topology_init(void)
return 0;
}
__initcall(topology_init);
+
+#ifdef CONFIG_SCHED_SMT
+#ifdef CONFIG_NUMA
+static struct sched_group sched_group_cpus[NR_CPUS];
+static struct sched_group sched_group_phys[NR_CPUS];
+static struct sched_group sched_group_nodes[MAX_NUMNODES];
+static DEFINE_PER_CPU(struct sched_domain, phys_domains);
+static DEFINE_PER_CPU(struct sched_domain, node_domains);
+__init void arch_init_sched_domains(void)
+{
+ int i;
+ struct sched_group *first_cpu = NULL, *last_cpu = NULL;
+
+ /* Set up domains */
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_domain *node_domain = &per_cpu(node_domains, i);
+ int node = cpu_to_node(i);
+ cpumask_t nodemask = node_to_cpumask(node);
+ cpumask_t tmp1 = cpumask_of_cpu(i ^ 0x1);
+ cpumask_t tmp2 = cpumask_of_cpu(i);
+
+ *cpu_domain = SD_SIBLING_INIT;
+ cpus_or(cpu_domain->span, tmp1, tmp2);
+
+ *phys_domain = SD_CPU_INIT;
+ phys_domain->span = nodemask;
+
+ *node_domain = SD_NODE_INIT;
+ node_domain->span = cpu_possible_map;
+ }
+
+ /* Set up CPU (sibling) groups */
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ int j;
+ first_cpu = last_cpu = NULL;
+
+ if (i != first_cpu(cpu_domain->span)) {
+ cpu_sched_domain(i)->flags |= SD_FLAG_SHARE_CPUPOWER;
+ cpu_sched_domain(first_cpu(cpu_domain->span))->flags |=
+ SD_FLAG_SHARE_CPUPOWER;
+ continue;
+ }
+
+ for_each_cpu_mask(j, cpu_domain->span) {
+ struct sched_group *cpu = &sched_group_cpus[j];
+
+ cpus_clear(cpu->cpumask);
+ cpu_set(j, cpu->cpumask);
+ cpu->cpu_power = SCHED_LOAD_SCALE;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+ }
+
+ for (i = 0; i < MAX_NUMNODES; i++) {
+ int j;
+ cpumask_t nodemask;
+ struct sched_group *node = &sched_group_nodes[i];
+ cpumask_t node_cpumask = node_to_cpumask(i);
+ cpus_and(nodemask, node_cpumask, cpu_online_map);
+
+ if (cpus_empty(nodemask))
+ continue;
+
+ first_cpu = last_cpu = NULL;
+ /* Set up physical groups */
+ for_each_cpu_mask(j, nodemask) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(j);
+ struct sched_group *cpu = &sched_group_phys[j];
+
+ if (j != first_cpu(cpu_domain->span))
+ continue;
+
+ cpu->cpumask = cpu_domain->span;
+ /*
+ * Make each extra sibling increase power by 10% of
+ * the basic CPU. This is very arbitrary.
+ */
+ cpu->cpu_power = SCHED_LOAD_SCALE + SCHED_LOAD_SCALE*(cpus_weight(cpu->cpumask)-1) / 10;
+ node->cpu_power += cpu->cpu_power;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+ }
+
+ /* Set up nodes */
+ first_cpu = last_cpu = NULL;
+ for (i = 0; i < MAX_NUMNODES; i++) {
+ struct sched_group *cpu = &sched_group_nodes[i];
+ cpumask_t nodemask;
+ cpumask_t node_cpumask = node_to_cpumask(i);
+ cpus_and(nodemask, node_cpumask, cpu_possible_map);
+
+ if (cpus_empty(nodemask))
+ continue;
+
+ cpu->cpumask = nodemask;
+ /* ->cpu_power already setup */
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+
+ mb();
+ for_each_cpu(i) {
+ int node = cpu_to_node(i);
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_domain *node_domain = &per_cpu(node_domains, i);
+ struct sched_group *cpu_group = &sched_group_cpus[i];
+ struct sched_group *phys_group = &sched_group_phys[first_cpu(cpu_domain->span)];
+ struct sched_group *node_group = &sched_group_nodes[node];
+
+ cpu_domain->parent = phys_domain;
+ phys_domain->parent = node_domain;
+
+ node_domain->groups = node_group;
+ phys_domain->groups = phys_group;
+ cpu_domain->groups = cpu_group;
+ }
+}
+#else /* CONFIG_NUMA */
+static struct sched_group sched_group_cpus[NR_CPUS];
+static struct sched_group sched_group_phys[NR_CPUS];
+static DEFINE_PER_CPU(struct sched_domain, phys_domains);
+__init void arch_init_sched_domains(void)
+{
+ int i;
+ struct sched_group *first_cpu = NULL, *last_cpu = NULL;
+
+ /* Set up domains */
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+
+ *cpu_domain = SD_SIBLING_INIT;
+ cpu_domain->span = cpu_sibling_map[i];
+
+ *phys_domain = SD_CPU_INIT;
+ phys_domain->span = cpu_possible_map;
+ }
+
+ /* Set up CPU (sibling) groups */
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ int j;
+ first_cpu = last_cpu = NULL;
+
+ if (i != first_cpu(cpu_domain->span)) {
+ cpu_sched_domain(i)->flags |= SD_FLAG_SHARE_CPUPOWER;
+ cpu_sched_domain(first_cpu(cpu_domain->span))->flags |=
+ SD_FLAG_SHARE_CPUPOWER;
+ continue;
+ }
+
+ for_each_cpu_mask(j, cpu_domain->span) {
+ struct sched_group *cpu = &sched_group_cpus[j];
+
+ cpus_clear(cpu->cpumask);
+ cpu_set(j, cpu->cpumask);
+ cpu->cpu_power = SCHED_LOAD_SCALE;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+ }
+
+ first_cpu = last_cpu = NULL;
+ /* Set up physical groups */
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_group *cpu = &sched_group_phys[i];
+
+ if (i != first_cpu(cpu_domain->span))
+ continue;
+
+ cpu->cpumask = cpu_domain->span;
+ /* See SMT+NUMA setup for comment */
+ cpu->cpu_power = SCHED_LOAD_SCALE + SCHED_LOAD_SCALE*(cpus_weight(cpu->cpumask)-1) / 10;
+
+ if (!first_cpu)
+ first_cpu = cpu;
+ if (last_cpu)
+ last_cpu->next = cpu;
+ last_cpu = cpu;
+ }
+ last_cpu->next = first_cpu;
+
+ mb();
+ for_each_cpu(i) {
+ struct sched_domain *cpu_domain = cpu_sched_domain(i);
+ struct sched_domain *phys_domain = &per_cpu(phys_domains, i);
+ struct sched_group *cpu_group = &sched_group_cpus[i];
+ struct sched_group *phys_group = &sched_group_phys[first_cpu(cpu_domain->span)];
+ cpu_domain->parent = phys_domain;
+ phys_domain->groups = phys_group;
+ cpu_domain->groups = cpu_group;
+ }
+}
+#endif /* CONFIG_NUMA */
+#endif /* CONFIG_SCHED_SMT */
diff -puN include/asm-ppc64/processor.h~sched-ppc64bits include/asm-ppc64/processor.h
--- gr25_work/include/asm-ppc64/processor.h~sched-ppc64bits 2004-03-11 21:27:02.665465998 -0600
+++ gr25_work-anton/include/asm-ppc64/processor.h 2004-03-11 21:27:02.677464095 -0600
@@ -631,6 +631,11 @@ static inline void prefetchw(const void
#define spin_lock_prefetch(x) prefetchw(x)
+#ifdef CONFIG_SCHED_SMT
+#define ARCH_HAS_SCHED_DOMAIN
+#define ARCH_HAS_SCHED_WAKE_BALANCE
+#endif
+
#endif /* ASSEMBLY */
#endif /* __ASM_PPC64_PROCESSOR_H */
_
On Thursday March 11, [email protected] wrote:
>
> Tried adding earlyprintk=vga?
>
> If that works, judicious addition of printks will narrow it down.
It doesn't.
I've tried compiling with SMP - no go.
I've tried with gcc-2.95 (instead of 3.3.2). Still no go.
I thought I might try selectively removing patches, but it isn't clear
what order the borken-out patches were applied it.
If you have an ordered list, I can try a binary search.
Or if you can suggest some patches that I can try backing out....
NeilBrown
Neil Brown <[email protected]> wrote:
>
> On Thursday March 11, [email protected] wrote:
> >
> > Tried adding earlyprintk=vga?
> >
> > If that works, judicious addition of printks will narrow it down.
>
> It doesn't.
>
> I've tried compiling with SMP - no go.
> I've tried with gcc-2.95 (instead of 3.3.2). Still no go.
Your .config works happily here.
> I thought I might try selectively removing patches, but it isn't clear
> what order the borken-out patches were applied it.
> If you have an ordered list, I can try a binary search.
See the `series' file in the broken-out directory.
> Or if you can suggest some patches that I can try backing out....
Maybe turn off -mregparm? Or back off the 4g/4g patches? Maybe they broke
non-4:4 code comehow.
On Fri, Mar 12, 2004 at 12:37:20AM +0100, Andi Kleen wrote:
> Are DM_NAME_LEN and DM_UUID_LEN not both a multiple of 8?
name len == 128, uuid_len == 129, so is the uuid_len being rounded up
to the nearest 64bit boundary on x86-64 and only 32bit boundary on
x86-32 ? (Sounds likely)
> There are more structures here, right?
Not that effect this problem, we're getting unknown ioctl, not an oops.
- Joe
On Thursday 11 March 2004 21:23, Adrian Bunk wrote:
Hi Adrian,
> On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
> >...
> > ext3-journalled-quotas-2.patch
> > ext3: journalled quota
> >...
> This patch broke modular quota:
> WARNING: /lib/modules/2.6.4-mm1/kernel/fs/quota_v2.ko needs unknown
> symbol mark_info_dirty
Patch attached (again) ;)
ciao, Marc
It seems to work fine for desktop use.
I compiled with regparm, 4k stacks, and checking for stack overflow.
Helge Hafting
On Thursday 11 March 2004 21:31, you wrote:
> This causes the following unknown symbols in modules on i386:
Sorry, that could not work. This patch reverts my changes to loadable
device drivers. As Arjan van de Ven already noted, they have to
be converted to request_firmware() anyway.
Arnd <><
drivers/media/dvb/frontends/alps_tdlb7.c | 12 ++++++++----
drivers/media/dvb/frontends/sp887x.c | 11 +++++++----
drivers/media/dvb/frontends/tda1004x.c | 10 ++++++----
sound/isa/wavefront/wavefront_synth.c | 12 ++++++------
sound/oss/wavfront.c | 12 +++++++-----
5 files changed, 34 insertions(+), 23 deletions(-)
diff -u -r linux-2.6.4-mm1/drivers/media/dvb/frontends/alps_tdlb7.c linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/alps_tdlb7.c
--- linux-2.6.4-mm1/drivers/media/dvb/frontends/alps_tdlb7.c 2004-03-12 10:03:46.000000000 +0100
+++ linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/alps_tdlb7.c 2004-03-12 10:07:51.000000000 +0100
@@ -29,6 +29,8 @@
*/
+
+#define __KERNEL_SYSCALLS__
#include <linux/module.h>
#include <linux/init.h>
#include <linux/vmalloc.h>
@@ -56,6 +58,8 @@
#define SP8870_FIRMWARE_OFFSET 0x0A
+static int errno;
+
static struct dvb_frontend_info tdlb7_info = {
.name = "Alps TDLB7",
.type = FE_OFDM,
@@ -170,13 +174,13 @@
loff_t filesize;
char *dp;
- fd = sys_open(fn, 0, 0);
+ fd = open(fn, 0, 0);
if (fd == -1) {
printk("%s: unable to open '%s'.\n", __FUNCTION__, fn);
return -EIO;
}
- filesize = sys_lseek(fd, 0L, 2);
+ filesize = lseek(fd, 0L, 2);
if (filesize <= 0 || filesize < SP8870_FIRMWARE_OFFSET + SP8870_FIRMWARE_SIZE) {
printk("%s: firmware filesize to small '%s'\n", __FUNCTION__, fn);
sys_close(fd);
@@ -190,8 +194,8 @@
return -EIO;
}
- sys_lseek(fd, SP8870_FIRMWARE_OFFSET, 0);
- if (sys_read(fd, dp, SP8870_FIRMWARE_SIZE) != SP8870_FIRMWARE_SIZE) {
+ lseek(fd, SP8870_FIRMWARE_OFFSET, 0);
+ if (read(fd, dp, SP8870_FIRMWARE_SIZE) != SP8870_FIRMWARE_SIZE) {
printk("%s: failed to read '%s'.\n",__FUNCTION__, fn);
vfree(dp);
sys_close(fd);
diff -u -r linux-2.6.4-mm1/drivers/media/dvb/frontends/sp887x.c linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/sp887x.c
--- linux-2.6.4-mm1/drivers/media/dvb/frontends/sp887x.c 2004-03-12 10:03:46.000000000 +0100
+++ linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/sp887x.c 2004-03-12 10:07:51.000000000 +0100
@@ -12,6 +12,7 @@
next 0x4000 loaded. This may change in future versions.
*/
+#define __KERNEL_SYSCALLS__
#include <linux/kernel.h>
#include <linux/vmalloc.h>
#include <linux/module.h>
@@ -67,6 +68,8 @@
FE_CAN_QPSK | FE_CAN_QAM_16 | FE_CAN_QAM_64 | FE_CAN_RECOVER
};
+static int errno;
+
static
int i2c_writebytes (struct dvb_frontend *fe, u8 addr, u8 *buf, u8 len)
{
@@ -213,13 +216,13 @@
// Load the firmware
set_fs(get_ds());
- fd = sys_open(sp887x_firmware, 0, 0);
+ fd = open(sp887x_firmware, 0, 0);
if (fd < 0) {
printk(KERN_WARNING "%s: Unable to open firmware %s\n", __FUNCTION__,
sp887x_firmware);
return -EIO;
}
- filesize = sys_lseek(fd, 0L, 2);
+ filesize = lseek(fd, 0L, 2);
if (filesize <= 0) {
printk(KERN_WARNING "%s: Firmware %s is empty\n", __FUNCTION__,
sp887x_firmware);
@@ -241,8 +244,8 @@
// read it!
// read the first 16384 bytes from the file
// ignore the first 10 bytes
- sys_lseek(fd, 10, 0);
- if (sys_read(fd, firmware, fw_size) != fw_size) {
+ lseek(fd, 10, 0);
+ if (read(fd, firmware, fw_size) != fw_size) {
printk(KERN_WARNING "%s: Failed to read firmware\n", __FUNCTION__);
vfree(firmware);
sys_close(fd);
diff -u -r linux-2.6.4-mm1/drivers/media/dvb/frontends/tda1004x.c linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/tda1004x.c
--- linux-2.6.4-mm1/drivers/media/dvb/frontends/tda1004x.c 2004-03-12 10:03:46.000000000 +0100
+++ linux-2.6.4-mm1-patched/drivers/media/dvb/frontends/tda1004x.c 2004-03-12 10:07:51.000000000 +0100
@@ -32,6 +32,7 @@
*/
+#define __KERNEL_SYSCALLS__
#include <linux/kernel.h>
#include <linux/vmalloc.h>
#include <linux/module.h>
@@ -40,6 +41,7 @@
#include <linux/slab.h>
#include <linux/syscalls.h>
#include <linux/fs.h>
+#include <linux/unistd.h>
#include <linux/fcntl.h>
#include <linux/errno.h>
#include "dvb_frontend.h"
@@ -397,13 +399,13 @@
// Load the firmware
set_fs(get_ds());
- fd = sys_open(tda1004x_firmware, 0, 0);
+ fd = open(tda1004x_firmware, 0, 0);
if (fd < 0) {
printk("%s: Unable to open firmware %s\n", __FUNCTION__,
tda1004x_firmware);
return -EIO;
}
- filesize = sys_lseek(fd, 0L, 2);
+ filesize = lseek(fd, 0L, 2);
if (filesize <= 0) {
printk("%s: Firmware %s is empty\n", __FUNCTION__,
tda1004x_firmware);
@@ -434,8 +436,8 @@
}
// read it!
- sys_lseek(fd, fw_offset, 0);
- if (sys_read(fd, firmware, fw_size) != fw_size) {
+ lseek(fd, fw_offset, 0);
+ if (read(fd, firmware, fw_size) != fw_size) {
printk("%s: Failed to read firmware\n", __FUNCTION__);
vfree(firmware);
sys_close(fd);
diff -u -r linux-2.6.4-mm1/sound/isa/wavefront/wavefront_synth.c linux-2.6.4-mm1-patched/sound/isa/wavefront/wavefront_synth.c
--- linux-2.6.4-mm1/sound/isa/wavefront/wavefront_synth.c 2004-03-12 10:03:50.000000000 +0100
+++ linux-2.6.4-mm1-patched/sound/isa/wavefront/wavefront_synth.c 2004-03-12 10:07:51.000000000 +0100
@@ -1913,11 +1913,11 @@
return (1);
}
+#define __KERNEL_SYSCALLS__
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/slab.h>
#include <linux/unistd.h>
-#include <linux/syscalls.h>
#include <asm/uaccess.h>
static int errno;
@@ -1947,7 +1947,7 @@
fs = get_fs();
set_fs (get_ds());
- if ((fd = sys_open (path, 0, 0)) < 0) {
+ if ((fd = open (path, 0, 0)) < 0) {
snd_printk ("Unable to load \"%s\".\n",
path);
return 1;
@@ -1956,7 +1956,7 @@
while (1) {
int x;
- if ((x = sys_read (fd, §ion_length, sizeof (section_length))) !=
+ if ((x = read (fd, §ion_length, sizeof (section_length))) !=
sizeof (section_length)) {
snd_printk ("firmware read error.\n");
goto failure;
@@ -1966,7 +1966,7 @@
break;
}
- if (sys_read (fd, section, section_length) != section_length) {
+ if (read (fd, section, section_length) != section_length) {
snd_printk ("firmware section "
"read error.\n");
goto failure;
@@ -2005,12 +2005,12 @@
}
- sys_close (fd);
+ close (fd);
set_fs (fs);
return 0;
failure:
- sys_close (fd);
+ close (fd);
set_fs (fs);
snd_printk ("firmware download failed!!!\n");
return 1;
diff -u -r linux-2.6.4-mm1/sound/oss/wavfront.c linux-2.6.4-mm1-patched/sound/oss/wavfront.c
--- linux-2.6.4-mm1/sound/oss/wavfront.c 2004-03-12 10:03:50.000000000 +0100
+++ linux-2.6.4-mm1-patched/sound/oss/wavfront.c 2004-03-12 10:07:51.000000000 +0100
@@ -2490,9 +2490,11 @@
}
#include "os.h"
+#define __KERNEL_SYSCALLS__
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/slab.h>
+#include <linux/unistd.h>
#include <asm/uaccess.h>
static int errno;
@@ -2522,7 +2524,7 @@
fs = get_fs();
set_fs (get_ds());
- if ((fd = sys_open (path, 0, 0)) < 0) {
+ if ((fd = open (path, 0, 0)) < 0) {
printk (KERN_WARNING LOGNAME "Unable to load \"%s\".\n",
path);
return 1;
@@ -2531,7 +2533,7 @@
while (1) {
int x;
- if ((x = sys_read (fd, §ion_length, sizeof (section_length))) !=
+ if ((x = read (fd, §ion_length, sizeof (section_length))) !=
sizeof (section_length)) {
printk (KERN_ERR LOGNAME "firmware read error.\n");
goto failure;
@@ -2541,7 +2543,7 @@
break;
}
- if (sys_read (fd, section, section_length) != section_length) {
+ if (read (fd, section, section_length) != section_length) {
printk (KERN_ERR LOGNAME "firmware section "
"read error.\n");
goto failure;
@@ -2580,12 +2582,12 @@
}
- sys_close (fd);
+ close (fd);
set_fs (fs);
return 0;
failure:
- sys_close (fd);
+ close (fd);
set_fs (fs);
printk (KERN_ERR "\nWaveFront: firmware download failed!!!\n");
return 1;
Arnd Bergmann <[email protected]> wrote:
>
> On Thursday 11 March 2004 21:31, you wrote:
> > This causes the following unknown symbols in modules on i386:
>
> Sorry, that could not work. This patch reverts my changes to loadable
> device drivers. As Arjan van de Ven already noted, they have to
> be converted to request_firmware() anyway.
I just did an EXPORT_SYMBOL_GPL of the three symbols and added a suitably
rude changelog. Is that inadequate?
On Friday 12 March 2004 10:29, you wrote:
> I just did an EXPORT_SYMBOL_GPL of the three symbols and added a suitably
> rude changelog. ?Is that inadequate?
The symbols are already exported on alpha, arm, parisc, um and x86_64,
but I'd rather not have them available to modules at all in order to
prevent driver writers from (ab)using them after KERNEL_SYSCALLS have been
eliminated.
Arnd <><
Arnd Bergmann <[email protected]> wrote:
>
> On Friday 12 March 2004 10:29, you wrote:
> > I just did an EXPORT_SYMBOL_GPL of the three symbols and added a suitably
> > rude changelog. ?Is that inadequate?
>
> The symbols are already exported on alpha, arm, parisc, um and x86_64,
> but I'd rather not have them available to modules at all in order to
> prevent driver writers from (ab)using them after KERNEL_SYSCALLS have been
> eliminated.
>
But then the removal of KERNEL_SYSCALLS becomes hostage to those drivers,
and nobody is working on them. It'll never happen.
On Fri, Mar 12, 2004 at 08:22:14AM +0000, Joe Thornber wrote:
> name len == 128, uuid_len == 129, so is the uuid_len being rounded up
> to the nearest 64bit boundary on x86-64 and only 32bit boundary on
> x86-32 ? (Sounds likely)
In which case the following ugly patch should fix things. Mickael,
any chance you could test this please ?
- Joe
Fix ioctl breakage on x86-64.
--- diff/include/linux/dm-ioctl.h 2004-03-11 10:20:28.000000000 +0000
+++ source/include/linux/dm-ioctl.h 2004-03-12 09:44:58.000000000 +0000
@@ -187,23 +187,37 @@ enum {
DM_TABLE_STATUS_CMD,
};
+/*
+ * The dm_ioctl struct passed into the ioctl is just the header
+ * on a larger chunk of memory. On x86-64 the dm-ioctl struct
+ * will be padded to an 8 byte boundary so the size will be
+ * different, which would change the ioctl code - yes I really
+ * messed up. This hack forces x86-64 to have the correct ioctl
+ * code.
+ */
+#ifdef CONFIG_X86_64
+typedef char ioctl_struct[308];
+#else
+typedef struct dm_ioctl ioctl_struct;
+#endif
+
#define DM_IOCTL 0xfd
-#define DM_VERSION _IOWR(DM_IOCTL, DM_VERSION_CMD, struct dm_ioctl)
-#define DM_REMOVE_ALL _IOWR(DM_IOCTL, DM_REMOVE_ALL_CMD, struct dm_ioctl)
-#define DM_LIST_DEVICES _IOWR(DM_IOCTL, DM_LIST_DEVICES_CMD, struct dm_ioctl)
-
-#define DM_DEV_CREATE _IOWR(DM_IOCTL, DM_DEV_CREATE_CMD, struct dm_ioctl)
-#define DM_DEV_REMOVE _IOWR(DM_IOCTL, DM_DEV_REMOVE_CMD, struct dm_ioctl)
-#define DM_DEV_RENAME _IOWR(DM_IOCTL, DM_DEV_RENAME_CMD, struct dm_ioctl)
-#define DM_DEV_SUSPEND _IOWR(DM_IOCTL, DM_DEV_SUSPEND_CMD, struct dm_ioctl)
-#define DM_DEV_STATUS _IOWR(DM_IOCTL, DM_DEV_STATUS_CMD, struct dm_ioctl)
-#define DM_DEV_WAIT _IOWR(DM_IOCTL, DM_DEV_WAIT_CMD, struct dm_ioctl)
-
-#define DM_TABLE_LOAD _IOWR(DM_IOCTL, DM_TABLE_LOAD_CMD, struct dm_ioctl)
-#define DM_TABLE_CLEAR _IOWR(DM_IOCTL, DM_TABLE_CLEAR_CMD, struct dm_ioctl)
-#define DM_TABLE_DEPS _IOWR(DM_IOCTL, DM_TABLE_DEPS_CMD, struct dm_ioctl)
-#define DM_TABLE_STATUS _IOWR(DM_IOCTL, DM_TABLE_STATUS_CMD, struct dm_ioctl)
+#define DM_VERSION _IOWR(DM_IOCTL, DM_VERSION_CMD, ioctl_struct)
+#define DM_REMOVE_ALL _IOWR(DM_IOCTL, DM_REMOVE_ALL_CMD, ioctl_struct)
+#define DM_LIST_DEVICES _IOWR(DM_IOCTL, DM_LIST_DEVICES_CMD, ioctl_struct)
+
+#define DM_DEV_CREATE _IOWR(DM_IOCTL, DM_DEV_CREATE_CMD, ioctl_struct)
+#define DM_DEV_REMOVE _IOWR(DM_IOCTL, DM_DEV_REMOVE_CMD, ioctl_struct)
+#define DM_DEV_RENAME _IOWR(DM_IOCTL, DM_DEV_RENAME_CMD, ioctl_struct)
+#define DM_DEV_SUSPEND _IOWR(DM_IOCTL, DM_DEV_SUSPEND_CMD, ioctl_struct)
+#define DM_DEV_STATUS _IOWR(DM_IOCTL, DM_DEV_STATUS_CMD, ioctl_struct)
+#define DM_DEV_WAIT _IOWR(DM_IOCTL, DM_DEV_WAIT_CMD, ioctl_struct)
+
+#define DM_TABLE_LOAD _IOWR(DM_IOCTL, DM_TABLE_LOAD_CMD, ioctl_struct)
+#define DM_TABLE_CLEAR _IOWR(DM_IOCTL, DM_TABLE_CLEAR_CMD, ioctl_struct)
+#define DM_TABLE_DEPS _IOWR(DM_IOCTL, DM_TABLE_DEPS_CMD, ioctl_struct)
+#define DM_TABLE_STATUS _IOWR(DM_IOCTL, DM_TABLE_STATUS_CMD, ioctl_struct)
#define DM_VERSION_MAJOR 4
#define DM_VERSION_MINOR 0
On Fri, Mar 12, 2004 at 01:48:09AM -0800, Andrew Morton wrote:
> > The symbols are already exported on alpha, arm, parisc, um and x86_64,
> > but I'd rather not have them available to modules at all in order to
> > prevent driver writers from (ab)using them after KERNEL_SYSCALLS have been
> > eliminated.
>
> But then the removal of KERNEL_SYSCALLS becomes hostage to those drivers,
> and nobody is working on them. It'll never happen.
The DVB folks claimed to be working on fixing this up a few weeks back,
still not seen any patches though.
Dave
On Fri, 2004-03-12 at 10:48, Andrew Morton wrote:
> Arnd Bergmann <[email protected]> wrote:
> But then the removal of KERNEL_SYSCALLS becomes hostage to those drivers,
> and nobody is working on them. It'll never happen.
CONFIG_BROKEN ??
Hi,
just tested, it works just fine :)
no more errors,
dmsetup version and dmsetup ls work nicely
I will try to make evms work now but I guess that should be okay.
I will report if I have other troubles.
good candidate for next mm ?
thanks Joe,
Mik
Le vendredi 12 Mars 2004 10:49, Joe Thornber a ?crit?:
> On Fri, Mar 12, 2004 at 08:22:14AM +0000, Joe Thornber wrote:
> > name len == 128, uuid_len == 129, so is the uuid_len being rounded up
> > to the nearest 64bit boundary on x86-64 and only 32bit boundary on
> > x86-32 ? (Sounds likely)
>
> In which case the following ugly patch should fix things. Mickael,
> any chance you could test this please ?
>
> - Joe
>
>
> Fix ioctl breakage on x86-64.
> --- diff/include/linux/dm-ioctl.h 2004-03-11 10:20:28.000000000 +0000
> +++ source/include/linux/dm-ioctl.h 2004-03-12 09:44:58.000000000 +0000
> @@ -187,23 +187,37 @@ enum {
> DM_TABLE_STATUS_CMD,
> };
>
> +/*
> + * The dm_ioctl struct passed into the ioctl is just the header
> + * on a larger chunk of memory. On x86-64 the dm-ioctl struct
> + * will be padded to an 8 byte boundary so the size will be
> + * different, which would change the ioctl code - yes I really
> + * messed up. This hack forces x86-64 to have the correct ioctl
> + * code.
> + */
> +#ifdef CONFIG_X86_64
> +typedef char ioctl_struct[308];
> +#else
> +typedef struct dm_ioctl ioctl_struct;
> +#endif
> +
> #define DM_IOCTL 0xfd
>
> -#define DM_VERSION _IOWR(DM_IOCTL, DM_VERSION_CMD, struct dm_ioctl)
> -#define DM_REMOVE_ALL _IOWR(DM_IOCTL, DM_REMOVE_ALL_CMD, struct
> dm_ioctl) -#define DM_LIST_DEVICES _IOWR(DM_IOCTL, DM_LIST_DEVICES_CMD,
> struct dm_ioctl) -
> -#define DM_DEV_CREATE _IOWR(DM_IOCTL, DM_DEV_CREATE_CMD, struct
> dm_ioctl) -#define DM_DEV_REMOVE _IOWR(DM_IOCTL, DM_DEV_REMOVE_CMD,
> struct dm_ioctl) -#define DM_DEV_RENAME _IOWR(DM_IOCTL,
> DM_DEV_RENAME_CMD, struct dm_ioctl) -#define DM_DEV_SUSPEND
> _IOWR(DM_IOCTL, DM_DEV_SUSPEND_CMD, struct dm_ioctl) -#define DM_DEV_STATUS
> _IOWR(DM_IOCTL, DM_DEV_STATUS_CMD, struct dm_ioctl) -#define DM_DEV_WAIT
> _IOWR(DM_IOCTL, DM_DEV_WAIT_CMD, struct dm_ioctl) -
> -#define DM_TABLE_LOAD _IOWR(DM_IOCTL, DM_TABLE_LOAD_CMD, struct
> dm_ioctl) -#define DM_TABLE_CLEAR _IOWR(DM_IOCTL, DM_TABLE_CLEAR_CMD,
> struct dm_ioctl) -#define DM_TABLE_DEPS _IOWR(DM_IOCTL,
> DM_TABLE_DEPS_CMD, struct dm_ioctl) -#define DM_TABLE_STATUS
> _IOWR(DM_IOCTL, DM_TABLE_STATUS_CMD, struct dm_ioctl) +#define DM_VERSION
> _IOWR(DM_IOCTL, DM_VERSION_CMD, ioctl_struct) +#define DM_REMOVE_ALL
> _IOWR(DM_IOCTL, DM_REMOVE_ALL_CMD, ioctl_struct) +#define DM_LIST_DEVICES
> _IOWR(DM_IOCTL, DM_LIST_DEVICES_CMD, ioctl_struct) +
> +#define DM_DEV_CREATE _IOWR(DM_IOCTL, DM_DEV_CREATE_CMD, ioctl_struct)
> +#define DM_DEV_REMOVE _IOWR(DM_IOCTL, DM_DEV_REMOVE_CMD, ioctl_struct)
> +#define DM_DEV_RENAME _IOWR(DM_IOCTL, DM_DEV_RENAME_CMD, ioctl_struct)
> +#define DM_DEV_SUSPEND _IOWR(DM_IOCTL, DM_DEV_SUSPEND_CMD, ioctl_struct)
> +#define DM_DEV_STATUS _IOWR(DM_IOCTL, DM_DEV_STATUS_CMD, ioctl_struct)
> +#define DM_DEV_WAIT _IOWR(DM_IOCTL, DM_DEV_WAIT_CMD, ioctl_struct)
> +
> +#define DM_TABLE_LOAD _IOWR(DM_IOCTL, DM_TABLE_LOAD_CMD, ioctl_struct)
> +#define DM_TABLE_CLEAR _IOWR(DM_IOCTL, DM_TABLE_CLEAR_CMD, ioctl_struct)
> +#define DM_TABLE_DEPS _IOWR(DM_IOCTL, DM_TABLE_DEPS_CMD, ioctl_struct)
> +#define DM_TABLE_STATUS _IOWR(DM_IOCTL, DM_TABLE_STATUS_CMD,
> ioctl_struct)
>
> #define DM_VERSION_MAJOR 4
> #define DM_VERSION_MINOR 0
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Fri, Mar 12, 2004 at 01:11:46PM +0100, Mickael Marchand wrote:
> Hi,
>
> just tested, it works just fine :)
Great I'm glad it works. Sorry I was so slow about realising what was
wrong.
> good candidate for next mm ?
Yep, I'll forward a patch to akpm now.
- Joe
Joe Thornber <[email protected]> writes:
> Fix ioctl breakage on x86-64.
> --- diff/include/linux/dm-ioctl.h 2004-03-11 10:20:28.000000000 +0000
> +++ source/include/linux/dm-ioctl.h 2004-03-12 09:44:58.000000000 +0000
> @@ -187,23 +187,37 @@ enum {
> DM_TABLE_STATUS_CMD,
> };
>
> +/*
> + * The dm_ioctl struct passed into the ioctl is just the header
> + * on a larger chunk of memory. On x86-64 the dm-ioctl struct
> + * will be padded to an 8 byte boundary so the size will be
> + * different, which would change the ioctl code - yes I really
> + * messed up. This hack forces x86-64 to have the correct ioctl
> + * code.
> + */
> +#ifdef CONFIG_X86_64
> +typedef char ioctl_struct[308];
> +#else
> +typedef struct dm_ioctl ioctl_struct;
> +#endif
That's bad because it will break binary compatibility for existing
x86-64 systems. Don't add that please. Either emulate it properly
or I will just declare the 32bit DM emulation broken and users will
have to live with that.
-Andi
Joe Thornber <[email protected]> writes:
>
>> good candidate for next mm ?
>
> Yep, I'll forward a patch to akpm now.
Please don't do that. It will break all 64bit userland.
-Andi
On Fri, Mar 19, 2004 at 06:58:26AM +0100, Andi Kleen wrote:
> That's bad because it will break binary compatibility for existing
> x86-64 systems. Don't add that please. Either emulate it properly
> or I will just declare the 32bit DM emulation broken and users will
> have to live with that.
So you want me to put in a seperate set of ioctl codes for
compatibility ?
- Joe
On Fri, Mar 12, 2004 at 01:49:43PM +0000, Joe Thornber wrote:
> On Fri, Mar 19, 2004 at 06:58:26AM +0100, Andi Kleen wrote:
> > That's bad because it will break binary compatibility for existing
> > x86-64 systems. Don't add that please. Either emulate it properly
> > or I will just declare the 32bit DM emulation broken and users will
> > have to live with that.
>
> So you want me to put in a seperate set of ioctl codes for
> compatibility.
Breaking the 64bit ABI is not acceptable at least. There are distributions
shipping that use it.
For 32bit emulation you can use what you think is easiest or just
ignore it if it's too hard (64bit is more important than 32bit)
-Andi
On Fri, Mar 12, 2004 at 03:24:43PM +1100, Nick Piggin wrote:
>
>
> Andi Kleen wrote:
>
> >On Thu, Mar 11, 2004 at 07:04:50PM -0800, Nakajima, Jun wrote:
> >
> >>As we can have more complex architectures in the future, the scheduler
> >>is flexible enough to represent various scheduling domains effectively,
> >>and yet keeps the common scheduler code simple.
> >>
> >
> >I think for SMT alone it's too complex and for NUMA it doesn't do
> >the right thing for "modern NUMAs" (where NUMA factor is very low
> >and you have a small number of CPUs for each node).
> >
> >
>
> For SMT it is a less complex than shared runqueues, it is actually
> less lines of code and smaller object size.
By moving all the complexity into arch/* ?
>
> It is also more flexible than shared runqueues in that you can still
> have control over each sibling's runqueue. Con's SMT nice patch for
> example would probably be more difficult to do with shared runqueues.
> Shared runqueues also gives zero affinity to siblings. While current
> implementations may not (do they?) care, future ones might.
>
> For Opteron type NUMA, it actually balances much more aggressively
> than the default NUMA scheduler, especially when a CPU is idle. I
> don't doubt you aren't seeing great performance, but it should be
> able to be fixed.
>
> The problem is just presumably your lack of time to investigate
> further, and my lack of problem descriptions or Opterons.
I didn't investigate further on your scheduler because I have my
doubts about it being the right approach and it seems to have
some obvious design bugs (like the racy SMT setup)
The problem description is still the same as it was in the past.
Basically it is: schedule as on SMP, but avoid local affinity for newly
created tasks and balance early. Allow to disable all old style NUMA
heuristics.
Longer term some homenode scheduling affinity may be still useful,
but I tried to get that to work on 2.4 and failed, so I'm not sure
it can be done. The right way may be to keep track how much memory
each thread allocated on each node and preferably schedule on
the node with the most memory. But that's future work.
>
> One thing you definitely want is a sched_balance_fork, is that right?
> Have you been able to do any benchmarks on recent -mm kernels?
I sent the last benchmarks I did to you (including the tweaks you
suggested). All did worse than the standard scheduler. Did you
change anything significant that makes rebenchmarking useful?
-Andi
Andi Kleen wrote:
>On Fri, Mar 12, 2004 at 03:24:43PM +1100, Nick Piggin wrote:
>
>>
>>Andi Kleen wrote:
>>
>>
>>>On Thu, Mar 11, 2004 at 07:04:50PM -0800, Nakajima, Jun wrote:
>>>
>>>
>>>>As we can have more complex architectures in the future, the scheduler
>>>>is flexible enough to represent various scheduling domains effectively,
>>>>and yet keeps the common scheduler code simple.
>>>>
>>>>
>>>I think for SMT alone it's too complex and for NUMA it doesn't do
>>>the right thing for "modern NUMAs" (where NUMA factor is very low
>>>and you have a small number of CPUs for each node).
>>>
>>>
>>>
>>For SMT it is a less complex than shared runqueues, it is actually
>>less lines of code and smaller object size.
>>
>
>By moving all the complexity into arch/* ?
>
>
Well you have a point in a way. At least it is configurable, per
arch, and done in setup __init code. The whole point really was
to move the complexity to arch/* (or they can just use the default
setup, obviously).
>>It is also more flexible than shared runqueues in that you can still
>>have control over each sibling's runqueue. Con's SMT nice patch for
>>example would probably be more difficult to do with shared runqueues.
>>Shared runqueues also gives zero affinity to siblings. While current
>>implementations may not (do they?) care, future ones might.
>>
>>For Opteron type NUMA, it actually balances much more aggressively
>>than the default NUMA scheduler, especially when a CPU is idle. I
>>don't doubt you aren't seeing great performance, but it should be
>>able to be fixed.
>>
>>The problem is just presumably your lack of time to investigate
>>further, and my lack of problem descriptions or Opterons.
>>
>
>I didn't investigate further on your scheduler because I have my
>doubts about it being the right approach and it seems to have
>some obvious design bugs (like the racy SMT setup)
>
>
If you have any ideas about other approaches I would be interested
to hear them...
Setup needs some work, yes. It isn't a fundamental problem.
>The problem description is still the same as it was in the past.
>
>Basically it is: schedule as on SMP, but avoid local affinity for newly
>created tasks and balance early. Allow to disable all old style NUMA
>heuristics.
>
>
That is pretty much what it does now. Apart from moving newly created
tasks. I think you're pretty brave for wanting to move new *threads*
off node. If anything, they are the most likely possible thing to
share memory. But I could add a sched_balance_fork which you can turn
on if you like.
>Longer term some homenode scheduling affinity may be still useful,
>but I tried to get that to work on 2.4 and failed, so I'm not sure
>it can be done. The right way may be to keep track how much memory
>each thread allocated on each node and preferably schedule on
>the node with the most memory. But that's future work.
>
>
Yeah. There is no reason why the scheduler should perform worse than
2.4 for you. We have to get to the bottom of it.
>>One thing you definitely want is a sched_balance_fork, is that right?
>>Have you been able to do any benchmarks on recent -mm kernels?
>>
>
>I sent the last benchmarks I did to you (including the tweaks you
>suggested). All did worse than the standard scheduler. Did you
>change anything significant that makes rebenchmarking useful?
>
>
Yeah thanks for those. There have been quite a few changes and fixes
to the scheduler since then, so I think it would be worth re-testing.
On Thu, 2004-03-11 at 20:03, Neil Brown wrote:
> I've seen this before when trying to boot a P4 kernel on a P-classic
> etc, so I tried compiling with CONFIG_M386, and got lots of compile
> errors:
>
> include/asm/acpi.h: In function `__acpi_acquire_global_lock':
> include/asm/acpi.h:74: warning: implicit declaration of function
> `cmpxchg'
>
fixed in latest ACPI patch.
> So I tried the default (CONFIG_M686) and it still doesn't work.
>
> So: where do I look next?
did you try "acpi=off"
if the system crashed during acpi table parsing, that would happen
before console output; and acpi=off would skip table parsing.
thanks,
-Len
On Fri, Mar 12, 2004 at 11:11:50AM +0100, Arjan van de Ven wrote:
> On Fri, 2004-03-12 at 10:48, Andrew Morton wrote:
> > Arnd Bergmann <[email protected]> wrote:
>
> > But then the removal of KERNEL_SYSCALLS becomes hostage to those drivers,
> > and nobody is working on them. It'll never happen.
>
> CONFIG_BROKEN ??
These are working drivers, and it's a stable kernel series...
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
On Fri, Mar 12, 2004 at 08:42:36PM +0100, Adrian Bunk wrote:
> On Fri, Mar 12, 2004 at 11:11:50AM +0100, Arjan van de Ven wrote:
> > On Fri, 2004-03-12 at 10:48, Andrew Morton wrote:
> > > Arnd Bergmann <[email protected]> wrote:
> >
> > > But then the removal of KERNEL_SYSCALLS becomes hostage to those drivers,
> > > and nobody is working on them. It'll never happen.
> >
> > CONFIG_BROKEN ??
>
> These are working drivers, and it's a stable kernel series...
for some value of working...
I mean, I'm not convinced the code is actually secure
(what if the fd they use is shared via a thread and userland is mucking
about with that fd to do funky stuff ???)
Hi!
> IBM Thinkpad T30, current bios
>
> On a clean boot (not resume - I've not gotten that working):
> resuming from /dev/hda8
> Resuming from device hda8
> bad: scheduling while atomic!
Boot with "noresume", then mkswap /dev/hda8.
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms
On Fri, Mar 12, 2004 at 01:11:46PM +0100, Mickael Marchand wrote:
> just tested, it works just fine :)
> no more errors,
> dmsetup version and dmsetup ls work nicely
I concur -- this seems to be working now. Many thanks.
> > In which case the following ugly patch should fix things. Mickael,
> > any chance you could test this please ?
> >
> > - Joe
Eww... that's a really horrid-looking patch. I'm not complaining
right now, though. :)
Hugo.
--
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
PGP key: 1C335860 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
--- Hey, Virtual Memory! Now I can have a *really big* ramdisk! ---
Dave Jones <[email protected]> :
[...]
> The DVB folks claimed to be working on fixing this up a few weeks back,
> still not seen any patches though.
Which drivers can be considered to be good examples for the firmware
framework ?
--
Ueimor
> On Thursday 11 March 2004 21:23, Adrian Bunk wrote:
>
> Hi Adrian,
>
> > On Wed, Mar 10, 2004 at 11:31:40PM -0800, Andrew Morton wrote:
> > >...
> > > ext3-journalled-quotas-2.patch
> > > ext3: journalled quota
> > >...
>
> > This patch broke modular quota:
> > WARNING: /lib/modules/2.6.4-mm1/kernel/fs/quota_v2.ko needs unknown
> > symbol mark_info_dirty
>
> Patch attached (again) ;)
Yes, the patch is right... I tested modular filesystem but forgot
about modular quota formats ;(.
Honza
> --- old/fs/dquot.c 2004-03-08 23:49:35.000000000 +0100
> +++ new/fs/dquot.c 2004-03-08 23:51:02.000000000 +0100
> @@ -1733,3 +1733,4 @@ EXPORT_SYMBOL(dquot_alloc_inode);
> EXPORT_SYMBOL(dquot_free_space);
> EXPORT_SYMBOL(dquot_free_inode);
> EXPORT_SYMBOL(dquot_transfer);
> +EXPORT_SYMBOL(mark_info_dirty);
--
Jan Kara <[email protected]>
SuSE CR Labs