ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
- Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
It seems that nothing else is going to come along and this is completely
encapsulated.
- Various other stuff. If anyone has a patch in here which they think
should be in 2.6.11, please let me know. I'm intending to merge the
following into 2.6.11:
alpha-add-missing-dma_mapping_error.patch
fix-compat-shmget-overflow.patch
fix-shmget-for-ppc64-s390-64-sparc64.patch
binfmt_elf-clearing-bss-may-fail.patch
qlogic-warning-fixes.patch
oprofile-exittext-referenced-in-inittext.patch
force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
oprofile-arm-xscale1-pmu-support-fix.patch
Changes since 2.6.11-rc3-mm1:
linus.patch
bk-agpgart.patch
bk-alsa.patch
bk-arm.patch
bk-cifs.patch
bk-cpufreq.patch
bk-drm-via.patch
bk-i2c.patch
bk-ieee1394.patch
bk-input.patch
bk-dtor-input.patch
bk-jfs.patch
bk-kbuild.patch
bk-kconfig.patch
bk-netdev.patch
bk-ntfs.patch
bk-scsi.patch
bk-scsi-rc-fixes.patch
bk-serial.patch
bk-usb.patch
bk-watchdog.patch
bk-xfs.patch
External bk trees.
-fix-an-error-in-proc-slabinfo-print.patch
-ibmveth-inlining-failure.patch
-fix-devfs-name-for-the-hvcs-driver.patch
-uml-compile-fixes.patch
-include-jiffies-fix-usecs_to_jiffies-jiffies_to_usecs-math.patch
-credits-update.patch
-nfsd-needs-exportfs.patch
-input-make-mousedevc-report-all-events-to-user-space-immediately.patch
-input-enable-hardware-tapping-for-alps-touchpads.patch
-input-fix-pointer-jumps-to-corner-of-screen-problem-on-alps-glidepoint-touchpads.patch
-input-add-support-for-synaptics-touchpad-scroll-wheels.patch
-driver-model-fix-types-in-usb.patch
-kswapd-throttling-fix.patch
-task_size-is-variable.patch
-use-mm_vm_size-in-exit_mmap.patch
-ppc64-correct-return-code-in-syscall-auditing.patch
-ppc64-show-1-for-physical_id-of-non-present-cpus.patch
-ppc64-replace-last-usage-of-vio-dma-mapping-routines.patch
-speedstep-libc-fix-frequency-multiplier-for-pentium4.patch
-x86_64-parse-noexec=.patch
-force-feedback-support-for-uinput.patch
-pcmcia-dc-initialisation-fix.patch
-scsi-megaraid_mmc-make-some-code-static.patch
-add-map_populate-sys_remap_file_pages-support-to-xfs.patch
-acpi-call-acpi_leave_sleep_state-before-resuming-devices.patch
-small-partitions-msdos-cleanups.patch
-scsi-sim710c-make-some-code-static.patch
Merged
+alpha-add-missing-dma_mapping_error.patch
Alpha build fix
+fix-compat-shmget-overflow.patch
+fix-shmget-for-ppc64-s390-64-sparc64.patch
Various compat code sign extension fixes
+binfmt_elf-clearing-bss-may-fail.patch
Fix the weird elf loading failure
+qlogic-warning-fixes.patch
smp_processor_id() warnings
+oprofile-exittext-referenced-in-inittext.patch
oprofile section fix
+force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
Partly fix up the x86_64 noexec mapping problem.
+oprofile-arm-xscale1-pmu-support-fix.patch
oprofile fix
+nfsd--sgi-921857-find-broken-with-nohide-on-nfsv3.patch
+nfsd--exportfs-reduce-stack-usage.patch
+nfsd--svcrpc-add-a-per-flavor-set_client-method.patch
+nfsd--svcrpc-rename-pg_authenticate.patch
+nfsd--svcrpc-move-export-table-checks-to-a-per-program-pg_add_client-method.patch
+nfsd--nfs4-use-new-pg_set_client-method-to-simplify-nfs4-callback-authentication.patch
+nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch
+nfsd--nfsd-remove-pg_authenticate-field.patch
+nfsd--global-static-cleanups-for-nfsd.patch
+nfsd--change-nfsd-reply-cache-to-use-listh-lists.patch
nfsd update
+acpi-fix-containers-notify-handler-to-handle-proper-cases-properly.patch
+acpi_power_off-bug-fix.patch
ACPI fixes
+update-to-ipmi-driver-to-support-old-dmi-spec.patch
+add-the-ipmi-smbus-driver-fix-fix.patch
IPMI fixes
+ohci1394-dma_pool_destroy-while-in_atomic-irqs_disabled.patch
1394 fix
+sbp2-fix-hang-on-unload.patch
Another 1394 fix
+serio-warning-fix.patch
+twidjoy-build-fix.patch
input code fixes
+compat-ioctl-for-submiting-urb.patch
+compat-ioctl-for-submiting-urb-fix.patch
Add compatibility emulation for the USB URB-direct-submission code.
+swapspace-layout-improvements-fix.patch
Fix swapspace-layout-improvements.patch
+fix-small-vmalloc-per-allocation-limit.patch
Make monster vmalloc()s work
+randomisation-global-sysctl-fix.patch
Fix randomisation-global-sysctl.patch
+fix-compilation-of-uml-after-the-stack-randomization-patches.patch
Fix randomisation-infrastructure.patch
+invalidate-range-of-pages-after-direct-io-write-fix-fix.patch
+write-and-wait-on-range-before-direct-io-read.patch
+only-unmap-what-intersects-a-direct_io-op.patch
Various optimisations for page unmapping, mainly related to direct-IO.
+net-s2io-replace-schedule_timeout-with-msleep.patch
cleanup
+ppc-ppc64-abstract-cpu_feature-checks.patch
+ppc32-dont-create-tmp_gas_check.patch
+ppc32-fix-mv64x60-register-relocation-bug-in-bootwrapper.patch
ppc32 updates
+ppc64-remove-unneeded-includes-from-pseries_nvramc.patch
+ppc64-collect-and-export-low-level-cpu-usage-statistics.patch
+ppc64-defconfig-updates.patch
+ppc64-distribute-export_symbols.patch
+ppc64-disable-hmt-for-rs64-cpus.patch
+use-vmlinux-during-make-install-on-ppc64.patch
+ppc64-functions-to-reserve-performance-monitor-hardware.patch
ppc64 updates
+mips-add-tanbac-tb0219-base-board-driver.patch
mips board driver
+refactor-i386-memory-setup.patch
+consolidate-set_max_mapnr_init-implementations.patch
+remove-free_all_bootmem-define.patch
x86 mm cleanups
+out-of-line-x86-put_user-implementation.patch
move x86 put_user() out of line.
+x86_64-hugetlb-fix.patch
hugepage fix
+swsusp-do-not-use-higher-order-memory-allocations-on-suspend-fix.patch
+swsusp-do-not-use-higher-order-memory-allocations-on-suspend-fix-fix.patch
Fix swsusp-do-not-use-higher-order-memory-allocations-on-suspend.patch
+fix-partial-sysrq-setting.patch
Fix a -mm-only sysrq patch
+touch_softlockup_watchdog.patch
+fix-softlockup-warning-in-swsuspend-resume.patch
Enhancements to detect-soft-lockups.patch
+add-struct-request-end_io-callback-fix.patch
Fix add-struct-request-end_io-callback.patch
+add-compiler-gcc4h.patch
cleanup
+rt-lsm.patch
realtime+mlock LSM
+convert-proc-driver-rtc-to-seq_file.patch
RTC driver /proc seqfile conversion
+drivers-char-lpc-race-fix.patch
Fix obscure line printer driver race
+clean-up-and-unify-asm-resourceh-files.patch
Code cleanup
-base-small-shrink-major_names-hash.patch
This broke stuff
+sort-fix.patch
Make the new sort function place things in sorted order.
-inotify.patch
-inotify-fix_find_inode.patch
I think my version is old, and it oopses.
+pcmcia-add-support-ti-pci4510-cardbus-bridge.patch
+pcmcia-update-vrc4171_card.patch
PCMCIA device updates
+nfsv4-deamon-always-supports-acls.patch
NFS ACL Kconfig fix
+add-do_proc_doulonglongvec_minmax-to-sysctl-functions.patch
+add-sysctl-interface-to-sched_domain-parameters.patch
We found the cause of the weird ia64 oops (extable sorting was wrong), so
bring these back.
+crashdump-routines-for-copying-dump-pages-fixes.patch
+crashdump-linear-raw-format-dump-file-access-coding-style.patch
crashdump coding cleanups
+radeonfb-fix-spurious-error-return-in-fbio_radeon_set_mirror.patch
+w100fb-make-blanking-function-interrupt-safe.patch
+kyrofb-copy__user-return-value-checks-added-to-kyro-fb.patch
+skeletonfb-documentation-fixes.patch
+intelfb-add-partial-support-915g-chipset.patch
+sisfb_compat_ioctl-warning-fix.patch
+sis-warning-fix.patch
+tridentfb-warning-fix.patch
fbdev updates and fixes
-raid5-overlapping-read-hack.patch
This got fixed for real
+md-fix-multipath-assembly-bug.patch
+md-raid-kconfig-cleanups-remove-experimental-tag-from-raid-6.patch
MD updates
+device-mapper-store-name-directly-against-device.patch
+device-mapper-record-restore-bio-state.patch
+device-mapper-export-map_info.patch
DM updates
+update-documentation-filesystems-locking.patch
Update the VFS locking doc for quotas
+hpet-setup-comment-fix.patch
+fs-ncpfs-ncplib_kernelc-make-a-function-static.patch
+kill-iphase5526.patch
+fs-nfs-make-some-code-static.patch
+i386-x86_64-acpi-sleepc-kill-unused-acpi_save_state_disk.patch
+smpbootc-cleanups.patch
+i386-kernel-i387c-misc-cleanups.patch
+i386-x86_64-i8259c-make-mask_and_ack_8259a-static.patch
+scsi-sym53c416c-make-a-function-static.patch
+scsi-ultrastorc-make-a-variable-static.patch
+tridentfbc-make-some-code-static.patch
+kernel-intermodulec-make-inter_module_get-static.patch
Various nanofixes
number of patches in -mm: 585
number of changesets in external trees: 634
number of patches in -mm only: 564
total patches: 1198
All 585 patches:
linus.patch
alpha-add-missing-dma_mapping_error.patch
alpha: add missing dma_mapping_error
fix-compat-shmget-overflow.patch
Fix compat shmget overflow
fix-shmget-for-ppc64-s390-64-sparc64.patch
Fix shmget for ppc64, s390-64 & sparc64.
binfmt_elf-clearing-bss-may-fail.patch
binfmt_elf: clearing bss may fail
qlogic-warning-fixes.patch
more qlogic smp_processor_id() warning fixes
oprofile-exittext-referenced-in-inittext.patch
OProfile: exit.text referenced in init.text
force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
Force read implies exec for all 32bit processes in x86-64
oprofile-arm-xscale1-pmu-support-fix.patch
OProfile: ARM/XScale1 PMU support fix
nfsd--sgi-921857-find-broken-with-nohide-on-nfsv3.patch
SGI 921857: find broken with nohide on NFSv3
nfsd--exportfs-reduce-stack-usage.patch
nfsd: exportfs: reduce stack usage
nfsd--svcrpc-add-a-per-flavor-set_client-method.patch
nfsd: svcrpc: add a per-flavor set_client method
nfsd--svcrpc-rename-pg_authenticate.patch
nfsd: svcrpc: rename pg_authenticate
nfsd--svcrpc-move-export-table-checks-to-a-per-program-pg_add_client-method.patch
nfsd: svcrpc: move export table checks to a per-program pg_add_client method
nfsd--nfs4-use-new-pg_set_client-method-to-simplify-nfs4-callback-authentication.patch
nfsd: nfs4: use new pg_set_client method to simplify nfs4 callback authentication
nfsd--lockd-dont-try-to-match-callback-requests-against-export-table.patch
nfsd: lockd: don't try to match callback requests against export table
nfsd--nfsd-remove-pg_authenticate-field.patch
nfsd: nfsd: remove pg_authenticate field
nfsd--global-static-cleanups-for-nfsd.patch
nfsd: global/static cleanups for nfsd
nfsd--change-nfsd-reply-cache-to-use-listh-lists.patch
nfsd: change nfsd reply cache to use list.h lists
ia64-config_apci_numa-fix.patch
ia64 CONFIG_APCI_NUMA fix
ia64-acpi-build-fix.patch
ia64 acpi build fix
add-try_acquire_console_sem.patch
Add try_acquire_console_sem
update-aty128fb-sleep-wakeup-code-for-new-powermac-changes.patch
update aty128fb sleep/wakeup code for new powermac changes
radeonfb-update.patch
radeonfb update
radeonfb-build-fix.patch
radeonfb-build-fix
acpi-sleep-while-atomic-during-s3-resume-from-ram.patch
acpi: sleep-while-atomic during S3 resume from ram
acpi-report-errors-in-fanc.patch
ACPI: report errors in fan.c
acpi-flush-tlb-when-pagetable-changed.patch
acpi: flush TLB when pagetable changed
fix-an-issue-in-acpi-processor-and-container-drivers-related-with-kobject_hotplug.patch
Fix an issue in ACPI processor and container drivers related with kobject_hotplug()
acpi-fix-containers-notify-handler-to-handle-proper-cases-properly.patch
acpi: fix container's notify handler to handle proper cases properly
acpi_power_off-bug-fix.patch
acpi_power_off bug fix
bk-agpgart.patch
bk-alsa.patch
fix-32-bit-calls-to-snd_pcm_channel_info.patch
Fix 32-bit calls to snd_pcm_channel_info()
bk-arm.patch
bk-cifs.patch
bk-cpufreq.patch
cpufreq-core-reduce-warning-messages.patch
cpufreq-core: reduce warning messages
bk-drm-via.patch
bk-i2c.patch
changes-to-the-i2c-driver-to-support-a-non-blocking-interface.patch
Changes to the I2C driver to support a non-blocking interface
minor-ipmi-enhancements.patch
Minor IPMI enhancements
update-to-ipmi-driver-to-support-old-dmi-spec.patch
Update to IPMI driver to support old DMI spec
modify-the-i801-i2c-driver-to-use-the-non-blocking-interface.patch
Modify the i801 I2C driver to use the non-blocking interface.
add-the-ipmi-smbus-driver.patch
Add the IPMI SMBus driver
add-the-ipmi-smbus-driver-fix.patch
ipmi-build-fix-42
add-the-ipmi-smbus-driver-fix-fix.patch
add-the-ipmi-smbus-driver-fix fix
bk-ieee1394.patch
ohci1394-dma_pool_destroy-while-in_atomic-irqs_disabled.patch
ohci1394: dma_pool_destroy while in_atomic() && irqs_disabled()
ohci1394-dma_pool_destroy-while-in_atomic-irqs_disabled-tidy
ohci1394-dma_pool_destroy-while-in_atomic-irqs_disabled-simplification
sbp2-fix-hang-on-unload.patch
sbp2: fix hang on unload
bk-input.patch
bk-dtor-input.patch
serio-warning-fix.patch
serio warning fix
twidjoy-build-fix.patch
twidjoy-build-fix
bk-jfs.patch
bk-kbuild.patch
bk-kconfig.patch
bk-netdev.patch
bk-ntfs.patch
bk-scsi.patch
bk-scsi-rc-fixes.patch
bk-serial.patch
bk-usb.patch
compat-ioctl-for-submiting-urb.patch
compat ioctl for submiting URB
compat-ioctl-for-submiting-urb-fix.patch
compat-ioctl-for-submiting-urb-fix
bk-watchdog.patch
bk-xfs.patch
mm.patch
add -mmN to EXTRAVERSION
vm-pageout-throttling.patch
vm: pageout throttling
orphaned-pagecache-memleak-fix.patch
orphaned pagecache memleak fix
swapspace-layout-improvements.patch
swapspace-layout-improvements
swapspace-layout-improvements-fix.patch
/proc/swaps negative Used
simpler-topdown-mmap-layout-allocator.patch
simpler topdown mmap layout allocator
vmscan-reclaim-swap_cluster_max-pages-in-a-single-pass.patch
vmscan: reclaim SWAP_CLUSTER_MAX pages in a single pass
fix-mincore-cornercases-overflow-caused-by-large-len.patch
Fix mincore cornercases: overflow caused by large "len"
fix-small-vmalloc-per-allocation-limit.patch
Fix small vmalloc per allocation limit
randomisation-global-sysctl.patch
Randomisation: global sysctl
randomisation-global-sysctl-fix.patch
randomisation-global-sysctl-fix
randomisation-infrastructure.patch
Randomisation: infrastructure
fix-compilation-of-uml-after-the-stack-randomization-patches.patch
Fix compilation of UML after the stack-randomization patches
randomisation-add-pf_randomize.patch
Randomisation: add PF_RANDOMIZE
randomisation-stack-randomisation.patch
Randomisation: stack randomisation
randomisation-mmap-randomisation.patch
Randomisation: mmap randomisation
randomisation-enable-by-default.patch
Randomisation: enable by default
randomisation-addr_no_randomize-personality.patch
Randomisation: add ADDR_NO_RANDOMIZE personality
randomisation-top-of-stack-randomization.patch
Randomisation: top-of-stack randomization
move-accounting-function-calls-out-of-critical-vm-code-pathspatch.patch
Move accounting function calls out of critical vm code paths
invalidate-range-of-pages-after-direct-io-write.patch
invalidate range of pages after direct IO write
invalidate-range-of-pages-after-direct-io-write-fix.patch
invalidate-range-of-pages-after-direct-io-write-fix
invalidate-range-of-pages-after-direct-io-write-fix-fix.patch
invalidate-range-of-pages-after-direct-io-write-fix-fix
write-and-wait-on-range-before-direct-io-read.patch
write and wait on range before direct io read
only-unmap-what-intersects-a-direct_io-op.patch
only unmap what intersects a direct_IO op
make-tree_lock-an-rwlock.patch
make mapping->tree_lock an rwlock
must-fix.patch
must fix lists update
must fix list update
mustfix update
must-fix update
mustfix lists
b44-bounce-buffer-fix.patch
b44 bounce buffering fix
net-s2io-replace-schedule_timeout-with-msleep.patch
net/s2io: replace schedule_timeout() with msleep()
ppc-ppc64-abstract-cpu_feature-checks.patch
PPC/PPC64: Abstract cpu_feature checks.
ppc32-dont-create-tmp_gas_check.patch
ppc32: Don't create .tmp_gas_check
ppc32-fix-mv64x60-register-relocation-bug-in-bootwrapper.patch
ppc32: fix mv64x60 register relocation bug in bootwrapper
ppc64-remove-unneeded-includes-from-pseries_nvramc.patch
remove unneeded includes from pSeries_nvram.c
ppc64-collect-and-export-low-level-cpu-usage-statistics.patch
ppc64: collect and export low-level cpu usage statistics
ppc64-move-systemcfg-out-of-heads.patch
ppc64: Move systemcfg out of head.S
ppc64-defconfig-updates.patch
ppc64: defconfig updates
ppc64-distribute-export_symbols.patch
ppc64: distribute EXPORT_SYMBOLs
ppc64-implement-a-vdso-and-use-it-for-signal-trampoline.patch
ppc64: Implement a vDSO and use it for signal trampoline
ppc64-generic-hotplug-cpu-support.patch
ppc64: generic hotplug cpu support
ppc64-disable-hmt-for-rs64-cpus.patch
ppc64: disable HMT for RS64 cpus
use-vmlinux-during-make-install-on-ppc64.patch
ppc64: use vmlinux during make install on ppc64
ppc64-functions-to-reserve-performance-monitor-hardware.patch
ppc64: functions to reserve performance monitor hardware
ppc64-reloc_hide.patch
agpgart-allow-multiple-backends-to-be-initialized.patch
agpgart: allow multiple backends to be initialized
agpgart-allow-multiple-backends-to-be-initialized fix
agpgart: add bridge assignment missed in agp_allocate_memory
x86_64 agp failure fix
agpgart-allow-multiple-backends-to-be-initialized-fix.patch
agpgart-allow-multiple-backends-to-be-initialized-fix
agpgart-add-agp_find_bridge-function.patch
agpgart: add agp_find_bridge function
agpgart-allow-drivers-to-allocate-memory-local-to.patch
agpgart: allow drivers to allocate memory local to the bridge
drm-add-support-for-new-multiple-agp-bridge-agpgart-api.patch
drm: add support for new multiple agp bridge agpgart api
fb-add-support-for-new-multiple-agp-bridge-agpgart-api.patch
fb: add support for new multiple agp bridge agpgart api
agpgart-add-bridge-parameter-to-driver-functions.patch
agpgart: add bridge parameter to driver functions
mips-add-tanbac-tb0219-base-board-driver.patch
mips: add TANBAC TB0219 base board driver
allow-hot-add-enabled-i386-numa-box-to-boot.patch
Allow hot-add enabled i386 NUMA box to boot
refactor-i386-memory-setup.patch
x86: refactor memory setup
consolidate-set_max_mapnr_init-implementations.patch
x86: consolidate set_max_mapnr_init() implementations
remove-free_all_bootmem-define.patch
x86: remove-free_all_bootmem() #define
out-of-line-x86-put_user-implementation.patch
out-of-line x86 "put_user()" implementation
x86_64-hugetlb-fix.patch
x86_64: hugetlb fix
xen-vmm-4-add-ptep_establish_new-to-make-va-available.patch
Xen VMM #4: add ptep_establish_new to make va available
xen-vmm-4-return-code-for-arch_free_page.patch
Xen VMM #4: return code for arch_free_page
xen-vmm-4-return-code-for-arch_free_page-fix.patch
Get rid of arch_free_page() warning
xen-vmm-4-runtime-disable-of-vt-console.patch
Xen VMM #4: runtime disable of VT console
xen-vmm-4-has_arch_dev_mem.patch
Xen VMM #4: HAS_ARCH_DEV_MEM
xen-vmm-4-split-free_irq-into-teardown_irq.patch
Xen VMM #4: split free_irq into teardown_irq
swsusp-do-not-use-higher-order-memory-allocations-on-suspend.patch
swsusp: do not use higher order memory allocations on suspend
swsusp-do-not-use-higher-order-memory-allocations-on-suspend-fix.patch
swsusp-do-not-use-higher-order-memory-allocations-on-suspend fix
swsusp-do-not-use-higher-order-memory-allocations-on-suspend-fix-fix.patch
swsusp-do-not-use-higher-order-memory-allocations-on-suspend fix fix
make-sysrq-f-call-oom_kill.patch
make sysrq-F call oom_kill()
allow-admin-to-enable-only-some-of-the-magic-sysrq-functions.patch
Allow admin to enable only some of the Magic-Sysrq functions
fix-partial-sysrq-setting.patch
Fix partial sysrq setting
sort-out-pci_rom_address_enable-vs-ioresource_rom_enable.patch
Sort out PCI_ROM_ADDRESS_ENABLE vs IORESOURCE_ROM_ENABLE
irqpoll.patch
irqpoll
poll-mini-optimisations.patch
poll: mini optimisations
mtrr-size-and-base-debug.patch
mtrr size-and-base debugging
cleanup-vc-array-access.patch
cleanup vc array access
remove-console_macrosh.patch
remove console_macros.h
merge-vt_struct-into-vc_data.patch
merge vt_struct into vc_data
merge-vt_struct-into-vc_data-fix.patch
merge-vt_struct-into-vc_data fix
jbd-journal-overflow-fix-2.patch
jbd: journal overflow fix #2
jbd-fix-against-journal-overflow.patch
JBD: reduce stack and number of journal descriptors
jbd-fix-against-journal-overflow-tidies.patch
jbd-fix-against-journal-overflow-tidies
jbd-log-space-management-optimization.patch
JBD: log space management optimization
factor-out-phase-6-of-journal_commit_transaction.patch
Factor out phase 6 of journal_commit_transaction
ext3-cleanup-1.patch
ext3 cleanup 1
ext3-free-block-accounting-fix.patch
ext3: free block accounting fix
ext3_test_root-speedup.patch
ext3_test_root() speedup
i4l-new-hfc_usb-driver-version.patch
i4l: new hfc_usb driver version
i4l-hfc-4s-and-hfc-8s-driver.patch
i4l: HFC-4S and HFC-8S driver
fix-race-between-the-nmi-code-and-the-cmos-clock.patch
Fix race between the NMI code and the CMOS clock
cant-unmount-bad-inode.patch
Can't unmount bad inode
iounmap-debugging.patch
iounmap debugging
fix-put_user-under-mmap_sem-in-sys_get_mempolicy.patch
fix put_user under mmap_sem in sys_get_mempolicy()
oss-support-for-ac97-low-power-codecs.patch
OSS Support for AC97 low power codecs
fix-kallsyms-insmod-rmmod-race.patch
Fix kallsyms/insmod/rmmod race
fix-kallsyms-insmod-rmmod-race-fix.patch
fix-kallsyms-insmod-rmmod-race fix
fix-kallsyms-insmod-rmmod-race-fix-fix.patch
fix-kallsyms-insmod-rmmod-race-fix-fix
d_drop-should-use-per-dentry-lock.patch
d_drop should use per dentry lock
detect-soft-lockups.patch
detect soft lockups
touch_softlockup_watchdog.patch
touch_softlockup_watchdog()
fix-softlockup-warning-in-swsuspend-resume.patch
fix softlockup warning in swsuspend resume
serialize-access-to-ide-devices.patch
serialize access to ide devices
add-struct-request-end_io-callback.patch
Add struct request end_io callback
add-struct-request-end_io-callback-fix.patch
add-struct-request-end_io-callback fix
rework-core-barrier-support.patch
rework core barrier support
scsi_io_completion-sense-copy.patch
scsi_io_completion sense copy
blk_execute_rq-oops-on-fast-completion.patch
blk_execute_rq() oops on fast completion
nls_cp936c-is-not-synchronized-with-ms-translation-table.patch
nls_cp936.c is not synchronized with M$'s translation table
annotate-proc-pid-maps-with--markers.patch
annotate /proc/<PID>/maps with [heap]/[stack]/[vdso] markers
serial-add-nec-vr4100-series-serial-support.patch
serial: add NEC VR4100 series serial support
sys_setpriority-euid-semantics-fix.patch
sys_setpriority() euid semantics fix
add-tcsbrkp-to-compat_ioctlh.patch
add TCSBRKP to compat_ioctl.h
areca-raid-linux-scsi-driver.patch
ARECA RAID Linux scsi driver
add-local-bio-pool-support-and-modify-dm.patch
add local bio pool support and modify dm
add-local-bio-pool-support-and-modify-dm-uninline-zero_fill_bio.patch
uninline-zero_fill_bio
minor-conceptual-fix-for-proc-kcore-header-size.patch
minor conceptual fix for /proc/kcore header size
floppy-add-sysfs-symlink.patch
floppy.c: add sysfs symlink
add-compiler-gcc4h.patch
add compiler-gcc4.h
rt-lsm.patch
RT-LSM
convert-proc-driver-rtc-to-seq_file.patch
convert /proc/driver/rtc to seq_file.
drivers-char-lpc-race-fix.patch
drivers/char/lp.c race fix
clean-up-and-unify-asm-resourceh-files.patch
clean up and unify asm-*/resource.h files
base-small-introduce-the-config_base_small-flag.patch
base-small: introduce the CONFIG_BASE_SMALL flag
base-small-shrink-chrdevs-hash.patch
base-small: shrink chrdevs hash
base-small-shrink-pid-tables.patch
base-small: shrink PID tables
base-small-shrink-uid-hash.patch
base-small: shrink UID hash
base-small-shrink-futex-queues.patch
base-small: shrink futex queues
base-small-shrink-timer-hashes.patch
base-small: shrink timer hashes
base-small-shrink-console-buffer.patch
base-small: shrink console buffer
lib-sort-heapsort-implementation-of-sort.patch
lib/sort: Heapsort implementation of sort()
sort-fix.patch
sort fix
sort-export.patch
sort export
sort-build-fix.patch
sort build fix
lib-sort-turn-off-self-test.patch
lib/sort: turn off self-test
lib-sort-replace-qsort-in-xfs.patch
lib/sort: Replace qsort in XFS
lib-sort-replace-insertion-sort-in-exception-tables.patch
lib/sort: Replace insertion sort in exception tables
lib-sort-replace-insertion-sort-in-ia64-exception-tables.patch
lib/sort: Replace insertion sort in IA64 exception tables
lib-sort-use-generic-sort-on-x86_64.patch
lib/sort: Use generic sort on x86_64
random-pt2-cleanup-waitqueue-logic-fix-missed-wakeup.patch
random: cleanup waitqueue logic, fix missed wakeup
random-pt2-kill-pool-clearing.patch
random: kill pool clearing
random-pt2-combine-legacy-ioctls.patch
random: combine legacy ioctls
random-pt2-re-init-all-pools-on-zero.patch
random: re-init all pools on zero
random-pt2-simplify-initialization.patch
random: simplify initialization
random-pt2-kill-memsets-of-static-data.patch
random: kill memsets of static data
random-pt2-kill-dead-extract_state-struct.patch
random: kill dead extract_state struct
random-pt2-kill-22-compat-waitqueue-defs.patch
random: kill 2.2 compat waitqueue defs
random-pt2-kill-redundant-rotate_left-definitions.patch
random: kill redundant rotate_left definitions
random-pt2-kill-redundant-rotate_left-definitions-fix.patch
rol32 thinko
random-pt2-kill-misnamed-log2.patch
random: kill misnamed log2
random-pt3-more-meaningful-pool-names.patch
random: More meaningful pool names
random-pt3-static-allocation-of-pools.patch
random: Static allocation of pools
random-pt3-static-sysctl-bits.patch
random: Static sysctl bits
random-pt3-catastrophic-reseed-checks.patch
random: Catastrophic reseed checks
random-pt3-entropy-reservation-accounting.patch
random: Entropy reservation accounting
random-pt3-reservation-flag-in-pool-struct.patch
random: Reservation flag in pool struct
random-pt3-reseed-pointer-in-pool-struct.patch
random: Reseed pointer in pool struct
random-pt3-break-up-extract_user.patch
random: Break up extract_user
random-pt3-remove-dead-md5-copy.patch
random: Remove dead MD5 copy
random-pt3-simplify-hash-folding.patch
random: Simplify hash folding
random-pt3-clean-up-hash-buffering.patch
random: Clean up hash buffering
random-pt3-remove-entropy-batching.patch
random: Remove entropy batching
random-pt4-create-new-rol32-ror32-bitops.patch
random: Create new rol32/ror32 bitops
random-pt4-use-them-throughout-the-tree.patch
random: Use them throughout the tree
random-pt4-kill-the-sha-variants.patch
random: Kill the SHA variants
random-pt4-cleanup-sha-interface.patch
random: Cleanup SHA interface
random-pt4-move-sha-code-to-lib.patch
random: Move SHA code to lib/
random-pt4-replace-sha-with-faster-version.patch
random: Replace SHA with faster version
random-pt4-replace-sha-with-faster-version-fix.patch
random-pt4-replace-sha-with-faster-version-fix
random-pt4-replace-sha-with-faster-version-fix-fix.patch
SHA1 clarify kerneldoc
random-pt4-replace-sha-with-faster-version-fix-fix-fix.patch
random-pt4-cleanup-sha-interface fix
random-pt4-update-cryptolib-to-use-sha-fro-lib.patch
random: Update cryptolib to use SHA fro lib
random-pt4-move-halfmd4-to-lib.patch
random: Move halfmd4 to lib
random-pt4-kill-duplicate-halfmd4-in-ext3-htree.patch
random: Kill duplicate halfmd4 in ext3 htree
random-pt4-kill-duplicate-halfmd4-in-ext3-htree-fix.patch
random-pt4-kill-duplicate-halfmd4-in-ext3-htree-fix
random-pt4-simplify-and-shrink-syncookie-code.patch
random: Simplify and shrink syncookie code
random-pt4-move-syncookies-to-net.patch
random: Move syncookies to net/
speedup-proc-pid-maps.patch
Speed up /proc/pid/maps
speedup-proc-pid-maps-fix.patch
Speed up /proc/pid/maps fix
speedup-proc-pid-maps-fix-fix.patch
speedup-proc-pid-maps fix fix
speedup-proc-pid-maps-fix-fix-fix.patch
speedup /proc/<pid>/maps(4th version)
fix-loss-of-records-on-size-4096-in-proc-pid-maps.patch
fix loss of records on size > 4096 in proc/<pid>/maps
speedup-proc-pid-maps-fix-fix-fix-fix.patch
speedup-proc-pid-maps-fix-fix-fix fix
posix-timers-tidy-up-clock-interfaces-and-consolidate-dispatch-logic.patch
posix-timers: tidy up clock interfaces and consolidate dispatch logic
posix-timers-high-resolution-cpu-clocks-for-posix-clock_-syscalls.patch
posix-timers: high-resolution CPU clocks for POSIX clock_* syscalls
posix-timers-tidy-up-clock-interfaces-and-consolidate-dispatch-logic-cleanup.patch
posix-timers: tidy up clock interfaces and consolidate dispatch logic cleanup
posix-timers-fix-posix-timers-signals-lock-order.patch
posix-timers: fix posix-timers signals lock order
posix-timers-cpu-clock-support-for-posix-timers.patch
posix-timers: CPU clock support for POSIX timers
posix-timers-cpu-clock-support-for-posix-timers-fix.patch
posix-timers: CPU clock support for POSIX timers (fix)
panic-in-check_process_timers.patch
PANIC in check_process_timers()
make-itimer_real-per-process.patch
make ITIMER_REAL per-process
make-itimer_prof-itimer_virtual-per-process.patch
make ITIMER_PROF, ITIMER_VIRTUAL per-process
make-rlimit_cpu-sigxcpu-per-process.patch
make RLIMIT_CPU/SIGXCPU per-process
pcmcia-add-support-ti-pci4510-cardbus-bridge.patch
pcmcia: Add support TI PCI4510 CardBus bridge
pcmcia-update-vrc4171_card.patch
pcmcia: update vrc4171_card
nfs-fix_vfsflock.patch
VFS: Fix structure initialization in locks_remove_flock()
nfs-flock.patch
NFS: Add emulation of BSD flock() in terms of POSIX locks on the server
nfsacl-return-enosys-for-rpc-programs-that-are-unavailable.patch
nfsacl: return -ENOSYS for RPC programs that are unavailable
nfsacl-add-missing-eopnotsupp-=-nfs3err_notsupp-mapping-in-nfsd.patch
nfsacl: add missing -EOPNOTSUPP => NFS3ERR_NOTSUPP mapping in nfsd
nfsacl-allow-multiple-programs-to-listen-on-the-same-port.patch
nfsacl: allow multiple programs to listen on the same port
nfsacl-allow-multiple-programs-to-share-the-same-transport.patch
nfsacl: allow multiple programs to share the same transport
nfsacl-lazy-rpc-receive-buffer-allocation.patch
nfsacl: lazy RPC receive buffer allocation
nfsacl-encode-and-decode-arbitrary-xdr-arrays.patch
nfsacl: encode and decode arbitrary XDR arrays
nfsacl-encode-and-decode-arbitrary-xdr-arrays-fix.patch
nfsacl-encode-and-decode-arbitrary-xdr-arrays-fix
nfsacl-add-noacl-nfs-mount-option.patch
nfsacl: add noacl nfs mount option
nfsacl-infrastructure-and-server-side-of-nfsacl.patch
nfsacl: infrastructure and server side of nfsacl
nfsv4-deamon-always-supports-acls.patch
NFSv4 deamon always supports acls
lib-sort-replace-qsort-in-nfs-acl-code.patch
lib/sort: Replace qsort in NFS ACL code
nfsacl-infrastructure-and-server-side-of-nfsacl-fix.patch
nfsacl-infrastructure-and-server-side-of-nfsacl fix
nfsacl-solaris-nfsacl-workaround.patch
nfsacl: solaris nfsacl workaround
nfsacl-client-side-of-nfsacl.patch
nfsacl: client side of nfsacl
nfsacl-client-side-of-nfsacl-fix.patch
nfsacl: Must not initialize inode->i_op to NULL
nfsacl-acl-umask-handling-workaround-in-nfs-client.patch
nfsacl: aCL umask handling workaround in nfs client
nfsacl-acl-umask-handling-workaround-in-nfs-client-fix.patch
ACL umask handling workaround in nfs client fix
nfsacl-cache-acls-on-the-nfs-client-side.patch
nfsacl: cache acls on the nfs client side
nfs-acl-build-fix-posix-acl-config-tidy.patch
NFS ACL build fix, POSIX ACL config tidy
kgdb-ga.patch
kgdb stub for ia32 (George Anzinger's one)
kgdbL warning fix
kgdb buffer overflow fix
kgdbL warning fix
kgdb: CONFIG_DEBUG_INFO fix
x86_64 fixes
correct kgdb.txt Documentation link (against 2.6.1-rc1-mm2)
kgdb: fix for recent gcc
kgdb warning fixes
THREAD_SIZE fixes for kgdb
Fix stack overflow test for non-8k stacks
kgdb-ga.patch fix for i386 single-step into sysenter
fix TRAP_BAD_SYSCALL_EXITS on i386
add TRAP_BAD_SYSCALL_EXITS config for i386
kgdb-is-incompatible-with-kprobes
kgdb-ga-build-fix
kgdb-ga-fixes
kgdb: kill off highmem_start_page
kgdboe-netpoll.patch
kgdb-over-ethernet via netpoll
kgdboe: fix configuration of MAC address
kgdb-x86_64-support.patch
kgdb-x86_64-support.patch for 2.6.2-rc1-mm3
kgdb-x86_64-warning-fixes
kgdb-x86_64-fix
kgdb-x86_64-serial-fix
kprobes exception notifier fix
dev-mem-restriction-patch.patch
/dev/mem restriction patch
dev-mem-restriction-patch-allow-reads.patch
dev-mem-restriction-patch: allow reads
journal_add_journal_head-debug.patch
journal_add_journal_head-debug
list_del-debug.patch
list_del debug check
page-owner-tracking-leak-detector.patch
Page owner tracking leak detector
make-page_owner-handle-non-contiguous-page-ranges.patch
make page_owner handle non-contiguous page ranges
unplug-can-sleep.patch
unplug functions can sleep
firestream-warnings.patch
firestream warnings
perfctr-core.patch
perfctr: core
perfctr: remove bogus perfctr_sample_thread() calls
perfctr-i386.patch
perfctr: i386
perfctr-x86-core-updates.patch
perfctr x86 core updates
perfctr-x86-driver-updates.patch
perfctr x86 driver updates
perfctr-x86-driver-cleanup.patch
perfctr: x86 driver cleanup
perfctr-prescott-fix.patch
Prescott fix for perfctr
perfctr-x86-update-2.patch
perfctr x86 update 2
perfctr-x86_64.patch
perfctr: x86_64
perfctr-x86_64-core-updates.patch
perfctr x86_64 core updates
perfctr-ppc.patch
perfctr: PowerPC
perfctr-ppc32-driver-update.patch
perfctr: ppc32 driver update
perfctr-ppc32-mmcr0-handling-fixes.patch
perfctr ppc32 MMCR0 handling fixes
perfctr-ppc32-update.patch
perfctr ppc32 update
perfctr-ppc32-update-2.patch
perfctr ppc32 update
perfctr-virtualised-counters.patch
perfctr: virtualised counters
perfctr-remap_page_range-fix.patch
virtual-perfctr-illegal-sleep.patch
virtual perfctr illegal sleep
make-perfctr_virtual-default-in-kconfig-match-recommendation.patch
Make PERFCTR_VIRTUAL default in Kconfig match recommendation in help text
perfctr-ifdef-cleanup.patch
perfctr ifdef cleanup
perfctr-update-2-6-kconfig-related-updates.patch
perfctr: Kconfig-related updates
perfctr-virtual-updates.patch
perfctr virtual updates
perfctr-virtual-cleanup.patch
perfctr: virtual cleanup
perfctr-ppc32-preliminary-interrupt-support.patch
perfctr ppc32 preliminary interrupt support
perfctr-update-5-6-reduce-stack-usage.patch
perfctr: reduce stack usage
perfctr-interrupt-support-kconfig-fix.patch
perfctr interrupt_support Kconfig fix
perfctr-low-level-documentation.patch
perfctr low-level documentation
perfctr-inheritance-1-3-driver-updates.patch
perfctr inheritance: driver updates
perfctr-inheritance-2-3-kernel-updates.patch
perfctr inheritance: kernel updates
perfctr-inheritance-3-3-documentation-updates.patch
perfctr inheritance: documentation updates
perfctr-inheritance-locking-fix.patch
perfctr inheritance locking fix
perfctr-api-changes-first-step.patch
perfctr API changes: first step
perfctr-virtual-update.patch
perfctr virtual update
perfctr-x86-64-ia32-emulation-fix.patch
perfctr x86-64 ia32 emulation fix
perfctr-sysfs-update-1-4-core.patch
perfctr sysfs update: core
perfctr-sysfs-update.patch
Perfctr sysfs update
perfctr-sysfs-update-2-4-x86.patch
perfctr sysfs update: x86
perfctr-sysfs-update-3-4-x86-64.patch
perfctr sysfs update: x86-64
perfctr: syscall numbers in x86-64 ia32-emulation
perfctr x86_64 native syscall numbers fix
perfctr-sysfs-update-4-4-ppc32.patch
perfctr sysfs update: ppc32
add-do_proc_doulonglongvec_minmax-to-sysctl-functions.patch
Add do_proc_doulonglongvec_minmax to sysctl functions
add-do_proc_doulonglongvec_minmax-to-sysctl-functions-fix
add-do_proc_doulonglongvec_minmax-to-sysctl-functions fix 2
add-sysctl-interface-to-sched_domain-parameters.patch
Add sysctl interface to sched_domain parameters
allow-modular-ide-pnp.patch
allow modular ide-pnp
allow-x86_64-to-reenable-interrupts-on-contention.patch
Allow x86_64 to reenable interrupts on contention
i386-cpu-hotplug-updated-for-mm.patch
i386 CPU hotplug updated for -mm
ppc64-fix-cpu-hotplug.patch
ppc64: fix hotplug cpu
disable-atykb-warning.patch
disable atykb "too many keys pressed" warning
export-file_ra_state_init-again.patch
Export file_ra_state_init() again
cachefs-filesystem.patch
CacheFS filesystem
numa-policies-for-file-mappings-mpol_mf_move-cachefs.patch
numa-policies-for-file-mappings-mpol_mf_move for cachefs
cachefs-release-search-records-lest-they-return-to-haunt-us.patch
CacheFS: release search records lest they return to haunt us
fix-64-bit-problems-in-cachefs.patch
Fix 64-bit problems in cachefs
cachefs-fixed-typos-that-cause-wrong-pointer-to-be-kunmapped.patch
cachefs: fixed typos that cause wrong pointer to be kunmapped
cachefs-return-the-right-error-upon-invalid-mount.patch
CacheFS: return the right error upon invalid mount
fix-cachefs-barrier-handling-and-other-kernel-discrepancies.patch
Fix CacheFS barrier handling and other kernel discrepancies
remove-error-from-linux-cachefsh.patch
Remove #error from linux/cachefs.h
cachefs-warning-fix-2.patch
cachefs warning fix 2
cachefs-linkage-fix-2.patch
cachefs linkage fix
cachefs-build-fix.patch
cachefs build fix
cachefs-documentation.patch
CacheFS documentation
add-page-becoming-writable-notification.patch
Add page becoming writable notification
add-page-becoming-writable-notification-fix.patch
do_wp_page_mk_pte_writable() fix
add-page-becoming-writable-notification-build-fix.patch
add-page-becoming-writable-notification build fix
provide-a-filesystem-specific-syncable-page-bit.patch
Provide a filesystem-specific sync'able page bit
provide-a-filesystem-specific-syncable-page-bit-fix.patch
provide-a-filesystem-specific-syncable-page-bit-fix
provide-a-filesystem-specific-syncable-page-bit-fix-2.patch
provide-a-filesystem-specific-syncable-page-bit-fix-2
make-afs-use-cachefs.patch
Make AFS use CacheFS
afs-cachefs-dependency-fix.patch
afs-cachefs-dependency-fix
split-general-cache-manager-from-cachefs.patch
Split general cache manager from CacheFS
turn-cachefs-into-a-cache-backend.patch
Turn CacheFS into a cache backend
rework-the-cachefs-documentation-to-reflect-fs-cache-split.patch
Rework the CacheFS documentation to reflect FS-Cache split
update-afs-client-to-reflect-cachefs-split.patch
Update AFS client to reflect CacheFS split
x86-rename-apic_mode_exint.patch
kexec: x86: rename APIC_MODE_EXINT
x86-local-apic-fix.patch
kexec: x86: local apic fix
x86_64-e820-64bit.patch
kexec: x86_64: e820 64bit fix
x86-i8259-shutdown.patch
kexec: x86: i8259 shutdown: disable interrupts
x86_64-i8259-shutdown.patch
kexec: x86_64: add i8259 shutdown method
x86-apic-virtwire-on-shutdown.patch
kexec: x86: resture apic virtual wire mode on shutdown
x86_64-apic-virtwire-on-shutdown.patch
kexec: x86_64: restore apic virtual wire mode on shutdown
vmlinux-fix-physical-addrs.patch
kexec: vmlinux: fix physical addresses
x86-vmlinux-fix-physical-addrs.patch
kexec: x86: vmlinux: fix physical addresses
x86_64-vmlinux-fix-physical-addrs.patch
kexec: x86_64: vmlinux: fix physical addresses
x86_64-entry64.patch
kexec: x86_64: add 64-bit entry
x86-config-kernel-start.patch
kexec: x86: add CONFIG_PYSICAL_START
x86_64-config-kernel-start.patch
kexec: x86_64: add CONFIG_PHYSICAL_START
kexec-kexec-generic.patch
kexec: add kexec syscalls
kexec-kexec-generic-kexec-use-unsigned-bitfield.patch
kexec: use unsigned bitfield
x86-machine_shutdown.patch
kexec: x86: factor out apic shutdown code
x86-kexec.patch
kexec: x86 kexec core
x86-crashkernel.patch
crashdump: x86 crashkernel option
x86_64-machine_shutdown.patch
kexec: x86_64: factor out apic shutdown code
x86_64-kexec.patch
kexec: x86_64 kexec implementation
x86_64-crashkernel.patch
crashdump: x86_64: crashkernel option
kexec-ppc-support.patch
kexec: kexec ppc support
x86-crash_shutdown-nmi-shootdown.patch
crashdump: x86: add NMI handler to capture other CPUs
x86-crash_shutdown-snapshot-registers.patch
kexec: x86: snapshot registers during crash shutdown
x86-crash_shutdown-apic-shutdown.patch
kexec: x86 shutdown APICs during crash_shutdown
crashdump-documentation.patch
crashdump: documentation
crashdump-memory-preserving-reboot-using-kexec.patch
crashdump: memory preserving reboot using kexec
crashdump-routines-for-copying-dump-pages.patch
crashdump: routines for copying dump pages
crashdump-routines-for-copying-dump-pages-fixes.patch
crashdump-routines-for-copying-dump-pages-fixes
crashdump-elf-format-dump-file-access.patch
crashdump: elf format dump file access
crashdump-linear-raw-format-dump-file-access.patch
crashdump: linear raw format dump file access
crashdump-linear-raw-format-dump-file-access-coding-style.patch
crashdump-linear-raw-format-dump-file-access-coding-style
new-bitmap-list-format-for-cpusets.patch
new bitmap list format (for cpusets)
cpusets-big-numa-cpu-and-memory-placement.patch
cpusets - big numa cpu and memory placement
cpusets-config_cpusets-depends-on-smp.patch
Cpusets: CONFIG_CPUSETS depends on SMP
cpusets-move-cpusets-above-embedded.patch
move CPUSETS above EMBEDDED
cpusets-fix-cpuset_get_dentry.patch
cpusets : fix cpuset_get_dentry()
cpusets-fix-race-in-cpuset_add_file.patch
cpusets: fix race in cpuset_add_file()
cpusets-remove-more-casts.patch
cpusets: remove more casts
cpusets-make-config_cpusets-the-default-in-sn2_defconfig.patch
cpusets: make CONFIG_CPUSETS the default in sn2_defconfig
cpusets-document-proc-status-allowed-fields.patch
cpusets: document proc status allowed fields
cpusets-dont-export-proc_cpuset_operations.patch
Cpusets - Dont export proc_cpuset_operations
cpusets-display-allowed-masks-in-proc-status.patch
cpusets: display allowed masks in proc status
cpusets-simplify-cpus_allowed-setting-in-attach.patch
cpusets: simplify cpus_allowed setting in attach
cpusets-remove-useless-validation-check.patch
cpusets: remove useless validation check
cpusets-tasks-file-simplify-format-fixes.patch
Cpusets tasks file: simplify format, fixes
lib-sort-replace-open-coded-opids2-bubblesort-in-cpusets.patch
lib/sort: Replace open-coded O(pids**2) bubblesort in cpusets
cpusets-simplify-memory-generation.patch
Cpusets: simplify memory generation
cpusets-interoperate-with-hotplug-online-maps.patch
cpusets: interoperate with hotplug online maps
cpusets-alternative-fix-for-possible-race-in.patch
cpusets: alternative fix for possible race in cpuset_tasks_read()
cpusets-remove-casts.patch
cpusets: remove void* typecasts
reiser4-sb_sync_inodes.patch
reiser4: vfs: add super_operations.sync_inodes()
reiser4-allow-drop_inode-implementation.patch
reiser4: export vfs inode.c symbols
reiser4-truncate_inode_pages_range.patch
reiser4: vfs: add truncate_inode_pages_range()
reiser4-export-remove_from_page_cache.patch
reiser4: export pagecache add/remove functions to modules
reiser4-export-page_cache_readahead.patch
reiser4: export page_cache_readahead to modules
reiser4-reget-page-mapping.patch
reiser4: vfs: re-check page->mapping after calling try_to_release_page()
reiser4-rcu-barrier.patch
reiser4: add rcu_barrier() synchronization point
reiser4-export-inode_lock.patch
reiser4: export inode_lock to modules
reiser4-export-pagevec-funcs.patch
reiser4: export pagevec functions to modules
reiser4-export-radix_tree_preload.patch
reiser4: export radix_tree_preload() to modules
reiser4-export-find_get_pages.patch
reiser4-radix-tree-tag.patch
reiser4: add new radix tree tag
reiser4-radix_tree_lookup_slot.patch
reiser4: add radix_tree_lookup_slot()
reiser4-perthread-pages.patch
reiser4: per-thread page pools
reiser4-include-reiser4.patch
reiser4: add to build system
reiser4-doc.patch
reiser4: documentation
reiser4-only.patch
reiser4: main fs
reiser4-recover-read-performance.patch
reiser4: recover read performance
reiser4-export-find_get_pages_tag.patch
reiser4-export-find_get_pages_tag
reiser4-add-missing-context.patch
add-acpi-based-floppy-controller-enumeration.patch
Add ACPI-based floppy controller enumeration.
possible-dcache-bug-debugging-patch.patch
Possible dcache BUG: debugging patch
serial-add-support-for-non-standard-xtals-to-16c950-driver.patch
serial: add support for non-standard XTALs to 16c950 driver
add-support-for-possio-gcc-aka-pcmcia-siemens-mc45.patch
Add support for Possio GCC AKA PCMCIA Siemens MC45
generic-serial-cli-conversion.patch
generic-serial cli() conversion
specialix-io8-cli-conversion.patch
Specialix/IO8 cli() conversion
sx-cli-conversion.patch
SX cli() conversion
revert-allow-oem-written-modules-to-make-calls-to-ia64-oem-sal-functions.patch
revert "allow OEM written modules to make calls to ia64 OEM SAL functions"
md-add-interface-for-userspace-monitoring-of-events.patch
md: add interface for userspace monitoring of events.
make-acpi_bus_register_driver-consistent-with-pci_register_driver-again.patch
make acpi_bus_register_driver() consistent with pci_register_driver()
remove-lock_section-from-x86_64-spin_lock-asm.patch
remove LOCK_SECTION from x86_64 spin_lock asm
kfree_skb-dump_stack.patch
kfree_skb-dump_stack
cancel_rearming_delayed_work.patch
cancel_rearming_delayed_work()
ipvs-deadlock-fix.patch
ipvs deadlock fix
minimal-ide-disk-updates.patch
Minimal ide-disk updates
use-find_trylock_page-in-free_swap_and_cache-instead-of-hand-coding.patch
use find_trylock_page in free_swap_and_cache instead of hand coding
radeonfb-fix-spurious-error-return-in-fbio_radeon_set_mirror.patch
radeonfb: Fix spurious error return in FBIO_RADEON_SET_MIRROR
w100fb-make-blanking-function-interrupt-safe.patch
w100fb: Make blanking function interrupt safe
kyrofb-copy__user-return-value-checks-added-to-kyro-fb.patch
kyrofb: copy_*_user return value checks added to kyro fb
skeletonfb-documentation-fixes.patch
skeletonfb: Documentation fixes
intelfb-add-partial-support-915g-chipset.patch
intelfb: Add partial support 915G chipset
sisfb_compat_ioctl-warning-fix.patch
fbdev compat_ioctl warning fix
sis-warning-fix.patch
sis warning fix
tridentfbc-make-some-code-static.patch
tridentfb.c: make some code static
tridentfb-warning-fix.patch
tridentfb warning fix
md-fix-multipath-assembly-bug.patch
md: fix multipath assembly bug
md-raid-kconfig-cleanups-remove-experimental-tag-from-raid-6.patch
md: RAID Kconfig cleanups, remove experimental tag from RAID-6
device-mapper-store-name-directly-against-device.patch
device-mapper: Store name directly against device
device-mapper-record-restore-bio-state.patch
device-mapper: Record & restore bio state.
device-mapper-export-map_info.patch
device-mapper: Export map_info
figure-out-who-is-inserting-bogus-modules.patch
Figure out who is inserting bogus modules
detect-atomic-counter-underflows.patch
detect atomic counter underflows
update-documentation-filesystems-locking.patch
Update Documentation/filesystems/Locking
post-halloween-doc.patch
post halloween doc
periodically-scan-redzone-entries-and-slab-control-structures.patch
periodically scan redzone entries and slab control structures
fuse-maintainers-kconfig-and-makefile-changes.patch
Subject: [PATCH 1/11] FUSE - MAINTAINERS, Kconfig and Makefile changes
fuse-core.patch
Subject: [PATCH 2/11] FUSE - core
fuse-device-functions.patch
Subject: [PATCH 3/11] FUSE - device functions
fuse-device-functions-fix-race-in-interrupted-request.patch
fuse: fix race in interrupted request
fuse-device-functions-fix.patch
fuse: better error reporting in fuse_fill_super
fuse-fix-llseek-on-device.patch
FUSE: fix llseek on device
fuse-make-two-functions-static.patch
fuse: make two functions static
fuse-fix-variable-with-confusing-name.patch
fuse: fix variable with confusing name
fuse-read-only-operations.patch
Subject: [PATCH 4/11] FUSE - read-only operations
fuse-read-write-operations.patch
Subject: [PATCH 5/11] FUSE - read-write operations
fuse-read-write-operations-fix.patch
fuse: fix hard link operation
fuse-file-operations.patch
Subject: [PATCH 6/11] FUSE - file operations
fuse-mount-options.patch
Subject: [PATCH 7/11] FUSE - mount options
fuse-dont-check-against-zero-fsuid.patch
fuse: don't check against zero fsuid
fuse-remove-mount_max-and-user_allow_other-module-parameters.patch
fuse: remove mount_max and user_allow_other module parameters
fuse-extended-attribute-operations.patch
Subject: [PATCH 8/11] FUSE - extended attribute operations
fuse-readpages-operation.patch
Subject: [PATCH 9/11] FUSE - readpages operation
fuse-nfs-export.patch
Subject: [PATCH 10/11] FUSE - NFS export
fuse-direct-i-o.patch
Subject: [PATCH 11/11] FUSE - direct I/O
fuse-transfer-readdir-data-through-device.patch
fuse: transfer readdir data through device
cryptoapi-prepare-for-processing-multiple-buffers-at.patch
CryptoAPI: prepare for processing multiple buffers at a time
cryptoapi-update-padlock-to-process-multiple-blocks-at.patch
CryptoAPI: Update PadLock to process multiple blocks at once
update-email-address-of-andrea-arcangeli.patch
update email address of Andrea Arcangeli
compile-error-blackbird_load_firmware.patch
blackbird_load_firmware compile fix
i386-x86_64-apicc-make-two-functions-static.patch
i386/x86_64 apic.c: make two functions static
i386-cyrixc-make-a-function-static.patch
i386 cyrix.c: make a function static
mtrr-some-cleanups.patch
mtrr: some cleanups
i386-cpu-commonc-some-cleanups.patch
i386 cpu/common.c: some cleanups
i386-cpuidc-make-two-functions-static.patch
i386 cpuid.c: make two functions static
i386-efic-make-some-code-static.patch
i386 efi.c: make some code static
i386-x86_64-io_apicc-misc-cleanups.patch
i386/x86_64 io_apic.c: misc cleanups
i386-mpparsec-make-mp_processor_info-static.patch
i386 mpparse.c: make MP_processor_info static
i386-x86_64-msrc-make-two-functions-static.patch
i386/x86_64 msr.c: make two functions static
3w-abcdh-tw_device_extension-remove-an-unused-filed.patch
3w-abcd.h: TW_Device_Extension: remove an unused field
hpet-make-some-code-static.patch
hpet: make some code static
26-patch-i386-trapsc-make-a-function-static.patch
i386 traps.c: make a function static
i386-semaphorec-make-4-functions-static.patch
i386 semaphore.c: make 4 functions static
kill-aux_device_present.patch
kill aux_device_present
i386-setupc-make-4-variables-static.patch
i386 setup.c: make 4 variables static
mostly-i386-mm-cleanup.patch
(mostly i386) mm cleanup
update-email-address-of-benjamin-lahaise.patch
Update email address of Benjamin LaHaise
update-email-address-of-philip-blundell.patch
Update email address of Philip Blundell
kernel-acctc-make-a-function-static.patch
kernel/acct.c: make a function static
kernel-auditc-make-some-functions-static.patch
kernel/audit.c: make some functions static
kernel-capabilityc-make-a-spinlock-static.patch
kernel/capability.c: make a spinlock static
mm-thrashc-make-a-variable-static.patch
mm/thrash.c: make a variable static
lib-kernel_lockc-make-kernel_sem-static.patch
lib/kernel_lock.c: make kernel_sem static
saa7146_vv_ksymsc-remove-two-unused-export_symbol_gpls.patch
saa7146_vv_ksyms.c: remove two unused EXPORT_SYMBOL_GPL's
fix-placement-of-static-inline-in-nfsdh.patch
fix placement of static inline in nfsd.h
drivers-block-umemc-make-two-functions-static.patch
drivers/block/umem.c: make two functions static
drivers-block-xdc-make-a-variable-static.patch
drivers/block/xd.c: make a variable static
kernel-forkc-make-mm_cachep-static.patch
kernel/fork.c: make mm_cachep static
kernel-forkc-make-mm_cachep-static-fix.patch
kernel-forkc-make-mm_cachep-static fix
mm-page-writebackc-remove-an-unused-function.patch
mm/page-writeback.c: remove an unused function
mm-shmemc-make-a-struct-static.patch
mm/shmem.c: make a struct static
misc-isapnp-cleanups.patch
misc ISAPNP cleanups
some-pnp-cleanups.patch
some PNP cleanups
if-0-cx88_risc_disasm.patch
#if 0 cx88_risc_disasm
make-loglevels-in-init-mainc-a-little-more-sane.patch
Make loglevels in init/main.c a little more sane.
isicom-use-null-for-pointer.patch
sparse: use NULL for pointer
remove-bouncing-email-address-of-hennus-bergman.patch
remove bouncing email address of Hennus Bergman
cirrusfbc-make-some-code-static.patch
cirrusfb.c: make some code static
matroxfb_basec-make-some-code-static.patch
matroxfb_base.c: make some code static
matroxfb_basec-make-some-code-static-fix.patch
matroxfb_basec-make-some-code-static fix
asiliantfbc-make-some-code-static.patch
asiliantfb.c: make some code static
i386-apic-kconfig-cleanups.patch
i386 APIC Kconfig cleanups
security-seclvlc-make-some-code-static.patch
security/seclvl.c: make some code static
drivers-block-elevatorc-make-two-functions-static.patch
drivers/block/elevator.c: make two functions static
drivers-block-rdc-make-two-variables-static.patch
drivers/block/rd.c: make two variables static
loopc-make-two-functions-static.patch
loop.c: make two functions static
remove-bouncing-email-address-of-thomas-hood.patch
remove bouncing email address of Thomas Hood
fs-adfs-dir_fc-remove-an-unused-function.patch
fs/adfs/dir_f.c: remove an unused function
drivers-char-moxac-if-0-an-unused-function.patch
drivers/char/moxa.c: #if 0 an unused function
fs-lockd-clntprocc-make-2-functions-static.patch
fs/lockd/clntproc.c: make 2 functions static
oss-sb_cardc-no-need-to-include-mcah.patch
OSS sb_card.c: no need to include mca.h
ioschedc-use-proper-documentation-path.patch
*-iosched.c: Use proper documentation path
kernel-resourcec-make-resource_op-static.patch
kernel/resource.c: make resource_op static
kernel-power-mainc-make-pm_states-static.patch
kernel/power/main.c: make pm_states static
kernel-sysc-make-some-code-static.patch
kernel/sys.c: make some code static
scsi-ipsc-make-some-code-static.patch
SCSI ips.c: make some code static
scsi-psi240ic-make-4-functions-static.patch
SCSI psi240i.c: make 4 functions static
scsi-src-make-a-struct-static.patch
SCSI sr.c: make a struct static
small-drivers-video-kyro-cleanups.patch
small drivers/video/kyro/ cleanups
drivers-video-i810-make-some-code-static.patch
drivers/video/i810/: make some code static
floppyc-make-some-code-static.patch
floppy.c: make some code static
drivers-block-nbdc-make-3-functions-static.patch
drivers/block/nbd.c: make 3 functions static
drivers-block-cpqarrayc-small-cleanups.patch
drivers/block/cpqarray.c: small cleanups
pcxx-remove-obsolete-driver.patch
pcxx: Remove obsolete driver
pty-oops-fix.patch
pty oops fix
mark-the-mcd-cdrom-driver-as-broken.patch
mark the mcd cdrom driver as BROKEN
warning-fix-in-drivers-cdrom-mcdc.patch
warning fix in drivers/cdrom/mcd.c
wavefront-reduce-stack-usage.patch
wavefront: reduce stack usage
mm-page-writebackc-remove-an-unused-function-2.patch
mm/page-writeback.c: remove an unused function #2
generic_serialh-kill-incorrect-gs_debug-reference.patch
generic_serial.h: kill incorrect gs_debug reference
kernel-timerc-make-two-variables-static.patch
kernel/timer.c: make two variables static
remove-the-unused-oss-maestro_tablesh.patch
remove the unused OSS maestro_tables.h
fs-hfs-misc-cleanups.patch
fs/hfs/: misc cleanups
fs-hpfs-make-some-code-static.patch
fs/hpfs/: make some code static
fs-hfsplus-misc-cleanups.patch
fs/hfsplus/: misc cleanups
i386-x86_64-processc-make-hlt_counter-static.patch
i386/x86_64 process.c: make hlt_counter static
i386-mach-default-topologyc-make-cpu_devices-static.patch
i386/mach-default/topology.c: make cpu_devices static
i386-math-emu-misc-cleanups.patch
i386/math-emu/: misc cleanups
non-pc-parport-config-change.patch
non-PC parport config change
prism54-misc-cleanups.patch
prism54: misc cleanups
scsi-qlogicfcc-some-cleanups.patch
SCSI qlogicfc.c: some cleanups
scsi-qlogicispc-some-cleanups.patch
SCSI qlogicisp.c: some cleanups
savagefbc-make-some-code-static.patch
savagefb.c: make some code static
hpet-setup-comment-fix.patch
hpet setup comment fix
fs-ncpfs-ncplib_kernelc-make-a-function-static.patch
fs/ncpfs/ncplib_kernel.c: make a function static
kill-iphase5526.patch
kill IPHASE5526
fs-nfs-make-some-code-static.patch
fs/nfs/: make some code static
i386-x86_64-acpi-sleepc-kill-unused-acpi_save_state_disk.patch
i386/x86_64: acpi/sleep.c: kill unused acpi_save_state_disk
smpbootc-cleanups.patch
smp{,boot}.c cleanups
i386-kernel-i387c-misc-cleanups.patch
i386/kernel/i387.c: misc cleanups
i386-x86_64-i8259c-make-mask_and_ack_8259a-static.patch
i386/x86_64 i8259.c: make mask_and_ack_8259A static
scsi-sym53c416c-make-a-function-static.patch
SCSI sym53c416.c: make a function static
scsi-ultrastorc-make-a-variable-static.patch
SCSI ultrastor.c: make a variable static
kernel-intermodulec-make-inter_module_get-static.patch
kernel/intermodule.c: make inter_module_get static
On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
>
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
>
>
> - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> It seems that nothing else is going to come along and this is completely
> encapsulated.
Even if we accept a module that grants capabilities to groups this isn't fine
yet because it only supports two specific capabilities (and even those two in
different ways!) instead of adding generic support to bind capabilities to
groups.
More comments on the actual code:
+#include <linux/module.h>
+#include <linux/security.h>
+
+#define RT_LSM "Realtime LSM " /* syslog module name prefix */
+#define RT_ERR "Realtime: " /* syslog error message prefix */
+#include <linux/vermagic.h>
+MODULE_INFO(vermagic,VERMAGIC_STRING);
This doesn't belong into a module.
+#define MY_NAME __stringify(KBUILD_MODNAME)
Please use normal prefix. A module shouldn't behave differently depending on
what name you compile it as.
On Thu, 2005-02-10 at 02:35 -0800, Andrew Morton wrote:
> -inotify.patch
> -inotify-fix_find_inode.patch
>
> I think my version is old, and it oopses.
It is old. I have sent you multiple updates. ;-)
Attached, find a patch against 2.6.11-rc3-mm2 of the latest inotify.
This version has numerous optimizations, bug fixes, and clean ups. It
introduces a generic notification layer to cleanly wrap both dnotify and
inotify hooks in fs/.
Pending is a data structure reorganization, to untangle some of the
locking.
Andrew, please apply!
Robert Love
inotify!
inotify is intended to correct the deficiencies of dnotify, particularly
its inability to scale and its terrible user interface:
* dnotify requires the opening of one fd per each directory
that you intend to watch. This quickly results in too many
open files and pins removable media, preventing unmount.
* dnotify is directory-based. You only learn about changes to
directories. Sure, a change to a file in a directory affects
the directory, but you are then forced to keep a cache of
stat structures.
* dnotify's interface to user-space is awful. Signals?
inotify provides a more usable, simple, powerful solution to file change
notification:
* inotify's interface is a device node, not SIGIO. You open a
single fd to the device node, which is select()-able.
* inotify has an event that says "the filesystem that the item
you were watching is on was unmounted."
* inotify can watch directories or files.
Inotify is currently used by Beagle (a desktop search infrastructure)
and Gamin (a FAM replacement).
Signed-off-by: Robert Love <[email protected]>
arch/sparc64/Kconfig | 13
drivers/char/Kconfig | 13
drivers/char/Makefile | 2
drivers/char/inotify.c | 1053 +++++++++++++++++++++++++++++++++++++++++++++
fs/attr.c | 33 -
fs/compat.c | 14
fs/file_table.c | 4
fs/inode.c | 3
fs/namei.c | 38 -
fs/open.c | 9
fs/read_write.c | 24 -
fs/super.c | 3
include/linux/fs.h | 7
include/linux/fsnotify.h | 235 ++++++++++
include/linux/inotify.h | 118 +++++
include/linux/miscdevice.h | 1
include/linux/sched.h | 2
kernel/user.c | 2
18 files changed, 1511 insertions(+), 63 deletions(-)
diff -urN linux-2.6.11-rc3-mm2/arch/sparc64/Kconfig linux-mm-inotify/arch/sparc64/Kconfig
--- linux-2.6.11-rc3-mm2/arch/sparc64/Kconfig 2005-02-10 13:17:32.212175080 -0500
+++ linux-mm-inotify/arch/sparc64/Kconfig 2005-02-10 13:18:40.358815216 -0500
@@ -88,6 +88,19 @@
bool
default y
+config INOTIFY
+ bool "Inotify file change notification support"
+ default y
+ ---help---
+ Say Y here to enable inotify support and the /dev/inotify character
+ device. Inotify is a file change notification system and a
+ replacement for dnotify. Inotify fixes numerous shortcomings in
+ dnotify and introduces several new features. It allows monitoring
+ of both files and directories via a single open fd. Multiple file
+ events are supported.
+
+ If unsure, say Y.
+
config SMP
bool "Symmetric multi-processing support"
---help---
diff -urN linux-2.6.11-rc3-mm2/drivers/char/inotify.c linux-mm-inotify/drivers/char/inotify.c
--- linux-2.6.11-rc3-mm2/drivers/char/inotify.c 1969-12-31 19:00:00.000000000 -0500
+++ linux-mm-inotify/drivers/char/inotify.c 2005-02-10 13:18:40.360814912 -0500
@@ -0,0 +1,1053 @@
+/*
+ * drivers/char/inotify.c - inode-based file event notifications
+ *
+ * Authors:
+ * John McCutchan <[email protected]>
+ * Robert Love <[email protected]>
+ *
+ * Copyright (C) 2005 John McCutchan
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/spinlock.h>
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/poll.h>
+#include <linux/device.h>
+#include <linux/miscdevice.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/writeback.h>
+#include <linux/inotify.h>
+
+#include <asm/ioctls.h>
+
+static atomic_t inotify_cookie;
+static kmem_cache_t *watch_cachep;
+static kmem_cache_t *event_cachep;
+static kmem_cache_t *inode_data_cachep;
+
+static int sysfs_attrib_max_user_devices;
+static int sysfs_attrib_max_user_watches;
+static unsigned int sysfs_attrib_max_queued_events;
+
+/*
+ * struct inotify_device - represents an open instance of an inotify device
+ *
+ * For each inotify device, we need to keep track of the events queued on it,
+ * a list of the inodes that we are watching, and so on.
+ *
+ * This structure is protected by 'lock'. Lock ordering:
+ *
+ * dev->lock (protects dev)
+ * inode_lock (used to safely walk inode_in_use list)
+ * inode->i_lock (only needed for getting ref on inode_data)
+ */
+struct inotify_device {
+ wait_queue_head_t wait;
+ struct idr idr;
+ struct list_head events;
+ struct list_head watches;
+ spinlock_t lock;
+ unsigned int queue_size;
+ unsigned int event_count;
+ unsigned int max_events;
+ struct user_struct *user;
+};
+
+struct inotify_watch {
+ s32 wd; /* watch descriptor */
+ u32 mask; /* event mask for this watch */
+ struct inode *inode; /* associated inode */
+ struct inotify_device *dev; /* associated device */
+ struct list_head d_list; /* entry in device's list */
+ struct list_head i_list; /* entry in inotify_data's list */
+};
+
+/*
+ * A list of these is attached to each instance of the driver. In read(), this
+ * this list is walked and all events that can fit in the buffer are returned.
+ */
+struct inotify_kernel_event {
+ struct inotify_event event;
+ struct list_head list;
+ char *filename;
+};
+
+static ssize_t show_max_queued_events(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_queued_events);
+}
+
+static ssize_t store_max_queued_events(struct class_device *class,
+ const char *buf, size_t count)
+{
+ unsigned int max;
+
+ if (sscanf(buf, "%u", &max) > 0 && max > 0) {
+ sysfs_attrib_max_queued_events = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static ssize_t show_max_user_devices(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_user_devices);
+}
+
+static ssize_t store_max_user_devices(struct class_device *class,
+ const char *buf, size_t count)
+{
+ int max;
+
+ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
+ sysfs_attrib_max_user_devices = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static ssize_t show_max_user_watches(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_user_watches);
+}
+
+static ssize_t store_max_user_watches(struct class_device *class,
+ const char *buf, size_t count)
+{
+ int max;
+
+ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
+ sysfs_attrib_max_user_watches = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static CLASS_DEVICE_ATTR(max_queued_events, S_IRUGO | S_IWUSR,
+ show_max_queued_events, store_max_queued_events);
+static CLASS_DEVICE_ATTR(max_user_devices, S_IRUGO | S_IWUSR,
+ show_max_user_devices, store_max_user_devices);
+static CLASS_DEVICE_ATTR(max_user_watches, S_IRUGO | S_IWUSR,
+ show_max_user_watches, store_max_user_watches);
+
+static inline void __get_inode_data(struct inotify_inode_data *data)
+{
+ atomic_inc(&data->count);
+}
+
+/*
+ * get_inode_data - pin an inotify_inode_data structure. Returns the structure
+ * if successful and NULL on failure, which can only occur if inotify_data is
+ * not yet allocated. The inode must be pinned prior to invocation.
+ */
+static inline struct inotify_inode_data * get_inode_data(struct inode *inode)
+{
+ struct inotify_inode_data *data;
+
+ spin_lock(&inode->i_lock);
+ data = inode->inotify_data;
+ if (data)
+ __get_inode_data(data);
+ spin_unlock(&inode->i_lock);
+
+ return data;
+}
+
+/*
+ * put_inode_data - drop our reference on an inotify_inode_data and the
+ * inode structure in which it lives. If the reference count on inotify_data
+ * reaches zero, free it.
+ */
+static inline void put_inode_data(struct inode *inode)
+{
+ //spin_lock(&inode->i_lock);
+ if (atomic_dec_and_test(&inode->inotify_data->count)) {
+ kmem_cache_free(inode_data_cachep, inode->inotify_data);
+ inode->inotify_data = NULL;
+ }
+ //spin_unlock(&inode->i_lock);
+}
+
+/*
+ * find_inode - resolve a user-given path to a specific inode and return a nd
+ */
+static int find_inode(const char __user *dirname, struct nameidata *nd)
+{
+ int error;
+
+ error = __user_walk(dirname, LOOKUP_FOLLOW, nd);
+ if (error)
+ return error;
+
+ /* you can only watch an inode if you have read permissions on it */
+ return permission(nd->dentry->d_inode, MAY_READ, NULL);
+}
+
+static struct inotify_kernel_event * kernel_event(s32 wd, u32 mask, u32 cookie,
+ const char *filename)
+{
+ struct inotify_kernel_event *kevent;
+
+ kevent = kmem_cache_alloc(event_cachep, GFP_ATOMIC);
+ if (!kevent)
+ return NULL;
+
+ /* we hand this out to user-space, so zero it just in case */
+ memset(&kevent->event, 0, sizeof(struct inotify_event));
+
+ kevent->event.wd = wd;
+ kevent->event.mask = mask;
+ kevent->event.cookie = cookie;
+ INIT_LIST_HEAD(&kevent->list);
+
+ if (filename) {
+ size_t len, rem, event_size = sizeof(struct inotify_event);
+
+ /*
+ * We need to pad the filename so as to properly align an
+ * array of inotify_event structures. Because the structure is
+ * small and the common case is a small filename, we just round
+ * up to the next multiple of the structure's sizeof. This is
+ * simple and safe for all architectures.
+ */
+ len = strlen(filename) + 1;
+ rem = event_size - len;
+ if (len > event_size) {
+ rem = event_size - (len % event_size);
+ if (len % event_size == 0)
+ rem = 0;
+ }
+ len += rem;
+
+ kevent->filename = kmalloc(len, GFP_ATOMIC);
+ if (!kevent->filename) {
+ kmem_cache_free(event_cachep, kevent);
+ return NULL;
+ }
+ memset(kevent->filename, 0, len);
+ strncpy(kevent->filename, filename, strlen(filename));
+ kevent->event.len = len;
+ } else {
+ kevent->event.len = 0;
+ kevent->filename = NULL;
+ }
+
+ return kevent;
+}
+
+#define list_to_inotify_kernel_event(pos) \
+ list_entry((pos), struct inotify_kernel_event, list)
+
+#define inotify_dev_get_event(dev) \
+ (list_to_inotify_kernel_event(dev->events.next))
+
+/*
+ * inotify_dev_queue_event - add a new event to the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_queue_event(struct inotify_device *dev,
+ struct inotify_watch *watch, u32 mask,
+ u32 cookie, const char *filename)
+{
+ struct inotify_kernel_event *kevent, *last;
+
+ /* drop this event if it is a dupe of the previous */
+ last = inotify_dev_get_event(dev);
+ if (dev->event_count && last->event.mask == mask &&
+ last->event.wd == watch->wd) {
+ const char *lastname = last->filename;
+
+ if (!filename && !lastname)
+ return;
+ if (filename && lastname && !strcmp(lastname, filename))
+ return;
+ }
+
+ /*
+ * the queue has already overflowed and we have already sent the
+ * Q_OVERFLOW event
+ */
+ if (dev->event_count > dev->max_events)
+ return;
+
+ /* the queue has just overflowed and we need to notify user space */
+ if (dev->event_count == dev->max_events) {
+ kevent = kernel_event(-1, IN_Q_OVERFLOW, cookie, NULL);
+ goto add_event_to_queue;
+ }
+
+ kevent = kernel_event(watch->wd, mask, cookie, filename);
+
+add_event_to_queue:
+ if (!kevent)
+ return;
+
+ /* queue the event and wake up anyone waiting */
+ dev->event_count++;
+ dev->queue_size += sizeof(struct inotify_event) + kevent->event.len;
+ list_add_tail(&kevent->list, &dev->events);
+ wake_up_interruptible(&dev->wait);
+}
+
+static inline int inotify_dev_has_events(struct inotify_device *dev)
+{
+ return !list_empty(&dev->events);
+}
+
+/*
+ * inotify_dev_event_dequeue - destroy an event on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_event_dequeue(struct inotify_device *dev)
+{
+ struct inotify_kernel_event *kevent;
+
+ if (!inotify_dev_has_events(dev))
+ return;
+
+ kevent = inotify_dev_get_event(dev);
+ list_del_init(&kevent->list);
+ if (kevent->filename)
+ kfree(kevent->filename);
+
+ dev->event_count--;
+ dev->queue_size -= sizeof(struct inotify_event) + kevent->event.len;
+
+ kmem_cache_free(event_cachep, kevent);
+}
+
+/*
+ * inotify_dev_get_wd - returns the next WD for use by the given dev
+ *
+ * This function can sleep.
+ */
+static int inotify_dev_get_wd(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ int ret;
+
+ if (atomic_read(&dev->user->inotify_watches) >=
+ sysfs_attrib_max_user_watches)
+ return -ENOSPC;
+
+repeat:
+ if (!idr_pre_get(&dev->idr, GFP_KERNEL))
+ return -ENOSPC;
+ spin_lock(&dev->lock);
+ ret = idr_get_new(&dev->idr, watch, &watch->wd);
+ spin_unlock(&dev->lock);
+ if (ret == -EAGAIN) /* more memory is required, try again */
+ goto repeat;
+ else if (ret) /* the idr is full! */
+ return -ENOSPC;
+
+ atomic_inc(&dev->user->inotify_watches);
+
+ return 0;
+}
+
+/*
+ * inotify_dev_put_wd - release the given WD on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static int inotify_dev_put_wd(struct inotify_device *dev, s32 wd)
+{
+ if (!dev || wd < 0)
+ return -1;
+
+ atomic_dec(&dev->user->inotify_watches);
+ idr_remove(&dev->idr, wd);
+
+ return 0;
+}
+
+/*
+ * create_watch - creates a watch on the given device.
+ *
+ * Grabs dev->lock, so the caller must not hold it.
+ */
+static struct inotify_watch *create_watch(struct inotify_device *dev,
+ u32 mask, struct inode *inode)
+{
+ struct inotify_watch *watch;
+
+ watch = kmem_cache_alloc(watch_cachep, GFP_KERNEL);
+ if (!watch)
+ return NULL;
+
+ watch->mask = mask;
+ watch->inode = inode;
+ watch->dev = dev;
+ INIT_LIST_HEAD(&watch->d_list);
+ INIT_LIST_HEAD(&watch->i_list);
+
+ if (inotify_dev_get_wd(dev, watch)) {
+ kmem_cache_free(watch_cachep, watch);
+ return NULL;
+ }
+
+ return watch;
+}
+
+/*
+ * delete_watch - removes the given 'watch' from the given 'dev'
+ *
+ * Caller must hold dev->lock.
+ */
+static void delete_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ inotify_dev_put_wd(dev, watch->wd);
+ kmem_cache_free(watch_cachep, watch);
+}
+
+/*
+ * inotify_find_dev - find the watch associated with the given inode and dev
+ *
+ * Caller must hold dev->lock.
+ * FIXME: Needs inotify_data->lock too. Don't need dev->lock, just pin it.
+ */
+static struct inotify_watch *inode_find_dev(struct inode *inode,
+ struct inotify_device *dev)
+{
+ struct inotify_watch *watch;
+
+ if (!inode->inotify_data)
+ return NULL;
+
+ list_for_each_entry(watch, &inode->inotify_data->watches, i_list) {
+ if (watch->dev == dev)
+ return watch;
+ }
+
+ return NULL;
+}
+
+/*
+ * dev_find_wd - given a (dev,wd) pair, returns the matching inotify_watcher
+ *
+ * Returns the results of looking up (dev,wd) in the idr layer. NULL is
+ * returned on error.
+ *
+ * The caller must hold dev->lock.
+ */
+static inline struct inotify_watch *dev_find_wd(struct inotify_device *dev,
+ u32 wd)
+{
+ return idr_find(&dev->idr, wd);
+}
+
+static int inotify_dev_is_watching_inode(struct inotify_device *dev,
+ struct inode *inode)
+{
+ struct inotify_watch *watch;
+
+ list_for_each_entry(watch, &dev->watches, d_list) {
+ if (watch->inode == inode)
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * inotify_dev_add_watcher - add the given watcher to the given device instance
+ *
+ * Caller must hold dev->lock.
+ */
+static int inotify_dev_add_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ if (!dev || !watch)
+ return -EINVAL;
+
+ list_add(&watch->d_list, &dev->watches);
+ return 0;
+}
+
+/*
+ * inotify_dev_rm_watch - remove the given watch from the given device
+ *
+ * Caller must hold dev->lock because we call inotify_dev_queue_event().
+ */
+static int inotify_dev_rm_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ if (!watch)
+ return -EINVAL;
+
+ inotify_dev_queue_event(dev, watch, IN_IGNORED, 0, NULL);
+ list_del_init(&watch->d_list);
+
+ return 0;
+}
+
+/*
+ * inode_add_watch - add a watch to the given inode
+ *
+ * Callers must hold dev->lock, because we call inode_find_dev().
+ */
+static int inode_add_watch(struct inode *inode, struct inotify_watch *watch)
+{
+ int ret;
+
+ if (!inode || !watch)
+ return -EINVAL;
+
+ spin_lock(&inode->i_lock);
+ if (!inode->inotify_data) {
+ /* inotify_data is not attached to the inode, so add it */
+ inode->inotify_data = kmem_cache_alloc(inode_data_cachep,
+ GFP_ATOMIC);
+ if (!inode->inotify_data) {
+ ret = -ENOMEM;
+ goto out_lock;
+ }
+
+ atomic_set(&inode->inotify_data->count, 0);
+ INIT_LIST_HEAD(&inode->inotify_data->watches);
+ spin_lock_init(&inode->inotify_data->lock);
+ } else if (inode_find_dev(inode, watch->dev)) {
+ /* a watch is already associated with this (inode,dev) pair */
+ ret = -EINVAL;
+ goto out_lock;
+ }
+ __get_inode_data(inode->inotify_data);
+ spin_unlock(&inode->i_lock);
+
+ list_add(&watch->i_list, &inode->inotify_data->watches);
+
+ return 0;
+out_lock:
+ spin_unlock(&inode->i_lock);
+ return ret;
+}
+
+static int inode_rm_watch(struct inode *inode,
+ struct inotify_watch *watch)
+{
+ if (!inode || !watch || !inode->inotify_data)
+ return -EINVAL;
+
+ list_del_init(&watch->i_list);
+
+ /* clean up inode->inotify_data */
+ put_inode_data(inode);
+
+ return 0;
+}
+
+/* Kernel API */
+
+/*
+ * inotify_inode_queue_event - queue an event with the given mask, cookie, and
+ * filename to any watches associated with the given inode.
+ *
+ * inode must be pinned prior to calling.
+ */
+void inotify_inode_queue_event(struct inode *inode, u32 mask, u32 cookie,
+ const char *name)
+{
+ struct inotify_watch *watch;
+
+ if (!inode->inotify_data)
+ return;
+
+ list_for_each_entry(watch, &inode->inotify_data->watches, i_list) {
+ if (watch->mask & mask) {
+ struct inotify_device *dev = watch->dev;
+ spin_lock(&dev->lock);
+ inotify_dev_queue_event(dev, watch, mask, cookie, name);
+ spin_unlock(&dev->lock);
+ }
+ }
+}
+EXPORT_SYMBOL_GPL(inotify_inode_queue_event);
+
+void inotify_dentry_parent_queue_event(struct dentry *dentry, u32 mask,
+ u32 cookie, const char *filename)
+{
+ struct dentry *parent;
+ struct inode *inode;
+
+ spin_lock(&dentry->d_lock);
+ parent = dentry->d_parent;
+ inode = parent->d_inode;
+ if (inode->inotify_data) {
+ dget(parent);
+ spin_unlock(&dentry->d_lock);
+ inotify_inode_queue_event(inode, mask, cookie, filename);
+ dput(parent);
+ } else
+ spin_unlock(&dentry->d_lock);
+}
+EXPORT_SYMBOL_GPL(inotify_dentry_parent_queue_event);
+
+u32 inotify_get_cookie(void)
+{
+ atomic_inc(&inotify_cookie);
+ return atomic_read(&inotify_cookie);
+}
+EXPORT_SYMBOL_GPL(inotify_get_cookie);
+
+/*
+ * Caller must hold dev->lock.
+ */
+static void __remove_watch(struct inotify_watch *watch,
+ struct inotify_device *dev)
+{
+ struct inode *inode;
+
+ inode = watch->inode;
+
+ inode_rm_watch(inode, watch);
+ inotify_dev_rm_watch(dev, watch);
+ delete_watch(dev, watch);
+
+ iput(inode);
+}
+
+/*
+ * destroy_watch - remove a watch from both the device and the inode.
+ *
+ * watch->inode must be pinned. We drop a reference before returning. Grabs
+ * dev->lock.
+ */
+static void remove_watch(struct inotify_watch *watch)
+{
+ struct inotify_device *dev = watch->dev;
+
+ spin_lock(&dev->lock);
+ __remove_watch(watch, dev);
+ spin_unlock(&dev->lock);
+}
+
+void inotify_super_block_umount(struct super_block *sb)
+{
+ struct inode *inode;
+
+ spin_lock(&inode_lock);
+
+ /*
+ * We hold the inode_lock, so the inodes are not going anywhere, and
+ * we grab a reference on inotify_data before walking its list of
+ * watches.
+ */
+ list_for_each_entry(inode, &inode_in_use, i_list) {
+ struct inotify_inode_data *inode_data;
+ struct inotify_watch *watch;
+
+ if (inode->i_sb != sb)
+ continue;
+
+ inode_data = get_inode_data(inode);
+ if (!inode_data)
+ continue;
+
+ list_for_each_entry(watch, &inode_data->watches, i_list) {
+ struct inotify_device *dev = watch->dev;
+ spin_lock(&dev->lock);
+ inotify_dev_queue_event(dev, watch, IN_UNMOUNT, 0,
+ NULL);
+ __remove_watch(watch, dev);
+ spin_unlock(&dev->lock);
+ }
+ put_inode_data(inode);
+ }
+
+ spin_unlock(&inode_lock);
+}
+EXPORT_SYMBOL_GPL(inotify_super_block_umount);
+
+/*
+ * inotify_inode_is_dead - an inode has been deleted, cleanup any watches
+ */
+void inotify_inode_is_dead(struct inode *inode)
+{
+ struct inotify_watch *watch, *next;
+ struct inotify_inode_data *data;
+
+ data = get_inode_data(inode);
+ if (!data)
+ return;
+ list_for_each_entry_safe(watch, next, &data->watches, i_list)
+ remove_watch(watch);
+ put_inode_data(inode);
+}
+EXPORT_SYMBOL_GPL(inotify_inode_is_dead);
+
+/* The driver interface is implemented below */
+
+static unsigned int inotify_poll(struct file *file, poll_table *wait)
+{
+ struct inotify_device *dev;
+
+ dev = file->private_data;
+
+ poll_wait(file, &dev->wait, wait);
+
+ if (inotify_dev_has_events(dev))
+ return POLLIN | POLLRDNORM;
+
+ return 0;
+}
+
+static ssize_t inotify_read(struct file *file, char __user *buf,
+ size_t count, loff_t *pos)
+{
+ size_t event_size;
+ struct inotify_device *dev;
+ char __user *start;
+ DECLARE_WAITQUEUE(wait, current);
+
+ start = buf;
+ dev = file->private_data;
+
+ /* We only hand out full inotify events */
+ event_size = sizeof(struct inotify_event);
+ if (count < event_size)
+ return 0;
+
+ while (1) {
+ int has_events;
+
+ spin_lock(&dev->lock);
+ has_events = inotify_dev_has_events(dev);
+ spin_unlock(&dev->lock);
+ if (has_events)
+ break;
+
+ if (file->f_flags & O_NONBLOCK)
+ return -EAGAIN;
+
+ if (signal_pending(current))
+ return -EINTR;
+
+ add_wait_queue(&dev->wait, &wait);
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ schedule();
+
+ set_current_state(TASK_RUNNING);
+ remove_wait_queue(&dev->wait, &wait);
+ }
+
+ while (count >= event_size) {
+ struct inotify_kernel_event *kevent;
+
+ spin_lock(&dev->lock);
+ if (!inotify_dev_has_events(dev)) {
+ spin_unlock(&dev->lock);
+ break;
+ }
+ kevent = inotify_dev_get_event(dev);
+ spin_unlock(&dev->lock);
+
+ /* We can't send this event, not enough space in the buffer */
+ if (event_size + kevent->event.len > count)
+ break;
+
+ /* Copy the entire event except the string to user space */
+ if (copy_to_user(buf, &kevent->event, event_size))
+ return -EFAULT;
+
+ buf += event_size;
+ count -= event_size;
+
+ /* Copy the filename to user space */
+ if (kevent->filename) {
+ if (copy_to_user(buf, kevent->filename,
+ kevent->event.len))
+ return -EFAULT;
+ buf += kevent->event.len;
+ count -= kevent->event.len;
+ }
+
+ spin_lock(&dev->lock);
+ inotify_dev_event_dequeue(dev);
+ spin_unlock(&dev->lock);
+ }
+
+ return buf - start;
+}
+
+static int inotify_open(struct inode *inode, struct file *file)
+{
+ struct inotify_device *dev;
+ struct user_struct *user;
+ int ret;
+
+ user = get_uid(current->user);
+
+ if (atomic_read(&user->inotify_devs) >= sysfs_attrib_max_user_devices) {
+ ret = -EMFILE;
+ goto out_err;
+ }
+
+ dev = kmalloc(sizeof(struct inotify_device), GFP_KERNEL);
+ if (!dev) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ atomic_inc(¤t->user->inotify_devs);
+
+ idr_init(&dev->idr);
+
+ INIT_LIST_HEAD(&dev->events);
+ INIT_LIST_HEAD(&dev->watches);
+ init_waitqueue_head(&dev->wait);
+
+ dev->event_count = 0;
+ dev->queue_size = 0;
+ dev->max_events = sysfs_attrib_max_queued_events;
+ dev->user = user;
+ spin_lock_init(&dev->lock);
+
+ file->private_data = dev;
+
+ return 0;
+out_err:
+ free_uid(current->user);
+ return ret;
+}
+
+/*
+ * inotify_release_all_watches - destroy all watches on a given device
+ *
+ * FIXME: We need a lock on the watch here.
+ */
+static void inotify_release_all_watches(struct inotify_device *dev)
+{
+ struct inotify_watch *watch, *next;
+
+ list_for_each_entry_safe(watch, next, &dev->watches, d_list)
+ remove_watch(watch);
+}
+
+/*
+ * inotify_release_all_events - destroy all of the events on a given device
+ */
+static void inotify_release_all_events(struct inotify_device *dev)
+{
+ spin_lock(&dev->lock);
+ while (inotify_dev_has_events(dev))
+ inotify_dev_event_dequeue(dev);
+ spin_unlock(&dev->lock);
+}
+
+static int inotify_release(struct inode *inode, struct file *file)
+{
+ struct inotify_device *dev;
+
+ dev = file->private_data;
+
+ inotify_release_all_watches(dev);
+ inotify_release_all_events(dev);
+
+ atomic_dec(&dev->user->inotify_devs);
+ free_uid(dev->user);
+
+ kfree(dev);
+
+ return 0;
+}
+
+static int inotify_add_watch(struct inotify_device *dev,
+ struct inotify_watch_request *request)
+{
+ struct inode *inode;
+ struct inotify_watch *watch;
+ struct nameidata nd;
+ int ret;
+
+ ret = find_inode((const char __user*) request->name, &nd);
+ if (ret)
+ return ret;
+
+ /* held in place by references in nd */
+ inode = nd.dentry->d_inode;
+
+ spin_lock(&dev->lock);
+
+ /*
+ * This handles the case of re-adding a directory we are already
+ * watching, we just update the mask and return 0
+ */
+ if (inotify_dev_is_watching_inode(dev, inode)) {
+ struct inotify_watch *owatch; /* the old watch */
+
+ owatch = inode_find_dev(inode, dev);
+ owatch->mask = request->mask;
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+
+ return owatch->wd;
+ }
+
+ spin_unlock(&dev->lock);
+
+ watch = create_watch(dev, request->mask, inode);
+ if (!watch) {
+ path_release(&nd);
+ return -ENOSPC;
+ }
+
+ spin_lock(&dev->lock);
+
+ /* We can't add anymore watches to this device */
+ if (inotify_dev_add_watch(dev, watch)) {
+ delete_watch(dev, watch);
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+ return -EINVAL;
+ }
+
+ ret = inode_add_watch(inode, watch);
+ if (ret < 0) {
+ list_del_init(&watch->d_list);
+ delete_watch(dev, watch);
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+ return ret;
+ }
+
+ spin_unlock(&dev->lock);
+
+ /*
+ * Demote the reference to nameidata to a reference to the inode held
+ * by the watch.
+ */
+ spin_lock(&inode_lock);
+ __iget(inode);
+ spin_unlock(&inode_lock);
+ path_release(&nd);
+
+ return watch->wd;
+}
+
+static int inotify_ignore(struct inotify_device *dev, s32 wd)
+{
+ struct inotify_watch *watch;
+ int ret = 0;
+
+ spin_lock(&dev->lock);
+ watch = dev_find_wd(dev, wd);
+ spin_unlock(&dev->lock);
+ if (!watch) {
+ ret = -EINVAL;
+ goto out;
+ }
+ __remove_watch(watch, dev);
+
+out:
+ spin_unlock(&dev->lock);
+ return ret;
+}
+
+/*
+ * inotify_ioctl() - our device file's ioctl method
+ *
+ * The VFS serializes all of our calls via the BKL and we rely on that. We
+ * could, alternatively, grab dev->lock. Right now lower levels grab that
+ * where needed.
+ */
+static int inotify_ioctl(struct inode *ip, struct file *fp,
+ unsigned int cmd, unsigned long arg)
+{
+ struct inotify_device *dev;
+ struct inotify_watch_request request;
+ void __user *p;
+ s32 wd;
+
+ dev = fp->private_data;
+ p = (void __user *) arg;
+
+ switch (cmd) {
+ case INOTIFY_WATCH:
+ if (copy_from_user(&request, p, sizeof (request)))
+ return -EFAULT;
+ return inotify_add_watch(dev, &request);
+ case INOTIFY_IGNORE:
+ if (copy_from_user(&wd, p, sizeof (wd)))
+ return -EFAULT;
+ return inotify_ignore(dev, wd);
+ case FIONREAD:
+ return put_user(dev->queue_size, (int __user *) p);
+ default:
+ return -ENOTTY;
+ }
+}
+
+static struct file_operations inotify_fops = {
+ .owner = THIS_MODULE,
+ .poll = inotify_poll,
+ .read = inotify_read,
+ .open = inotify_open,
+ .release = inotify_release,
+ .ioctl = inotify_ioctl,
+};
+
+static struct miscdevice inotify_device = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "inotify",
+ .fops = &inotify_fops,
+};
+
+static int __init inotify_init(void)
+{
+ struct class_device *class;
+ int ret;
+
+ ret = misc_register(&inotify_device);
+ if (ret)
+ return ret;
+
+ sysfs_attrib_max_queued_events = 512;
+ sysfs_attrib_max_user_devices = 64;
+ sysfs_attrib_max_user_watches = 16384;
+
+ class = inotify_device.class;
+ class_device_create_file(class, &class_device_attr_max_queued_events);
+ class_device_create_file(class, &class_device_attr_max_user_devices);
+ class_device_create_file(class, &class_device_attr_max_user_watches);
+
+ atomic_set(&inotify_cookie, 0);
+
+ watch_cachep = kmem_cache_create("inotify_watch_cache",
+ sizeof(struct inotify_watch), 0, SLAB_PANIC,
+ NULL, NULL);
+
+ event_cachep = kmem_cache_create("inotify_event_cache",
+ sizeof(struct inotify_kernel_event), 0,
+ SLAB_PANIC, NULL, NULL);
+
+ inode_data_cachep = kmem_cache_create("inotify_inode_data_cache",
+ sizeof(struct inotify_inode_data), 0, SLAB_PANIC,
+ NULL, NULL);
+
+ printk(KERN_INFO "inotify device minor=%d\n", inotify_device.minor);
+
+ return 0;
+}
+
+module_init(inotify_init);
diff -urN linux-2.6.11-rc3-mm2/drivers/char/Kconfig linux-mm-inotify/drivers/char/Kconfig
--- linux-2.6.11-rc3-mm2/drivers/char/Kconfig 2005-02-10 13:17:46.349025952 -0500
+++ linux-mm-inotify/drivers/char/Kconfig 2005-02-10 13:18:40.361814760 -0500
@@ -62,6 +62,19 @@
depends on VT && !S390 && !USERMODE
default y
+config INOTIFY
+ bool "Inotify file change notification support"
+ default y
+ ---help---
+ Say Y here to enable inotify support and the /dev/inotify character
+ device. Inotify is a file change notification system and a
+ replacement for dnotify. Inotify fixes numerous shortcomings in
+ dnotify and introduces several new features. It allows monitoring
+ of both files and directories via a single open fd. Multiple file
+ events are supported.
+
+ If unsure, say Y.
+
config SERIAL_NONSTANDARD
bool "Non-standard serial port support"
---help---
diff -urN linux-2.6.11-rc3-mm2/drivers/char/Makefile linux-mm-inotify/drivers/char/Makefile
--- linux-2.6.11-rc3-mm2/drivers/char/Makefile 2005-02-10 13:17:46.352025496 -0500
+++ linux-mm-inotify/drivers/char/Makefile 2005-02-10 13:18:40.362814608 -0500
@@ -9,6 +9,8 @@
obj-y += mem.o random.o tty_io.o n_tty.o tty_ioctl.o
+
+obj-$(CONFIG_INOTIFY) += inotify.o
obj-$(CONFIG_LEGACY_PTYS) += pty.o
obj-$(CONFIG_UNIX98_PTYS) += pty.o
obj-y += misc.o
diff -urN linux-2.6.11-rc3-mm2/fs/attr.c linux-mm-inotify/fs/attr.c
--- linux-2.6.11-rc3-mm2/fs/attr.c 2005-02-10 13:17:35.850621952 -0500
+++ linux-mm-inotify/fs/attr.c 2005-02-10 13:19:39.655800704 -0500
@@ -10,7 +10,7 @@
#include <linux/mm.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/fcntl.h>
#include <linux/quotaops.h>
#include <linux/security.h>
@@ -107,31 +107,8 @@
out:
return error;
}
-
EXPORT_SYMBOL(inode_setattr);
-int setattr_mask(unsigned int ia_valid)
-{
- unsigned long dn_mask = 0;
-
- if (ia_valid & ATTR_UID)
- dn_mask |= DN_ATTRIB;
- if (ia_valid & ATTR_GID)
- dn_mask |= DN_ATTRIB;
- if (ia_valid & ATTR_SIZE)
- dn_mask |= DN_MODIFY;
- /* both times implies a utime(s) call */
- if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME))
- dn_mask |= DN_ATTRIB;
- else if (ia_valid & ATTR_ATIME)
- dn_mask |= DN_ACCESS;
- else if (ia_valid & ATTR_MTIME)
- dn_mask |= DN_MODIFY;
- if (ia_valid & ATTR_MODE)
- dn_mask |= DN_ATTRIB;
- return dn_mask;
-}
-
int notify_change(struct dentry * dentry, struct iattr * attr)
{
struct inode *inode = dentry->d_inode;
@@ -194,11 +171,9 @@
if (ia_valid & ATTR_SIZE)
up_write(&dentry->d_inode->i_alloc_sem);
- if (!error) {
- unsigned long dn_mask = setattr_mask(ia_valid);
- if (dn_mask)
- dnotify_parent(dentry, dn_mask);
- }
+ if (!error)
+ fsnotify_change(dentry, ia_valid);
+
return error;
}
diff -urN linux-2.6.11-rc3-mm2/fs/compat.c linux-mm-inotify/fs/compat.c
--- linux-2.6.11-rc3-mm2/fs/compat.c 2005-02-10 13:17:35.890615872 -0500
+++ linux-mm-inotify/fs/compat.c 2005-02-10 13:18:45.405048072 -0500
@@ -36,7 +36,7 @@
#include <linux/ctype.h>
#include <linux/module.h>
#include <linux/dirent.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/highuid.h>
#include <linux/sunrpc/svc.h>
#include <linux/nfsd/nfsd.h>
@@ -1233,9 +1233,15 @@
out:
if (iov != iovstack)
kfree(iov);
- if ((ret + (type == READ)) > 0)
- dnotify_parent(file->f_dentry,
- (type == READ) ? DN_ACCESS : DN_MODIFY);
+ if ((ret + (type == READ)) > 0) {
+ struct dentry *dentry = file->f_dentry;
+ if (type == READ)
+ fsnotify_access(dentry, dentry->d_inode,
+ dentry->d_name.name);
+ else
+ fsnotify_modify(dentry, dentry->d_inode,
+ dentry->d_name.name);
+ }
return ret;
}
diff -urN linux-2.6.11-rc3-mm2/fs/file_table.c linux-mm-inotify/fs/file_table.c
--- linux-2.6.11-rc3-mm2/fs/file_table.c 2005-02-10 13:17:35.967604168 -0500
+++ linux-mm-inotify/fs/file_table.c 2005-02-10 13:18:45.406047920 -0500
@@ -16,6 +16,7 @@
#include <linux/eventpoll.h>
#include <linux/mount.h>
#include <linux/cdev.h>
+#include <linux/fsnotify.h>
/* sysctl tunables... */
struct files_stat_struct files_stat = {
@@ -122,6 +123,9 @@
struct vfsmount *mnt = file->f_vfsmnt;
struct inode *inode = dentry->d_inode;
+
+ fsnotify_close(dentry, inode, file->f_mode, dentry->d_name.name);
+
might_sleep();
/*
* The function eventpoll_release() should be the first called
diff -urN linux-2.6.11-rc3-mm2/fs/inode.c linux-mm-inotify/fs/inode.c
--- linux-2.6.11-rc3-mm2/fs/inode.c 2005-02-10 13:17:47.899790200 -0500
+++ linux-mm-inotify/fs/inode.c 2005-02-10 13:18:45.407047768 -0500
@@ -132,6 +132,9 @@
#ifdef CONFIG_QUOTA
memset(&inode->i_dquot, 0, sizeof(inode->i_dquot));
#endif
+#ifdef CONFIG_INOTIFY
+ inode->inotify_data = NULL;
+#endif
inode->i_pipe = NULL;
inode->i_bdev = NULL;
inode->i_cdev = NULL;
diff -urN linux-2.6.11-rc3-mm2/fs/namei.c linux-mm-inotify/fs/namei.c
--- linux-2.6.11-rc3-mm2/fs/namei.c 2005-02-10 13:17:47.918787312 -0500
+++ linux-mm-inotify/fs/namei.c 2005-02-10 13:18:45.409047464 -0500
@@ -21,7 +21,7 @@
#include <linux/namei.h>
#include <linux/quotaops.h>
#include <linux/pagemap.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/smp_lock.h>
#include <linux/personality.h>
#include <linux/security.h>
@@ -1252,7 +1252,7 @@
DQUOT_INIT(dir);
error = dir->i_op->create(dir, dentry, mode, nd);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_create(dir, dentry, mode);
}
return error;
@@ -1557,7 +1557,7 @@
DQUOT_INIT(dir);
error = dir->i_op->mknod(dir, dentry, mode, dev);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_mknod(dir, dentry, mode, dev);
}
return error;
@@ -1630,7 +1630,7 @@
DQUOT_INIT(dir);
error = dir->i_op->mkdir(dir, dentry, mode);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_mkdir(dir, dentry->d_name.name);
security_inode_post_mkdir(dir,dentry, mode);
}
return error;
@@ -1720,10 +1720,8 @@
}
}
up(&dentry->d_inode->i_sem);
- if (!error) {
- inode_dir_notify(dir, DN_DELETE);
- d_delete(dentry);
- }
+ if (!error)
+ fsnotify_rmdir(dentry, dentry->d_inode, dir);
dput(dentry);
return error;
@@ -1793,10 +1791,9 @@
up(&dentry->d_inode->i_sem);
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
- if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
- d_delete(dentry);
- inode_dir_notify(dir, DN_DELETE);
- }
+ if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED))
+ fsnotify_unlink(dentry->d_inode, dir, dentry);
+
return error;
}
@@ -1870,7 +1867,7 @@
DQUOT_INIT(dir);
error = dir->i_op->symlink(dir, dentry, oldname);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_symlink(dir, dentry, oldname);
}
return error;
@@ -1943,7 +1940,7 @@
error = dir->i_op->link(old_dentry, dir, new_dentry);
up(&old_dentry->d_inode->i_sem);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, new_dentry->d_name.name);
security_inode_post_link(old_dentry, dir, new_dentry);
}
return error;
@@ -2107,6 +2104,7 @@
{
int error;
int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
+ char *old_name;
if (old_dentry->d_inode == new_dentry->d_inode)
return 0;
@@ -2128,18 +2126,18 @@
DQUOT_INIT(old_dir);
DQUOT_INIT(new_dir);
+ old_name = fsnotify_oldname_init(old_dentry);
+
if (is_dir)
error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
else
error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
if (!error) {
- if (old_dir == new_dir)
- inode_dir_notify(old_dir, DN_RENAME);
- else {
- inode_dir_notify(old_dir, DN_DELETE);
- inode_dir_notify(new_dir, DN_CREATE);
- }
+ const char *new_name = old_dentry->d_name.name;
+ fsnotify_move(old_dir, new_dir, old_name, new_name);
}
+ fsnotify_oldname_free(old_name);
+
return error;
}
diff -urN linux-2.6.11-rc3-mm2/fs/open.c linux-mm-inotify/fs/open.c
--- linux-2.6.11-rc3-mm2/fs/open.c 2005-02-10 13:17:36.101583800 -0500
+++ linux-mm-inotify/fs/open.c 2005-02-10 13:18:45.410047312 -0500
@@ -10,7 +10,7 @@
#include <linux/file.h>
#include <linux/smp_lock.h>
#include <linux/quotaops.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/tty.h>
@@ -944,9 +944,14 @@
fd = get_unused_fd();
if (fd >= 0) {
struct file *f = filp_open(tmp, flags, mode);
+ struct dentry *dentry;
+
error = PTR_ERR(f);
if (IS_ERR(f))
goto out_error;
+ dentry = f->f_dentry;
+ fsnotify_open(dentry, dentry->d_inode,
+ dentry->d_name.name);
fd_install(fd, f);
}
out:
@@ -998,7 +1003,7 @@
retval = err;
}
- dnotify_flush(filp, id);
+ fsnotify_flush(filp, id);
locks_remove_posix(filp, id);
fput(filp);
return retval;
diff -urN linux-2.6.11-rc3-mm2/fs/read_write.c linux-mm-inotify/fs/read_write.c
--- linux-2.6.11-rc3-mm2/fs/read_write.c 2005-02-10 13:17:48.151751896 -0500
+++ linux-mm-inotify/fs/read_write.c 2005-02-10 13:31:10.191823272 -0500
@@ -10,7 +10,7 @@
#include <linux/file.h>
#include <linux/uio.h>
#include <linux/smp_lock.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/security.h>
#include <linux/module.h>
#include <linux/syscalls.h>
@@ -239,7 +239,10 @@
else
ret = do_sync_read(file, buf, count, pos);
if (ret > 0) {
- dnotify_parent(file->f_dentry, DN_ACCESS);
+ struct dentry *dentry = file->f_dentry;
+ struct inode *inode = dentry->d_inode;
+ fsnotify_access(dentry, inode,
+ dentry->d_name.name);
current->rchar += ret;
}
current->syscr++;
@@ -287,7 +290,10 @@
else
ret = do_sync_write(file, buf, count, pos);
if (ret > 0) {
- dnotify_parent(file->f_dentry, DN_MODIFY);
+ struct dentry *dentry = file->f_dentry;
+ struct inode *inode = dentry->d_inode;
+ fsnotify_modify(dentry, inode,
+ dentry->d_name.name);
current->wchar += ret;
}
current->syscw++;
@@ -523,9 +529,15 @@
out:
if (iov != iovstack)
kfree(iov);
- if ((ret + (type == READ)) > 0)
- dnotify_parent(file->f_dentry,
- (type == READ) ? DN_ACCESS : DN_MODIFY);
+ if ((ret + (type == READ)) > 0) {
+ struct dentry *dentry = file->f_dentry;
+ struct inode *inode = dentry->d_inode;
+
+ if (type == READ)
+ fsnotify_access(dentry, inode, dentry->d_name.name);
+ else
+ fsnotify_modify(dentry, inode, dentry->d_name.name);
+ }
return ret;
Efault:
ret = -EFAULT;
Files linux-2.6.11-rc3-mm2/fs/.read_write.c.swp and linux-mm-inotify/fs/.read_write.c.swp differ
diff -urN linux-2.6.11-rc3-mm2/fs/super.c linux-mm-inotify/fs/super.c
--- linux-2.6.11-rc3-mm2/fs/super.c 2005-02-10 13:17:36.202568448 -0500
+++ linux-mm-inotify/fs/super.c 2005-02-10 13:18:45.413046856 -0500
@@ -37,9 +37,9 @@
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
#include <linux/kobject.h>
+#include <linux/fsnotify.h>
#include <asm/uaccess.h>
-
void get_filesystem(struct file_system_type *fs);
void put_filesystem(struct file_system_type *fs);
struct file_system_type *get_fs_type(const char *name);
@@ -229,6 +229,7 @@
if (root) {
sb->s_root = NULL;
+ fsnotify_sb_umount(sb);
shrink_dcache_parent(root);
shrink_dcache_anon(&sb->s_anon);
dput(root);
diff -urN linux-2.6.11-rc3-mm2/include/linux/fs.h linux-mm-inotify/include/linux/fs.h
--- linux-2.6.11-rc3-mm2/include/linux/fs.h 2005-02-10 13:17:49.275581048 -0500
+++ linux-mm-inotify/include/linux/fs.h 2005-02-10 13:18:45.415046552 -0500
@@ -26,6 +26,7 @@
struct kstatfs;
struct vm_area_struct;
struct vfsmount;
+struct inotify_inode_data;
/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -474,6 +475,10 @@
struct dnotify_struct *i_dnotify; /* for directory notifications */
#endif
+#ifdef CONFIG_INOTIFY
+ struct inotify_inode_data *inotify_data;
+#endif
+
unsigned long i_state;
unsigned long dirtied_when; /* jiffies of first dirtying */
@@ -1374,7 +1379,7 @@
extern int do_remount_sb(struct super_block *sb, int flags,
void *data, int force);
extern sector_t bmap(struct inode *, sector_t);
-extern int setattr_mask(unsigned int);
+extern void setattr_mask(unsigned int, int *, u32 *);
extern int notify_change(struct dentry *, struct iattr *);
extern int permission(struct inode *, int, struct nameidata *);
extern int generic_permission(struct inode *, int,
diff -urN linux-2.6.11-rc3-mm2/include/linux/fsnotify.h linux-mm-inotify/include/linux/fsnotify.h
--- linux-2.6.11-rc3-mm2/include/linux/fsnotify.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-mm-inotify/include/linux/fsnotify.h 2005-02-10 13:18:45.416046400 -0500
@@ -0,0 +1,235 @@
+#ifndef _LINUX_FS_NOTIFY_H
+#define _LINUX_FS_NOTIFY_H
+
+/*
+ * include/linux/fs_notify.h - generic hooks for filesystem notification, to
+ * reduce in-source duplication from both dnotify and inotify.
+ *
+ * We don't compile any of this away in some complicated menagerie of ifdefs.
+ * Instead, we rely on the code inside to optimize away as needed.
+ *
+ * (C) Copyright 2005 Robert Love
+ */
+
+#ifdef __KERNEL__
+
+#include <linux/dnotify.h>
+#include <linux/inotify.h>
+
+/*
+ * fsnotify_move - file old_name at old_dir was moved to new_name at new_dir
+ */
+static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
+ const char *old_name, const char *new_name)
+{
+ u32 cookie;
+
+ if (old_dir == new_dir)
+ inode_dir_notify(old_dir, DN_RENAME);
+ else {
+ inode_dir_notify(old_dir, DN_DELETE);
+ inode_dir_notify(new_dir, DN_CREATE);
+ }
+
+ cookie = inotify_get_cookie();
+
+ inotify_inode_queue_event(old_dir, IN_MOVED_FROM, cookie, old_name);
+ inotify_inode_queue_event(new_dir, IN_MOVED_TO, cookie, new_name);
+}
+
+/*
+ * fsnotify_unlink - file was unlinked
+ */
+static inline void fsnotify_unlink(struct inode *inode, struct inode *dir,
+ struct dentry *dentry)
+{
+ inode_dir_notify(dir, DN_DELETE);
+ inotify_inode_queue_event(dir, IN_DELETE_FILE, 0, dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
+
+ inotify_inode_is_dead(inode);
+ d_delete(dentry);
+}
+
+/*
+ * fsnotify_rmdir - directory was removed
+ */
+static inline void fsnotify_rmdir(struct dentry *dentry, struct inode *inode,
+ struct inode *dir)
+{
+ inode_dir_notify(dir, DN_DELETE);
+ inotify_inode_queue_event(dir, IN_DELETE_SUBDIR,0,dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
+
+ inotify_inode_is_dead(inode);
+ d_delete(dentry);
+}
+
+/*
+ * fsnotify_create - filename was linked in
+ */
+static inline void fsnotify_create(struct inode *inode, const char *filename)
+{
+ inode_dir_notify(inode, DN_CREATE);
+ inotify_inode_queue_event(inode, IN_CREATE_FILE, 0, filename);
+}
+
+/*
+ * fsnotify_mkdir - directory 'name' was created
+ */
+static inline void fsnotify_mkdir(struct inode *inode, const char *name)
+{
+ inode_dir_notify(inode, DN_CREATE);
+ inotify_inode_queue_event(inode, IN_CREATE_SUBDIR, 0, name);
+}
+
+/*
+ * fsnotify_access - file was read
+ */
+static inline void fsnotify_access(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ dnotify_parent(dentry, DN_ACCESS);
+ inotify_dentry_parent_queue_event(dentry, IN_ACCESS, 0,
+ dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_ACCESS, 0, NULL);
+}
+
+/*
+ * fsnotify_modify - file was modified
+ */
+static inline void fsnotify_modify(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ dnotify_parent(dentry, DN_MODIFY);
+ inotify_dentry_parent_queue_event(dentry, IN_MODIFY, 0, filename);
+ inotify_inode_queue_event(inode, IN_MODIFY, 0, NULL);
+}
+
+/*
+ * fsnotify_open - file was opened
+ */
+static inline void fsnotify_open(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ inotify_inode_queue_event(inode, IN_OPEN, 0, NULL);
+ inotify_dentry_parent_queue_event(dentry, IN_OPEN, 0, filename);
+}
+
+/*
+ * fsnotify_close - file was closed
+ */
+static inline void fsnotify_close(struct dentry *dentry, struct inode *inode,
+ mode_t mode, const char *filename)
+{
+ u32 mask;
+
+ mask = (mode & FMODE_WRITE) ? IN_CLOSE_WRITE : IN_CLOSE_NOWRITE;
+ inotify_dentry_parent_queue_event(dentry, mask, 0, filename);
+ inotify_inode_queue_event(inode, mask, 0, NULL);
+}
+
+/*
+ * fsnotify_change - notify_change event. file was modified and/or metadata
+ * was changed.
+ */
+static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
+{
+ int dn_mask = 0;
+ u32 in_mask = 0;
+
+ if (ia_valid & ATTR_UID) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+ if (ia_valid & ATTR_GID) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+ if (ia_valid & ATTR_SIZE) {
+ in_mask |= IN_MODIFY;
+ dn_mask |= DN_MODIFY;
+ }
+ /* both times implies a utime(s) call */
+ if ((ia_valid & (ATTR_ATIME | ATTR_MTIME)) == (ATTR_ATIME | ATTR_MTIME))
+ {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ } else if (ia_valid & ATTR_ATIME) {
+ in_mask |= IN_ACCESS;
+ dn_mask |= DN_ACCESS;
+ } else if (ia_valid & ATTR_MTIME) {
+ in_mask |= IN_MODIFY;
+ dn_mask |= DN_MODIFY;
+ }
+ if (ia_valid & ATTR_MODE) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+
+ if (dn_mask)
+ dnotify_parent(dentry, dn_mask);
+ if (in_mask) {
+ inotify_inode_queue_event(dentry->d_inode, in_mask, 0, NULL);
+ inotify_dentry_parent_queue_event(dentry, in_mask, 0,
+ dentry->d_name.name);
+ }
+}
+
+/*
+ * fsnotify_sb_umount - filesystem unmount
+ */
+static inline void fsnotify_sb_umount(struct super_block *sb)
+{
+ inotify_super_block_umount(sb);
+}
+
+/*
+ * fsnotify_flush - flush time!
+ */
+static inline void fsnotify_flush(struct file *filp, fl_owner_t id)
+{
+ dnotify_flush(filp, id);
+}
+
+#ifdef CONFIG_INOTIFY /* inotify helpers */
+
+/*
+ * fsnotify_oldname_init - save off the old filename before we change it
+ *
+ * this could be kstrdup if only we could add that to lib/string.c
+ */
+static inline char *fsnotify_oldname_init(struct dentry *old_dentry)
+{
+ char *old_name;
+
+ old_name = kmalloc(strlen(old_dentry->d_name.name) + 1, GFP_KERNEL);
+ if (old_name)
+ strcpy(old_name, old_dentry->d_name.name);
+ return old_name;
+}
+
+/*
+ * fsnotify_oldname_free - free the name we got from fsnotify_oldname_init
+ */
+static inline void fsnotify_oldname_free(const char *old_name)
+{
+ kfree(old_name);
+}
+
+#else /* CONFIG_INOTIFY */
+
+static inline char *fsnotify_oldname_init(struct dentry *old_dentry)
+{
+ return NULL;
+}
+
+static inline void fsnotify_oldname_free(const char *old_name)
+{
+}
+
+#endif /* ! CONFIG_INOTIFY */
+
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_FS_NOTIFY_H */
diff -urN linux-2.6.11-rc3-mm2/include/linux/inotify.h linux-mm-inotify/include/linux/inotify.h
--- linux-2.6.11-rc3-mm2/include/linux/inotify.h 1969-12-31 19:00:00.000000000 -0500
+++ linux-mm-inotify/include/linux/inotify.h 2005-02-10 13:18:45.417046248 -0500
@@ -0,0 +1,118 @@
+/*
+ * Inode based directory notification for Linux
+ *
+ * Copyright (C) 2005 John McCutchan
+ */
+
+#ifndef _LINUX_INOTIFY_H
+#define _LINUX_INOTIFY_H
+
+#include <linux/types.h>
+#include <linux/limits.h>
+
+/*
+ * struct inotify_event - structure read from the inotify device for each event
+ *
+ * When you are watching a directory, you will receive the filename for events
+ * such as IN_CREATE, IN_DELETE, IN_OPEN, IN_CLOSE, ..., relative to the wd.
+ */
+struct inotify_event {
+ __s32 wd; /* watch descriptor */
+ __u32 mask; /* watch mask */
+ __u32 cookie; /* cookie used for synchronizing two events */
+ size_t len; /* length (including nulls) of name */
+ char name[0]; /* stub for possible name */
+};
+
+/*
+ * struct inotify_watch_request - represents a watch request
+ *
+ * Pass to the inotify device via the INOTIFY_WATCH ioctl
+ */
+struct inotify_watch_request {
+ char *name; /* directory name */
+ __u32 mask; /* event mask */
+};
+
+/* the following are legal, implemented events */
+#define IN_ACCESS 0x00000001 /* File was accessed */
+#define IN_MODIFY 0x00000002 /* File was modified */
+#define IN_ATTRIB 0x00000004 /* File changed attributes */
+#define IN_CLOSE_WRITE 0x00000008 /* Writtable file was closed */
+#define IN_CLOSE_NOWRITE 0x00000010 /* Unwrittable file closed */
+#define IN_OPEN 0x00000020 /* File was opened */
+#define IN_MOVED_FROM 0x00000040 /* File was moved from X */
+#define IN_MOVED_TO 0x00000080 /* File was moved to Y */
+#define IN_DELETE_SUBDIR 0x00000100 /* Subdir was deleted */
+#define IN_DELETE_FILE 0x00000200 /* Subfile was deleted */
+#define IN_CREATE_SUBDIR 0x00000400 /* Subdir was created */
+#define IN_CREATE_FILE 0x00000800 /* Subfile was created */
+#define IN_DELETE_SELF 0x00001000 /* Self was deleted */
+#define IN_UNMOUNT 0x00002000 /* Backing fs was unmounted */
+#define IN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define IN_IGNORED 0x00008000 /* File was ignored */
+
+/* special flags */
+#define IN_ALL_EVENTS 0xffffffff /* All the events */
+#define IN_CLOSE (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE)
+
+#define INOTIFY_IOCTL_MAGIC 'Q'
+#define INOTIFY_IOCTL_MAXNR 2
+
+#define INOTIFY_WATCH _IOR(INOTIFY_IOCTL_MAGIC, 1, struct inotify_watch_request)
+#define INOTIFY_IGNORE _IOR(INOTIFY_IOCTL_MAGIC, 2, int)
+
+#ifdef __KERNEL__
+
+#include <linux/dcache.h>
+#include <linux/fs.h>
+#include <linux/config.h>
+
+struct inotify_inode_data {
+ struct list_head watches; /* list of watches on this inode */
+ spinlock_t lock; /* lock protecting the struct */
+ atomic_t count; /* ref count */
+};
+
+#ifdef CONFIG_INOTIFY
+
+extern void inotify_inode_queue_event(struct inode *, __u32, __u32,
+ const char *);
+extern void inotify_dentry_parent_queue_event(struct dentry *, __u32, __u32,
+ const char *);
+extern void inotify_super_block_umount(struct super_block *);
+extern void inotify_inode_is_dead(struct inode *);
+extern __u32 inotify_get_cookie(void);
+
+#else
+
+static inline void inotify_inode_queue_event(struct inode *inode,
+ __u32 mask, __u32 cookie,
+ const char *filename)
+{
+}
+
+static inline void inotify_dentry_parent_queue_event(struct dentry *dentry,
+ __u32 mask, __u32 cookie,
+ const char *filename)
+{
+}
+
+static inline void inotify_super_block_umount(struct super_block *sb)
+{
+}
+
+static inline void inotify_inode_is_dead(struct inode *inode)
+{
+}
+
+static inline __u32 inotify_get_cookie(void)
+{
+ return 0;
+}
+
+#endif /* CONFIG_INOTIFY */
+
+#endif /* __KERNEL __ */
+
+#endif /* _LINUX_INOTIFY_H */
diff -urN linux-2.6.11-rc3-mm2/include/linux/miscdevice.h linux-mm-inotify/include/linux/miscdevice.h
--- linux-2.6.11-rc3-mm2/include/linux/miscdevice.h 2005-02-10 13:17:37.578359296 -0500
+++ linux-mm-inotify/include/linux/miscdevice.h 2005-02-10 13:18:45.418046096 -0500
@@ -2,6 +2,7 @@
#define _LINUX_MISCDEVICE_H
#include <linux/module.h>
#include <linux/major.h>
+#include <linux/device.h>
#define PSMOUSE_MINOR 1
#define MS_BUSMOUSE_MINOR 2
diff -urN linux-2.6.11-rc3-mm2/include/linux/sched.h linux-mm-inotify/include/linux/sched.h
--- linux-2.6.11-rc3-mm2/include/linux/sched.h 2005-02-10 13:17:49.435556728 -0500
+++ linux-mm-inotify/include/linux/sched.h 2005-02-10 13:18:45.419045944 -0500
@@ -403,6 +403,8 @@
atomic_t processes; /* How many processes does this user have? */
atomic_t files; /* How many open files does this user have? */
atomic_t sigpending; /* How many pending signals does this user have? */
+ atomic_t inotify_watches; /* How many inotify watches does this user have? */
+ atomic_t inotify_devs; /* How many inotify devs does this user have opened? */
/* protected by mq_lock */
unsigned long mq_bytes; /* How many bytes can be allocated to mqueue? */
unsigned long locked_shm; /* How many pages of mlocked shm ? */
diff -urN linux-2.6.11-rc3-mm2/kernel/user.c linux-mm-inotify/kernel/user.c
--- linux-2.6.11-rc3-mm2/kernel/user.c 2005-02-10 13:17:49.580534688 -0500
+++ linux-mm-inotify/kernel/user.c 2005-02-10 13:18:45.420045792 -0500
@@ -120,6 +120,8 @@
atomic_set(&new->processes, 0);
atomic_set(&new->files, 0);
atomic_set(&new->sigpending, 0);
+ atomic_set(&new->inotify_watches, 0);
+ atomic_set(&new->inotify_devs, 0);
new->mq_bytes = 0;
new->locked_shm = 0;
The following warning comes from Linus' tree:
<-- snip -->
...
CC drivers/char/mxser.o
drivers/char/mxser.c: In function `mxser_initbrd':
drivers/char/mxser.c:551: warning: unused variable `flags'
...
<-- snip -->
The fis is simple:
Signed-off-by: Adrian Bunk <[email protected]>
--- linux-2.6.11-rc3-mm2-full/drivers/char/mxser.c.old 2005-02-10 19:58:36.000000000 +0100
+++ linux-2.6.11-rc3-mm2-full/drivers/char/mxser.c 2005-02-10 19:58:56.000000000 +0100
@@ -548,7 +548,6 @@
static int mxser_initbrd(int board, struct mxser_hwconf *hwconf)
{
struct mxser_struct *info;
- unsigned long flags;
int retval;
int i, n;
Christoph Hellwig <[email protected]> wrote:
>
> On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
> >
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
> >
> >
> > - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> > It seems that nothing else is going to come along and this is completely
> > encapsulated.
>
> Even if we accept a module that grants capabilities to groups this isn't fine
> yet because it only supports two specific capabilities (and even those two in
> different ways!) instead of adding generic support to bind capabilities to
> groups.
I'm sure that got discussed somewhere in the 1000 emails which flew past
last time. Jack?
[direct reply bounced, resending via gmail]
Andrew Morton <[email protected]> writes:
> Christoph Hellwig <[email protected]> wrote:
> >
> > On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
> > >
> > >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
> > >
> > >
> > > - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> > > It seems that nothing else is going to come along and this is completely
> > > encapsulated.
> >
> > Even if we accept a module that grants capabilities to groups this
> > isn't fine yet because it only supports two specific capabilities
> > (and even those two in different ways!) instead of adding generic
> > support to bind capabilities to groups.
>
> I'm sure that got discussed somewhere in the 1000 emails which flew past
> last time. Jack?
[adding cc: for the main discussion participants]
Most people felt that a more general capabilities module would be nice
to have. But, no one offered any code, or volunteered to work on it.
I have no objection to that approach, but am not willing or able to do
it myself. My opinion is that expanding the scope of the LSM would
significantly increase its security risk. That job needs to be done
very carefully, by someone with a deep understanding of the kernel's
internal use of capabilities.
Perhaps, Christoph's suggestion could become part of a more general
module, which might replace the RT-LSM in the 2.8 timeframe. Our LSM
is a modest solution aimed at solving the immediate needs of audio
developers and users with minimal impact on kernel security or
correctness.
Andrew Morton wrote:
>ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
>
>
>- Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> It seems that nothing else is going to come along and this is completely
> encapsulated.
>
>- Various other stuff. If anyone has a patch in here which they think
> should be in 2.6.11, please let me know. I'm intending to merge the
> following into 2.6.11:
>
> alpha-add-missing-dma_mapping_error.patch
> fix-compat-shmget-overflow.patch
> fix-shmget-for-ppc64-s390-64-sparc64.patch
> binfmt_elf-clearing-bss-may-fail.patch
> qlogic-warning-fixes.patch
> oprofile-exittext-referenced-in-inittext.patch
> force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
> oprofile-arm-xscale1-pmu-support-fix.patch
>
>
>
>
The following one should probably go in:
>+update-to-ipmi-driver-to-support-old-dmi-spec.patch
>
>
Systems with old data will not work correctly without it. There seems
to be a few of them out there.
-Corey
On Thu, 2005-02-10 at 02:35 -0800, Andrew Morton wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
>
>
> - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> It seems that nothing else is going to come along and this is completely
> encapsulated.
>
> - Various other stuff. If anyone has a patch in here which they think
> should be in 2.6.11, please let me know. I'm intending to merge the
> following into 2.6.11:
>
> alpha-add-missing-dma_mapping_error.patch
> fix-compat-shmget-overflow.patch
> fix-shmget-for-ppc64-s390-64-sparc64.patch
> binfmt_elf-clearing-bss-may-fail.patch
> qlogic-warning-fixes.patch
> oprofile-exittext-referenced-in-inittext.patch
> force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
> oprofile-arm-xscale1-pmu-support-fix.patch
Without the aty128fb and radeonfb updates, current 2.6.11 is a
regression on pmac as it breaks sleep support on previously working
laptops. If you don't intend to get at least
try_to_acquire_console_sem() and aty128fb fix in, in which case i can
send you a minimal radeonfb patch, then I'll have to make another patch
for 2.6.11 that reverts some of the arch changes to re-enable sleep on
those machines.
Ben.
Benjamin Herrenschmidt <[email protected]> wrote:
>
> On Thu, 2005-02-10 at 02:35 -0800, Andrew Morton wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
> >
> >
> > - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> > It seems that nothing else is going to come along and this is completely
> > encapsulated.
> >
> > - Various other stuff. If anyone has a patch in here which they think
> > should be in 2.6.11, please let me know. I'm intending to merge the
> > following into 2.6.11:
> >
> > alpha-add-missing-dma_mapping_error.patch
> > fix-compat-shmget-overflow.patch
> > fix-shmget-for-ppc64-s390-64-sparc64.patch
> > binfmt_elf-clearing-bss-may-fail.patch
> > qlogic-warning-fixes.patch
> > oprofile-exittext-referenced-in-inittext.patch
> > force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
> > oprofile-arm-xscale1-pmu-support-fix.patch
>
> Without the aty128fb and radeonfb updates, current 2.6.11 is a
> regression on pmac as it breaks sleep support on previously working
> laptops.
Is that worse than the risk of the large patch?
> If you don't intend to get at least
> try_to_acquire_console_sem() and aty128fb fix in, in which case i can
> send you a minimal radeonfb patch, then I'll have to make another patch
> for 2.6.11 that reverts some of the arch changes to re-enable sleep on
> those machines.
Ho hum. PM and fbdev are regularly broken anyway. Please always identify
the patches by name - it helps avoid mistakes.
These?
add-try_acquire_console_sem.patch
update-aty128fb-sleep-wakeup-code-for-new-powermac-changes.patch
radeonfb-update.patch
radeonfb-build-fix.patch
On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
>...
> - Various other stuff. If anyone has a patch in here which they think
> should be in 2.6.11, please let me know. I'm intending to merge the
> following into 2.6.11:
>
> alpha-add-missing-dma_mapping_error.patch
> fix-compat-shmget-overflow.patch
> fix-shmget-for-ppc64-s390-64-sparc64.patch
> binfmt_elf-clearing-bss-may-fail.patch
> qlogic-warning-fixes.patch
> oprofile-exittext-referenced-in-inittext.patch
> force-read-implies-exec-for-all-32bit-processes-in-x86-64.patch
> oprofile-arm-xscale1-pmu-support-fix.patch
>...
As described in the patch description, I'd like to see
mark-the-mcd-cdrom-driver-as-broken.patch in 2.6.11 .
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
> > Without the aty128fb and radeonfb updates, current 2.6.11 is a
> > regression on pmac as it breaks sleep support on previously working
> > laptops.
>
> Is that worse than the risk of the large patch?
Well, it used to work upstream fine for some time now... The large patch
isn't risky imho, at least in the latest version I sent you. The bulk of
the changes are just code to re-initialize new chip that isn't executed
at all on earlier models. The main radeonfb code changes very little. I
haven't had a failure report with the latest patch yet.
> > If you don't intend to get at least
> > try_to_acquire_console_sem() and aty128fb fix in, in which case i can
> > send you a minimal radeonfb patch, then I'll have to make another patch
> > for 2.6.11 that reverts some of the arch changes to re-enable sleep on
> > those machines.
>
> Ho hum. PM and fbdev are regularly broken anyway. Please always identify
> the patches by name - it helps avoid mistakes.
Ahem ... not that badly broken on releases, I've been careful enough
that at least, powerbook sleep worked fine for some time now.
> These?
>
> add-try_acquire_console_sem.patch
> update-aty128fb-sleep-wakeup-code-for-new-powermac-changes.patch
Those 2 first at least yes
> radeonfb-update.patch
> radeonfb-build-fix.patch
And either the above, or I can do a minimal patch on radeonfb just
restoring sleep on earlier models (adding the pmac_feature call to
notify the arch code that we can wakeup the chip) if you don't want to
merge the bigger update.
Ben.
On Thu, Feb 10, 2005 at 02:51:44PM -0600, Jack O'Quin wrote:
> [direct reply bounced, resending via gmail]
>
> Andrew Morton <[email protected]> writes:
>
> > Christoph Hellwig <[email protected]> wrote:
> > >
> > > On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
> > > >
> > > >
> > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/
> > > >
> > > >
> > > > - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
> > > > It seems that nothing else is going to come along and this is completely
> > > > encapsulated.
> > >
> > > Even if we accept a module that grants capabilities to groups this
> > > isn't fine yet because it only supports two specific capabilities
> > > (and even those two in different ways!) instead of adding generic
> > > support to bind capabilities to groups.
> >
> > I'm sure that got discussed somewhere in the 1000 emails which flew past
> > last time. Jack?
>
> [adding cc: for the main discussion participants]
>
> Most people felt that a more general capabilities module would be nice
> to have. But, no one offered any code, or volunteered to work on it.
What happened to the RT rlimit code from Chris?
--
Mathematics is the supreme nostalgia of our time.
* Matt Mackall ([email protected]) wrote:
> What happened to the RT rlimit code from Chris?
I still have it, but I had the impression Ingo didn't like it as a long
term solution/hack (albeit small) to the scheduler. Whereas the rt-lsm
patch is wholly self-contained.
thanks,
-chris
--
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
On Thu, Feb 10, 2005 at 04:47:27PM -0800, Chris Wright wrote:
> * Matt Mackall ([email protected]) wrote:
> > What happened to the RT rlimit code from Chris?
>
> I still have it, but I had the impression Ingo didn't like it as a long
> term solution/hack (albeit small) to the scheduler. Whereas the rt-lsm
> patch is wholly self-contained.
I think it's important to recognize that we're trying to address an
issue that has a much wider potential audience than pro audio users,
and not very far off - what is high end audio performance today will be
expected desktop performance next year.
So I think it's critical that we find solution that's appropriate for
_every single box_, because realistically vendors are going to ship
with this "wholly self-contained" feature turned on by default next
year, at which point the "containment" will be nil and whatever warts
it has will be with us forever.
The rlimit stuff is not perfect, but it's a much better fit for the
UNIX model generally, which is a fairly big win. Having it in the
system unconditionally doesn't trigger the gag reflex in quite the
same way as the LSM approach.
--
Mathematics is the supreme nostalgia of our time.
On Thu, 2005-02-10 at 18:09 -0800, Matt Mackall wrote:
> On Thu, Feb 10, 2005 at 04:47:27PM -0800, Chris Wright wrote:
> > * Matt Mackall ([email protected]) wrote:
> > > What happened to the RT rlimit code from Chris?
> >
> > I still have it, but I had the impression Ingo didn't like it as a long
> > term solution/hack (albeit small) to the scheduler. Whereas the rt-lsm
> > patch is wholly self-contained.
>
> I think it's important to recognize that we're trying to address an
> issue that has a much wider potential audience than pro audio users,
> and not very far off - what is high end audio performance today will be
> expected desktop performance next year.
>
> So I think it's critical that we find solution that's appropriate for
> _every single box_, because realistically vendors are going to ship
> with this "wholly self-contained" feature turned on by default next
> year, at which point the "containment" will be nil and whatever warts
> it has will be with us forever.
>
> The rlimit stuff is not perfect, but it's a much better fit for the
> UNIX model generally, which is a fairly big win. Having it in the
> system unconditionally doesn't trigger the gag reflex in quite the
> same way as the LSM approach.
>
Without considering the userspace aspect, RT rlimits is the best
implementation I have seen. All others either break RT scheduling
semantics, or don't allow any way for root to maintain control of
the system after giving out RT privileges.
http://mobile.yahoo.com.au - Yahoo! Mobile
- Check & compose your email via SMS on your Telstra or Vodafone mobile.
Nick Piggin wrote:
> On Thu, 2005-02-10 at 18:09 -0800, Matt Mackall wrote:
>
>>On Thu, Feb 10, 2005 at 04:47:27PM -0800, Chris Wright wrote:
>>
>>>* Matt Mackall ([email protected]) wrote:
>>>
>>>>What happened to the RT rlimit code from Chris?
>>>
>>>I still have it, but I had the impression Ingo didn't like it as a long
>>>term solution/hack (albeit small) to the scheduler. Whereas the rt-lsm
>>>patch is wholly self-contained.
>>
>>I think it's important to recognize that we're trying to address an
>>issue that has a much wider potential audience than pro audio users,
>>and not very far off - what is high end audio performance today will be
>>expected desktop performance next year.
>>
>>So I think it's critical that we find solution that's appropriate for
>>_every single box_, because realistically vendors are going to ship
>>with this "wholly self-contained" feature turned on by default next
>>year, at which point the "containment" will be nil and whatever warts
>>it has will be with us forever.
>>
>>The rlimit stuff is not perfect, but it's a much better fit for the
>>UNIX model generally, which is a fairly big win. Having it in the
>>system unconditionally doesn't trigger the gag reflex in quite the
>>same way as the LSM approach.
>>
>
>
> Without considering the userspace aspect, RT rlimits is the best
> implementation I have seen. All others either break RT scheduling
> semantics, or don't allow any way for root to maintain control of
> the system after giving out RT privileges.
Personally, I think that the best approach to solving this problem is
from the privileges aspect. The ability to grant privileges to only set
RT policy is just an example of a general need for granting limited
privileges to a program and/or a user. So a solution that involved a
mechanism for granting a specified subset of root privileges to
specified users when running specified programs would have wider
application.
My limited understanding of SELinux (which may be mistaken) is that it
provides a basic framework for this level of privilege control and
perhaps the solution lies there.
Peter
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
[ the best solution is .... ]
[ my preferred solution is ... ]
[ it would be better if ... ]
[ this is a kludge and it should be done instead like ... ]
did nobody read what andrew wrote and what JOQ pointed out?
after weeks of debating this, no other conceptual solution emerged
that did not have at least as many problems as the RT LSM module, and
all other proposed solutions were also more invasive of other aspects
of kernel design and operations than RT LSM is.
--p
On Thu, 2005-02-10 at 22:41 -0500, Paul Davis wrote:
> [ the best solution is .... ]
>
> [ my preferred solution is ... ]
>
> [ it would be better if ... ]
>
> [ this is a kludge and it should be done instead like ... ]
>
> did nobody read what andrew wrote and what JOQ pointed out?
>
> after weeks of debating this, no other conceptual solution emerged
> that did not have at least as many problems as the RT LSM module, and
> all other proposed solutions were also more invasive of other aspects
> of kernel design and operations than RT LSM is.
>
Sure, it is quick and easy. Suits some. At least I do prefer
this to altering the semantics of realtime scheduling.
I can't say much about it because I'm not putting my hand up to
do anything. Just mentioning that rlimit would be better if not
for the userspace side of the equation. I think most were already
agreed on that point anyway though.
Nick
Paul Davis wrote:
> [ the best solution is .... ]
>
> [ my preferred solution is ... ]
>
> [ it would be better if ... ]
>
> [ this is a kludge and it should be done instead like ... ]
>
> did nobody read what andrew wrote and what JOQ pointed out?
>
> after weeks of debating this, no other conceptual solution emerged
> that did not have at least as many problems as the RT LSM module, and
> all other proposed solutions were also more invasive of other aspects
> of kernel design and operations than RT LSM is.
As I see it, what I said was in support of RT LSM (or at least the
approach that RT LSM is taking) so why are you attacking me. I'm on
your side :-)
Peter
PS I'm withdrawing the "unprivileged real time" feature from the
spa_no_frills and zaphod schedulers in the PlugSched patch as a result
of the discussions on SCHED_ISO and RT rlimits because the discussion
convinced me that it's the wrong way to go.
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
Linux 2.6 (mm tree) Compile Statistics (gcc 3.4.1)
Web page with links to complete details:
http://developer.osdl.org/cherry/compile/
Kernel bzImage bzImage bzImage modules bzImage modules
(defconfig) (allno) (allyes) (allyes) (allmod) (allmod)
--------------- ---------- -------- -------- -------- -------- --------
2.6.11-rc3-mm2 14w/0e 0w/0e 192w/0e 6w/0e 19w/0e 172w/0e
2.6.11-rc3-mm1 13w/10e 0w/7e 196w/12e 6w/0e 18w/12e 177w/0e
2.6.11-rc2-mm2 15w/0e 0w/0e 201w/0e 6w/0e 18w/0e 182w/0e
2.6.11-rc2-mm1 15w/0e 0w/0e 306w/14e 6w/0e 18w/0e 294w/0e
2.6.11-rc1-mm2 21w/0e 0w/0e 316w/9e 6w/0e 22w/0e 294w/0e
2.6.11-rc1-mm1 21w/0e 0w/0e 319w/0e 6w/0e 23w/0e 298w/0e
2.6.10-mm3 21w/0e 0w/0e 320w/0e 6w/0e 23w/0e 299w/0e
2.6.10-mm2 21w/0e 0w/0e 440w/0e 6w/0e 23w/0e 420w/0e
2.6.10-mm1 12w/0e 0w/0e 414w/0e 6w/0e 17w/0e 399w/0e
2.6.10-rc3-mm1 12w/0e 0w/0e 414w/0e 6w/0e 16w/0e 401w/0e
2.6.10-rc2-mm4 15w/0e 1w/7e 421w/0e 6w/0e 16w/0e 408w/0e
2.6.10-rc2-mm3 15w/0e 0w/0e 1255w/12e 66w/0e 16w/0e 1507w/0e
2.6.10-rc2-mm2 15w/0e 0w/0e 1362w/15e 65w/0e 16w/0e 1612w/2e
2.6.10-rc2-mm1 15w/0e 0w/0e 1405w/11e 65w/0e 16w/0e 1652w/0e
2.6.10-rc1-mm5 16w/0e 0w/0e 1587w/0e 65w/0e 20w/0e 1834w/0e
2.6.10-rc1-mm4 16w/0e 0w/0e 1485w/9e 65w/0e 20w/0e 1732w/0e
(Compiles with gcc 3.2.2)
2.6.10-rc1-mm3 7w/31e 0w/9e 496w/141e 4w/0e 4w/50e 693w/83e
2.6.10-rc1-mm2 16w/1e 1w/1e 529w/1e 4w/0e 12w/1e 729w/0e
2.6.10-mm1 12w/0e 0w/0e 414w/0e 6w/0e 17w/0e 399w/0e
2.6.10-rc3-mm1 12w/0e 0w/0e 414w/0e 6w/0e 16w/0e 401w/0e
2.6.10-rc2-mm4 15w/0e 1w/7e 421w/0e 6w/0e 16w/0e 408w/0e
2.6.10-rc2-mm3 15w/0e 0w/0e 1255w/12e 66w/0e 16w/0e 1507w/0e
2.6.10-rc2-mm2 15w/0e 0w/0e 1362w/15e 65w/0e 16w/0e 1612w/2e
2.6.10-rc2-mm1 15w/0e 0w/0e 1405w/11e 65w/0e 16w/0e 1652w/0e
2.6.10-rc1-mm5 16w/0e 0w/0e 1587w/0e 65w/0e 20w/0e 1834w/0e
2.6.10-rc1-mm4 16w/0e 0w/0e 1485w/9e 65w/0e 20w/0e 1732w/0e
(Compiles with gcc 3.2.2)
2.6.10-rc1-mm3 7w/31e 0w/9e 496w/141e 4w/0e 4w/50e 693w/83e
2.6.10-rc1-mm2 16w/1e 1w/1e 529w/1e 4w/0e 12w/1e 729w/0e
2.6.10-rc1-mm1 16w/1e 1w/1e 592w/1e 4w/0e 13w/1e 857w/0e
2.6.9-mm1 6w/1e 1w/1e 1761w/15e 65w/0e 9w/0e 2086w/0e
2.6.9-rc4-mm1 5w/0e 0w/0e 1766w/11e 43w/0e 6w/0e 1798w/0e
2.6.9-rc3-mm3 5w/0e 0w/0e 1756w/11e 43w/0e 4w/0e 1786w/0e
2.6.9-rc3-mm2 10w/0e 4w/9e 1754w/14e 43w/0e 4w/0e 1782w/1e
2.6.9-rc3-mm1 10w/0e 4w/10e 1768w/0e 43w/0e 4w/0e 1796w/0e
2.6.9-rc2-mm4 10w/0e 5w/0e 2573w/0e 41w/0e 4w/0e 2600w/0e
2.6.9-rc2-mm3 10w/0e 5w/0e 2400w/0e 41w/0e 4w/0e 2435w/0e
2.6.9-rc2-mm2 10w/0e 5w/0e 2919w/0e 41w/0e 4w/0e 2954w/0e
2.6.9-rc2-mm1 0w/0e 2w/0e 3541w/9e 41w/0e 3w/9e 3567w/0e
2.6.9-rc1-mm4 0w/0e 1w/0e 55w/0e 3w/0e 2w/0e 48w/0e
2.6.9-rc1-mm3 0w/0e 0w/0e 55w/13e 3w/0e 1w/0e 49w/1e
2.6.9-rc1-mm2 0w/0e 0w/0e 53w/11e 3w/0e 1w/0e 47w/0e
2.6.9-rc1-mm1 0w/0e 0w/0e 80w/0e 4w/0e 1w/0e 74w/0e
2.6.8.1-mm4 0w/0e 0w/0e 78w/0e 4w/0e 1w/0e 73w/0e
2.6.8.1-mm3 0w/96e 0w/0e 78w/97e 4w/0e 1w/0e 74w/89e
2.6.8.1-mm2 0w/96e 0w/0e 78w/97e 4w/0e 1w/0e 74w/89e
2.6.8.1-mm1 0w/0e 0w/0e 78w/0e 4w/0e 1w/0e 74w/0e
2.6.8-rc4-mm1 0w/0e 0w/5e 81w/0e 4w/0e 1w/0e 75w/0e
2.6.8-rc3-mm2 1w/7e 0w/5e 82w/8e 4w/0e 2w/8e 75w/0e
2.6.8-rc3-mm1 0w/0e 1w/5e 81w/9e 4w/0e 1w/0e 75w/0e
2.6.8-rc2-mm2 0w/0e 4w/5e 87w/9e 4w/0e 1w/0e 80w/0e
2.6.8-rc2-mm1 0w/0e 0w/0e 83w/9e 3w/0e 1w/0e 81w/0e
2.6.8-rc1-mm1 0w/0e 0w/0e 88w/9e 5w/0e 1w/0e 87w/0e
2.6.7-mm7 0w/0e 0w/0e 89w/9e 5w/0e 1w/0e 84w/0e
2.6.7-mm6 0w/0e 0w/0e 85w/9e 5w/0e 1w/0e 80w/0e
2.6.7-mm5 0w/0e 0w/0e 92w/0e 5w/0e 1w/0e 87w/0e
2.6.7-mm4 0w/0e 0w/0e 94w/0e 5w/0e 1w/0e 89w/0e
2.6.7-mm3 0w/0e 0w/0e 90w/6e 5w/0e 1w/0e 86w/0e
2.6.7-mm2 0w/0e 0w/0e 109w/0e 7w/0e 1w/0e 106w/0e
2.6.7-mm1 0w/0e 5w/0e 108w/0e 5w/0e 1w/0e 104w/0e
2.6.7-rc3-mm2 0w/0e 5w/0e 105w/10e 5w/0e 2w/0e 100w/2e
2.6.7-rc3-mm1 0w/0e 5w/0e 104w/10e 5w/0e 2w/0e 100w/2e
2.6.7-rc2-mm2 0w/0e 5w/0e 109w/10e 5w/0e 2w/0e 105w/2e
2.6.7-rc2-mm1 0w/0e 12w/0e 158w/13e 5w/0e 3w/0e 153w/4e
2.6.7-rc1-mm1 0w/0e 6w/0e 108w/0e 5w/0e 2w/0e 104w/0e
2.6.6-mm5 0w/0e 0w/0e 109w/5e 5w/0e 2w/0e 110w/0e
2.6.6-mm4 0w/0e 0w/0e 112w/9e 5w/0e 2w/5e 106w/1e
2.6.6-mm3 3w/9e 0w/0e 120w/26e 5w/0e 2w/0e 114w/10e
2.6.6-mm2 4w/11e 0w/0e 120w/24e 6w/0e 2w/0e 118w/9e
2.6.6-mm1 1w/0e 0w/0e 118w/25e 6w/0e 2w/0e 114w/10e
2.6.6-rc3-mm2 0w/0e 0w/0e 117w/ 0e 8w/0e 2w/0e 116w/0e
2.6.6-rc3-mm1 0w/0e 0w/0e 120w/10e 8w/0e 2w/0e 152w/2e
2.6.6-rc2-mm2 0w/0e 1w/5e 118w/ 0e 8w/0e 3w/0e 118w/0e
2.6.6-rc2-mm1 0w/0e 0w/0e 115w/ 0e 7w/0e 3w/0e 116w/0e
2.6.6-rc1-mm1 0w/0e 0w/7e 122w/ 0e 7w/0e 4w/0e 122w/0e
2.6.5-mm6 0w/0e 0w/0e 123w/ 0e 7w/0e 4w/0e 124w/0e
2.6.5-mm5 0w/0e 0w/0e 119w/ 0e 7w/0e 4w/0e 120w/0e
2.6.5-mm4 0w/0e 0w/0e 120w/ 0e 7w/0e 4w/0e 121w/0e
2.6.5-mm3 0w/0e 1w/0e 121w/12e 7w/0e 3w/0e 123w/0e
2.6.5-mm2 0w/0e 0w/0e 128w/12e 7w/0e 3w/0e 134w/0e
2.6.5-mm1 0w/0e 5w/0e 122w/ 0e 7w/0e 3w/0e 124w/0e
2.6.5-rc3-mm4 0w/0e 0w/0e 124w/ 0e 8w/0e 4w/0e 126w/0e
2.6.5-rc3-mm3 0w/0e 5w/0e 129w/14e 8w/0e 4w/0e 129w/6e
2.6.5-rc3-mm2 0w/0e 5w/0e 130w/14e 8w/0e 4w/0e 129w/6e
2.6.5-rc3-mm1 0w/0e 5w/0e 129w/ 0e 8w/0e 4w/0e 129w/0e
2.6.5-rc2-mm5 0w/0e 5w/0e 130w/ 0e 8w/0e 4w/0e 129w/0e
2.6.5-rc2-mm4 0w/0e 5w/0e 134w/ 0e 8w/0e 3w/0e 133w/0e
2.6.5-rc2-mm3 0w/0e 5w/0e 134w/ 0e 8w/0e 3w/0e 133w/0e
2.6.5-rc2-mm2 0w/0e 5w/0e 137w/ 0e 8w/0e 3w/0e 134w/0e
2.6.5-rc2-mm1 0w/0e 5w/0e 136w/ 0e 8w/0e 3w/0e 134w/0e
2.6.5-rc1-mm2 0w/0e 5w/0e 135w/ 5e 8w/0e 3w/0e 133w/0e
2.6.5-rc1-mm1 0w/0e 5w/0e 135w/ 5e 8w/0e 3w/0e 133w/0e
2.6.4-mm2 1w/2e 5w/2e 144w/10e 8w/0e 3w/2e 144w/0e
2.6.4-mm1 1w/0e 5w/0e 146w/ 5e 8w/0e 3w/0e 144w/0e
2.6.4-rc2-mm1 1w/0e 5w/0e 146w/12e 11w/0e 3w/0e 147w/2e
2.6.4-rc1-mm2 1w/0e 5w/0e 144w/ 0e 11w/0e 3w/0e 145w/0e
2.6.4-rc1-mm1 1w/0e 5w/0e 147w/ 5e 11w/0e 3w/0e 147w/0e
2.6.3-mm4 1w/0e 5w/0e 146w/ 0e 7w/0e 3w/0e 142w/0e
2.6.3-mm3 1w/2e 5w/2e 146w/15e 7w/0e 3w/2e 144w/5e
2.6.3-mm2 1w/8e 5w/0e 140w/ 0e 7w/0e 3w/0e 138w/0e
2.6.3-mm1 1w/0e 5w/0e 143w/ 5e 7w/0e 3w/0e 141w/0e
2.6.3-rc3-mm1 1w/0e 0w/0e 144w/13e 7w/0e 3w/0e 142w/3e
2.6.3-rc2-mm1 1w/0e 0w/265e 144w/ 5e 7w/0e 3w/0e 145w/0e
2.6.3-rc1-mm1 1w/0e 0w/265e 141w/ 5e 7w/0e 3w/0e 143w/0e
2.6.2-mm1 2w/0e 0w/264e 147w/ 5e 7w/0e 3w/0e 173w/0e
2.6.2-rc3-mm1 2w/0e 0w/265e 146w/ 5e 7w/0e 3w/0e 172w/0e
2.6.2-rc2-mm2 0w/0e 0w/264e 145w/ 5e 7w/0e 3w/0e 171w/0e
2.6.2-rc2-mm1 0w/0e 0w/264e 146w/ 5e 7w/0e 3w/0e 172w/0e
2.6.2-rc1-mm3 0w/0e 0w/265e 144w/ 8e 7w/0e 3w/0e 169w/0e
2.6.2-rc1-mm2 0w/0e 0w/264e 144w/ 5e 10w/0e 3w/0e 171w/0e
2.6.2-rc1-mm1 0w/0e 0w/264e 144w/ 5e 10w/0e 3w/0e 171w/0e
2.6.1-mm5 2w/5e 0w/264e 153w/11e 10w/0e 3w/0e 180w/0e
2.6.1-mm4 0w/821e 0w/264e 154w/ 5e 8w/1e 5w/0e 179w/0e
2.6.1-mm3 0w/0e 0w/0e 151w/ 5e 10w/0e 3w/0e 177w/0e
2.6.1-mm2 0w/0e 0w/0e 143w/ 5e 12w/0e 3w/0e 171w/0e
2.6.1-mm1 0w/0e 0w/0e 146w/ 9e 12w/0e 6w/0e 171w/0e
2.6.1-rc2-mm1 0w/0e 0w/0e 149w/ 0e 12w/0e 6w/0e 171w/4e
2.6.1-rc1-mm2 0w/0e 0w/0e 157w/15e 12w/0e 3w/0e 185w/4e
2.6.1-rc1-mm1 0w/0e 0w/0e 156w/10e 12w/0e 3w/0e 184w/2e
2.6.0-mm2 0w/0e 0w/0e 161w/ 0e 12w/0e 3w/0e 189w/0e
2.6.0-mm1 0w/0e 0w/0e 173w/ 0e 12w/0e 3w/0e 212w/0e
John
Nick Piggin wrote:
> On Thu, 2005-02-10 at 22:41 -0500, Paul Davis wrote:
>
>> [ the best solution is .... ]
>>
>> [ my preferred solution is ... ]
>>
>> [ it would be better if ... ]
>>
>> [ this is a kludge and it should be done instead like ... ]
>>
>>did nobody read what andrew wrote and what JOQ pointed out?
>>
>>after weeks of debating this, no other conceptual solution emerged
>>that did not have at least as many problems as the RT LSM module, and
>>all other proposed solutions were also more invasive of other aspects
>>of kernel design and operations than RT LSM is.
>>
>
>
> Sure, it is quick and easy. Suits some. At least I do prefer
> this to altering the semantics of realtime scheduling.
>
> I can't say much about it because I'm not putting my hand up to
> do anything. Just mentioning that rlimit would be better if not
> for the userspace side of the equation. I think most were already
> agreed on that point anyway though.
I think that the rlimits are a good idea in themselves but not as a
solution to this problem. I.e. having a RT CPU rate rlimit should not
be a sufficient (or necessary for that matter) condition to change
policy to SCHED_OTHER or SCHED_RR but could still be used to limit the
possibility of lock out. (But I guess even that is a violation of RT
semantics?)
Peter
PS Zaphod's per task hard/soft CPU rate caps (which are the equivalent
of an rlimit on CPU usage rate) are only enforced for SCHED_NORMAL tasks
and should not (therefore) effect RT semantics.
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
On Fri, 2005-02-11 at 17:34 +1100, Peter Williams wrote:
> Nick Piggin wrote:
> > I can't say much about it because I'm not putting my hand up to
> > do anything. Just mentioning that rlimit would be better if not
> > for the userspace side of the equation. I think most were already
> > agreed on that point anyway though.
>
> I think that the rlimits are a good idea in themselves but not as a
> solution to this problem. I.e. having a RT CPU rate rlimit should not
> be a sufficient (or necessary for that matter) condition to change
> policy to SCHED_OTHER or SCHED_RR but could still be used to limit the
> possibility of lock out.
Ah well that may be a good way to do it indeed. As I said, I
don't know much about privileges etc.
But I just want to be clear that I'm not trying to stop RT-LSM
going in (if only because I don't care one way or the other
about it).
> (But I guess even that is a violation of RT
> semantics?)
>
I'd have to re-read the standard, but it may not be. For example,
a compliant system advertises the minimum and maximum priority
levels available - you may be able to adjust these based on what
the rlimit is set to. On the other hand, yes it may violate the
stanards.
Nick
On Thu, Feb 10, 2005 at 10:41:28PM -0500, Paul Davis wrote:
> [ the best solution is .... ]
>
> [ my preferred solution is ... ]
>
> [ it would be better if ... ]
>
> [ this is a kludge and it should be done instead like ... ]
>
> did nobody read what andrew wrote and what JOQ pointed out?
>
> after weeks of debating this, no other conceptual solution emerged
> that did not have at least as many problems as the RT LSM module, and
> all other proposed solutions were also more invasive of other aspects
> of kernel design and operations than RT LSM is.
Eh? Chris Wright's original rlimits patch was very straightforward
(unlike some of the other rlimit-like patches that followed).
I haven't heard the downsides of it yet.
simple rlimits:
logical extension of standard, flexible interface
fine-grained per-process access to nice levels and priorities
managed with standard tools
fairly broad possible applications
clean enough to be added unconditionally
already doing mlock this way!
RT LSM:
new, narrow magic group interface (module parameters!)
boolean granularity of access to all RT levels and maybe mlock
potential interesting interaction with other LSMs
not orthogonal to mlock
not appropriate for every box out there
requires lsm and (sysfs or modprobe)
--
Mathematics is the supreme nostalgia of our time.
* Matt Mackall <[email protected]> wrote:
> Eh? Chris Wright's original rlimits patch was very straightforward
> [...]
the problem is that it didnt solve the problem (unprivileged user can
lock up the system) in any way. So after it became visible that all the
existing 'dont allow users to lock up' solutions are too invasive, we
went to recommend the solution that introduces the least architectural
problems: RT-LSM.
Ingo
* Matt Mackall <[email protected]> wrote:
> > > What happened to the RT rlimit code from Chris?
> >
> > I still have it, but I had the impression Ingo didn't like it as a long
> > term solution/hack (albeit small) to the scheduler. Whereas the rt-lsm
> > patch is wholly self-contained.
>
> I think it's important to recognize that we're trying to address an
> issue that has a much wider potential audience than pro audio users,
> and not very far off - what is high end audio performance today will
> be expected desktop performance next year.
i disagree that desktop performance tomorrow will necessarily have to
utilize SCHED_FIFO. Today's desktop audio applications perform quite
good at SCHED_NORMAL priorities [with the 2.6.11 kernel that has more
interactivity/latency fixes such as PREEMPT_BKL].
I agree (and hope) that tomorrow's "stock" desktop will be based on
today's pro audio architectures, but tomorrows CPUs will be much faster
and tomorrows desktop apps dont want to spend 30%+ CPU time on creating
audio.
the pro applications will always want to have a 100% guarantee (it
really sucks to generate a nasty audio click during a live performance)
and want to utilize as much CPU time for audio as needed. They are also
clearly the most complex creators of audio so they go far above the
normal (and reasonable) CPU-use/latency expectations and tradeoffs of
the stock scheduler.
> So I think it's critical that we find solution that's appropriate for
> _every single box_, because realistically vendors are going to ship
> with this "wholly self-contained" feature turned on by default next
> year, at which point the "containment" will be nil and whatever warts
> it has will be with us forever.
an "RT priorities rlimit" is still not adequate as a desktop solution,
because it still allows the box to be locked up. Also, if it turns out
to be a mistake then it's already codified into the ABI, while RT-LSM is
much less 'persistent' and could be replaced much easier. RT-LSM is also
more flexible and more practical. (an rlimit needs changes across a
number of userspace components, delaying its adoptation.)
> The rlimit stuff is not perfect, but it's a much better fit for the
> UNIX model generally, which is a fairly big win. [...]
a 'locked up box' is as far away from the UNIX model as it gets.
perhaps, if the need arises, we can add the RT-throttling sysctl (which
still wont give RT priorities to unprivileged users and would serve as a
way to throttle privileged RT tasks), which could thus make the RT-LSM
solution pretty safe. Right now Jack has its own watchdog thread which
should solve most of the lockup situations. Lets not overdesign the
solution, especially when we dont yet know how the problem really looks
like.
or an even simpler solution for the lockup problem would be a
kernel-based RT watchdog. In fact 2.6.11-rc3-mm2 already includes such a
watchdog (written by yours truly):
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.6.11-rc3-mm2/broken-out/detect-soft-lockups.patch
right now softlockup-detect runs at SCHED_FIFO prio 99 and only prints a
warning - but it could easily run at SCHED_FIFO prio 1 [to detect
lockups generated by all RT tasks] and it could actively try to renice
(or kill) tasks that run for too long. So very likely there will be an
easy upstream mechanism for any problem that could arise out of RT-LSM.
Ingo
On Fri, Feb 11, 2005 at 09:14:22AM +0100, Ingo Molnar wrote:
> an "RT priorities rlimit" is still not adequate as a desktop solution,
> because it still allows the box to be locked up. Also, if it turns out
> to be a mistake then it's already codified into the ABI, while RT-LSM is
> much less 'persistent' and could be replaced much easier. RT-LSM is also
> more flexible and more practical. (an rlimit needs changes across a
> number of userspace components, delaying its adoptation.)
Putting it into the tree means a gurantee we'll keep it going. It'd
probably much better if Jack just keepts it separatly. Especially as
his lack of even making it generic shows that he's unwilling to invest
work into it that doesn't benfit him personally.
On Fri, Feb 11, 2005 at 08:54:17AM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <[email protected]> wrote:
>
> > Eh? Chris Wright's original rlimits patch was very straightforward
> > [...]
>
> the problem is that it didnt solve the problem (unprivileged user can
> lock up the system) in any way.
There are two separate but related problems:
a) need a way to give non-root access to SCHED_FIFO without other
privileges
b) would like a way to have RT-like capabilities without risk of DoS
The original rlimits patch solves (a), which is the pressing concern.
The existence of a satisfactory solution to related problem (b) has
yet to be demonstrated. And even if a solution for (b) is found that
is satisfactory for, say, high end audio users, it may not necessarily
be sufficient for everyone who might have wanted SCHED_FIFO for
non-root processes. So we still need a solution for (a).
> So after it became visible that all the
> existing 'dont allow users to lock up' solutions are too invasive, we
> went to recommend the solution that introduces the least architectural
> problems: RT-LSM.
RT-LSM introduces architectural problems in the form of bogus API. And
I claim that if RT-LSM becomes part of the mainline kernel, it -will-
become a default feature on the desktop in short order. The fact that
it's implemented as an LSM is meaningless if Redhat and SuSE ship it
on by default.
So the comparison boils down to putting a magic gid in a sysfs
file/module parameter or setting an rlimit with standard tools (PAM,
etc). I'm really boggled that anyone could prefer the former,
especially since we had almost this exact debate over what became the
mlock rlimit!
Here's Chris' patch for reference:
http://groups-beta.google.com/group/linux.kernel/msg/6408569e13ed6e80
--
Mathematics is the supreme nostalgia of our time.
On Fri, Feb 11, 2005 at 09:14:22AM +0100, Ingo Molnar wrote:
>
> > I think it's important to recognize that we're trying to address an
> > issue that has a much wider potential audience than pro audio users,
> > and not very far off - what is high end audio performance today will
> > be expected desktop performance next year.
>
> i disagree that desktop performance tomorrow will necessarily have to
> utilize SCHED_FIFO. Today's desktop audio applications perform quite
> good at SCHED_NORMAL priorities [with the 2.6.11 kernel that has more
> interactivity/latency fixes such as PREEMPT_BKL].
Desktop performance tomorrow will want realtime audio AND video.
Think simultaneous record and playback of multiple high-definition
video streams. There's a demand for this; my company already sells it.
> the pro applications will always want to have a 100% guarantee (it
> really sucks to generate a nasty audio click during a live performance)
> and want to utilize as much CPU time for audio as needed. They are also
> clearly the most complex creators of audio so they go far above the
> normal (and reasonable) CPU-use/latency expectations and tradeoffs of
> the stock scheduler.
The pro will want to do his work on a stock desktop system. More
importantly, the hobbyist will want to do exactly what the pro is
doing on the same system.
> > So I think it's critical that we find solution that's appropriate for
> > _every single box_, because realistically vendors are going to ship
> > with this "wholly self-contained" feature turned on by default next
> > year, at which point the "containment" will be nil and whatever warts
> > it has will be with us forever.
>
> an "RT priorities rlimit" is still not adequate as a desktop solution,
> because it still allows the box to be locked up. Also, if it turns out
> to be a mistake then it's already codified into the ABI, while RT-LSM is
> much less 'persistent' and could be replaced much easier. RT-LSM is also
> more flexible and more practical. (an rlimit needs changes across a
> number of userspace components, delaying its adoptation.)
I'm very suspicious about being able to rip out RT-LSM once it's
introduced. See devfs. And I think the adoption barrier thing is a red
herring as well: the current users are by and large compiling their
own RT-tuned kernels.
> > The rlimit stuff is not perfect, but it's a much better fit for the
> > UNIX model generally, which is a fairly big win. [...]
>
> a 'locked up box' is as far away from the UNIX model as it gets.
Rlimits are already the favored tool for dealing with the classic UNIX DoS:
the fork bomb. Turn off process limits, tada, locked up box.
--
Mathematics is the supreme nostalgia of our time.
* Matt Mackall <[email protected]> wrote:
> Here's Chris' patch for reference:
>
> http://groups-beta.google.com/group/linux.kernel/msg/6408569e13ed6e80
how does this patch solve the separation of 'negative nice values' and
'RT priority rlimits'? In one piece of code it handles the rlimit value
as a 0-39 nice value, in another place it handles it as a limit for a
1-100 RT priority range. The two ranges overlap and have nothing to do
with each other. [*]
anyway, as long as it doesnt touch the scheduler runtime code (and it
doesnt), both types of solutions are fine to me - it's basically the
security-subsystem people's call.
if the patch solves the negative-nice-value and the RT-priority issues
at once, then it indeed looks more flexible (and more generic) than the
LSM solution. [**]
Ingo
[*] one acceptable way to 'merge' the two priority ranges would be to
introduce a unified priority range of 0-139: 0-39 would be for nice
values while 40-139 would be for RT priorities 1-99. NOTE: due to
rlimit semantics (users can always lower them without any security
checks), value 39 _must_ denote nice -20 and value 0 must denote
nice +19. I.e. it must strictly in increasing priority order.
[**] in fact, the 'Gnome problem' wrt. suid/gid binaries would be solved
via the rlimit too.
On Fri, Feb 11, 2005 at 09:48:43AM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <[email protected]> wrote:
>
> > Here's Chris' patch for reference:
> >
> > http://groups-beta.google.com/group/linux.kernel/msg/6408569e13ed6e80
>
> how does this patch solve the separation of 'negative nice values' and
> 'RT priority rlimits'? In one piece of code it handles the rlimit value
> as a 0-39 nice value, in another place it handles it as a limit for a
> 1-100 RT priority range. The two ranges overlap and have nothing to do
> with each other. [*]
Read more closely: there are two independent limits in the patch,
RLIMIT_NICE and RLIMIT_RTPRIO. This lets us grant elevated nice
without SCHED_FIFO.
--
Mathematics is the supreme nostalgia of our time.
* Matt Mackall <[email protected]> wrote:
> > i disagree that desktop performance tomorrow will necessarily have to
> > utilize SCHED_FIFO. Today's desktop audio applications perform quite
> > good at SCHED_NORMAL priorities [with the 2.6.11 kernel that has more
> > interactivity/latency fixes such as PREEMPT_BKL].
>
> Desktop performance tomorrow will want realtime audio AND video.
> Think simultaneous record and playback of multiple high-definition
> video streams. There's a demand for this; my company already sells it.
Tomorrow's hardware will have enough buffering as today's hardware has
for simpler tasks. Repeat after me: it likely _wont_ _need_ SCHED_FIFO.
Running tomorrow's hardware on today's boxes indeed pushes the system to
its limits, but torrows hardware will be well-balanced just as much as
today's is - if nothing else then due to kernel drivers providing a
buffering guarantee.
think of SCHED_FIFO on the desktop as an ugly wart, a hammer, that
destroys the careful balance of priorities of SCHED_OTHER tasks. Yes, it
can be useful if you _need_ a scheduling guarantee due to physical
constraints, and it can be useful if the hardware (or the kernel) cannot
buffer enough, but otherwise, it only causes problems.
> I'm very suspicious about being able to rip out RT-LSM once it's
> introduced. [...]
yeah, i somewhat share that view. (despite all the promises from the
audio folks - if they are just half as agressive resisting removal as
they were pushing integration then it will never be removed ;-)
but i'm not sure how rlimits will contain the whole problem - can
rlimits be restricted to a single app (jackd)? The most canonical use of
rlimits is per-user (per-group), so the rlimit could end up _widening_
the effects of the hack ...
> > > The rlimit stuff is not perfect, but it's a much better fit for the
> > > UNIX model generally, which is a fairly big win. [...]
> >
> > a 'locked up box' is as far away from the UNIX model as it gets.
>
> Rlimits are already the favored tool for dealing with the classic UNIX
> DoS: the fork bomb. Turn off process limits, tada, locked up box.
the big difference is that process limits are finegrained and it has a
single value (unlimited) that allows the DoS - while the RT-rlimits have
_one_ value that is safe, all the other values are unsafe!
Ingo
* Matt Mackall <[email protected]> wrote:
> Read more closely: there are two independent limits in the patch,
> RLIMIT_NICE and RLIMIT_RTPRIO. This lets us grant elevated nice
> without SCHED_FIFO.
ok, indeed.
Ingo
* Matt Mackall <[email protected]> wrote:
> So the comparison boils down to putting a magic gid in a sysfs
> file/module parameter or setting an rlimit with standard tools (PAM,
> etc). I'm really boggled that anyone could prefer the former,
> especially since we had almost this exact debate over what became the
> mlock rlimit!
the big difference to mlock is that for mlock there _is_ a _limit_. For
RT scheduling the priority is _NOT_ a _limit_. Okay? So you give the
false pretense of this being some kind of resource 'limit', while in
fact allowing SCHED_FIFO prio 1 alone enables unprivileged users to lock
up the system.
so i could agree with RLIMIT_NICE (which _is_ a limit), but
RLIMIT_RTPRIO sends the wrong message. The proper rlimit would be
RLIMIT_RT_CPU (the patch i did).
Ingo
On Fri, Feb 11, 2005 at 10:04:19AM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <[email protected]> wrote:
>
> > So the comparison boils down to putting a magic gid in a sysfs
> > file/module parameter or setting an rlimit with standard tools (PAM,
> > etc). I'm really boggled that anyone could prefer the former,
> > especially since we had almost this exact debate over what became the
> > mlock rlimit!
>
> the big difference to mlock is that for mlock there _is_ a _limit_. For
> RT scheduling the priority is _NOT_ a _limit_. Okay? So you give the
> false pretense of this being some kind of resource 'limit', while in
> fact allowing SCHED_FIFO prio 1 alone enables unprivileged users to lock
> up the system.
>
> so i could agree with RLIMIT_NICE (which _is_ a limit), but
> RLIMIT_RTPRIO sends the wrong message. The proper rlimit would be
> RLIMIT_RT_CPU (the patch i did).
It's not a perfect fit, I'll readily agree.
But consider this: with RLIMIT_RTPRIO, I can restrict a user to the
lowest N RT priorities. Then at N+1, I have an RT watchdog taking care
of runaways, tickled by a SCHED_NORMAL task. So it can still be looked
at as a meaningful limit, just a bit different from the others.
The RT LSM gives full CAP_SYS_NICE out, so there's no way to guarantee
that the watchdog has higher priority.
--
Mathematics is the supreme nostalgia of our time.
On Fri, Feb 11, 2005 at 09:59:42AM +0100, Ingo Molnar wrote:
>
> think of SCHED_FIFO on the desktop as an ugly wart, a hammer, that
> destroys the careful balance of priorities of SCHED_OTHER tasks. Yes, it
> can be useful if you _need_ a scheduling guarantee due to physical
> constraints, and it can be useful if the hardware (or the kernel) cannot
> buffer enough, but otherwise, it only causes problems.
Agreed. I think something short of full SCHED_FIFO will make most
desktop folks happy. But a) we still have to figure out exactly how to
do that and b) we still have to make everyone else happy. The embedded
folks (me included) would prefer to not run our realtime bits as root
too..
> but i'm not sure how rlimits will contain the whole problem - can
> rlimits be restricted to a single app (jackd)?
Yes. There's also the whole soft limit thing.
--
Mathematics is the supreme nostalgia of our time.
* Matt Mackall <[email protected]> wrote:
> On Fri, Feb 11, 2005 at 09:59:42AM +0100, Ingo Molnar wrote:
> >
> > think of SCHED_FIFO on the desktop as an ugly wart, a hammer, that
> > destroys the careful balance of priorities of SCHED_OTHER tasks. Yes, it
> > can be useful if you _need_ a scheduling guarantee due to physical
> > constraints, and it can be useful if the hardware (or the kernel) cannot
> > buffer enough, but otherwise, it only causes problems.
>
> Agreed. I think something short of full SCHED_FIFO will make most
> desktop folks happy. [...]
ah, but it's not the desktop folks who have to be happy but users :-)
Really, if you ask any app designer then obviously 'the more CPU time we
get for sure, the better our app behaves'. So in that sense SCHED_OTHER
is a fair playground: if you behave nicely you'll have higher priority
and shorter latencies.
(there are things like SCHED_ISO but how good of a solution they are is
not yet clear.)
> [...] But a) we still have to figure out exactly how to do that and b)
> we still have to make everyone else happy. The embedded folks (me
> included) would prefer to not run our realtime bits as root too..
you dont have to - you can drop root after startup.
> > but i'm not sure how rlimits will contain the whole problem - can
> > rlimits be restricted to a single app (jackd)?
>
> Yes. There's also the whole soft limit thing.
i'm curious, how does this 'per-app' rlimit thing work? If a user has
jackd installed and runs it from X unprivileged, how does it get the
elevated rlimit? (while the rest of his desktop still runs with a safe
rlimit.) SELinux/RT-LSM could do this, but i'm not sure about how
rlimits give this to you.
Ingo
In fs/Kconfig,
See "Documentation/filesystems/fscache.txt for more information." and
"See Documentation/filesystems/cachefs.txt for more information."
Should be changed to:
"See Documentation/filesystems/caching/fscache.txt for more
information." and "See Documentation/filesystems/caching/cachefs.txt for
more information."
Thanks,
Yuval
Andrew Morton wrote:
>cachefs-filesystem.patch
> CacheFS filesystem
>
>
>introduced. See devfs. And I think the adoption barrier thing is a red
>herring as well: the current users are by and large compiling their
>own RT-tuned kernels.
not true. most people are using kernels built for specialized distros
or addons, such as CCRMA, Demudi, Ubuntu, or dyne:bolic.
--p
On Fri, Feb 11, 2005 at 10:53:27AM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <[email protected]> wrote:
>
> > On Fri, Feb 11, 2005 at 09:59:42AM +0100, Ingo Molnar wrote:
> > >
> > > think of SCHED_FIFO on the desktop as an ugly wart, a hammer, that
> > > destroys the careful balance of priorities of SCHED_OTHER tasks. Yes, it
> > > can be useful if you _need_ a scheduling guarantee due to physical
> > > constraints, and it can be useful if the hardware (or the kernel) cannot
> > > buffer enough, but otherwise, it only causes problems.
> >
> > Agreed. I think something short of full SCHED_FIFO will make most
> > desktop folks happy. [...]
>
> ah, but it's not the desktop folks who have to be happy but users :-)
> Really, if you ask any app designer then obviously 'the more CPU time we
> get for sure, the better our app behaves'. So in that sense SCHED_OTHER
> is a fair playground: if you behave nicely you'll have higher priority
> and shorter latencies.
>
> (there are things like SCHED_ISO but how good of a solution they are is
> not yet clear.)
>
> > [...] But a) we still have to figure out exactly how to do that and b)
> > we still have to make everyone else happy. The embedded folks (me
> > included) would prefer to not run our realtime bits as root too..
>
> you dont have to - you can drop root after startup.
>
> > > but i'm not sure how rlimits will contain the whole problem - can
> > > rlimits be restricted to a single app (jackd)?
> >
> > Yes. There's also the whole soft limit thing.
>
> i'm curious, how does this 'per-app' rlimit thing work? If a user has
> jackd installed and runs it from X unprivileged, how does it get the
> elevated rlimit?
It needs a setuid launcher. It would be nice to be able to elevate the
rlimits of running processes but the API doesn't exist yet.
>From the POV of accidental elevation to RT, soft limits are
sufficient. But we can't stop a user from exploiting an app they own
with RT privileges from elevating other apps via ptrace+exec or
whatever. Nor with RT-LSM.
> (while the rest of his desktop still runs with a safe
> rlimit.) SELinux/RT-LSM could do this, but i'm not sure about how
> rlimits give this to you.
How does it get done with RT-LSM? Setgid binaries? It only
discriminates on a group granularity. Or are you saying "and SELinux"
rather than "or SELinux"?
--
Mathematics is the supreme nostalgia of our time.
>RT-LSM introduces architectural problems in the form of bogus API. And
that may be true of LSM, but not RT-LSM in particular. RT-LSM doesn't
introduce *any* API whatsoever - it simply allows software to call
various existing APIs (mostly from POSIX) and have them not fail as
result of not being root and/or not running on a capabilities-enabled
kernel without the required caps.
No audio apps "use" RT-LSM in any way - it just lets them do things
they otherwise could not do. And all the alternatives to RT-LSM have
this feature as well - controlling rlimits won't be done by the audio
apps, but by some part of the security infrastructure.
>it's implemented as an LSM is meaningless if Redhat and SuSE ship it
>on by default.
We haven't encouraged anyone to ship anything with it on by default:
the idea is for the module to be present and usable, not turned on.
--p
* Matt Mackall <[email protected]> wrote:
> > > Yes. There's also the whole soft limit thing.
> >
> > i'm curious, how does this 'per-app' rlimit thing work? If a user has
> > jackd installed and runs it from X unprivileged, how does it get the
> > elevated rlimit?
>
> It needs a setuid launcher. It would be nice to be able to elevate the
> rlimits of running processes but the API doesn't exist yet.
With a setuid launcher you need _zero_ kernel help to get SCHED_FIFO: if
you have a launcher then already today it can just give SCHED_FIFO to
jackd and be done with it!
Ingo
On Fri, Feb 11, 2005 at 12:49:04PM -0500, Paul Davis wrote:
> >RT-LSM introduces architectural problems in the form of bogus API. And
>
> that may be true of LSM, but not RT-LSM in particular. RT-LSM doesn't
> introduce *any* API whatsoever - it simply allows software to call
> various existing APIs (mostly from POSIX) and have them not fail as
> result of not being root and/or not running on a capabilities-enabled
> kernel without the required caps.
The API is the parameters to modprobe or sysfs.
> >it's implemented as an LSM is meaningless if Redhat and SuSE ship it
> >on by default.
>
> We haven't encouraged anyone to ship anything with it on by default:
> the idea is for the module to be present and usable, not turned on.
On as in turned on for build in the kernel config and shipped. But I
expect people will eventually actually ship it _on_ with a group
called 'rt' and possibly even put the primary user in there on install
unless you start slapping some big fat warnings on it. (I just noticed
the new Debian installer is putting the primary user in audio, cdrom,
video, etc.)
--
Mathematics is the supreme nostalgia of our time.
On Fri, 2005-02-11 at 11:42 -0800, Matt Mackall wrote:
> On Fri, Feb 11, 2005 at 12:49:04PM -0500, Paul Davis wrote:
> > >RT-LSM introduces architectural problems in the form of bogus API. And
> >
> > that may be true of LSM, but not RT-LSM in particular. RT-LSM doesn't
> > introduce *any* API whatsoever - it simply allows software to call
> > various existing APIs (mostly from POSIX) and have them not fail as
> > result of not being root and/or not running on a capabilities-enabled
> > kernel without the required caps.
>
> The API is the parameters to modprobe or sysfs.
>
I think you are talking about the API for root to administer it vs. the
(lack of) API for apps to use the RT capabilities. I think Paul's point
is that we can transparently replace it with something better (IMO the
RT rlimit is better) in the future, and the apps don't have to know
about it at all. Comparing it to devfs/udev is bogus because those are
way, way more complicated.
> > >it's implemented as an LSM is meaningless if Redhat and SuSE ship it
> > >on by default.
> >
> > We haven't encouraged anyone to ship anything with it on by default:
> > the idea is for the module to be present and usable, not turned on.
>
> On as in turned on for build in the kernel config and shipped. But I
> expect people will eventually actually ship it _on_ with a group
> called 'rt' and possibly even put the primary user in there on install
> unless you start slapping some big fat warnings on it. (I just noticed
> the new Debian installer is putting the primary user in audio, cdrom,
> video, etc.)
>
Sorry, if the distros are so dumb they need a big fat warning to know
that this is not a safe thing to enable by default, at least on anything
you would ever consider a multiuser system, then they get what they
deserve. If they have half a brain they will use the setgid approach
that Ingo suggested, and only enable this for apps like JACK and
cdrecord that have been farily well audited and can be trusted to use
this feature (for example JACK has the internal watchdog to keep a bad
client from locking the system). Really it only makes sense for a
distro to enable this if the user selects the "low latency desktop" or
"multimedia desktop" or whatever install option and makes clear that
this profile is NOT suitable for a multiuser system.
Lee
On Fri, Feb 11, 2005 at 06:49:05PM +0100, Ingo Molnar wrote:
>
> * Matt Mackall <[email protected]> wrote:
>
> > > > Yes. There's also the whole soft limit thing.
> > >
> > > i'm curious, how does this 'per-app' rlimit thing work? If a user has
> > > jackd installed and runs it from X unprivileged, how does it get the
> > > elevated rlimit?
> >
> > It needs a setuid launcher. It would be nice to be able to elevate the
> > rlimits of running processes but the API doesn't exist yet.
>
> With a setuid launcher you need _zero_ kernel help to get SCHED_FIFO: if
> you have a launcher then already today it can just give SCHED_FIFO to
> jackd and be done with it!
I'm sure you know all this already but I'll spell it out so we're all
clear:
a) rlimits are tracked per-process so they're fundamentally
per-process
b) there are hard and soft limits, with soft always <= hard
c) only root can raise hard rlimits, but normal users can lower them
d) if a user owns a process, he can gain the privileges of that process
by various means, so in the strict sense per-process privileges are
meaningless - all privileges are per-uid
e) so we either need to segregate all privileged processes into
separate uid domains
f) or we're assuming non-malicious users and soft limits are
sufficient.
Now I suspect we don't want to insist people do (e) (though I'd
certainly encourage them to try).
Don't forget that the rlimits approach allows us to reserve the
highest priorities for root. I'm pretty sure an effective watchdog
policy can thus be implemented in userspace, which RT-LSM can't really
offer.
--
Mathematics is the supreme nostalgia of our time.
Hi,
Yuval Tanny wrote:
> Andrew Morton wrote:
>>cachefs-filesystem.patch
>> CacheFS filesystem
> ...
as you mention cachefs - know what's the status of supporting nfs?
Or is the project as dead as the mailing-list?
Is there any whole-in-one patch relative to vanilla-sources,
at best including nfs-support?
Thanks in advance,
Henning Rohde
Christoph Hellwig <[email protected]> writes:
> On Thu, Feb 10, 2005 at 02:35:08AM -0800, Andrew Morton wrote:
>>
>> - Added the mlock and !SCHED_OTHER Linux Security Module for the audio guys.
>> It seems that nothing else is going to come along and this is completely
>> encapsulated.
>
> Even if we accept a module that grants capabilities to groups this isn't fine
> yet because it only supports two specific capabilities (and even those two in
> different ways!) instead of adding generic support to bind capabilities to
> groups.
Unless I misunderstood the code, this one is available for
quite some time: <http://www.olafdietsche.de/linux/accessfs/>
or a newer, self-contained version <http://lkml.org/lkml/2005/1/11/221>
Or you could use a real solution - filesystem capabilities:
<http://www.olafdietsche.de/linux/capability/> and if you don't like
this one :-), there's also an alternative existing here:
<http://www.stanford.edu/~luto/linux-fscap/>
Regards, Olaf.
Ingo Molnar wrote:
> the pro applications will always want to have a 100% guarantee (it
> really sucks to generate a nasty audio click during a live performance)
... and the "generic kernels" distributions use will follow just
as swiftly, as soon as the feature appears stable enough. It even
makes sense: no need to switch kernels if "pro audio" applications
(or whatever else may end up wanting this) are added to the mix,
and fewer configurations to test.
You can run, but you cannot hide :-)
- Werner
--
_________________________________________________________________________
/ Werner Almesberger, Buenos Aires, Argentina [email protected] /
/_http://www.almesberger.net/____________________________________________/
Alle 11:35, gioved? 10 febbraio 2005, Andrew Morton ha scritto:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11-rc3/2.
>6.11-rc3-mm2/
I was trying to use the skge module for my Intel 3c940 card, in place of the
(working) sk98lin.
It gives the following:
Feb 14 14:16:35 nbsteu kernel: kobject_register failed for skge (-17)
Feb 14 14:16:35 nbsteu kernel: [kobject_register+81/96]
kobject_register+0x51/0x60
Feb 14 14:16:35 nbsteu kernel: [bus_add_driver+82/192]
bus_add_driver+0x52/0xc0
Feb 14 14:16:35 nbsteu kernel: [driver_register+40/48]
driver_register+0x28/0x30
Feb 14 14:16:35 nbsteu kernel: [pci_register_driver+94/128]
pci_register_driver+0x5e/0x80
Feb 14 14:16:35 nbsteu kernel: [sys_init_module+313/480]
sys_init_module+0x139/0x1e0
Feb 14 14:16:35 nbsteu kernel: [sysenter_past_esp+82/117]
sysenter_past_esp+0x52/0x75
Attached, .config and lspci -v
--
Stefano Rivoir
On Thu, 2005-02-10 at 13:47 -0500, Robert Love wrote:
> Attached, find a patch against 2.6.11-rc3-mm2 of the latest inotify.
Updated patch, fixes a bug.
Robert Love
inotify, bitches
Signed-off-by: Robert Love <[email protected]>
arch/sparc64/Kconfig | 13
drivers/char/Kconfig | 13
drivers/char/Makefile | 2
drivers/char/inotify.c | 1053 +++++++++++++++++++++++++++++++++++++++++++++
drivers/char/misc.c | 14
fs/attr.c | 34 -
fs/compat.c | 14
fs/file_table.c | 4
fs/inode.c | 3
fs/namei.c | 38 -
fs/open.c | 9
fs/read_write.c | 28 -
fs/super.c | 3
include/linux/fs.h | 7
include/linux/fsnotify.h | 235 ++++++++++
include/linux/inotify.h | 118 +++++
include/linux/miscdevice.h | 5
include/linux/sched.h | 2
kernel/user.c | 2
19 files changed, 1522 insertions(+), 75 deletions(-)
diff -urN linux-2.6.10/arch/sparc64/Kconfig linux/arch/sparc64/Kconfig
--- linux-2.6.10/arch/sparc64/Kconfig 2004-12-24 16:35:25.000000000 -0500
+++ linux/arch/sparc64/Kconfig 2005-02-01 12:24:26.000000000 -0500
@@ -88,6 +88,19 @@
bool
default y
+config INOTIFY
+ bool "Inotify file change notification support"
+ default y
+ ---help---
+ Say Y here to enable inotify support and the /dev/inotify character
+ device. Inotify is a file change notification system and a
+ replacement for dnotify. Inotify fixes numerous shortcomings in
+ dnotify and introduces several new features. It allows monitoring
+ of both files and directories via a single open fd. Multiple file
+ events are supported.
+
+ If unsure, say Y.
+
config SMP
bool "Symmetric multi-processing support"
---help---
diff -urN linux-2.6.10/drivers/char/inotify.c linux/drivers/char/inotify.c
--- linux-2.6.10/drivers/char/inotify.c 1969-12-31 19:00:00.000000000 -0500
+++ linux/drivers/char/inotify.c 2005-02-09 16:05:07.959265648 -0500
@@ -0,0 +1,1053 @@
+/*
+ * drivers/char/inotify.c - inode-based file event notifications
+ *
+ * Authors:
+ * John McCutchan <[email protected]>
+ * Robert Love <[email protected]>
+ *
+ * Copyright (C) 2005 John McCutchan
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/spinlock.h>
+#include <linux/idr.h>
+#include <linux/slab.h>
+#include <linux/fs.h>
+#include <linux/namei.h>
+#include <linux/poll.h>
+#include <linux/device.h>
+#include <linux/miscdevice.h>
+#include <linux/init.h>
+#include <linux/list.h>
+#include <linux/writeback.h>
+#include <linux/inotify.h>
+
+#include <asm/ioctls.h>
+
+static atomic_t inotify_cookie;
+static kmem_cache_t *watch_cachep;
+static kmem_cache_t *event_cachep;
+static kmem_cache_t *inode_data_cachep;
+
+static int sysfs_attrib_max_user_devices;
+static int sysfs_attrib_max_user_watches;
+static unsigned int sysfs_attrib_max_queued_events;
+
+/*
+ * struct inotify_device - represents an open instance of an inotify device
+ *
+ * For each inotify device, we need to keep track of the events queued on it,
+ * a list of the inodes that we are watching, and so on.
+ *
+ * This structure is protected by 'lock'. Lock ordering:
+ *
+ * dev->lock (protects dev)
+ * inode_lock (used to safely walk inode_in_use list)
+ * inode->i_lock (only needed for getting ref on inode_data)
+ */
+struct inotify_device {
+ wait_queue_head_t wait;
+ struct idr idr;
+ struct list_head events;
+ struct list_head watches;
+ spinlock_t lock;
+ unsigned int queue_size;
+ unsigned int event_count;
+ unsigned int max_events;
+ struct user_struct *user;
+};
+
+struct inotify_watch {
+ s32 wd; /* watch descriptor */
+ u32 mask; /* event mask for this watch */
+ struct inode *inode; /* associated inode */
+ struct inotify_device *dev; /* associated device */
+ struct list_head d_list; /* entry in device's list */
+ struct list_head i_list; /* entry in inotify_data's list */
+};
+
+/*
+ * A list of these is attached to each instance of the driver. In read(), this
+ * this list is walked and all events that can fit in the buffer are returned.
+ */
+struct inotify_kernel_event {
+ struct inotify_event event;
+ struct list_head list;
+ char *filename;
+};
+
+static ssize_t show_max_queued_events(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_queued_events);
+}
+
+static ssize_t store_max_queued_events(struct class_device *class,
+ const char *buf, size_t count)
+{
+ unsigned int max;
+
+ if (sscanf(buf, "%u", &max) > 0 && max > 0) {
+ sysfs_attrib_max_queued_events = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static ssize_t show_max_user_devices(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_user_devices);
+}
+
+static ssize_t store_max_user_devices(struct class_device *class,
+ const char *buf, size_t count)
+{
+ int max;
+
+ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
+ sysfs_attrib_max_user_devices = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static ssize_t show_max_user_watches(struct class_device *class, char *buf)
+{
+ return sprintf(buf, "%d\n", sysfs_attrib_max_user_watches);
+}
+
+static ssize_t store_max_user_watches(struct class_device *class,
+ const char *buf, size_t count)
+{
+ int max;
+
+ if (sscanf(buf, "%d", &max) > 0 && max > 0) {
+ sysfs_attrib_max_user_watches = max;
+ return strlen(buf);
+ }
+ return -EINVAL;
+}
+
+static CLASS_DEVICE_ATTR(max_queued_events, S_IRUGO | S_IWUSR,
+ show_max_queued_events, store_max_queued_events);
+static CLASS_DEVICE_ATTR(max_user_devices, S_IRUGO | S_IWUSR,
+ show_max_user_devices, store_max_user_devices);
+static CLASS_DEVICE_ATTR(max_user_watches, S_IRUGO | S_IWUSR,
+ show_max_user_watches, store_max_user_watches);
+
+static inline void __get_inode_data(struct inotify_inode_data *data)
+{
+ atomic_inc(&data->count);
+}
+
+/*
+ * get_inode_data - pin an inotify_inode_data structure. Returns the structure
+ * if successful and NULL on failure, which can only occur if inotify_data is
+ * not yet allocated. The inode must be pinned prior to invocation.
+ */
+static inline struct inotify_inode_data * get_inode_data(struct inode *inode)
+{
+ struct inotify_inode_data *data;
+
+ spin_lock(&inode->i_lock);
+ data = inode->inotify_data;
+ if (data)
+ __get_inode_data(data);
+ spin_unlock(&inode->i_lock);
+
+ return data;
+}
+
+/*
+ * put_inode_data - drop our reference on an inotify_inode_data and the
+ * inode structure in which it lives. If the reference count on inotify_data
+ * reaches zero, free it.
+ */
+static inline void put_inode_data(struct inode *inode)
+{
+ //spin_lock(&inode->i_lock);
+ if (atomic_dec_and_test(&inode->inotify_data->count)) {
+ kmem_cache_free(inode_data_cachep, inode->inotify_data);
+ inode->inotify_data = NULL;
+ }
+ //spin_unlock(&inode->i_lock);
+}
+
+/*
+ * find_inode - resolve a user-given path to a specific inode and return a nd
+ */
+static int find_inode(const char __user *dirname, struct nameidata *nd)
+{
+ int error;
+
+ error = __user_walk(dirname, LOOKUP_FOLLOW, nd);
+ if (error)
+ return error;
+
+ /* you can only watch an inode if you have read permissions on it */
+ return permission(nd->dentry->d_inode, MAY_READ, NULL);
+}
+
+static struct inotify_kernel_event * kernel_event(s32 wd, u32 mask, u32 cookie,
+ const char *filename)
+{
+ struct inotify_kernel_event *kevent;
+
+ kevent = kmem_cache_alloc(event_cachep, GFP_ATOMIC);
+ if (!kevent)
+ return NULL;
+
+ /* we hand this out to user-space, so zero it just in case */
+ memset(&kevent->event, 0, sizeof(struct inotify_event));
+
+ kevent->event.wd = wd;
+ kevent->event.mask = mask;
+ kevent->event.cookie = cookie;
+ INIT_LIST_HEAD(&kevent->list);
+
+ if (filename) {
+ size_t len, rem, event_size = sizeof(struct inotify_event);
+
+ /*
+ * We need to pad the filename so as to properly align an
+ * array of inotify_event structures. Because the structure is
+ * small and the common case is a small filename, we just round
+ * up to the next multiple of the structure's sizeof. This is
+ * simple and safe for all architectures.
+ */
+ len = strlen(filename) + 1;
+ rem = event_size - len;
+ if (len > event_size) {
+ rem = event_size - (len % event_size);
+ if (len % event_size == 0)
+ rem = 0;
+ }
+ len += rem;
+
+ kevent->filename = kmalloc(len, GFP_ATOMIC);
+ if (!kevent->filename) {
+ kmem_cache_free(event_cachep, kevent);
+ return NULL;
+ }
+ memset(kevent->filename, 0, len);
+ strncpy(kevent->filename, filename, strlen(filename));
+ kevent->event.len = len;
+ } else {
+ kevent->event.len = 0;
+ kevent->filename = NULL;
+ }
+
+ return kevent;
+}
+
+#define list_to_inotify_kernel_event(pos) \
+ list_entry((pos), struct inotify_kernel_event, list)
+
+#define inotify_dev_get_event(dev) \
+ (list_to_inotify_kernel_event(dev->events.next))
+
+/*
+ * inotify_dev_queue_event - add a new event to the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_queue_event(struct inotify_device *dev,
+ struct inotify_watch *watch, u32 mask,
+ u32 cookie, const char *filename)
+{
+ struct inotify_kernel_event *kevent, *last;
+
+ /* drop this event if it is a dupe of the previous */
+ last = inotify_dev_get_event(dev);
+ if (dev->event_count && last->event.mask == mask &&
+ last->event.wd == watch->wd) {
+ const char *lastname = last->filename;
+
+ if (!filename && !lastname)
+ return;
+ if (filename && lastname && !strcmp(lastname, filename))
+ return;
+ }
+
+ /*
+ * the queue has already overflowed and we have already sent the
+ * Q_OVERFLOW event
+ */
+ if (dev->event_count > dev->max_events)
+ return;
+
+ /* the queue has just overflowed and we need to notify user space */
+ if (dev->event_count == dev->max_events) {
+ kevent = kernel_event(-1, IN_Q_OVERFLOW, cookie, NULL);
+ goto add_event_to_queue;
+ }
+
+ kevent = kernel_event(watch->wd, mask, cookie, filename);
+
+add_event_to_queue:
+ if (!kevent)
+ return;
+
+ /* queue the event and wake up anyone waiting */
+ dev->event_count++;
+ dev->queue_size += sizeof(struct inotify_event) + kevent->event.len;
+ list_add_tail(&kevent->list, &dev->events);
+ wake_up_interruptible(&dev->wait);
+}
+
+static inline int inotify_dev_has_events(struct inotify_device *dev)
+{
+ return !list_empty(&dev->events);
+}
+
+/*
+ * inotify_dev_event_dequeue - destroy an event on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static void inotify_dev_event_dequeue(struct inotify_device *dev)
+{
+ struct inotify_kernel_event *kevent;
+
+ if (!inotify_dev_has_events(dev))
+ return;
+
+ kevent = inotify_dev_get_event(dev);
+ list_del_init(&kevent->list);
+ if (kevent->filename)
+ kfree(kevent->filename);
+
+ dev->event_count--;
+ dev->queue_size -= sizeof(struct inotify_event) + kevent->event.len;
+
+ kmem_cache_free(event_cachep, kevent);
+}
+
+/*
+ * inotify_dev_get_wd - returns the next WD for use by the given dev
+ *
+ * This function can sleep.
+ */
+static int inotify_dev_get_wd(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ int ret;
+
+ if (atomic_read(&dev->user->inotify_watches) >=
+ sysfs_attrib_max_user_watches)
+ return -ENOSPC;
+
+repeat:
+ if (!idr_pre_get(&dev->idr, GFP_KERNEL))
+ return -ENOSPC;
+ spin_lock(&dev->lock);
+ ret = idr_get_new(&dev->idr, watch, &watch->wd);
+ spin_unlock(&dev->lock);
+ if (ret == -EAGAIN) /* more memory is required, try again */
+ goto repeat;
+ else if (ret) /* the idr is full! */
+ return -ENOSPC;
+
+ atomic_inc(&dev->user->inotify_watches);
+
+ return 0;
+}
+
+/*
+ * inotify_dev_put_wd - release the given WD on the given device
+ *
+ * Caller must hold dev->lock.
+ */
+static int inotify_dev_put_wd(struct inotify_device *dev, s32 wd)
+{
+ if (!dev || wd < 0)
+ return -1;
+
+ atomic_dec(&dev->user->inotify_watches);
+ idr_remove(&dev->idr, wd);
+
+ return 0;
+}
+
+/*
+ * create_watch - creates a watch on the given device.
+ *
+ * Grabs dev->lock, so the caller must not hold it.
+ */
+static struct inotify_watch *create_watch(struct inotify_device *dev,
+ u32 mask, struct inode *inode)
+{
+ struct inotify_watch *watch;
+
+ watch = kmem_cache_alloc(watch_cachep, GFP_KERNEL);
+ if (!watch)
+ return NULL;
+
+ watch->mask = mask;
+ watch->inode = inode;
+ watch->dev = dev;
+ INIT_LIST_HEAD(&watch->d_list);
+ INIT_LIST_HEAD(&watch->i_list);
+
+ if (inotify_dev_get_wd(dev, watch)) {
+ kmem_cache_free(watch_cachep, watch);
+ return NULL;
+ }
+
+ return watch;
+}
+
+/*
+ * delete_watch - removes the given 'watch' from the given 'dev'
+ *
+ * Caller must hold dev->lock.
+ */
+static void delete_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ inotify_dev_put_wd(dev, watch->wd);
+ kmem_cache_free(watch_cachep, watch);
+}
+
+/*
+ * inotify_find_dev - find the watch associated with the given inode and dev
+ *
+ * Caller must hold dev->lock.
+ * FIXME: Needs inotify_data->lock too. Don't need dev->lock, just pin it.
+ */
+static struct inotify_watch *inode_find_dev(struct inode *inode,
+ struct inotify_device *dev)
+{
+ struct inotify_watch *watch;
+
+ if (!inode->inotify_data)
+ return NULL;
+
+ list_for_each_entry(watch, &inode->inotify_data->watches, i_list) {
+ if (watch->dev == dev)
+ return watch;
+ }
+
+ return NULL;
+}
+
+/*
+ * dev_find_wd - given a (dev,wd) pair, returns the matching inotify_watcher
+ *
+ * Returns the results of looking up (dev,wd) in the idr layer. NULL is
+ * returned on error.
+ *
+ * The caller must hold dev->lock.
+ */
+static inline struct inotify_watch *dev_find_wd(struct inotify_device *dev,
+ u32 wd)
+{
+ return idr_find(&dev->idr, wd);
+}
+
+static int inotify_dev_is_watching_inode(struct inotify_device *dev,
+ struct inode *inode)
+{
+ struct inotify_watch *watch;
+
+ list_for_each_entry(watch, &dev->watches, d_list) {
+ if (watch->inode == inode)
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * inotify_dev_add_watcher - add the given watcher to the given device instance
+ *
+ * Caller must hold dev->lock.
+ */
+static int inotify_dev_add_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ if (!dev || !watch)
+ return -EINVAL;
+
+ list_add(&watch->d_list, &dev->watches);
+ return 0;
+}
+
+/*
+ * inotify_dev_rm_watch - remove the given watch from the given device
+ *
+ * Caller must hold dev->lock because we call inotify_dev_queue_event().
+ */
+static int inotify_dev_rm_watch(struct inotify_device *dev,
+ struct inotify_watch *watch)
+{
+ if (!watch)
+ return -EINVAL;
+
+ inotify_dev_queue_event(dev, watch, IN_IGNORED, 0, NULL);
+ list_del_init(&watch->d_list);
+
+ return 0;
+}
+
+/*
+ * inode_add_watch - add a watch to the given inode
+ *
+ * Callers must hold dev->lock, because we call inode_find_dev().
+ */
+static int inode_add_watch(struct inode *inode, struct inotify_watch *watch)
+{
+ int ret;
+
+ if (!inode || !watch)
+ return -EINVAL;
+
+ spin_lock(&inode->i_lock);
+ if (!inode->inotify_data) {
+ /* inotify_data is not attached to the inode, so add it */
+ inode->inotify_data = kmem_cache_alloc(inode_data_cachep,
+ GFP_ATOMIC);
+ if (!inode->inotify_data) {
+ ret = -ENOMEM;
+ goto out_lock;
+ }
+
+ atomic_set(&inode->inotify_data->count, 0);
+ INIT_LIST_HEAD(&inode->inotify_data->watches);
+ spin_lock_init(&inode->inotify_data->lock);
+ } else if (inode_find_dev(inode, watch->dev)) {
+ /* a watch is already associated with this (inode,dev) pair */
+ ret = -EINVAL;
+ goto out_lock;
+ }
+ __get_inode_data(inode->inotify_data);
+ spin_unlock(&inode->i_lock);
+
+ list_add(&watch->i_list, &inode->inotify_data->watches);
+
+ return 0;
+out_lock:
+ spin_unlock(&inode->i_lock);
+ return ret;
+}
+
+static int inode_rm_watch(struct inode *inode,
+ struct inotify_watch *watch)
+{
+ if (!inode || !watch || !inode->inotify_data)
+ return -EINVAL;
+
+ list_del_init(&watch->i_list);
+
+ /* clean up inode->inotify_data */
+ put_inode_data(inode);
+
+ return 0;
+}
+
+/* Kernel API */
+
+/*
+ * inotify_inode_queue_event - queue an event with the given mask, cookie, and
+ * filename to any watches associated with the given inode.
+ *
+ * inode must be pinned prior to calling.
+ */
+void inotify_inode_queue_event(struct inode *inode, u32 mask, u32 cookie,
+ const char *name)
+{
+ struct inotify_watch *watch;
+
+ if (!inode->inotify_data)
+ return;
+
+ list_for_each_entry(watch, &inode->inotify_data->watches, i_list) {
+ if (watch->mask & mask) {
+ struct inotify_device *dev = watch->dev;
+ spin_lock(&dev->lock);
+ inotify_dev_queue_event(dev, watch, mask, cookie, name);
+ spin_unlock(&dev->lock);
+ }
+ }
+}
+EXPORT_SYMBOL_GPL(inotify_inode_queue_event);
+
+void inotify_dentry_parent_queue_event(struct dentry *dentry, u32 mask,
+ u32 cookie, const char *filename)
+{
+ struct dentry *parent;
+ struct inode *inode;
+
+ spin_lock(&dentry->d_lock);
+ parent = dentry->d_parent;
+ inode = parent->d_inode;
+ if (inode->inotify_data) {
+ dget(parent);
+ spin_unlock(&dentry->d_lock);
+ inotify_inode_queue_event(inode, mask, cookie, filename);
+ dput(parent);
+ } else
+ spin_unlock(&dentry->d_lock);
+}
+EXPORT_SYMBOL_GPL(inotify_dentry_parent_queue_event);
+
+u32 inotify_get_cookie(void)
+{
+ atomic_inc(&inotify_cookie);
+ return atomic_read(&inotify_cookie);
+}
+EXPORT_SYMBOL_GPL(inotify_get_cookie);
+
+/*
+ * Caller must hold dev->lock.
+ */
+static void __remove_watch(struct inotify_watch *watch,
+ struct inotify_device *dev)
+{
+ struct inode *inode;
+
+ inode = watch->inode;
+
+ inode_rm_watch(inode, watch);
+ inotify_dev_rm_watch(dev, watch);
+ delete_watch(dev, watch);
+
+ iput(inode);
+}
+
+/*
+ * destroy_watch - remove a watch from both the device and the inode.
+ *
+ * watch->inode must be pinned. We drop a reference before returning. Grabs
+ * dev->lock.
+ */
+static void remove_watch(struct inotify_watch *watch)
+{
+ struct inotify_device *dev = watch->dev;
+
+ spin_lock(&dev->lock);
+ __remove_watch(watch, dev);
+ spin_unlock(&dev->lock);
+}
+
+void inotify_super_block_umount(struct super_block *sb)
+{
+ struct inode *inode;
+
+ spin_lock(&inode_lock);
+
+ /*
+ * We hold the inode_lock, so the inodes are not going anywhere, and
+ * we grab a reference on inotify_data before walking its list of
+ * watches.
+ */
+ list_for_each_entry(inode, &inode_in_use, i_list) {
+ struct inotify_inode_data *inode_data;
+ struct inotify_watch *watch;
+
+ if (inode->i_sb != sb)
+ continue;
+
+ inode_data = get_inode_data(inode);
+ if (!inode_data)
+ continue;
+
+ list_for_each_entry(watch, &inode_data->watches, i_list) {
+ struct inotify_device *dev = watch->dev;
+ spin_lock(&dev->lock);
+ inotify_dev_queue_event(dev, watch, IN_UNMOUNT, 0,
+ NULL);
+ __remove_watch(watch, dev);
+ spin_unlock(&dev->lock);
+ }
+ put_inode_data(inode);
+ }
+
+ spin_unlock(&inode_lock);
+}
+EXPORT_SYMBOL_GPL(inotify_super_block_umount);
+
+/*
+ * inotify_inode_is_dead - an inode has been deleted, cleanup any watches
+ */
+void inotify_inode_is_dead(struct inode *inode)
+{
+ struct inotify_watch *watch, *next;
+ struct inotify_inode_data *data;
+
+ data = get_inode_data(inode);
+ if (!data)
+ return;
+ list_for_each_entry_safe(watch, next, &data->watches, i_list)
+ remove_watch(watch);
+ put_inode_data(inode);
+}
+EXPORT_SYMBOL_GPL(inotify_inode_is_dead);
+
+/* The driver interface is implemented below */
+
+static unsigned int inotify_poll(struct file *file, poll_table *wait)
+{
+ struct inotify_device *dev;
+
+ dev = file->private_data;
+
+ poll_wait(file, &dev->wait, wait);
+
+ if (inotify_dev_has_events(dev))
+ return POLLIN | POLLRDNORM;
+
+ return 0;
+}
+
+static ssize_t inotify_read(struct file *file, char __user *buf,
+ size_t count, loff_t *pos)
+{
+ size_t event_size;
+ struct inotify_device *dev;
+ char __user *start;
+ DECLARE_WAITQUEUE(wait, current);
+
+ start = buf;
+ dev = file->private_data;
+
+ /* We only hand out full inotify events */
+ event_size = sizeof(struct inotify_event);
+ if (count < event_size)
+ return 0;
+
+ while (1) {
+ int has_events;
+
+ spin_lock(&dev->lock);
+ has_events = inotify_dev_has_events(dev);
+ spin_unlock(&dev->lock);
+ if (has_events)
+ break;
+
+ if (file->f_flags & O_NONBLOCK)
+ return -EAGAIN;
+
+ if (signal_pending(current))
+ return -EINTR;
+
+ add_wait_queue(&dev->wait, &wait);
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ schedule();
+
+ set_current_state(TASK_RUNNING);
+ remove_wait_queue(&dev->wait, &wait);
+ }
+
+ while (count >= event_size) {
+ struct inotify_kernel_event *kevent;
+
+ spin_lock(&dev->lock);
+ if (!inotify_dev_has_events(dev)) {
+ spin_unlock(&dev->lock);
+ break;
+ }
+ kevent = inotify_dev_get_event(dev);
+ spin_unlock(&dev->lock);
+
+ /* We can't send this event, not enough space in the buffer */
+ if (event_size + kevent->event.len > count)
+ break;
+
+ /* Copy the entire event except the string to user space */
+ if (copy_to_user(buf, &kevent->event, event_size))
+ return -EFAULT;
+
+ buf += event_size;
+ count -= event_size;
+
+ /* Copy the filename to user space */
+ if (kevent->filename) {
+ if (copy_to_user(buf, kevent->filename,
+ kevent->event.len))
+ return -EFAULT;
+ buf += kevent->event.len;
+ count -= kevent->event.len;
+ }
+
+ spin_lock(&dev->lock);
+ inotify_dev_event_dequeue(dev);
+ spin_unlock(&dev->lock);
+ }
+
+ return buf - start;
+}
+
+static int inotify_open(struct inode *inode, struct file *file)
+{
+ struct inotify_device *dev;
+ struct user_struct *user;
+ int ret;
+
+ user = get_uid(current->user);
+
+ if (atomic_read(&user->inotify_devs) >= sysfs_attrib_max_user_devices) {
+ ret = -EMFILE;
+ goto out_err;
+ }
+
+ dev = kmalloc(sizeof(struct inotify_device), GFP_KERNEL);
+ if (!dev) {
+ ret = -ENOMEM;
+ goto out_err;
+ }
+
+ atomic_inc(¤t->user->inotify_devs);
+
+ idr_init(&dev->idr);
+
+ INIT_LIST_HEAD(&dev->events);
+ INIT_LIST_HEAD(&dev->watches);
+ init_waitqueue_head(&dev->wait);
+
+ dev->event_count = 0;
+ dev->queue_size = 0;
+ dev->max_events = sysfs_attrib_max_queued_events;
+ dev->user = user;
+ spin_lock_init(&dev->lock);
+
+ file->private_data = dev;
+
+ return 0;
+out_err:
+ free_uid(current->user);
+ return ret;
+}
+
+/*
+ * inotify_release_all_watches - destroy all watches on a given device
+ *
+ * FIXME: We need a lock on the watch here.
+ */
+static void inotify_release_all_watches(struct inotify_device *dev)
+{
+ struct inotify_watch *watch, *next;
+
+ list_for_each_entry_safe(watch, next, &dev->watches, d_list)
+ remove_watch(watch);
+}
+
+/*
+ * inotify_release_all_events - destroy all of the events on a given device
+ */
+static void inotify_release_all_events(struct inotify_device *dev)
+{
+ spin_lock(&dev->lock);
+ while (inotify_dev_has_events(dev))
+ inotify_dev_event_dequeue(dev);
+ spin_unlock(&dev->lock);
+}
+
+static int inotify_release(struct inode *inode, struct file *file)
+{
+ struct inotify_device *dev;
+
+ dev = file->private_data;
+
+ inotify_release_all_watches(dev);
+ inotify_release_all_events(dev);
+
+ atomic_dec(&dev->user->inotify_devs);
+ free_uid(dev->user);
+
+ kfree(dev);
+
+ return 0;
+}
+
+static int inotify_add_watch(struct inotify_device *dev,
+ struct inotify_watch_request *request)
+{
+ struct inode *inode;
+ struct inotify_watch *watch;
+ struct nameidata nd;
+ int ret;
+
+ ret = find_inode((const char __user*) request->name, &nd);
+ if (ret)
+ return ret;
+
+ /* held in place by references in nd */
+ inode = nd.dentry->d_inode;
+
+ spin_lock(&dev->lock);
+
+ /*
+ * This handles the case of re-adding a directory we are already
+ * watching, we just update the mask and return 0
+ */
+ if (inotify_dev_is_watching_inode(dev, inode)) {
+ struct inotify_watch *owatch; /* the old watch */
+
+ owatch = inode_find_dev(inode, dev);
+ owatch->mask = request->mask;
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+
+ return owatch->wd;
+ }
+
+ spin_unlock(&dev->lock);
+
+ watch = create_watch(dev, request->mask, inode);
+ if (!watch) {
+ path_release(&nd);
+ return -ENOSPC;
+ }
+
+ spin_lock(&dev->lock);
+
+ /* We can't add anymore watches to this device */
+ if (inotify_dev_add_watch(dev, watch)) {
+ delete_watch(dev, watch);
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+ return -EINVAL;
+ }
+
+ ret = inode_add_watch(inode, watch);
+ if (ret < 0) {
+ list_del_init(&watch->d_list);
+ delete_watch(dev, watch);
+ spin_unlock(&dev->lock);
+ path_release(&nd);
+ return ret;
+ }
+
+ spin_unlock(&dev->lock);
+
+ /*
+ * Demote the reference to nameidata to a reference to the inode held
+ * by the watch.
+ */
+ spin_lock(&inode_lock);
+ __iget(inode);
+ spin_unlock(&inode_lock);
+ path_release(&nd);
+
+ return watch->wd;
+}
+
+static int inotify_ignore(struct inotify_device *dev, s32 wd)
+{
+ struct inotify_watch *watch;
+ int ret = 0;
+
+ spin_lock(&dev->lock);
+ watch = dev_find_wd(dev, wd);
+ //spin_unlock(&dev->lock);
+ if (!watch) {
+ ret = -EINVAL;
+ goto out;
+ }
+ __remove_watch(watch, dev);
+
+out:
+ spin_unlock(&dev->lock);
+ return ret;
+}
+
+/*
+ * inotify_ioctl() - our device file's ioctl method
+ *
+ * The VFS serializes all of our calls via the BKL and we rely on that. We
+ * could, alternatively, grab dev->lock. Right now lower levels grab that
+ * where needed.
+ */
+static int inotify_ioctl(struct inode *ip, struct file *fp,
+ unsigned int cmd, unsigned long arg)
+{
+ struct inotify_device *dev;
+ struct inotify_watch_request request;
+ void __user *p;
+ s32 wd;
+
+ dev = fp->private_data;
+ p = (void __user *) arg;
+
+ switch (cmd) {
+ case INOTIFY_WATCH:
+ if (copy_from_user(&request, p, sizeof (request)))
+ return -EFAULT;
+ return inotify_add_watch(dev, &request);
+ case INOTIFY_IGNORE:
+ if (copy_from_user(&wd, p, sizeof (wd)))
+ return -EFAULT;
+ return inotify_ignore(dev, wd);
+ case FIONREAD:
+ return put_user(dev->queue_size, (int __user *) p);
+ default:
+ return -ENOTTY;
+ }
+}
+
+static struct file_operations inotify_fops = {
+ .owner = THIS_MODULE,
+ .poll = inotify_poll,
+ .read = inotify_read,
+ .open = inotify_open,
+ .release = inotify_release,
+ .ioctl = inotify_ioctl,
+};
+
+static struct miscdevice inotify_device = {
+ .minor = MISC_DYNAMIC_MINOR,
+ .name = "inotify",
+ .fops = &inotify_fops,
+};
+
+static int __init inotify_init(void)
+{
+ struct class_device *class;
+ int ret;
+
+ ret = misc_register(&inotify_device);
+ if (ret)
+ return ret;
+
+ sysfs_attrib_max_queued_events = 512;
+ sysfs_attrib_max_user_devices = 64;
+ sysfs_attrib_max_user_watches = 16384;
+
+ class = inotify_device.class;
+ class_device_create_file(class, &class_device_attr_max_queued_events);
+ class_device_create_file(class, &class_device_attr_max_user_devices);
+ class_device_create_file(class, &class_device_attr_max_user_watches);
+
+ atomic_set(&inotify_cookie, 0);
+
+ watch_cachep = kmem_cache_create("inotify_watch_cache",
+ sizeof(struct inotify_watch), 0, SLAB_PANIC,
+ NULL, NULL);
+
+ event_cachep = kmem_cache_create("inotify_event_cache",
+ sizeof(struct inotify_kernel_event), 0,
+ SLAB_PANIC, NULL, NULL);
+
+ inode_data_cachep = kmem_cache_create("inotify_inode_data_cache",
+ sizeof(struct inotify_inode_data), 0, SLAB_PANIC,
+ NULL, NULL);
+
+ printk(KERN_INFO "inotify device minor=%d\n", inotify_device.minor);
+
+ return 0;
+}
+
+module_init(inotify_init);
diff -urN linux-2.6.10/drivers/char/Kconfig linux/drivers/char/Kconfig
--- linux-2.6.10/drivers/char/Kconfig 2004-12-24 16:33:49.000000000 -0500
+++ linux/drivers/char/Kconfig 2005-01-18 16:11:08.000000000 -0500
@@ -62,6 +62,19 @@
depends on VT && !S390 && !USERMODE
default y
+config INOTIFY
+ bool "Inotify file change notification support"
+ default y
+ ---help---
+ Say Y here to enable inotify support and the /dev/inotify character
+ device. Inotify is a file change notification system and a
+ replacement for dnotify. Inotify fixes numerous shortcomings in
+ dnotify and introduces several new features. It allows monitoring
+ of both files and directories via a single open fd. Multiple file
+ events are supported.
+
+ If unsure, say Y.
+
config SERIAL_NONSTANDARD
bool "Non-standard serial port support"
---help---
diff -urN linux-2.6.10/drivers/char/Makefile linux/drivers/char/Makefile
--- linux-2.6.10/drivers/char/Makefile 2004-12-24 16:35:29.000000000 -0500
+++ linux/drivers/char/Makefile 2005-01-18 16:11:08.000000000 -0500
@@ -9,6 +9,8 @@
obj-y += mem.o random.o tty_io.o n_tty.o tty_ioctl.o
+
+obj-$(CONFIG_INOTIFY) += inotify.o
obj-$(CONFIG_LEGACY_PTYS) += pty.o
obj-$(CONFIG_UNIX98_PTYS) += pty.o
obj-y += misc.o
diff -urN linux-2.6.10/drivers/char/misc.c linux/drivers/char/misc.c
--- linux-2.6.10/drivers/char/misc.c 2004-12-24 16:35:28.000000000 -0500
+++ linux/drivers/char/misc.c 2005-01-18 16:11:08.000000000 -0500
@@ -207,10 +207,9 @@
int misc_register(struct miscdevice * misc)
{
struct miscdevice *c;
- struct class_device *class;
dev_t dev;
int err;
-
+
down(&misc_sem);
list_for_each_entry(c, &misc_list, list) {
if (c->minor == misc->minor) {
@@ -224,8 +223,7 @@
while (--i >= 0)
if ( (misc_minors[i>>3] & (1 << (i&7))) == 0)
break;
- if (i<0)
- {
+ if (i<0) {
up(&misc_sem);
return -EBUSY;
}
@@ -240,10 +238,10 @@
}
dev = MKDEV(MISC_MAJOR, misc->minor);
- class = class_simple_device_add(misc_class, dev,
- misc->dev, misc->name);
- if (IS_ERR(class)) {
- err = PTR_ERR(class);
+ misc->class = class_simple_device_add(misc_class, dev,
+ misc->dev, misc->name);
+ if (IS_ERR(misc->class)) {
+ err = PTR_ERR(misc->class);
goto out;
}
diff -urN linux-2.6.10/fs/attr.c linux/fs/attr.c
--- linux-2.6.10/fs/attr.c 2004-12-24 16:34:00.000000000 -0500
+++ linux/fs/attr.c 2005-01-31 15:52:37.000000000 -0500
@@ -10,7 +10,7 @@
#include <linux/mm.h>
#include <linux/string.h>
#include <linux/smp_lock.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/fcntl.h>
#include <linux/quotaops.h>
#include <linux/security.h>
@@ -103,31 +103,8 @@
out:
return error;
}
-
EXPORT_SYMBOL(inode_setattr);
-int setattr_mask(unsigned int ia_valid)
-{
- unsigned long dn_mask = 0;
-
- if (ia_valid & ATTR_UID)
- dn_mask |= DN_ATTRIB;
- if (ia_valid & ATTR_GID)
- dn_mask |= DN_ATTRIB;
- if (ia_valid & ATTR_SIZE)
- dn_mask |= DN_MODIFY;
- /* both times implies a utime(s) call */
- if ((ia_valid & (ATTR_ATIME|ATTR_MTIME)) == (ATTR_ATIME|ATTR_MTIME))
- dn_mask |= DN_ATTRIB;
- else if (ia_valid & ATTR_ATIME)
- dn_mask |= DN_ACCESS;
- else if (ia_valid & ATTR_MTIME)
- dn_mask |= DN_MODIFY;
- if (ia_valid & ATTR_MODE)
- dn_mask |= DN_ATTRIB;
- return dn_mask;
-}
-
int notify_change(struct dentry * dentry, struct iattr * attr)
{
struct inode *inode = dentry->d_inode;
@@ -183,11 +160,10 @@
error = inode_setattr(inode, attr);
}
}
- if (!error) {
- unsigned long dn_mask = setattr_mask(ia_valid);
- if (dn_mask)
- dnotify_parent(dentry, dn_mask);
- }
+
+ if (!error)
+ fsnotify_change(dentry, ia_valid);
+
return error;
}
diff -urN linux-2.6.10/fs/compat.c linux/fs/compat.c
--- linux-2.6.10/fs/compat.c 2004-12-24 16:34:44.000000000 -0500
+++ linux/fs/compat.c 2005-02-04 12:07:47.000000000 -0500
@@ -35,7 +35,7 @@
#include <linux/ctype.h>
#include <linux/module.h>
#include <linux/dirent.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/highuid.h>
#include <linux/sunrpc/svc.h>
#include <linux/nfsd/nfsd.h>
@@ -1192,9 +1192,15 @@
out:
if (iov != iovstack)
kfree(iov);
- if ((ret + (type == READ)) > 0)
- dnotify_parent(file->f_dentry,
- (type == READ) ? DN_ACCESS : DN_MODIFY);
+ if ((ret + (type == READ)) > 0) {
+ struct dentry *dentry = file->f_dentry;
+ if (type == READ)
+ fsnotify_access(dentry, dentry->d_inode,
+ dentry->d_name.name);
+ else
+ fsnotify_modify(dentry, dentry->d_inode,
+ dentry->d_name.name);
+ }
return ret;
}
diff -urN linux-2.6.10/fs/file_table.c linux/fs/file_table.c
--- linux-2.6.10/fs/file_table.c 2004-12-24 16:33:50.000000000 -0500
+++ linux/fs/file_table.c 2005-01-31 15:46:49.000000000 -0500
@@ -16,6 +16,7 @@
#include <linux/eventpoll.h>
#include <linux/mount.h>
#include <linux/cdev.h>
+#include <linux/fsnotify.h>
/* sysctl tunables... */
struct files_stat_struct files_stat = {
@@ -121,6 +122,9 @@
struct vfsmount *mnt = file->f_vfsmnt;
struct inode *inode = dentry->d_inode;
+
+ fsnotify_close(dentry, inode, file->f_mode, dentry->d_name.name);
+
might_sleep();
/*
* The function eventpoll_release() should be the first called
diff -urN linux-2.6.10/fs/inode.c linux/fs/inode.c
--- linux-2.6.10/fs/inode.c 2004-12-24 16:35:40.000000000 -0500
+++ linux/fs/inode.c 2005-01-18 16:11:08.000000000 -0500
@@ -130,6 +130,9 @@
#ifdef CONFIG_QUOTA
memset(&inode->i_dquot, 0, sizeof(inode->i_dquot));
#endif
+#ifdef CONFIG_INOTIFY
+ inode->inotify_data = NULL;
+#endif
inode->i_pipe = NULL;
inode->i_bdev = NULL;
inode->i_cdev = NULL;
diff -urN linux-2.6.10/fs/namei.c linux/fs/namei.c
--- linux-2.6.10/fs/namei.c 2004-12-24 16:34:30.000000000 -0500
+++ linux/fs/namei.c 2005-01-31 17:24:21.000000000 -0500
@@ -21,7 +21,7 @@
#include <linux/namei.h>
#include <linux/quotaops.h>
#include <linux/pagemap.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/smp_lock.h>
#include <linux/personality.h>
#include <linux/security.h>
@@ -1241,7 +1241,7 @@
DQUOT_INIT(dir);
error = dir->i_op->create(dir, dentry, mode, nd);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_create(dir, dentry, mode);
}
return error;
@@ -1555,7 +1555,7 @@
DQUOT_INIT(dir);
error = dir->i_op->mknod(dir, dentry, mode, dev);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_mknod(dir, dentry, mode, dev);
}
return error;
@@ -1628,7 +1628,7 @@
DQUOT_INIT(dir);
error = dir->i_op->mkdir(dir, dentry, mode);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_mkdir(dir, dentry->d_name.name);
security_inode_post_mkdir(dir,dentry, mode);
}
return error;
@@ -1722,10 +1722,8 @@
}
}
up(&dentry->d_inode->i_sem);
- if (!error) {
- inode_dir_notify(dir, DN_DELETE);
- d_delete(dentry);
- }
+ if (!error)
+ fsnotify_rmdir(dentry, dentry->d_inode, dir);
dput(dentry);
return error;
@@ -1795,10 +1793,9 @@
up(&dentry->d_inode->i_sem);
/* We don't d_delete() NFS sillyrenamed files--they still exist. */
- if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED)) {
- d_delete(dentry);
- inode_dir_notify(dir, DN_DELETE);
- }
+ if (!error && !(dentry->d_flags & DCACHE_NFSFS_RENAMED))
+ fsnotify_unlink(dentry->d_inode, dir, dentry);
+
return error;
}
@@ -1872,7 +1869,7 @@
DQUOT_INIT(dir);
error = dir->i_op->symlink(dir, dentry, oldname);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, dentry->d_name.name);
security_inode_post_symlink(dir, dentry, oldname);
}
return error;
@@ -1945,7 +1942,7 @@
error = dir->i_op->link(old_dentry, dir, new_dentry);
up(&old_dentry->d_inode->i_sem);
if (!error) {
- inode_dir_notify(dir, DN_CREATE);
+ fsnotify_create(dir, new_dentry->d_name.name);
security_inode_post_link(old_dentry, dir, new_dentry);
}
return error;
@@ -2109,6 +2106,7 @@
{
int error;
int is_dir = S_ISDIR(old_dentry->d_inode->i_mode);
+ char *old_name;
if (old_dentry->d_inode == new_dentry->d_inode)
return 0;
@@ -2130,18 +2128,18 @@
DQUOT_INIT(old_dir);
DQUOT_INIT(new_dir);
+ old_name = fsnotify_oldname_init(old_dentry);
+
if (is_dir)
error = vfs_rename_dir(old_dir,old_dentry,new_dir,new_dentry);
else
error = vfs_rename_other(old_dir,old_dentry,new_dir,new_dentry);
if (!error) {
- if (old_dir == new_dir)
- inode_dir_notify(old_dir, DN_RENAME);
- else {
- inode_dir_notify(old_dir, DN_DELETE);
- inode_dir_notify(new_dir, DN_CREATE);
- }
+ const char *new_name = old_dentry->d_name.name;
+ fsnotify_move(old_dir, new_dir, old_name, new_name);
}
+ fsnotify_oldname_free(old_name);
+
return error;
}
diff -urN linux-2.6.10/fs/open.c linux/fs/open.c
--- linux-2.6.10/fs/open.c 2004-12-24 16:33:50.000000000 -0500
+++ linux/fs/open.c 2005-02-02 11:26:06.000000000 -0500
@@ -10,7 +10,7 @@
#include <linux/file.h>
#include <linux/smp_lock.h>
#include <linux/quotaops.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <linux/tty.h>
@@ -953,9 +953,14 @@
fd = get_unused_fd();
if (fd >= 0) {
struct file *f = filp_open(tmp, flags, mode);
+ struct dentry *dentry;
+
error = PTR_ERR(f);
if (IS_ERR(f))
goto out_error;
+ dentry = f->f_dentry;
+ fsnotify_open(dentry, dentry->d_inode,
+ dentry->d_name.name);
fd_install(fd, f);
}
out:
@@ -1007,7 +1012,7 @@
retval = err;
}
- dnotify_flush(filp, id);
+ fsnotify_flush(filp, id);
locks_remove_posix(filp, id);
fput(filp);
return retval;
diff -urN linux-2.6.10/fs/read_write.c linux/fs/read_write.c
--- linux-2.6.10/fs/read_write.c 2004-12-24 16:35:00.000000000 -0500
+++ linux/fs/read_write.c 2005-01-31 16:35:05.000000000 -0500
@@ -10,7 +10,7 @@
#include <linux/file.h>
#include <linux/uio.h>
#include <linux/smp_lock.h>
-#include <linux/dnotify.h>
+#include <linux/fsnotify.h>
#include <linux/security.h>
#include <linux/module.h>
#include <linux/syscalls.h>
@@ -216,8 +216,11 @@
ret = file->f_op->read(file, buf, count, pos);
else
ret = do_sync_read(file, buf, count, pos);
- if (ret > 0)
- dnotify_parent(file->f_dentry, DN_ACCESS);
+ if (ret > 0) {
+ struct dentry *dentry = file->f_dentry;
+ fsnotify_access(dentry, inode,
+ dentry->d_name.name);
+ }
}
}
@@ -260,8 +263,11 @@
ret = file->f_op->write(file, buf, count, pos);
else
ret = do_sync_write(file, buf, count, pos);
- if (ret > 0)
- dnotify_parent(file->f_dentry, DN_MODIFY);
+ if (ret > 0) {
+ struct dentry *dentry = file->f_dentry;
+ fsnotify_modify(dentry, inode,
+ dentry->d_name.name);
+ }
}
}
@@ -493,9 +499,15 @@
out:
if (iov != iovstack)
kfree(iov);
- if ((ret + (type == READ)) > 0)
- dnotify_parent(file->f_dentry,
- (type == READ) ? DN_ACCESS : DN_MODIFY);
+ if ((ret + (type == READ)) > 0) {
+ struct dentry *dentry = file->f_dentry;
+ struct inode *inode = dentry->d_inode;
+
+ if (type == READ)
+ fsnotify_access(dentry, inode, dentry->d_name.name);
+ else
+ fsnotify_modify(dentry, inode, dentry->d_name.name);
+ }
return ret;
}
diff -urN linux-2.6.10/fs/super.c linux/fs/super.c
--- linux-2.6.10/fs/super.c 2004-12-24 16:34:33.000000000 -0500
+++ linux/fs/super.c 2005-01-31 14:53:38.000000000 -0500
@@ -37,9 +37,9 @@
#include <linux/writeback.h> /* for the emergency remount stuff */
#include <linux/idr.h>
#include <linux/kobject.h>
+#include <linux/fsnotify.h>
#include <asm/uaccess.h>
-
void get_filesystem(struct file_system_type *fs);
void put_filesystem(struct file_system_type *fs);
struct file_system_type *get_fs_type(const char *name);
@@ -227,6 +227,7 @@
if (root) {
sb->s_root = NULL;
+ fsnotify_sb_umount(sb);
shrink_dcache_parent(root);
shrink_dcache_anon(&sb->s_anon);
dput(root);
diff -urN linux-2.6.10/include/linux/fs.h linux/include/linux/fs.h
--- linux-2.6.10/include/linux/fs.h 2004-12-24 16:34:27.000000000 -0500
+++ linux/include/linux/fs.h 2005-01-18 16:11:08.000000000 -0500
@@ -27,6 +27,7 @@
struct kstatfs;
struct vm_area_struct;
struct vfsmount;
+struct inotify_inode_data;
/*
* It's silly to have NR_OPEN bigger than NR_FILE, but you can change
@@ -473,6 +474,10 @@
struct dnotify_struct *i_dnotify; /* for directory notifications */
#endif
+#ifdef CONFIG_INOTIFY
+ struct inotify_inode_data *inotify_data;
+#endif
+
unsigned long i_state;
unsigned long dirtied_when; /* jiffies of first dirtying */
@@ -1353,7 +1358,7 @@
extern int do_remount_sb(struct super_block *sb, int flags,
void *data, int force);
extern sector_t bmap(struct inode *, sector_t);
-extern int setattr_mask(unsigned int);
+extern void setattr_mask(unsigned int, int *, u32 *);
extern int notify_change(struct dentry *, struct iattr *);
extern int permission(struct inode *, int, struct nameidata *);
extern int generic_permission(struct inode *, int,
diff -urN linux-2.6.10/include/linux/fsnotify.h linux/include/linux/fsnotify.h
--- linux-2.6.10/include/linux/fsnotify.h 1969-12-31 19:00:00.000000000 -0500
+++ linux/include/linux/fsnotify.h 2005-02-04 12:09:48.000000000 -0500
@@ -0,0 +1,235 @@
+#ifndef _LINUX_FS_NOTIFY_H
+#define _LINUX_FS_NOTIFY_H
+
+/*
+ * include/linux/fs_notify.h - generic hooks for filesystem notification, to
+ * reduce in-source duplication from both dnotify and inotify.
+ *
+ * We don't compile any of this away in some complicated menagerie of ifdefs.
+ * Instead, we rely on the code inside to optimize away as needed.
+ *
+ * (C) Copyright 2005 Robert Love
+ */
+
+#ifdef __KERNEL__
+
+#include <linux/dnotify.h>
+#include <linux/inotify.h>
+
+/*
+ * fsnotify_move - file old_name at old_dir was moved to new_name at new_dir
+ */
+static inline void fsnotify_move(struct inode *old_dir, struct inode *new_dir,
+ const char *old_name, const char *new_name)
+{
+ u32 cookie;
+
+ if (old_dir == new_dir)
+ inode_dir_notify(old_dir, DN_RENAME);
+ else {
+ inode_dir_notify(old_dir, DN_DELETE);
+ inode_dir_notify(new_dir, DN_CREATE);
+ }
+
+ cookie = inotify_get_cookie();
+
+ inotify_inode_queue_event(old_dir, IN_MOVED_FROM, cookie, old_name);
+ inotify_inode_queue_event(new_dir, IN_MOVED_TO, cookie, new_name);
+}
+
+/*
+ * fsnotify_unlink - file was unlinked
+ */
+static inline void fsnotify_unlink(struct inode *inode, struct inode *dir,
+ struct dentry *dentry)
+{
+ inode_dir_notify(dir, DN_DELETE);
+ inotify_inode_queue_event(dir, IN_DELETE_FILE, 0, dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
+
+ inotify_inode_is_dead(inode);
+ d_delete(dentry);
+}
+
+/*
+ * fsnotify_rmdir - directory was removed
+ */
+static inline void fsnotify_rmdir(struct dentry *dentry, struct inode *inode,
+ struct inode *dir)
+{
+ inode_dir_notify(dir, DN_DELETE);
+ inotify_inode_queue_event(dir, IN_DELETE_SUBDIR,0,dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_DELETE_SELF, 0, NULL);
+
+ inotify_inode_is_dead(inode);
+ d_delete(dentry);
+}
+
+/*
+ * fsnotify_create - filename was linked in
+ */
+static inline void fsnotify_create(struct inode *inode, const char *filename)
+{
+ inode_dir_notify(inode, DN_CREATE);
+ inotify_inode_queue_event(inode, IN_CREATE_FILE, 0, filename);
+}
+
+/*
+ * fsnotify_mkdir - directory 'name' was created
+ */
+static inline void fsnotify_mkdir(struct inode *inode, const char *name)
+{
+ inode_dir_notify(inode, DN_CREATE);
+ inotify_inode_queue_event(inode, IN_CREATE_SUBDIR, 0, name);
+}
+
+/*
+ * fsnotify_access - file was read
+ */
+static inline void fsnotify_access(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ dnotify_parent(dentry, DN_ACCESS);
+ inotify_dentry_parent_queue_event(dentry, IN_ACCESS, 0,
+ dentry->d_name.name);
+ inotify_inode_queue_event(inode, IN_ACCESS, 0, NULL);
+}
+
+/*
+ * fsnotify_modify - file was modified
+ */
+static inline void fsnotify_modify(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ dnotify_parent(dentry, DN_MODIFY);
+ inotify_dentry_parent_queue_event(dentry, IN_MODIFY, 0, filename);
+ inotify_inode_queue_event(inode, IN_MODIFY, 0, NULL);
+}
+
+/*
+ * fsnotify_open - file was opened
+ */
+static inline void fsnotify_open(struct dentry *dentry, struct inode *inode,
+ const char *filename)
+{
+ inotify_inode_queue_event(inode, IN_OPEN, 0, NULL);
+ inotify_dentry_parent_queue_event(dentry, IN_OPEN, 0, filename);
+}
+
+/*
+ * fsnotify_close - file was closed
+ */
+static inline void fsnotify_close(struct dentry *dentry, struct inode *inode,
+ mode_t mode, const char *filename)
+{
+ u32 mask;
+
+ mask = (mode & FMODE_WRITE) ? IN_CLOSE_WRITE : IN_CLOSE_NOWRITE;
+ inotify_dentry_parent_queue_event(dentry, mask, 0, filename);
+ inotify_inode_queue_event(inode, mask, 0, NULL);
+}
+
+/*
+ * fsnotify_change - notify_change event. file was modified and/or metadata
+ * was changed.
+ */
+static inline void fsnotify_change(struct dentry *dentry, unsigned int ia_valid)
+{
+ int dn_mask = 0;
+ u32 in_mask = 0;
+
+ if (ia_valid & ATTR_UID) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+ if (ia_valid & ATTR_GID) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+ if (ia_valid & ATTR_SIZE) {
+ in_mask |= IN_MODIFY;
+ dn_mask |= DN_MODIFY;
+ }
+ /* both times implies a utime(s) call */
+ if ((ia_valid & (ATTR_ATIME | ATTR_MTIME)) == (ATTR_ATIME | ATTR_MTIME))
+ {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ } else if (ia_valid & ATTR_ATIME) {
+ in_mask |= IN_ACCESS;
+ dn_mask |= DN_ACCESS;
+ } else if (ia_valid & ATTR_MTIME) {
+ in_mask |= IN_MODIFY;
+ dn_mask |= DN_MODIFY;
+ }
+ if (ia_valid & ATTR_MODE) {
+ in_mask |= IN_ATTRIB;
+ dn_mask |= DN_ATTRIB;
+ }
+
+ if (dn_mask)
+ dnotify_parent(dentry, dn_mask);
+ if (in_mask) {
+ inotify_inode_queue_event(dentry->d_inode, in_mask, 0, NULL);
+ inotify_dentry_parent_queue_event(dentry, in_mask, 0,
+ dentry->d_name.name);
+ }
+}
+
+/*
+ * fsnotify_sb_umount - filesystem unmount
+ */
+static inline void fsnotify_sb_umount(struct super_block *sb)
+{
+ inotify_super_block_umount(sb);
+}
+
+/*
+ * fsnotify_flush - flush time!
+ */
+static inline void fsnotify_flush(struct file *filp, fl_owner_t id)
+{
+ dnotify_flush(filp, id);
+}
+
+#ifdef CONFIG_INOTIFY /* inotify helpers */
+
+/*
+ * fsnotify_oldname_init - save off the old filename before we change it
+ *
+ * this could be kstrdup if only we could add that to lib/string.c
+ */
+static inline char *fsnotify_oldname_init(struct dentry *old_dentry)
+{
+ char *old_name;
+
+ old_name = kmalloc(strlen(old_dentry->d_name.name) + 1, GFP_KERNEL);
+ if (old_name)
+ strcpy(old_name, old_dentry->d_name.name);
+ return old_name;
+}
+
+/*
+ * fsnotify_oldname_free - free the name we got from fsnotify_oldname_init
+ */
+static inline void fsnotify_oldname_free(const char *old_name)
+{
+ kfree(old_name);
+}
+
+#else /* CONFIG_INOTIFY */
+
+static inline char *fsnotify_oldname_init(struct dentry *old_dentry)
+{
+ return NULL;
+}
+
+static inline void fsnotify_oldname_free(const char *old_name)
+{
+}
+
+#endif /* ! CONFIG_INOTIFY */
+
+#endif /* __KERNEL__ */
+
+#endif /* _LINUX_FS_NOTIFY_H */
diff -urN linux-2.6.10/include/linux/inotify.h linux/include/linux/inotify.h
--- linux-2.6.10/include/linux/inotify.h 1969-12-31 19:00:00.000000000 -0500
+++ linux/include/linux/inotify.h 2005-02-09 16:02:58.291978072 -0500
@@ -0,0 +1,118 @@
+/*
+ * Inode based directory notification for Linux
+ *
+ * Copyright (C) 2005 John McCutchan
+ */
+
+#ifndef _LINUX_INOTIFY_H
+#define _LINUX_INOTIFY_H
+
+#include <linux/types.h>
+#include <linux/limits.h>
+
+/*
+ * struct inotify_event - structure read from the inotify device for each event
+ *
+ * When you are watching a directory, you will receive the filename for events
+ * such as IN_CREATE, IN_DELETE, IN_OPEN, IN_CLOSE, ..., relative to the wd.
+ */
+struct inotify_event {
+ __s32 wd; /* watch descriptor */
+ __u32 mask; /* watch mask */
+ __u32 cookie; /* cookie used for synchronizing two events */
+ size_t len; /* length (including nulls) of name */
+ char name[0]; /* stub for possible name */
+};
+
+/*
+ * struct inotify_watch_request - represents a watch request
+ *
+ * Pass to the inotify device via the INOTIFY_WATCH ioctl
+ */
+struct inotify_watch_request {
+ char *name; /* directory name */
+ __u32 mask; /* event mask */
+};
+
+/* the following are legal, implemented events */
+#define IN_ACCESS 0x00000001 /* File was accessed */
+#define IN_MODIFY 0x00000002 /* File was modified */
+#define IN_ATTRIB 0x00000004 /* File changed attributes */
+#define IN_CLOSE_WRITE 0x00000008 /* Writtable file was closed */
+#define IN_CLOSE_NOWRITE 0x00000010 /* Unwrittable file closed */
+#define IN_OPEN 0x00000020 /* File was opened */
+#define IN_MOVED_FROM 0x00000040 /* File was moved from X */
+#define IN_MOVED_TO 0x00000080 /* File was moved to Y */
+#define IN_DELETE_SUBDIR 0x00000100 /* Subdir was deleted */
+#define IN_DELETE_FILE 0x00000200 /* Subfile was deleted */
+#define IN_CREATE_SUBDIR 0x00000400 /* Subdir was created */
+#define IN_CREATE_FILE 0x00000800 /* Subfile was created */
+#define IN_DELETE_SELF 0x00001000 /* Self was deleted */
+#define IN_UNMOUNT 0x00002000 /* Backing fs was unmounted */
+#define IN_Q_OVERFLOW 0x00004000 /* Event queued overflowed */
+#define IN_IGNORED 0x00008000 /* File was ignored */
+
+/* special flags */
+#define IN_ALL_EVENTS 0xffffffff /* All the events */
+#define IN_CLOSE (IN_CLOSE_WRITE | IN_CLOSE_NOWRITE)
+
+#define INOTIFY_IOCTL_MAGIC 'Q'
+#define INOTIFY_IOCTL_MAXNR 2
+
+#define INOTIFY_WATCH _IOR(INOTIFY_IOCTL_MAGIC, 1, struct inotify_watch_request)
+#define INOTIFY_IGNORE _IOR(INOTIFY_IOCTL_MAGIC, 2, int)
+
+#ifdef __KERNEL__
+
+#include <linux/dcache.h>
+#include <linux/fs.h>
+#include <linux/config.h>
+
+struct inotify_inode_data {
+ struct list_head watches; /* list of watches on this inode */
+ spinlock_t lock; /* lock protecting the struct */
+ atomic_t count; /* ref count */
+};
+
+#ifdef CONFIG_INOTIFY
+
+extern void inotify_inode_queue_event(struct inode *, __u32, __u32,
+ const char *);
+extern void inotify_dentry_parent_queue_event(struct dentry *, __u32, __u32,
+ const char *);
+extern void inotify_super_block_umount(struct super_block *);
+extern void inotify_inode_is_dead(struct inode *);
+extern __u32 inotify_get_cookie(void);
+
+#else
+
+static inline void inotify_inode_queue_event(struct inode *inode,
+ __u32 mask, __u32 cookie,
+ const char *filename)
+{
+}
+
+static inline void inotify_dentry_parent_queue_event(struct dentry *dentry,
+ __u32 mask, __u32 cookie,
+ const char *filename)
+{
+}
+
+static inline void inotify_super_block_umount(struct super_block *sb)
+{
+}
+
+static inline void inotify_inode_is_dead(struct inode *inode)
+{
+}
+
+static inline __u32 inotify_get_cookie(void)
+{
+ return 0;
+}
+
+#endif /* CONFIG_INOTIFY */
+
+#endif /* __KERNEL __ */
+
+#endif /* _LINUX_INOTIFY_H */
diff -urN linux-2.6.10/include/linux/miscdevice.h linux/include/linux/miscdevice.h
--- linux-2.6.10/include/linux/miscdevice.h 2004-12-24 16:34:58.000000000 -0500
+++ linux/include/linux/miscdevice.h 2005-01-18 16:11:08.000000000 -0500
@@ -2,6 +2,7 @@
#define _LINUX_MISCDEVICE_H
#include <linux/module.h>
#include <linux/major.h>
+#include <linux/device.h>
#define PSMOUSE_MINOR 1
#define MS_BUSMOUSE_MINOR 2
@@ -32,13 +33,13 @@
struct device;
-struct miscdevice
-{
+struct miscdevice {
int minor;
const char *name;
struct file_operations *fops;
struct list_head list;
struct device *dev;
+ struct class_device *class;
char devfs_name[64];
};
diff -urN linux-2.6.10/include/linux/sched.h linux/include/linux/sched.h
--- linux-2.6.10/include/linux/sched.h 2004-12-24 16:33:59.000000000 -0500
+++ linux/include/linux/sched.h 2005-01-18 16:11:08.000000000 -0500
@@ -353,6 +353,8 @@
atomic_t processes; /* How many processes does this user have? */
atomic_t files; /* How many open files does this user have? */
atomic_t sigpending; /* How many pending signals does this user have? */
+ atomic_t inotify_watches; /* How many inotify watches does this user have? */
+ atomic_t inotify_devs; /* How many inotify devs does this user have opened? */
/* protected by mq_lock */
unsigned long mq_bytes; /* How many bytes can be allocated to mqueue? */
unsigned long locked_shm; /* How many pages of mlocked shm ? */
diff -urN linux-2.6.10/kernel/user.c linux/kernel/user.c
--- linux-2.6.10/kernel/user.c 2004-12-24 16:34:31.000000000 -0500
+++ linux/kernel/user.c 2005-01-18 16:11:08.000000000 -0500
@@ -119,6 +119,8 @@
atomic_set(&new->processes, 0);
atomic_set(&new->files, 0);
atomic_set(&new->sigpending, 0);
+ atomic_set(&new->inotify_watches, 0);
+ atomic_set(&new->inotify_devs, 0);
new->mq_bytes = 0;
new->locked_shm = 0;
On Fri, Feb 18, 2005 at 11:40:59AM -0500, Robert Love wrote:
> inotify, bitches
/me does "pick a random function, find a race" again.
> +/*
> + * inode_add_watch - add a watch to the given inode
> + *
> + * Callers must hold dev->lock, because we call inode_find_dev().
> + */
> +static int inode_add_watch(struct inode *inode, struct inotify_watch *watch)
[snip]
> + list_add(&watch->i_list, &inode->inotify_data->watches);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
... and that is protected by what?
> +
> + return 0;
Fix the damn locking, already.
On Fri, 2005-02-18 at 17:24 +0000, Al Viro wrote:
> Fix the damn locking, already.
Fast as I can.
Robert Love