2002-10-12 04:55:34

by Linus Torvalds

[permalink] [raw]
Subject: Linux v2.5.42


Augh.. People have been mailbombing me apparently because a lot of people
finally decided that they really want to sync with me due to the upcoming
feature freeze, so there's a _lot_ of stuff here, all over the map.

Both the NFS client and the server are getting facelifts to support NFSv4.
And both Dave Jones and Alan Cox decided to try to merge more stuff with
me - along with the usual stream from Andrew Morton.

In addition, we have build updates, ISDN, ACPI, input layer, network
drivers and driverfs.. Along with a random collection of other stuff: USB,
s390, ppc etc.

End result: 1MB worth of compressed patches - in four days.

Linus


PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
stand. I'm not using any kind of volume management personally, so I just
don't have the background or inclination to walk through the patches and
make that kind of decision. My non-scientific opinion is that it looks
like the EVMS code is going to be merged, but ..

Alan, Jens, Christoph, others - this is going to be an area where I need
input from people I know, and preferably also help merging. I've been
happy to see the EVMS patches being discussed on linux-kernel, and I just
wanted to let people know that this needs outside help.

----

Summary of changes from v2.5.41 to v2.5.42
============================================

<[email protected]>:
o Increase the resync timeout for serial mice, and fix MZ wheel
direction

<[email protected]>:
o cpqarray compile fixes
o cpqarray SMP deadlock fix

Dave Jones <[email protected]>:
o document extra option in isapnp
o updated comments
o bad userspace dereferencing
o handle bogus zero IO-APIC addresses
o JBD documentation
o oss compile fix (missing spinlock)
o kernel parameters update
o increase PCI namespace buffer
o move apic_timer_irqs to irqstat
o Missing/Unneeded includes
o mountable futexfs
o missing files in mrproper
o P4 SPIV FOCUS bit
o APM SMP fixes
o fix leak in pcf8583
o missing sanity check in ppdev
o parport docs typo
o Updated proc docs from 2.4
o Updated submitting drivers docs
o Document randconfig
o various typo fixes
o major showstopper diff
o Updated DMA-mapping docs
o fix broken syntax in video config.in
o Document VIA C3
o vmalloc corner case
o only allow IGMP to multicast addresses
o KT266x latency fix
o intel cache parsing update
o DMI updates
o vm86 updates
o IRQ router updates
o Misc reboot.c bits
o numerous __FUNCTION__ pasting fixes
o unify slab namespace
o Don't prefetch io space
o use cpu_has macros
o tlbflush cleanups
o indentation fixes
o death of v86mode
o named initialisers for dcache
o ifdef noise cleanup
o io.h unobfuscation
o ISAPNP updates
o bluesmoke fixes
o arch fixes for make rpm
o jiffy wrap fixes
o x86 math-emu update
o Add ALI 1671 support to AGPGART
o CONFIG_NR_CPUS
o avoid trigraphs in generated pci ids
o Escape quotes in menuconfig
o increase list_del_init usage
o named struct initialisers
o cpqarray reads ->irq before pci_enable_device()
o pci_enable_device before accessing ->irq for wdt_pci
o i845 AGPGART power management
o Remove code duplication in power.c
o missing checks in acorn drivers
o sun3 ncr scsi driver update
o more devexit fixes
o More list_del_init usage increases
o Make work throttling actually work,
o Remove useless mdelay wrapper in pcxx.c
o CONFIG_ISA optional on x86
o zoran named initialisers
o TTY_DO_WRITE_WAKEUP
o Missing kmalloc check in iphase driver
o random fixes for random.c
o module fixes for qtronix.c
o sanitise proc usage in zoran driver
o radio-zoltrix typo
o Various drivers using longs instead of ulongs for flags

<[email protected]>:
o driver core: add generic logging macros for devices

<[email protected]>:
o Prevent EFAULT errors when checking link status, in bonding net
driver

Jeff Dike <[email protected]>:
o xor.h was created as asm-um/xor.h rather than include/asm-um/xor.h
o A number of bug fixes from UML 2.4.19-6 -
o Removed from user_util.h the declarations that are now in
time_user.h
o Small changes to bring UML up to date with 2.5.40
o A set of small bug fixes brought over from 2.4.19-8
o A bunch of network updates from 2.4.19
o A small network bug fix from 2.4.19-7
o Updated defconfig with CONFIG_UML_NET_PCAP
o Back out a piece of the last merge which didn't apply in 2.5
o Fixed a bit of the last merge which I messed up
o Fixed a build bug with CONFIG_UML_NET_PCAP
o Changed my mind about having CONFIG_UML_NET_PCAP enabled by default
in defconfig. This would cause the default config to break on any
system without libpcap, so disabling it by default is better.
o I forgot to add include/asm-um/topology to the repo
o Updated initializers in the block driver
o Updates to make UML build as 2.5.41
o Added a missing directory to the arch/um/kernel Makefile

<[email protected]>:
o e1000 net driver minor fixes/cleanups

<[email protected]>:
o Make Logitech Desktop Pro (wireless keyboard & mouse) work with all
buttons and wheel

<[email protected]>:
o C99 designated initializers

<[email protected]>:
o ISDN: Add new Eicon driver

<[email protected]>:
o aacraid Makefile error in 2.5.41

Martin Bligh <[email protected]>:
o NUMA-Q fixes

Olaf Dietsche <olaf.dietsche#[email protected]>:
o 2.5.40: fix chmod/chown on procfs

<[email protected]>:
o ips.c remove tqueue.h

Richard Henderson <[email protected]>:
o Move syscall table out to new file. Prevent entSys constants from
being out of sync with it.
o Merge minor changes from entry_rewrite tree
o Make sysrq-b halt on SRM
o Fix missed variable rename in stxncpy glibc conversion
o Avoid oops on systems that set atkbd_reset

<[email protected]>:
o Initial check in of cifs filesystem version 0.54 for Linux 2.5 (to
clean tree as one changeset)

Thomas Molina <[email protected]>:
o remove double "lock" in v_midi.h
o missing exports

<[email protected]>:
o Several fixes in the uinput.c userspace input driver. Size of fifo,
handling of flag bits, etc.

Alan Cox <[email protected]>:
o fix cut and paste error in amd768rng help
o fix all the isdn compile mess
o cadet needless globals
o mpt fusion - remove donothing code
o un-tqueue aironet
o bring I2O roughly back into line
o 2.5 clean up of DE600
o fix ibmtr mapping bug
o 2.5 cleanup + 2.4 merge of depca
o (forwarded) Olympic fixes
o fix orinoco build
o Suppose we unload with the timer function live ?
o fix aha152x
o make dmx1391 work with new 5380
o make tcic work again
o update fdomain scsi
o fix imm compile
o make pas16 work with new NCR5380
o fix ppa
o fix t128 for new NCR5380
o first pass at seagate st-02 for 2.5
o wd7000 lock error Willy noticed
o fix telephony for tqueue
o fix gcc 3.1/2 warnings in USB
o Fix 2.5 signal handling in jffs/jffs2
o tidy for the max_thread stuff from the kernel list
o trivial sound static/cast fixes
o fix warnings in fpu code
o first pass over the in2000
o 3c501 for 2.5

Alexander Viro <[email protected]>:
o compile fixes

Alexey Kuznetsov <[email protected]>:
o net/ipv4/igmp.c: Revert PACKET_MULTICAST check

Andi Kleen <[email protected]>:
o Efficient bswab64 for i386

Andrew Morton <[email protected]>:
o fix READA in ll_rw_block()
o discontigmem compilation fix
o discontigmem fixes and cleanups
o node-local mem_map for ia32 discontigmem
o remove get_free_page()
o numa: alloc_pages_node cleanup
o free_area_init cleanup
o move library functions from ramfs into libfs
o ext3 indexed directory support
o 64-bit sector_t - various driver changes
o 64-bit sector_t - printk changes and sector_t cleanup
o 64-bit sector_t - driver changes
o 64-bit sector_t - filesystems
o 64-bit sector_t - md fixes
o 64-bit sector_t - remove udivdi3, use sector_div()
o Fix xxx_get_biosgeometry --- avoid useless 64-bit division
o Hardwire CONFIG_LBD to "on" for testing
o mremap use-after-free bugfix
o move_one_page atomicity fix
o fix the raw driver
o remove radix_tree_reserve()
o remove the sched_yield from the ext3 fsync path
o make readv/writev return 0 for 0 segments
o x86 uniproc compile fix
o various fixes

Andries E. Brouwer <[email protected]>:
o isofs fix
o keyboard repeat code fix

Andy Grover <[email protected]>:
o ACPI: Replace ACPI_DEBUG define with ACPI_DEBUG_OUTPUT (Dominik
Brodowski)
o ACPI: Fix reversed logic in blacklist code (Sergio Monteiro Basto)
o ACPI: IA64 fixes (David Mosberger)
o ACPI: Fix /proc/acpi/sleep (P. Christeas)
o ACPI: Fix thermal management and make trip points R/W (Pavel
Machek)
o ACPI: Allow handling negative celsius values (Kochi Takayoshi)
o ACPI: Add another cast to Bjoern's MADT walking fix to silence
warning
o ACPI: Initialize thermal driver's timer before it is used (Knut
Neumann)
o ACPI: Interpreter update to 200201002

Anton Altaparmakov <[email protected]>:
o Add functions for searching for an inode in icache and getting a
reference to it if present - fs/inode.c::ilookup() and ilookup5(),
mirroring the iget_locked() and iget5_locked() function pair. Also
add two internal helpers ifind_fast() and ifind() respectively
which will later be used by iget_locked() and iget5_locked() to do
the search, too.
o Cleanup: Convert fs/inode.c::iget_locked() and iget5_locked() to
use the new ifind_fast() and ifind() helpers, respectively.

Anton Blanchard <[email protected]>:
o one of these things is not like the others
o fix NLS config.in

Arnaldo Carvalho de Melo <[email protected]>:
o hid-input: fix find_next_zero_bit usage
o Appletalk: convert some spinlocks to rwlocks
o IPX: fix permission bogosity in create_proc_entry usage
o LLC: fix permission bogosity in create_proc_entry usage
o Appletalk: convert aarp_lock from spinlock to rwlock

Art Haas <[email protected]>:
o named initializers all over the place

Benjamin LaHaise <[email protected]>:
o [AIO]: First stage of AIO infrastructure for networking
o fix symbol export in fs/read_write.c
o v2.5.31-aio-nohighmem.diff
o correct return value from aio_complete on sync iocbs
o update ns83820.c to v0.19
o sync iocbs need to actually wake_up_process the waiter (as spotted
by Suparna)
o several updates for testing aio_{read,write} support for file
descriptors with only async ops in vfs_{read,write}
o several minor bugfixes for the aio core
o create support for iocb kicking, where a retry operation gets
triggered in the mm context of the submitter to allow the use of
copy_*_user.
o queue descriptor io errors instead of returning them from io_submit
o adapt aio kick changes to ingo's work queues
o buildbug.diff
o fix missing list initialization in aio context creation
o fix a bug in kick_iocb that caused it to fail for async iocbs
o update ns83820 to 0.20
o fix typo in aio.c merge
o export do_sync_{read,write} for modules
o fix compile glitch introduced by the addition of symbol exports

Brad Hards <[email protected]>:
o Better naming for USB input devices that omit the manufacturer name

Brian Gerst <[email protected]>:
o struct super_block cleanup - final

Chris Wright <[email protected]>:
o LSM: move the inode_alloc_security hook

Christoph Hellwig <[email protected]>:
o initcalls for ATM

David Brownell <[email protected]>:
o usbtest: mo'betta devices, control tests

David S. Miller <[email protected]>:
o net/ipv6/addrconf.c: Need __constant_XXX for case statements
o [ESP/QLOGICPTI]: Only set highmem_io on sparc64
o [DECNET]: Kill warnings
o [IPV4/IPV6]: Cleanup inet{,6}_protocol
o include/net/sock.h: Kill __async_lock_sock extern for now
o drivers/scsi/scsi.h: Add back sync/wide members for host drivers
o [CRIS/SPARC/SPARC64]: Init mem_map after free_area_init_node
o drivers/net/pppoe.c: Use sock_owned_by_user
o [SCTP]: Use sock_owned_by_user
o arch/sparc64/kernel/ioctl32.c: Block ioctl handling fix

[email protected] <[email protected]>:
o ips driver 1-6

Doug Ledford <[email protected]>:
o make SCSI queue depth adjustable
o Updated scsi patch
o compile fix for cpqfc driver
o atp870 driver
o redo of scsi.h changes
o Updates for the scsi.h removal of device specific data from struct
scsi_device
o tcq fixes for the issue on linux-kernel
o aic7xxx_old update and a compile warning fix in scsi.c
o Make the rest of the world happy with ips again

Geert Uytterhoeven <[email protected]>:
o Please apply this small clean up, too

Greg Kroah-Hartman <[email protected]>:
o USB: fix ctsrts handling in pl2303 driver
o LSM: added lsm documentation to the tree
o driver core: rename DEVICE to DEVPATH for /sbin/hotplug call to
prevent conflict with USB
o USB: add device speed driverfs file
o USB: removed unused DEVFS /sbin/hotplug attribute
o minor i386 timer changes for 2.5.41

Hideaki Yoshifuji <[email protected]>:
o net/ipv6/addrconf.c: Use prefix of 64 for link-local addresses

Ingo Molnar <[email protected]>:
o timer cleanups
o sched-2.5.41-A0

Ivan Kokshaysky <[email protected]>:
o alpha build fixes

Jan Harkes <[email protected]>:
o Coda FS update

Jean Tourrilhes <[email protected]>:
o irda update 1-6, big vlsi_ir driver update

Jeff Garzik <[email protected]>:
o Revert incorrect s/__exit/__devexit/ change to tmspci tokenring
drvr
o [netdrvr] Use ADVERTISE_FULL in mii lib, to clean up duplex check
o Support multiple cards in ewrk3 net driver (contributed by Adam
Kropelin)

Jens Axboe <[email protected]>:
o make bio->bi_max contain max vec entries
o ide tagged command queueing support
o Scsi sense buffer thinko
o excessive stack usage in cdrom

Johannes Erdfelt <[email protected]>:
o USB: Trivial MAINTAINERS update

John Stultz <[email protected]>:
o linux-2.5.41_timer-changes_A4 (1/3 - infrastructure)
o linux-2.5.41_timer-changes_A4 (2/3 - bulk move)
o linux-2.5.41_timer-changes_A4 (3/3 - integration)
o linux-2.5.41_cyclone-timer_B2
o linux-2.5.41_cyclone-fixes_A1

Kai Germaschewski <[email protected]>:
o kbuild: Call scripts explicitly via sh
o kbuild: Fix drivers/scsi/aacraid/Makefile
o kbuild: Remove now unnecessary usages of $(TOPDIR)
o kbuild: Buglet in Documentation/DocBook/Makefile
o kbuild: typos
o ISDN: race-free incoming call handling
o ISDN: Accept incoming calls and do callback in the state machine
o ISDN: Move generic bits from isdn_net_lib to isdn_common
o ISDN: Move binding the interface into state machine
o ISDN: ref counting for isdn_net_local / isdn_net_dev

Linus Torvalds <[email protected]>:
o Fix missing printk end-of-line
o Oops, removed one too many header includes
o Don't declare pcibios_fixup_irqs, it's static inside irq.c
o firestream compile fix
o Get PageUp handling right
o Declare set_change_info() only if CONFIG_NFSD_V3 is enabled. It
uses fields that do not exist otherwise.
o wd7000 indent pass, no code changes
o Clean up after timers - move the "timers" Makefile info into the
proper subdirectory (kernel) where it is used.

Maksim Krasnyanskiy <[email protected]>:
o Initialize Bluetooth core using subsys_initcall()
o RFCOMM core API extensions. Improved /proc/bluetooth/rfcomm format

Marcel Holtmann <[email protected]>:
o Make it possible to compile in the Bluetooth subsystem

Martin Schwidefsky <[email protected]>:
o s390 update: compile fixes
o s390 update: work queues
o s390 update: tasklets
o s390 update: linker script typo
o s390 update: superfluous memset
o s390 update: syscall tracing
o s390 update: 3270 console

Neil Brown <[email protected]>:
o kNFSd: Remove the nfs-devel list from MAINTAINERS
o kNFSd: Use correct value for max size for readlink response
o kNFSd: A couple of possible incorrect calls to dput
o kNFSd: pre-zero response for lockd _msg requests
o kNFSd: header file for NFSv4 XDR
o kNFSd: Expand nfsd filehandle to 128 bytes
o kNFSd: new routine fh_dup2()
o kNFSd: New routine exp_pseudoroot() to find 'root' filehandle for
nfsv4
o kNFSd: ensure XDR buffer is large enough for NFSv4
o kNFSd: Stub support for name lookup
o kNFSd: Giant patch importing NFSv4 server functionality
o kNFSd: Enable selection of NFSv4 server in configurator and
Makefile
o kNFSd: Tidy up the rpc authentication interface
o kNFSd: Initialial caching infrastructure for RPC authentication
caches
o kNFSd: Use new cache infrastructure for auth_unix specific lookups
o kNFSd: Move auth domain lookup into svcauth
o kNFSd: exp_getclient, now just a small wrapper, goes in favour of
auth_unix_lookup
o kNFSd: Open code exp_get and exp_get_fsid in the one place they are
called
o kNFSd: Convert export-table to use new cache code
o kNFSd: Don't over-write rpc request with response
o kNFSd: decode symlink inplace to avoid modifying request
o kNFSd: Provide support for request deferral and revisit
o kNFSd: Create files: /proc/net/rpc/$CACHENAME/channel for
communicating cache updates with kernel
o kNFSd: Provide generic code for making an upcall
o kNFSd: Implement ip_map_request for upcalls
o kNFSd: Implement get_word to help in parsing cache updates
o kNFSd: get_int and get_expiry to help in parsing
o kNFSd: Impletement ip_map_parse to allow filling auth.unix.ip cache
o kNFSd: upcall/update for export tables

Patrick Mochel <[email protected]>:
o driver model: and present field to struct device and implement
device_unregister()
o driver model: check return of get_device() when creating a driverfs
file
o IDE: call device_unregister() instead of put_device() in
ide-disk->cleanup()
o USB: call device_unregister() instead of put_device() when removing
devices
o IDE: only register devices that are present
o IDE: add struct device to ide_drive_t and use that for IDE drives
o IDE: register ide driver for all ide drives; not just for disk
drives
o IDE: Add generic remove() method for drives; remove reboot notifier
o IDE: make ide_drive_remove() call driver's ->cleanup()

Paul Mackerras <[email protected]>:
o PPC32: Reorganize the files for the IBM 4xx embedded PPC processors
o PPC32: Move a couple of 4xx-related files around
o PPC32: Rename sigcontext_struct to sigcontext, and use sig->siglock
o PPC32: Use prepare target to make the assembler offsets header file
o PPC32: Add the kallsyms section to arch/ppc/vmlinux.lds.S
o PPC32: fix in_atomic; PREEMPT_ACTIVE set doesn't mean atomic
o PPC32: put the _right_ asm-offsets.c in
o PPC32: fix the last sigcontext_struct, missed previously
o PPC32: Add kallsyms support in stack tracing functions
o PPC32: fix arch-level tid handling
o PPC32: Add might_sleep() calls to down and down_interruptible
o PPC32: change the pmd macros to allow us to support large TLB
entries
o add PCI device ID for Motorola MPC107
o adjust PPC sysctls

Peter Chubb <[email protected]>:
o fix crash in yenta_bh() on card insertion/removal

Randy Dunlap <[email protected]>:
o build cpia video driver

Richard Zidlicky <[email protected]>:
o Move beeping and sysrq to input layer on m68k

Robert Love <[email protected]>:
o fix preempt_count overflow with brlocks
o getpid() comment typo

Rolf Eike Beer <[email protected]>:
o improve NCR53c710 SCSI driver

Russell King <[email protected]>:
o [SERIAL] Remove old pci_board cruft from serialP.h
o [SERIAL] Fix uart_type compilation error when CONFIG_PROC_FS=n
o [SERIAL] Fix oops when removing some PCI serial boards Patch from
William Lee Irwin II.
o [SERIAL] Fix serial.h/serialP.h ordering nightmare

Sam Ravnborg <[email protected]>:
o drivers/scsi - Makefile fix

Stephen Rothwell <[email protected]>:
o fix __SI_CODE

Stephen Smalley <[email protected]>:
o Base set of LSM hooks for SysV IPC

Steven Whitehouse <[email protected]>:
o [DECNET]: New autoconfiguration code for 2.5

Tom Callaway <[email protected]>:
o arch/sparc64/solaris/misc.c: Add MODULE_LICENSE

Trond Myklebust <[email protected]>:
o NFSv4 client for 2.5.x
o Fix NFS locking over TCP
o Remove unbalanced kunmap() in NFS readdir code
o Disable Nagle algorithm for RPC over TCP

Vojtech Pavlik <[email protected]>:
o Remove several files no longer used on m68k
o Add support for PS/2 Active Multiplexing Spec, updates for PS/2
mouse and keyboard handling - proper cleanup on reboot, allow
USB-emulated AT keyboards, option to restrict PS/2 mouse to generic
mode.
o Update Wacom driver to 2.4 changes and changes from Ping Cheng of
Wacom
o Convert gameport.[ch] to use lists.h for its linked lists
o Convert serio.[ch] to use list.h lists
o Cleanups and fixes for the Wacom USB driver
o Add #include <list.h> to input.h
o Use list_for_each_entry() in input.c
o Convert more of input to list.h usage
o Fixes/cleanups after converting drivers to list.h lists
o Accept 0xfa as an "OK" result code for AUX TEST cmd in i8042.c
o Add japanese Set 3 scancodes to atkbd.c
o Fix LAlt-RAlt combination on AT keyboards (generated "unknown
scancode" message)
o Make NR_KEYS be (KEY_MAX+1) so that keybindings can be set for keys
over 128.
o psmouse.c: ignore the sync bit to make slightly non-conforming
devices work.
o Change PC-keyboard mappings to follow MS Keyboards - a de facto
standard for extended keys.
o Initialize struct input_dev in input drivers before it's passed to
input_event()
o Add german keyboard \ to the default table of atkbd.c
o Make i8042.c even less picky about detecting an AUX port because of
broken chipsets that don't support the LOOP command or report
failure
o Add japanese bar key mapping to the default table in atkbd.c
o Fix the Shift-PgUp problem again, and hopefully for good
o Fix i8042 for Sun, recent updates broke it
o Fix oops when 'cat /dev/uinput' is done. Used
wait_event_interruptible()
o Don't try to enable extra keys on IBM/Chicony keyboards as this
upsets several notebook keyboards. Until we find a better solution
how to detect who are we talking to, we rely on the kernel command
line. Use atkbd_set=4 to gain access to the extra keys.
o Fix a ; in atkbd.c that somehow got into the last cset
o Fixes in i8042.c Active Multiplexing support

Zwane Mwaikambo <[email protected]>:
o Add ethtool media support to 3c509 net driver
o Add ethtool media support to smc91c92_cs net driver



2002-10-12 05:47:59

by Adrian Bunk

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, 11 Oct 2002, Linus Torvalds wrote:

>...
> Summary of changes from v2.5.41 to v2.5.42
> ============================================
>...
> John Stultz <[email protected]>:
>...
> o linux-2.5.41_timer-changes_A4 (2/3 - bulk move)
>...


This patch moved cpufreq stuff from time.c to timers/timer_tsc.c but not
the corresponding #include <linux/cpufreq.h> causing the following compile
error:

<-- snip -->

...
gcc -Wp,-MD,arch/i386/kernel/timers/.timer_tsc.o.d -D__KERNEL__
-Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer
-fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2 -march=k6
-Iarch/i386/mach-generic -nostdinc -iwithprefix include -DKBUILD_BASENAME=timer_tsc -c -o
arch/i386/kernel/timers/timer_tsc.o arch/i386/kernel/timers/timer_tsc.c
arch/i386/kernel/timers/timer_tsc.c: In function `time_cpufreq_notifier':
arch/i386/kernel/timers/timer_tsc.c:181: `CPUFREQ_PRECHANGE' undeclared
...
arch/i386/kernel/timers/timer_tsc.c:183: `CPUFREQ_ALL_CPUS' undeclared
...
arch/i386/kernel/timers/timer_tsc.c:192: `CPUFREQ_POSTCHANGE' undeclared
...
arch/i386/kernel/timers/timer_tsc.c:265: `CPUFREQ_TRANSITION_NOTIFIER' undeclared
...
make[2]: *** [arch/i386/kernel/timers/timer_tsc.o] Error 1

<-- snip -->


The fix is simple:


--- linux-2.5.42-full/arch/i386/kernel/time.c.old 2002-10-12 07:43:55.000000000 +0200
+++ linux-2.5.42-full/arch/i386/kernel/time.c 2002-10-12 07:44:05.000000000 +0200
@@ -43,7 +43,6 @@
#include <linux/smp.h>
#include <linux/module.h>
#include <linux/device.h>
-#include <linux/cpufreq.h>

#include <asm/io.h>
#include <asm/smp.h>
--- linux-2.5.42-full/arch/i386/kernel/timers/timer_tsc.c.old 2002-10-12 07:40:26.000000000 +0200
+++ linux-2.5.42-full/arch/i386/kernel/timers/timer_tsc.c 2002-10-12 07:44:25.000000000 +0200
@@ -7,6 +7,7 @@
#include <linux/init.h>
#include <linux/timex.h>
#include <linux/errno.h>
+#include <linux/cpufreq.h>

#include <asm/timer.h>
#include <asm/io.h>


cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed


2002-10-12 07:17:18

by Adrian Bunk

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, 11 Oct 2002, Linus Torvalds wrote:

>...
> Summary of changes from v2.5.41 to v2.5.42
> ============================================
>...
> <[email protected]>:
> o Initial check in of cifs filesystem version 0.54 for Linux 2.5 (to
> clean tree as one changeset)
>...


Both jfs and cifs ship a function called `dump_mem' causing the following
compile error when both are included:

<-- snip -->

ld -m elf_i386 -r -o fs/built-in.o ...
fs/jfs/built-in.o: In function `dump_mem':
fs/jfs/built-in.o(.text+0xe420): multiple definition of `dump_mem'
fs/cifs/built-in.o(.text+0x3af0): first defined here
make[1]: *** [fs/built-in.o] Error 1
make: *** [fs] Error 2

<-- snip -->

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

2002-10-12 07:47:04

by Andres Salomon

[permalink] [raw]
Subject: Re: Linux v2.5.42

esp.c got mangled; fixed to allow compilation are attached. Please
consider LVM2 inclusion; it co-exists happily LVM1, is a much cleaner
driver than LVM1, and is well tested (at least under 2.4).


On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
>
>
> Augh.. People have been mailbombing me apparently because a lot of people
> finally decided that they really want to sync with me due to the upcoming
> feature freeze, so there's a _lot_ of stuff here, all over the map.
>
> Both the NFS client and the server are getting facelifts to support NFSv4.
> And both Dave Jones and Alan Cox decided to try to merge more stuff with
> me - along with the usual stream from Andrew Morton.
>
> In addition, we have build updates, ISDN, ACPI, input layer, network
> drivers and driverfs.. Along with a random collection of other stuff: USB,
> s390, ppc etc.
>
> End result: 1MB worth of compressed patches - in four days.
>
> Linus
>
>
> PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> stand. I'm not using any kind of volume management personally, so I just
> don't have the background or inclination to walk through the patches and
> make that kind of decision. My non-scientific opinion is that it looks
> like the EVMS code is going to be merged, but ..
>
> Alan, Jens, Christoph, others - this is going to be an area where I need
> input from people I know, and preferably also help merging. I've been
> happy to see the EVMS patches being discussed on linux-kernel, and I just
> wanted to let people know that this needs outside help.
>
> ----
>
> Summary of changes from v2.5.41 to v2.5.42
> ============================================
[...]
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
It's not denial. I'm just selective about the reality I accept.
-- Bill Watterson


Attachments:
(No filename) (2.00 kB)
esp.diff (789.00 B)
Download all attachments

2002-10-12 09:06:04

by Adrian Bunk

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, 11 Oct 2002, Linus Torvalds wrote:

>...
> Summary of changes from v2.5.41 to v2.5.42
> ============================================
>...
> Christoph Hellwig <[email protected]>:
> o initcalls for ATM
>...

This broke the compilation of drivers/atm/iphase.c:

<-- snip -->

...
gcc -Wp,-MD,drivers/atm/.iphase.o.d -D__KERNEL__ -Iinclude -Wall
-Wstrict-prototypes -Wno-trigraphs -O2 -fomit-frame-pointer
-fno-strict-aliasing -fno-common -pipe -mpreferred-stack-boundary=2
-march=k6 -Iarch/i386/mach-generic -nostdinc -iwithprefix include -g
-DKBUILD_BASENAME=iphase -c -o drivers/atm/iphase.o drivers/atm/iphase.c
drivers/atm/iphase.c: In function `rx_pkt':
drivers/atm/iphase.c:1167: warning: implicit declaration of function
`atm_pdu2truesize'
drivers/atm/iphase.c:1172: structure has no member named `rx_quota'
make[2]: *** [drivers/atm/iphase.o] Error 1

<-- snip -->


The following part of the 2.5.42 patch to iphase.c shows the cause of this
problem:


<-- snip -->

@@ -1162,10 +1157,7 @@
goto out_free_desc;
}

-#if LINUX_VERSION_CODE >= 0x20312
if (!(skb = atm_alloc_charge(vcc, len, GFP_ATOMIC))) {
-#else
- if (atm_charge(vcc, atm_pdu2truesize(len))) {
/* lets allocate an skb for now */
skb = alloc_skb(len, GFP_ATOMIC);
if (!skb)
@@ -1178,7 +1170,6 @@
}
else {
IF_EVENT(printk("IA: Rx over the rx_quota %ld\n", vcc->rx_quota);)
-#endif
if (vcc->vci < 32)
printk("Drop control packets\n");
goto out_free_desc;

<-- snip -->


Therefore the fix it simple:

--- linux-2.5.42-full/drivers/atm/iphase.c.old 2002-10-12 11:02:31.000000000 +0200
+++ linux-2.5.42-full/drivers/atm/iphase.c 2002-10-12 11:09:15.000000000 +0200
@@ -1158,18 +1158,6 @@
}

if (!(skb = atm_alloc_charge(vcc, len, GFP_ATOMIC))) {
- /* lets allocate an skb for now */
- skb = alloc_skb(len, GFP_ATOMIC);
- if (!skb)
- {
- IF_ERR(printk("can't allocate memory for recv, drop pkt!\n");)
- atomic_inc(&vcc->stats->rx_drop);
- atm_return(vcc, atm_pdu2truesize(len));
- goto out_free_desc;
- }
- }
- else {
- IF_EVENT(printk("IA: Rx over the rx_quota %ld\n", vcc->rx_quota);)
if (vcc->vci < 32)
printk("Drop control packets\n");
goto out_free_desc;


cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed


2002-10-12 09:19:04

by Adrian Bunk

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, 11 Oct 2002, Linus Torvalds wrote:

>...
> Summary of changes from v2.5.41 to v2.5.42
> ============================================
>...
> Christoph Hellwig <[email protected]>:
> o initcalls for ATM
>...


This patch fixed part of the kbuild breakage in drivers/atm/Makefile, the
following patch fixes the rest:


--- linux-2.5.42-full/drivers/atm/Makefile.old 2002-10-12 11:13:48.000000000 +0200
+++ linux-2.5.42-full/drivers/atm/Makefile 2002-10-12 11:20:15.000000000 +0200
@@ -36,7 +36,7 @@
fore_200e-objs += fore200e_pca_fw.o
# guess the target endianess to choose the right PCA-200E firmware image
ifeq ($(CONFIG_ATM_FORE200E_PCA_DEFAULT_FW),y)
- CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo pca200e.bin; else echo pca200e_ecd.bin2; fi)
+ CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo drivers/atm/pca200e.bin; else echo drivers/atm/pca200e_ecd.bin2; fi)
endif
endif



Please apply
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed


2002-10-12 09:24:40

by David Miller

[permalink] [raw]
Subject: Re: Linux v2.5.42

From: Andres Salomon <[email protected]>
Date: Sat, 12 Oct 2002 03:52:55 -0400

esp.c got mangled; fixed to allow compilation are attached.

Thanks, I've applied your patch.

2002-10-12 09:38:10

by Sam Ravnborg

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, Oct 12, 2002 at 11:24:48AM +0200, Adrian Bunk wrote:
> This patch fixed part of the kbuild breakage in drivers/atm/Makefile, the
> following patch fixes the rest:
Small adjustment:

Sam

--- linux-2.5.42-full/drivers/atm/Makefile.old 2002-10-12 11:13:48.000000000 +0200
+++ linux-2.5.42-full/drivers/atm/Makefile 2002-10-12 11:20:15.000000000 +0200
@@ -36,7 +36,7 @@
fore_200e-objs += fore200e_pca_fw.o
# guess the target endianess to choose the right PCA-200E firmware image
ifeq ($(CONFIG_ATM_FORE200E_PCA_DEFAULT_FW),y)
- CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo pca200e.bin; else echo pca200e_ecd.bin2; fi)
+ CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo $(obj)/pca200e.bin; else echo $(obj)/pca200e_ecd.bin2; fi)
endif
endif



2002-10-12 09:44:43

by Matthias Andree

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, 11 Oct 2002, Linus Torvalds wrote:

> PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> stand. I'm not using any kind of volume management personally, so I just
> don't have the background or inclination to walk through the patches and
> make that kind of decision. My non-scientific opinion is that it looks
> like the EVMS code is going to be merged, but ..
>
> Alan, Jens, Christoph, others - this is going to be an area where I need
> input from people I know, and preferably also help merging. I've been
> happy to see the EVMS patches being discussed on linux-kernel, and I just
> wanted to let people know that this needs outside help.

A user's input, of not nearly as much weight as of the input you
suggested, and totally unencumbered by technical details:

EVMS has been much more present to interested parties than LVM2. If --
as a user -- I was to choose either one RIGHT NOW (i. e. with a gun
against a head, a boss telling me 'I want a decision in 30 minutes', you
name it), I'd go for EVMS.

Not for the names behind, the LVM2 and the EVMS teams both have their
reputation, and from my POV, they are equally good.

Not for technical reasons either, because I just cannot judge on this
area.

But because EVMS just looks much less like a construction site than
dm2/LVM2 does.

If there was something about integrating dm2, I'd not be surprised if
EVMS used it or wrapped it up or something. It also usurps LVM1.

Just my two Euro cents.

2002-10-12 10:16:14

by Adrian Bunk

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, 12 Oct 2002, Sam Ravnborg wrote:

> On Sat, Oct 12, 2002 at 11:24:48AM +0200, Adrian Bunk wrote:
> > This patch fixed part of the kbuild breakage in drivers/atm/Makefile, the
> > following patch fixes the rest:
> Small adjustment:
>
> Sam
>
> --- linux-2.5.42-full/drivers/atm/Makefile.old 2002-10-12 11:13:48.000000000 +0200
> +++ linux-2.5.42-full/drivers/atm/Makefile 2002-10-12 11:20:15.000000000 +0200
> @@ -36,7 +36,7 @@
> fore_200e-objs += fore200e_pca_fw.o
> # guess the target endianess to choose the right PCA-200E firmware image
> ifeq ($(CONFIG_ATM_FORE200E_PCA_DEFAULT_FW),y)
> - CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo pca200e.bin; else echo pca200e_ecd.bin2; fi)
> + CONFIG_ATM_FORE200E_PCA_FW = $(shell if test -n "`$(CC) -E -dM $(src)/../../include/asm/byteorder.h | grep ' __LITTLE_ENDIAN '`"; then echo $(obj)/pca200e.bin; else echo $(obj)/pca200e_ecd.bin2; fi)
> endif
> endif

Yes thanks, your patch is better than mine.

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed



2002-10-12 11:06:00

by jw schultz

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, Oct 12, 2002 at 11:50:26AM +0200, Matthias Andree wrote:
> On Fri, 11 Oct 2002, Linus Torvalds wrote:
>
> > PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> > stand. I'm not using any kind of volume management personally, so I just
> > don't have the background or inclination to walk through the patches and
> > make that kind of decision. My non-scientific opinion is that it looks
> > like the EVMS code is going to be merged, but ..
>
> A user's input, of not nearly as much weight as of the input you
> suggested, and totally unencumbered by technical details:
>
> EVMS has been much more present to interested parties than LVM2. If --
> as a user -- I was to choose either one RIGHT NOW (i. e. with a gun
> against a head, a boss telling me 'I want a decision in 30 minutes', you
> name it), I'd go for EVMS.
>
> But because EVMS just looks much less like a construction site than
> dm2/LVM2 does.
>
> Just my two Euro cents.

I'll add my $0.02US which (according to exchange rates) is
worth more though almost worthless.

Hate to say it but in this comparison LVM2 looses. Primary
reason: Backward compatibility. People are going to need to
be able to switch between kernels.

So far everything indicates that LVM2 is not compatible with
LVM. That LVM2 and LVM(1) can coexist-exist is irrelevant if
2.5 hasn't got a working LVM(1). And that would leave us
with having to do backup+restore around the upgrade.

Any on-disk changes also need to have an in-place translator.
Just think about what it would take to do an upgrade, or
downgrade, without in-place translation.

Also 2.4 -> 2.6 should not be a feature reduction so
snapshot volumes and any other LVM features missing from
LVM2 are issues.

--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt

2002-10-12 11:24:07

by Andres Salomon

[permalink] [raw]
Subject: Re: Linux v2.5.42

Uh, what? LVM2 is perfectly backwards compatible w/ LVM1. Of course,
snapshots don't work w/ LVM2 yet, so I'm not sure how LVM2 handles LVM1
snapshot volumes. In general, volumes created w/ LVM1 tools should
work fine with device-mapper/LVM2. I've been using LVM2 for 8+ months;
when I've needed to do things that aren't yet implemented with LVM2 (for
example, pvmove'ing), I've simply downgraded to LVM1 temporarily, done
the task, and then upgraded my tools again.


On Sat, Oct 12, 2002 at 04:11:40AM -0700, jw schultz wrote:
>
> On Sat, Oct 12, 2002 at 11:50:26AM +0200, Matthias Andree wrote:
> > On Fri, 11 Oct 2002, Linus Torvalds wrote:
> >
> > > PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> > > stand. I'm not using any kind of volume management personally, so I just
> > > don't have the background or inclination to walk through the patches and
> > > make that kind of decision. My non-scientific opinion is that it looks
> > > like the EVMS code is going to be merged, but ..
> >
> > A user's input, of not nearly as much weight as of the input you
> > suggested, and totally unencumbered by technical details:
> >
> > EVMS has been much more present to interested parties than LVM2. If --
> > as a user -- I was to choose either one RIGHT NOW (i. e. with a gun
> > against a head, a boss telling me 'I want a decision in 30 minutes', you
> > name it), I'd go for EVMS.
> >
> > But because EVMS just looks much less like a construction site than
> > dm2/LVM2 does.
> >
> > Just my two Euro cents.
>
> I'll add my $0.02US which (according to exchange rates) is
> worth more though almost worthless.
>
> Hate to say it but in this comparison LVM2 looses. Primary
> reason: Backward compatibility. People are going to need to
> be able to switch between kernels.
>
> So far everything indicates that LVM2 is not compatible with
> LVM. That LVM2 and LVM(1) can coexist-exist is irrelevant if
> 2.5 hasn't got a working LVM(1). And that would leave us
> with having to do backup+restore around the upgrade.
>
> Any on-disk changes also need to have an in-place translator.
> Just think about what it would take to do an upgrade, or
> downgrade, without in-place translation.
>
> Also 2.4 -> 2.6 should not be a feature reduction so
> snapshot volumes and any other LVM features missing from
> LVM2 are issues.
>
> --
> ________________________________________________________________
> J.W. Schultz Pegasystems Technologies
> email address: [email protected]
>
> Remember Cernan and Schmitt
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
It's not denial. I'm just selective about the reality I accept.
-- Bill Watterson

2002-10-12 11:29:02

by Alan

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, 2002-10-12 at 12:11, jw schultz wrote:
> So far everything indicates that LVM2 is not compatible with
> LVM. That LVM2 and LVM(1) can coexist-exist is irrelevant if
> 2.5 hasn't got a working LVM(1). And that would leave us
> with having to do backup+restore around the upgrade.

LVM2 supports LVM1 volumes. I don't know where you got the idea
otherwise.

2002-10-12 11:34:57

by jw schultz

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, Oct 12, 2002 at 12:46:37PM +0100, Alan Cox wrote:
> On Sat, 2002-10-12 at 12:11, jw schultz wrote:
> > So far everything indicates that LVM2 is not compatible with
> > LVM. That LVM2 and LVM(1) can coexist-exist is irrelevant if
> > 2.5 hasn't got a working LVM(1). And that would leave us
> > with having to do backup+restore around the upgrade.
>
> LVM2 supports LVM1 volumes. I don't know where you got the idea
> otherwise.

Good. I'm very glad to be wrong. Then all we need care
about is project maturity and design.

--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]

Remember Cernan and Schmitt

2002-10-12 12:37:45

by Ed Tomlinson

[permalink] [raw]
Subject: Re: Linux v2.5.42

Linus Torvalds wrote:

> PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> stand. I'm not using any kind of volume management personally, so I just
> don't have the background or inclination to walk through the patches and
> make that kind of decision. My non-scientific opinion is that it looks
> like the EVMS code is going to be merged, but ..
>
> Alan, Jens, Christoph, others - this is going to be an area where I need
> input from people I know, and preferably also help merging. I've been
> happy to see the EVMS patches being discussed on linux-kernel, and I just
> wanted to let people know that this needs outside help.

I support a SAP system (so far on solaris) at work. I make extensive use
of a volume manager there. On linux I have tried both LVM1 and EVMS. I
stopped using LVM1 the thrid time it caused me to restore after a clean
boot... This problem may well have been fixed in LVM2 - I have not used it.
On the other had EVMS has never scrambled my disks. It also offers more
options, including an encapsulated LVM1. From my perspective EVMS wins.

Ed Tomlinson

2002-10-12 13:26:52

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
> PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> stand. I'm not using any kind of volume management personally, so I just
> don't have the background or inclination to walk through the patches and
> make that kind of decision. My non-scientific opinion is that it looks
> like the EVMS code is going to be merged, but ..
>
> Alan, Jens, Christoph, others - this is going to be an area where I need
> input from people I know, and preferably also help merging. I've been
> happy to see the EVMS patches being discussed on linux-kernel, and I just
> wanted to let people know that this needs outside help.

I don't think the work to get EVMS in shape can be done in time (feel
free to preove me wrong..). The problem in my eyes is that large
parts of what evms does should be in the higher layers, i.e. the
block layer, but they implement their own new layer as the consumer of
those. i.e. instead of using the generic block layer structures to
present a volume/device they use their own, private structures that
need hacks to get the access right (pass-through ioctls) and need
constant resyncing with the native structures in the case where we
have both (the lowest layer). IMHO we should try to get a common
userspace API in first, then implement the missing functionality for
properly interaction of voulme managers at the block layer. After
that EVMS would just be a set of coulme mangment drivers + a library
of common functionality.

Doing that higher level work will take some time to get right, and the
current EVMS API seems unsuitable for me, it contains lots of very#
strange APIs that need rework. Merging EVMS now for 2.6 means that
we'll have to keep those strange APIs around, and have to maintain
backwards-compatiblity.

I've not seen LVM2 code for 2.5 yet, but the 2.4 code looks very
promising, although it might need some work in different areas.
I'll take a look as soon as Sistina publishes patches for 2.5 instead
of just a BK repository. LVM1 is totally unusable in 2.5, I think
we should better remove the dead code now than later.

Christoph

2002-10-12 13:37:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
>
> Augh.. People have been mailbombing me apparently because a lot of people
> finally decided that they really want to sync with me due to the upcoming
> feature freeze, so there's a _lot_ of stuff here, all over the map.

BTW, there's another infrastructure feature I forgot when you asked
what should go in before feature freeze. And IMHO it's very important
(so why did I forget it..): IBM's read copy update synchronisation
primitives. They've shown significant improvements when used for the
file tables, dcache and routing cache, it has been around since before
2.5 forked, SuSE has it in their production kernel for a while, too and
akpm has it in his tree for while.

Even if those existing users don't get in yet I don't want to miss the
infrastructure in the 2.6 series.

2002-10-12 17:08:42

by Mark Peloquin

[permalink] [raw]
Subject: Re: Linux v2.5.42

On 2002-10-12 13:32:33, Christoph Hellwig wrote:
> > On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
> > PS: NOTE - I'm not going to merge either EVMS or LVM2 right now > as
>things
> > stand. I'm not using any kind of volume management personally, so > > I
>just
> > don't have the background or inclination to walk through the > patches
>and
> > make that kind of decision. My non-scientific opinion is that it > looks
> > like the EVMS code is going to be merged, but ..
> > > Alan, Jens, Christoph, others - this is going to be an area where > >
>I need
> > input from people I know, and preferably also help merging. I've > been
> > happy to see the EVMS patches being discussed on linux-kernel, and > > I
>just > wanted to let people know that this needs outside help.

>I don't think the work to get EVMS in shape can be done in time > (feel
>free to preove me wrong..).

Should EVMS be included, the team will make it our top priority to resolve
the disputed design issues. If the ruling should be that some of our design
decisions must change, so be it, we will comply. Certainly some changes can
not be done by the 20th or 31st, however I feel the team can handle most
changes before 2.6 ships.

>The problem in my eyes is that large
>parts of what evms does should be in the higher layers, i.e. the
>block layer, but they implement their own new layer as the consumer > of
>those. i.e. instead of using the generic block layer structures to
>present a volume/device they use their own,

More accurately, we do use generic block layers structures to present
volumes that are visible to the user/system.

>private structures that
>need hacks to get the access right (pass-through ioctls) and need
>constant resyncing with the native structures in the case where we
>have both (the lowest layer).

The point of contention is that EVMS does not provide generic access (block
layer operations) to the components that make up the volume, but only to the
user/system accessible volumes themselves. EVMS consumes (primarily disk)
devices and produces volumes. The intermediate points are abstracted by the
volume manager.

>IMHO we should try to get a common
>userspace API in first, then implement the missing functionality for
>properly interaction of voulme managers at the block layer. After
>that EVMS would just be a set of coulme mangment drivers + a library
>of common functionality.

>Doing that higher level work will take some time to get right, and the
>current EVMS API seems unsuitable for me, it contains lots of very#
>strange APIs that need rework. Merging EVMS now for 2.6 means that
>we'll have to keep those strange APIs around, and have to maintain
>backwards-compatiblity.

I guess it comes down to the point of whether the block layer should evolve
to also handle volume management generically, or whether volume management
is separate component that utilizes and works with the block layer.

Linus, if you feel that volume management and the block layer can and should
be separate components that work together, then EVMS is ready today, and at
least functionally, could be a pretty good starting point. As a separate
component, only the EVMS tools would have to know or care of the new EVMS
APIs. The volumes EVMS produces, being standard block devices, interface,
interact, and operate as any other block device does today.

Mark

_________________________________________________________________
MSN Photos is the easiest way to share and print your photos:
http://photos.msn.com/support/worldwide.aspx

2002-10-12 17:42:10

by Jon Portnoy

[permalink] [raw]
Subject: Re: Linux v2.5.42

if [ -r System.map ]; then /sbin/depmod -ae -F System.map 2.5.42; fi
depmod: *** Unresolved symbols in
/lib/modules/2.5.42/kernel/fs/ext3/ext3.o
depmod: generic_file_aio_read
depmod: generic_file_aio_write
depmod: *** Unresolved symbols in /lib/modules/2.5.42/kernel/fs/nfs/nfs.o
depmod: generic_file_aio_read
depmod: generic_file_aio_write
depmod: *** Unresolved symbols in
/lib/modules/2.5.42/kernel/fs/nfsd/nfsd.o
depmod: auth_domain_find
depmod: cache_fresh
depmod: unix_domain_find
depmod: auth_domain_put
depmod: cache_flush
depmod: cache_unregister
depmod: add_hex
depmod: cache_check
depmod: svcauth_unix_purge
depmod: get_word
depmod: cache_clean
depmod: cache_register
depmod: auth_unix_lookup
depmod: auth_unix_add_addr
depmod: cache_init
depmod: auth_unix_forget_old
depmod: add_word

2002-10-12 18:05:19

by Rik van Riel

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, 12 Oct 2002, jw schultz wrote:

> So far everything indicates that LVM2 is not compatible with
> LVM.

On the contrary. Everything I've seen indicates that device
mapper is compatible with anything and just needs userland
helpers to set up the mapping.

Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
Current spamtrap: <a href=mailto:"[email protected]">[email protected]</a>

2002-10-12 19:16:47

by Alan

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, 2002-10-12 at 18:14, Mark Peloquin wrote:
> Should EVMS be included, the team will make it our top priority to resolve
> the disputed design issues. If the ruling should be that some of our design
> decisions must change, so be it, we will comply. Certainly some changes can
> not be done by the 20th or 31st, however I feel the team can handle most
> changes before 2.6 ships.

Thats good to hear. Right now the debate appears to be - "users: please
add EVMS" "hackers: oh my god no" - so you got the feature set right it
seems

2002-10-12 19:23:56

by jbradford

[permalink] [raw]
Subject: Re: Linux v2.5.42

> > Should EVMS be included, the team will make it our top priority to resolve
> > the disputed design issues. If the ruling should be that some of our design
> > decisions must change, so be it, we will comply. Certainly some changes can
> > not be done by the 20th or 31st, however I feel the team can handle most
> > changes before 2.6 ships.
>
> Thats good to hear. Right now the debate appears to be - "users: please
> add EVMS" "hackers: oh my god no" - so you got the feature set right it
> seems

Obvious point:

* Linus can always thaw the tree after 31st just for one addition, if
something _really_ needs to be added for 2.6

John.

2002-10-12 19:33:51

by Andres Salomon

[permalink] [raw]
Subject: Re: Linux v2.5.42

2.5 LVM2 patches in non-BK form:
http://people.sistina.com/~thornber/dm_2002-10-09.tar.bz2

Joe, it would be nice if you linked to that from p.s.c/~thornber/; or,
if you're going to be releasing a bunch of patches, make a dm directory
w/ directoryindexing enabled.


On Sat, Oct 12, 2002 at 02:32:33PM +0100, Christoph Hellwig wrote:
>
> On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
> > PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> > stand. I'm not using any kind of volume management personally, so I just
> > don't have the background or inclination to walk through the patches and
> > make that kind of decision. My non-scientific opinion is that it looks
> > like the EVMS code is going to be merged, but ..
> >
> > Alan, Jens, Christoph, others - this is going to be an area where I need
> > input from people I know, and preferably also help merging. I've been
> > happy to see the EVMS patches being discussed on linux-kernel, and I just
> > wanted to let people know that this needs outside help.
>
> I don't think the work to get EVMS in shape can be done in time (feel
> free to preove me wrong..). The problem in my eyes is that large
> parts of what evms does should be in the higher layers, i.e. the
> block layer, but they implement their own new layer as the consumer of
> those. i.e. instead of using the generic block layer structures to
> present a volume/device they use their own, private structures that
> need hacks to get the access right (pass-through ioctls) and need
> constant resyncing with the native structures in the case where we
> have both (the lowest layer). IMHO we should try to get a common
> userspace API in first, then implement the missing functionality for
> properly interaction of voulme managers at the block layer. After
> that EVMS would just be a set of coulme mangment drivers + a library
> of common functionality.
>
> Doing that higher level work will take some time to get right, and the
> current EVMS API seems unsuitable for me, it contains lots of very#
> strange APIs that need rework. Merging EVMS now for 2.6 means that
> we'll have to keep those strange APIs around, and have to maintain
> backwards-compatiblity.
>
> I've not seen LVM2 code for 2.5 yet, but the 2.4 code looks very
> promising, although it might need some work in different areas.
> I'll take a look as soon as Sistina publishes patches for 2.5 instead
> of just a BK repository. LVM1 is totally unusable in 2.5, I think
> we should better remove the dead code now than later.
>
> Christoph
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
It's not denial. I'm just selective about the reality I accept.
-- Bill Watterson

2002-10-12 20:14:51

by Dieter Nützel

[permalink] [raw]
Subject: Re: Linux v2.5.42

> > > Should EVMS be included, the team will make it our top priority to
> > > resolve the disputed design issues. If the ruling should be that some of
> > > our design decisions must change, so be it, we will comply. Certainly
> > > some changes can not be done by the 20th or 31st, however I feel
> > > the team can handle most changes before 2.6 ships.
> >
> > Thats good to hear. Right now the debate appears to be - "users: please
> > add EVMS" "hackers: oh my god no" - so you got the feature set right it
> > seems
>
> Obvious point:
>
> * Linus can always thaw the tree after 31st just for one addition, if
> something _really_ needs to be added for 2.6

Beside EVMS there is another one: Reiser4
Getting such an FS "for free" is worth it.
http://www.namesys.com/v4/v4.html

Hans, can you please send a summary of the "new" FS limits?
PB/EB, etc.? ;-)

Regards,
Dieter

--
Dieter N?tzel
F&E Leiter
WEAR-A-BRAIN GmbH
Email: d.nuetzel at wearabrain.de (replace at with @)

2002-10-12 22:13:54

by Hans Reiser

[permalink] [raw]
Subject: Re: Linux v2.5.42

Dieter N?tzel wrote:

>>>>Should EVMS be included, the team will make it our top priority to
>>>>resolve the disputed design issues. If the ruling should be that some of
>>>>our design decisions must change, so be it, we will comply. Certainly
>>>>some changes can not be done by the 20th or 31st, however I feel
>>>>the team can handle most changes before 2.6 ships.
>>>>
>>>>
>>>Thats good to hear. Right now the debate appears to be - "users: please
>>>add EVMS" "hackers: oh my god no" - so you got the feature set right it
>>>seems
>>>
>>>
>>Obvious point:
>>
>>* Linus can always thaw the tree after 31st just for one addition, if
>>something _really_ needs to be added for 2.6
>>
>>
>
>Beside EVMS there is another one: Reiser4
>Getting such an FS "for free" is worth it.
>http://www.namesys.com/v4/v4.html
>
>Hans, can you please send a summary of the "new" FS limits?
>PB/EB, etc.? ;-)
>
>Regards,
> Dieter
>
>
>
The new size limits are those of the Linux VFS layer (we use 64 bit
numbers most places so that if we port to another architecture, or ia64
becomes viable....). I don't think anyone will find them motivating.

Dramatic performance gains while offering transactional FS operations
(wandering logs work, woohoo!), plugins, scalability due to per node
locking, obsoleting a whole slew of traditional database tree algorithms
for better performance, those are motivating. Wait for Linux Journal to
come out, it will have the benchmarks, and you'll see what I mean by
dramatic. It will be good enough that we can focus mostly on getting
the semantics in place for the competition with OFS.

Hans


2002-10-13 11:53:13

by Luigi Genoni

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, 12 Oct 2002, jw schultz wrote:

> Date: Sat, 12 Oct 2002 04:11:40 -0700
> From: jw schultz <[email protected]>
> To: Kernel Mailing List <[email protected]>
> Subject: Re: Linux v2.5.42
>
>
> I'll add my $0.02US which (according to exchange rates) is
> worth more though almost worthless.
>
> Hate to say it but in this comparison LVM2 looses. Primary
> reason: Backward compatibility. People are going to need to
> be able to switch between kernels.
>
> So far everything indicates that LVM2 is not compatible with
> LVM. That LVM2 and LVM(1) can coexist-exist is irrelevant if
> 2.5 hasn't got a working LVM(1). And that would leave us
> with having to do backup+restore around the upgrade.

that is I think the real issue for people like me. One day I will have to
upgrade my servers to kernel 2.6, and all of them are on LVM1. I need to
be able to upgrade them. with EVMS I can do so, but after I will have
other problems because of operator, now used to LVM command line (that is
the same they are using on other Unices), will also have to learn another
command line. If I am not wrong, the LVM1 like command line is going to
disappear from EVMS, and that will be a problem for many users.
EVMS is powerfull but somehow too complex for many.
So my 2 eurocents are for EVMS with "also" a LVM1 like command line.

Luigi


2002-10-13 15:57:40

by Brian Jackson

[permalink] [raw]
Subject: Re: Linux v2.5.42

Christoph Hellwig writes:

> On Sun, Oct 13, 2002 at 11:16:24PM +0800, Michael Clark wrote:
>> On 10/13/02 21:49, Christoph Hellwig wrote:
>> > On Sun, Oct 13, 2002 at 08:41:20PM +0800, Michael Clark wrote:
>> >
>> >>Exactly. I think Christoph is comparing it to the original md
>> >>architecture thich was more of an evolutionary design on the existing
>> >>block layer
>> >
>> >
>> > No, I do not. MD is in _no_ ways a volume managment framwork but just
>> > a few drivers that share common code. That's somethig entirely different.
>>
>> So why then the requirement that internal remapping layers be
>> implemented as block devices?
>
> I don't care how a single remapping layers is implemented. I want
> the common Voulme managment API work on public nodes.
>
>> Neither is implementing an internal logical remapping layer as a
>> block device just so you can do an ioctl directly to it.
>
> Not without hacks.
>
>> I think the point is really explaining why they _should_ be accessed.
>> If there is some valid reason other than having something you
>> can do an ioctl on.
>
> Because that
>
> a) removes hacks like the EVMS pass-though
> b) allows userspace to easily access it through read/write
>
>>
>> > argumentation tell me why you haven't submitted a patch to Linus
>> > yet to disallow direct access to block devices that are in use
>> > by a filesystem.
>>
>> I think the issue here is an md block device in use by another md block
>> device. Possbily becuase md's design precludes this (a design artifact)
>> (ie. md tools need access to the intermediary devices - users don't).
>
> I'm not talksing about MD here. Why do you want to disallow people
> using a device just it has another layer above it. E.g. write a change
> to the ondisk structures (setting a flag, etcc..) is most logically
> expressed by simple, O_DIRECT write to the actual device.
>
>
>> Yes, but the block device encapsulation here removes the need for plugins
>> to be implemented as block devices ie. removing complexity elsewhere.
>> I must admit to not being an expert on the block layer - but wouldn't
>> your suggesed approach mean intermediary layers would each have a
>> request queue
>
> It _coukd_ have a request queue, yes.
>
>> and other unneeded stuff - if so, is this desirable?
>
> What unneeded stuff? block device state contains no state relevant
> to userspace access.
>
>> > This argument is NIL if the infrastructure is part of exactly that
>> > evolving block layer. You might have noticed that kernel code
>> > compatility to other releases is not really a criteria for the
>> > linux kernel development, btw..
>>
>> I agree, maybe this would be worth doing for 2.7/2.8.
>
> Yes.
>
>> In the meatime
>> do you think this would be feasible? - you are basically suggesting
>> a complete rewrite
>
> Exactly.
>
>> (or do you think you can do the rewrite to IBM's
>> satisfaction before the freeze ie. in the eternal linux kernel way,
>> you want it you write it ;). Me, i'm happy with the current approach
>> - but of course, i'm only a user ;).
>
> _I_ don't want to get EVMS in, sorry. I _do_ want a proper volume
> managment framework, but I can live with it not beeing in before 2.8.
>

Good for you. Most people can't/won't wait for it. They will see that linux
doesn't have a key feature for enterprises, and say that linux still isn't
mature enough for them and at best only use linux on some dinky little
webservers, like it has been used in the past. There isn't a whole lot of
that market left. If we want to move forward and offer something to a
broader base of companies, we need features like this included.

--Brian Jackson

p.s. Maybe you could keep your replies to constructive criticism, instead of
just dogging EVMS. Some people actually do want linux to improve.

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-10-13 12:47:09

by Michael Clark

[permalink] [raw]
Subject: Re: Linux v2.5.42

On 10/13/02 19:58, [email protected] wrote:
> On Sat, 12 Oct 2002, jw schultz wrote:
>
> So my 2 eurocents are for EVMS with "also" a LVM1 like command line.

EVMS has this already (same syntax exactly).

eg.

monty:~# evms_lvcreate -h
Enterprise Volume Management System
International Business Machines 09/30/02
LVM Emulation Utilities 1.2.0

evms_lvcreate -- initialize a logical volume for use by EVMS

evms_lvcreate [-A|--autobackup {y|n}] [-C|--contiguous {y|n}] [-d|--debug]
[-h|--help] [-i|--stripes Stripes [-I|--stripesize StripeSize]]
{-l|--extents LogicalExtentsNumber |
-L|--size LogicalVolumeSize[kKmMgGtT]} [-n|--name LogicalVolumeName]
[-p|--permission {r|rw}] [-r|--readahead ReadAheadSectors]
[-v|--verbose] [-Z|--zero {y|n}] [--version]
VolumeGroupName [PhysicalVolumePath...]

evms_lvcreate -s|--snapshot [-c|--chunksize ChunkSize]
{-l|--extents LogicalExtentsNumber |
-L|--size LogicalVolumeSize[kKmMgGtT]}
-n|--name SnapshotLogicalVolumeName
LogicalVolume[Path] [PhysicalVolumePath...]

monty:~# /sbin/lvcreate -h
Logical Volume Manager 1.0.4
Heinz Mauelshagen, Sistina Software 02/05/2002 (IOP 10)

lvcreate -- initialize a logical volume for use by LVM

lvcreate [-A|--autobackup {y|n}] [-C|--contiguous {y|n}] [-d|--debug]
[-h|--help] [-i|--stripes Stripes [-I|--stripesize StripeSize]]
{-l|--extents LogicalExtentsNumber |
-L|--size LogicalVolumeSize[kKmMgGtT]} [-n|--name LogicalVolumeName]
[-p|--permission {r|rw}] [-r|--readahead ReadAheadSectors]
[-v|--verbose] [-Z|--zero {y|n}] [--version]
VolumeGroupName [PhysicalVolumePath...]

lvcreate -s|--snapshot [-c|--chunksize ChunkSize]
{-l|--extents LogicalExtentsNumber |
-L|--size LogicalVolumeSize[kKmMgGtT]}
-n|--name SnapshotLogicalVolumeName
LogicalVolume[Path] [PhysicalVolumePath...]

2002-10-13 13:36:07

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, Oct 12, 2002 at 12:14:25PM -0500, Mark Peloquin wrote:
> I guess it comes down to the point of whether the block layer should evolve
> to also handle volume management generically, or whether volume management
> is separate component that utilizes and works with the block layer.
>
> Linus, if you feel that volume management and the block layer can and should
> be separate components that work together, then EVMS is ready today,

No, it's not. Even if this design stands the code still has many issues.
Neverless even if we don't want separate representations of intermediate
volmes and topmost volumes, the voulme managment should not be part of
a driver but higher leve, i.e. separated out from the evms common library
code.

2002-10-13 16:19:48

by Arjan van de Ven

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sun, 2002-10-13 at 18:11, Brian Jackson wrote:

> p.s. Maybe you could keep your replies to constructive criticism, instead of
> just dogging EVMS. Some people actually do want linux to improve.

In case you missed is: EVMS is not the only way you can do volume
management in 2.5... and Christoph is pointing out valid design flaws
(and serious code bugs) in EVMS. Code bugs you can fix after merge;
design flaws should at least be discussed before merging in my opinion.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2002-10-13 16:52:15

by Brian Jackson

[permalink] [raw]
Subject: Re: Linux v2.5.42

Arjan van de Ven writes:

> On Sun, 2002-10-13 at 18:11, Brian Jackson wrote:
>
>> p.s. Maybe you could keep your replies to constructive criticism, instead of
>> just dogging EVMS. Some people actually do want linux to improve.
>
> In case you missed is: EVMS is not the only way you can do volume
> management in 2.5... and Christoph is pointing out valid design flaws
> (and serious code bugs) in EVMS. Code bugs you can fix after merge;
> design flaws should at least be discussed before merging in my opinion.
>

Yes I do realize that, but I think EVMS offers more in the long run than any
of the others. I don't mean to speak ill of Christoph(he has done some very
good work in the past, and I think he is very talented), In fact, I thought
he handled the problems with EVMS very well at first(pointing out problems,
etc.), but then as the thread went on you could tell he was just taking it
more and more personally(for whatever reasons), and we all know that doesn't
help getting things done, especially not with a deadline looming. I don't
know as much as I should about all of this considering I am opening my mouth
on the subject, but it seems that something needs to be done soon one way or
another.

--Brian

2002-10-13 16:58:42

by Dipankar Sarma

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sat, Oct 12, 2002 at 01:46:17PM +0000, Christoph Hellwig wrote:
> BTW, there's another infrastructure feature I forgot when you asked
> what should go in before feature freeze. And IMHO it's very important
> (so why did I forget it..): IBM's read copy update synchronisation
> primitives. They've shown significant improvements when used for the
> file tables, dcache and routing cache, it has been around since before
> 2.5 forked, SuSE has it in their production kernel for a while, too and
> akpm has it in his tree for while.

Yes, rcu core and dcache_rcu has been in -mm since 2.5.37-mm1 and we
haven't seen any problems with it so far. This patch combination
has no regression in lower end of systems and gives us better performance in
webserver and multiuser type of workloads at the higher end of systems.

> Even if those existing users don't get in yet I don't want to miss the
> infrastructure in the 2.6 series.

Andrew, will you be inclined to send this to Linus or should I send them
myself ?

Thanks
--
Dipankar Sarma <[email protected]> http://lse.sourceforge.net
Linux Technology Center, IBM Software Lab, Bangalore, India.

2002-10-13 17:41:14

by Robert Love

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sun, 2002-10-13 at 12:11, Brian Jackson wrote:

> Good for you. Most people can't/won't wait for it. They will see that
> linux doesn't have a key feature for enterprises, and say that linux
> still isn't mature enough for them and at best only use linux on some
> dinky little webservers, like it has been used in the past. There
> isn't a whole lot of that market left. If we want to move forward and
> offer something to a broader base of companies, we need features like
> this included.

s/Most people/Most enterprise people/

And this is really entirely the wrong attitude to take. "Linux does not
have volume management" and "High-end Linux applications need volume
management" do not logically imply "we need to merge EVMS."

You need to fix the issues Christoph and others raised and you need to
work within the system. I won't cry if EVMS is not merged.

Robert Love

2002-10-13 18:20:01

by Brian Jackson

[permalink] [raw]
Subject: Re: Linux v2.5.42

Robert Love writes:

> On Sun, 2002-10-13 at 12:11, Brian Jackson wrote:
>
>> Good for you. Most people can't/won't wait for it. They will see that
>> linux doesn't have a key feature for enterprises, and say that linux
>> still isn't mature enough for them and at best only use linux on some
>> dinky little webservers, like it has been used in the past. There
>> isn't a whole lot of that market left. If we want to move forward and
>> offer something to a broader base of companies, we need features like
>> this included.
>
> s/Most people/Most enterprise people/
>
> And this is really entirely the wrong attitude to take. "Linux does not
> have volume management" and "High-end Linux applications need volume
> management" do not logically imply "we need to merge EVMS."
>
> You need to fix the issues Christoph and others raised and you need to
> work within the system. I won't cry if EVMS is not merged.
>

Just for clarification I have nothing to do with EVMS, other than I think it
would be a good addition. I don't want anybody to get the idea that they are
as unprofessional as me.

--Brian

> Robert Love
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-10-13 19:44:22

by Mark Hahn

[permalink] [raw]
Subject: Re: Linux v2.5.42

> Yes I do realize that, but I think EVMS offers more in the long run than any
> of the others.

not to put too find a point on it, but IBM has their own goals.
for instance, some part of EVMS design is motivated by IBM's political
desire to permit its bank customers, who have horrible old OS/2 systems,
to transparently use OS/2 volumes. it's not as if IBM couldn't provide
a simple, user-level migration tool. it's not as if the Linux community
is going to rush out and say "let's all start use OS/2 volumes everywhere!"

using Linux to make your customers happy is great;
the issue is whether it deforms Linux in general to be shaped
by non-shared priorities. yes, I'm a crank, and yes, I'm bothered
by other IBM influences, such as their fixation with performance of
broken platforms like Profusion, or tuning for NUMA.

the best part of Linux is its willingness to throw out old designs;
a big system like EVMS has its own resistance to such redesign.

2002-10-13 19:52:04

by Rik van Riel

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sun, 13 Oct 2002, Mark Hahn wrote:

> > Yes I do realize that, but I think EVMS offers more in the long run than any
> > of the others.
>
> not to put too find a point on it, but IBM has their own goals. for
> instance, some part of EVMS design is motivated by IBM's political
> desire to permit its bank customers, who have horrible old OS/2 systems,
> to transparently use OS/2 volumes. it's not as if IBM couldn't provide
> a simple, user-level migration tool.

You don't need a migration tool.

All you need is:

1) a kernel level driver that can map devices, ie. a device mapper

2) user space tools that can parse the volume metadata and tell the
kernel how to map each chunk at initialisation or mount time

You don't need a flying circus in kernel space.

regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".
http://www.surriel.com/ http://distro.conectiva.com/
Current spamtrap: <a href=mailto:"[email protected]">[email protected]</a>

2002-10-13 19:53:29

by Andrew Morton

[permalink] [raw]
Subject: Re: Linux v2.5.42

Mark Hahn wrote:
>
> I'm bothered by other IBM influences, such as their fixation with
> performance of broken platforms like Profusion, or tuning for NUMA.

Actually it's rather useful to have these platforms around. Because
if you fix a problem on profusion and NUMA-Q, it's really, really fixed
for other hardware...

They show up races and lock contention like crazy.

2002-10-13 20:18:19

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: Linux v2.5.42

In article <Pine.LNX.4.33.0210131545510.17395-100000@coffee.psychology.mcmaster.ca> you wrote:
> for instance, some part of EVMS design is motivated by IBM's political
> desire to permit its bank customers, who have horrible old OS/2 systems,
> to transparently use OS/2 volumes.

Some parts? I guess it is one module. What is wrong with this. Support for
non standard partition and slice types is currently cluttering up the kernel
source. I will be more than happy to see this in a EVMS module.

Greetings
Bernd

2002-10-13 20:20:13

by Sean Neakums

[permalink] [raw]
Subject: Re: Linux v2.5.42

commence Rik van Riel quotation:

> All you need is:
>
> 1) a kernel level driver that can map devices, ie. a device mapper
>
> 2) user space tools that can parse the volume metadata and tell the
> kernel how to map each chunk at initialisation or mount time
>
> You don't need a flying circus in kernel space.

I don't know my arse from my elbow when it comes to kernel design and
coding issues, but my chimpanzee brain really likes this aspect of the
LVM2/dm combination.

--
/ |
[|] Sean Neakums | Questions are a burden to others;
[|] <[email protected]> | answers a prison for oneself.
\ |

2002-10-14 04:21:12

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Oct 13, 2002 13:46 -0400, Robert Love wrote:
> And this is really entirely the wrong attitude to take. "Linux does not
> have volume management" and "High-end Linux applications need volume
> management" do not logically imply "we need to merge EVMS."

Well, I think the attitude is more like "I've never used volume
management, and high-end systems use volume management, therefore only
high end systems will benefit from volume management".

It's like Fortran programmers saying "I've gotten by with only static
memory allocation all of these years, do dynamic memory allocation in
C is just useless". (Yes, I know "them's fightin' words" ;-)

The truth is that once you've gotten used to the LVM paradigm, going
back to "partitions" sucks, a lot. Not having to over-allocate huge
gobs of disk to partitions because you don't want to backup, reformat,
restore, repeat each time you manage to run out of space in a partition
is a big win, whether you're administering 1 disk drive or 1000.

Being able to create temporary volumes for whatever need strikes you,
increasing the amount of free space in your filesystem while it's
mounted, that's a big win in my books, even on a laptop (maybe even
_especially_ on a laptop where you can't easily add another disk).

Maybe there are some warts in EVMS, but that doesn't mean we don't
need it (or equivalent) in Linux.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-10-14 04:51:46

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Oct 13, 2002 15:58 -0400, Mark Hahn wrote:
> > Yes I do realize that, but I think EVMS offers more in the long run
> > than any of the others.
>
> for instance, some part of EVMS design is motivated by IBM's political
> desire to permit its bank customers, who have horrible old OS/2 systems,
> to transparently use OS/2 volumes. it's not as if IBM couldn't provide
> a simple, user-level migration tool.

Well, you try and convert a few TB of data in a few hour outage window
and pray everything goes well (and then have to convert _back_ to the
old format once you find a bug in the new environment). You have just
never worked in an environment where the time constraints are tight,
and you CANNOT do the migration offline, or in advance, or whatever.

> it's not as if the Linux community is going to rush out and say
> "let's all start use OS/2 volumes everywhere!"

Well, it's not like most of the Linux community is rushing out and saying
"let's all start using Amiga AFFS filesystems" either, but that didn't
prevent it from being included in the kernel.

I actually DO prefer AIX LVM metadata over the Linux LVM metadata,
and it is NO CONTEST when you are comparing it to the "DOS partitions"
that you seem to prefer so much.

> the best part of Linux is its willingness to throw out old designs;

The best part of Linux is that it accepts a lot of people into the fold,
each of whom has their own special needs, and can change it to meet
those needs.

> a big system like EVMS has its own resistance to such redesign.

??? A big system like the VM/VFS/networking/etc has its own resistance to
such redesign too, but that doesn't mean that they haven't been hacked and
diced and re-assembled like Frankenstein several times. Everything has
to start somewhere, and if you want until everyone reaches "consensus"
on what is the "best" way to implement it, we would all still be running
MS DOS or Minix.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-10-14 09:56:40

by Joe Thornber

[permalink] [raw]
Subject: Re: Linux v2.5.42

Linus,

On Fri, Oct 11, 2002 at 09:59:58PM -0700, Linus Torvalds wrote:
> PS: NOTE - I'm not going to merge either EVMS or LVM2 right now as things
> stand. I'm not using any kind of volume management personally, so I just
> don't have the background or inclination to walk through the patches and
> make that kind of decision. My non-scientific opinion is that it looks
> like the EVMS code is going to be merged, but ..
>
> Alan, Jens, Christoph, others - this is going to be an area where I need
> input from people I know, and preferably also help merging. I've been
> happy to see the EVMS patches being discussed on linux-kernel, and I just
> wanted to let people know that this needs outside help.

I've just got a few comments to make:

Yes, there has been a lot more discussion of EVMS than device-mapper
in the last couple of weeks, however not much of it was complimentary.
I feel like adding some obvious design flaws to device-mapper so that
Christoph will give me some free publicity too ;)

I've always tried to argue for the inclusion of device-mapper in the
kernel, rather than the exclusion of EVMS. Admittedly I don't agree
with their design, if I did I would have continued developing the LVM1
driver. However I don't see why we have to deliberately upset to
either the large LVM or EVMS userbase by not supporting their software
- unless the respective driver is too broken.

Some people seem to misunderstand the status of the LVM2 system.

i) I consider the software to be more stable than LVM1 and would
always use it in preference, and have done for the last year.

ii) It is backwards compatible with LVM1, the tools look and behave in
an almost identical manner to the LVM1 tools. To migrate from
LVM1 to LVM2 you compile a kernel with dm, compile the userland tools
and use them.

iii) The only major feature that LVM2 doesn't have compared to LVM1 is
'pvmove'. This feature is broken/dangerous in LVM1. EVMS also
doesn't have a pvmove.

The LVM1 driver recieved a lot of abuse of the last 2 years, I believe
we've addressed these problems very well with the dm driver. I have
also argued why a new driver was neccessary rather than fixing LVM1,
and think the vast majority of people agree with me. The LVM users
want to continue with the toolset they are familiar with, so why are
we even considering not continuing to support them by leaving dm out
of 2.5 ?

- Joe

2002-10-14 13:05:06

by Rob Landley

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Saturday 12 October 2002 03:37 pm, [email protected] wrote:

> Obvious point:
>
> * Linus can always thaw the tree after 31st just for one addition, if
> something _really_ needs to be added for 2.6
>
> John.

New entry to the famous last words list: "just one addition".

Um, no please? Can of worms? Bad Thing (tm)?

Rob

(I'd much rather have to patch my kernel to get a feature than go through
another nine months of "2.6-pre37-pre2-we_really_mean_it_this_time-ac4".
I don't care if my system won't BOOT without it, everybody has _something_
they can't live without...)

2002-10-14 14:40:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

(full quote deleted, please try stop that)

> Good for you. Most people can't/won't wait for it. They will see that linux
> doesn't have a key feature for enterprises, and say that linux still isn't
> mature enough for them and at best only use linux on some dinky little
> webservers, like it has been used in the past.

Please stop trolling. Linux ised used in many areas, and you can
patch in whathever you want to feel "enteprise ready". Beeing
that is not a primary focus of the Linux Kernel developmen and won't be.

2002-10-14 15:05:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sun, Oct 13, 2002 at 10:24:11PM +0200, Bernd Eckenfels wrote:
> In article <Pine.LNX.4.33.0210131545510.17395-100000@coffee.psychology.mcmaster.ca> you wrote:
> > for instance, some part of EVMS design is motivated by IBM's political
> > desire to permit its bank customers, who have horrible old OS/2 systems,
> > to transparently use OS/2 volumes.
>
> Some parts? I guess it is one module. What is wrong with this. Support for
> non standard partition and slice types is currently cluttering up the kernel
> source. I will be more than happy to see this in a EVMS module.

Umm, you consider moving coee from fs/partitions/*.c to drivers/evms/*.c
a cleanup?

2002-10-14 16:02:34

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Sun, Oct 13, 2002 at 10:23:23PM -0600, Andreas Dilger wrote:
> Maybe there are some warts in EVMS, but that doesn't mean we don't
> need it (or equivalent) in Linux.

Full agreement here. But until we have something that is in
a mergable shape I don't want to see it in mainline.

2002-10-14 17:31:19

by Ben Rafanello

[permalink] [raw]
Subject: Re: Linux v2.5.42


On Sun, 13 Oct 2002, Rik van Riel wrote:

>All you need is:
>
>1) a kernel level driver that can map devices, ie. a device mapper
>
>2) user space tools that can parse the volume metadata and tell the
> kernel how to map each chunk at initialisation or mount time

This works well for the simple cases where the volume metadata is
static. However, it does not handle cases where the volume
metadata must be updated dynamically, the most obvious cases
being striping with parity, mirroring (esp. the more advanced
forms/features such as smart resync, partial mirrors, remote
mirroring, etc), snapshots, and bad block relocation.

Regards,

Ben Rafanello
EVMS Team Lead
IBM Linux Technology Center
(512) 838-4762
[email protected]


2002-10-14 19:16:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Mon, Oct 14, 2002 at 11:01:50AM +0100, Joe Thornber wrote:
> Yes, there has been a lot more discussion of EVMS than device-mapper
> in the last couple of weeks, however not much of it was complimentary.
> I feel like adding some obvious design flaws to device-mapper so that
> Christoph will give me some free publicity too ;)

Haven't found that big design issues yet, but a number of small
and medium implementation issues and some style nitpicking :)

Comments are against dm_2002-10-09.tar.bz2:

00.patch

Looks fine. Useful for other code (like EVMS..), too.

01.patch

Looks fine, but I wonder whether we really want the
zeroing in kernel mode (yes, I know userspace calloc
does it)

02.patch

It starts to get interesting:

+#define NUM_BUCKETS 64
+#define MASK_BUCKETS (NUM_BUCKETS - 1)
+#define HASH_MULT 2654435387U
+static struct list_head *_dev_buckets;
+static struct list_head *_name_buckets;
+static struct list_head *_uuid_buckets;
+
+/*
+ * Guards access to all three tables.
+ */
+static DECLARE_RWSEM(_hash_lock);

Your heavy _ prefix looks a bit strange from the normal
kernel coding style perspective. It's fine with me as long
as it's consistant with itself (and it is).

+/*-----------------------------------------------------------------
+ * Init/exit code
+ *---------------------------------------------------------------*/
+void dm_hash_exit(void)
+{
+ if (_dev_buckets)
+ kfree(_dev_buckets);
+
+ if (_name_buckets)
+ kfree(_name_buckets);
+
+ if (_uuid_buckets)
+ kfree(_uuid_buckets);
+}

kfree(NULL) is fine.

+/*-----------------------------------------------------------------
+ * Code for looking up the device by kdev_t.
+ *---------------------------------------------------------------*/
+static struct hash_cell *__get_dev_cell(kdev_t dev)
+{
+ struct list_head *tmp;
+ struct hash_cell *hc;
+ unsigned int h = hash_dev(dev);
+
+ list_for_each (tmp, _dev_buckets + h) {
+ hc = list_entry(tmp, struct hash_cell, list);
+ if (kdev_same(hc->md->dev, dev))
+ return hc;
+ }
+
+ return NULL;
+}

As the argument is purely a hash value I'd suggest to
use a dev_t. Maybe pass in a struct block_device for
consistency.

+
+struct mapped_device *dm_get_r(kdev_t dev)
+{
+ struct hash_cell *hc;
+ struct mapped_device *md = NULL;
+
+ down_read(&_hash_lock);
+ hc = __get_dev_cell(dev);
+ if (hc && dm_flag(hc->md, DMF_VALID)) {
+ md = hc->md;
+ down_read(&md->lock);
+ }
+ up_read(&_hash_lock);
+
+ return md;
+}

Dito (and some more).

+/*
+ * Convert a device path to a kdev_t.
+ */
+int lookup_device(const char *path, kdev_t *dev)
+{
+ int r;
+ struct nameidata nd;
+ struct inode *inode;
+
+ if ((r = path_lookup(path, LOOKUP_FOLLOW, &nd)))
+ return r;
+
+ inode = nd.dentry->d_inode;
+ if (!inode) {
+ r = -ENOENT;
+ goto out;
+ }
+
+ if (!S_ISBLK(inode->i_mode)) {
+ r = -EINVAL;
+ goto out;
+ }
+
+ *dev = inode->i_rdev;
+
+ out:
+ path_release(&nd);
+ return r;
+}

What about resolving directly to a struct block_device?
And yes, this name -> struct block_Device thing is duplicated
a few times. Al & I need to look into factoring out.

+ * Open a device so we can use it as a map destination.
+ */
+static int open_dev(struct dm_dev *d)
+{
+ int r;
+
+ if (d->bdev)
+ BUG();
+
+ if (!(d->bdev = bdget(kdev_t_to_nr(d->dev))))
+ return -ENOMEM;
+
+ r = blkdev_get(d->bdev, d->mode, 0, BDEV_RAW);
+ if (r) {
+ bdput(d->bdev);
+ return r;
+ }
+
+ return 0;
+}

bd_claim is missing..

+/*
+ * Close a device that we've been using.
+ */
+static void close_dev(struct dm_dev *d)
+{
+ if (!d->bdev)
+ return;
+
+ blkdev_put(d->bdev, BDEV_RAW);
+ d->bdev = NULL;
+}

And bd_unclaim here.

+
+ if (sscanf(path, "%x:%x", &major, &minor) == 2) {
+ /* Extract the major/minor numbers */
+ dev = mk_kdev(major, minor);
+ } else {
+ /* convert the path to a device */
+ if ((r = lookup_device(path, &dev)))
+ return r;
+ }

What do you need the major/minor version for?

+static int __init dm_init(void)
+{
+ const int count = sizeof(_inits) / sizeof(*_inits);

Use ARRAY_SIZE()?

+static int dm_blk_ioctl(struct inode *inode, struct file *file,
+ uint command, unsigned long a)
+{
+ int r;
+ sector_t size;
+ long l_size;
+ unsigned long long ll_size;
+
+ r = get_device_size(inode->i_rdev, &size);
+ if (r)
+ return r;
+
+ switch (command) {
+ case BLKGETSIZE:
+ l_size = (long) size;
+ if (copy_to_user((void *) a, &l_size, sizeof(long)))
+ return -EFAULT;
+ break;
+
+ case BLKGETSIZE64:
+ ll_size = (unsigned long long) size << 9;
+ if (put_user(ll_size, (u64 *) a))
+ return -EFAULT;
+ break;

These two are in generic code and odn't need to be implemented
by a lowlevel driver (won't ever be called).

+{
+ struct clone_info ci;
+
+ ci.md = md;
+ ci.bio = bio;
+ ci.io = alloc_io();
+ ci.io->error = 0;

Some indentation issues here (and in a lot other places). I'd
suggest you run the file through unexpand(1)

+ ci.io->io_count = (atomic_t) ATOMIC_INIT(1);

This cast looks bogus to me.

+/*
+ * Sets or clears the read-only flag for the device. Write lock
+ * must be held.
+ */
+void dm_set_ro(struct mapped_device *md, int ro)
+{
+ if (ro)
+ dm_set_flag(md, DMF_RO);
+ else
+ dm_clear_flag(md, DMF_RO);
+
+ set_device_ro(md->dev, ro);
+}

Split this into dm_set_ro and dm_set_rw? Yes, set_device_ro
has the saem braindead API, but no need to do the same

+/*
+ * We need to be able to change a mapping table under a mounted
+ * filesystem. For example we might want to move some data in
+ * the background. Before the table can be swapped with
+ * dm_bind_table, dm_suspend must be called to flush any in
+ * flight bios and ensure that any further io gets
+ * deferred. Write lock must be held.
+ */
+int dm_suspend(kdev_t dev)

Pass in a struct block_Device here?

+ add_wait_queue(&md->wait, &wait);
+ while (1) {
+ set_current_state(TASK_INTERRUPTIBLE);
+
+ if (!atomic_read(&md->pending))
+ break;
+
+ yield();
+ }
+
+ current->state = TASK_RUNNING;
+ remove_wait_queue(&md->wait, &wait);

Hmm, the yield() looks strange and INTERRUPTIBLE without
a check for signals, too. Switch to wait_event_interruptible?

+int dm_resume(kdev_t dev)

struct block_Device?

+#include <linux/config.h>
+#include <linux/version.h>
+#include <linux/major.h>
+#include <linux/iobuf.h>

You don't actually use, do you?

+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+#include <linux/compatmac.h>

What do you need from this one?

+#include <linux/cache.h>
+#include <linux/devfs_fs_kernel.h>
+#include <linux/ctype.h>
+#include <linux/device-mapper.h>
+#include <linux/list.h>
+#include <linux/init.h>
+#include <linux/blkdev.h>

IMHO many of these includes should go into the individual sources
instead. I doubt you need them in the header.

+static inline void dm_put_r(struct mapped_device *md) {
+ up_read(&md->lock);
+}

static inline void dm_put_r(struct mapped_device *md)
{
up_read(&md->lock);
}

+
+static inline void dm_put_w(struct mapped_device *md) {
+ up_write(&md->lock);
+}

Dito.

+static inline char *dm_strdup(const char *str)
+{
+ char *r = kmalloc(strlen(str) + 1, GFP_KERNEL);
+ if (r)
+ strcpy(r, str);
+ return r;
+}

What about the following in kernel.h instead?:

static inline char *kstrdup(const char *str, unsigned int gfp_mask)
{
char *r = kmalloc(strlen(str) + 1, gfp_mask);
if (likely(r))
strcpy(r, str);
return r;
}

+
+static inline int dm_flag(struct mapped_device *md, int flag)
+{
+ return (md->flags & (1 << flag));
+}
+
+static inline void dm_set_flag(struct mapped_device *md, int flag)
+{
+ md->flags |= (1 << flag);
+}
+
+static inline void dm_clear_flag(struct mapped_device *md, int flag)
+{
+ md->flags &= ~(1 << flag);
+}

Are these performance-critial or is there another reason to not
use the generic linux/bitops.h variants?

+int __init dm_interface_init(void);

__init is not needed for the prototypes. That way you don't need init.h
in the header.

03.patch
04.patch

Look fine. I wonder whether they want to be separate modules?

05.patch

+
+/*-----------------------------------------------------------------
+ * Implementation of open/close/ioctl on the special char
+ * device.
+ *---------------------------------------------------------------*/
+static int ctl_open(struct inode *inode, struct file *file)
+{
+ /* only root can open this */
+ if (!capable(CAP_SYS_ADMIN))
+ return -EACCES;

Do you really want this in ->open and not the ?ctual ioctl
commands?

+
+ MOD_INC_USE_COUNT;

Not needed - that's what THIS_MODULE in struct file_operations
is for.

+static int ctl_close(struct inode *inode, struct file *file)
+{
+ MOD_DEC_USE_COUNT;
+ return 0;
+}

Method not needed at all.

+ r = devfs_generate_path(_dm_misc.devfs_handle, rname + 3,
+ sizeof rname - 3);
+ if (r == -ENOSYS)
+ return 0; /* devfs not present */
+
+ if (r < 0) {
+ DMERR("devfs_generate_path failed for control device");
+ goto failed;
+ }
+
+ strncpy(rname + r, "../", 3);
+ r = devfs_mk_symlink(NULL, DM_DIR "/control",
+ DEVFS_FL_DEFAULT, rname + r, &_ctl_handle, NULL);

Looks a bit crude. Why do you need this symlink?

+ __kernel_dev_t dev; /* in/out */

Hmm. Can't you just do every ioctl on the actually affected
block device node instead of the character ones? Unlike LVM1
the nodes are there in /dev/mapper and must not be created..

And I must admit I don't really like the ioctl interface. But at least
it's separated out properly.

2002-10-14 19:26:15

by Alexander Viro

[permalink] [raw]
Subject: Re: Linux v2.5.42



On Mon, 14 Oct 2002, Christoph Hellwig wrote:

> +{
> + int r;
> +
> + if (d->bdev)
> + BUG();
> +
> + if (!(d->bdev = bdget(kdev_t_to_nr(d->dev))))
> + return -ENOMEM;
> +
> + r = blkdev_get(d->bdev, d->mode, 0, BDEV_RAW);
> + if (r) {
> + bdput(d->bdev);

*blam*
failing blkdev_get() does bdput() itself.

> +
> + if (sscanf(path, "%x:%x", &major, &minor) == 2) {
> + /* Extract the major/minor numbers */
> + dev = mk_kdev(major, minor);
> + } else {
> + /* convert the path to a device */
> + if ((r = lookup_device(path, &dev)))
> + return r;
> + }
>
> What do you need the major/minor version for?

... and in any case, both branches should result in struct block_device *
(the former - via bdget(MKDEV(...));)

> + switch (command) {
> + case BLKGETSIZE:
> + l_size = (long) size;
> + if (copy_to_user((void *) a, &l_size, sizeof(long)))
> + return -EFAULT;
> + break;

> These two are in generic code and odn't need to be implemented
> by a lowlevel driver (won't ever be called).

Not only that, but BLKGETSIZE above is missing overflow check.
(these two are still called, with generic version called if we
get -EINVAL; with patches submitted to Linus they won't be even
tried).

2002-10-14 22:23:11

by Joe Thornber

[permalink] [raw]
Subject: Re: Linux v2.5.42

Christoph,

Thanks for finding time to go through the code. I've just dropped a
lot of patches in

http://people.sistina.com/~thornber/patches/2.5-unstable/

which address the following points of yours (more email after the list):

12.patch
Leave checking for a NULL pointer to the free functions.

13.patch
Use the ARRAY_SIZE() macro

14.patch
Don't reimplement the BLKGETSIZE ioctls

15.patch
Run source through unexpand(1)

16.patch
Use atomic_set rather than casting ATOMIC_INIT()

17.patch
Split dm_set_ro(md, flag) into dm_set_ro(md) and dm_set_rw(md).

18.patch
Move header files out of dm.h to the sources that really need them.

19.patch
Formatting.

20.patch
Remove the dm_flag functions and use the standard bitop ones instead.

21.patch
No need to use __init in a declaration. Remove inclusion of
linux/init.h

22.patch
No need for MOD_INC_USE_COUNT etc.

Move root check to ioctl fn, rather than open.

23.patch
No need to bdput after a failed blkdev_get.


On to the slightly more interesting points:

On Mon, Oct 14, 2002 at 08:21:58PM +0100, Christoph Hellwig wrote:
> 01.patch
>
> Looks fine, but I wonder whether we really want the
> zeroing in kernel mode (yes, I know userspace calloc
> does it)

ok, if we don't zero we better not call it calloc, any preference for
the new name ?

> +/*-----------------------------------------------------------------
> + * Code for looking up the device by kdev_t.
> + *---------------------------------------------------------------*/
> +static struct hash_cell *__get_dev_cell(kdev_t dev)
> +{
> + struct list_head *tmp;
> + struct hash_cell *hc;
> + unsigned int h = hash_dev(dev);
> +
> + list_for_each (tmp, _dev_buckets + h) {
> + hc = list_entry(tmp, struct hash_cell, list);
> + if (kdev_same(hc->md->dev, dev))
> + return hc;
> + }
> +
> + return NULL;
> +}
>
> As the argument is purely a hash value I'd suggest to
> use a dev_t. Maybe pass in a struct block_device for
> consistency.

I'm trying to keep dev_ts strictly within the interface (dm-ioctl in
this case). So dm-ioctl will get the dev_t from the ioctl args and
then convert it to a kdev_t for the look up. I'll think more about
this, I am going to remove kdev_ts from all but dm-ioctl.c and
dm-hash.c at some point (ie. when I can).

> +/*
> + * Convert a device path to a kdev_t.
> + */
> +int lookup_device(const char *path, kdev_t *dev)
> +{
> + int r;
> + struct nameidata nd;
> + struct inode *inode;
> +
> + if ((r = path_lookup(path, LOOKUP_FOLLOW, &nd)))
> + return r;
> +
> + inode = nd.dentry->d_inode;
> + if (!inode) {
> + r = -ENOENT;
> + goto out;
> + }
> +
> + if (!S_ISBLK(inode->i_mode)) {
> + r = -EINVAL;
> + goto out;
> + }
> +
> + *dev = inode->i_rdev;
> +
> + out:
> + path_release(&nd);
> + return r;
> +}
>
> What about resolving directly to a struct block_device?
> And yes, this name -> struct block_Device thing is duplicated
> a few times. Al & I need to look into factoring out.

Agreed. I'll look at this tomorrow when I'm more awake.


> + * Open a device so we can use it as a map destination.
> + */
> +static int open_dev(struct dm_dev *d)
> +{
> + int r;
> +
> + if (d->bdev)
> + BUG();
> +
> + if (!(d->bdev = bdget(kdev_t_to_nr(d->dev))))
> + return -ENOMEM;
> +
> + r = blkdev_get(d->bdev, d->mode, 0, BDEV_RAW);
> + if (r) {
> + bdput(d->bdev);
> + return r;
> + }
> +
> + return 0;
> +}
>
> bd_claim is missing..

I don't think so, we don't want to claim the whole device, only the
sectors we are using. Another md table might be using other parts of
the dev.

> +static void close_dev(struct dm_dev *d)
> +{
> + if (!d->bdev)
> + return;
> +
> + blkdev_put(d->bdev, BDEV_RAW);
> + d->bdev = NULL;
> +}
>
> And bd_unclaim here.

Ditto

> + if (sscanf(path, "%x:%x", &major, &minor) == 2) {
> + /* Extract the major/minor numbers */
> + dev = mk_kdev(major, minor);
> + } else {
> + /* convert the path to a device */
> + if ((r = lookup_device(path, &dev)))
> + return r;
> + }
>
> What do you need the major/minor version for?

Someone wanted to specify major/minor pairs in the tables provided to
dmsetup rather than a path.

> + add_wait_queue(&md->wait, &wait);
> + while (1) {
> + set_current_state(TASK_INTERRUPTIBLE);
> +
> + if (!atomic_read(&md->pending))
> + break;
> +
> + yield();
> + }
> +
> + current->state = TASK_RUNNING;
> + remove_wait_queue(&md->wait, &wait);
>
> Hmm, the yield() looks strange and INTERRUPTIBLE without
> a check for signals, too. Switch to wait_event_interruptible?

Also agreed, will look at tomorrow.

>
> +int dm_resume(kdev_t dev)
>
> struct block_Device?

I changed these this afternoon to take a struct mapped_device the same
as the other dm.c functions (see 11.patch).

> +static inline char *dm_strdup(const char *str)
> +{
> + char *r = kmalloc(strlen(str) + 1, GFP_KERNEL);
> + if (r)
> + strcpy(r, str);
> + return r;
> +}
>
> What about the following in kernel.h instead?:

I'm wary of including linux/slab.h in kernel.h.

> 04.patch
>
> Look fine. I wonder whether they want to be separate modules?

I decided not, they're tiny, and nobody is likely to want to run dm
without them.


> + r = devfs_generate_path(_dm_misc.devfs_handle, rname + 3,
> + sizeof rname - 3);
> + if (r == -ENOSYS)
> + return 0; /* devfs not present */
> +
> + if (r < 0) {
> + DMERR("devfs_generate_path failed for control device");
> + goto failed;
> + }
> +
> + strncpy(rname + r, "../", 3);
> + r = devfs_mk_symlink(NULL, DM_DIR "/control",
> + DEVFS_FL_DEFAULT, rname + r, &_ctl_handle, NULL);
>
> Looks a bit crude. Why do you need this symlink?

This just links /dev/mapper/control -> /dev/misc/device-mapper, I
think it's neater that way. If it looks crude blame devfs.

>
> + __kernel_dev_t dev; /* in/out */
>
> Hmm. Can't you just do every ioctl on the actually affected
> block device node instead of the character ones?

Yes I could, the only reason I'm not is that I'm keeping all the
interface stuff completely seperate in dm-ioctl.c

> And I must admit I don't really like the ioctl interface. But at least
> it's separated out properly.

Nobody likes ioctl interfaces. About a year ago we had a filesystem
interface to device mapper instead, however I thought there would be
more opposition to that approach so we switched to the nasty ioctl
interface. If l-k can agree on a better interface method I'd be happy
to write a new interface module.

- Joe

2002-10-14 22:21:37

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: Linux v2.5.42

In article <[email protected]> you wrote:
> Umm, you consider moving coee from fs/partitions/*.c to drivers/evms/*.c
> a cleanup?

Actually no, but I consider a registration interface for partition handlers
a cleanup, but I must admit I am not up to date to 2.5.

Personally I would love to see the device mapper stuff as the foundation for
evms. Hopefully those teams may meet in the middle :)

I am afraid partition handling is most of the time needed for root file
systems, so this will put more need for initrd solutions into the picture.

Greetings
Bernd

2002-10-15 02:44:24

by Paul McKenney

[permalink] [raw]
Subject: Re: Linux v2.5.42


On Sat, Oct 12, 2002 at 01:46:17PM +0000, Christoph Hellwig wrote:
> Even if those existing users don't get in yet I don't want to miss the
> infrastructure in the 2.6 series.

One important thing to note: all of the RCU patches
we have constructed have shown benefit. Since we
have not been very selective in choosing what patches
to generate, it seems a reasonable guess that other
parts of Linux would benefit as well. There is no
shortage of read-mostly data structures in Linux!

Thanx, Paul


2002-10-15 17:33:03

by Mark Peloquin

[permalink] [raw]
Subject: Re: Linux v2.5.42


Linus,

You have undoubtedly heard from many people their
opinions of some of the disputed design issues of
EVMS. I would very much like to hear what your
opinions are on the following points:

1) In-kernel vs user space discovery

Today, EVMS employs in-kernel discovery. Some
members of the kernel community has expressed
their desire to rip all existing kernel code for
discovery (of partitions, etc) and move this
support into user space.

Do you agree that having only user space
discovery is good idea? And if so, will this be
a requirement for 2.6?

2) Separate vol. mgmt. subsystems vs.
Integrating vol. mgmt into the block layer

Christoph's new proposal seems to be to
consolidate all vol. mgmt. into a new block
layer/device interface. In the long term, this
might be the right direction to go. However,
this does not seem likely to be completed in
the 2.5/2.6 timeframe. What will be used for
volume management until then?

Assuming compatible metadata formats, it seems
that MD, LVM, and EVMS as separate components
could be used until such a common infrastructure
was in place. At that point, the existing drivers
could be migrated to using the new infrastructure.

3) Generic access to intermediate (storage)
points that comprise a volume.

One of EVMS' original design goals was to
abstract internals of the composition of volumes
from the user and the block layer. There was
benefits in doing this, primarily in not wasting
some limited resources (ie. majors and minors),
as well as some memory. Each EVMS volume is
exported as a standard block device, but the
internal composition of that volume is considered
EVMS private data and not exposed directly to
the outside world. Thus member elements of a
volume are not accessible through the generic
block device operations.

Other (potential volume manager) implementation
have typically represented member elements as
independent block devices. Each being accessible
through the generic block device operations.

What's your opinion on the abstraction that EVMS
currently provides?

Mark


2002-10-24 11:39:04

by Alexander Kellett

[permalink] [raw]
Subject: Re: Linux v2.5.42

On Sun, Oct 13, 2002 at 05:57:06PM -0200, Rik van Riel wrote:
> All you need is:
> 1) a kernel level driver that can map devices, ie. a device mapper
> 2) user space tools that can parse the volume metadata and tell the
> kernel how to map each chunk at initialisation or mount time

stupid user question here. does the dm stuff make
vmware partition mounts easy without needing
all the nbd overhead?, or would the mappings be
so large that they negate the decrease in nbd
overhead?

Alex