2000-12-22 17:49:14

by Alan

[permalink] [raw]
Subject: Linux 2.4.0test13pre4ac1

This is mostly so people can see what I have merged in my tree and what
has gone from it. The patch for the adventurous is in

ftp://ftp.kernel.org/pub/linux/kernel/people/alan/2.4.0test/..

Next job - merging the 2.2.18 stuff

2.4.0test13pre4-ac1
o Merge Linus pre4
o Fix 8139too signal handling and task scribble (Andrew Morton)
o Fix signal handling for usermode helper (Shuu Yamaguchi)
o Fix network register/hotplug/publish problems (Andrew Morton)
o Fix tty DoS bug (Andrew Morton)
o Fix sun3 scsi, mmu and includes (Geert Uytterhoeven)
o Remove obsolete bits for q40 (Geert Uytterhoeven)
o M68k setup update (Geert Uytterhoeven)
o Tidy m68k includes (Geert Uytterhoeven)
o Hopefully fix quotaless compile (me)
o Help for irda options question (Steven Cole)

2.4.0test13pre3-ac4
o CCISS root= table (Charles White)
o Fix frame size on toshoboe (Christian Gennerat)
o Quota fixes/updates (Jan Kara)
o Fix keyspan usb config (Hugh Blemings)
o Fix module handling in usb serial (Greg Kroah-Hartmann)
o Fix sparc64 build of fusion drivers (Eddie Dost)
o Further NetROM tidies (Hans Grobler)
o Further rose fixes (Hans Grobler)
o Wireless include update (Jean Tourrilhes)
o Fix eepro module warnings (Aristeu Filho)
o Clean up config.h includes (Niels Jensen)
o Fix most of the netfilter oops cases (David Miller)

2.4.0test13pre3-ac3
o Fix the patch file. Some stuff got corrupted.

2.4.0test13pre3-ac2 adds
o Resync with the powerpc folks (Cort Dougan)
o Fix appletalk config entry (William McGonigle)
o Documentation/script fixes (Tim Waugh)
o Parport experimental label fix (Tim Waugh)
o Update credits to add Hans Grobler (Hans Grobler)
o Make uhci return the same error code as the (David Brownell)
other USB hub controllers
o Merge Fusion drivers (Steve Ralston)
o BPQ ethernet tidy (Hans Grobler)
o Updated AX.25 tidy (Hans Grobler)
o Shared memory fixes (Christoph Rohland)
o Resync mac ethernet drivers (Cort Dougan)
o Fix missing memory barrier in bootp/dhcp code (Cort Dougan)

2.4.0test13pre3-ac1 adds
o Handle TLB flush reruns caused by APIC rexmit (me)
o Fix leak in link() syscall (Christopher Yeoh)
o Fix ramfs deadlock (Al Viro)
o Fix udf deadlock (Al Viro)
o Improve parport docs (Tim Waugh)
o Document some of the macros (Tim Waugh)
o Fix ppa timing issues (Tim Waugh)
o Mark the parport fifo code as experimental (Tim Waugh)
o Resynch ppa changelog (Tim Waugh)
| Tim please double check as I got offsets
o Fix Yam driver for Linux 2.4test (Hans Grobler)
o Fix AF_ROSE sockets for 2.4 (Hans Grobler)
o Fix AF_NETROM sockets for 2.4 (Hans Grobler)
o Tidy AF_AX25 sockets for 2.4 (Hans Grobler)
o Teach kernel-doc about const (Jani Monoses)
o Add documentation to the PCI api (Jani Monoses)
o Fix inode.c documentation (Jani Monoses)
o Push Davicom support into the main tulip driver (Tobias Ringstrom)
o First block of mkiss driver fixes (Hans Grobler)
o Fix bug in VFAT short name handling (Nicolas Goutte)
o Clean up the i810 driver (Tjeerd Mulder)
o RCPCI45 PCI cleanup fixes (mark 2) (Rasmus Andersen)
o Improve the ALSxxx sound driver documentation (Jonathan Woithe)
o Fix ext2 modular build (Jeff Raubitschek)
o Fix bug in scripts/Configure.in matching (Matthew Wilcox)
o Revert accidental amifb change (Geert Uytterhoeven)
o Fix ext2 file size limiting for large files (Andreas Dilger)
o Clean up misleading indenting in partition code (JAmes Antill)
o Update SiS video drivers (Can-Ru Yeou)
o Yamaha audio doc fix (Pavel Roskin)
o Fix ACPI driver wakeup races (David Woodhouse)
o Remove bogus asserts in 8139too driver (Jeff Garzik)
o Fix timeout problms with rocktports at 249 days
o Update acenic patches (Jes Sorensen)
o Fix i810 tco locking (me)
o Fix drm makefiles (Peter Samuelson)

2.4.0test12-ac1 adds
o ARM bootup/initd fixes (Russell King)
o Fix ymf_sb setup bug (Pavel Roskin)
o Correctly print names of md10+ (me)
[Based on code from Roberto Ragusa]
o Fix sound crashes in various drivers (Tjeerd Mulder)
o Update epic100 to new pci api (Francois Romieu)
o Fix IOC/SIOC ioctl problems in ac97 code (Dick Streefland)

To merge
o Fix Ruffian Alpha boot (Ivan Kokshaysky)
o Bridge handling patches needed for Alpha (Ivan Kokshaysky /
Richard Henderson)
o FPU emulator source set for m68k (Geert Uytterhoeven)
o Fix m68k build with rmw disabled (Geert Uytterhoeven)
o Cleanup ramdisk namespace (Jeff Garzik)
o Link correctly with ACPI on ACPI_INTERPRETER off
o Ramdisk missing blkdev_put
o Acenic update
o Epic100 update
o Support mixed pnp and legacy sb cards
o Hopefully fix the bugs in the FAT and HPFS file systems that
caused fs corruption
o Fix cramfs vanishing data bug
o Fix NLS config.in bug for SMB
o Power management locking fixes
o filemap posix compliance fix
o Fix pte handling race
o Remove unneeded inits to 0 in ide code (Bartlomiej Zolnierkiewicz)
o IDE documentation fixes (Bartlomiej Zolnierkiewicz)

Submitted to Linus
o Add firestream ATM driver (Patrick van de Lageweg)
o Add the powermac extras to the input and (Franz Sirl)
keyboard drivers
o Fix reference counting in ATM (Patrick van de Lageweg)
o Update Changes to give correct modutils rev (Steven Cole)
o Fix xconfig/menuconfig problems with config (Andrzej Krzysztofowicz)
scripts in 2.4test
o Fix kd_mksound declaration (Geert Uytterhoeven)
o Fix warning in sim710 driver (Pavel Rabel)
o Merge bttv 0.7.50 (Gerd Knorr)
o Clean it up to use pci_pci_quirks properly (me)
o SMC token ring driver update (Jay Schulist)
o Support kgcc autodetect
o Rusty's fixes/review of unsafe set_bit usage
(A few left to go)
o I2C bus driver updates (Frodo Looijaard)
o Fix pcmcia ordering on socket remove (David Woodhouse)
o Update USB documentation (Greg Kroah-Hartmann)
o Tidy the tachyon 5526 driver (Rasmus Andersen)
o Clean old old compile time config stuff from (Pavel Rabel)
mad16 driver
o Tidy riscom8 and sx namespace (Jeff Garzik)
o Rename block_til_ready in generic_serial (Patrick van de Lageweg)

Merged by Linus from -ac or direct
o Add clocking option to maestro (broken laptop (me)
stuff again)
o Put back the module locking in soundcore (David Schleef)
that someone disabled
o Abyss driver cleanup (Jeff Garzik)
o Fix most of the tq changes (Mohammad A. Haque)
o DOC1000 driver fixes (David Woodhouse)
o Switch tvaudio and msp3400 to use up_and_exit (David Woodhouse)
o usb-uhci was using constants not flags for (Jeff Garzik)
pci interface
o Small fix for kdoc (Tim Waugh)
o Fix nubus build (Geert Uytterhoeven)
o atari/sun3lance update (Geert Uytterhoeven)
o Amiga gayle pcmcia fixups (Geert Uytterhoeven)
o Fixes for amiga scsi drivers (Geert Uytterhoeven)
o Simplify amiga irq handling code (Geert Uytterhoeven)
o Amiga sound/fb driver update (Geert Uytterhoeven)
o Amiga/Mac/Atari keyboard driver changes (Geert Uytterhoeven)
o Integrate atari stram with bootmem (Geert Uytterhoeven)
o Restore atafb_fix that someone deleted (Geert Uytterhoeven)
o m68k include updates for 64bit structs (Geert Uytterhoeven)
o Add driver for MVME147 onboard scsi (Geert Uytterhoeven)
o Enable Q40 ide interface (Geert Uytterhoeven)
o Replace init with initdata in places on m68k (Geert Uytterhoeven)
o MMU code changes for m68k (Geert Uytterhoeven)
o dma_addr_t and other minor updates for m68k (Geert Uytterhoeven)
o m68k ptrace update (Geert Uytterhoeven)
o Fix pmc551 when used without bugfix enabled (David Woodhouse)
o Fix endianness on ftl layer (David Woodhouse)
o Fix atm build (Markus Kossmann)
o Update 8139too driver (Jeff Garzik)
o Fix readdir returns on procfs (Matt Kraai)
o Make SET_MODULE_OWNER macro safer (Jeff Garzik)
o Hisax needed __init (Jeff Garzik)
o APM updates, fix the Dell 5000e check for APM=m (Stephen Rothwell)
o Fix module initialization oops (Keith Owens)
o Clean up Abyss driver (Jeff Garzik)
o Fix raid linking order (Neil Brown)
o Cleanup console_verbose() duplication
o Radio driver cleanups
o BTTV radio config option
o Fix qcam VIDIOCGWIN bugs
o 8390 seperate tx timeout path
o Tulip crash fix on weird eeproms
o ISAPnP hang on boot port fix
o Maestro pci_enable fix
o Fix function prototype in wacom drivr
o Fix SCSI / PCI dependancies (Jeff Garzik)
o m68k config fixes (Geert Uytterhoeven)
o Fix dquot overflow/recovery (Jan Kara)
o Make uid16 macros safer (Andreas Schwab)
o Fix missing Config doc and sound doc error (Thierry Vignaud)
o APM update (Stephen Rothwell)
o Fix SMP build on x86 (Steven Cole)
o Maestro ioctl locking fix (Zach Brown)
o Make console_* static inline not extern (Jeff Garzik)
o Work arounds for broken Dell laptop APM (me)
o Fix aha1542 memory scribbles (Phil Stracchino)
o Fix ide scsi printk (Geert Uytterhoeven)
o Update EATA driver and Ultrastor driver (Dario Ballabio)
o Clean up printk formatting in a few drivers (me)
o Documentation for CONFIG_TOSHIBA
o Updated version of Rusty's kernel-hacking doc
o Updated SubmittingDrivers
o Added SubmittingPatches
o Updated procfs docs
o Updated initrd docs
o Tidy network drivers module locking (Jeff Garzik)
(Some in, a few to go)
o Alpha PCI fixes (update resource not __init, (Ivan Kokshaysky)
off by one on check)
o Fix warning in rclan driver (Rasmus Andersen)
o Clean up rcpci driver (new style pci etc) (Jeff Garzik)
o Fix generic bitops bugs
o Fix pcnet32 printk problems (Vojtech Pavlik)
o Network driver check/request region fixes
o MDAcon cleanup (Pavel Rabel)
o Tidy up mad16 driver (Pavel Rabel)
o ACPI updates (Andrew Grover)
o Fix FPU emulation compile (Adam Richter)
o M68K/PPC makefile fixes (Geert Uytterhoeven)
o Work around a funny in the Solaris NFS client (Neil Brown)
o Fix building of network modules (Peter Samuelson)
o Fix media makefiles (me)

Superceded by other fixes
o Features is back to flags for compatibility (me)
o MTRR updates (36bit etc)
o Dont crash on boot with a dual cpu board holding a non intel cpu
o CS46xx update
o NFS atomic fixes (Trond Myklebust)
o Fix O_SYNC for ext2fs (Stephen Tweedie)
[ I believe so anyway ]
o Disable PMC511 driver - its obviously broken (me)
o kbuild documentation improvements (Neil Brown)
o Fix ppa and imm hangs on io_request_lock (Tim Waugh)
o Fix pport reverse/forward logic error (Tim Waugh)
o ACPI updates (Andrew Grover)
o E820 handling fixup (Andrea Arcangeli)

Other

---
Alan Cox <[email protected]>
Red Hat Kernel Hacker
& Linux 2.2 Maintainer Brainbench MVP for TCP/IP
http://www.linux.org.uk/diary http://www.brainbench.com


2000-12-23 07:06:07

by Andrew Morton

[permalink] [raw]
Subject: netdevice interface changes in 2.4.0test13pre4ac1

Alan Cox wrote:
> ...
>
> 2.4.0test13pre4-ac1
> ...
> o Fix network register/hotplug/publish problems (Andrew Morton)

This patch is a step on the way to changing how netdevices are
registered. It is a significant restructuring which was
basically forced upon us by a bad race condition which the new
hotplug code exposed.

Alan has applied the core netdevice patch and a set of changes
to the tokenring drivers which make them use the new interface.

There are about eighty drivers which still need to be changed and
I'll be doing these in large lumps. fc, hippi, ethernet, etc.

I'm determined to get them *all* done, and quickly. We cannot
just half-do it and leave another back-compatibility layer in
the kernel.

If anyone wants to help (please) then please send the patches
to me so I can maintain them in nice big, orthogonal
easily-backed-out-if-necessary chunks. Thanks.

Here's a description of the problem, and the interface
changes which were made to address it:




Proposed 2.4 netdevice interface change.
Andrew Morton <[email protected]>
19 December 2000


The problem
===========

A typical netdevice's probe() function is structured like this:

int xxx_probe()
{
struct net_device *dev = init_etherdev();
...
initialise()
...
dev->open = xxx_open;
if (errcode)
unregister_netdevice(dev);
return errcode;
}

The problem is, init_etherdev() calls register_netdevice() and makes
the netdevice available in the kernel namespace before it's ready to be
opened. Probe routines can easily take tens of milliseconds talking to
slow EEPROMs and MII devices so the window is large.

The problem becomes acute now we are running /sbin/hotplug from within
register_netdevice(). When /sbin/hotplug tries to open the device it
wins the race every time! The device's `open' method is NULL and the
open "succeeds", but the hardware wasn't prepared for operation. Bad.

Kernel 2.4.0-test12 has a kludge in it (dev_probe_lock) which protects
PCI devices, but that's not a fix.

The big kernel lock accidentally protects us from this race because
it's taken by both sys_init_module() and sys_ioctl(). But if the
probe() routines calls schedule() - and they all can - the BKL is
dropped and we lose.


"Old style" netdevices which statically allocate their struct
net_device are OK. They do this:

static struct net_device yyy_dev;

init_module()
{
yyy_dev.init = yyy_probe;
register_netdev(&yyy_dev);
}

This works fine because register_netdevice() calls yyy_probe() prior to
registering the device, and register_netdevice() calls /sbin/hotplug
_after_ the call to yyy_probe() and after registering the netdevice.


The tokenring, hippi and fc drivers do quite wierd things in their
initialisation. This is not described here, but suffice to say that they
are racy and they can be fixed with the proposed new design.



The fix
=======

The correct fix is to:

1): Not make the netdevice visible in the namespace until
it's ready to accept open()s.

2): Allocate the netdevice's name ("eth0") early in
probe(), because drivers like to print it out in diagnostic
messages.

2): Not call /sbin/hotplug until the device is registered
and ready to accept open()s.

3): Minimise the amount of changes which are required.

4): Not break the "Old style" drivers.

5): Provide easy 2.2 back-compatibility for those drivers
which support 2.2.

6): Change ALL the drivers quickly! Don't leave yet
another back-compatibility layer lying around.

7): Support a back-compatibility for the duration of the
netdevice changes. Perhaps a couple of weeks.

8): Approximately 83 drivers need to be changed.

The new interface
=================

Netdevices gain a new state called "hidden". A hidden netdevice is not
visible in namespace lookups, but its existence reserves the interface
name to prevent duplicates or races.


We create three new API functions:

prepare_etherdev(), prepare_trdev(), prepare_fcdev(), etc.

These are similar to init_etherdev() and friends,
but the device is created in a hidden state and protocols are
not notified of the device's existence and /sbin/hotplug is
not called.

publish_netdev(struct net_device *dev)

This is a new API function altogether. It takes a
hidden netdevice and "commits" it. The netdevice is
unhidden, the protocols are notified of its existence and
/sbin/hotplug is called.

withdraw_netdev(struct net_device *dev)

This reverts the effects of prepare_etherdev() and
friends. It's called on the error path and it simply removes
the hidden netdevice from the device list. Protocols are not
notified and /sbin/hotplug is not called.


We retain these:

register_netdev()

This function remains in place. It should only be
used by "old style" drivers and by virtual devices such as
the IP/IP driver and the bonding driver.

register_netdev() registers the device in the
namespace (unhidden), notifies protocols and runs
/sbin/hotplug.

register_netdevice()

Lower-level form of register_netdev(). The device
is registered in the namespace and the protocols are
notified. /sbin/hotplug is not called.

This function must be called with the rtnl_lock
held. It is typically used by virtual devices such as
tunnelling protocols and bonding drivers.

So there is no hotplug notification when one of
these devices is registered. Should there be? If so, they
should use register_netdev().

unregister_netdev()

This function remains in place. Should only be
called for an unhidden interface. It closes the device,
removes the interface from the namespace, notifies protocols
of the disappearing interface and then calls /sbin/hotplug.

unregister_netdevice()

Lower-level form of unregister_netdev(). The
device is unregistered and then the protocols are notified,
but /sbin/hotplug is not called.


Additional details on these functions are available in the source in
kernel-doc format.

Note that /sbin/hotplug is never called under the rtnl_lock. This is
not very important at the time of writing, because /sbin/hotplug is
launched asynchronously. But it leaves open the option of running
/sbin/hotplug synchronously at some time in the future, which would be
nice.


Doomed functions:

init_etherdev(), init_trdev(), etc.

These will remain in existence until all drivers
are changed. They will still have the raciness identified
above.

They register the device in the namespace
(unhidden) and notify protocols. But /sbin/hotplug is NOT
run for these devices.

/sbin/hotplug WILL be called when these devices are
unregistered, however.


Other things:

#define HAVE_PUBLISH_NETDEV

This is for 2.2-compatible drivers. They can do this:

#ifdef HAVE_PUBLISH_NETDEV
#define init_etherdev prepare_etherdev
#define publish_netdev(dev) do {} while (0)
#define withdraw_netdev unregister_netdev
#endif

New driver structure:

int xxx_probe()
{
- struct net_device *dev = init_etherdev();
+ struct net_device *dev = prepare_etherdev();
...
initialise()
...
dev->open = xxx_open;
if (errcode)
- unregister_netdevice(dev);
+ withdraw_netdevice(dev);
+ else
+ publish_netdev(dev);
return errcode;
}

In other words:

int xxx_probe()
{
struct net_device *dev = prepare_etherdev();
...
initialise()
...
dev->open = xxx_open;
if (errcode)
withdraw_netdevice(dev);
else
publish_netdev(dev);
return errcode;
}