2007-08-02 10:21:27

by K. Ernel

[permalink] [raw]
Subject: VIA EPIA EK: strange eth dev numbering


hi,

I see a strange numbering of ethernet devices with a VIA EPIA EK
board. This board has two ethernet connectors, you can see it
here: http://www.via.com.tw/en/products/mainboards/motherboards.jsp?motherboard_id=420

I configured the system such that via-rhine is loaded as a module.
When doing the module load, the kernel says:

via-rhine.c:v1.10-LK1.4.3 2007-03-06 Written by Donald Becker
via-rhine: Broken BIOS detected, avoid_D3 enabled.
ACPI: PCI Interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 17
eth0: VIA Rhine III at 0x1d000, 00:40:63:ee:96:56, IRQ 17.
eth0: MII PHY found at address 1, status 0x7869 advertising 05e1 Link 45e1.
ACPI: PCI Interrupt Link [ALKD] enabled at IRQ 23
ACPI: PCI Interrupt 0000:00:12.0[A] -> Link [ALKD] -> GSI 23 (level, low) -> IRQ 18
eth1: VIA Rhine II at 0x1ec00, 00:40:63:ee:96:55, IRQ 18.
eth1: MII PHY found at address 1, status 0x7849 advertising 05e1 Link 0000.

but, when continuing, no such eth0 or eth1 device is present (and thus, dhcpcd will
not receive any address).

an "ifconfig eth"0(eth1) will report:

localhost ~ # ifconfig eth0
eth0: error fetching interface information: Device not found
localhost ~ # ifconfig eth1
eth1: error fetching interface information: Device not found

it took a while until, just out of a feeling in my stomach, I tried "ifconfig -a",
and surprise, surprise, the ethernet devices were in fact there, *but* there
names where eth2 and eth3.

doing an "ifconig eth2 192.168....." in fact "upped" the interface, and I was
able to work with the net. after ifconfig, the kernel would say:

eth2: link up, 100Mbps, full-duplex, lpa 0x45E1

and ifconfig -a:

localhost ~ # ifconfig -a
eth2 Link encap:Ethernet HWaddr 00:40:63:EE:96:56
inet addr:192.168.1.23 Bcast:192.168.1.255 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:16713 errors:0 dropped:0 overruns:0 frame:0
TX packets:10522 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:18133391 (17.2 Mb) TX bytes:1176276 (1.1 Mb)
Interrupt:17 Base address:0xd000

eth3 Link encap:Ethernet HWaddr 00:40:63:EE:96:55
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
Interrupt:18 Base address:0xec00

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

localhost ~ #


pretty strange?! I dont think this is the correct behaviour, is it?

cheers,
herp


2007-08-02 10:26:54

by Michael Tokarev

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

Herbert Rosmanith wrote:
> hi,

Hello.

[]
> When doing the module load, the kernel says:
> eth0: VIA Rhine III at 0x1d000, 00:40:63:ee:96:56, IRQ 17.
> eth0: MII PHY found at address 1, status 0x7869 advertising 05e1 Link 45e1.
> eth1: VIA Rhine II at 0x1ec00, 00:40:63:ee:96:55, IRQ 18.
> eth1: MII PHY found at address 1, status 0x7849 advertising 05e1 Link 0000.
[]
> it took a while until, just out of a feeling in my stomach, I tried "ifconfig -a",
> and surprise, surprise, the ethernet devices were in fact there, *but* there
> names where eth2 and eth3.
[]
> pretty strange?! I dont think this is the correct behaviour, is it?

Strange or not, correct or not - depends on the point of view.

The key word here is "udev" - check your udev rules. Since some time
ago udev on some distros comes with rules to give persistent device
names for network interfaces. Some time ago you had eth0 and eth1
with different hardware, and udev remembered this fact somewhere.
Now it sees new hardware, and gives it consecutive numbers, renaming
kernel devices.

/mjt

2007-08-02 10:43:30

by K. Ernel

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

>
> Strange or not, correct or not - depends on the point of view.
>
> The key word here is "udev" - check your udev rules. Since some time
> ago udev on some distros comes with rules to give persistent device
> names for network interfaces. Some time ago you had eth0 and eth1
> with different hardware, and udev remembered this fact somewhere.
> Now it sees new hardware, and gives it consecutive numbers, renaming
> kernel devices.

hu. where are the days when eth0 was eth0 ...

this means that, when I exchange a network card and replace it with a
new one, it will cost an "ethX" device name each time.

FYI, distribution is latest Gentoo.

as a friend of mine puts it:

"Haha das is geil, das is wie bei Windows"
"Wo er merkt, dass du ein neues Mainboard hast :)"

which translates:

"Haha, that's randy, that's like in Windows, which
notices that you've got a new mainboard"

Bad, bad Ferdl! How dare you compare this to M$Windows ;-)

well, Michael, thanks for the help,
herp

2007-08-02 10:44:33

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 2 2007 12:20, Herbert Rosmanith wrote:
>
>I see a strange numbering of ethernet devices with a VIA EPIA EK
>board. This board has two ethernet connectors, you can see it
>here: http://www.via.com.tw/en/products/mainboards/motherboards.jsp?motherboard_id=420

Maybe udev is configured to do renaming on the interfaces?

>an "ifconfig eth"0(eth1) will report:

Hence use `ip a`.

>pretty strange?! I dont think this is the correct behaviour, is it?

No, not strange at all. At least openSUSE ensures (or TRIES - there
are _really_ broken chips out there that change their MAC every boot)
that each card has the same interface name regardless of load order etc.




Jan
--

2007-08-02 10:49:26

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>
>hu. where are the days when eth0 was eth0 ...

If you and/or your distribution accidentally or incidentally loaded modules in
the wrong order (which may happen in e.g. parallel-running boot scripts), you
suddenly have eth0 as eth1. Or, when you changed the PCI slots (i.e. swapped
cards around), eth0 would also suddenly become a different one. There never
*were* days when eth0 remained eth0 across such changes.

>which translates:
>
> "Haha, that's randy, that's like in Windows, which
> notices that you've got a new mainboard"

In Linux, the user recognizes he got himself a new mainboard,
e.g. when the IDE chip changed and the new module is not in the initrd.



Jan
--

2007-08-02 10:56:35

by K. Ernel

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

>
> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
> There never *were* days when eth0 remained eth0 across such changes.

but there *were* days when eth0 was eth0, if the kernel reports it as such.
now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
it to be present.

Instead, udev remembers the old MAC address in
/etc/udev/rules.d/70-persistent-net.rules

which contains the old MAC adresses, too.

of course, that's problem with gentoo, not with the kernel.

Now I know why I never trusted udev much :-/

thanks,
herp

2007-08-02 11:12:36

by K. Ernel

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


> > cards around), eth0 would also suddenly become a different one. There never
> > *were* days when eth0 remained eth0 across such changes.
>
> but there *were* days when eth0 was eth0, if the kernel reports it as such.
> now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
> it to be present.

hm, well, a thought, maybe udev should report what is doing, like
printinig "renamed eth0 to eth2", or such.

the problem with this device renaming in my case was that other software,
in particular dhcpcd, didnt get any lease, because (obviously?) dhcpcd
on the other hand _still_ seemed to look for eth0, and thus, after
booting, there was no network configured at all.

Maybe the name "eth0" is hardcoded in the gentoo init scripts, I dont know.

Can this be considered a distrubtion bug?

cheers,
herp

2007-08-02 11:23:47

by Michael Tokarev

[permalink] [raw]
Subject: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

Herbert Rosmanith wrote:
>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>> There never *were* days when eth0 remained eth0 across such changes.
[]
> of course, that's problem with gentoo, not with the kernel.

Whenever it's a problem or not is questionable too. I mean,
ethX order depends on module loading order, or on PCI slot
number, or whatnot. So userspace (udev) tries to compensate
(sometimes its own design.. issues - module loading order
in this case). It's worse if you'll have eth0 and eth1
swapped on every boot depending on tiny module loading
time differences.

To me it'd be a problem, but I don't run udev (more,
I hate udev ;)

By the way, this very approach (renaming "new" eth0
interface to the next "free" ethX) seems to be flawed.

If I'd were to implement this scheme, I'd do two things
instead of one currently done, and I'd do whatever is
currently done by udev a bit differently (but second
half requires (minor) kernel mods):

first of all, I'd turn this behaviour off by default,
but only when the user asked me to do so - say, when
a new NIC is found, ask a user what's the name he
wants it to be known as. *Or* choosed different
"basename" for the renamed devices. So that
in-kernel eth0 becomes, say, nicX instead of
ethX - to make things explicit. Current way is just
too confusing, when eth0 quietly becomes eth2 or
whatnot.

And second half, which is more important here, is to
always keep kernel names, and create aliases named
by user (or automatic nicX scheme). This is fundamental --
applies to every device on the system. For example,
if kernel says it has disk named "sda", it should be
accessible as /dev/sda (and /sys/block/sda, whatever),
and any alternative names ("boot disk", disk-serial-12345
etc yadda) should be symlinks in /dev. Ie, general rule
is to remove *ALL* "NAME" statements from udev.conf file
and use "SYMLINK" instead.

For network interfaces, ifconfig -a may omit the kernel
names from the listing (but in this case, say, ifconfig -aa
should still show them), or alternatively it may show
something like

eth0 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX Name nic10
^^^^^^^^^^
and both
ifconfig eth0 blablabla
and
ifconfig nic10 blablabla
will work identically.

I.e., the general rule is: kernel has some naming scheme,
because it has to name things in dmesg, in /sys, in /proc/partitions
etc *somehow*. And those "kernel names" should always be accessible,
and stay here at least up to the next reboot. All the rest are
"aliases", alternative names for "kernel names".

I already can see comments from udev/sysfs maintainers here: "naming
is a policy which does not belong to kernel". It's a bullshit, because
kernel too has to use SOME way to name things, and either we should
teach it to use our names EVERYWHERE (including early-boot printk()),
or accept the fact that any userspace naming (the "policy") should
be implemented as aliases for kernel names, not as renames.

(And no, things like "I/O error on SCSI device 8:32 sector XXX" is even
worse - I don't want to care which numbers are used for the devices,
I want to see which sdX it is).

/mjt

2007-08-02 11:47:32

by Jan Engelhardt

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]


On Aug 2 2007 15:23, Michael Tokarev wrote:
>Herbert Rosmanith wrote:
>>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>>> There never *were* days when eth0 remained eth0 across such changes.
>[]
>> of course, that's problem with gentoo, not with the kernel.
>
>To me it'd be a problem, but I don't run udev (more, I hate udev ;)
>
>By the way, this very approach (renaming "new" eth0 interface to the next
>"free" ethX) seems to be flawed.

It does not rename ethX to the "next free" one, but to a _persistent_ one.
If it were a "next free" thing, then removing a card would shuffle all
your eth around again (and invalidate your iptables rules at the same
time, to note).

>first of all, I'd turn this behaviour off by default, but only when the
>user asked me to do so - say, when a new NIC is found, ask a user what's
>the name he wants it to be known as. *Or* choosed different "basename"
>for the renamed devices. So that in-kernel eth0 becomes, say, nicX
>instead of ethX - to make things explicit. Current way is just too
>confusing, when eth0 quietly becomes eth2 or whatnot.

Remember that persistent names also need to provide means for
hot-pluggable devices. Say your eth0 was a wireless, then you surely would
_not ever_ want that on removal of eth0, all other cards step one down
(eth1,eth2,ethN->eth0,eth1,ethN-1). Unfortuantely, I think it is hard (if
not that, then it's a lot of code) to distinguish coldplugged vs
hotplugged devices.

>And second half, which is more important here, is to always keep kernel
>names, and create aliases named by user (or automatic nicX scheme). This
>is fundamental -- applies to every device on the system.

This is easy. Edit /lib/udev/rename_netiface to always hand out "nicX"
regardless of whether the input device was ethX, trX, raX, wlanX or
whatever.

>For example, if kernel says it has disk named "sda", it should be
>accessible as /dev/sda (and /sys/block/sda, whatever),

Note that /dev/sda is not persistent either.

>and any alternative names ("boot disk", disk-serial-12345 etc yadda)
>should be symlinks in /dev. Ie, general rule is to remove *ALL* "NAME"
>statements from udev.conf file and use "SYMLINK" instead.

See above - make rename_netiface use nicX. (Symlinks don't exist for
netdevices.)

>For network interfaces, ifconfig -a may omit the kernel names from the
>listing (but in this case, say, ifconfig -aa should still show them), or
>alternatively it may show something like

(ifconfig has been superseded by iproute2, please use it :)

>eth0 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX Name nic10
> ^^^^^^^^^^
>and both
> ifconfig eth0 blablabla
>and
> ifconfig nic10 blablabla
>will work identically.

I prefer nic10 directly over having a dual. You'd be totally lost
of syslog showed eth0 (from klog) and nic10 (from userspace).

>I already can see comments from udev/sysfs maintainers here: "naming
>is a policy which does not belong to kernel". It's a bullshit, because
>kernel too has to use SOME way to name things,

(1) The kernel starts with ethX
(2) udev renames it to something else
(3) kernel uses new name too ("ni0: link down")

> and either we should teach it to use our names EVERYWHERE (including
>early-boot printk()), or accept the fact that any userspace naming (the
>"policy") should be implemented as aliases for kernel names, not as
>renames.
>
>(And no, things like "I/O error on SCSI device 8:32 sector XXX" is even
>worse - I don't want to care which numbers are used for the devices,
>I want to see which sdX it is).

Jan
--

2007-08-02 11:47:51

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 2 2007 12:56, Herbert Rosmanith wrote:
>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>> There never *were* days when eth0 remained eth0 across such changes.
>
>but there *were* days when eth0 was eth0, if the kernel reports it as such.
>now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
>it to be present.

Wait, you forget that something may change the name. That dmesg message
from 1 second ago does not need to be valid anymore, just as anything
else in this world.


On Aug 2 2007 13:12, Herbert Rosmanith wrote:
>> > cards around), eth0 would also suddenly become a different one. There never
>> > *were* days when eth0 remained eth0 across such changes.
>>
>> but there *were* days when eth0 was eth0, if the kernel reports it as such.
>> now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
>> it to be present.
>
>hm, well, a thought, maybe udev should report what is doing, like
>printinig "renamed eth0 to eth2", or such.

I think it once did with suse, but it does not right now. Worth fixing
(yet I am no udev maintainer).

>the problem with this device renaming in my case was that other software,
>in particular dhcpcd, didnt get any lease, because (obviously?) dhcpcd
>on the other hand _still_ seemed to look for eth0, and thus, after
>booting, there was no network configured at all.

So blame your distro for not integrating udev correctly with dhcp-client.
I can only speak for suse, where you define BOOTPROTO=dhcp for an
interface. Then, on /etc/init.d/network, every interface that has a
configuration file gets run, so you never see what ethX udev picked for
the day, but things still work. That's good^TM.


Jan
--

2007-08-02 12:00:51

by K. Ernel

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


> Wait, you forget that something may change the name. That dmesg message
> from 1 second ago does not need to be valid anymore, just as anything
> else in this world.

there are many things in this world which are usually very persistent, and
people rely on their persistence. e.g. in my office, I assume the phone number
is still the same I used 1 second ago ... but what has this to do with
ethernet device names? well: I expected the names persist - like they did
without udev. I'm used that things persist, as probably the rest of the world :-)

cheers,
herp

2007-08-02 12:07:15

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 2 2007 14:00, Herbert Rosmanith wrote:
>> Wait, you forget that something may change the name. That dmesg message
>> from 1 second ago does not need to be valid anymore, just as anything
>> else in this world.
>
>there are many things in this world which are usually very persistent, and
>people rely on their persistence. e.g. in my office, I assume the phone number
>is still the same I used 1 second ago ... but what has this to do with
>ethernet device names? well: I expected the names persist - like they did
>without udev. I'm used that things persist, as probably the rest of the world
>:-)

Yes, your telephone number persists even if you plug it into another
jack (well, at least for my ISDN box, this holds true). Your names
did persist - no card uses eth0 (which seems to belong to another one
that is not currently installed). Note also that not the names per se
are persistent, the (name,device) tuples are.


Jan
--

2007-08-02 12:56:28

by Michael Tokarev

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

Jan Engelhardt wrote:
> On Aug 2 2007 15:23, Michael Tokarev wrote:
>> Herbert Rosmanith wrote:
>>>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>>>> There never *were* days when eth0 remained eth0 across such changes.
>> []
>>> of course, that's problem with gentoo, not with the kernel.
>> To me it'd be a problem, but I don't run udev (more, I hate udev ;)
>>
>> By the way, this very approach (renaming "new" eth0 interface to the next
>> "free" ethX) seems to be flawed.
>
> It does not rename ethX to the "next free" one, but to a _persistent_ one.
> If it were a "next free" thing, then removing a card would shuffle all
> your eth around again (and invalidate your iptables rules at the same
> time, to note).

Note I said ``"new" eth0'' -- i.e, when udev sees new NIC (with new, yet
unknown to this system, MAC address), it assigns it a "next free" (from
it's persistent names list), now persistent, ethN.

Shuffling rest of the interfaces if eth0 is removed and no this persistent
naming present is obvious, including invalidating iptable rules and
breaking dhcp configuration and other stuff. But I was referring to
something different -- below.

>> first of all, I'd turn this behaviour off by default, but only when the
>> user asked me to do so - say, when a new NIC is found, ask a user what's
>> the name he wants it to be known as. *Or* choosed different "basename"
>> for the renamed devices. So that in-kernel eth0 becomes, say, nicX
>> instead of ethX - to make things explicit. Current way is just too
>> confusing, when eth0 quietly becomes eth2 or whatnot.
>
> Remember that persistent names also need to provide means for
> hot-pluggable devices. Say your eth0 was a wireless, then you surely would
> _not ever_ want that on removal of eth0, all other cards step one down
> (eth1,eth2,ethN->eth0,eth1,ethN-1). Unfortuantely, I think it is hard (if
> not that, then it's a lot of code) to distinguish coldplugged vs
> hotplugged devices.

There's really no need to distinguish them (and by the way, wireless !=
hot-pluggable. There are PCI wireless cards (non-hotplug), and there
are, say, PCMCIA or USB ethernet cards (hotplug)).

Well, I see your point here (I think) -- by assigning names to hotplug
devices from a different namespace (hpethN vs ethN for example) we will
stop shuffling *everything* (without persistent names like udev does) -
but that doesn't really help anyway because of module loading order for
example, and because non-hotplug devices (like PCI) can be missing
(and new added) too on next reboot.

What I mean is -- if I, as a user, care about interface (or other)
names when I replug my NICs (if I ever do that in the first place),
I can assign names to them explicitly (and some programs that are
running at system startup - like kudzu on redhat for example - may
just ask me when finding something new), and THOSE explicitly set
names should be persistent for sure. Preferrable they will be in
different namespace (not named as ethX but, say, lan or isp or
segment12 - user chooses the name). And nothing will break
(iptables or dhcp or whatnot) when using THOSE persistent names.

When I don't care, usually I don't have many interfaces to worry
about, either. But in this case it's expectable that names of
existing interfaces after removing one may change.

Kernel uses SOME names for the interfaces anyway when it boots up
(more on this below). If - in case I don't care or just didn't
know (which is more often I think) about renumbering, udev by
default MAY assign persistent names like it does (using nicX
scheme), but it'd be better (IMHO anyway) that those names will
be implemented as aliases, not as rename...

>> And second half, which is more important here, is to always keep kernel
>> names, and create aliases named by user (or automatic nicX scheme). This
>> is fundamental -- applies to every device on the system.
>
> This is easy. Edit /lib/udev/rename_netiface to always hand out "nicX"
> regardless of whether the input device was ethX, trX, raX, wlanX or
> whatever.

Again, the key point here in the "alias" thing. Whatever basename will
be used is irrelevant, but I wanted to preserve BOTH names...

>> For example, if kernel says it has disk named "sda", it should be
>> accessible as /dev/sda (and /sys/block/sda, whatever),
>
> Note that /dev/sda is not persistent either.

Yes it's not, but during kernel lifetime (from boot to shutdown) it
at least MAY be persistent, unless it's a removable device which gets
re-plugged.

>> and any alternative names ("boot disk", disk-serial-12345 etc yadda)
>> should be symlinks in /dev. Ie, general rule is to remove *ALL* "NAME"
>> statements from udev.conf file and use "SYMLINK" instead.
>
> See above - make rename_netiface use nicX. (Symlinks don't exist for
> netdevices.)

And this is what's needed for the scheme to work - to be able to have
symlinks. Which name the alias will have is irrelevant, but the symlink
(alias) is important.

>> For network interfaces, ifconfig -a may omit the kernel names from the
>> listing (but in this case, say, ifconfig -aa should still show them), or
>> alternatively it may show something like
>
> (ifconfig has been superseded by iproute2, please use it :)

I know, but for many people ifconfig is still much more familiar tool.

>> eth0 Link encap:Ethernet HWaddr XX:XX:XX:XX:XX Name nic10
>> ^^^^^^^^^^
>> and both
>> ifconfig eth0 blablabla
>> and
>> ifconfig nic10 blablabla
>> will work identically.
>
> I prefer nic10 directly over having a dual. You'd be totally lost
> of syslog showed eth0 (from klog) and nic10 (from userspace).

Well, both userspace and kernelspace may log nic10(eth0) instead.

>> I already can see comments from udev/sysfs maintainers here: "naming
>> is a policy which does not belong to kernel". It's a bullshit, because
>> kernel too has to use SOME way to name things,
>
> (1) The kernel starts with ethX
> (2) udev renames it to something else
> (3) kernel uses new name too ("ni0: link down")

And now tell me please how can I connect two messages from dmesg:

eth0: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:14:5e:5d:18:26
nic10: Link is up at 100 Mbps, full duplex.

For example, I see the first line and notice the card isn't configured
properly somehow, and want to use ethtool to tweak it - I run ethtool
on eth0 and it obviously fails.

(logging renames from udev seems to be a good idea - with kern priority,
so that the two pieces will be in the same logfile, and it helps, but
only somewhat).

What I wanted to say (here with network devices, and with disk names
and everything else) is -- as long as the device is here (plugged in
but not yet unplugged), I want it to have the same "primary" name in
kernel and in userspace, so that everything - logging, sysfs, /dev,
/proc/partitions&whatnot, maybe configs - will match each other.

Once the device is replugged, it may become different kernnameN
(eth5, sdf, whatever), and again, from the point when it has been
plugged in, the "kernel" names match each other as above. If
persistent naming is configured/in effect, aliases (nic10,
bootdrive etc) "connects" to the new in-kernel name, and
everything relying on those persistent names continues working.

Well, having two (or more) names for non-device nodes (like
network cards which don't have corresponding /dev/ethX) is
probably asking for too much because it's the kernel who
should provide infrastructure for that, unlike things
like /dev/sda vs /dev/bootdrive which is purely userspace
thing. But it's not that difficult to do either and
isn't too bloated.

By the way, It'd be nice to be able to store list of names
a given device is known somewhere in /sys/device/foo/aliases
(in line with modalias, dev, and other attributes in that
directory), instead of the way udevd keeps them in
/dev/.udevdb (or somesuch -- don't remember offhand).
Just a little rw text attribute which has no meaning
for the kernel but which is settable and queriable from
userspace. But that's a very-very minor thing.

/mjt

2007-08-02 13:07:45

by Michael Tokarev

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

Jan Engelhardt wrote:
> On Aug 2 2007 12:56, Herbert Rosmanith wrote:
>>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>>> There never *were* days when eth0 remained eth0 across such changes.
>> but there *were* days when eth0 was eth0, if the kernel reports it as such.
>> now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
>> it to be present.
>
> Wait, you forget that something may change the name. That dmesg message
> from 1 second ago does not need to be valid anymore, just as anything
> else in this world.

That was my argument - there should be no way to *change* the name, but
to give an alias(es) - entirely different thing.

Yes, if a device is replugged during that one second, it's another
at least "instance" of that device - similar to 'ifindex' field in
interface description (not shown by ifconfig but shown by `ip link'),
or to usb endpoint numbers which gets incremented each time one
plug something in.

But as long as the device is connected, it should have the same
name - that's my key point. You may change its aliases as you
wish, but not the "primary name".

[]
>> the problem with this device renaming in my case was that other software,
>> in particular dhcpcd, didnt get any lease, because (obviously?) dhcpcd
>> on the other hand _still_ seemed to look for eth0, and thus, after
>> booting, there was no network configured at all.
>
> So blame your distro for not integrating udev correctly with dhcp-client.
> I can only speak for suse, where you define BOOTPROTO=dhcp for an
> interface. Then, on /etc/init.d/network, every interface that has a
> configuration file gets run, so you never see what ethX udev picked for
> the day, but things still work. That's good^TM.

Again, this is questionable - the integration part, right way to it,
that is.

If - recalling my "naming scheme" with kernel ethX (which may change each
boot or even at runtime, OR may not change at all if I don't replug
devices), and nicN which is based on particular device's MAC address, --
I configured dhcp to listen on eth0, I assume it's the first network
card found by the system, whatever it is. In this case, if I replaced
the card (because previous one was faulty etc), it will continue to
work (provided no other renames was done) without renames done by
udev, and will break with current udev behaviour. But if I configured
dhcp to listen on *this* NIC with *this* serial number and MAC address,
current udev behaviour is right - the system just assumes that this
particular card isn't here (yet?) and hence dhcp shouldn't run on it.

You see - we again have two names - "first interface found by kernel"
and "this particular card with this serial number", and both of them
are useful.

Partially this issue can be solved by - say - kudzu asking for a
name if it finds new hardware (we'll answer it with the name our
replaced card had) - but such behaviour is out of the question
because system startup scripts should not generally ask "random
questions".

/mjt

2007-08-02 13:30:58

by Jan Engelhardt

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]


On Aug 2 2007 16:56, Michael Tokarev wrote:
>>> I already can see comments from udev/sysfs maintainers here: "naming
>>> is a policy which does not belong to kernel". It's a bullshit, because
>>> kernel too has to use SOME way to name things,
>>
>> (1) The kernel starts with ethX
>> (2) udev renames it to something else
>> (3) kernel uses new name too ("ni0: link down")
>
>And now tell me please how can I connect two messages from dmesg:
>
>eth0: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:14:5e:5d:18:26
>nic10: Link is up at 100 Mbps, full duplex.

Generally, the "link is xyz" message comes directly after loading the module,
so it should be eth0 before udev gets a chance to rename it. Or maybe not -
in which case, well, you're literally fubared, and your distro should put a
"renamed A to B" into syslog.

>What I wanted to say (here with network devices, and with disk names
>and everything else) is -- as long as the device is here (plugged in
>but not yet unplugged), I want it to have the same "primary" name in
>kernel and in userspace, so that everything

Oh I think it already has a "primary name" today --

$ readlink /sys/class/net/eth0/device
../../../../../devices/pci0000:00/0000:00:04.0

there is your primary name, and your secondary name is ethZ. :)


Jan
--

2007-08-02 13:36:57

by Michael Tokarev

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

Jan Engelhardt wrote:
> On Aug 2 2007 16:56, Michael Tokarev wrote:
>>>> I already can see comments from udev/sysfs maintainers here: "naming
>>>> is a policy which does not belong to kernel". It's a bullshit, because
>>>> kernel too has to use SOME way to name things,
>>> (1) The kernel starts with ethX
>>> (2) udev renames it to something else
>>> (3) kernel uses new name too ("ni0: link down")
>> And now tell me please how can I connect two messages from dmesg:
>>
>> eth0: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:14:5e:5d:18:26
>> nic10: Link is up at 100 Mbps, full duplex.
>
> Generally, the "link is xyz" message comes directly after loading the module,
> so it should be eth0 before udev gets a chance to rename it. Or maybe not -
> in which case, well, you're literally fubared, and your distro should put a
> "renamed A to B" into syslog.

Yes, first message is generated before udev has a chance to act.
And no, I just don't use udev, and I hope very much that it will
not become required (it is slowly becoming - for example, some
packages on Debian (like xen for example) now explicitly depends
on udev - but so far I managed to satisfy this dependency by
other means).

>> What I wanted to say (here with network devices, and with disk names
>> and everything else) is -- as long as the device is here (plugged in
>> but not yet unplugged), I want it to have the same "primary" name in
>> kernel and in userspace, so that everything
>
> Oh I think it already has a "primary name" today --
>
> $ readlink /sys/class/net/eth0/device
> ../../../../../devices/pci0000:00/0000:00:04.0
>
> there is your primary name, and your secondary name is ethZ. :)

This primary name isn't at all useful - I can't ifconfig or fdisk it,
and it's not shown in log/dmesg either.

/mjt

2007-08-02 13:38:19

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 2 2007 17:07, Michael Tokarev wrote:
>Jan Engelhardt wrote:
>> On Aug 2 2007 12:56, Herbert Rosmanith wrote:
>>>> On Aug 2 2007 12:42, Herbert Rosmanith wrote:
>>>> There never *were* days when eth0 remained eth0 across such changes.
>>> but there *were* days when eth0 was eth0, if the kernel reports it as such.
>>> now there is no eth0 at all. if I see an "eth0" from dmesg, I expect
>>> it to be present.
>>
>> Wait, you forget that something may change the name. That dmesg message
>> from 1 second ago does not need to be valid anymore, just as anything
>> else in this world.
>
>That was my argument - there should be no way to *change* the name, but
>to give an alias(es) - entirely different thing.

Ah, in that case, make the kernel print the PCI-ID instead of the name.
(Or perhaps both together.)


As for userspace, you are free to do whatever you find most useful.



Jan
--

2007-08-02 13:44:17

by K. Ernel

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

> >And now tell me please how can I connect two messages from dmesg:
> >eth0: Tigon3 [partno(BCM95721) rev 4201 PHY(5750)] (PCI Express) 10/100/1000Base-T Ethernet 00:14:5e:5d:18:26
> >nic10: Link is up at 100 Mbps, full duplex.
>
> Generally, the "link is xyz" message comes directly after loading the module,
> so it should be eth0 before udev gets a chance to rename it. Or maybe not -
^^^^^^^^^^^^

you mean like this:
eth0: MII PHY found at address 1, status 0x7869 advertising 05e1 Link 45e1.
...
eth2: link up, 100Mbps, full-duplex, lpa 0x45E1

with eth0 really being (same MAC) == eth2.

> in which case, well, you're literally fubared, and your distro should put a
> "renamed A to B" into syslog.

I think it would be helpful.

/herp

2007-08-02 14:37:27

by Ondrej Zajicek

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

On Thu, Aug 02, 2007 at 05:36:45PM +0400, Michael Tokarev wrote:
> not become required (it is slowly becoming - for example, some
> packages on Debian (like xen for example) now explicitly depends
> on udev - but so far I managed to satisfy this dependency by
> other means).

udev is not problem - problem si crazy default configuration
of udev used by some distros. If you use udev with simple sane
configuration, then it just creates device nodes based on kernel
names.

--
Elen sila lumenn' omentielvo

Ondrej 'SanTiago' Zajicek (email: [email protected], jabber: [email protected])
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."


Attachments:
(No filename) (735.00 B)
signature.asc (197.00 B)
Digital signature
Download all attachments

2007-08-02 14:52:04

by Ondrej Zajicek

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

On Thu, Aug 02, 2007 at 01:47:23PM +0200, Jan Engelhardt wrote:
> It does not rename ethX to the "next free" one, but to a _persistent_ one.
> If it were a "next free" thing, then removing a card would shuffle all
> your eth around again (and invalidate your iptables rules at the same
> time, to note).

It is questionable what is _persistent_ . MAC-based names are persistent
with regard to adding and removing of other cards, 'Plain' names are persistent
with regard to replacing that card with different item (of a same kind).

I am very happy that (using 'plain' names) i can send technician to
replace broken NIC in our routers without need for configuration
change.

--
Elen sila lumenn' omentielvo

Ondrej 'SanTiago' Zajicek (email: [email protected], jabber: [email protected])
OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, wwwkeys.pgp.net)
"To err is human -- to blame it on a computer is even more so."

2007-08-02 18:38:00

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

Hey,


I know I have seen my kernel outputting "A renamed to B". Since you two
however wanted that information in the first place, I grepped a bit
around, and actually found, (drumroll), that the SUSE kernel has had a
proper patch for [I can't remember how long] quite some time. (At least
one distro which does the right thing.) I copied it below.


Jan
===
Name: patches.suse/nameif-track-rename.patch
Subject: [PATCH] keep track of network interface renaming
From: Olaf Hering <[email protected]>

Keep track about which network interface names were renamed after the
network device driver printed its banner. Example insanity:

honeydew:~ # dmesg| grep -Ew '(eth[0-9]|rename|renamed)'
e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection
e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection
dev_change_name: about to rename 'eth3' to 'eth0'
dev_change_name: about to rename 'eth3' to 'ethxx3'
eth3 renamed to ethxx3
dev_change_name: about to rename 'ethxx3' to 'eth0'
dev_change_name: about to rename 'eth0' to 'eth3'
eth0 renamed to eth3
dev_change_name: about to rename 'eth1' to 'eth2'
dev_change_name: about to rename 'eth1' to 'ethxx1'
eth1 renamed to ethxx1
dev_change_name: about to rename 'ethxx1' to 'eth2'
dev_change_name: about to rename 'eth2' to 'eth1'
eth2 renamed to eth1
dev_change_name: about to rename 'ethxx3' to 'eth0'
ethxx3 renamed to eth0
dev_change_name: about to rename 'ethxx1' to 'eth2'
ethxx1 renamed to eth2
e1000: eth0: e1000_watchdog_task: NIC Link is Up 100 Mbps Full Duplex



Signed-off-by: Olaf Hering <[email protected]>

net/core/dev.c | 4 ++++
1 files changed, 4 insertions(+)

--- linux-2.6.21.orig/net/core/dev.c
+++ linux-2.6.21/net/core/dev.c
@@ -822,7 +822,11 @@ int dev_change_name(struct net_device *d
else if (__dev_get_by_name(newname))
return -EEXIST;
else
+ {
+ if (strncmp(newname, dev->name, IFNAMSIZ))
+ printk(KERN_INFO "%s renamed to %s\n", dev->name, newname);
strlcpy(dev->name, newname, IFNAMSIZ);
+ }

device_rename(&dev->dev, dev->name);
hlist_del(&dev->name_hlist);

2007-08-02 22:00:52

by Kay Sievers

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering

On 8/2/07, Jan Engelhardt <[email protected]> wrote:
> I know I have seen my kernel outputting "A renamed to B". Since you two
> however wanted that information in the first place, I grepped a bit
> around, and actually found, (drumroll), that the SUSE kernel has had a
> proper patch for [I can't remember how long] quite some time. (At least
> one distro which does the right thing.) I copied it below.

> e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection
> dev_change_name: about to rename 'eth3' to 'eth0'
> dev_change_name: about to rename 'eth3' to 'ethxx3'
> eth3 renamed to ethxx3

I just made udev when it successfully renamed an interface to write
that to the kernel log, so dmesg will show:
<6>udev: renamed network interface eth0 to eth1

Kay

2007-08-02 22:40:13

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 3 2007 00:00, Kay Sievers wrote:
>On 8/2/07, Jan Engelhardt <[email protected]> wrote:
>> I know I have seen my kernel outputting "A renamed to B". Since you two
>> however wanted that information in the first place, I grepped a bit
>> around, and actually found, (drumroll), that the SUSE kernel has had a
>> proper patch for [I can't remember how long] quite some time. (At least
>> one distro which does the right thing.) I copied it below.
>
>> e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection
>> dev_change_name: about to rename 'eth3' to 'eth0'
>> dev_change_name: about to rename 'eth3' to 'ethxx3'
>> eth3 renamed to ethxx3
>
>I just made udev when it successfully renamed an interface to write
>that to the kernel log, so dmesg will show:
> <6>udev: renamed network interface eth0 to eth1

I think it is helpful to integrate the suse patch rather than to patch udev
alone. This way, renames that do not involve udev also show up.

Jan
--

2007-08-02 22:47:07

by Kay Sievers

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Fri, 2007-08-03 at 00:39 +0200, Jan Engelhardt wrote:
> On Aug 3 2007 00:00, Kay Sievers wrote:
> >On 8/2/07, Jan Engelhardt <[email protected]> wrote:
> >> I know I have seen my kernel outputting "A renamed to B". Since you two
> >> however wanted that information in the first place, I grepped a bit
> >> around, and actually found, (drumroll), that the SUSE kernel has had a
> >> proper patch for [I can't remember how long] quite some time. (At least
> >> one distro which does the right thing.) I copied it below.
> >
> >> e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection
> >> dev_change_name: about to rename 'eth3' to 'eth0'
> >> dev_change_name: about to rename 'eth3' to 'ethxx3'
> >> eth3 renamed to ethxx3
> >
> >I just made udev when it successfully renamed an interface to write
> >that to the kernel log, so dmesg will show:
> > <6>udev: renamed network interface eth0 to eth1
>
> I think it is helpful to integrate the suse patch rather than to patch udev
> alone. This way, renames that do not involve udev also show up.

But if you need to swap interface names, you will see the useless
temporary device names. On SUSE, nothing else renames network interfaces
these days, so we'll probably just remove that patch. :)

Kay

2007-08-03 04:48:39

by David Lang

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

On Thu, 2 Aug 2007, Ondrej Zajicek wrote:

>
> On Thu, Aug 02, 2007 at 01:47:23PM +0200, Jan Engelhardt wrote:
>> It does not rename ethX to the "next free" one, but to a _persistent_ one.
>> If it were a "next free" thing, then removing a card would shuffle all
>> your eth around again (and invalidate your iptables rules at the same
>> time, to note).
>
> It is questionable what is _persistent_ . MAC-based names are persistent
> with regard to adding and removing of other cards, 'Plain' names are persistent
> with regard to replacing that card with different item (of a same kind).
>
> I am very happy that (using 'plain' names) i can send technician to
> replace broken NIC in our routers without need for configuration
> change.

this is a very important point, and with the distros (and many kernel
people) treating udev as a requirement this is going to bite a lot of
people.

David Lang

2007-08-03 07:46:17

by Jan Engelhardt

[permalink] [raw]
Subject: Re: VIA EPIA EK: strange eth dev numbering


On Aug 3 2007 00:49, Kay Sievers wrote:
>> I think it is helpful to integrate the suse patch rather than to patch udev
>> alone. This way, renames that do not involve udev also show up.
>
>But if you need to swap interface names, you will see the useless
>temporary device names. On SUSE, nothing else renames network interfaces
>these days,

Yeah on SUSE on SUSE. The _user_ may very well swap these, either by using
/usr/sbin/ifrename (which is not even from udev!), or doing the magic himself
(ioctls, netlink, whatever is needed).

Also, it may happen that further things pop up during the two renames, e.g.

eth0 renamed to ethxx0
ethxx0: link up
ethxx0 renamed to foo0

If only udev were to print it:

eth0: RealTek 8139 whatever
ethxx0: ink up
eth0 renamed to foo0

and you think "what's ethxx0?".


Jan
--

2007-08-03 15:12:45

by Stefan Richter

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

[email protected] wrote:
> On Thu, 2 Aug 2007, Ondrej Zajicek wrote:
>> On Thu, Aug 02, 2007 at 01:47:23PM +0200, Jan Engelhardt wrote:
>>> It does not rename ethX to the "next free" one, but to a _persistent_ one.
>>> If it were a "next free" thing, then removing a card would shuffle all
>>> your eth around again (and invalidate your iptables rules at the same
>>> time, to note).
>>
>> It is questionable what is _persistent_ . MAC-based names are persistent
>> with regard to adding and removing of other cards, 'Plain' names are persistent
>> with regard to replacing that card with different item (of a same kind).
>>
>> I am very happy that (using 'plain' names) i can send technician to
>> replace broken NIC in our routers without need for configuration
>> change.
>
> this is a very important point, and with the distros (and many kernel
> people) treating udev as a requirement this is going to bite a lot of
> people.

Two notes:

1. Udev doesn't restrict you to any one naming scheme. If you want
something else than a MAC based scheme, e.g. a PCI topology based
scheme, udev most certainly can do that for you. But the kernel can't.

2. Consider udev a kernel extension in userspace, with the benefit of
configurability and scriptability, features that kernel extensions in
kernelspace can't offer. Of course this gain of features doesn't come
at zero cost: You need a minimal userspace environment at boot time.

Quoting myself from http://marc.info/?l=linux-scsi&m=118613786003162:

There is a variety of possible naming schemes:

- Naming by order of discovery.
- Naming by vendor/model name strings.
- Naming by universally unique identifier.
- Naming by topology.
- ...

Only the simplest of these schemes (naming by order of discovery) is
hardwired into the kernel portion of the Linux OS. The other naming
schemes are (or can be) implemented in the userland portion of the Linux OS.

There is only the most primitive naming scheme implemented in the kernel
because naming policy, like most other kinds of policy, is better left
to userland. The kernel is a too restricted framework to implement such
things. The kernel lacks runtime-configuration files, scripting
interfaces, et cetera.
--
Stefan Richter
-=====-=-=== =--- ---==
http://arcgraph.de/sr/

2007-08-04 04:36:35

by David Lang

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

On Fri, 3 Aug 2007, Stefan Richter wrote:

> [email protected] wrote:
>> On Thu, 2 Aug 2007, Ondrej Zajicek wrote:
>>> On Thu, Aug 02, 2007 at 01:47:23PM +0200, Jan Engelhardt wrote:
>>>> It does not rename ethX to the "next free" one, but to a _persistent_ one.
>>>> If it were a "next free" thing, then removing a card would shuffle all
>>>> your eth around again (and invalidate your iptables rules at the same
>>>> time, to note).
>>>
>>> It is questionable what is _persistent_ . MAC-based names are persistent
>>> with regard to adding and removing of other cards, 'Plain' names are persistent
>>> with regard to replacing that card with different item (of a same kind).
>>>
>>> I am very happy that (using 'plain' names) i can send technician to
>>> replace broken NIC in our routers without need for configuration
>>> change.
>>
>> this is a very important point, and with the distros (and many kernel
>> people) treating udev as a requirement this is going to bite a lot of
>> people.
>
> Two notes:
>
> 1. Udev doesn't restrict you to any one naming scheme. If you want
> something else than a MAC based scheme, e.g. a PCI topology based
> scheme, udev most certainly can do that for you. But the kernel can't.
>
> 2. Consider udev a kernel extension in userspace, with the benefit of
> configurability and scriptability, features that kernel extensions in
> kernelspace can't offer. Of course this gain of features doesn't come
> at zero cost: You need a minimal userspace environment at boot time.
>
> Quoting myself from http://marc.info/?l=linux-scsi&m=118613786003162:
>
> There is a variety of possible naming schemes:
>
> - Naming by order of discovery.
> - Naming by vendor/model name strings.
> - Naming by universally unique identifier.
> - Naming by topology.
> - ...
>
> Only the simplest of these schemes (naming by order of discovery) is
> hardwired into the kernel portion of the Linux OS. The other naming
> schemes are (or can be) implemented in the userland portion of the Linux OS.
>
> There is only the most primitive naming scheme implemented in the kernel
> because naming policy, like most other kinds of policy, is better left
> to userland. The kernel is a too restricted framework to implement such
> things. The kernel lacks runtime-configuration files, scripting
> interfaces, et cetera.

I understand the flexibility that this provides, unfortunantly (IMHO)
default udev rules (or at least what many distros are shipping by default)
changes from this simple naming scheme in a way that hides the fact from
the user. This means that many users will not even realize the change in
policy until the hardware changes and things don't act the way they were
expected to. In my case it was removing 3 quad cards from a machine and
finding that there was no eth0 on the box, instead there was a eth12, this
is fairly benign. what would have caused me significant problems would
have been having a card fail in a production box, have it replaced and
then found that the interfaces were now eth4-eth22 instead of eth0-eth18.
having the interfaces named differently on different boxes with identical
hardware based on the history of what has been plugged into the boxes in
the past is not what sysadmins expect.

David Lang

2007-08-04 09:17:07

by Stefan Richter

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

[email protected] wrote:
> On Fri, 3 Aug 2007, Stefan Richter wrote:
>> There is a variety of possible naming schemes:
>>
>> - Naming by order of discovery.
>> - Naming by vendor/model name strings.
>> - Naming by universally unique identifier.
>> - Naming by topology.
>> - ...
>>
>> Only the simplest of these schemes (naming by order of discovery)

(which is in most cases also the scheme that's the least useful to
admins and users)

>> is hardwired into the kernel portion of the Linux OS. The other naming
>> schemes are (or can be) implemented in the userland portion of the
>> Linux OS.
...
> I understand the flexibility that this provides, unfortunantly (IMHO)
> default udev rules (or at least what many distros are shipping by
> default) changes from this simple naming scheme in a way that hides the
> fact from the user. This means that many users will not even realize the
> change in policy until the hardware changes and things don't act the way
> they were expected to. In my case it was removing 3 quad cards from a
> machine and finding that there was no eth0 on the box, instead there was
> a eth12, this is fairly benign. what would have caused me significant
> problems would have been having a card fail in a production box, have it
> replaced and then found that the interfaces were now eth4-eth22 instead
> of eth0-eth18. having the interfaces named differently on different
> boxes with identical hardware based on the history of what has been
> plugged into the boxes in the past is not what sysadmins expect.

Yes, these rules by far don't fit everyone's needs. People who often
use hotpluggable NICs are probably served best by MAC address based
naming. Boxes with field replacable but otherwise fixed NICs apparently
rather need a naming scheme based on PCI/PCIe topology. (This requires
that the topology is exposed to userspace in comparable manner across
boots and across kernel version updates.)

So, an administrator should get to choose between different well
documented naming schemes.

Also, like Jan mentioned, confusion can already be minimized by renaming
eth[0-9]+ -> net_[a-z]+ (for example, or nic[0-9]+ like Jan wrote)
rather than eth[0-9]+ -> eth[0-9]+. That way it's clearer at all times
whether the original names or names given by userspace are used.

And there should be a log message when a device was renamed. Better
yet, like Michal wrote: In case of device files for mass storage, there
is no _renaming_. Instead, udev creates _aliases_ (symlinks), and it
does so with a few different naming schemes at once so that admins or
users immediately have a choice:
$ ls /dev/disk/
by-id by-path by-uuid

Any chance that there could be aliases to network interfaces? Aliases
for device files are easy --- they live only in userspace.
--
Stefan Richter
-=====-=-=== =--- --=--
http://arcgraph.de/sr/

2007-08-04 17:10:21

by David Lang

[permalink] [raw]
Subject: Re: renaming kernel devices [was: VIA EPIA EK: strange eth dev numbering]

On Sat, 4 Aug 2007, Stefan Richter wrote:

> [email protected] wrote:
>> On Fri, 3 Aug 2007, Stefan Richter wrote:
>>> There is a variety of possible naming schemes:
>>>
>>> - Naming by order of discovery.
>>> - Naming by vendor/model name strings.
>>> - Naming by universally unique identifier.
>>> - Naming by topology.
>>> - ...
>>>
>>> Only the simplest of these schemes (naming by order of discovery)
>
> (which is in most cases also the scheme that's the least useful to
> admins and users)
>
>>> is hardwired into the kernel portion of the Linux OS. The other naming
>>> schemes are (or can be) implemented in the userland portion of the
>>> Linux OS.
> ...
>> I understand the flexibility that this provides, unfortunantly (IMHO)
>> default udev rules (or at least what many distros are shipping by
>> default) changes from this simple naming scheme in a way that hides the
>> fact from the user. This means that many users will not even realize the
>> change in policy until the hardware changes and things don't act the way
>> they were expected to. In my case it was removing 3 quad cards from a
>> machine and finding that there was no eth0 on the box, instead there was
>> a eth12, this is fairly benign. what would have caused me significant
>> problems would have been having a card fail in a production box, have it
>> replaced and then found that the interfaces were now eth4-eth22 instead
>> of eth0-eth18. having the interfaces named differently on different
>> boxes with identical hardware based on the history of what has been
>> plugged into the boxes in the past is not what sysadmins expect.
>
> Yes, these rules by far don't fit everyone's needs. People who often
> use hotpluggable NICs are probably served best by MAC address based
> naming. Boxes with field replacable but otherwise fixed NICs apparently
> rather need a naming scheme based on PCI/PCIe topology. (This requires
> that the topology is exposed to userspace in comparable manner across
> boots and across kernel version updates.)

the detection order works pretty well, we don't do parallel detection yet,
and when we do having parallel detection, but serial registration (in a
defined order) will work.

yes, this can change across kernel version, but in practice it seldom
does. having run linux on production systems since the 2.0 days (and on
personal systems since before 0.99) there have only been a handful of
times when the detection order has changed.

> So, an administrator should get to choose between different well
> documented naming schemes.
>
> Also, like Jan mentioned, confusion can already be minimized by renaming
> eth[0-9]+ -> net_[a-z]+ (for example, or nic[0-9]+ like Jan wrote)
> rather than eth[0-9]+ -> eth[0-9]+. That way it's clearer at all times
> whether the original names or names given by userspace are used.
>
> And there should be a log message when a device was renamed. Better
> yet, like Michal wrote: In case of device files for mass storage, there
> is no _renaming_. Instead, udev creates _aliases_ (symlinks), and it
> does so with a few different naming schemes at once so that admins or
> users immediately have a choice:
> $ ls /dev/disk/
> by-id by-path by-uuid
>
> Any chance that there could be aliases to network interfaces? Aliases
> for device files are easy --- they live only in userspace.

useing different names for the MAC based names, and providing both names
at once would work very well.

David Lang