2008-08-19 11:17:24

by David Miller

[permalink] [raw]
Subject: [GIT]: Networking


We're still chipping away at the packet scheduler layer locking
issues, but I feel that this is mostly sorted at this point.

Other highlights:

1) Fix for NAT via loopback per regression with GSO by Herbert
Xu.

2) Merge in wired driver fixes via Jeff Garzik.

3) Bluetooth updates via Marcel Holtmann.

4) Wireless driver updates via John Linville.

5) Fix to namespace handling in ipv6 from Brian Haley.

6) DCCP panic fix from Gerrit Renker.

7) Packet scheduler qdisc return value handling fix which can
cause TCP crashes.

8) Netfilter bug fixes from Patrick McHardy and co.

Please pull, thanks a lot!

The following changes since commit a7f5aaf36ded825477c4d7167cc6eb1bcdc63191:
Linus Torvalds (1):
Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/.../tip/linux-2.6-tip

are available in the git repository at:

master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6.git master

Adrian Bunk (2):
[netdrvr] uninline atl1e_setup_mac_ctrl()
ath9k: work around gcc ICEs (again)

Anders Grafstr?m (1):
netfilter: ipt_addrtype: Fix matching of inverted destination address type

Atsushi Nemoto (1):
[netdrvr] ne: Use CONFIG_MACH_TX49XX

Ben Dooks (1):
AX88796: Fix locking in ethtool support

Brian Haley (1):
netns: Add network namespace argument to rt6_fill_node() and ipv6_dev_get_saddr()

Brice Goglin (1):
myri10ge: myri10ge_fw_name also overrides the rss firmware

Bruce Allan (7):
e1000e: Return 1 instead of a non-zero value for link up indication
e1000e: Set InterruptThrottleRate to default when invalid value used
e1000e: Use skb_copy_to_linear_data_offset introduced in 2.6.22
e1000e: Increase Tx timeout factor for 10Mbps
e1000e: increase minimum frame size allowed
e1000e: test for unusable MSI support
e1000e: remove unnecessary snippet missed in prior check_options update

Christian Lamparter (3):
p54: Fix regression due to "net: Delete NETDEVICES_MULTIQUEUE kconfig option"
p54: move p54_vdcf_init to the right place.
p54u: reset skb's data/tail pointer on requeue

David Brownell (1):
Kconfig: HSO driver bugfixes and updates

David S. Miller (18):
Merge branch 'upstream-davem' of master.kernel.org:/.../jgarzik/netdev-2.6
loopback: Remove rest of LOOPBACK_TSO code.
bnx2: Fix build with VLAN_8021Q disabled.
pkt_sched: Add 'deactivated' state.
pkt_sched: Simplify dev_deactivate() polling loop.
pkt_sched: No longer destroy qdiscs from RCU.
sch_prio: Use NET_XMIT_SUCCESS instead of "0" constant.
pkt_sched: Fix missed RCU unlock in dev_queue_xmit()
pkt_sched: Fix return value corruption in HTB and TBF.
pkt_sched: Never schedule non-root qdiscs.
pkt_sched: Don't hold qdisc lock over qdisc_destroy().
Merge branch 'master' of git://git.kernel.org/.../linville/wireless-2.6
Revert "pkt_sched: Protect gen estimators under est_lock."
Revert "pkt_sched: Add BH protection for qdisc_stab_lock."
Merge branch 'master' of git://git.kernel.org/.../holtmann/bluetooth-2.6
pkt_sched: Prevent livelock in TX queue running.

Dhananjay Phadke (6):
netxen: fix mac addr setup
netxen: fix rxbuf leak across driver reload
netxen: force link update across ifdown/ifup
netxen: fix dma watchdog
netxen: cleanup interrupt code
netxen: update driver version

Gerrit Renker (1):
dccp: Fix panic caused by too early termination of retransmission mechanism

Greg Kroah-Hartman (2):
USB: HSO: make tty_operations const
USB: HSO: minor fixes due to code review

Henrique de Moraes Holschuh (1):
rfkill: protect suspended rfkill controllers

Herbert Xu (4):
ipv4: Disable route secret interval on zero interval
loopback: Enable TSO
net: Preserve netfilter attributes in skb_gso_segment using __copy_skb_header
loopback: Drop obsolete ip_summed setting

Holger Schurig (1):
ssb: allow compilation on systems without PCI

Huang Weiyi (2):
[netdrvr] remove unnecessary #include
removed unused #include <version.h>

Ilpo J?rvinen (1):
pkt_sched: remove bogus block (cleanup)

Jarek Poplawski (4):
pkt_sched: Fix unlocking in tc_ctl_tfilter()
net: Change handling of the __QDISC_STATE_SCHED flag in net_tx_action().
pkt_sched: Grab correct lock in notify_and_destroy().
pkt_sched: Add lockdep annotation for qdisc locks

Jesse Brandeburg (1):
ixgbe: add cx4 device ID

Jiri Slaby (1):
iwlwifi: fix printk newlines

Jochen Friedrich (1):
rt2x00: Fix txdone_entry_desc_flags

Jussi Kivilinna (1):
sch_prio: Use return value from inner qdisc requeue

Larry Finger (2):
b43: Fix for SPROM coding error in Linksys WMP54G (BCM4306/3)
b43: Fix for another Bluetooth Coexistence SPROM Programming error for BCM4306

Luis R. Rodriguez (1):
mac80211: remove kdoc references to IEEE80211_HW_HOST_GEN_BEACON_TEMPLATE

Marcel Holtmann (3):
[Bluetooth] Add SCO support to btusb driver
[Bluetooth] Fix userspace breakage due missing class links
[Bluetooth] Consolidate maintainers information

Mark McLoughlin (1):
tun: TUNGETIFF interface to query name and flags

Matt Carlson (6):
tg3: Add APE register access locking
tg3: Refine APE status check
tg3: Preserve register settings for DASH
tg3: Turn off ASF "driver alive" heartbeats for APE
tg3: Fix firmware event timeouts
tg3: Update version to 3.94

Michael Chan (4):
bnx2: Fix logic to setup VLAN rx tagging.
bnx2: Use proper CONFIG_VLAN_8021Q to compile the VLAN code.
bnx2: Reinsert VLAN tag when necessary.
bnx2: Update version to 1.8.0.

Michael Karcher (1):
ath5k: Don't fiddle with MSI on suspend/resume.

Mikael Pettersson (1):
ixp4xx_eth: fix dma_mapping_error() compile errors

Olivier Blin (2):
hso: fix oops in read/write callbacks
hso: fix refcounting on the ttyHSx devices

Pablo Neira Ayuso (3):
netfilter: ctnetlink: fix double helper assignation for NAT'ed conntracks
netfilter: ctnetlink: fix sleep in read-side lock section
netfilter: ctnetlink: sleepable allocation with spin lock bh

Rafael J. Wysocki (1):
sky2: Fix suspend/hibernation/shutdown regression with WOL enabled (rev. 2)

Robert Fitzsimons (1):
tlan: Fix two regressions introduced by 64bit conversion.

Ron Rindjunsky (1):
mac80211: update new sta's rx timestamp

Rusty Russell (2):
net: skb_copy_datagram_from_iovec()
tun: fallback if skb_alloc() fails on big packets

Scott Wood (1):
gianfar: Call gfar_halt_nodisable() from gfar_halt().

Stefan Buehler (1):
tg3: fix 64 bit counter for ethtool stats

Stephen Hemminger (2):
bridge: show offload settings
nf_nat: use secure_ipv4_port_ephemeral() for NAT port randomization

Vegard Nossum (1):
au1000_eth: use 'unsigned long' for irqflags

Yang Hongyang (1):
ipv6: Fix the return interface index when get it while no message is received.

matthieu Barth?lemy (1):
rtl8187: Add USB ID for Netgear WG111V3

roel kluin (1):
atl1e: WAKE_MCAST 2x. 1st WAKE_UCAST?

Documentation/rfkill.txt | 5 +
MAINTAINERS | 87 +------
drivers/bluetooth/Kconfig | 10 +-
drivers/bluetooth/bt3c_cs.c | 2 +-
drivers/bluetooth/btusb.c | 282 +++++++++++++++++++-
drivers/bluetooth/hci_ldisc.c | 2 +-
drivers/bluetooth/hci_usb.c | 2 +-
drivers/bluetooth/hci_vhci.c | 2 +-
drivers/char/random.c | 1 +
drivers/net/Kconfig | 2 +-
drivers/net/acenic.c | 1 -
drivers/net/arm/ixp4xx_eth.c | 6 +-
drivers/net/atl1e/atl1e_ethtool.c | 2 +-
drivers/net/au1000_eth.c | 2 +-
drivers/net/ax88796.c | 4 +-
drivers/net/bnx2.c | 47 +++-
drivers/net/bnx2x_link.c | 1 -
drivers/net/bnx2x_main.c | 1 -
drivers/net/cpmac.c | 1 -
drivers/net/e1000e/defines.h | 2 +-
drivers/net/e1000e/e1000.h | 1 +
drivers/net/e1000e/ethtool.c | 2 +-
drivers/net/e1000e/netdev.c | 185 ++++++++++++-
drivers/net/e1000e/param.c | 25 ++-
drivers/net/gianfar.c | 6 +-
drivers/net/gianfar_sysfs.c | 1 -
drivers/net/ipg.h | 2 -
drivers/net/ixgbe/ixgbe_82598.c | 1 +
drivers/net/ixgbe/ixgbe_main.c | 4 +-
drivers/net/ixgbe/ixgbe_type.h | 1 +
drivers/net/loopback.c | 67 -----
drivers/net/myri10ge/myri10ge.c | 6 +-
drivers/net/ne.c | 4 +-
drivers/net/netxen/netxen_nic.h | 7 +-
drivers/net/netxen/netxen_nic_hw.c | 59 +++--
drivers/net/netxen/netxen_nic_init.c | 28 +-
drivers/net/netxen/netxen_nic_main.c | 210 +++++++--------
drivers/net/netxen/netxen_nic_phan_reg.h | 2 +
drivers/net/ppp_mppe.c | 1 -
drivers/net/pppol2tp.c | 1 -
drivers/net/r6040.c | 1 -
drivers/net/sh_eth.c | 1 -
drivers/net/sky2.c | 8 +-
drivers/net/tehuti.h | 1 -
drivers/net/tg3.c | 101 ++++++--
drivers/net/tg3.h | 6 +
drivers/net/tlan.c | 8 +-
drivers/net/tun.c | 105 +++++++-
drivers/net/typhoon.c | 1 -
drivers/net/usb/Kconfig | 21 +-
drivers/net/usb/hso.c | 53 +++--
drivers/net/wireless/ath5k/base.c | 9 +-
drivers/net/wireless/ath9k/hw.c | 6 +-
drivers/net/wireless/b43/main.c | 3 +-
drivers/net/wireless/ipw2100.c | 1 -
drivers/net/wireless/ipw2200.c | 1 -
drivers/net/wireless/iwlwifi/iwl-3945.c | 1 -
drivers/net/wireless/iwlwifi/iwl-4965.c | 3 +-
drivers/net/wireless/iwlwifi/iwl-5000.c | 1 -
drivers/net/wireless/iwlwifi/iwl-agn.c | 1 -
drivers/net/wireless/iwlwifi/iwl-core.c | 1 -
drivers/net/wireless/iwlwifi/iwl-eeprom.c | 7 +-
drivers/net/wireless/iwlwifi/iwl-hcmd.c | 1 -
drivers/net/wireless/iwlwifi/iwl-power.c | 1 -
drivers/net/wireless/iwlwifi/iwl-sta.c | 4 +-
drivers/net/wireless/iwlwifi/iwl-tx.c | 4 +-
drivers/net/wireless/iwlwifi/iwl3945-base.c | 7 +-
drivers/net/wireless/p54/p54common.c | 51 ++--
drivers/net/wireless/p54/p54common.h | 18 +-
drivers/net/wireless/p54/p54usb.c | 10 +
drivers/net/wireless/rt2x00/rt2x00queue.h | 8 +-
drivers/net/wireless/rt2x00/rt2x00usb.c | 1 +
drivers/net/wireless/rtl8187_dev.c | 1 +
drivers/ssb/main.c | 8 +
include/linux/if_tun.h | 1 +
include/linux/skbuff.h | 4 +
include/net/addrconf.h | 3 +-
include/net/ip6_route.h | 1 +
include/net/mac80211.h | 11 +-
include/net/sch_generic.h | 2 +-
net/bluetooth/af_bluetooth.c | 2 +-
net/bluetooth/bnep/core.c | 2 +-
net/bluetooth/hci_sysfs.c | 376 ++++++++++++++-------------
net/bluetooth/l2cap.c | 2 +-
net/bluetooth/rfcomm/core.c | 2 +-
net/bluetooth/sco.c | 2 +-
net/bridge/br_device.c | 15 +-
net/core/datagram.c | 87 ++++++
net/core/dev.c | 49 +++--
net/core/gen_estimator.c | 9 +-
net/core/skbuff.c | 12 +-
net/dccp/input.c | 12 +-
net/ipv4/netfilter/ipt_addrtype.c | 2 +-
net/ipv4/netfilter/nf_nat_proto_common.c | 8 +-
net/ipv4/route.c | 76 +++++-
net/ipv6/addrconf.c | 3 +-
net/ipv6/fib6_rules.c | 3 +-
net/ipv6/ip6_fib.c | 1 +
net/ipv6/ip6_output.c | 2 +-
net/ipv6/ipv6_sockglue.c | 4 +-
net/ipv6/ndisc.c | 2 +-
net/ipv6/route.c | 12 +-
net/ipv6/xfrm6_policy.c | 4 +-
net/mac80211/mlme.c | 2 +
net/netfilter/nf_conntrack_netlink.c | 36 ++--
net/rfkill/rfkill.c | 14 +-
net/sched/cls_api.c | 2 +-
net/sched/sch_api.c | 47 ++--
net/sched/sch_cbq.c | 2 +-
net/sched/sch_generic.c | 68 ++----
net/sched/sch_htb.c | 4 +-
net/sched/sch_prio.c | 4 +-
net/sched/sch_tbf.c | 11 +-
net/sctp/ipv6.c | 3 +-
114 files changed, 1533 insertions(+), 898 deletions(-)


2008-08-19 17:03:50

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> 114 files changed, 1533 insertions(+), 898 deletions(-)

David, this absolutely _has_ to stop.

We're after -rc3. Your network merges continue to be too f*cking large,
and this has been going on for many months now. If you cannot throttle
people, I will have to throttle you and stop pulling things.

I'm going to take this, but really - this isn't just new drivers or
something like that that you've used as an excuse for big pulls before,
this is a _lot_ of changes to existing code.

Tell your people to look at the regression list, and if it's not there,
they should stop.

I realize that this problem is partly because when I see the pull requests
from you, I effectively see a combined pull from multiple different
sources, and in that sense it's not quite as big. But the networking pulls
have _consistently_ had the problem that they keep on being big not just
after -rc3, but after -rc4 and on, and I get the distinct feeling that
you're not moving the pain downwards, and aren't telling the people under
you to keep it clean and minimal and regressions only.

For example, those BT updates looked in no way like regression fixes. So
what the f*ck were they doing there? And why do you think all those driver
updates cannot cause new regressions?

If it's not a regression fix, it shouldn't be there. It should be in the
queue for the next version. Why is that apparently so hard for the network
people?

Linus

2008-08-19 18:27:40

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> For example, those BT updates looked in no way like regression fixes.

the Bluetooth fixes do fix one regression that broke user space
assumptions.

I included additional support for one new driver since I was under the
assumption that new driver support is fine since it can't introduce a
regression. If that has changed then please spell this out and we have
to apply this rule to all subsystems.

Also I cleaned up the MAINTAINERS file entries for Bluetooth. Are these
considered harmful now and should be postponed to the next merge window?
They can obviously not introduce any regressions?

Regards

Marcel

2008-08-19 20:15:12

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 10:03:07 -0700 (PDT)

> For example, those BT updates looked in no way like regression fixes. So
> what the f*ck were they doing there? And why do you think all those driver
> updates cannot cause new regressions?

The BT bits were the only part I really considered borderline,
and I was going to push back on Marcel.

But to be honest, I haven't seen bluetooth updates from him
for such a long time I felt that being strict here would just
exacerbate the problem.

Guess I was wrong.

2008-08-19 20:48:21

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> The BT bits were the only part I really considered borderline,
> and I was going to push back on Marcel.

I really don't see the e1000 and netxen updates as being critical either.
Sure, they look like driver improvement, but "improvement" is not what the
-rc3+ series is about.

Same goes for all the loopback changes. They look like cleanups or feature
enables.

IOW, it all looks like good commits, but quite a _lot_ of that queue looks
like good commits that should happen during the merge window, not during
the stabilization phase.

And this is by no means unique to _this_ pull request. It's been a very
clear pattern for a long time now. The networking area tends to be one of
the absolutely *most* active ones during the post-rc1 phase.

[ Yeah, in all fairness some architectures also do that, but at least I
feel like I _really_ don't need to care when I get a diffstat that only
touches arch/sh/* or something like that. ]

> But to be honest, I haven't seen bluetooth updates from him
> for such a long time I felt that being strict here would just
> exacerbate the problem.

I pointed out the BT ones as standing out (they were larger than some of
the other patches too), but I really don't think this was in any way
limited to BT in any shape, form or color. Quite frankly, looking through
the thing, my gut feel is that about _half_ the commits over-all should
probably have been in the queue for the next release.

Linus

2008-08-19 20:54:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, Marcel Holtmann wrote:
>
> Also I cleaned up the MAINTAINERS file entries for Bluetooth. Are these
> considered harmful now and should be postponed to the next merge window?
> They can obviously not introduce any regressions?

What I consider harmful is not any individual commit per se, but the
mindset that clearly says "hey, this particular commit is good, let's
push it up".

And all of the commits are _individually_ fine and the likelihood for
breakage is probably damn low, but when you have lots of them, that
doesn't work any more.

The whole point of the merge window is that you should be sending good,
tested commits _then_. And if you miss the merge window, then you queue
them up for the next one.

As it is, it seems like some people think that the merge window is when
you send any random crap that hasn't even been tested, and then after the
merge window you send the stuff that looks "obviously good".

How about raising your quality control a bit, so that I don't have to
berate you? Send the _obviously good_ stuff during the merge window, and
don't send the "random crap" AT ALL. And then, during the -rc series, you
don't do any "obviously good" stuff at all, but you do the "absolutely
required" stuff.

The rule should be that if you have any doubt _what-so-ever_ that
something is absolutely required, you simply don't send it during the -rc
phase. And if you have any doubt at all about something not working, you
don't send it during the merge window either!

The merge window is not for "let's get this tested, so that we can fix it
during the -rc". And the stabilization phase is not for "this one looks
obviously correct and safe".

Linus

2008-08-19 21:05:15

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 13:47:59 -0700 (PDT)

> Same goes for all the loopback changes. They look like cleanups or feature
> enables.

Those fix a performance regression reported by a real user.

Sure I did a cleanup of dead code in one of those commits,
but if you look at the commit beforehand from Herbert, the
context, you can see that it made no sense to leave that
in there any longer as half of what it was standing there
as "documenting" was removed by Herbert's commit.

2008-08-19 21:12:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tuesday, 19 of August 2008, David Miller wrote:
> From: Linus Torvalds <[email protected]>
> Date: Tue, 19 Aug 2008 13:47:59 -0700 (PDT)
>
> > Same goes for all the loopback changes. They look like cleanups or feature
> > enables.
>
> Those fix a performance regression reported by a real user.

FWIW, they fix the recent regression tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=11316 .

Thanks,
Rafael

2008-08-19 21:16:15

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tue, Aug 19, 2008 at 01:47:59PM -0700, Linus Torvalds ([email protected]) wrote:
> I really don't see the e1000 and netxen updates as being critical either.
> Sure, they look like driver improvement, but "improvement" is not what the
> -rc3+ series is about.

Netxen driver update contains bug fixes (leak and races) and hardware
workaround. Well, it has driver version bump either, I agress, that
one was an error. E1000 contains number of regression fixes and
performance improvement via module parameter change.

> Same goes for all the loopback changes. They look like cleanups or feature
> enables.

It fixes performance regression.

--
Evgeniy Polyakov

2008-08-19 21:22:10

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 13:54:03 -0700 (PDT)

> As it is, it seems like some people think that the merge window is when
> you send any random crap that hasn't even been tested, and then after the
> merge window you send the stuff that looks "obviously good".

We are perpetuating this mind set, aren't we? I could be wrong
but this is how I see things currently.

I agree, we should be working on regressions fixes now. And we should
essentially be doing so up until the merge window opens up again,
right?

When do people following those rules have time to work on new stuff?
Especially people like me who have to review and merge everyone else's
work as well as help fix bugs.

And not just subsystem maintainers like me, it's also the same for
people who are experienced, dilligent, and work on fixing bugs.
That kind of work is very time consuming.

So given that, who spends a decent amount of time working on features?
People who aren't dilligent working on bugs before the merge window,
and new developers, that's who.

linux-next is great, I love it, it solves all the merge hassles that
used to knock us out during the merge window and make life hell.

But it doesn't fix the time delegation problem.

There is always this "oh crap, I just spent 3 months doing nothing
but fixing bugs" feeling a lot of us core folks get right before
the merge window opens up.

So instead of getting the best work from the best people we have,
we get this last minute flurry of development in the days leading
up to the merge window openning up.

Maybe it's just a longing for the golden era of 2.${ODD}.x style
development, who knows :-)

2008-08-19 21:23:07

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> Those fix a performance regression reported by a real user.

Since when?

The thing is, I can do a

gitk v2.6.24.. drivers/net/loopback.c

as well as anybody else, and TSO has not been enabled for loopback at
least since 2.6.24. Going back to 2.6.23 (which has more changes that I
won't comment on), it looks like that LOOPBACK_TSO thing you removed was
there back then too.

So the performance regression if it happened must have been due to
something else, no?

Oh, I'm sure that enabling TSO speeds things up, but apparently it also
basically enables a code-path that hasn't been enabled since at least
2.6.23, no?

Really, David. Was the performance regression due to something else, and
then by enabling LOOPBACK_TSO it hid the problem? Or what? The thing is,
-rc3 is _not_ the point to apparently change something that hasn't been
changed in about a year (I didn't go any further back in history).

So what's going on? Do you seriously think it's a good point in time to
enable TSO for loopback after a long time of apparently _not_ being
enabled?

It smells like excuses to me. Was this really a "must be in 2.6.27" thing?

And no, it wouldn't bother me if this was a rare thing. Again, let me
repeat: the problem is not any of the individual commits _per_se_. The
problem is that the network layer stands out. And not in a good way. It
stands out as being a layer that gets a _lot_ of churn late in the -rc
game.

Linus

2008-08-19 21:27:50

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 14:21:57 -0700 (PDT)

>
>
> On Tue, 19 Aug 2008, David Miller wrote:
> >
> > Those fix a performance regression reported by a real user.
>
> Since when?

Check the regression list entry you were pointed to in another
reply.

But I'll save you some time and I'll explain the problem for you.

We enabled GSO segmentation offload, which is a software variant of
TSO we've had in the tree for ages, when a card can do scatter-gather
and checksumming offloading in HW. We do this because Lennert
Buytenhek validated with many tests that this consistently decreases
cpu utilization.

However, a user reported that if they NAT'd a remote destination port
using netfilter to a loopback addr:port, then there was a performance
degradation.

Herbert discovered the cause, which was multi-fold. And smashing the
SKB checksum and not indicating TSO capability in loopback was the end
cause.

Loopback should enable TSO for other reasons, not just to fix this
bug. If loopback says it can do TSO then the TSO packet gets passed
straight through to the receive side, and our entire stack has been
able to handle that for years.

2008-08-19 21:29:15

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, Rafael J. Wysocki wrote:
>
> FWIW, they fix the recent regression tracked as
> http://bugzilla.kernel.org/show_bug.cgi?id=11316 .

Yeah, and the real cause was apparently another commit that *ALSO*
happened after the merge window!

Guys, you're making excuses for the problem.

The problem that triggered this bugus loopback change was commit
e5a4a72d4f88f4389e9340d383ca67031d1b8536. Look at when that one was done.

This is my whole _point_. The networking layer is doing development during
the -rc window. And you guys are making excuses for it. Wake up, guys!

Linus

2008-08-19 21:32:08

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> I agree, we should be working on regressions fixes now.

Not just now. For the last two weeks, yes.

> And we should essentially be doing so up until the merge window opens up
> again, right?

Yes. But any new code should go into another branch (or delayed entirely,
but that probably doesn't work wekk for you guys) so that by the time the
merge window opens up, it's already ready and rearing to go, and
preferably pretty well tested too.

The problem is, you guys end up accepting a lot of stuff even after the
merge window. I know why - it's easy to do. It looks obviously fine. And
yeah, I let things slide.

The problem is, I've let things slide for a long time, and you guys don't
feel the pain.

> When do people following those rules have time to work on new stuff?

You can work on the new stuff too, but DON'T F*CKING SEND IT TO ME!

What's so hard to understand about that?

Linus

2008-08-19 21:33:22

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> Check the regression list entry you were pointed to in another
> reply.

I did. You apparently didn't do that yourself.

> We enabled GSO segmentation offload [ .. ]

yeah. And look at when that happened.

Dammit, all I ask is that you

- admit that you have a problem
- work on fixing it.

Stop the incessant excuses already.

Linus

2008-08-19 21:36:15

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 14:28:49 -0700 (PDT)

> Yeah, and the real cause was apparently another commit that *ALSO*
> happened after the merge window!
>
> Guys, you're making excuses for the problem.
>
> The problem that triggered this bugus loopback change was commit
> e5a4a72d4f88f4389e9340d383ca67031d1b8536. Look at when that one was done.
>
> This is my whole _point_. The networking layer is doing development during
> the -rc window. And you guys are making excuses for it. Wake up, guys!

That change was made under the pretext that it was tested heavily and
that if we hit any problem whatsoever with it that we couldn't fix
quickly it would be reverted.

If you look at the pull request I sent you that contained that change,
I pointed this change out as "highlight" and explained the situation,
in detail. Here it is:

4) Lennert Buytenhek did some really nice analysis on a network
device that cannot do TSO offloading in hardware. He checked
out what happens if you enable the software TSO mechanism fallback
we have in the kernel, and it improves CPU utilization tremendously.

It is safe to do this as long as the device in question can
support scatter-gather.

Herbert and I are discussing a way to do this even more efficiently
with some help from the device (currently the code has to allocate
extra sk_buff objects as we split up the TSO frame, and then do
a bunch of extra page ref counting, when all we need is some headers
and some way to say where the data portion split points are).

If this causes any problems whatsoever, it's trivial to revert this.

Did you read it? I write those for you specifically, so that you know
what changes in there are "of note" and you may want to be aware of.

But anyways, let's chalk this one up as inappropriate.

Looking through the rest of the networking changes in this
pull I see real bug fixes in all of the netxen and e1000
changes that seemed to stand out. All of the wireless stuff
looks like real bug fixes for things reported by real users.

And then there are 2 or 3 cleanups that probably could have
waited.

And then there is the Bluetooth SCO change which I agree was
borderline and I should have pushed back on.

There are simply a lot of people fixing a lot of bugs. And I have to
stay on top of it all. And I also have to be able to trust John
Linville, Jeff Garzik, Marcel, and others so that I don't have to be
checking up on them every single time. I look at what they send me,
and I do push back when I see obviously bogus stuff, but there is a
trust breakpoint.

2008-08-19 21:40:32

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 14:33:03 -0700 (PDT)

> Dammit, all I ask is that you
>
> - admit that you have a problem
> - work on fixing it.
>
> Stop the incessant excuses already.

I happily consider this example as inappropriate, sure.

But I think you're throwing the baby out with the bath water, the
majority of that pull contained legitimate real bug regression fixes.
That's why some other developers are coming out of the woods and
defending me, they don't have to do that, but they do it because they
feel I'm being slighted at least a little bit.

I don't know what to say, because I spent most of my sunny weekend
working on bug fixes as well as integrating other people's work.

2008-08-19 21:41:09

by Denys Fedoryschenko

[permalink] [raw]
Subject: Re: [GIT]: Networking

> Oh, I'm sure that enabling TSO speeds things up, but apparently it also
> basically enables a code-path that hasn't been enabled since at least
> 2.6.23, no?
Well i report about performance regression before too, seems related case,
but my report was unclear.

And the bug was terrible for me, it was causing very bad performance on
REDIRECT and loopback transfers. I am testing also recent net-2.6 with this
loopback changes, it seems improve things for me much. So it is really
important bugfix for me too.

There is always chances that some fix will cause regression, i will try to
test this changes on intensive real-life workloads to make sure all fine.

2008-08-19 21:47:23

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> That change was made under the pretext that it was tested heavily and
> that if we hit any problem whatsoever with it that we couldn't fix
> quickly it would be reverted.

David, I will say this one more time:

- as long as you concentrate on individual commits, you're missing the
big picture.

you can _always_ make excuses for individual commits. That's not my point.
Or rather, it actually verry much _is_ my point. If you have the mindset
that you're looking for excuses why any individual commit is ok to merge,
then you don't end up with a coupld of individual commits any more: you
end up with a LOT OF CHURN.

It's not the individual commits. You're looking at the individual trees,
and you're missing the forest. The problem isn't the individual trees. The
problem is that there's a metric sh*tload of individual trees, what we in
the tree industry call a 'forest'. You're not seeing it.

And btw, don't get me wrong - you're not the only problem spot. During the
-rc's leading up to 2.6.26, drivers/merdia was actually a _bigger_
problem. I happen to care less about that (the same way I care less about
some odd-ball architectures), but I have to admit that drivers/media was a
total disaster last time around.

So if it makes you feel any better, others have been even worse. But this
networking problem ha been going on for quite a while.

So the problem here really is that you seem overly eager to make excuses
for individual patches. And if they _stayed_ "individual" it would all be
good. But they don't seem to.

Linus

2008-08-19 21:53:31

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Tue, 19 Aug 2008, David Miller wrote:
>
> But I think you're throwing the baby out with the bath water, the
> majority of that pull contained legitimate real bug regression fixes.

.. and notice how I

(a) took it

and

(b) am asking for you to be more careful?

In other words, I would be a lot happier if you didn't say "majority". I
would be a ton happioer if you could HONESTLY say that every single one
was a regression.

And the thing is, you cannot. Some of the ones I pointed you to were
actually regressions due to _other_ patches you had much too happily sent
me after the merge window had already closed).

> That's why some other developers are coming out of the woods and
> defending me, they don't have to do that, but they do it because they
> feel I'm being slighted at least a little bit.

Umm. The only defending I have seen was a F*CKING DISGRACE, since nobody
apparently had the balls to stand up and admit that the whole problem
happened after -rc1 in the first place!

In other words, the "defense" was just making excuses for EXACTLY the
behaviour I'm trying to tell you shouldn't have happened in the first
place.

Please. You're still making excuses for this, even after I pointed out
that ALL of the problems with the whole loopback driver thing happened
after the merge window.

Linus

2008-08-19 21:57:00

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 14:50:36 -0700 (PDT)

> In other words, I would be a lot happier if you didn't say "majority". I
> would be a ton happioer if you could HONESTLY say that every single one
> was a regression.
>
> And the thing is, you cannot. Some of the ones I pointed you to were
> actually regressions due to _other_ patches you had much too happily sent
> me after the merge window had already closed).

Fair enough.

2008-08-19 22:27:17

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tue, Aug 19, 2008 at 02:50:36PM -0700, Linus Torvalds ([email protected]) wrote:
> And the thing is, you cannot. Some of the ones I pointed you to were
> actually regressions due to _other_ patches you had much too happily sent
> me after the merge window had already closed).

... phisiological thoughts skipped ...

> Please. You're still making excuses for this, even after I pointed out
> that ALL of the problems with the whole loopback driver thing happened
> after the merge window.

I belive it was you who told that there is no black and white (another
guy told that there is no spoon, I frequntly confuse).

Any changes made no matter when can not be 100% tested in laboratory
environment, even fixes, which look obviously. Even changes which do fix
some problems can introduce another. And some fixes can introduce
problems, which are not immediately shown in majority of the tests, so
changes on top of them can look like introducing new bugs. If you have
multiple changes and result which produce an error, it does not mean
that the last one was wrong. Of course it can, but 'there is no black
and white', and in really complex system only trivial changes can be
thought of not touching others. The same applies to loopback. So this
particular note is just about the fact, that fixing regression means
either reverting a change or introducing a new change. The latter is
preferred (or at least should be), since it is a move forward.

According to other changes, which you believe are not suitable for the
post merge window releases... People do know that major changes are not
allowed to be made, but there are always last strikes in the head which
are supposed to fix problems, not to introduce new ones. Where did you
see experimental code in -rc cycle? Where did you see patches which
break things without bringing improvement at that time? Yes, there
changes which are not supposed to be in the tree at that time, but
that's just a development process, which even in a short run makes good
result. Regressions and bugs are fixed, and things are not getting worse
with time.

--
Evgeniy Polyakov

2008-08-19 22:41:16

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Wed, 20 Aug 2008, Evgeniy Polyakov wrote:
>
> I belive it was you who told that there is no black and white (another
> guy told that there is no spoon, I frequntly confuse).

Yes.

> Any changes made no matter when can not be 100% tested in laboratory
> environment, even fixes, which look obviously.

100% agreed.

Please note that I'm not against these things slipping in occasionally.
The reason I brought this up in the first place really wasn't the loopback
driver issue at all. The reason I brought it up was simply the fact that
when I compare the size and frequency of changes, the networking pulls
tend to be the worst of the lot of the "core" kernel changes.

I say "core" kernel changes, because things are usually worse for the
outliers. As mentioned, networking is actually one of the _better_ guys if
you start comparing to the DVB people, or to some of the architectures
that often slip the merge window _entirely_, and *all* their changes come
in during -rc2 or something.

So it's not that networking is especially bad on an absolute scale in this
regard. And it's not like it doesn't happen all the time for everybody
else too. But I think networking has ben a bit more cavalier about things
than many other core areas.

So no, I'm not asking for black-and-white absolutes here. But I'm asking
for a "tightening of the belts". Please don't let it all hang out, ok?

Linus

2008-08-19 22:52:28

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Linus Torvalds <[email protected]>
Date: Tue, 19 Aug 2008 15:40:35 -0700 (PDT)

> So no, I'm not asking for black-and-white absolutes here. But I'm asking
> for a "tightening of the belts". Please don't let it all hang out, ok?

I'm in agreement :)

2008-08-20 02:25:52

by Josh Boyer

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tue, 2008-08-19 at 14:21 -0700, David Miller wrote:
> But it doesn't fix the time delegation problem.
>
> There is always this "oh crap, I just spent 3 months doing nothing
> but fixing bugs" feeling a lot of us core folks get right before
> the merge window opens up.
>
> So instead of getting the best work from the best people we have,
> we get this last minute flurry of development in the days leading
> up to the merge window openning up.

So, not to add fuel to a fire that seems to be calming down, but what is
so wrong with having that feeling? If the core people are spending 3
months doing nothing but fixing bugs, the I consider that to _be_ the
best work from the best people.

Bugs happen and it's not really worth debating over how they got into
the tree to begin with because it happens in a number of different ways.
However, if it takes 3 months of bug fixing to get a tree in shape then
I don't see how that's a problem at all. Personally, I'd rather take a
relatively bug free tree over new shiny features on top of a buggy as
hell tree.

Call me old fashioned.

josh

2008-08-20 02:51:27

by David Miller

[permalink] [raw]
Subject: Re: [GIT]: Networking

From: Josh Boyer <[email protected]>
Date: Tue, 19 Aug 2008 22:25:15 -0400

> So, not to add fuel to a fire that seems to be calming down, but what is
> so wrong with having that feeling? If the core people are spending 3
> months doing nothing but fixing bugs, the I consider that to _be_ the
> best work from the best people.

All work and no play makes Dave a dull boy, that's the
problem. :-)

I don't care how much someone claims they enjoy bug fixing
and gathering up other people's patches, you will go out
of your mind or become bored to death if you don't get to
spend real time implementing something significant from
time to time.

2008-08-20 03:42:45

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> > Also I cleaned up the MAINTAINERS file entries for Bluetooth. Are these
> > considered harmful now and should be postponed to the next merge window?
> > They can obviously not introduce any regressions?
>
> What I consider harmful is not any individual commit per se, but the
> mindset that clearly says "hey, this particular commit is good, let's
> push it up".

again, my current understanding was that updates to the documentation
that would help people to navigate and understand the kernel better and
make it easier for them to do bug reports etc. are always welcome and
should be pushed immediately. Same goes for new drivers that would
enable people to use their hardware.

If you don't want these patches during the -rc phase, then this is fine
by me. It is no extra work for me to queue these up until the next merge
window. Actually GIT makes merges for me so simple that I couldn't care
less. I was mislead that you want these kind of fixes to go in quickly
and I apologize for the trouble. I will stop bothering Dave with these
from now on and wait until the next merge window.

> And all of the commits are _individually_ fine and the likelihood for
> breakage is probably damn low, but when you have lots of them, that
> doesn't work any more.
>
> The whole point of the merge window is that you should be sending good,
> tested commits _then_. And if you miss the merge window, then you queue
> them up for the next one.
>
> As it is, it seems like some people think that the merge window is when
> you send any random crap that hasn't even been tested, and then after the
> merge window you send the stuff that looks "obviously good".
>
> How about raising your quality control a bit, so that I don't have to
> berate you? Send the _obviously good_ stuff during the merge window, and
> don't send the "random crap" AT ALL. And then, during the -rc series, you
> don't do any "obviously good" stuff at all, but you do the "absolutely
> required" stuff.
>
> The rule should be that if you have any doubt _what-so-ever_ that
> something is absolutely required, you simply don't send it during the -rc
> phase. And if you have any doubt at all about something not working, you
> don't send it during the merge window either!
>
> The merge window is not for "let's get this tested, so that we can fix it
> during the -rc". And the stabilization phase is not for "this one looks
> obviously correct and safe".

I get your point! And I was never using the merge window for "random
crap". All my stuff is heavily tested and even on non-x86 systems.

So why does it happen that I touched the MAINTAINERS file outside the
merge window? Simply because I ran into it looking what it says and then
fixed it. And when I had the regression fix for Dave to pull, I picked
the other two patches that couldn't introduce any regression and send it
with it. You don't want these. I get it and from now on they will stay
in my queue until the next merge window.

Regards

Marcel

2008-08-20 03:44:08

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Dave,

> > For example, those BT updates looked in no way like regression fixes. So
> > what the f*ck were they doing there? And why do you think all those driver
> > updates cannot cause new regressions?
>
> The BT bits were the only part I really considered borderline,
> and I was going to push back on Marcel.
>
> But to be honest, I haven't seen bluetooth updates from him
> for such a long time I felt that being strict here would just
> exacerbate the problem.
>
> Guess I was wrong.

as I explained to Linus, my current assumption was that documentation
updates and new driver stuff should go in quickly. You will not get any
of these from me anymore. Next time you only get the one regression fix
and all my queued up stuff in the next merge window.

Don't hold back if you think that a patch is not acceptable. Really, I
am using GIT for everything. Merging is no problem for me. I also do
backports in my -mh patch for the latest stable kernel. So there is no
extra work for me. It is just sending you a new email with another tree
to pull from :)

Regards

Marcel

2008-08-20 04:20:28

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi guys,

> And then there is the Bluetooth SCO change which I agree was
> borderline and I should have pushed back on.

so this is the statement, I sent Dave to explain why that change was in
there:

---
For the btusb driver this adds the promised SCO support. The btusb
driver is a new driver and will eventually replace hci_usb. Adding SCO
support was the last missing piece. All distributions are using the
hci_usb driver at the moment and you can only select one of them. So
this can't introduce any regression. With this change the distributions
are now able to select the new driver if they really want to.
---

Was this absolutely needed after -rc3. Of course not. No questions asked
about it. So why did it ended up in there?

Almost everybody is using the hci_usb driver and that one has issues
that are beyond fixable. So the btusb is its replacement and with this
change it became a real alternate solution. For me this is a new driver
that would allow people to use it in case hci_usb gives them a hard time
and falls over again. And fixing hci_usb is not an option. A lot of
people tried it and they failed. I think the last one was Pavel a month
ago. This is why I re-wrote the whole beast from scratch.

So that is my excuse why I thought this would be good choice to push it
to Dave. No more excuses and no new drivers after the merge window. At
least not from me.

Regards

Marcel

2008-08-20 04:21:24

by Willy Tarreau

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tue, Aug 19, 2008 at 07:51:15PM -0700, David Miller wrote:
> From: Josh Boyer <[email protected]>
> Date: Tue, 19 Aug 2008 22:25:15 -0400
>
> > So, not to add fuel to a fire that seems to be calming down, but what is
> > so wrong with having that feeling? If the core people are spending 3
> > months doing nothing but fixing bugs, the I consider that to _be_ the
> > best work from the best people.
>
> All work and no play makes Dave a dull boy, that's the
> problem. :-)
>
> I don't care how much someone claims they enjoy bug fixing
> and gathering up other people's patches, you will go out
> of your mind or become bored to death if you don't get to
> spend real time implementing something significant from
> time to time.

That's true and I would also add that it's very common for bugs to
be discovered and fixed while implementing new features. However,
it's so convenient to manage several branches with git that it should
not be a problem to "play" in one branch and push all the stuff during
the merge window only. One of the problems with networking is that you
need a lot of testers. I don't think it's too hard for them to pull
from your development tree. And if it is, maybe you can incite them
from time to time by releasing snapshots as plain patches.

BTW, it also helps testers a lot to be able to play with topic trees
provided as patches against last release, because they generally can
apply them to stable kernels without the fear of losing their data.
I'm sure that many people already run stable kernels with not-yet-merged
patches on top of them and are happy that way.

Regards,
Willy

2008-08-20 04:47:31

by David Lang

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Wed, 20 Aug 2008, Marcel Holtmann wrote:

>> And then there is the Bluetooth SCO change which I agree was
>> borderline and I should have pushed back on.
>
> so this is the statement, I sent Dave to explain why that change was in
> there:
>
> ---
> For the btusb driver this adds the promised SCO support. The btusb
> driver is a new driver and will eventually replace hci_usb. Adding SCO
> support was the last missing piece. All distributions are using the
> hci_usb driver at the moment and you can only select one of them. So
> this can't introduce any regression. With this change the distributions
> are now able to select the new driver if they really want to.
> ---
>
> Was this absolutely needed after -rc3. Of course not. No questions asked
> about it. So why did it ended up in there?
>
> Almost everybody is using the hci_usb driver and that one has issues
> that are beyond fixable. So the btusb is its replacement and with this
> change it became a real alternate solution. For me this is a new driver
> that would allow people to use it in case hci_usb gives them a hard time
> and falls over again. And fixing hci_usb is not an option. A lot of
> people tried it and they failed. I think the last one was Pavel a month
> ago. This is why I re-wrote the whole beast from scratch.
>
> So that is my excuse why I thought this would be good choice to push it
> to Dave. No more excuses and no new drivers after the merge window. At
> least not from me.

one of thr goals of the new release approach was to make releases
frequently enough that it's not a big deal to miss a merge window, you
only have to wait a couple of months (rather then a couple of years under
the old model).

while I don't see a bit problem with drivers going in for previously
unsupported hardware (at least since I custom compile my kernels with all
unnessasary drivers disabled, so I wouldn't even try to compile them ;-)
it doesn't hurt much, either as a user, or for you as a developer (as you
note above) to go ahead and delay till the next merge window.

the benifits of delaying are that the changes in the -rc cycle are clearer
and smaller. this should make the progress towards the release more
obvious, and avoid distractions like the one that started this thread.
yes, this will make the -rc1/-rc2 even bigger as there is more stuff going
in, but it looks like that is being handled well (in part thanks to the
preview that -next is providing)

so as a user/tester I want to thank you for being so willing to delay new
stuff for the next merge window, may others learn to follow your example.

David Lang

2008-08-20 05:23:18

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> And btw, don't get me wrong - you're not the only problem spot. During the
> -rc's leading up to 2.6.26, drivers/merdia was actually a _bigger_
> problem. I happen to care less about that (the same way I care less about
> some odd-ball architectures), but I have to admit that drivers/media was a
> total disaster last time around.

to be quite honest, don't you think this is a little bit unfair. So if
it is drivers/media/ or arch/sh/ or whatever you don't care. Do you
actually care what I do in drivers/bluetooth/?

I think that networking and USB are the two most biggest trees of the
kernel and the number of changes they produce are big. And most things
are drivers. I do think that we are doing pretty well in holding back
architectural changes of a subsystem and testing them properly in -mm or
-next before merging them.

The actual drivers make a big portion of Linux and we need them and we
want them. However drivers are the biggest problem for most subsystem
maintainers since they don't own all hardware. And just face it. Most
people that have certain hardware bits are not going to run -next or -mm
kernels. If we get them to test -rc kernels we are lucky.

So drivers/net/ alone is 41M in size. That is a quarter of all drivers/
directory and to be fair the others also contain the actual subsystem in
there while networking maintains it outside that directory.

Just look at your EeePC. Booting a 2.6.26 kernel on it and you have no
Ethernet and no WiFi drivers for the built-in hardware.

I personally think that we need to be conservative for the actual
subsystem changes and a little bit more open for driver changes.
Especially when it comes to new drivers (not a tg3 or e100 driver) since
we wanna ease the entry level to get Linux running. When it comes to
networking, there are just a lot of drivers. Plain simple as that.

And when it comes to architectural or subsystem changes, I think that
the merge window rule is followed quite literal. With minor things
during -rc1 because of merge conflicts, but eventually the -next tree
will solve all of these.

So what about the drivers? Should drivers for new hardware go in? Even
if the maintainers don't think they are stable enough? The current
approach is that even an almost stable driver is better than no driver.
If this no longer applies, then please spell it out and I am more than
happy to oblige.

Regards

Marcel

2008-08-20 05:38:31

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi David,

> >> And then there is the Bluetooth SCO change which I agree was
> >> borderline and I should have pushed back on.
> >
> > so this is the statement, I sent Dave to explain why that change was in
> > there:
> >
> > ---
> > For the btusb driver this adds the promised SCO support. The btusb
> > driver is a new driver and will eventually replace hci_usb. Adding SCO
> > support was the last missing piece. All distributions are using the
> > hci_usb driver at the moment and you can only select one of them. So
> > this can't introduce any regression. With this change the distributions
> > are now able to select the new driver if they really want to.
> > ---
> >
> > Was this absolutely needed after -rc3. Of course not. No questions asked
> > about it. So why did it ended up in there?
> >
> > Almost everybody is using the hci_usb driver and that one has issues
> > that are beyond fixable. So the btusb is its replacement and with this
> > change it became a real alternate solution. For me this is a new driver
> > that would allow people to use it in case hci_usb gives them a hard time
> > and falls over again. And fixing hci_usb is not an option. A lot of
> > people tried it and they failed. I think the last one was Pavel a month
> > ago. This is why I re-wrote the whole beast from scratch.
> >
> > So that is my excuse why I thought this would be good choice to push it
> > to Dave. No more excuses and no new drivers after the merge window. At
> > least not from me.
>
> one of thr goals of the new release approach was to make releases
> frequently enough that it's not a big deal to miss a merge window, you
> only have to wait a couple of months (rather then a couple of years under
> the old model).
>
> while I don't see a bit problem with drivers going in for previously
> unsupported hardware (at least since I custom compile my kernels with all
> unnessasary drivers disabled, so I wouldn't even try to compile them ;-)
> it doesn't hurt much, either as a user, or for you as a developer (as you
> note above) to go ahead and delay till the next merge window.

the downside is that users wanna use this hardware have to wait for the
next kernel release. The -next and -mm trees are simply not for
everybody. Even some of the -rc kernels are a pain if you happen to use
a non-x86 system. The kernel developers can fix them easily or know who
to ask for a fix. So decision to include certain driver updates or new
drivers are made from the perspective of the end users.

>From a developer perspective if you work on a well separated subsystem
or an individual driver, I can go for many kernel releases without
running into major merge conflicts.

We do have the fast moving targets like wireless. And it is not always
the developers fault. The hardware manufactures are putting out new
chips so fast nowadays that keeping up with the drivers is a hard job.
Also laptop/desktop manufactures are a lot quicker in integrating these
chips and bringing them to market.

I made the EeePC 901 example. A 2.6.26 kernel has no support for the
Ethernet card in it. This happened that last time with a 2.2 kernel
where I bought an Ethernet card that was not supported.

So when it comes to new driver support, it is a judgment call. Some
times we make the wrong one.

Regards

Marcel

2008-08-20 06:15:54

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> > I belive it was you who told that there is no black and white (another
> > guy told that there is no spoon, I frequntly confuse).
>
> Yes.
>
> > Any changes made no matter when can not be 100% tested in laboratory
> > environment, even fixes, which look obviously.
>
> 100% agreed.
>
> Please note that I'm not against these things slipping in occasionally.
> The reason I brought this up in the first place really wasn't the loopback
> driver issue at all. The reason I brought it up was simply the fact that
> when I compare the size and frequency of changes, the networking pulls
> tend to be the worst of the lot of the "core" kernel changes.
>
> I say "core" kernel changes, because things are usually worse for the
> outliers. As mentioned, networking is actually one of the _better_ guys if
> you start comparing to the DVB people, or to some of the architectures
> that often slip the merge window _entirely_, and *all* their changes come
> in during -rc2 or something.
>
> So it's not that networking is especially bad on an absolute scale in this
> regard. And it's not like it doesn't happen all the time for everybody
> else too. But I think networking has ben a bit more cavalier about things
> than many other core areas.

I was always under the impression that Dave was quite strict when it
comes to merging things after the merge window. Yes, some things should
have better waiting for the next release, but we always had exceptions
for various things that fall out of the merge window anyway.

All the small "soldiers" like me make a call on what to pick to send to
Dave and it is not that we try to sneak anything in. It just made sense
to us and either he agrees or not. Some choices are better than others,
no questions asked, but I think what you are seeing is that networking
is getting huge. And this is mostly networking drivers. And I don't
expect this to slow down or anything. The Linux WiFi support is just at
the edge to really become competitive and show real leading across all
other operating systems.

When looking at the actual split between net/ and drivers/, then the
drivers part is the big one. And I don't expect this to go down any time
soon. We are about to get Ultra-Wideband support merged. After that we
will have plain networking over UWB, Wireless USB using UWB and soon
Bluetooth over UWB. Also we will see Bluetooth over 802.11 and then
another Ultra-Low-Power thing for sensor devices. And I forget WiMAX.
That is just the short term wireless future. Every of these come with
new networking drivers and then you have new Ethernet drivers and so on.
It is just a lot of stuff.

While looking at the actual diffstat, I realized that I really was the
biggest offender:

net/bluetooth/hci_sysfs.c | 376 ++++++++++++++-------------

This is actually the regression fix. It is big, because thanks to sysfs,
I had to move some code around in that file and rename things. I should
have seen that earlier and gave Dave an extra comment why it is so big.
That was my bad. Sorry for that.

Regards

Marcel

2008-08-20 06:36:25

by Grant Coady

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Wed, 20 Aug 2008 06:20:29 +0200, Willy Tarreau <[email protected]> wrote:

...
>I'm sure that many people already run stable kernels with not-yet-merged
>patches on top of them and are happy that way.

Count me as a patch tester -- trying to learn git but I break the thing
too often at the moment :(

Grant.

2008-08-20 15:03:20

by John W. Linville

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Wed, Aug 20, 2008 at 05:42:33AM +0200, Marcel Holtmann wrote:

> again, my current understanding was that updates to the documentation
> that would help people to navigate and understand the kernel better and
> make it easier for them to do bug reports etc. are always welcome and
> should be pushed immediately. Same goes for new drivers that would
> enable people to use their hardware.

This is my (current) understanding as well. If that is not the case,
then someone should clarify.

John
--
John W. Linville
[email protected]

2008-08-20 15:03:49

by John W. Linville

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Tue, Aug 19, 2008 at 07:51:15PM -0700, David Miller wrote:

> I don't care how much someone claims they enjoy bug fixing
> and gathering up other people's patches, you will go out
> of your mind or become bored to death if you don't get to
> spend real time implementing something significant from
> time to time.

Are you talking to me??? :-)

John
--
John W. Linville
[email protected]

2008-08-20 15:31:47

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Wed, 20 Aug 2008, John W. Linville wrote:

> On Wed, Aug 20, 2008 at 05:42:33AM +0200, Marcel Holtmann wrote:
> >
> > again, my current understanding was that updates to the documentation
> > that would help people to navigate and understand the kernel better and
> > make it easier for them to do bug reports etc. are always welcome and
> > should be pushed immediately. Same goes for new drivers that would
> > enable people to use their hardware.
>
> This is my (current) understanding as well. If that is not the case,
> then someone should clarify.

Guys, which part of "it wasn't any individual commit" didn' you
understand?

Why are you concentrating on one documentation commit that I didn't even
point to?

But that said - no, I don't think there is any reason to even push
documentation commits, unless there is a real and pressing reason (ie the
documentation is really important or will really matter from a future
merge standpoint). I generally won't complain about them, but I also don't
see the point.

I'd _much_ rather see you guys queue it up for future merges.

Linus

2008-08-20 15:40:26

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> > > again, my current understanding was that updates to the documentation
> > > that would help people to navigate and understand the kernel better and
> > > make it easier for them to do bug reports etc. are always welcome and
> > > should be pushed immediately. Same goes for new drivers that would
> > > enable people to use their hardware.
> >
> > This is my (current) understanding as well. If that is not the case,
> > then someone should clarify.
>
> Guys, which part of "it wasn't any individual commit" didn' you
> understand?
>
> Why are you concentrating on one documentation commit that I didn't even
> point to?
>
> But that said - no, I don't think there is any reason to even push
> documentation commits, unless there is a real and pressing reason (ie the
> documentation is really important or will really matter from a future
> merge standpoint). I generally won't complain about them, but I also don't
> see the point.
>
> I'd _much_ rather see you guys queue it up for future merges.

John was just pointing out (like myself before) that a lot of people are
under the impression that documentation updates and new drivers should
not be queued up and merged as soon as possible.

I am really fine either way. Queuing them up is not a problem and you
made your point that you don't wanna see them after -rc1. That is fair
enough, but it really needs to be spelled our here since the overall
consensus is different. I will not send any of these do Dave anymore
after the merge window.

Regards

Marcel

2008-08-20 16:10:46

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT]: Networking



On Wed, 20 Aug 2008, Marcel Holtmann wrote:
>
> John was just pointing out (like myself before) that a lot of people are
> under the impression that documentation updates and new drivers should
> not be queued up and merged as soon as possible.

I think (and hey, I'm flexible, and we can discuss this) that the rules
should be:

- by default, the answer should always be "don't push anything after the
merge window unless it fixes a regression or a nasty bug".

Here "nasty bug" is something that is a problem in practice, and not
something theoretical that people haven't really reported.

- but as a special case, we relax that for totally new drivers (and that
includes things like just adding a new PCI or USB ID's to old drivers),
because (a) it can't really regress and (b) support for a specific
piece of hardware can often be critical.

With regard to that second case, I'd like to note that obviously even a
totally new driver _can_ regress, in the sense that it can cause build
errors, or problems that simply wouldn't have happened without that
driver. So the "cannot regress" obviously isn't strictly true, but I
think everybody understands what I really mean.

It should also be noted that the "new driver" exception should only be an
issue for things that _matter_.

For example, a machine without networking support (or without suppoort for
a some other really core driver that provides basic functionality) is
practically useless. But a machine without support for some particular
webcam or support for some special keys on a particular keyboard? That
really doesn't matter, and might as well wait for the next release.

So the "merge drivers early" is for drivers that reasonably _matter_ in
the sense that it allows people to test Linux AT ALL on the platform. It
shouldn't be "any possible random driver".

IOW, think about the drivers a bit like a distro would think about
backporting drivers to a stable kernel. Which ones are really needed?

Also, note that "new driver" really should be that. If it's an older
driver, and you need to touch _any_ old code to add a new PCI ID or
something, the whole argument about it not breaking falls away. Don't do
it. I think, for example, that the SCSI people seem to be a bit too eager
sometimes to update their drivers for new revisions of cards, and they do
it to old drivers.

And finally - the rules should be guidelines. It really isn't always
black-and-white, but most of the time the simple question of "could this
_possibly_ be just queued for the next release without hurting anything"
should be the basic one. If the answer is "yes", then wait.

Linus

2008-08-20 16:46:51

by Jiri Kosina

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Wed, 20 Aug 2008, Linus Torvalds wrote:

> - but as a special case, we relax that for totally new drivers (and that
> includes things like just adding a new PCI or USB ID's to old drivers),
> because (a) it can't really regress

It in fact depends on your definition of regression really :)

If we merge a buggy driver that hangs the user's machine when loaded, well
... before the driver has been merged, the machine had been booting well,
just some hardware was not functioning at all. After this late driver
merge, the driver gets autoloaded upon boot and crashes the machine. Users
will probably see this as a regression.

This doesn't mean that I am against merging new drivers as aggressively as
possible, I just wanted to point out that it might bring actual
regressions to users.

--
Jiri Kosina
SUSE Labs

2008-08-20 17:33:33

by Marcel Holtmann

[permalink] [raw]
Subject: Re: [GIT]: Networking

Hi Linus,

> > John was just pointing out (like myself before) that a lot of people are
> > under the impression that documentation updates and new drivers should
> > not be queued up and merged as soon as possible.
>
> I think (and hey, I'm flexible, and we can discuss this) that the rules
> should be:
>
> - by default, the answer should always be "don't push anything after the
> merge window unless it fixes a regression or a nasty bug".
>
> Here "nasty bug" is something that is a problem in practice, and not
> something theoretical that people haven't really reported.
>
> - but as a special case, we relax that for totally new drivers (and that
> includes things like just adding a new PCI or USB ID's to old drivers),
> because (a) it can't really regress and (b) support for a specific
> piece of hardware can often be critical.
>
> With regard to that second case, I'd like to note that obviously even a
> totally new driver _can_ regress, in the sense that it can cause build
> errors, or problems that simply wouldn't have happened without that
> driver. So the "cannot regress" obviously isn't strictly true, but I
> think everybody understands what I really mean.
>
> It should also be noted that the "new driver" exception should only be an
> issue for things that _matter_.
>
> For example, a machine without networking support (or without suppoort for
> a some other really core driver that provides basic functionality) is
> practically useless. But a machine without support for some particular
> webcam or support for some special keys on a particular keyboard? That
> really doesn't matter, and might as well wait for the next release.
>
> So the "merge drivers early" is for drivers that reasonably _matter_ in
> the sense that it allows people to test Linux AT ALL on the platform. It
> shouldn't be "any possible random driver".
>
> IOW, think about the drivers a bit like a distro would think about
> backporting drivers to a stable kernel. Which ones are really needed?
>
> Also, note that "new driver" really should be that. If it's an older
> driver, and you need to touch _any_ old code to add a new PCI ID or
> something, the whole argument about it not breaking falls away. Don't do
> it. I think, for example, that the SCSI people seem to be a bit too eager
> sometimes to update their drivers for new revisions of cards, and they do
> it to old drivers.
>
> And finally - the rules should be guidelines. It really isn't always
> black-and-white, but most of the time the simple question of "could this
> _possibly_ be just queued for the next release without hurting anything"
> should be the basic one. If the answer is "yes", then wait.

I am perfectly fine with these rules. You only had to spell them out :)

Regards

Marcel

2008-08-20 18:50:58

by Paolo Ciarrocchi

[permalink] [raw]
Subject: Re: [GIT]: Networking

On Wed, Aug 20, 2008 at 7:33 PM, Marcel Holtmann <[email protected]> wrote:
> Hi Linus,
>
>> > John was just pointing out (like myself before) that a lot of people are
>> > under the impression that documentation updates and new drivers should
>> > not be queued up and merged as soon as possible.
>>
>> I think (and hey, I'm flexible, and we can discuss this) that the rules
>> should be:
>>
>> - by default, the answer should always be "don't push anything after the
>> merge window unless it fixes a regression or a nasty bug".
>>
>> Here "nasty bug" is something that is a problem in practice, and not
>> something theoretical that people haven't really reported.
>>
>> - but as a special case, we relax that for totally new drivers (and that
>> includes things like just adding a new PCI or USB ID's to old drivers),
>> because (a) it can't really regress and (b) support for a specific
>> piece of hardware can often be critical.
>>
>> With regard to that second case, I'd like to note that obviously even a
>> totally new driver _can_ regress, in the sense that it can cause build
>> errors, or problems that simply wouldn't have happened without that
>> driver. So the "cannot regress" obviously isn't strictly true, but I
>> think everybody understands what I really mean.
>>
>> It should also be noted that the "new driver" exception should only be an
>> issue for things that _matter_.
>>
>> For example, a machine without networking support (or without suppoort for
>> a some other really core driver that provides basic functionality) is
>> practically useless. But a machine without support for some particular
>> webcam or support for some special keys on a particular keyboard? That
>> really doesn't matter, and might as well wait for the next release.
>>
>> So the "merge drivers early" is for drivers that reasonably _matter_ in
>> the sense that it allows people to test Linux AT ALL on the platform. It
>> shouldn't be "any possible random driver".
>>
>> IOW, think about the drivers a bit like a distro would think about
>> backporting drivers to a stable kernel. Which ones are really needed?
>>
>> Also, note that "new driver" really should be that. If it's an older
>> driver, and you need to touch _any_ old code to add a new PCI ID or
>> something, the whole argument about it not breaking falls away. Don't do
>> it. I think, for example, that the SCSI people seem to be a bit too eager
>> sometimes to update their drivers for new revisions of cards, and they do
>> it to old drivers.
>>
>> And finally - the rules should be guidelines. It really isn't always
>> black-and-white, but most of the time the simple question of "could this
>> _possibly_ be just queued for the next release without hurting anything"
>> should be the basic one. If the answer is "yes", then wait.
>
> I am perfectly fine with these rules. You only had to spell them out :)

I wonder whether if it would be a good idea to periodically send out an email
with the basic rules to be followed in each phase of the project.


regards,
--
Paolo
http://paolo.ciarrocchi.googlepages.com/