2002-12-12 20:25:50

by Jeff Garzik

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

Donald Becker wrote:
> [[ I don't know why I bother. The people that now control what goes into
> the kernel would rather put in random patches from other people than
> accept a correct fix from me. ]]


I'm very interested in applying fixes from you! I am publicly begging
you to do so, and even CC'ing lkml on my request.

Please re-post any patches I or Andrew missed.

Jeff




2002-12-12 21:03:15

by Donald Becker

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

On Thu, 12 Dec 2002, Jeff Garzik wrote:
> Donald Becker wrote:
> > [[ I don't know why I bother. The people that now control what goes into
> > the kernel would rather put in random patches from other people than
> > accept a correct fix from me. ]]
>
> I'm very interested in applying fixes from you! I am publicly begging
> you to do so, and even CC'ing lkml on my request.

This is very disingenuous statement.

The drivers in the kernel are now heavily modified and have significantly
diverged from my version. Sure, you are fine with having someone else
do the difficult and unrewarding debugging and maintainence work, while
you work on just the latest cool hardware, change the interfaces and are
concerned only with the current kernel version.

I've been actively developing Linux drivers for over a decade, and run
about two dozen mailing lists for specific drivers. I write diagnostic
routines for every released driver. I thoroughly test and frequently
update the driver set I maintain. And since about 2000, my patches were
ignored while the first notice I've have gotten to changes in my drivers
is the bug reports. And the response: "submit a patch to fix those
newly introduced bugs". I've even had patches ignore in favor of people
that wrote "I don't have the NIC, but here is a change".

A good example is the tulip driver. You repeatedly restructured my
driver in the kernel, splitting into different files. It was still 90+%
my code, but the changes made it impossible to track the modification
history. The kernel driver was long-broken with 21143-SYM cards, but no
one took the responsibility for fixing it.


It's easy to make the first few patches, when you don't have to deal
with reversion testing, many different models, and have an unlimited
sandbox where it doesn't matter if a specific release works or not. But
it takes a huge of work to keep a stable, tracable driver development
process that works with many different kernel versions and hardware
environments.


--
Donald Becker [email protected]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210 Scyld Beowulf cluster system
Annapolis MD 21403 410-990-9993

2002-12-12 21:31:59

by Jeff Garzik

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

Donald Becker wrote:
> On Thu, 12 Dec 2002, Jeff Garzik wrote:
>
>>Donald Becker wrote:
>>
>>>[[ I don't know why I bother. The people that now control what goes into
>>>the kernel would rather put in random patches from other people than
>>>accept a correct fix from me. ]]
>>
>>I'm very interested in applying fixes from you! I am publicly begging
>>you to do so, and even CC'ing lkml on my request.
>
>
> This is very disingenuous statement.


Oh come on, it's far less disingenuous than what you said:

[[ I don't know why I bother. The people that now control what
goes into the kernel would rather put in random patches from
other people than accept a correct fix from me. ]]

I'm sure you'll continue making snide comments on every mailing list you
maintain, but the fact remains:

I would much rather accept a fix from you.

That hasn't changed in the past year. or two. or any amount of time.
Your input is very valuable, and I typically save quite a few of your
emails.


> The drivers in the kernel are now heavily modified and have significantly
> diverged from my version. Sure, you are fine with having someone else
> do the difficult and unrewarding debugging and maintainence work, while
> you work on just the latest cool hardware, change the interfaces and are
> concerned only with the current kernel version.

While I disagree with this assessment, I think we can safely draw the
conclusion that the problem is _not_ people ignoring your patches, or
preferring other patches over yours.


> I've been actively developing Linux drivers for over a decade, and run
> about two dozen mailing lists for specific drivers. I write diagnostic
> routines for every released driver. I thoroughly test and frequently
> update the driver set I maintain. And since about 2000, my patches were
> ignored while the first notice I've have gotten to changes in my drivers
> is the bug reports. And the response: "submit a patch to fix those
> newly introduced bugs". I've even had patches ignore in favor of people
> that wrote "I don't have the NIC, but here is a change".

I don't recall _ever_ getting a patch from you or seeing one posted on
lkml or netdev. How can you be ignored if you're not sending patches?


> A good example is the tulip driver. You repeatedly restructured my
> driver in the kernel, splitting into different files. It was still 90+%
> my code, but the changes made it impossible to track the modification
> history. The kernel driver was long-broken with 21143-SYM cards, but no
> one took the responsibility for fixing it.

s/was/is/
I take responsibility for fixing it, I just haven't fixed it yet :)


> It's easy to make the first few patches, when you don't have to deal
> with reversion testing, many different models, and have an unlimited
> sandbox where it doesn't matter if a specific release works or not. But
> it takes a huge of work to keep a stable, tracable driver development
> process that works with many different kernel versions and hardware
> environments.


We're slowly getting there, in terms of regression and stress testing.

Since you don't send patches anymore for a long time, I was really the
only one [stupid enough?] to stand up and even bother to help collecting
and reviewing net driver changes.

I would love to integrate your drivers directly, but they don't come
anywhere close to using current kernel APIs. The biggie of biggies is
not using the pci_driver API. So, given that we cannot directly merge
your drivers, and you don't send patches to kernel developers, what is
the next best alternative? (a) let kernel net drivers bitrot, or (b)
maintain them as best we can without Don Becker patches? I say that "b"
is far better than "a" for Linux users.

Jeff



2002-12-12 23:01:19

by Ben Greear

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

Donald Becker wrote:

> I've been actively developing Linux drivers for over a decade, and run
> about two dozen mailing lists for specific drivers. I write diagnostic
> routines for every released driver. I thoroughly test and frequently
> update the driver set I maintain. And since about 2000, my patches were
> ignored while the first notice I've have gotten to changes in my drivers
> is the bug reports. And the response: "submit a patch to fix those
> newly introduced bugs". I've even had patches ignore in favor of people
> that wrote "I don't have the NIC, but here is a change".
>
> A good example is the tulip driver. You repeatedly restructured my
> driver in the kernel, splitting into different files. It was still 90+%
> my code, but the changes made it impossible to track the modification
> history. The kernel driver was long-broken with 21143-SYM cards, but no
> one took the responsibility for fixing it.

For what it's worth, I have yet to find a tulip driver that works with
all of my 4-port NICs. Becker's fails with the Phobos 4-port NIC,
a very recent kernel driver fails to negotiate correctly (sometimes)
with the DFE-570tx NIC. Both of them failed a while back when I tried
to put 3 DFE-570tx's into a single machine.

On average, I've had better luck with the kernel driver than with
Becker's, and since it is quite a pain to compile and test it, I
have been ignoring it more and more.

Mr Becker: Perhaps you could rename your tulip driver becker_tulip and have
it separately buildable and configurable in the kernel config options? If
it was back into the kernel proper it would be much easier to experiment with.

Thanks,
Ben

--
Ben Greear <[email protected]> <Ben_Greear AT excite.com>
President of Candela Technologies Inc http://www.candelatech.com
ScryMUD: http://scry.wanfear.com http://scry.wanfear.com/~greear


2002-12-13 01:10:54

by Donald Becker

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

On Thu, 12 Dec 2002, Jeff Garzik wrote:
> Donald Becker wrote:
> > On Thu, 12 Dec 2002, Jeff Garzik wrote:
> >>Donald Becker wrote:
> [[ I don't know why I bother. The people that now control what
> goes into the kernel would rather put in random patches from
> other people than accept a correct fix from me. ]]
> I'm sure you'll continue making snide comments on every mailing list you
> maintain, but the fact remains:
> I would much rather accept a fix from you.

Perhaps we have a different idea of "fix".

You want are looking for a patch to modifications you have made to code
I have written. In the meantime I have been providing working code, and
have been updating that code to work with new hardware. So a fix is the
working, continuously maintained version of the driver.

To put an admittedly simplified spin on it, you are saying "I want you
to tell me what I have broken when I changed this", and to continuously
monitor and test your changes, made for unknown reasons on a very
divergent source base.

> > The drivers in the kernel are now heavily modified and have significantly
> > diverged from my version. Sure, you are fine with having someone else
> > do the difficult and unrewarding debugging and maintenance work, while
> > you work on just the latest cool hardware, change the interfaces and are
> > concerned only with the current kernel version.
>
> While I disagree with this assessment, I think we can safely draw the
> conclusion that the problem is _not_ people ignoring your patches, or
> preferring other patches over yours.

I can point to specific instances. Just looking at the drivers in the
kernel, it is clear that this has happened.

> > A good example is the tulip driver. You repeatedly restructured my
...
> I take responsibility for fixing it, I just haven't fixed it yet :)

> > It's easy to make the first few patches, when you don't have to deal
> > with reversion testing, many different models, and have an unlimited
> > sandbox where it doesn't matter if a specific release works or not.

I think that these two statements fit well together.


> > But
> > it takes a huge of work to keep a stable, traceable driver development
> > process that works with many different kernel versions and hardware
> > environments.
>
> We're slowly getting there, in terms of regression and stress testing.

But it existed before, and was discarded!
Yes, the kernel is now _returning_ to a stable state while making
improvements. But there was a period of time when interface stability
and detailed correctness was thrown away in favor of many inexperienced
people quickly and repeatedly restructuring interfaces without
understanding the underlying requirements.

I could mention VM, but I'll go back to one that had a very large direct
impacted on me: CardBus. CardBus is a difficult case of hot-swap PCI --
the device can go away without warning, and it's usually implemented on
machines where suspend and resume must work correctly. We had perhaps
the best operational implementation in the industry, and I had written
about half of the CardBus drivers. Yet my proven interface
implementation was completely discarded in favor one that needed to be
repeatedly changed as the requirements were slowly understood.

> I would love to integrate your drivers directly, but they don't come
> anywhere close to using current kernel APIs. The biggie of biggies is
> not using the pci_driver API. So, given that we cannot directly merge

Yup, that is just what I was talking about. Let me continue:

The result is that today other systems now have progressed to a great
implementation of suspend/resume, while Linux distributions now default
to unloading and reloading drivers to avoid various suspend bugs. And
when the driver cannot be unloaded because some part of the networking
stack is holding the interface, things go horribly wrong...

You might be able to naysay the individual details, but the end
technical result is clear.

> your drivers, and you don't send patches to kernel developers, what is
> the next best alternative? (a) let kernel net drivers bitrot, or (b)
> maintain them as best we can without Don Becker patches? I say that "b"
> is far better than "a" for Linux users.

Or perhaps recognizing that when someone that has been a significant,
continuous contributer since the early days of Linux says "this is
screwed up", they might have a point. When Linux suddenly had thousands
of people wanting to submit patches, that didn't means that there were
more people that could understand, implement and maintain complex
systems.

--
Donald Becker [email protected]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210 Scyld Beowulf cluster system
Annapolis MD 21403 410-990-9993


2002-12-13 08:43:04

by David Miller

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

On Thu, 2002-12-12 at 17:18, Donald Becker wrote:
> Or perhaps recognizing that when someone that has been a significant,
> continuous contributer since the early days of Linux

Until you learn to play nice with people and mesh within the
fabric of Linux development, I adamently do not classify you
as you appear to self-classify yourself. You don't contribute,
you sit in your sandbox and then point fingers at the people who
do know how to work with other human beings and say "see how much
that stuff sucks? well my stuff works, nyah!"

I fear you will hold a grudge about this forever.

If Linux itself is worse off and went backwards in time for a while, it
is because of your inability to work together with people.

I know it may be hard for you to accept this fact, but I can tell you
that continuing to point the fingers elsewhere is going to be a repeated
dead end.

2002-12-13 16:48:34

by Donald Becker

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

On 13 Dec 2002, David S. Miller wrote:
> On Thu, 2002-12-12 at 17:18, Donald Becker wrote:
> > Or perhaps recognizing that when someone that has been a significant,
> > continuous contributer since the early days of Linux
>
> Until you learn to play nice with people and mesh within the
> fabric of Linux development, I adamently do not classify you
> as you appear to self-classify yourself. You don't contribute,
> you sit in your sandbox and then point fingers at the people who
> do know how to work with other human beings and say "see how much
> that stuff sucks? well my stuff works, nyah!"
..
> If Linux itself is worse off and went backwards in time for a while...

The development criteria used to be technically based, and that is still
the public statement. Now, as your statement makes clear, working code
is an irrelevant criteria.

You comments immediately moved the subject from the technical merit and
correctness of the code to an ad hominem attack. The facts, and the
code, clearly show the long term interaction and contribution. In most
cases the code and interfaces we are talking about were written and
defined by me throughout the past decade.



--
Donald Becker [email protected]
Scyld Computing Corporation http://www.scyld.com
410 Severn Ave. Suite 210 Scyld Beowulf cluster system
Annapolis MD 21403 410-990-9993

2002-12-13 18:25:47

by David Miller

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

From: Donald Becker <[email protected]>
Date: Fri, 13 Dec 2002 11:56:17 -0500 (EST)

The development criteria used to be technically based, and that is still
the public statement. Now, as your statement makes clear, working code
is an irrelevant criteria.

No, working code is only part of the equation. If you're a total and
complete asshole, your work is likely to get lost to the sands of
time. In such a case nobody wants to deal with you.

Welcome to the real world where you have to interact with other human
beings (not just be technically capable) in order to accomplish
things.

It's always been like this Donald. If you piss off, or are a jerk to,
the primary maintainers you're going to get the short end of the
stick.

2002-12-13 20:51:02

by Jeff Garzik

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

Donald Becker wrote:
> On Thu, 12 Dec 2002, Jeff Garzik wrote:
>
>>Donald Becker wrote:
>>
>>>On Thu, 12 Dec 2002, Jeff Garzik wrote:
>>>
>>>>Donald Becker wrote:
>>
>> [[ I don't know why I bother. The people that now control what
>> goes into the kernel would rather put in random patches from
>> other people than accept a correct fix from me. ]]
>>I'm sure you'll continue making snide comments on every mailing list you
>>maintain, but the fact remains:
>>I would much rather accept a fix from you.
>
>
> Perhaps we have a different idea of "fix".
>
> You want are looking for a patch to modifications you have made to code
> I have written. In the meantime I have been providing working code, and
> have been updating that code to work with new hardware. So a fix is the
> working, continuously maintained version of the driver.
>
> To put an admittedly simplified spin on it, you are saying "I want you
> to tell me what I have broken when I changed this", and to continuously
> monitor and test your changes, made for unknown reasons on a very
> divergent source base.

No, that's not it at all.

I would ecstatic if you even posted the changes made to your own drivers
to [email protected] or similar...

I'm asking for _any_ contributions at all. The more fine-grained the
better...



>>>The drivers in the kernel are now heavily modified and have significantly
>>>diverged from my version. Sure, you are fine with having someone else
>>>do the difficult and unrewarding debugging and maintenance work, while
>>>you work on just the latest cool hardware, change the interfaces and are
>>>concerned only with the current kernel version.
>>
>>While I disagree with this assessment, I think we can safely draw the
>>conclusion that the problem is _not_ people ignoring your patches, or
>>preferring other patches over yours.
>
>
> I can point to specific instances. Just looking at the drivers in the
> kernel, it is clear that this has happened.

There is an admitted preference to people who actually send me patches.
That sometimes translates to "other change" being preferred over logic
in one of your drivers.

I would still greatly prefer patches from you, however. And your
comments on other people's patches are very welcome [and there have been
plenty of those in past -- thanks].


> But it existed before, and was discarded!
> Yes, the kernel is now _returning_ to a stable state while making
> improvements. But there was a period of time when interface stability
> and detailed correctness was thrown away in favor of many inexperienced
> people quickly and repeatedly restructuring interfaces without
> understanding the underlying requirements.
>
> I could mention VM, but I'll go back to one that had a very large direct
> impacted on me: CardBus. CardBus is a difficult case of hot-swap PCI --
> the device can go away without warning, and it's usually implemented on
> machines where suspend and resume must work correctly. We had perhaps
> the best operational implementation in the industry, and I had written
> about half of the CardBus drivers. Yet my proven interface
> implementation was completely discarded in favor one that needed to be
> repeatedly changed as the requirements were slowly understood.

But... this is how Linux development works. Believe me, I understand
you don't like that very much, but here is a central question to you:

what can we do to move forward?

The CardBus implementation still fails on some systems, and still wants
work. However, the pci_driver API is not only codified in 2.4.x, but it
is extended to the more generic driver model in 2.5.x. _And_ I have
proven it works just fine under 2.2.x (see kcompat24 toolkit).

What can we as kernel developers do to reintegrate you back into kernel
development? Some of the APIs you obviously don't like, but pretending
they don't exist is not a solution. This is the Linux game, for better
or worse. At the end of the day, if we don't like Linus's decisions, we
can either swallow our pride and continue with Linux, or fork a Linux
tree and make it work "the right way." The driver model (nee
pci_driver) is the direction of Linux.


>>I would love to integrate your drivers directly, but they don't come
>>anywhere close to using current kernel APIs. The biggie of biggies is
>>not using the pci_driver API. So, given that we cannot directly merge
>
>
> Yup, that is just what I was talking about. Let me continue:
>
> The result is that today other systems now have progressed to a great
> implementation of suspend/resume, while Linux distributions now default
> to unloading and reloading drivers to avoid various suspend bugs. And
> when the driver cannot be unloaded because some part of the networking
> stack is holding the interface, things go horribly wrong...
>
> You might be able to naysay the individual details, but the end
> technical result is clear.

That's what is currently in development in 2.5.x: sane suspend and
resume. I would dispute that other systems have a decently designed
suspend/resume -- that said, working is obviously better right now than
non-working but nicer design ;-)


>>your drivers, and you don't send patches to kernel developers, what is
>>the next best alternative? (a) let kernel net drivers bitrot, or (b)
>>maintain them as best we can without Don Becker patches? I say that "b"
>>is far better than "a" for Linux users.
>
>
> Or perhaps recognizing that when someone that has been a significant,
> continuous contributer since the early days of Linux says "this is
> screwed up", they might have a point. When Linux suddenly had thousands
> of people wanting to submit patches, that didn't means that there were
> more people that could understand, implement and maintain complex
> systems.


Shit, dude, _I_ recognize this. Probably better than most people, since
I see on a daily basis the benefits of your overall design in the net
drivers, and want to push good elements of that design into the kernel
net drivers. At the end of the day you'd be surprised how much I wind
up defending your code to other kernel hackers, and educating them on
why -not- to do certain things.

IMO the bigger sticking point is - at what point do you say "yeah,
2.4.x/2.5.x APIs may suck in my opinion, but they are the official APIs
so I will use them"? There are tons of reasons why Red Hat (my current
employer) is very leery of taking patches which will not eventually find
their way to the mainline kernel.org kernel. A lot of those reasons
apply in the case of your drivers, too. Using non-standard APIs has all
sorts of software engineering implications which wind up with a negative
developer and user experience.

Jeff



2002-12-14 00:12:21

by Michael Richardson

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

-----BEGIN PGP SIGNED MESSAGE-----


>>>>> "Donald" == Donald Becker <[email protected]> writes:
Donald> The drivers in the kernel are now heavily modified and have significantly
Donald> diverged from my version. Sure, you are fine with having someone else
Donald> do the difficult and unrewarding debugging and maintainence work, while
Donald> you work on just the latest cool hardware, change the interfaces and are
Donald> concerned only with the current kernel version.

I agree strongly with Donald.

Interfaces should NEVER change in patch level versions.
Just *DO NOT DO IT*.

Go wild in odd-numbered.. get the interfaces right there.
But leave them alone afterward.

This is a fundamental tenant of being professional. Otherwise, the kernel
people are the biggest reason I've ever seen for using *BSD.
Microsoft is not the real enemy. Gratuitous change is.

] ON HUMILITY: to err is human. To moo, bovine. | firewalls [
] Michael Richardson, Sandelman Software Works, Ottawa, ON |net architect[
] [email protected] http://www.sandelman.ottawa.on.ca/ |device driver[
] panic("Just another Debian GNU/Linux using, kernel hacking, security guy"); [





-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Finger me for keys

iQCVAwUBPfp5IIqHRg3pndX9AQHW7gP9FC0BgskaVBb9HNjKUp8DhR5bJK+YTVa7
YeVGZFRxuFi2O9oDiMuUvYq++y+8PR4LXpJZuNoShA36wqV38QS8pBFhqFt/JrEb
xHNozohQ/7IyncJsG0UkBTfhqIbxbfsd19DUx0ehcqNAh7N3c95qeEEHODTs2DKy
jqtgSrXvOBY=
=JsKT
-----END PGP SIGNATURE-----

2002-12-14 01:36:36

by Oliver Xymoron

[permalink] [raw]
Subject: Re: pci-skeleton duplex check

On Fri, Dec 13, 2002 at 07:19:47PM -0500, Michael Richardson wrote:
>
>
> >>>>> "Donald" == Donald Becker <[email protected]> writes:
> Donald> The drivers in the kernel are now heavily modified and have significantly
> Donald> diverged from my version. Sure, you are fine with having someone else
> Donald> do the difficult and unrewarding debugging and maintainence work, while
> Donald> you work on just the latest cool hardware, change the interfaces and are
> Donald> concerned only with the current kernel version.
>
> I agree strongly with Donald.
>
> Interfaces should NEVER change in patch level versions.
> Just *DO NOT DO IT*.
>
> Go wild in odd-numbered.. get the interfaces right there.
> But leave them alone afterward.

Umm, if I recall correctly, they're rehashing a flame war about stuff
that occurred in 2.3. It doesn't need any additional kindling.

--
"Love the dolphins," she advised him. "Write by W.A.S.T.E.."