2002-09-21 00:22:10

by Rhoads, Rob

[permalink] [raw]
Subject: [ANNOUNCE] Linux Hardened Device Drivers Project


Project Announcement:
--------------------
We've started a new project on sourceforge.net w/ focus
on hardening Linux device drivers for highly available
systems. This project is being worked on with folks from
OSDL's CGL and DCL projects as well.

Initially we've created a specification, a few kernel modules
that implement a set of driver programming interfaces, and
a sample device driver that demonstrates those interfaces.

We are actively soliciting involvement with others in the
Linux developer community. We need your help to make this
project relevant and useful.

Below I've included an overview of the hardened driver project.
By no means is this complete or final. It's just our initial
attempt at defining what is meant by the term hardened driver
and the areas we want to focus on.

For additional info, please checkout the links at the bottom
of this message and the Hardened Drivers web site at
http://hardeneddrivers.sf.net.


Hardened Driver Project Overview:
--------------------------------
Device drivers have traditionally been a significant source
of software faults. For this reason, they are of key concern
in improving the availability and stability of the operating
system. A critical element in creating Highly Available (HA)
environment is to reduce the likelihood of faults in key
drivers, a methodology called driver hardening.

A device driver is typically implemented with emphasis on
the proper operation of the hardware. Attention to how it
will function in the event of hardware faults is often
minimal. Hardened drivers, on the other hand, are designed
with the assumption that the underlying hardware that they
control will fail. They need to respond to such failures by
handling faults gracefully, limiting the impact on the overall
system. Hardened device drivers must continue to operate when
the hardware has failed (e.g. allow device fail-over), and
must not allow the propagation of corrupt data from a failed
device to other components of the system.

Hardened device drivers must also be active participants in
the recovery of detected faults, by locally recovering them or
by reporting them to higher-level system management software
that subsequently instructs the driver to take a specific
action.

The goal of a hardened driver is to provide an environment
in which hardware and software failures are transparent to
the applications using their services, where possible. The
way to effectively achieve this goal is to analyze a
driver's software design and implement appropriate changes
to improve stability, reliability and availability, and
to provide instrumentation for management middleware.

We believe that improving driver stability and reliability
includes such measures as ensuring that all wait loops are
limited with a timeout, validating input and output data and
structuring the driver to anticipate hardware errors.
Improving availability includes adding support for device
hot swapping and validating the driver with fault injection.
Instrumentation for management middleware includes functions
such as reporting of statistical indicators and logging of
pertinent events to enable postmortem analysis in the event
of a failure.

To minimize instability contributed by device drivers and to
enhance the availability of HA systems, we've attempted to
define a set of requirements that a device driver should
adhere to in order to be considered a hardened driver. We
then define different hardening traits and the required
programming interfaces to support these hardening traits.

We've identified four areas in which drivers can be hardened:
o Hardening with code robustness
o Hardening with event logging
o Hardening with diagnostics
o Hardening with resource monitoring and statistics

We've also identified some key areas we feel are most critical
to overall system stability and plan to focus initial hardening
efforts on drivers for network interface cards, physical storage,
and logical storage.

Project Links:
-------------
o The Driver Hardening website:
http://hardeneddrivers.sourceforge.net

o The SourceForge project related info:
http://sourceforge.net/projects/hardeneddrivers

o Hardened Drivers Mailing List Info (subscribe here):
http://lists.sourceforge.net/mailman/listinfo/hardeneddrivers-discuss


+=+=+
Rob Rhoads mailto:[email protected]
Staff Software Engineer office: 503-677-5498
Telecom Software Platforms
Intel Communications Group

This email message solely contains my own personal views, and not
necessarily those of my employer.


2002-09-21 01:05:18

by Andre Hedrick

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project


Hi Rob,

My opinion only, and you may think it "stinks" ... oh well.

Obvious this is a way for the telecom folks to get something for free that
really should be paid for by funding the project with CASH. Or funding
(a) startup(s) related to generating such support.

Regardless, it takes (fill in the blank) to boldly ask people to add APIs
for an industry who is only interested in using and not contributing.
Prove that all the stuff which is going to be plugged into these
security-hole^Wbug-generators^Wfeatures will be scheduled for open source.
Or this another attempt to try and take over the license and shove BSD
down the piles?

Pointed Blunt Raw, but nice.

Regards,

Andre Hedrick
LAD Storage Consulting Group

PS: I see a lot of "wants", are there any "gives" ?


On Fri, 20 Sep 2002, Rhoads, Rob wrote:

>
> Project Announcement:
> --------------------
> We've started a new project on sourceforge.net w/ focus
> on hardening Linux device drivers for highly available
> systems. This project is being worked on with folks from
> OSDL's CGL and DCL projects as well.
>
> Initially we've created a specification, a few kernel modules
> that implement a set of driver programming interfaces, and
> a sample device driver that demonstrates those interfaces.
>
> We are actively soliciting involvement with others in the
> Linux developer community. We need your help to make this
> project relevant and useful.

We need your CAPITAL to pay for our TIME.

> Below I've included an overview of the hardened driver project.
> By no means is this complete or final. It's just our initial
> attempt at defining what is meant by the term hardened driver
> and the areas we want to focus on.

Great, do they serve the needs of more than "INTEL"?

> For additional info, please checkout the links at the bottom
> of this message and the Hardened Drivers web site at
> http://hardeneddrivers.sf.net.
>
>
> Hardened Driver Project Overview:
> --------------------------------
> Device drivers have traditionally been a significant source
> of software faults. For this reason, they are of key concern
> in improving the availability and stability of the operating
> system. A critical element in creating Highly Available (HA)
> environment is to reduce the likelihood of faults in key
> drivers, a methodology called driver hardening.
>
> A device driver is typically implemented with emphasis on
> the proper operation of the hardware. Attention to how it
> will function in the event of hardware faults is often
> minimal. Hardened drivers, on the other hand, are designed
> with the assumption that the underlying hardware that they
> control will fail. They need to respond to such failures by
> handling faults gracefully, limiting the impact on the overall
> system. Hardened device drivers must continue to operate when
> the hardware has failed (e.g. allow device fail-over), and
> must not allow the propagation of corrupt data from a failed
> device to other components of the system.
>
> Hardened device drivers must also be active participants in
> the recovery of detected faults, by locally recovering them or
> by reporting them to higher-level system management software
> that subsequently instructs the driver to take a specific
> action.
>
> The goal of a hardened driver is to provide an environment
> in which hardware and software failures are transparent to
> the applications using their services, where possible. The
> way to effectively achieve this goal is to analyze a
> driver's software design and implement appropriate changes
> to improve stability, reliability and availability, and
> to provide instrumentation for management middleware.
>
> We believe that improving driver stability and reliability
> includes such measures as ensuring that all wait loops are
> limited with a timeout, validating input and output data and
> structuring the driver to anticipate hardware errors.
> Improving availability includes adding support for device
> hot swapping and validating the driver with fault injection.
> Instrumentation for management middleware includes functions
> such as reporting of statistical indicators and logging of
> pertinent events to enable postmortem analysis in the event
> of a failure.
>
> To minimize instability contributed by device drivers and to
> enhance the availability of HA systems, we've attempted to
> define a set of requirements that a device driver should
> adhere to in order to be considered a hardened driver. We
> then define different hardening traits and the required
> programming interfaces to support these hardening traits.
>
> We've identified four areas in which drivers can be hardened:
> o Hardening with code robustness
> o Hardening with event logging
> o Hardening with diagnostics
> o Hardening with resource monitoring and statistics
>
> We've also identified some key areas we feel are most critical
> to overall system stability and plan to focus initial hardening
> efforts on drivers for network interface cards, physical storage,
> and logical storage.
>
> Project Links:
> -------------
> o The Driver Hardening website:
> http://hardeneddrivers.sourceforge.net
>
> o The SourceForge project related info:
> http://sourceforge.net/projects/hardeneddrivers
>
> o Hardened Drivers Mailing List Info (subscribe here):
> http://lists.sourceforge.net/mailman/listinfo/hardeneddrivers-discuss
>
>
> +=+=+
> Rob Rhoads mailto:[email protected]
> Staff Software Engineer office: 503-677-5498
> Telecom Software Platforms
> Intel Communications Group
>
> This email message solely contains my own personal views, and not
> necessarily those of my employer.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


2002-09-21 01:36:17

by Greg KH

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

Hi,

I've just started to read over the published spec, and will reserve
comment on it, and the example code you've created after I'm done
reading it. But I'll make a few comments right now on your
announcement:


On Fri, Sep 20, 2002 at 05:26:47PM -0700, Rhoads, Rob wrote:
>
> Project Announcement:
> --------------------
> We've started a new project on sourceforge.net w/ focus
> on hardening Linux device drivers for highly available
> systems. This project is being worked on with folks from
> OSDL's CGL and DCL projects as well.

Who is "we"?

> Hardened Driver Project Overview:
> --------------------------------
> Device drivers have traditionally been a significant source
> of software faults. For this reason, they are of key concern
> in improving the availability and stability of the operating
> system. A critical element in creating Highly Available (HA)
> environment is to reduce the likelihood of faults in key
> drivers, a methodology called driver hardening.

Or in simpler terms, making drivers that work, right?
Do you have any statistics that show that existing Linux drivers are a
problem with HA systems? If so, which drivers?

> A device driver is typically implemented with emphasis on
> the proper operation of the hardware. Attention to how it
> will function in the event of hardware faults is often
> minimal.

Ah, a broad generalization, very nice to set up for the reasoning behind
your project. But is this really true? Lots of existing kernel drivers
can handle a wide range of hardware faults, and user faults. Again, do
people have any specific problems with existing drivers, or driver
subsystems?

> The goal of a hardened driver is to provide an environment
> in which hardware and software failures are transparent to
> the applications using their services, where possible. The
> way to effectively achieve this goal is to analyze a
> driver's software design and implement appropriate changes
> to improve stability, reliability and availability, and
> to provide instrumentation for management middleware.

So in order to achieve reliable drivers, we want to add more lines of
code to the driver to allow for instrumentation? What happens when the
fault happens in the instrumentation interface? And what is watching
this interface for problems in it's handling of data?

> We believe that improving driver stability and reliability
> includes such measures as ensuring that all wait loops are
> limited with a timeout, validating input and output data and
> structuring the driver to anticipate hardware errors.

All good things to achieve. Have you looked at the kernel-janitors
project? There are lots of places where you all can jump in to achieve
this right now in the existing code. Patches for these items are always
welcome, a spec is not needed :)

> Improving availability includes adding support for device
> hot swapping and validating the driver with fault injection.

Hot-swap needs to have hardware that can support this. Linux currently
supports these kinds of hardware configurations (USB, IEEE1294, PCI
Hotplug, cPCI Hotplug, hotplug CPU, etc.) Are there existing types of
hardware that is present in your systems that do not have support on
Linux? And if so, creating drivers for this hardware would be greatly
appreciated.

As for "fault injection", this traditionally requires hardware test
setups that are beyond the means of most kernel programmers. Will your
group be providing access to this kind of hardware for kernel developers
to test their drivers with?

> Instrumentation for management middleware includes functions
> such as reporting of statistical indicators and logging of
> pertinent events to enable postmortem analysis in the event
> of a failure.

Um, about this middleware management layer, are you talking about
RAS-style kernel logging? If so, please see the archives about why the
current implementation of this has been rejected by the kernel
community.

> We've identified four areas in which drivers can be hardened:
> o Hardening with code robustness

You mean the driver core? That should be a requirement of any Linux
kernel driver today, hardened or not. So all Linux drivers already meet
this, right? If not, please let us know and they will be fixed.

> o Hardening with event logging

See the above comment about RAS.

> o Hardening with diagnostics

Ah, but most hardware does not support diagnostics. What do you do
suggest be done for this?

> o Hardening with resource monitoring and statistics

The middle management layer, right? I'll get into my response of this
once I've gone over the spec.

> We've also identified some key areas we feel are most critical
> to overall system stability and plan to focus initial hardening
> efforts on drivers for network interface cards, physical storage,
> and logical storage.

In a quick look at your example code and documentation, this is all for
the 2.4 kernel. As the 2.5 deadline is almost a month away, do you have
any intention of trying to get these features and layers into the 2.5
kernel? And if not, are you willing to wait until the 2.7 kernel is
opened up?

That's probably enough questions for now :)

thanks,

greg k-h

2002-09-21 02:09:46

by Pete Zaitcev

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

> Obvious this is a way for the telecom folks to get something for free that
> really should be paid for by funding the project with CASH. Or funding
> (a) startup(s) related to generating such support.

Andre, if I read you right, you are articulating the following
idea: "Those guys collect drivers written by students and try
to run them in production. Of course, it cannot work. If paid
professionals wrote them, there would be no problem."

If this is what you are saying here, it is very misguided.
I had a chance to examine some of drivers written by paid
professionals, and the picture was pretty bleak. Also, the
problem of hardening is not unique to Linux or Open Source,
I had runs with it before.

So, I do not think there's a budgetary issue here. I talked to
the C-G Linux folks at OLS, and they do have funding. But I do
not think the hardening is going to fly the way they push it,
for two technical reasons.

First, you cannot race crappy driver writers. As soon as you
harden and qualify something, technology changes and brings
a whole bunch of crappy drivers.

Second, the resulting "hardened" system is no less fragile than
it was before.

If I was going the C-G Linux, I would abandon the "hardening"
efforts as they are now, and shift in-house hackers to work on
clusters and UML (including a cluster or UMLs).

As far as giving goes, the C-G people expended a lot of effort
on documentation of their wishes (again, judging by their OLS
performance). And I mean *A F. LOT* of effort. If they
coded as much as they wrote reports and reviews, we'd probably
have something working by now.

-- Pete

2002-09-21 02:55:27

by Rhoads, Rob

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

> Obvious this is a way for the telecom folks to get something
> for free that
> really should be paid for by funding the project with CASH.
> Or funding (a) startup(s) related to generating such support.
>
> Regardless, it takes (fill in the blank) to boldly ask people
> to add APIs
> for an industry who is only interested in using and not contributing.
> Prove that all the stuff which is going to be plugged into these
> security-hole^Wbug-generators^Wfeatures will be scheduled for
> open source.

This project is open to anyone who wants to participate and is
being paid for by Intel and a host of other companies. The
idea is to enable Linux to play in the Carrier space with all
the work given away under the GPL.

> Or this another attempt to try and take over the license and shove BSD
> down the piles?

The project is open and released under the terms of the GPL.

>
> Pointed Blunt Raw, but nice.
>
> Regards,
>
> Andre Hedrick
> LAD Storage Consulting Group
>
> PS: I see a lot of "wants", are there any "gives" ?

What paying professional developers to work on an Open Source project
and giving their work away under the terms of the GPL isn't enough?

+=+=+
Rob Rhoads mailto:[email protected]
Staff Software Engineer office: 503-677-5498
Telecom Software Platforms
Intel Communications Group

This email message solely contains my own personal views, and not
necessarily those of my employer.

2002-09-21 03:29:05

by Andre Hedrick

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Fri, 20 Sep 2002, Pete Zaitcev wrote:

> > Obvious this is a way for the telecom folks to get something for free that
> > really should be paid for by funding the project with CASH. Or funding
> > (a) startup(s) related to generating such support.
>
> Andre, if I read you right, you are articulating the following
> idea: "Those guys collect drivers written by students and try
> to run them in production. Of course, it cannot work. If paid
> professionals wrote them, there would be no problem."

You can read that into it sure, how about reading the other side.
Also Pete, you know me better than to paint me into that corner so bogus.
Treat the students as professionals, as they will be soon enough.

Sheesh, fund a university project to get the fresh young minds to derive
the future fabric. Regardless if the students are paid or offerred
scholarships in return, would it not be a "WIN-WIN" for "ALL"?
Now the cherry on top comes to a few or many who are super talented, and
find they have a career resulting from the work.

> If this is what you are saying here, it is very misguided.
> I had a chance to examine some of drivers written by paid
> professionals, and the picture was pretty bleak. Also, the
> problem of hardening is not unique to Linux or Open Source,
> I had runs with it before.
>
> So, I do not think there's a budgetary issue here. I talked to
> the C-G Linux folks at OLS, and they do have funding. But I do

So if this is true, where is the sign up list for contracts based on
deliverables?

> not think the hardening is going to fly the way they push it,
> for two technical reasons.
>
> First, you cannot race crappy driver writers. As soon as you
> harden and qualify something, technology changes and brings
> a whole bunch of crappy drivers.

No but a legal binding contract of deliverables will bring those along who
rise to the challenge, correct?

> Second, the resulting "hardened" system is no less fragile than
> it was before.

Erm, more likely the basic infrastucture for permiting in-band device
recovery and communication pathways back to the requesting thread or
application above is what appears to be lacking, but then again I could be
wrong.

> If I was going the C-G Linux, I would abandon the "hardening"
> efforts as they are now, and shift in-house hackers to work on
> clusters and UML (including a cluster or UMLs).
>
> As far as giving goes, the C-G people expended a lot of effort
> on documentation of their wishes (again, judging by their OLS
> performance). And I mean *A F. LOT* of effort. If they
> coded as much as they wrote reports and reviews, we'd probably
> have something working by now.

Nice, so they do a great dog-n-pony show?

Cheers,

Andre Hedrick
LAD Storage Consulting Group

2002-09-21 03:46:16

by Andre Hedrick

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

On Fri, 20 Sep 2002, Rhoads, Rob wrote:

> > Obvious this is a way for the telecom folks to get something
> > for free that
> > really should be paid for by funding the project with CASH.
> > Or funding (a) startup(s) related to generating such support.
> >
> > Regardless, it takes (fill in the blank) to boldly ask people
> > to add APIs
> > for an industry who is only interested in using and not contributing.
> > Prove that all the stuff which is going to be plugged into these
> > security-hole^Wbug-generators^Wfeatures will be scheduled for
> > open source.
>
> This project is open to anyone who wants to participate and is
> being paid for by Intel and a host of other companies. The

Explain how it is paid and to whom?

> idea is to enable Linux to play in the Carrier space with all
> the work given away under the GPL.

Re-Phase, "Carrier space" needs Linux to succeed.


> > Or this another attempt to try and take over the license and shove BSD
> > down the piles?
>
> The project is open and released under the terms of the GPL.

Okay, is there not a cause for loading "closed source modules" via the new
API's and management tools?

> > PS: I see a lot of "wants", are there any "gives" ?
>
> What paying professional developers to work on an Open Source project
> and giving their work away under the terms of the GPL isn't enough?

I am sorry, I do not understand the context.

I give away lots of work which is paid for by various companies who desire
broarder support for their product. If I recall correctly, Intel is a
promoter of Serial ATA, yet it took another vendor whose interest in
working with the open source community funded the "free release" of a
generic IOPS driver layer change, and the crossover support for various
archs. Just so you know, they are totally aware the release of the
project would also enable their competion. They are betting their product
is superior and put the money down to prove it!

So why not have the "Carrier" people post a list of tasks to be completed
and the monetary value and let the opensource community play in the
bidding process to earn the contract?

You specify it to be totally open source and IP generated shall be demeed
public and not open for patents.

Will the carrier folks step up to the BAR and do the right thing by the
many individuals in the community or shield themselves in wordy
specifications and compliance terms?

Please point out why this is wrong?

Cheers,

Andre Hedrick
LAD Storage Consulting Group

2002-09-21 03:52:29

by Mark Veltzer

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

> This project is open to anyone who wants to participate and is
> being paid for by Intel and a host of other companies. The
> idea is to enable Linux to play in the Carrier space with all
> the work given away under the GPL.

Enable Linux to play in the Carrier space. That IS interesting. This is, I
expect, as opposed to all the other operating systems which run on Intel
platforms which are already robust and already play in the "Carrier space" ?
The patronization of commercial companies never ceases to amaze me...

Let me reverse this: Intel wants to play in the Carrier space and needs Linux
to do it... Ok. Now we've got it right. I think this is what other posters
thought of as "taking". Intel has everything to gain here since it was never
a player in the "Carrier space".

Don't get me wrong, I'm not saying NO to free code but if we really have to
come face to face with the truth then it's quite obvious from history that
commercial companies aren't that hot when it comes to coding (it's my general
experience that code that comes out of commercial companies needs to be more
heavily reviewed bacause marketing/featurism and deadlines produce bad
code...).

Regarding marketing slogans. Even a bad mouse driver can screw up your
system. This means that you just have to write good driver code. I certainly
wouldn't want all of this commercial bla bla to turn into a big fat API where
old and new semantics are mixed and are not clear like in the "other" carrier
grade operating system which is well known and runs on Intel. APIs have to be
as lean as possible with robust semantics. This should not change and this is
actually the chief strength of Linux (because all driver code is available
the API is quite mature and robust). All that is left to do is improve driver
code. So why don't you call the project "Driver improvement project" or
something like that and drop the commercial bla bla. Under this title the
project has probably been going on (under some form or another) since 1991.

> What paying professional developers to work on an Open Source project
> and giving their work away under the terms of the GPL isn't enough?

You mean when Intel finally gets a real operating system to run on it's
machines for PRACTICALLY NOTHING ?!? I think Intel is getting a real sweet
deal here. I would love to be a chip maker and get a full operating system
(with thousands of applications and a full desktops) for the price of a few
developers. Also the big commercial noise that such a project would generate
would probably win a few fat accounts away from SUN eh ?!?

BTW: would you be paying developers to work on other architecture drivers too
? ! ? That would be interesting but I guess the answer is no... This is a
major problem since the arsenal of tools at the disposal of a driver coder in
Linux is quite generic (with regard to platform). When you aim to produce a
driver just for i386 you tend to hardcode x86 details into your driver which
makes for a bad driver since using the platform agnostic Linux arsenal would
probably produce a better driver. You do code for x86 but if you are
developing a mixed set of drivers (for different archs) then you tend to
understand the generic tools semantics better and use them better which in
turn produces better drivers. It may sound strange but when coding in Linux
you're better off being familiar with several archs and working on details of
several archs because you tend to produce better drivers that way (even if
the drivers are arch specific). The generic tools that you mentioned
(regarding more robust error handling etc..) which seem to me like
improvements in API would certainly need to be approved for ALL architectures
which in turn will need a big janitor type project which means that it's out
of this development cycle.

Mark.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE9i/DpxlxDIcceXTgRAkWxAJ9vUND+LnzCg3c0dQepZ6sYFwBkEwCgvVpb
YA6gC8XeeM4Ct/w44SHXLhA=
=HP4A
-----END PGP SIGNATURE-----

2002-09-21 05:30:19

by Greg KH

[permalink] [raw]
Subject: my review of the Device Driver Hardening Design Spec

<sorry for the mangled header on the first send of this>

On Fri, Sep 20, 2002 at 06:40:54PM -0700, Greg KH wrote:
> Hi,
>
> I've just started to read over the published spec, and will reserve
> comment on it, and the example code you've created after I'm done
> reading it.

Ok, here's some comments on the 0.5h release of the Device Driver
Hardening Design Specification:

(I'll skip the intro, and feel good sections and get into the details
that you lay out, starting in section 2)

Section 2:
2.1:
- do NOT use /proc for driver info. Use driverfs.
- If you are using a kernel version that does not have driverfs,
put all /proc driver info under /proc/drivers, which is where
it belongs.
- Only have 1 value per file, and no binary data in the files.
- Do not put the "kernel version for which the driver was
compiled", as that _always_ much match the kernel version that
is running, so is redundant.

2.2:
- do NOT use typedef

2.5.5:
- you do not have to always check data returned from functions,
if you wrote the functions in the first place. Redundant
checking of all data within the kernel, slows things down.
Sure, some checking is good, but do not say that it is a
requirement, or no one will want to use your driver.

The majority of section 2 is very nice, it's a good list of things that
drivers should do.


Section 3:

Wow, where to start...

The Common Statistic Manager:
- why does this have to live in the kernel? It should be in
userspace, grabbing all of the data from the /proc files you
just specified in section 2.1.

POSIX event logging:
- wow, not much I can say here, that hasn't already been said
before :(

Diagnostics:
- now these are a good idea. A common subsystem that drivers
can register what kind of diagnostics they can run on their
hardware, nice.

3.1.1:
- UUIDs!!!??? You have got to be kidding. Here, for the
benefit of those who have not read this, I'll quote:
"Each subsystem, and each resource contained within each
subsystem, needs to be uniquely identified. In order to
do this a hardened driver developer shall pre-assign a
Universally Unique Identifier (UUID) as the Subsystem ID
for each subsystem, and shall provide a means to assign
a unique Resource ID string for each resource within a
subsystem."

So for every resource, a string shall be associated with it.
But that means for most resources, the string will take up more
memory than the resource itself does. Does that make sense?

It's also up to the driver to create these resource ids at
runtime and guarantee their uniqueness over the lifetime of the
kernel. How in the world can you expect every driver author to
do this? Any example code out there?

And what are these UUIDs going to be used for, ah, event
logging. Enough said.

3.2 Statistics:
You actually want every driver to support SNMP compliant
statistics groups within themselves? Why? What a bloat of a
kernel.

All of this should be done (if at all) from userspace.


3.2.5.2:
(I'm not condoning ANY of these functions or code, just trying to point out how
you should, if they were to be in the kernel, done properly.)
- do not use typedef
- struct stat_info does not need *unit, as that is already
specified in the scale field, right?
- the stat_value_t union is just a horrible abomination, don't
do that.

3.3 Diagnostics:
- not a bad idea, but some work could be done on the
implementation. Would fit in nicely with the device driver
model in 2.5. For 2.4, it would be another subsystem a driver
would register with.

3.3.3.2:
- no typedefs
- run() is horrible, you are trying to fit all kinds of possible
diagnosis into one function callback. Not a good idea.
Break the different kinds of callbacks out into different
functions. That ensures type safety, right now you are just
creating another ioctl() type mess.

3.4 Event logging:
- I'm not even going to touch this, sorry.

4: High Availability
- are you all working with the existing HA group?

4.1:
- um, what are you trying to say here. This section is
pointless. Yes we all think Hot Swap is a good idea, that's
why Linux currently supports it.

4.2:
- RAID and ethernet bonding is nice. Again, Linux already has
projects and support for these things. Why mention them?


The rest of this section is fine, and I welcome any test harnesses that
are created to do this kind of fault injection for driver testing.

5:
- Here you back-pedal on everything you said up till now. Let
me summarize what is said in these 3 paragraphs in 1 sentence:
"Yes, all these things are well and good, but don't let
them effect the currently great performance Linux has
today."
Sorry, but you can't have it both ways.

5.1:
- do NOT use #ifdef in the .c files. Only in .h files.
- why is CONFIG_DRIVER_HOTSWAP an option. What does it do that
CONFIG_HOTPLUG does not do today?
- actually, what do any of these CONFIG_ options do, and why
would someone not want the CONFIG_DRIVER_ROBUST to be always
enabled?


In summary, I think that a lot of people have spent a lot of time in
creating this document, and the surrounding code that matches this
document. I really wish that a tiny bit of that effort had gone into
contacting the Linux kernel development community, and asking to work
with them on a project like this. Due to that not happening, and by
looking at the resultant spec and code, I'm really afraid the majority
of that time and effort will have been wasted.

What do I think can be salvaged? Diagnostics are a good idea, and I
think they fit into the driver model in 2.5 pretty well. A lot of
kernel janitoring work could be done by the CG team to clean up, and
harden (by applying the things in section 2) the existing kernel
drivers. That effort alone would go a long way in helping the stability
of Linux, and also introduce the CG developers into the kernel community
as active, helping developers. It would allow the CG developers to
learn from the existing developers, as we must be doing something right
for Linux to be working as well as it does :)

Also, open specs for the hardware the CG members produce, to allow
existing kernel drivers to be enhanced (instead of having to be reverse
engineered), and new kernel drivers to be created, would also go a long
way in helping out both the CG's members and the entire Linux
community's cause of having a robust, stable kernel be achived easier.
Closed specs, and closed drivers do not help anyone.


thanks for reading this far,

greg k-h

2002-09-21 10:36:50

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

In article <[email protected]> you wrote:
> Regardless, it takes (fill in the blank) to boldly ask people to add APIs
> for an industry who is only interested in using and not contributing.

There is more than one industry interested in it. It simply sucks if your
kernel panic only because you remove a SCSI cable. IT also sucks if your
kernel panics only vecause you have a bad block on a Disk.

Companies which build carrier grade Linux Systems (like HP, IBM and SGI _do_
contribute on making Linux an Enterprise System).

So personal I find this project good, and adding the Linux Testing community
is needed. But I dont think that a lot of new APIs is needed in the first
place. (Well, possibly for things like path failover/md somebody needs to
define an actual error handling, like it is done currently), but "debugging"
all drivers by review is needed. On the other hand, the reason this has not
happend just shows us, that it is not trivial to find a second person which
understands hardware's error behaviour.

Greetings
Bernd

2002-09-21 11:15:38

by Russell King

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Sat, Sep 21, 2002 at 12:41:59PM +0200, Bernd Eckenfels wrote:
> In article <[email protected]> you wrote:
> > Regardless, it takes (fill in the blank) to boldly ask people to add APIs
> > for an industry who is only interested in using and not contributing.
>
> There is more than one industry interested in it. It simply sucks if your
> kernel panic only because you remove a SCSI cable. IT also sucks if your
> kernel panics only vecause you have a bad block on a Disk.

Both of which I'd classify as bugs. I recently submitted a few patches
that fix some of the idiotic or bad error handling in the 2.4 SCSI
layer. Although they didn't completely fix some of the problems, it
did highlight some of the problem areas.

> On the other hand, the reason this has not
> happend just shows us, that it is not trivial to find a second person which
> understands hardware's error behaviour.

Or people with broken hardware don't report that the error paths are
broken; they just fix their hardware.

I have a Syquest 270MB drive here. Bought from new, but it has never
worked 100% properly. It mostly complains about media errors and the
like. After several rounds with Syquest, I lost faith in it. However,
I still have it. Why?

I keep test filesystems on the cartridges. Perfect when I want to run
some tests that could well take out a filesystem, or when I want to test
out the SCSI error handling. That's how I found that the 2.4 SCSI error
handling code has the possibility to eat disks alive when it encounters
an error.

Would extra API's have helped find this? Would it have made the driver
more stable? Would it have caught the bug in my SCSI driver that caused
it not to request sense on error and therefore throw the SCSI subsystem
into a never-ending loop? The answers are: no, no, no.

Would testing with broken hardware have found this? Would it make the
driver more stable? Yes, and yes.

IMO, driver stability comes with testing and review by people who know
both the hardware _and_ who know the kernel API inside out. There seems
to be a lack the latter, and a lack of people with broken hardware for
the former.

So next time when your hard disk develops media errors, or your network
card starts corrupting data, think about whether it would be a useful
test device to someone. (Obviously not if its completely 100% dead.)

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-09-21 12:46:08

by Pete Zaitcev

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

> - actually, what do any of these CONFIG_ options do, and why
> would someone not want the CONFIG_DRIVER_ROBUST to be always
> enabled?

Probably performance blows when it is enabled.

> In summary, I think that a lot of people have spent a lot of time in
> creating this document, and the surrounding code that matches this
> document. I really wish that a tiny bit of that effort had gone into
> contacting the Linux kernel development community, and asking to work
> with them on a project like this. Due to that not happening, and by
> looking at the resultant spec and code, I'm really afraid the majority
> of that time and effort will have been wasted.

Eek. They never mentioned any code before now. In fact they
explicitly said they weren't going to code before the spec
was ready.

-- Pete

2002-09-21 15:19:00

by Martin J. Bligh

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

> What do I think can be salvaged? Diagnostics are a good idea, and I
> think they fit into the driver model in 2.5 pretty well. A lot of
> kernel janitoring work could be done by the CG team to clean up, and
> harden (by applying the things in section 2) the existing kernel
> drivers. That effort alone would go a long way in helping the stability
> of Linux, and also introduce the CG developers into the kernel community
> as active, helping developers. It would allow the CG developers to
> learn from the existing developers, as we must be doing something right
> for Linux to be working as well as it does :)

People with fault injection hardware are also extremely helpful
(assuming they do something useful with it). That's not something most
of the community would have access to, but the CG-type people probably
do. A couple of people who spent their full time kicking the hell out
of Sequent's fibrechannel system made a massive difference to it's
quality and reliabilty.

That's definitely something this project could help by doing ...
whatever people feel about the some of more theoretical aspects to
their work being discussed, I think few would object to some real-world
help from people tracking down and fixing existing bugs, especially in
the error handling.

M.

2002-09-21 18:11:36

by Greg KH

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

On Sat, Sep 21, 2002 at 08:51:15AM -0400, Pete Zaitcev wrote:
>
> > In summary, I think that a lot of people have spent a lot of time in
> > creating this document, and the surrounding code that matches this
> > document. I really wish that a tiny bit of that effort had gone into
> > contacting the Linux kernel development community, and asking to work
> > with them on a project like this. Due to that not happening, and by
> > looking at the resultant spec and code, I'm really afraid the majority
> > of that time and effort will have been wasted.
>
> Eek. They never mentioned any code before now. In fact they
> explicitly said they weren't going to code before the spec
> was ready.

Oh, there's lots of code:
A "hardened" binary kernel driver:
http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/sampledriver-0.1-1.i386.rpm
(um people, why a binary? Where's the source for this?)

Some header files:
http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/ddhardening_headerfiles.tar.gz

A bunch of diagnostics code:
http://linux-diag.sourceforge.net/code/cpu_affinity-v0.2.1.tar.gz
http://linux-diag.sourceforge.net/code/pmem-0.2.1.tar.gz
http://linux-diag.sourceforge.net/code/crms-0.1.1.tar.gz

And a bunch of resource monitoring code:
http://sourceforge.net/project/showfiles.php?group_id=54710

CG people, are you wanting any of this code to be in the main kernel?
If so, why have you not submitted it to anyone yet? And why did you
write any code before the spec was ready if you said you were not going
to do that?

thanks,

greg k-h

2002-09-21 21:47:19

by Francois Romieu

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

[Cc list trimmed]

Greg KH <[email protected]> :
[...]
> Oh, there's lots of code:
> A "hardened" binary kernel driver:
> http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/sampledriver-0.1-1.i386.rpm
> (um people, why a binary? Where's the source for this?)

In the cvs. See:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/sample_driver/src/

--
Ueimor

2002-09-21 21:58:00

by Mr. James W. Laferriere

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec


Hello Francois , A suggestion for Documentation . Please produce
output of Windows docs into non-propitary formats .
txt , pfd , ps , ... Tia , JimL

On Sat, 21 Sep 2002, Francois Romieu wrote:

> [Cc list trimmed]
>
> Greg KH <[email protected]> :
> [...]
> > Oh, there's lots of code:
> > A "hardened" binary kernel driver:
> > http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/sampledriver-0.1-1.i386.rpm
> > (um people, why a binary? Where's the source for this?)
>
> In the cvs. See:
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/sample_driver/src/
>
> --
> Ueimor
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+

2002-09-21 22:12:55

by Francois Romieu

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

Mr. James W. Laferriere <[email protected]> :
[...]
> Hello Francois , A suggestion for Documentation . Please produce
> output of Windows docs into non-propitary formats .
> txt , pfd , ps , ... Tia , JimL

I don't remember having written a doc in proprietary format lately.

--
Ueimor

2002-09-21 22:32:50

by Mr. James W. Laferriere

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec


Hello Francois , found the pdf of the same doc I beleive .
Tia , JimL

On Sat, 21 Sep 2002, Mr. James W. Laferriere wrote:

>
> Hello Francios , ... Hth , JimL
>
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/docs/DD_hardening_spec.doc
>
> On Sun, 22 Sep 2002, Francois Romieu wrote:
> > Mr. James W. Laferriere <[email protected]> :
> > [...]
> > > Hello Francois , A suggestion for Documentation . Please produce
> > > output of Windows docs into non-propitary formats .
> > > txt , pfd , ps , ... Tia , JimL
> > I don't remember having written a doc in proprietary format lately.
>
> +------------------------------------------------------------------+
> | James W. Laferriere | System Techniques | Give me VMS |
> | Network Engineer | P.O. Box 854 | Give me Linux |
> | [email protected] | Coudersport PA 16915 | only on AXP |
> +------------------------------------------------------------------+
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+

2002-09-21 22:29:57

by Mr. James W. Laferriere

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec


Hello Francios , ... Hth , JimL

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/docs/DD_hardening_spec.doc

On Sun, 22 Sep 2002, Francois Romieu wrote:
> Mr. James W. Laferriere <[email protected]> :
> [...]
> > Hello Francois , A suggestion for Documentation . Please produce
> > output of Windows docs into non-propitary formats .
> > txt , pfd , ps , ... Tia , JimL
> I don't remember having written a doc in proprietary format lately.

+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network Engineer | P.O. Box 854 | Give me Linux |
| [email protected] | Coudersport PA 16915 | only on AXP |
+------------------------------------------------------------------+

2002-09-21 22:38:01

by Greg KH

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec

On Sat, Sep 21, 2002 at 11:52:19PM +0200, Francois Romieu wrote:
> [Cc list trimmed]
>
> Greg KH <[email protected]> :
> [...]
> > Oh, there's lots of code:
> > A "hardened" binary kernel driver:
> > http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/sampledriver-0.1-1.i386.rpm
> > (um people, why a binary? Where's the source for this?)
>
> In the cvs. See:
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/sample_driver/src/

Thanks for pointing this out, I missed it.

Hm, if this is the code that the CG group is proposing for reliable
drivers, we are all in trouble. See:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hardeneddrivers/sample_driver/src/sampledriver.h?rev=1.1.1.1

as a very small example of what not to do :)

I'd be glad to provide concrete criticism of the other files in this
directory, if I thought people would actually change their programming
style to follow what their own spec says to do...

{sigh}

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hardeneddrivers/sample_driver/src/sampledriver_init.c?rev=1.1.1.1
contains so many examples of bad style, and real bugs...

greg k-h

2002-09-22 00:56:04

by Andre Hedrick

[permalink] [raw]
Subject: Re: my review of the Device Driver Hardening Design Spec



#ifndef __MYDRVR_H__
#define __MYDRVR_H__

#define MYDRVR_IOCTL_MAGIC 'm'

#define MYDRVR_IOCTL_RCV _IO(MYDRVR_IOCTL_MAGIC, 0)

typedef struct _MYDRVR_CONTEXT {

unsigned long cRCV; /* number of RCV ioctls made */
unsigned long cDeviceOpen; /* device open count */

} MYDRVR_CONTEXT, *PMYDRVR_CONTEXT;

#define ZEROMEMORY(pAddr, cbSize) \
{ \
int i; \
char *d = (char *)(pAddr); \
for ( i = 0; i < (cbSize); i++, *d++ = 0 ); \
}

#endif /* __MYDRVR_H__ */

SHEESH, COULD THEY LEARN THERE IS MORE TO LIFE THAN ALL CAPS??

Sweet, that is "stick ugly"!

stick ugly : beating with a stick and it can not made any uglier than it
is presently.

Andre Hedrick
LAD Storage Consulting Group

On Sat, 21 Sep 2002, Greg KH wrote:

> On Sat, Sep 21, 2002 at 11:52:19PM +0200, Francois Romieu wrote:
> > [Cc list trimmed]
> >
> > Greg KH <[email protected]> :
> > [...]
> > > Oh, there's lots of code:
> > > A "hardened" binary kernel driver:
> > > http://unc.dl.sourceforge.net/sourceforge/hardeneddrivers/sampledriver-0.1-1.i386.rpm
> > > (um people, why a binary? Where's the source for this?)
> >
> > In the cvs. See:
> > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/hardeneddrivers/sample_driver/src/
>
> Thanks for pointing this out, I missed it.
>
> Hm, if this is the code that the CG group is proposing for reliable
> drivers, we are all in trouble. See:
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hardeneddrivers/sample_driver/src/sampledriver.h?rev=1.1.1.1
>
> as a very small example of what not to do :)
>
> I'd be glad to provide concrete criticism of the other files in this
> directory, if I thought people would actually change their programming
> style to follow what their own spec says to do...
>
> {sigh}
>
> http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/*checkout*/hardeneddrivers/sample_driver/src/sampledriver_init.c?rev=1.1.1.1
> contains so many examples of bad style, and real bugs...
>
> greg k-h
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>


2002-09-23 06:11:45

by Randy.Dunlap

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Fri, 20 Sep 2002, Rhoads, Rob wrote:

| Project Announcement:
| --------------------
|
| Initially we've created a specification, a few kernel modules
| that implement a set of driver programming interfaces, and
| a sample device driver that demonstrates those interfaces.
| -

Only addressing spec bugs for now.
More comments tomorrow when I'm more awake.

section
3.1.1.1 "Table 2 takes a closer look at the fields...."
No, it doesn't.

3.2.3 "The CONFIG_DRIVER_STATISTICS flag...."
but section 5.1 calls it CONFIG_DRIVER_STATS.

3.3.2 "The CONFIG_DRIVER_STATISTICS build configuration option"
should be CONFIG_DRIVER_DIAGNOSTICS
and change "statistics support" to "diagnostics support"

3.4.2.7.3, example 2: missing final '|' after "%s"

3.4.3.8.1, Comments on #defines:
aren't several of these backwards?

3.4.3.8.3, last #define: bad font change.

--
~Randy

2002-09-23 12:25:40

by Lars Marowsky-Bree

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On 2002-09-20T17:26:47,
"Rhoads, Rob" <[email protected]> said:

Hi Rob,

I fully support the idea to audit the Linux device drivers - using guidelines,
hardware fault injection, stress testing etc - and fixing any potential bugs.
This is obviously a very important task, because the drivers are some of the
most ugly code I've seen in the kernel.

"Pro-active monitoring", ie by basically gathering whatever statistics are
available and feeding them to some sort of user-space application and then
trying to deduce a potential failure is also a very valuable goal; so exposing
more statistics seems definetely good, too. As long as that doesn't introduce
even more errors...

Any help you can offer on the above is surely appreciated by all involved and
will have a direct, positive impact on Linux.

That said, and the fluff in your specification aside (which was very likely
necessary for management ;-), your spec certainly contains some good points on
how to write stable and robust code. (Aside from the comments the others have
raised already regarding event logging and that of course all recommendations
need to be thoughtfully applied to the case in question)

The statistics can best be exposed via driverfs or /proc (for kernels which
don't have driverfs); however, the statistics analyser nor the SNMP agent
pre-processing belong into the kernel itself. Keep the drivers as lean as
possible, that will introduce less errors at this level. I object to the CSM
being in kernel space. Having a more or less common API for the statistics to
be gathered and exposed by the drivers would be highly valuable indeed though.

What are your further timelines?

A lot of the above - ie, audit and test current drivers - can be done without
(at least not with much more) further planning; I'm always rather amazed at
how much effort Intel, IBM and their child OSDL spent on pretty specifications
which could also be applied to real work ;-)



Sincerely,
Lars Marowsky-Br?e <[email protected]>

--
Principal Squirrel
Research and Development, SuSE Linux AG

``Immortality is an adequate definition of high availability for me.''
--- Gregory F. Pfister

2002-09-23 13:18:00

by Manfred Spraul

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

Lars Marowsky-Bree <[email protected]> wrote:
>
> I fully support the idea to audit the Linux device drivers - using guidelines,
> hardware fault injection, stress testing etc - and fixing any potential bugs.
> This is obviously a very important task, because the drivers are some of the
> most ugly code I've seen in the kernel.
>

Are there any recipies for stress testing drivers?
I have my own list of stress tests I run on my network drivers, but the
list is more or less random:

http://www.colorfullife.com/~manfred/net-stress/net-stresstest.txt

--
Manfred

2002-09-23 22:33:38

by Rhoads, Rob

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

I appreciate all the feedback. Based on the wide variety
of ideas/comments, it looks like I need to go back and
incorporate these ideas into the document, potentially
changing areas in major ways where appropriate.

Rather than bog down this mailing list with exchanges,
I would like to move this discussion to the hardened
driver mailing list. Please don't feel like I'm
ignoring your feedback--just moving the forum.

An underlying theme tends to revolve around the binding
of the concepts of 'hardening' and RAS features being
added to drivers. We will be looking into splitting
these two different approaches out from this singular
document and into their appropriate locations.

If you are interested (even if you aren't) please go
to http://lists.sourceforge.net/lists/listinfo/hardeneddrivers-discuss
and subscribe to the mailing list.

+=+=+
Rob Rhoads mailto:[email protected]
Staff Software Engineer office: 503-677-5498
Telecom Software Platforms
Intel Communications Group

This email message solely contains my own personal views, and not
necessarily those of my employer.

2002-09-24 00:04:05

by Greg KH

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Mon, Sep 23, 2002 at 03:38:32PM -0700, Rhoads, Rob wrote:
>
> Rather than bog down this mailing list with exchanges,
> I would like to move this discussion to the hardened
> driver mailing list. Please don't feel like I'm
> ignoring your feedback--just moving the forum.

No, please don't move this off to another mailing list. This is where
the majority of all kernel programmers are, don't try to make us move to
yet-another-mailing-list just to discuss your project. If you want our
contributions, and want our input, use this list.

If you stay on smaller mailing lists, like cg-discuss and
hardened-drivers, you do not reach the widest group of people, which is
what you will have to do if you want to have a chance for your
contributions to become part of the main kernel.

> An underlying theme tends to revolve around the binding
> of the concepts of 'hardening' and RAS features being
> added to drivers. We will be looking into splitting
> these two different approaches out from this singular
> document and into their appropriate locations.

Where would these locations be?

> If you are interested (even if you aren't) please go
> to http://lists.sourceforge.net/lists/listinfo/hardeneddrivers-discuss
> and subscribe to the mailing list.

Sorry, but major kernel driver discussions should occur on lkml.

thanks,

greg k-h

2002-09-24 17:07:52

by Greg KH

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Mon, Sep 23, 2002 at 03:38:32PM -0700, Rhoads, Rob wrote:
> I appreciate all the feedback. Based on the wide variety
> of ideas/comments, it looks like I need to go back and
> incorporate these ideas into the document, potentially
> changing areas in major ways where appropriate.

Not to be a pest, but I, and a lot of other people, posted some very
specific questions in response to both your original posting, and in
response to the published specification and published code. It would be
considered proper etiquette if you would at least try to respond to
_some_ of these questions, as you did ask for them, rather than stating
that you are going to go mull over everything and come back with a
modified document.

If you don't, any expectations of people reviewing future specs, or
proposals from this project should be kept quite low.

thanks,

greg k-h

2002-09-24 19:25:46

by Rhoads, Rob

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

>
> On Mon, Sep 23, 2002 at 03:38:32PM -0700, Rhoads, Rob wrote:
> > I appreciate all the feedback. Based on the wide variety
> > of ideas/comments, it looks like I need to go back and
> > incorporate these ideas into the document, potentially
> > changing areas in major ways where appropriate.
>
> Not to be a pest, but I, and a lot of other people, posted some very
> specific questions in response to both your original posting, and in
> response to the published specification and published code.
> It would be
> considered proper etiquette if you would at least try to respond to
> _some_ of these questions, as you did ask for them, rather
> than stating
> that you are going to go mull over everything and come back with a
> modified document.

I've been overwhelmed with the hailstorm of posts hitting
my mailbox, since I made the project announcement.

>
> If you don't, any expectations of people reviewing future specs, or
> proposals from this project should be kept quite low.
>

The responses I have received have fallen into several buckets:

1. INTEL???? wtf? You're evil. Go away.
2. Good goal; bad approach.
3. Good goal, bad approach in places, here are areas for improvement.
4. Good goal, here are my thoughts and questions on X.

Keep in mind the original post was the announcement of a new project.
Sure, there was a big document with lots of information--but the
project is STARTING. Not ending; personally I didn't think that
there would be huge following on LKML. I thought those interested
in the topic would read the spec we have, see where they like it
and where they don't and then hopefully give me feedback to make
the spec and the results better. This isn't something that
can be solved overnight.

What I'm seeing from the messages is that a lot of people have
been thinking about this topic, and a lot of people have ideas
on how they think the problems best solved.

Areas of common desire to be looked at:

1. validate kernel interfaces (i.e.: kernel janitor)
2. common logging mechanisms (i.e.: not POSIX logging)
3. validation/testing tools capabilities
4. driver-howto; best known methods by kernel driver
developers for writing stable maintainable drivers

I am trying to understand what people are looking for so that I
can provide meaningful posts.

That said, I will go back and address the specific questions that
you and others have asked.

> thanks,
>
> greg k-h
>

-RobR

2002-09-24 19:51:17

by Greg KH

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Tue, Sep 24, 2002 at 12:30:28PM -0700, Rhoads, Rob wrote:
>
> That said, I will go back and address the specific questions that
> you and others have asked.

Thank you.

greg k-h

2002-09-24 23:24:20

by Rhoads, Rob

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

> From: Greg KH [mailto:[email protected]]
> On Tue, Sep 24, 2002 at 02:46:35PM -0700, Rhoads, Rob wrote:
> >
> > First throw away any idea of a spec. That was a bad idea. :)
> >
> > Next, turn the first section, "Stability & Reliability" of our
> > original doc into a "Driver Hardening HOWTO". It would be a
> > list of characteristics that all good drivers should have,
> > packed with examples to back it up.
>
> Sounds very good. I recommend that it be written in DocBook and added
> to the Documentation/DocBook directory of the kernel tree.

Agreed. :-)

>
> > BTW, by no means did I or anyone involved on this project, ever
> > mean to imply that the current drivers in the kernel are "bad".
> > Rather, I'd like to capture a list of the best practices and
> > document them. In any event our current list needs to be
> > strengthened with concrete examples. My thinking is that we
> > should work with the Kernel Janitor project. This is where
> > Intel can probably really help out.
>
> Great, the janitor project can really use extra people to help out. I
> suggest that you read over their TODO list again and pick up
> the pieces
> from there that are missing from your "Driver Hardening HOWTO".

I will do.

[snip]

>
> It would be wonderful if there were some good FI tools that were
> available for our use. It can only help to make better drivers.
>
> Thank you for your response, and for listening to the community.
>
> greg k-h
>

-RobR

2002-09-24 23:03:19

by Greg KH

[permalink] [raw]
Subject: Re: [ANNOUNCE] Linux Hardened Device Drivers Project

On Tue, Sep 24, 2002 at 02:46:35PM -0700, Rhoads, Rob wrote:
>
> First throw away any idea of a spec. That was a bad idea. :)
>
> Next, turn the first section, "Stability & Reliability" of our
> original doc into a "Driver Hardening HOWTO". It would be a
> list of characteristics that all good drivers should have,
> packed with examples to back it up.

Sounds very good. I recommend that it be written in DocBook and added
to the Documentation/DocBook directory of the kernel tree.

> BTW, by no means did I or anyone involved on this project, ever
> mean to imply that the current drivers in the kernel are "bad".
> Rather, I'd like to capture a list of the best practices and
> document them. In any event our current list needs to be
> strengthened with concrete examples. My thinking is that we
> should work with the Kernel Janitor project. This is where
> Intel can probably really help out.

Great, the janitor project can really use extra people to help out. I
suggest that you read over their TODO list again and pick up the pieces
from there that are missing from your "Driver Hardening HOWTO".

> The section on Instrumentation should be broken up and each piece
> dealt with separately as separate project. Most likely killed outright
> or as part of existing efforts. I see this section as not having
> anything to do with driver hardening and more to do with driver RAS.

Agreed.

> POSIX Event Logging-- is a dead issue. The mailing list feedback
> is making that point very clear, many thanks. The current
> thread on an alternative, seems like there is some sort of need
> for event logging. Whatever the final decision that the Linux
> community decides, we'll do.

Thanks for listening.

> There seems to be a desire to have some sort of driver diagnostics.
> We can work on that with the existing linux-diag project.

Sounds good. I know those people are actively working to get their code
into the 2.5 kernel, using the driver model. This is a good thing.

> Statistics needs to be debated on its own merits. There are some
> arguments for keeping it, but I think that stats could be better
> handled in user-space and NOT kernel space. IMHO it's not driver
> hardening, therefore it's a separate project.

Agreed, it should be done in userspace.

> Third, the most of the section on High Availability should just
> be axed. The big exception being "fault injection testing".
>
> I see value in keeping FI testing. I think that getting FI
> tools into the hands of developers would be worthwhile. Why?
> Because letting people do more complicated testing, produces
> better code. I think there is room for us to work on a set of
> FI tools.

It would be wonderful if there were some good FI tools that were
available for our use. It can only help to make better drivers.

Thank you for your response, and for listening to the community.

greg k-h

2002-09-24 21:41:37

by Rhoads, Rob

[permalink] [raw]
Subject: RE: [ANNOUNCE] Linux Hardened Device Drivers Project

> From: Greg KH [mailto:[email protected]]
>
[snip]
>
> > An underlying theme tends to revolve around the binding
> > of the concepts of 'hardening' and RAS features being
> > added to drivers. We will be looking into splitting
> > these two different approaches out from this singular
> > document and into their appropriate locations.
>
> Where would these locations be?
>

First throw away any idea of a spec. That was a bad idea. :)

Next, turn the first section, "Stability & Reliability" of our
original doc into a "Driver Hardening HOWTO". It would be a
list of characteristics that all good drivers should have,
packed with examples to back it up.

BTW, by no means did I or anyone involved on this project, ever
mean to imply that the current drivers in the kernel are "bad".
Rather, I'd like to capture a list of the best practices and
document them. In any event our current list needs to be
strengthened with concrete examples. My thinking is that we
should work with the Kernel Janitor project. This is where
Intel can probably really help out.

The section on Instrumentation should be broken up and each piece
dealt with separately as separate project. Most likely killed outright
or as part of existing efforts. I see this section as not having
anything to do with driver hardening and more to do with driver RAS.

POSIX Event Logging-- is a dead issue. The mailing list feedback
is making that point very clear, many thanks. The current
thread on an alternative, seems like there is some sort of need
for event logging. Whatever the final decision that the Linux
community decides, we'll do.

There seems to be a desire to have some sort of driver diagnostics.
We can work on that with the existing linux-diag project.

Statistics needs to be debated on its own merits. There are some
arguments for keeping it, but I think that stats could be better
handled in user-space and NOT kernel space. IMHO it's not driver
hardening, therefore it's a separate project.

Third, the most of the section on High Availability should just
be axed. The big exception being "fault injection testing".

I see value in keeping FI testing. I think that getting FI
tools into the hands of developers would be worthwhile. Why?
Because letting people do more complicated testing, produces
better code. I think there is room for us to work on a set of
FI tools.

> > If you are interested (even if you aren't) please go
> > to http://lists.sourceforge.net/lists/listinfo/hardeneddrivers-discuss
> and subscribe to the mailing list.
>
> Sorry, but major kernel driver discussions should occur on lkml.
>
> thanks,
>
> greg k-h