LinuxLists.cc - [RFC] races in access to pci

2001-11-15 04:00:45

Subject: [RFC] races in access to pci_devices

Linus, as far as I can see there's no exclusion between
the code that walks pci_devices and pci_insert_device(). It's
not a big deal wrt security (not many laptops with remote access)
but...

What locking is supposed to be there?

2001-11-15 04:26:18

by Greg KH

[permalink] [raw]

Subject: Re: [RFC] races in access to pci_devices

On Wed, Nov 14, 2001 at 11:00:26PM -0500, Alexander Viro wrote:
> Linus, as far as I can see there's no exclusion between
> the code that walks pci_devices and pci_insert_device(). It's
> not a big deal wrt security (not many laptops with remote access)
> but...

It's a bigger deal with large servers that have PCI Hotplug controllers.

> What locking is supposed to be there?

I'll add a lock to keep the problem from happening.

thanks,

greg k-h

2001-11-15 07:56:45

by Jeff Garzik

[permalink] [raw]

Subject: Re: [RFC] races in access to pci_devices

Alexander Viro wrote:
>
> Linus, as far as I can see there's no exclusion between
> the code that walks pci_devices and pci_insert_device(). It's
> not a big deal wrt security (not many laptops with remote access)
> but...
>
> What locking is supposed to be there?

alas, yes, that's been there since time began, and since the window was
so minimal nobody cared enough to do anything about it. Even on the
larger hotplug PCI servers that Greg KH mentioned, the pci list really
isn't traversed much, much less updated.

I haven't looked at it in over a year, but from a quick look, all the
list access look like they can be protected by a simple spinlock.
You'll need to grep the tree, though, because not all users of
pci_devices are in drivers/pci/pci.c. [users outside of drivers/pci
should really be converted to use pci_find_xxx functions in any case]

--
Jeff Garzik | Only so many songs can be sung
Building 1024 | with two lips, two lungs, and one tongue.
MandrakeSoft | - nomeansno

2001-11-15 08:00:46

by Alexander Viro

[permalink] [raw]

Subject: Re: [RFC] races in access to pci_devices

On Thu, 15 Nov 2001, Jeff Garzik wrote:

> alas, yes, that's been there since time began, and since the window was
> so minimal nobody cared enough to do anything about it. Even on the
> larger hotplug PCI servers that Greg KH mentioned, the pci list really
> isn't traversed much, much less updated.

while true; do cat </proc/bus/pci/devices >/dev/null; done - notice that it's
world-readable (ditto for /proc/pci).

2001-11-15 08:29:40

by Andrew Morton

[permalink] [raw]

Subject: Re: [RFC] races in access to pci_devices

Jeff Garzik wrote:
>
> I haven't looked at it in over a year, but from a quick look, all the
> list access look like they can be protected by a simple spinlock.

I don't think so? We do things like calling driver probe methods
in the middle of a driver list walk.

An rwsem _may_ be suitable, but I'm not sure that we don't do
a nested walk in some circumstances, and AFAIK our rwsems
still are not safe for the same thread to do a down_read() twice.

Then there's the bus list, and the order of its lock wrt the device
list.

One approach would be to use a spinlock and a per-device refcount.
So something like:

spin_lock(&pci_dev_lock);
dev = pci_dev_g(pci_devices.next);
while (dev != pci_dev_g(&pci_devices)) {
struct pci_dev *next;

pci_dev_get(dev);
spin_unlock(&pci_dev_lock);
diddle(dev);
spin_lock(&pci_dev_lock);
next = pci_dev_g(dev->global_list.next);
pci_dev_put(dev);
dev = next;
}
spin_unlock(&pci_dev_lock);

pci_dev_get(struct pci_dev *dev)
{
#ifdef CONFIG_SMP
if (!spin_is_locked(&pci_dev_lock))
BUG();
#endif
dev->refcount++;
}

pci_dev_put(struct pci_dev *dev)
{
#ifdef CONFIG_SMP
if (!spin_is_locked(&pci_dev_lock))
BUG();
#endif
dev->refcount--;
if (dev->refcount == 0)
kfree(dev);
}

I _think_ all this list traversal happens in process context now.
Not sure about the PCI hotplug driver though.

It's really sticky. Which is why it isn't fixed :(

Sigh. Maybe go for an rwsem in 2.5, backport when it stops
deadlocking?

-

2001-11-15 16:33:21

by Patrick Mochel

[permalink] [raw]

Subject: Re: [RFC] races in access to pci_devices

On Thu, 15 Nov 2001, Andrew Morton wrote:

> Jeff Garzik wrote:
> >
> > I haven't looked at it in over a year, but from a quick look, all the
> > list access look like they can be protected by a simple spinlock.
>
> I don't think so? We do things like calling driver probe methods
> in the middle of a driver list walk.

That's true. However, ->probe() and the rest of the struct pci_driver
methods are all called from the PCI layer. It seems that a per-device
rwsem would could protect competing calls from screwing things up (a
->remove() call during a ->probe() call).

> An rwsem _may_ be suitable, but I'm not sure that we don't do
> a nested walk in some circumstances, and AFAIK our rwsems
> still are not safe for the same thread to do a down_read() twice.

If the device was locked by the PCI layer before all calls down to the
driver, and it was documented in BIG bold letters, this _should_ prevent
drivers from doing silly things like locking itself again, right? ;)

> Then there's the bus list, and the order of its lock wrt the device
> list.
>
> One approach would be to use a spinlock and a per-device refcount.
> So something like:

It seems as though having two lists of devices complicates things quite a
bit. I was looking into trying to obviate the need for a global device
list yesterday, but it seems painful at best (and probably worthy of
another thread).

Suppose you didn't have a global device list, but you had a global bus
list (including all subordinate buses).

Could you do:

struct pci_bus * bus;
struct pci_dev * dev;

for_each_bus(bus) {
down_read(&bus->lock);
for_each_child(bus,dev) {
down_read(&dev->lock);
dev->driver->foo();
up_read(&dev->lock);
}
up_read(&bus->lock);
}

[snip]

I started to implement proper locking for the new driver model a couple of
days ago. So, far it only touches the top-level device tree, but attempts
to deal with the same class of problems. (Though, I won't assert that it
is even close to the best solution).

The simple cases I've touched so far are registering and unregistering
devices in the tree. I don't want a device to be unregistered before its
completely done registering, and I don't want two register calls to stomp
on the same parent's list of children.

So, I do something like:

device_reigster(struct device * dev)
{
down_write(&dev->lock);
...
down_write(&parent->lock);
list_add_tail(&dev->node,&parent->devices);
up_write(&parent->lock);
...
up_write(&dev->lock);
}

Is this right, or is there something better?

> I _think_ all this list traversal happens in process context now.
> Not sure about the PCI hotplug driver though.
>
> It's really sticky. Which is why it isn't fixed :(
>
> Sigh. Maybe go for an rwsem in 2.5, backport when it stops
> deadlocking?

I am more than willing to work on this, as I have a strong interest in
getting this right for the global device tree (just point me in the right
direction..)

-pat