LinuxLists.cc - pci_alloc_consistent from interrupt == BAD

2002-01-18 19:03:04

Subject: pci_alloc_consistent from interrupt == BAD

Somehow the docs in DMA-mappings.txt say pci_alloc_consistent is allowed from
interrupt, but this is a "bad thing" on at least arm and PPC non-cache
coherent cpus.

On these cpus we have to allocate page tables for consistent_alloc.. I really
dont' think we want to be doing this during interrupt context.

This also causes the sym53c8xx_2 driver to not work on some embedded ppc 4xx
boards, and in general, seems to be a 'bad thing' to allow.

For example, in the arch/arm/consistent.c:

/*
* This allocates one page of cache-coherent memory space and returns
* both the virtual and a "dma" address to that space. It is not clear
* whether this could be called from an interrupt context or not. For
* now, we expressly forbid it, especially as some of the stuff we do
* here is not interrupt context safe.
*
* Note that this does *not* zero the allocated area!
*/
void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle)

(arm's pci_alloc_consistent always calls consistent_alloc).

The PPC version calls a similiar function when CONFIG_NOT_COHERENT_CACHE is
defined. On 'regular' ppc machines, it's just a __get_free_pages, which is
why no one from the pmac crowd has screamed.

--
Troy Benjegerdes | master of mispeeling | 'da hozer' | [email protected]
-----"If this message isn't misspelled, I didn't write it" -- Me -----
"Why do musicians compose symphonies and poets write poems? They do it
because life wouldn't have any meaning for them if they didn't. That's
why I draw cartoons. It's my life." -- Charles Schulz

2002-01-18 19:23:05

by Dan Malek

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

Troy Benjegerdes wrote:

> Somehow the docs in DMA-mappings.txt say pci_alloc_consistent is allowed from
> interrupt, but this is a "bad thing" on at least arm and PPC non-cache
> coherent cpus.

This isn't unique to PowerPC or ARM, and has nothing to do with allocating
page tables.

I don't understand how pci_alloc_consistent could ever be claimed to work
from an interrupt function because it actually allocates pages of memory
for all architectures. Anytime you call alloc_pages() (or friends) you could
potentially block or return an error (out of memory) condition.

Either option is undesirable for an interrupt function. If your software can
handle the case of not being able to allocate memory, then why not remove the
complexity and do it that way all of the time?

Thanks.

-- Dan

2002-01-18 20:23:07

by Gérard Roudier

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

On Fri, 18 Jan 2002, Troy Benjegerdes wrote:

> Somehow the docs in DMA-mappings.txt say pci_alloc_consistent is allowed from
> interrupt, but this is a "bad thing" on at least arm and PPC non-cache
> coherent cpus.

Everything gets bad when it doesn't work for you. :-)

> On these cpus we have to allocate page tables for consistent_alloc.. I really
> dont' think we want to be doing this during interrupt context.
>
> This also causes the sym53c8xx_2 driver to not work on some embedded ppc 4xx
> boards, and in general, seems to be a 'bad thing' to allow.

I have noted that some ports may [ever] require pci_alloc_consistent not
to be called from interrupt context. Just I will look into this when time
will allow.

Btw, I thought long ago about using a thread in the driver to deal with
completions and some other events on which the driver may want to allocate
memory. But this would have added useless latency. Instead I preferred to
be careful about the driver to be as fast as possible when completing an
IO under interrupt.

Btw-bis, FreeBSD-5 wants to thread everything (as most of you know). We
will see the result... No need to say that I posted my disagreement when
this project started. As FreeBSD-5 should be ready when Linux-2.6 will be,
we will be able to compare interrupt threading cost against simple fast
interrupt on typical hardware (99% are UP in my guessing).

> For example, in the arch/arm/consistent.c:
>
> /*
> * This allocates one page of cache-coherent memory space and returns
> * both the virtual and a "dma" address to that space. It is not clear
> * whether this could be called from an interrupt context or not. For
> * now, we expressly forbid it, especially as some of the stuff we do
> * here is not interrupt context safe.
> *
> * Note that this does *not* zero the allocated area!
> */
> void *consistent_alloc(int gfp, size_t size, dma_addr_t *dma_handle)
>
> (arm's pci_alloc_consistent always calls consistent_alloc).

ARM arch is simplistic. Wanting to run modern O/Ses on this crap looks
utter stupidity to me. I don't care about ARM at all.

> The PPC version calls a similiar function when CONFIG_NOT_COHERENT_CACHE is
> defined. On 'regular' ppc machines, it's just a __get_free_pages, which is
> why no one from the pmac crowd has screamed.

I am not going to ever use not cache coherent hardware, even if I am ready
to make the sym driver work reliably on such brain-dead things. Just it is
not high priority stuff for now.

G?rard.

2002-01-18 20:34:07

by David Miller

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

From: Dan Malek <[email protected]>
Date: Fri, 18 Jan 2002 14:22:03 -0500

> Somehow the docs in DMA-mappings.txt say pci_alloc_consistent is allowed from
> interrupt, but this is a "bad thing" on at least arm and PPC non-cache
> coherent cpus.

This isn't unique to PowerPC or ARM, and has nothing to do with allocating
page tables.

Yes it is in fact unique to those ports...

I don't understand how pci_alloc_consistent could ever be claimed to work
from an interrupt function because it actually allocates pages of memory
for all architectures. Anytime you call alloc_pages() (or friends) you could
potentially block or return an error (out of memory) condition.

If it specifies GFP_ATOMIC, there are no problems from interrupts.
You will see that every port other than the mentioned two above use
GFP_ATOMIC in their pci_alloc_consistent implementation, for this very
reason.

The ARM and PPC ports could set __GFP_HIGH in their page table
allocation calls when invoked via pci_alloc_consistent, and this is
the change I suggest they make. It is a trivial fix whereas backing
out this ability to call pci_alloc_consistent from interrupts is not
a trivial change at all.

2002-01-18 20:40:39

by David Miller

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

From: G?rard Roudier <[email protected]>
Date: Fri, 18 Jan 2002 21:21:35 +0100 (CET)

I have noted that some ports may [ever] require pci_alloc_consistent not
to be called from interrupt context. Just I will look into this when time
will allow.

Do not bother Gerard, these ports really are broken and
pci_alloc_consistent must work from interrupts.

I am not going to ever use not cache coherent hardware, even if I am ready
to make the sym driver work reliably on such brain-dead things. Just it is
not high priority stuff for now.

Perhaps you misunderstand, it is not "lack of cache coherency" it is
"cache needs flushing around DMA transfers" and this is handled
perfectly by PCI DMA interfaces. It is nothing you should be
concerned about.

2002-01-18 20:52:18

by Gérard Roudier

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

On Fri, 18 Jan 2002, Dan Malek wrote:

> Troy Benjegerdes wrote:
>
> > Somehow the docs in DMA-mappings.txt say pci_alloc_consistent is allowed from
> > interrupt, but this is a "bad thing" on at least arm and PPC non-cache
> > coherent cpus.
>
> This isn't unique to PowerPC or ARM, and has nothing to do with allocating
> page tables.
>
> I don't understand how pci_alloc_consistent could ever be claimed to work
> from an interrupt function because it actually allocates pages of memory
> for all architectures. Anytime you call alloc_pages() (or friends) you could
> potentially block or return an error (out of memory) condition.
>
> Either option is undesirable for an interrupt function. If your software can
> handle the case of not being able to allocate memory, then why not remove the
> complexity and do it that way all of the time?

I donnot see your point. Any software that tries to allocate from
interrupt context must be ready to handle failure, but when the allocation
does not require more than a single page, the failure should not happen
very often.

Not allocating under interrupt requires to pool everything that may be
ever needed. Speaking about Linux SCSI, you can:

1) Allocate all data structures at initialisation.
A driver that can support 1022 IOs per HBA for example with 2K CCB
will allocate 2 MB even if user just uses a single CD/ROM without tagged
commands supported.

2) Resize the CCB pool on select_queue_depth() call.
This will allocate far less memory when only few devices will be
attached to the BUS.

No need to say that 'sym' will never implement #1. At most I may add this
inside some #ifdef/#endif pair.

For example, on my personnal machine #1 scenario will allocate 8 MB but
actual need will be about 1 MB.

For now, I am not sure about #2 being cleanly implementable, but it looks
smarter to me. Note that I would prefer to also be able to free CCBs that
are no more needed. This will require a driver entry to be called when a
device goes away.

G?rard.

2002-01-18 21:08:50

by Gérard Roudier

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

On Fri, 18 Jan 2002, David S. Miller wrote:

> From: G?rard Roudier <[email protected]>
> Date: Fri, 18 Jan 2002 21:21:35 +0100 (CET)
>
> I have noted that some ports may [ever] require pci_alloc_consistent not
> to be called from interrupt context. Just I will look into this when time
> will allow.
>
> Do not bother Gerard, these ports really are broken and
> pci_alloc_consistent must work from interrupts.
>
> I am not going to ever use not cache coherent hardware, even if I am ready
> to make the sym driver work reliably on such brain-dead things. Just it is
> not high priority stuff for now.
>
> Perhaps you misunderstand, it is not "lack of cache coherency" it is
> "cache needs flushing around DMA transfers" and this is handled
> perfectly by PCI DMA interfaces. It is nothing you should be
> concerned about.

Depends on the OS.

Under Linux, PCI consistent allocation support is a requirement. Let me
cross fingers for this to stay as it is. :)

Under some other (NetBSD, for example, just not to name it :)), such
feature is just a hint and as a result drivers are theorically also
required to care about cache flushing for DMAs that involve driver
internal data structures if full portability is in concern. Btw, I donnot
really use NetBSD O/S but just ported sym-2 to it (NetBSD-1.5 with
coherency being a requirement for driver data structure allocations) as an
exercise.

G?rard.

2002-01-18 21:14:51

by David Miller

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

From: G?rard Roudier <[email protected]>
Date: Fri, 18 Jan 2002 22:07:09 +0100 (CET)

Under Linux, PCI consistent allocation support is a requirement. Let me
cross fingers for this to stay as it is. :)

It is my intention to keep it like this. I have shown 3 or 4 ports,
after I've been told "we cannot do consistent memory on our platform",
that in fact they could :-)

For this reason, someone will have to do the equivalent of parting the
seas to convince me to change things :-)

2002-01-18 21:30:11

by Russell King

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

On Fri, Jan 18, 2002 at 12:32:21PM -0800, David S. Miller wrote:
> The ARM and PPC ports could set __GFP_HIGH in their page table
> allocation calls when invoked via pci_alloc_consistent, and this is
> the change I suggest they make. It is a trivial fix whereas backing
> out this ability to call pci_alloc_consistent from interrupts is not
> a trivial change at all.

ARM will get fixed some way if and when it becomes a requirement to fix
it. "requirement" here is defined by someone wanting to use such a
driver on an ARM system. Currently, there is no call for it, so there's
not much point in implementing it.

However, if it becomes easy to implement without impacting the code too
much, then it will get fixed in due coarse. The problem currently is
that there is no way for the page table allocation functions to know
that they should be using atomic and emergency pool memory allocations.

Its a balance of pain vs gain. Pain is currently high, gain is currently
zero.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-01-18 21:34:51

by David Miller

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

From: Russell King <[email protected]>
Date: Fri, 18 Jan 2002 21:29:49 +0000

However, if it becomes easy to implement without impacting the code too
much, then it will get fixed in due coarse. The problem currently is
that there is no way for the page table allocation functions to know
that they should be using atomic and emergency pool memory allocations.

Encapsultate the page table allocation core interfaces into
__whatever_alloc() routines that take a GFP arg perhaps?
It is like a 15 minute hack.

BTW, the USB host controller drivers do this (allocate potentially
from interrupts) so anyone using USB on ARM...

2002-01-18 21:44:12

by Dan Malek

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

David S. Miller wrote:

> If it specifies GFP_ATOMIC, there are no problems from interrupts.
> You will see that every port other than the mentioned two above use
> GFP_ATOMIC in their pci_alloc_consistent implementation, for this very
> reason.

The PowerPC does use (or is supposed to use) GFP_ATOMIC. I don't
know why ARM doesn't.

I guess I'm just trying to understand the value of calling it from interrupt.
If the purpose is to allocate a page of memory, and that doesn't happen,
all you have done is pushed added complexity to a device driver. If
an interrupt handler (and the remainder of the driver) must handle the
condition of wanting a page but not getting one in the interrupt handler,
why bother trying to do it at all?

> The ARM and PPC ports could set __GFP_HIGH in their page table
> allocation calls when invoked via pci_alloc_consistent,

How does this solve anything? All it does is make more memory potentially
available. If memory isn't available, the call is still going to fail.

> ...... It is a trivial fix whereas backing
> out this ability to call pci_alloc_consistent from interrupts is not
> a trivial change at all.

You can call pci_alloc_consistent from anywhere on any architecture,
but anyone calling it and not handling an error condition is a broken
implementation.

I have to apologize to hozer, because the allocation of page tables would
be a condition where pci_alloc_consistent could return an error. However,
he must be looking into the future because that is the way it will be
later today when I'm done changing the functions, not the current implementation.

Thanks.

-- Dan

2002-01-18 21:43:02

by Russell King

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

On Fri, Jan 18, 2002 at 01:33:06PM -0800, David S. Miller wrote:
> Encapsultate the page table allocation core interfaces into
> __whatever_alloc() routines that take a GFP arg perhaps?
> It is like a 15 minute hack.

That may be, but there's all the page table manipulation code that goes
along with it as well. Yes, a similar thing could be done with that,
it needs someone with the time to look into it and cook up a patch.

(I've got my hands full, so I'm not eager to pick this up right now).

> BTW, the USB host controller drivers do this (allocate potentially
> from interrupts) so anyone using USB on ARM...

Well, I've got a BUG() in there that'll trigger when pci_alloc_consistent()
is called from IRQ, and so far no one has reported an incidence of
that occuring, despite there being USB OHCI controllers available on ARM.
Maybe no one is pushing USB hard enough on ARM to cause these allocations,
I don't know.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-01-18 21:47:42

by Dan Malek

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

Russell King wrote:

> ..... The problem currently is
> that there is no way for the page table allocation functions to know
> that they should be using atomic and emergency pool memory allocations.

Hmmm...I thought they already do this. I always assumed the gfp_flag passed
into alloc_pages would find its way all of the way through the page table
allocation as well. You just have to be ready to handle the case where
it returns with an -ENOMEM :-).

-- Dan

2002-01-18 21:55:42

by Russell King

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

(CC list trimmed)

On Fri, Jan 18, 2002 at 04:46:37PM -0500, Dan Malek wrote:
> Russell King wrote:
> > ..... The problem currently is
> > that there is no way for the page table allocation functions to know
> > that they should be using atomic and emergency pool memory allocations.
>
> Hmmm...I thought they already do this. I always assumed the gfp_flag passed
> into alloc_pages would find its way all of the way through the page table
> allocation as well. You just have to be ready to handle the case where
> it returns with an -ENOMEM :-).

I refer you to your nearest function prototype for pte_alloc_one()
rather than alloc_pages().

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-01-18 22:14:18

by Dan Malek

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

Russell King wrote:

> I refer you to your nearest function prototype for pte_alloc_one()
> rather than alloc_pages().

Oooo...thanks. I guess I'll be adding the flag into pte_alloc as well
here in just a few minutes. We'll see if that flies. Everything prior
to this seems capable of handling the error if it doesn't succeed.

Thanks.

-- Dan

2002-01-19 17:23:49

by David Brownell

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

"> > " == David Miller
"> " == Russell King

> > BTW, the USB host controller drivers do this (allocate potentially
> > from interrupts) so anyone using USB on ARM...
>
> Well, I've got a BUG() in there that'll trigger when pci_alloc_consistent()
> is called from IRQ, and so far no one has reported an incidence of
> that occuring, despite there being USB OHCI controllers available on ARM.

Yes, that'd be rare -- but legal.

The USB host controller drivers would do that primarily in the
case where (a) some driver submitted a new URB in_interrupt(),
so the HCD needed to queue new requests to the hardware,
and (b) the pools of endpoint or transfer descriptors didn't have
enough free entries.

The main reason for (a) is that the last URB just completed, and
freed its resources, and a new request for that endpoint needs to
get submitted. As for (b), endpoint descriptors are normally
preserved until the device is disconnected. That leaves transport
descriptors (TDs). OHCI (and EHCI) don't need many of those;
typically one per request. So if one was just put back into the pool,
it'd typically still be available. UHCI needs more memory, but
the same general rules apply.

- Dave

2002-01-21 23:52:26

by Pete Zaitcev

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

> It [GFP_HIGH] is a trivial fix whereas backing
> out this ability to call pci_alloc_consistent from interrupts is not
> a trivial change at all.

David, would you make a statement about pci_free_consistent.
I find it annoying that it cannot be called from an interrupt.

-- Pete

2002-01-22 00:11:07

by David Miller

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

From: Pete Zaitcev <[email protected]>
Date: Mon, 21 Jan 2002 18:52:05 -0500

> It [GFP_HIGH] is a trivial fix whereas backing
> out this ability to call pci_alloc_consistent from interrupts is not
> a trivial change at all.

David, would you make a statement about pci_free_consistent.
I find it annoying that it cannot be called from an interrupt.

It is not an unreasonable requirement. Wasn't there some problem with
the pci pool stuff if it free'd up a chunk in an interrupt?

2002-01-22 00:21:17

by Pete Zaitcev

[permalink] [raw]

Subject: Re: pci_alloc_consistent from interrupt == BAD

> Date: Mon, 21 Jan 2002 16:08:53 -0800 (PST)
> From: "David S. Miller" <[email protected]>
>
> From: Pete Zaitcev <[email protected]>
> Date: Mon, 21 Jan 2002 18:52:05 -0500
>
> > It [GFP_HIGH] is a trivial fix whereas backing
> > out this ability to call pci_alloc_consistent from interrupts is not
> > a trivial change at all.
>
> David, would you make a statement about pci_free_consistent.
> I find it annoying that it cannot be called from an interrupt.
>
> It is not an unreasonable requirement. Wasn't there some problem with
> the pci pool stuff if it free'd up a chunk in an interrupt?

Yes, the pci_pool_free is one of the reasons why I ask.
I understand that the fundamental difference is that
allocation can fail (given right GFP flag), so this is what
we do in an interrupt (and we drop a packet in the driver).
But frees cannot fail, so if that tries to slip then we are stuck.

-- Pete