2002-02-11 00:44:47

by Daniel Stodden

[permalink] [raw]
Subject: pci_pool reap?

hi.

is it true that pci pools are never shrunk? or am i just missing the
point where it happens?

try_to_free_pages() seems to care just about kmem_caches.

looks odd to me...


thanx,
dns

--
___________________________________________________________________________
mailto:[email protected]


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-02-11 02:50:55

by Pete Zaitcev

[permalink] [raw]
Subject: Re: pci_pool reap?

> is it true that pci pools are never shrunk? or am i just missing the
> point where it happens?
>
> try_to_free_pages() seems to care just about kmem_caches.

Yes, they do not shrink. When David-B wrote them, they shrunk.
Later I found an interrupt availability violation (pci_pool_free
can be called from an interrupt, it can call pci_free_consistent,
which cannot be called from an interrupt), so we got it removed.

There is a certain controversy about pci_free_consistent called
from an interrupt. It seems that most architectures would
have no problems, and only arm is problematic. RMK says that
it's not intrinsicly so, this is one of my TODO notes:

## 2001/12/18
<zaitcev> rmk: you do have some stuff broken, for instance using vmalloc for
pci_alloc_consistent was a major pain for everyone else
<_rmk_> zaitcev: errrrrrrrr
<_rmk_> zaitcev: I don't use vmalloc there, never have done.
<_rmk_> I use alloc_pages and ioremap
<_rmk_> You're thinking about the sa1100 people who decided to make pci_map_*
fail I think.
<zaitcev> hmm. I'll re-investigate.
<_rmk_> which... is not something I agree with either.
<_rmk_> but quite honestly, with Intel breaking the hardware soo badly, it
being popular and continues to be reused on other platforms, its
something we're going to have to live with.

Wanna take it up personally? I seem never to have a time.

-- Pete

2002-02-11 02:58:55

by Alan

[permalink] [raw]
Subject: Re: pci_pool reap?

> There is a certain controversy about pci_free_consistent called
> from an interrupt. It seems that most architectures would
> have no problems, and only arm is problematic. RMK says that

The discussion was about pci_alloc_consistent. The free case seems to be
explicitly disallowed in all cases.

(from DMA-mapping.txt)

To unmap and free such a DMA region, you call:

pci_free_consistent(dev, size, cpu_addr, dma_handle);

where dev, size are the same as in the above call and cpu_addr and
dma_handle are the values pci_alloc_consistent returned to you.
This function may not be called in interrupt context.

2002-02-11 19:21:39

by Gérard Roudier

[permalink] [raw]
Subject: Re: pci_pool reap?


On Mon, 11 Feb 2002, Alan Cox wrote:

> > There is a certain controversy about pci_free_consistent called
> > from an interrupt. It seems that most architectures would
> > have no problems, and only arm is problematic. RMK says that
>
> The discussion was about pci_alloc_consistent. The free case seems to be
> explicitly disallowed in all cases.
>
> (from DMA-mapping.txt)
>
> To unmap and free such a DMA region, you call:
>
> pci_free_consistent(dev, size, cpu_addr, dma_handle);
>
> where dev, size are the same as in the above call and cpu_addr and
> dma_handle are the values pci_alloc_consistent returned to you.
> This function may not be called in interrupt context.

Such limitation looks poor implementation to me.

At least, could existing driver interface be clearly documented about what
methods may/may not/might/should/shall/ever will/never will/ etc.. be
called in interrupt context or whatever context and what others may be
called...
...in a different way :-).

G?rard.

2002-02-12 02:46:38

by David Miller

[permalink] [raw]
Subject: Re: pci_pool reap?

From: G?rard Roudier <[email protected]>
Date: Sun, 10 Feb 2002 21:20:05 +0100 (CET)

On Mon, 11 Feb 2002, Alan Cox wrote:

> This function may not be called in interrupt context.

Such limitation looks poor implementation to me.

I agree with you Gerard, and probably nobody truly even requires
this limitation. I do plan to remove it after I've done a thorough
investigation of the platform implementations.

2002-02-12 15:38:30

by Daniel Stodden

[permalink] [raw]
Subject: Re: pci_pool reap?

hi.

On Tue, 2002-02-12 at 03:44, David S. Miller wrote:
> From: G?rard Roudier <[email protected]>
> Date: Sun, 10 Feb 2002 21:20:05 +0100 (CET)
>
> On Mon, 11 Feb 2002, Alan Cox wrote:
>
> > This function may not be called in interrupt context.
>
> Such limitation looks poor implementation to me.
>
> I agree with you Gerard, and probably nobody truly even requires
> this limitation. I do plan to remove it after I've done a thorough
> investigation of the platform implementations.

ok, i've looked through most of 2.5.4 now.
results look like this:

pci_alloc_consistent() pci_free_consistent()
i386:
[1] ok ok

ppc:
[1] ok ok

mips:
[1] ok ok

sh:
[1] ok ok
stm: [1] ok ok
dc: [3] ok ok

mips64:
ip32: [1] ok ok
ip27: [1] ok ok

sparc:
[1] GFP_KERNEL ok
sparc64:
[2] ok ok

arm: [4] BUG()/GFP_KERNEL BUG()

alpha:
[2] ok ok

ia64: [5] ok? ok?


[1]
gfp() + __pa() (or similar)

[2]
gfp() + IOMMU

[3]
dummy, offsets only

[4]
ARM does GFP_KERNEL, and then __ioremaps the underlying pages.
ugh. is that the only way to get the area coherent?
furthermore i don't see why this could not be interrupt safe.

[5]
i don't understand ia64. but it looks somewhat atomic :)

well, assuming i didn't oversee anything, there are indeed few reasons
left why the whole _consistent() machinery shouldn't be callable from
interrupts.

back to my original question: what were the last trees with shrinking
pools? would the original version still work or any redesigns needed?


regards,
dns

--
___________________________________________________________________________
mailto:[email protected]


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-02-12 15:48:51

by Russell King

[permalink] [raw]
Subject: Re: pci_pool reap?

On Tue, Feb 12, 2002 at 04:36:34PM +0100, Daniel Stodden wrote:
> ARM does GFP_KERNEL, and then __ioremaps the underlying pages.
> ugh. is that the only way to get the area coherent?

Yes. Cache bits are in the page tables, and it would be idiotic to
manipulate the cache bits on a 1MB granularity over the kernel
direct mapped space.

> furthermore i don't see why this could not be interrupt safe.

GFP_KERNEL in the page table allocation functions mainly. We've been
around and around this recently on this mailing list, so I'm not going
to say anything further. I don't want another long discussion about
this subject taking my time away from doing real work on ARM. If you're
really interested in the outcome, please examine the lkml archives.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-02-12 15:51:31

by David Miller

[permalink] [raw]
Subject: Re: pci_pool reap?

From: Daniel Stodden <[email protected]>
Date: 12 Feb 2002 16:36:34 +0100

back to my original question: what were the last trees with shrinking
pools? would the original version still work or any redesigns needed?

Probably yes and it was the first 2.4.x that the pci_pool stuff
appeared in. Peter Zaitcev disabled the shrinking in the next
release I believe, or soon thereafter.

2002-02-12 15:53:11

by David Miller

[permalink] [raw]
Subject: Re: pci_pool reap?

From: Russell King <[email protected]>
Date: Tue, 12 Feb 2002 15:48:16 +0000

If you're really interested in the outcome, please examine the lkml
archives.

The conclusion we came to is that there is no reason you can't do the
remapping from interrupts on ARM and propagate the GFP_ATOMIC
properly as well. Right?

Or is this another "I'm not going to make the change until it
is required of me" situation? If so I'll just make it so :-)

2002-02-12 15:59:51

by Russell King

[permalink] [raw]
Subject: Re: pci_pool reap?

On Tue, Feb 12, 2002 at 07:50:51AM -0800, David S. Miller wrote:
> The conclusion we came to is that there is no reason you can't do the
> remapping from interrupts on ARM and propagate the GFP_ATOMIC
> properly as well. Right?
>
> Or is this another "I'm not going to make the change until it
> is required of me" situation? If so I'll just make it so :-)

Well, seeing as I'm currently on 2.5.2 still, waiting for various changes
to stabilise, its still not really high on my priority list. Things
that are high on it is to move forward RSN and put in place all the
changes for ARM that are needed between 2.5.3-pre1 and 2.5.4. There's
several bits that need to be looked at, and some of the changes that
have happened in 2.5.2-rmk clash with some of the changes in these
patches, c'est la vie.

Also high is to do something about the growing mountain of patches in
my patch system that need to be processed.

So, I hope you can see that any changes you put into current Linus
kernels won't change the situation for a while because I'm too overloaded
with other stuff and stuck back at 2.5.2 currently.

--
Russell King ([email protected]) The developer of ARM Linux
http://www.arm.linux.org.uk/personal/aboutme.html

2002-02-12 17:29:25

by Daniel Stodden

[permalink] [raw]
Subject: Re: pci_pool reap?

hi.

On Tue, 2002-02-12 at 16:48, Russell King wrote:
> On Tue, Feb 12, 2002 at 04:36:34PM +0100, Daniel Stodden wrote:
> > ARM does GFP_KERNEL, and then __ioremaps the underlying pages.
> > ugh. is that the only way to get the area coherent?
>
> Yes. Cache bits are in the page tables, and it would be idiotic to
> manipulate the cache bits on a 1MB granularity over the kernel
> direct mapped space.
>
> > furthermore i don't see why this could not be interrupt safe.
>
> GFP_KERNEL in the page table allocation functions mainly. We've been
> around and around this recently on this mailing list, so I'm not going
> to say anything further. I don't want another long discussion about
> this subject taking my time away from doing real work on ARM. If you're
> really interested in the outcome, please examine the lkml archives.

ok. i read part of the old thread now. sorry. didn't know that this had
already been issued.

so, based on the fact that
1. _most_ archs can easily do atomically.
2. those which don't aren't necessarily the better ones.
3. many drivers may prefer/be able to alloc through during
_init()/_release()
3.5 some may not.
4. even on arm, __ioremap() takes a gfp for quite some time now
and nobody seems to disagree.

then why does pci_alloc_consistent() not just take gfp flags and people
put in what their personal preference is?

regards,
dns


--
___________________________________________________________________________
mailto:[email protected]


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-02-12 19:36:32

by Gérard Roudier

[permalink] [raw]
Subject: Re: pci_pool reap?



On Mon, 11 Feb 2002, David S. Miller wrote:

> From: G?rard Roudier <[email protected]>
> Date: Sun, 10 Feb 2002 21:20:05 +0100 (CET)
>
> On Mon, 11 Feb 2002, Alan Cox wrote:
>
> > This function may not be called in interrupt context.
>
> Such limitation looks poor implementation to me.
>
> I agree with you Gerard, and probably nobody truly even requires
> this limitation. I do plan to remove it after I've done a thorough
> investigation of the platform implementations.

In the meantime, you may just queue the thing to memory (as the allocated
memory chunk is likely to be larger than a pointer given alignment) and
use some helper thread that dequeue things to free and does the actual
free in 'non-interrupt' context (only on ports that are unable to free
under interrupt context, obviously).

G?rard.



2002-02-12 20:11:55

by Gérard Roudier

[permalink] [raw]
Subject: Re: pci_pool reap?


So, everything is ok. :-)

G?rard.

On 12 Feb 2002, Daniel Stodden wrote:

> hi.
>
> On Tue, 2002-02-12 at 03:44, David S. Miller wrote:
> > From: G?rard Roudier <[email protected]>
> > Date: Sun, 10 Feb 2002 21:20:05 +0100 (CET)
> >
> > On Mon, 11 Feb 2002, Alan Cox wrote:
> >
> > > This function may not be called in interrupt context.
> >
> > Such limitation looks poor implementation to me.
> >
> > I agree with you Gerard, and probably nobody truly even requires
> > this limitation. I do plan to remove it after I've done a thorough
> > investigation of the platform implementations.
>
> ok, i've looked through most of 2.5.4 now.
> results look like this:
>
> pci_alloc_consistent() pci_free_consistent()
> i386:
> [1] ok ok
>
> ppc:
> [1] ok ok
>
> mips:
> [1] ok ok
>
> sh:
> [1] ok ok
> stm: [1] ok ok
> dc: [3] ok ok
>
> mips64:
> ip32: [1] ok ok
> ip27: [1] ok ok
>
> sparc:
> [1] GFP_KERNEL ok
> sparc64:
> [2] ok ok
>
> arm: [4] BUG()/GFP_KERNEL BUG()
>
> alpha:
> [2] ok ok
>
> ia64: [5] ok? ok?
>
>
> [1]
> gfp() + __pa() (or similar)
>
> [2]
> gfp() + IOMMU
>
> [3]
> dummy, offsets only
>
> [4]
> ARM does GFP_KERNEL, and then __ioremaps the underlying pages.
> ugh. is that the only way to get the area coherent?
> furthermore i don't see why this could not be interrupt safe.
>
> [5]
> i don't understand ia64. but it looks somewhat atomic :)
>
> well, assuming i didn't oversee anything, there are indeed few reasons
> left why the whole _consistent() machinery shouldn't be callable from
> interrupts.
>
> back to my original question: what were the last trees with shrinking
> pools? would the original version still work or any redesigns needed?
>
>
> regards,
> dns
>
> --
> ___________________________________________________________________________
> mailto:[email protected]
>
>

2002-02-12 21:16:33

by Daniel Stodden

[permalink] [raw]
Subject: Re: pci_pool reap?

On Mon, 2002-02-11 at 22:10, G?rard Roudier wrote:
>
> So, everything is ok. :-)

hey,

mio nada great hacker von hardware. just the guy who wants to allocate
coherent buffers into shrinking pci pools, preferably at interrupts.

since everybody seems to come up something like "well, most systems.."
and "but some arch.." i thought it might be actually of interest to look
it up, no?

<:)

dns

> > [1] ok ok

> > [1] ok ok

> > [1] ok ok

> > [1] ok ok

> > stm: [1] ok ok

> > dc: [3] ok ok

> > ip32: [1] ok ok
> > ip27: [1] ok ok

> > [1] GFP_KERNEL ok

> > [2] ok ok

> > arm: [4] BUG()/GFP_KERNEL BUG()

> > [2] ok ok

> > ia64: [5] ok? ok?

--
___________________________________________________________________________
mailto:[email protected]

And don't EVER make the mistake that you can design something better
than what you get from ruthless massively parallel trial-and-error
with a feedback cycle.
- Linus Torvalds


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part