2003-11-02 18:12:26

by Jamie Wellnitz

[permalink] [raw]
Subject: virt_to_page/pci_map_page vs. pci_map_single

I see code similar to the following in a few drivers (qlogicfc,
sym53c8xx, acenic does something similar):

page = virt_to_page(buffer);
offset = ((unsigned long)buffer & ~PAGE_MASK);
busaddr = pci_map_page(pci_dev, page, offset, len, direction);

How is this preferable to:

pci_map_single( pci_dev, buffer, len, direction);

?

pci_map_single can't handle highmem pages (because they don't have a
kernel virtual address) but doesn't virt_to_page suffer from the same
limitation? Is there some benefit on architectures that don't have
highmem?

Thanks,
Jamie Wellnitz
[email protected]


2003-11-03 08:10:52

by Jes Sorensen

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> "Jamie" == Jamie Wellnitz <[email protected]> writes:

Jamie> page = virt_to_page(buffer); offset = ((unsigned long)buffer &
Jamie> ~PAGE_MASK); busaddr = pci_map_page(pci_dev, page, offset, len,
Jamie> direction);

Jamie> How is this preferable to:

Jamie> pci_map_single( pci_dev, buffer, len, direction);

Jamie> pci_map_single can't handle highmem pages (because they don't
Jamie> have a kernel virtual address) but doesn't virt_to_page suffer
Jamie> from the same limitation? Is there some benefit on
Jamie> architectures that don't have highmem?

virt_to_page() can handle any page in the standard kernel region
including pages that are physically in 64-bit space if the
architecture requires it. It doesn't handle vmalloc pages etc. but
one shouldn't try and dynamically map vmalloc pages at
random. pci_map_page() can handle all memory in the system though as
every page that can be mapped has a struct page * entry.

pci_map_page() is the correct API to use, pci_map_single() is
deprecated.

Cheers,
Jes

2003-11-03 12:53:37

by Jamie Wellnitz

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Mon, Nov 03, 2003 at 03:10:46AM -0500, Jes Sorensen wrote:
> >>>>> "Jamie" == Jamie Wellnitz <[email protected]> writes:
>
> Jamie> page = virt_to_page(buffer); offset = ((unsigned long)buffer &
> Jamie> ~PAGE_MASK); busaddr = pci_map_page(pci_dev, page, offset, len,
> Jamie> direction);
>
> Jamie> How is this preferable to:
>
> Jamie> pci_map_single( pci_dev, buffer, len, direction);
>
> Jamie> pci_map_single can't handle highmem pages (because they don't
> Jamie> have a kernel virtual address) but doesn't virt_to_page suffer
> Jamie> from the same limitation? Is there some benefit on
> Jamie> architectures that don't have highmem?
>
> virt_to_page() can handle any page in the standard kernel region

What is the "standard kernel region"? ZONE_NORMAL?

> including pages that are physically in 64-bit space if the
> architecture requires it. It doesn't handle vmalloc pages etc. but
> one shouldn't try and dynamically map vmalloc pages at
> random. pci_map_page() can handle all memory in the system though as
> every page that can be mapped has a struct page * entry.
>
> pci_map_page() is the correct API to use, pci_map_single() is
> deprecated.

Are you talking about 2.4 or 2.6 or both?

The Document/DMA-mapping.txt in 2.6.0-test9 says "To map a single
region, you do:" and then shows pci_map_single. Is DMA-mapping.txt in
need of patching?

>
> Cheers,
> Jes

Thanks,
Jamie
[email protected]

2003-11-03 14:18:05

by Jes Sorensen

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> "Jamie" == Jamie Wellnitz <[email protected]> writes:

Jamie> On Mon, Nov 03, 2003 at 03:10:46AM -0500, Jes Sorensen wrote:
>> virt_to_page() can handle any page in the standard kernel region

Jamie> What is the "standard kernel region"? ZONE_NORMAL?

Hmmm, my brain has gotten ia64ified ;-) It's basically the normal
mappings of the kernel, ie. the kernel text/data/bss segments as well
as anything you do not get back as a dynamic mapping such as
ioremap/vmalloc/kmap.

>> pci_map_page() is the correct API to use, pci_map_single() is
>> deprecated.

Jamie> Are you talking about 2.4 or 2.6 or both?

Both really.

Jamie> The Document/DMA-mapping.txt in 2.6.0-test9 says "To map a
Jamie> single region, you do:" and then shows pci_map_single. Is
Jamie> DMA-mapping.txt in need of patching?

Sounds like it needs an update.

Cheers,
Jes

2003-11-03 18:49:23

by James Bottomley

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single



Jamie> The Document/DMA-mapping.txt in 2.6.0-test9 says "To map a
Jamie> single region, you do:" and then shows pci_map_single. Is
Jamie> DMA-mapping.txt in need of patching?

Sounds like it needs an update.

Erm, I don't think so. pci_map_single() covers a different use case
from pci_map_page().

The thing pci_map_single() can do that pci_map_page() can't is cope with
contiguous regions greater than PAGE_SIZE in length (which you get
either from kmalloc() or __get_free_pages()). This feature is used in
the SCSI layer for instance.

There has been talk of deprecating dma_map_single() in favour of
dma_map_sg() (i.e. make all transfers use scatter/gather and eliminate
dma_map_single() in favour of a single sg entry table) but nothing has
been done about it (at least as far as I know).

James


2003-11-03 22:03:01

by David Mosberger

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> On 03 Nov 2003 09:17:59 -0500, Jes Sorensen <[email protected]> said:

>>>>> "Jamie" == Jamie Wellnitz <[email protected]> writes:
Jamie> On Mon, Nov 03, 2003 at 03:10:46AM -0500, Jes Sorensen wrote:
>>> virt_to_page() can handle any page in the standard kernel region

Jamie> What is the "standard kernel region"? ZONE_NORMAL?

Jes> Hmmm, my brain has gotten ia64ified ;-) It's basically the normal
Jes> mappings of the kernel, ie. the kernel text/data/bss segments as well
Jes> as anything you do not get back as a dynamic mapping such as
Jes> ioremap/vmalloc/kmap.

I don't think it's safe to use virt_to_page() on static kernel
addresses (text, data, and bss). For example, ia64 linux nowadays
uses a virtual mapping for the static kernel memory, so it's not part
of the identity-mapped segment.

--david

2003-11-03 22:03:41

by Jamie Wellnitz

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Mon, Nov 03, 2003 at 12:48:42PM -0600, James Bottomley wrote:
>
>
> Jamie> The Document/DMA-mapping.txt in 2.6.0-test9 says "To map a
> Jamie> single region, you do:" and then shows pci_map_single. Is
> Jamie> DMA-mapping.txt in need of patching?
>
> Sounds like it needs an update.
>
> Erm, I don't think so. pci_map_single() covers a different use case
> from pci_map_page().
>
> The thing pci_map_single() can do that pci_map_page() can't is cope with
> contiguous regions greater than PAGE_SIZE in length (which you get
> either from kmalloc() or __get_free_pages()). This feature is used in
> the SCSI layer for instance.

Does "pci_map_page(virt_to_page(addr))" handle contiguous regions of
multiple pages? It seems like the i386 implementation will (assuming
we're dealing with kmalloc'd memory). Although the semantics of
map_single seem better to me than map_page, if I'm mapping a single
"region" of multiple pages.

>
> There has been talk of deprecating dma_map_single() in favour of
> dma_map_sg() (i.e. make all transfers use scatter/gather and eliminate
> dma_map_single() in favour of a single sg entry table) but nothing has
> been done about it (at least as far as I know).
>
> James

Thanks,
Jamie Wellnitz

2003-11-03 23:44:25

by David Miller

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Mon, 3 Nov 2003 14:02:57 -0800
David Mosberger <[email protected]> wrote:

> >>>>> On 03 Nov 2003 09:17:59 -0500, Jes Sorensen <[email protected]> said:
>
> Jes> Hmmm, my brain has gotten ia64ified ;-) It's basically the normal
> Jes> mappings of the kernel, ie. the kernel text/data/bss segments as well
> Jes> as anything you do not get back as a dynamic mapping such as
> Jes> ioremap/vmalloc/kmap.
>
> I don't think it's safe to use virt_to_page() on static kernel
> addresses (text, data, and bss). For example, ia64 linux nowadays
> uses a virtual mapping for the static kernel memory, so it's not part
> of the identity-mapped segment.

That's correct and it'll break on sparc64 for similar reasons.

It's also not safe to do virt_to_page() on kernel stack addresses
either.

2003-11-04 08:42:54

by Ihar 'Philips' Filipau

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

Hi!

Can any-one draw a conclusion?
Which function should be used in which case?

So this will be at least reflected in lkml archives ;-)

David S. Miller wrote:
> On Mon, 3 Nov 2003 14:02:57 -0800
> David Mosberger <[email protected]> wrote:
>
>
>>>>>>>On 03 Nov 2003 09:17:59 -0500, Jes Sorensen <[email protected]> said:
>>
>> Jes> Hmmm, my brain has gotten ia64ified ;-) It's basically the normal
>> Jes> mappings of the kernel, ie. the kernel text/data/bss segments as well
>> Jes> as anything you do not get back as a dynamic mapping such as
>> Jes> ioremap/vmalloc/kmap.
>>
>>I don't think it's safe to use virt_to_page() on static kernel
>>addresses (text, data, and bss). For example, ia64 linux nowadays
>>uses a virtual mapping for the static kernel memory, so it's not part
>>of the identity-mapped segment.
>
>
> That's correct and it'll break on sparc64 for similar reasons.
>
> It's also not safe to do virt_to_page() on kernel stack addresses
> either.
>


--
Ihar 'Philips' Filipau / with best regards from Saarbruecken.
-- _ _ _
"... and for $64000 question, could you get yourself |_|*|_|
vaguely familiar with the notion of on-topic posting?" |_|_|*|
-- Al Viro @ LKML |*|*|*|

2003-11-04 09:48:14

by Jes Sorensen

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> "James" == James Bottomley <[email protected]> writes:

James> Erm, I don't think so. pci_map_single() covers a different use
James> case from pci_map_page().

James> The thing pci_map_single() can do that pci_map_page() can't is
James> cope with contiguous regions greater than PAGE_SIZE in length
James> (which you get either from kmalloc() or __get_free_pages()).
James> This feature is used in the SCSI layer for instance.

The question is whether that should be allowed in the first place. Some
IOMMU's will have to map it page-by-page anyway. However if it is to
remain a valid use then I don't see why pci_map_page() shouldn't be
able to handle it under the same conditions by passing it a
size > PAGE_SIZE.

Cheers,
Jes


2003-11-04 09:51:01

by Jes Sorensen

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> "David" == David Mosberger <[email protected]> writes:

Jes> Hmmm, my brain has gotten ia64ified ;-) It's basically the normal
Jes> mappings of the kernel, ie. the kernel text/data/bss segments as
Jes> well as anything you do not get back as a dynamic mapping such as
Jes> ioremap/vmalloc/kmap.

David> I don't think it's safe to use virt_to_page() on static kernel
David> addresses (text, data, and bss). For example, ia64 linux
David> nowadays uses a virtual mapping for the static kernel memory,
David> so it's not part of the identity-mapped segment.

Mmmm good point, I stand corrected. It doesn't really make sense to do
it anyway, but maybe it's worth underlining in the doc if it hasn't
been done so already.

Cheers,
Jes

2003-11-04 16:36:00

by Matt Porter

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Tue, Nov 04, 2003 at 04:48:10AM -0500, Jes Sorensen wrote:
> >>>>> "James" == James Bottomley <[email protected]> writes:
>
> James> Erm, I don't think so. pci_map_single() covers a different use
> James> case from pci_map_page().
>
> James> The thing pci_map_single() can do that pci_map_page() can't is
> James> cope with contiguous regions greater than PAGE_SIZE in length
> James> (which you get either from kmalloc() or __get_free_pages()).
> James> This feature is used in the SCSI layer for instance.
>
> The question is whether that should be allowed in the first place. Some
> IOMMU's will have to map it page-by-page anyway. However if it is to
> remain a valid use then I don't see why pci_map_page() shouldn't be
> able to handle it under the same conditions by passing it a
> size > PAGE_SIZE.

This raises a question for me regarding these rules in 2.4 versus
2.6. While fixing a bug in PPC's 2.4 pci_map_page/pci_map_sg
implementations I noticed that a scatterlist created by the IDE
subsystem will pass nents by page struct reference with a
size > PAGE_SIZE. Is this a 2.4ism resulting from allowing both
address and page reference scatterlist entries? This isn't
explicitly mentioned in the DMA docs AFAICT. I'm wondering
if this is the same expected behavior in 2.6 as well. If
pci_map_page() is limited to size <= PAGE_SIZE then I would
expect pci_map_sg() to be limited as well (and vice versa).

-Matt

2003-11-04 16:44:20

by James Bottomley

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Tue, 2003-11-04 at 03:48, Jes Sorensen wrote:
> The question is whether that should be allowed in the first place. Some
> IOMMU's will have to map it page-by-page anyway. However if it is to
> remain a valid use then I don't see why pci_map_page() shouldn't be
> able to handle it under the same conditions by passing it a
> size > PAGE_SIZE.

I really don't see what's to be gained by doing this. map_page is for
mapping one page or a fragment of it. It's designed for small zero copy
stuff, like networking. To get it to map more than one page, really we
should pass in an array of struct pages.

If the data you have to map is > 1 Page, then the scatter gather
interface (dma_map_sg()) is a better way to handle it. dma_map_single()
is for the special case of data generated in kernel, which is virtually
and physically contiguous (i.e. generated by kmalloc or get_free_pages,
so no highmem issues either), which we haven't passed through the SG
machinery. It's special cased out in all the drivers that handle SG
transactions.

James


2003-11-04 16:48:32

by James Bottomley

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Tue, 2003-11-04 at 10:35, Matt Porter wrote:
> This raises a question for me regarding these rules in 2.4 versus
> 2.6. While fixing a bug in PPC's 2.4 pci_map_page/pci_map_sg
> implementations I noticed that a scatterlist created by the IDE
> subsystem will pass nents by page struct reference with a
> size > PAGE_SIZE. Is this a 2.4ism resulting from allowing both
> address and page reference scatterlist entries? This isn't
> explicitly mentioned in the DMA docs AFAICT. I'm wondering
> if this is the same expected behavior in 2.6 as well. If
> pci_map_page() is limited to size <= PAGE_SIZE then I would
> expect pci_map_sg() to be limited as well (and vice versa).

Not really. By design, the SG interface can handle entries that are
physically contiguous.

If you have a limit on the length of your SG elements (because of the
device hardware, say), you can express this to the block layer with the
blk_queue_max_segment_size() API.

James

2003-11-04 17:11:53

by Matt Porter

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

On Tue, Nov 04, 2003 at 10:47:25AM -0600, James Bottomley wrote:
> On Tue, 2003-11-04 at 10:35, Matt Porter wrote:
> > This raises a question for me regarding these rules in 2.4 versus
> > 2.6. While fixing a bug in PPC's 2.4 pci_map_page/pci_map_sg
> > implementations I noticed that a scatterlist created by the IDE
> > subsystem will pass nents by page struct reference with a
> > size > PAGE_SIZE. Is this a 2.4ism resulting from allowing both
> > address and page reference scatterlist entries? This isn't
> > explicitly mentioned in the DMA docs AFAICT. I'm wondering
> > if this is the same expected behavior in 2.6 as well. If
> > pci_map_page() is limited to size <= PAGE_SIZE then I would
> > expect pci_map_sg() to be limited as well (and vice versa).
>
> Not really. By design, the SG interface can handle entries that are
> physically contiguous.
>
> If you have a limit on the length of your SG elements (because of the
> device hardware, say), you can express this to the block layer with the
> blk_queue_max_segment_size() API.

Ok, then to make this clear to implementers of the
pci_map_page()/pci_map_sg() APIs would the following
addition to the docs be correct?

-Matt

===== Documentation/DMA-mapping.txt 1.17 vs edited =====
--- 1.17/Documentation/DMA-mapping.txt Sat Aug 16 11:46:50 2003
+++ edited/Documentation/DMA-mapping.txt Tue Nov 4 09:59:53 2003
@@ -499,7 +499,8 @@

pci_unmap_page(dev, dma_handle, size, direction);

-Here, "offset" means byte offset within the given page.
+Here, "offset" means byte offset within the given page. In addition,
+"size" must be <= PAGE_SIZE - offset.

With scatterlists, you map a region gathered from several regions by:

@@ -520,6 +521,9 @@
advantage for cards which either cannot do scatter-gather or have very
limited number of scatter-gather entries) and returns the actual number
of sg entries it mapped them to.
+
+Note: sglist entries are not limited to PAGE_SIZE, the pci_map_sg()
+implementation must handle entries with size > PAGE_SIZE.

Then you should loop count times (note: this can be less than nents times)
and use sg_dma_address() and sg_dma_len() macros where you previously

2003-11-05 16:27:01

by Anton Blanchard

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single


> I really don't see what's to be gained by doing this. map_page is for
> mapping one page or a fragment of it. It's designed for small zero copy
> stuff, like networking. To get it to map more than one page, really we
> should pass in an array of struct pages.

As an aside it would be nice if networking used the map_sg infrastructure
for zero copy. Some architectures need to do things to make the DMA
mapping visible to IO and at the moment we do it for each map_page.

Anton

2003-11-06 08:29:15

by Jes Sorensen

[permalink] [raw]
Subject: Re: virt_to_page/pci_map_page vs. pci_map_single

>>>>> "James" == James Bottomley <[email protected]> writes:

James> On Tue, 2003-11-04 at 03:48, Jes Sorensen wrote:
>> The question is whether that should be allowed in the first
>> place. Some IOMMU's will have to map it page-by-page
>> anyway. However if it is to remain a valid use then I don't see why
>> pci_map_page() shouldn't be able to handle it under the same
>> conditions by passing it a size > PAGE_SIZE.

James> I really don't see what's to be gained by doing this. map_page
James> is for mapping one page or a fragment of it. It's designed for
James> small zero copy stuff, like networking. To get it to map more
James> than one page, really we should pass in an array of struct
James> pages.

I am totally in favor of that. I think it's a really bad idea on
relying on the pci_map infrstructure to do the page chopping for
multi-page mappings since the IOMMUs will normally have to chop it up
anyway. The driver authors needs to be aware of this.

The above was more meant as an example of how pci_map_page() can be
hacked to do the same thing as pci_map_single if we really wanted to
rely on that behavior.

Cheers,
Jes