2010-07-28 10:14:27

by Evan Lavelle

[permalink] [raw]
Subject: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

I'm writing a driver for a PCIe card which has to support DMA.

I can get this to work by using 'pci_alloc_consistent' to get a coherent
mapping for a DMA buffer (when I pass the returned 'dma_addr_t' to my
card, it can use it to successfully DMA into the PC).

The problem is that I actually want to use a streaming mapping, for
direct I/O, with a scatter-gather list, and I can't get this to work.
'pci_map_sg' returns me a bus address for the user's read buffer, but
this doesn't appear to be a valid bus address. When the PCIe card DMAs
to this bus address, the DMA operation appears to complete, but the
user's read buffer is not modified. Any ideas?

This is the code that works:

dmaCPUAddr = pci_alloc_consistent(
PCI_Dev_Cfg, dmaBufSize, dmaPCIBusAddr);

This allocates a 64KB kernel buffer. The PCIe card can DMA into this
buffer, and I can then copy this buffer back to the user.
'dmaPCIBusAddr' is set to something in the region of 0x32xxxxxx to
0x34xxxxxx (on x86), so this is presumably a valid bus address into
kernel memory.

This is a simplified version of the code which doesn't work:

===================================================
// get bus addresses to DMA into user memory
// 'userAddr', for 'npages' pages

down_read(&current->mm->mmap_sem);
ret = get_user_pages(
current, current->mm, userAddr,
numPages, 1, 1, pageList, NULL);
up_read(&current->mm->mmap_sem);

if(ret != numPages) {...error}

...'kmalloc' and clear scatterlist 'sgList'

for(i = 0; i < numPages; i++) {
if(pageList[i] == NULL)
... error
sgList[i].page = pageList[i];
sgList[i].offset = 0;
sgList[i].length = PAGE_SIZE;
}

sgLen = pci_map_sg(
pPciDev, // the PCI device
sgList, // the place to store the list
numPages, // how many initial pages are in it
direction); // data flow direction

if(sgLen == 0) { ...error }

{ // DEBUG
struct scatterlist *sg = sgList;
for(i = 0; i < sgLen; i++, sg++)
printk(
KERN_INFO "busAddr 0x%08x; len %d\n",
(u32)sg_dma_address(sg), sg_dma_len(sg));
}
===================================================

For one page of user memory, 'sg_dma_address' returns a bus address in
the region of 0x13xxxxxx (on x86). When the PCIe card tries to DMA to
this address, the data disappears - I can't see it in my userland test
program. Any ideas?

Thanks -

Evan

================================================

Extra information:

- modern x86_64 HP multiprocessor server motherboard, running 32-bit
RHEL 5.1

- 2.6.18-53el5xen, i686/athlon/i386

- 'page_address(pageList[0])' sometimes returns NULL, which I find
surprising - isn't this meant to be locked down?

- 'kmap(pageList[0])' always returns a valid address, and I can use this
address to write directly to the user-space buffer from the driver

- 'virt_to_bus', 'virt_to_phys', and '__pa' don't seem to do anything
useful on this platform; they just give a fixed offset from the user
virtual address, which is nowhere near the bus address which works


2010-08-04 09:28:06

by Evan Lavelle

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

Made some progress here. The problem is that this is 32-bit PAE kernel,
so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
32 bits but this is *not* sufficient:

pci_set_dma_mask(my_dev, DMA_32BIT_MASK)

Is this a bug, or do I have to do something else? LDD doesn't seem to
have anything to say about this. I had previously assumed that an IOMMU
would translate the (32-bit) dma_addr_t to a 36- or 64-bit value, but I
don't think there's an IOMMU in this system. Do x86 systems have IOMMUs?
This is a server motherboard, so I don't think it even has AGP. However,
even if I had an IOMMU, I would still need a way generate a 32-bit
dma_addr_t to start with.

Second problem: can I use the scatter-gather code ('pci_map_sg') on PAE
or 64-bit systems? I've found one post that says this isn't possible,
and that the DAC routines have to be used instead (second post in
http://www.alteraforum.com/forum/showthread.php?t=4171). These comments
seem to be incorrect, but I'd appreciate some confirmation of this.

The specific question in my first post was why the coherent mapping
worked, and the streaming mapping didn't. The answer was that, for this
system, the dma_addr_t for a coherent buffer in kernel space is in the
low 4GB, but the dma_addr_t for the streaming buffer in user space has
bit 32 set. I hadn't realised that dma_addr_t was 64-bit, and I was just
writing the low 32 bits to the DMA registers on the PCI card. The DMA op
to the coherent buffer worked, but the DMA op to the streaming buffer
didn't, since the PCIe card can't drive bit 32.

Thanks -

Evan

2010-08-04 10:09:03

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

On Wed, 04 Aug 2010 10:26:29 +0100
Evan Lavelle <[email protected]> wrote:

> Made some progress here. The problem is that this is 32-bit PAE kernel,
> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
> 32 bits but this is *not* sufficient:
>
> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)

It doesn't work on x86_32 kernel if your driver doesn't work with the
block layer or the network subsystem.

If your driver can't handle 64bit DMA, you need bounce buffer. I don't
know what your driver do, but a subsystem passes a buffer to your
driver. If a buffer is not below 32bit address, for example, if you
read data from hardware, you need to allocate a temporary buffer
(below 32bit), do DMA with the buffer, copy the data to the original
buffer, then free the temporary buffer.

The block layer and the network subsystem have the own bounce
mechanism. x86_64 kernel has swiotlb, which is the generic bounce
buffer mechanism. So if a driver sets the dma mask, they do bounce
buffer for the driver.

2010-08-04 11:22:46

by Evan Lavelle

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

FUJITA Tomonori wrote:
>> Made some progress here. The problem is that this is 32-bit PAE kernel,
>> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
>> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
>> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
>> 32 bits but this is *not* sufficient:
>>
>> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)
>
> It doesn't work on x86_32 kernel if your driver doesn't work with the
> block layer or the network subsystem.

Sorry, not sure that I understand this. Are you saying that I can't set
a DMA mask on x86_32 unless I have a block or network driver?

> If your driver can't handle 64bit DMA, you need bounce buffer.

The problem is not that I can't handle 64-bit DMA in the driver, but
that the PCI card can't do 64-bit DMA. I tell the kernel this by calling
'pci_set_dma_mask' with a 32-bit mask, but it appears to be ignoring my
request and then giving me a 64-bit dma_addr_t for the 32-bit PCI card.

Thanks -

Evan

2010-08-04 12:04:07

by FUJITA Tomonori

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

On Wed, 04 Aug 2010 12:22:32 +0100
Evan Lavelle <[email protected]> wrote:

> FUJITA Tomonori wrote:
> >> Made some progress here. The problem is that this is 32-bit PAE kernel,
> >> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
> >> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
> >> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
> >> 32 bits but this is *not* sufficient:
> >>
> >> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)
> >
> > It doesn't work on x86_32 kernel if your driver doesn't work with the
> > block layer or the network subsystem.
>
> Sorry, not sure that I understand this. Are you saying that I can't set
> a DMA mask on x86_32 unless I have a block or network driver?

Yeah, the mask is ignored. As I wrote in the previous mail, x86_32
doesn't have a bounce mechanism so dma_map_{single|sg} can't do
anything for a buffer above 32bit even if the mask is 32bit.


> > If your driver can't handle 64bit DMA, you need bounce buffer.
>
> The problem is not that I can't handle 64-bit DMA in the driver, but
> that the PCI card can't do 64-bit DMA. I tell the kernel this by calling
> 'pci_set_dma_mask' with a 32-bit mask, but it appears to be ignoring my
> request and then giving me a 64-bit dma_addr_t for the 32-bit PCI card.

If your card can't do 64-bit DMA, you need a bounce buffer mechanism.

Options are:

- your driver implements its own bounce buffer mechanism (as some
driver do).

- add swiotlb support to x86_32 (I don't think that it's difficult but
I might miss something).

2010-08-04 14:52:45

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

On Wed, Aug 04, 2010 at 09:03:53PM +0900, FUJITA Tomonori wrote:
> On Wed, 04 Aug 2010 12:22:32 +0100
> Evan Lavelle <[email protected]> wrote:
>
> > FUJITA Tomonori wrote:
> > >> Made some progress here. The problem is that this is 32-bit PAE kernel,
> > >> so 'dma_addr_t' is 64-bit. However, I have a 32-bit PCIe card, so I need
> > >> a 32-bit dma_addr_t. How do I do this? In other words, how do I handle
> > >> 32-bit PCI cards on PAE or 64-bit systems? My code sets the DMA mask to
> > >> 32 bits but this is *not* sufficient:
> > >>
> > >> pci_set_dma_mask(my_dev, DMA_32BIT_MASK)
> > >
> > > It doesn't work on x86_32 kernel if your driver doesn't work with the
> > > block layer or the network subsystem.
> >
> > Sorry, not sure that I understand this. Are you saying that I can't set
> > a DMA mask on x86_32 unless I have a block or network driver?
>
> Yeah, the mask is ignored. As I wrote in the previous mail, x86_32
> doesn't have a bounce mechanism so dma_map_{single|sg} can't do
> anything for a buffer above 32bit even if the mask is 32bit.
>
>
> > > If your driver can't handle 64bit DMA, you need bounce buffer.
> >
> > The problem is not that I can't handle 64-bit DMA in the driver, but
> > that the PCI card can't do 64-bit DMA. I tell the kernel this by calling
> > 'pci_set_dma_mask' with a 32-bit mask, but it appears to be ignoring my
> > request and then giving me a 64-bit dma_addr_t for the 32-bit PCI card.
>
> If your card can't do 64-bit DMA, you need a bounce buffer mechanism.
>
> Options are:
>
> - your driver implements its own bounce buffer mechanism (as some
> driver do).
>
> - add swiotlb support to x86_32 (I don't think that it's difficult but
> I might miss something).

I think the highmem support might be a bit tricky. The PowerPC folks
did some work in there, so it _ought_ to work.

Evan, you could edit arch/x86/Kconfig and change:
config SWIOTLB
def_bool y if X86_64

to say
def_bool y if X86

and see how it works? FYI, it might wreak havoc on your machine thought,
so be sure you have a fail-safe kernel and backup your root/home
directory.

(FYI, I made Xen-SWIOTLB be capable of running under X86_32 and so far
no trouble.. but that is not baremetal obviously).

2010-08-13 01:36:05

by Yuhong Bao

[permalink] [raw]
Subject: RE: Driver: PCIe: 'pci_map_sg' returning invalid bus address?


> > - add swiotlb support to x86_32 (I don't think that it's difficult but
> > I might miss something).
>
> I think the highmem support might be a bit tricky. The PowerPC folks
> did some work in there, so it _ought_ to work.
>
> Evan, you could edit arch/x86/Kconfig and change:
> config SWIOTLB
> def_bool y if X86_64
>
> to say
> def_bool y if X86
>
> and see how it works? FYI, it might wreak havoc on your machine thought,
> so be sure you have a fail-safe kernel and backup your root/home
> directory.
>
> (FYI, I made Xen-SWIOTLB be capable of running under X86_32 and so far
> no trouble.. but that is not baremetal obviously).
In fact, if you are going to port swiotlb, why not port the entire iommu support to x86_32 with PAE too?
In fact, I am really irritated at how the x86-64 port was developed completely separate from mainline, when
it is just a variant of the same x86 arch. For another example of this, look at the history of ACPI SRAT support
in Linux.

Yuhong Bao
-

2010-08-14 15:25:59

by Evan Lavelle

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

Thanks guys. I had to get this working quickly so I just stuck with my
bounce buffer code. I'm not sure that it's technically a 'bounce
buffer'; it just so happens that 'pci_alloc_consistent' returns an
address in the low 32 bits. This may stop working if the user installs
more than 4Gig of memory but I can live with that for now. Performance
isn't great (~110Mbytes/s on 4-channel PCIe) but it's good enough.

It's disappointing that LDD didn't have anything to say about this; it's
pretty fundamental to DMA on x86_32 and PAE.

Thanks -

Evan

2010-08-16 03:31:34

by Robert Hancock

[permalink] [raw]
Subject: Re: Driver: PCIe: 'pci_map_sg' returning invalid bus address?

On 08/14/2010 09:25 AM, Evan Lavelle wrote:
> Thanks guys. I had to get this working quickly so I just stuck with my
> bounce buffer code. I'm not sure that it's technically a 'bounce
> buffer'; it just so happens that 'pci_alloc_consistent' returns an
> address in the low 32 bits. This may stop working if the user installs
> more than 4Gig of memory but I can live with that for now. Performance

That should be safe as far as x86-32 goes - pci_alloc_consistent will
return a low memory address which will be below 1GB.

> isn't great (~110Mbytes/s on 4-channel PCIe) but it's good enough.
>
> It's disappointing that LDD didn't have anything to say about this; it's
> pretty fundamental to DMA on x86_32 and PAE.

The situation kind of sucks with that combination, yes. The block and
network subsystems have their own workarounds but other drivers just
have to sort of hack something together.