>Hello All,
>
>I have a 2.4.x char driver which works fine, except in boxes with lots of
>memory.
>
>user_buffer -> write() -> map_user_kiobuf() -> pci_map_sg() -> Pci DMA
>
>I'm using the .page/.offset version of the scatterlist, but in the HIGHMEM case,
>map_user_kiobuf() seems to return peculiar page addresses.
>
>What is the state of kiobufs/HIGHMEM in 2.4.x? Do I need to implement
>a bounce buffer in the driver? Some email correspondence indicates so,
>but I would be grateful for a definitive word from the kernel folks.
>
I finally googled up a couple of threads that shed some light ...
Seems that page_address() will return 0 when used on a high mem entry
in the kiobuf maplist.
Looks like three (?) options: go back to copying to a kernel DMA
buffer for all cases (swell for performance), split the code path into
map_user and copy_user branches (not that fond of spaghetti),
or - in the highmem case - copy to a local buffer and populate the
kiobuf with those pages and feed that to pci_map_sg().
The last is my preference, as it keeps the code cleaner, and since
my hardware is scatter-gather, I can either build the local buffer out
of discrete pages (at load time) or allocate a (possibly) non-
contiguous kernel buffer. I would prefer to not use kmalloc if
possible, since I don't really need contiguous pages, and would
like to keep the chances of allocation success as high as possible.
I haven't yet figured out how to allocate a (possibly) non-contiguous
buffer, since vmalloc is frowned on, or how to populate the kiobuf
with its pages.
Any advice gratefully accepted,
Bill
---------------------------------------
William D Waddington
Bainbridge Island, WA, USA
[email protected]
[email protected]
[email protected]
---------------------------------------
On Fri, Jul 19, 2002 at 08:00:00AM -0700, William D Waddington wrote:
> Looks like three (?) options: go back to copying to a kernel DMA
> buffer for all cases (swell for performance), split the code path into
> map_user and copy_user branches (not that fond of spaghetti),
> or - in the highmem case - copy to a local buffer and populate the
> kiobuf with those pages and feed that to pci_map_sg().
Or use the PCI-DMA API function pci_map_single() that's documented in
Documentation/DMA-mapping.txt to get a 64 bit pointer? Don't forget to
do a pci_set_dma_mask too, but that's mentioned in DMA-mapping.txt.
-ben
--
"You will be reincarnated as a toad; and you will be much happier."
On Fri, Jul 19, 2002 at 08:00:00AM -0700, William D Waddington wrote:
> I haven't yet figured out how to allocate a (possibly) non-contiguous
> buffer, since vmalloc is frowned on, or how to populate the kiobuf
> with its pages.
Don't use the kiobuf, since it is bloated.
Try this instead:
/* Pin down user pages and put them into a scatter gather list */
int sg_map_user_pages(struct scatterlist *sgl, const unsigned int max_pages,
unsigned long uaddr, size_t count, int rw) {
int res, i;
unsigned int nr_pages=((uaddr & ~PAGE_MASK) + count - 1 + ~PAGE_MASK) >> PAGE_SHIFT;
struct page *pages[max_pages];
/* User attempted Overflow! */
if ((uaddr + count) < uaddr)
return -EINVAL;
/* To big */
if (nr_pages > max_pages)
return -ENOMEM;
/* Hmm? */
if (count == 0)
return 0;
/* Not enough SGL entries provided! */
BUG_ON((((uaddr & ~PAGE_MASK) + count + ~PAGE_MASK) >> PAGE_SHIFT) > max_pages );
down_read(¤t->mm->mmap_sem);
res = get_user_pages(
current,
current->mm,
uaddr,
nr_pages,
rw == READ, /* logic is perversed^Wreversed here :-( */
0, /* don't force */
&pages[0],
NULL);
up_read(¤t->mm->mmap_sem);
/* Errors and no page mapped should return here */
if (res <= 0) return res;
memset(sgl, 0, sizeof(*sgl) * nr_pages);
sgl[0].page = pages[0];
sgl[0].offset = uaddr & PAGE_MASK;
/* Page crossing transfers need these adjustments */
if (res > 1) {
for (i=1; i < res ; i++) {
sgl[i].offset = 0;
sgl[i].page = pages[i];
sgl[i].length = PAGE_SIZE;
}
sgl[0].length = PAGE_SIZE - sgl[0].offset;
count -= sgl[0].length;
count -= (res - 2) * PAGE_SIZE;
}
sgl[res - 1].length = count;
return res;
}
/* And unmap them... */
int sg_unmap_user_pages(struct scatterlist *sgl, const unsigned int nr_pages) {
int i;
for (i=0; i < nr_pages; i++)
page_cache_release(sgl[i].page);
return 0;
}
That will give you nearly the same. You just have to lock_page()
the pages and UnlockPage() yourself.
Don't give to high values for max_pages, because this might
overflow your stack.
Do the pci_map_sg() after sg_map_user_pages() and pci_unmap_sg()
before sg_unmap_user_pages().
This should work. If you have highmem pages and your device can't
handle >32-bit physical addresses, then kmap() them
before pci_map_sg()ing them and kunmap() them after
pci_unmap_sg()ing.
kiobufs are (next to) useless for character io devices.
Hope that helps
Regards
Ingo Oeser
--
Science is what we can tell a computer. Art is everything else. --- D.E.Knuth
> This should work. If you have highmem pages and your device can't
> handle >32-bit physical addresses, then kmap() them
> before pci_map_sg()ing them and kunmap() them after
> pci_unmap_sg()ing.
--verbose please, I don't see how kmap() will fix the 32-bit limit issue.
As far I know kmap() doesn't move the page in physical memory, but
creates a virtual mapping for it. Thus the kernel (i.e. the CPU) can
access it, but PCI busmasters still can't ...
Gerd
--
You can't please everybody. And usually if you _try_ to please
everybody, the end result is one big mess.
-- Linus Torvalds, 2002-04-20