Hi,
I've been looking over the 2.3.41 patch, and have come across a major problem
area for ARM.
On ARM, there is no such thing as "dma coherent" memory. Unfortunately, the
new PCI code (pci_alloc_consistent) appears to assume that there is a way
of doing this.
I have had ideas about ways to do this on the ARM, but it will not be trivial
changes to the mm layer, and certainly has not been implemented yet.
This effectively means that I seem to have two options:
1. either we loose any hope of IDE DMA for the rest of 2.3 and 2.4, or
2. the IDE DMA code gets the dma_cache_* macros added back in
I would have preferred to have heard about the extent of these changes (and
that the dma_cache_* macros were going to be removed, along with my comments
marking them with my initials) before it was submitted.
For now, I'm adding the dma_cache_* macros back in, and if I don't hear anything,
I will be re-submitting that code back to Linus.
(very pissed)
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
Russell,
I missed covering your a**..........yes this is a WTF.
Where did you find the break point........It will get fixed.
On Sun, 30 Jan 2000, Russell King wrote:
> Hi,
>
> I've been looking over the 2.3.41 patch, and have come across a major problem
> area for ARM.
>
> On ARM, there is no such thing as "dma coherent" memory. Unfortunately, the
> new PCI code (pci_alloc_consistent) appears to assume that there is a way
> of doing this.
>
> I have had ideas about ways to do this on the ARM, but it will not be trivial
> changes to the mm layer, and certainly has not been implemented yet.
>
> This effectively means that I seem to have two options:
>
> 1. either we loose any hope of IDE DMA for the rest of 2.3 and 2.4, or
> 2. the IDE DMA code gets the dma_cache_* macros added back in
>
> I would have preferred to have heard about the extent of these changes (and
> that the dma_cache_* macros were going to be removed, along with my comments
> marking them with my initials) before it was submitted.
>
> For now, I'm adding the dma_cache_* macros back in, and if I don't hear anything,
> I will be re-submitting that code back to Linus.
>
> (very pissed)
> _____
> |_____| ------------------------------------------------- ---+---+-
> | | Russell King [email protected] --- ---
> | | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
> | +-+-+ --- -+-
> / | THE developer of ARM Linux |+| /|\
> / | | | --- |
> +-+-+ ------------------------------------------------- /\\\ |
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>
Andre Hedrick
The Linux ATA/IDE guy
THE USE OF EMAIL FOR THE TRANSMISSION OF UNSOLICITED COMMERCIAL
MATERIAL IS PROHIBITED UNDER FEDERAL LAW (47 USC 227). Violations may
result in civil penalties and claims of $500.00 PER OCCURRENCE
(47 USC 227[c]). Commercial spam WILL be forwarded to postmasters.
On Sun, Jan 30, 2000 at 12:06:15AM +0000, Russell King wrote:
> Hi,
>
> I've been looking over the 2.3.41 patch, and have come across a major problem
> area for ARM.
>
> On ARM, there is no such thing as "dma coherent" memory. Unfortunately, the
> new PCI code (pci_alloc_consistent) appears to assume that there is a way
> of doing this.
Some SPARCs are not DMA coherent either and a similar interface works for
them for quite some years.
With the pci_map_single/pci_map_sg/pci_unmap_single/pci_unmap_sg
you can sync caches in those routines as required (plus there are
pci_dma_sync_single/pci_dma_sync_sg which should sync as well).
With pci_alloc_consistant, on DMA non-coherent systems the trick is to
allocate a non-cacheable memory (or make it uncacheable after allocating).
>
> I have had ideas about ways to do this on the ARM, but it will not be trivial
> changes to the mm layer, and certainly has not been implemented yet.
>
> This effectively means that I seem to have two options:
>
> 1. either we loose any hope of IDE DMA for the rest of 2.3 and 2.4, or
> 2. the IDE DMA code gets the dma_cache_* macros added back in
>
> I would have preferred to have heard about the extent of these changes (and
> that the dma_cache_* macros were going to be removed, along with my comments
> marking them with my initials) before it was submitted.
The interface was lined out e.g. during the
Alpha: virt_to_bus/GFP_DMA problem
thread on l-k in december.
Cheers,
Jakub
___________________________________________________________________
Jakub Jelinek | [email protected] | http://sunsite.mff.cuni.cz/~jj
Linux version 2.3.41 on a sparc64 machine (1343.49 BogoMips)
___________________________________________________________________
Jakub Jelinek writes:
> With the pci_map_single/pci_map_sg/pci_unmap_single/pci_unmap_sg
> you can sync caches in those routines as required (plus there are
> pci_dma_sync_single/pci_dma_sync_sg which should sync as well).
Unfortunately, that is not sufficient. Those pci_* functions have no
knowledge of what the memory is going to be used for, and they need to
know. I am assuming that a particular usage of this interface may be:
CPU writes to block (some time in the past)
pci_map_single (cleans and flushes cache)
dma operation (dma writes to block)
CPU reads from block
pci_unmap_single
You have expended possibly a lot of CPU cycles writing out data to the
memory, which then get immediately overwritten by the DMA operation when
instead you could just flush the cache to discard the data.
[ Note: clean means write back the data to memory, flush means remove data
from cache here. These are two distinctly different operations ].
> With pci_alloc_consistant, on DMA non-coherent systems the trick is to
> allocate a non-cacheable memory (or make it uncacheable after allocating).
This is one point that I keep on making over and over again whenever the
DMA cache discussion comes up. People think "oh, page tables, one entry
per page, you can turn off the cache on a per-page basis".
I'm sure people are aware of the 4MB pages on the Intel architecture? Well,
believe it or not, other architectures have them. In this specific case,
the ARM has 1MB large pages. We use them to map in the kernel memory, and
IO, and they save TLB lookups etc.
With this scheme that's being advocated, not only does it hit us with
inefficiency in cache handling, but it also hits us on the TLB as well.
Double wammy.
> The interface was lined out e.g. during the
> Alpha: virt_to_bus/GFP_DMA problem
> thread on l-k in december.
Err, this pci_* interface wasn't. I still have a record of the DMA coherency
discussion sitting in my l-k mailbox. The discussion was between Pete, Ingo,
Roman, Jes, Ralf and myself.
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
From: Russell King <[email protected]>
Date: Sun, 30 Jan 2000 00:06:15 +0000 (GMT)
On ARM, there is no such thing as "dma coherent" memory. Unfortunately, the
new PCI code (pci_alloc_consistent) appears to assume that there is a way
of doing this.
You have no mechanism whatsoever to disable the cache on a per-page
basis with MMU mappings? This would very much surprise me.
I would have preferred to have heard about the extent of these changes (and
that the dma_cache_* macros were going to be removed, along with my comments
marking them with my initials) before it was submitted.
For the actual transfers, you can do the dma_cache_*() calls in the
pci_{un,}map_streaming() calls.
The only place you could possibly need it is for the IDE scatter list
tables, and that would only be if you have _no_ mechanism to disable
the CPU cache in the MMU, which I severely doubt.
For now, I'm adding the dma_cache_* macros back in, and if I don't hear anything,
I will be re-submitting that code back to Linus.
(very pissed)
Actually, don't be pissed, and instead work with us to get your
port working again. The changes were designed so that all of
the cache flushing hacks could just dissapear and be hidden
within the new interfaces on ports which needed.
I was completely convinced that if all dma mappings were handled
explicitly in the manner they are now, none of that crap would be
needed anymore.
Later,
David S. Miller
[email protected]
I've taken this conversation offline in hopes to make it a productive
conversation instead of one of pointing fingers.
I'm in no way asking you to map the kernel without the larger
mappings, and I've even shown you two methods by which to
do this in the private emails.
Later,
David S. Miller
[email protected]
David,
It is important that we all work togather, but there has been at least one
other occassion that this code has been nuke by accident. And yes, pissed
off never worked again since the begin of the hand off of IDE between
2.1.112 and 2.1.122.. Russell, I am in your corner to help fix this.
On Sun, 30 Jan 2000, David S. Miller wrote:
> From: Russell King <[email protected]>
> Date: Sun, 30 Jan 2000 00:06:15 +0000 (GMT)
>
> On ARM, there is no such thing as "dma coherent" memory. Unfortunately, the
> new PCI code (pci_alloc_consistent) appears to assume that there is a way
> of doing this.
>
> You have no mechanism whatsoever to disable the cache on a per-page
> basis with MMU mappings? This would very much surprise me.
>
> I would have preferred to have heard about the extent of these changes (and
> that the dma_cache_* macros were going to be removed, along with my comments
> marking them with my initials) before it was submitted.
>
> For the actual transfers, you can do the dma_cache_*() calls in the
> pci_{un,}map_streaming() calls.
>
> The only place you could possibly need it is for the IDE scatter list
> tables, and that would only be if you have _no_ mechanism to disable
> the CPU cache in the MMU, which I severely doubt.
>
> For now, I'm adding the dma_cache_* macros back in, and if I don't hear anything,
> I will be re-submitting that code back to Linus.
>
> (very pissed)
>
> Actually, don't be pissed, and instead work with us to get your
> port working again. The changes were designed so that all of
> the cache flushing hacks could just dissapear and be hidden
> within the new interfaces on ports which needed.
>
> I was completely convinced that if all dma mappings were handled
> explicitly in the manner they are now, none of that crap would be
> needed anymore.
>
> Later,
> David S. Miller
> [email protected]
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/
>
Andre Hedrick
The Linux ATA/IDE guy
THE USE OF EMAIL FOR THE TRANSMISSION OF UNSOLICITED COMMERCIAL
MATERIAL IS PROHIBITED UNDER FEDERAL LAW (47 USC 227). Violations may
result in civil penalties and claims of $500.00 PER OCCURRENCE
(47 USC 227[c]). Commercial spam WILL be forwarded to postmasters.
Date: Sun, 30 Jan 2000 20:27:18 -0800 (PST)
From: Andre Hedrick <[email protected]>
It is important that we all work togather, but there has been at
least one other occassion that this code has been nuke by accident.
And yes, pissed off never worked again since the begin of the hand
off of IDE between 2.1.112 and 2.1.122.. Russell, I am in your
corner to help fix this.
This was an intentional nuke. And I am working offline with Russell,
Jakub, and Alan to address the issues.
I am very sure that making the new (and documented) interfaces work
properly for him is going to be preferred to sticking the flush hacks
back in. I say this, because the new interfaces will create a
situation where properly written PCI drivers of any type will work
for his, and every, platform wrt. DMA issues not just the ones where
he happens to sprinkle his dma flush calls into.
Later,
David S. Miller
[email protected]
David S. Miller writes:
> You have no mechanism whatsoever to disable the cache on a per-page
> basis with MMU mappings? This would very much surprise me.
Please see my later mailing about 1MB mappings. I do not wish to map
the whole kernel memory in using 4k page tables - that would just too
sick to even consider. What if I came up with something which required
x86 to map in its kernel memory using ptes? ie, which prevented the
use of PSEs. How would that be accepted? Probably the same way that
I'm reacting to this change, but on a larger scale.
> For the actual transfers, you can do the dma_cache_*() calls in the
> pci_{un,}map_streaming() calls.
That is what I thought, but the problems I see which this introduces
are either:
1. pci_map_* needs to handle the cache *and* pci_unmap_* has to as well,
thereby expending twice the number of CPU cycles over cache handling
that the old way did.
2. pci_map_* needs to clean and flush the cache unconditionally, which
will result in DMA reads needlessly cleaning stale data out of the
cache (with the associated slow bus cycles).
Both of these suffer from at last a doubling of the non-cache-coherentness
that the old way allowed.
> The only place you could possibly need it is for the IDE scatter list
> tables, and that would only be if you have _no_ mechanism to disable
> the CPU cache in the MMU, which I severely doubt.
Read above.
> Actually, don't be pissed, and instead work with us to get your
> port working again. The changes were designed so that all of
> the cache flushing hacks could just dissapear and be hidden
> within the new interfaces on ports which needed.
Unfortunately, they are not designed well enough.
> I was completely convinced that if all dma mappings were handled
> explicitly in the manner they are now, none of that crap would be
> needed anymore.
My main bug bear with this is that there was virtually no evidence of
discussion of this method of fixing the problem between concerned people
(ie, architecture maintainers) who would get hit hardest by the change.
If there was, we wouldn't be in this situation we are now.
_____
|_____| ------------------------------------------------- ---+---+-
| | Russell King [email protected] --- ---
| | | | http://www.arm.linux.org.uk/~rmk/aboutme.html / / |
| +-+-+ --- -+-
/ | THE developer of ARM Linux |+| /|\
/ | | | --- |
+-+-+ ------------------------------------------------- /\\\ |
>>>>> "David" == David S Miller <[email protected]> writes:
David> From: Russell King <[email protected]> Date: Sun, 30 Jan
David> 2000 00:06:15 +0000 (GMT)
> I would have preferred to have heard about the extent of
> these changes (and that the dma_cache_* macros were going to be
> removed, along with my comments marking them with my initials)
> before it was submitted.
David> For the actual transfers, you can do the dma_cache_*() calls in
David> the pci_{un,}map_streaming() calls.
David> The only place you could possibly need it is for the IDE
David> scatter list tables, and that would only be if you have _no_
David> mechanism to disable the CPU cache in the MMU, which I severely
David> doubt.
Hmmm ok I just noticed this and I haven't read that DMA mapping
document yet. I'll have to look at it to see how it affects PCI
devices that are 64 bit address capable.
The one thing for the m68k is that we have very few machines with
PCI, though we still suffer a lot from the DMA coherency problem on
the busses we do have.
The place where this is a real problem is in drivers where data is
shared between the adapter and the host CPU, for instance the 53c7xx
driver. On the m68k we currently use a kernel_set_cachemode() function
to change the caching of the page allocated for the shared structures,
but thats a pretty non portable way of doing it. I would like to see
something a get_free_cachecoherent_page() interface instead, what do
you think of that?
Jes
From: Jes Sorensen <[email protected]>
Date: 31 Jan 2000 11:02:28 +0100
The place where this is a real problem is in drivers where data is
shared between the adapter and the host CPU, for instance the
53c7xx driver. On the m68k we currently use a
kernel_set_cachemode() function to change the caching of the page
allocated for the shared structures, but thats a pretty non
portable way of doing it. I would like to see something a
get_free_cachecoherent_page() interface instead, what do you think
of that?
Sounds like pci_alloc_consistent()
Go read the document and check out the interfaces, I believe
you'll be pleasantly surprised :-)
Later,
David S. Miller
[email protected]