Hello, all.
This subject was brought up in linux-ide mailing list by James
Steward in the following message.
http://marc.theaimsgroup.com/?l=linux-ide&m=113494968723962&w=2
Russell King kindly explained the problem in detail in the following
message.
http://marc.theaimsgroup.com/?l=linux-ide&m=113517728717638&w=2
To reiterate it again for the purpose of discussion, the problem is
that some block drivers (IDE, libata and a few SCSI drivers) sometimes
perform PIO instead of DMA. Block drivers often perform IOs on page
caches which can also be mapped from user-space. On machines where
caches are physically indexed, PIO doesn't cause any problem as CPU
handles all cache coherency.
However, on machines with virtually indexed caches, there can be
aliased cache lines of the same page which can result in accessing
stale data if mis-managed. DMA API takes good care of cache coherency
issues but currently most block drivers don't properly manage cache
coherency for PIOs.
The question is what kind of flushing is required where. AFAIK, DMA
API does the following flushing operations.
* On READ (DMA_FROM_DEVICE)
Invalidate all cpu caches of the target memory area before IO.
There's no need for flushing after IO as DMA transfers don't
affect cpu caches.
* On WRITE (DMA_TO_DEVICE)
Writeback (but don't invalidate) all cpu caches of the target
memory area before IO. There's no need for flushing after IO
as DMA write transfers don't dirty cpu caches.
PIO READs are different from DMA READs in that read operation creates
(on write-allocate caches) or dirties cache lines and they need to be
flushed before the page is mapped to user space.
As the ramdisk driver (driver/block/rd.c) deals with similar problem
and it performs flush_dcache_page after READ operation, there's no
question that flush_dcache_page is needed after READ, but we're not
sure...
* Is flush_dcache_page needed before PIO READ? IOW, is it guaranteed
that there's no dirty user-mapped cache lines on entry to block
layer for READ?
* Is flush_dcache_page needed before PIO WRITE? IOW, is it guaranteed
that there's no dirty user-mapped cache lines on entry to block
layer for WRITE?
* Is there any better (lighter) function to call to flush dirty
kernel-mapped cachelines? flush_dcache_page seems quite heavy for
that purpose.
And, I think it would be a good idea to have kmap/unmap wrappers in
block layer, say, kmap/unmap_for_pio(page, rw) which deal with above
cache coherency issues. How does it sound?
TIA.
--
tejun
On Thu, Dec 22, 2005 at 02:55:07PM +0900, Tejun Heo wrote:
> The question is what kind of flushing is required where. AFAIK, DMA
> API does the following flushing operations.
>
> * On READ (DMA_FROM_DEVICE)
>
> Invalidate all cpu caches of the target memory area before IO.
> There's no need for flushing after IO as DMA transfers don't
> affect cpu caches.
>
> * On WRITE (DMA_TO_DEVICE)
>
> Writeback (but don't invalidate) all cpu caches of the target
> memory area before IO. There's no need for flushing after IO
> as DMA write transfers don't dirty cpu caches.
>
> PIO READs are different from DMA READs in that read operation creates
> (on write-allocate caches) or dirties cache lines and they need to be
> flushed before the page is mapped to user space.
This is correct.
> As the ramdisk driver (driver/block/rd.c) deals with similar problem
> and it performs flush_dcache_page after READ operation, there's no
> question that flush_dcache_page is needed after READ, but we're not
> sure...
>
> * Is flush_dcache_page needed before PIO READ? IOW, is it guaranteed
> that there's no dirty user-mapped cache lines on entry to block
> layer for READ?
That's a slightly different problem, one which is handled by the mm
layer. Basically, whenever a page is unmapped from userspace on
aliasing caches, user cache lines need to be invalidated. Consider
this case:
- A page is mapped into userspace, and is subsequently removed and
freed without invalidating the cache lines.
- The kernel allocates this page for it's own purposes and writes data
to the kernel mapping of the page (eg, a slab page).
- At some time later, the cache lines corresponding with the old
userspace mapping are evicted from the cache and written back to
the page.
The result is that corruption occurs to this re-used page - and no
device driver has been involved. So this isn't a problem that
device driver authors need to care about.
> * Is flush_dcache_page needed before PIO WRITE? IOW, is it guaranteed
> that there's no dirty user-mapped cache lines on entry to block
> layer for WRITE?
For msync(), it uses flush_cache_range() prior to marking the pages
dirty (and therefore candidates for being written to devices.)
In the case of swapping pages out, flush_cache_page() is used to
ensure that user data is written to the page.
The final case is write(), where the standard rule is followed -
if the kernel mapping of a page cache page is written by the CPU,
flush_dcache_page() is used after the page has been written.
There's probably other cases as well (I'm not a mm hacker so it's an
area of the kernel I'm not familiar with.)
> * Is there any better (lighter) function to call to flush dirty
> kernel-mapped cachelines? flush_dcache_page seems quite heavy for
> that purpose.
That depends. The common case for reads from device drivers is
that a page has just been allocated, but not mapped into user
space. It has been submitted to the block layer to have the
required data placed into the page.
Given that, and a read of Documentation/cachetlb.txt, if an
architecture implements the idea given there, flush_dcache_page()
should just set the PG_arch_1 bit for this case.
So it should be rather light.
> And, I think it would be a good idea to have kmap/unmap wrappers in
> block layer, say, kmap/unmap_for_pio(page, rw) which deal with above
> cache coherency issues. How does it sound?
That's something I can't comment on.
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of: 2.6 Serial core