Some stability changes to improve ION robustness and a perf related
change to make it easier for clients to avoid unnecessary cache
maintenance, such as when buffers are clean and haven't had any CPU
access.
Liam Mark (4):
staging: android: ion: Support cpu access during dma_buf_detach
staging: android: ion: Restrict cache maintenance to dma mapped memory
dma-buf: add support for mapping with dma mapping attributes
staging: android: ion: Support for mapping with dma mapping attributes
drivers/staging/android/ion/ion.c | 33 +++++++++++++++++++++++++--------
include/linux/dma-buf.h | 3 +++
2 files changed, 28 insertions(+), 8 deletions(-)
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
The ION begin_cpu_access and end_cpu_access functions use the
dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
maintenance.
Currently it is possible to apply cache maintenance, via the
begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
dma mapped.
The dma sync sg APIs should not be called on sg lists which have not been
dma mapped as this can result in cache maintenance being applied to the
wrong address. If an sg list has not been dma mapped then its dma_address
field has not been populated, some dma ops such as the swiotlb_dma_ops ops
use the dma_address field to calculate the address onto which to apply
cache maintenance.
Also I don’t think we want CMOs to be applied to a buffer which is not
dma mapped as the memory should already be coherent for access from the
CPU. Any CMOs required for device access taken care of in the
dma_buf_map_attachment and dma_buf_unmap_attachment calls.
So really it only makes sense for begin_cpu_access and end_cpu_access to
apply CMOs if the buffer is dma mapped.
Fix the ION begin_cpu_access and end_cpu_access functions to only apply
cache maintenance to buffers which are dma mapped.
Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
Signed-off-by: Liam Mark <[email protected]>
---
drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index 6f5afab7c1a1..1fe633a7fdba 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
struct device *dev;
struct sg_table *table;
struct list_head list;
+ bool dma_mapped;
};
static int ion_dma_buf_attach(struct dma_buf *dmabuf,
@@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
a->table = table;
a->dev = attachment->dev;
+ a->dma_mapped = false;
INIT_LIST_HEAD(&a->list);
attachment->priv = a;
@@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
{
struct ion_dma_buf_attachment *a = attachment->priv;
struct sg_table *table;
+ struct ion_buffer *buffer = attachment->dmabuf->priv;
table = a->table;
+ mutex_lock(&buffer->lock);
if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
- direction))
+ direction)) {
+ mutex_unlock(&buffer->lock);
return ERR_PTR(-ENOMEM);
+ }
+ a->dma_mapped = true;
+ mutex_unlock(&buffer->lock);
return table;
}
@@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
struct sg_table *table,
enum dma_data_direction direction)
{
+ struct ion_dma_buf_attachment *a = attachment->priv;
+ struct ion_buffer *buffer = attachment->dmabuf->priv;
+
+ mutex_lock(&buffer->lock);
dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+ a->dma_mapped = false;
+ mutex_unlock(&buffer->lock);
}
static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
@@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
mutex_lock(&buffer->lock);
list_for_each_entry(a, &buffer->attachments, list) {
- dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
- direction);
+ if (a->dma_mapped)
+ dma_sync_sg_for_cpu(a->dev, a->table->sgl,
+ a->table->nents, direction);
}
unlock:
@@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
mutex_lock(&buffer->lock);
list_for_each_entry(a, &buffer->attachments, list) {
- dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
- direction);
+ if (a->dma_mapped)
+ dma_sync_sg_for_device(a->dev, a->table->sgl,
+ a->table->nents, direction);
}
mutex_unlock(&buffer->lock);
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Add support for configuring dma mapping attributes when mapping
and unmapping memory through dma_buf_map_attachment and
dma_buf_unmap_attachment.
For example this will allow ION clients to skip cache maintenance, by
using DMA_ATTR_SKIP_CPU_SYNC, for buffers which are clean and haven't been
accessed by the CPU.
Signed-off-by: Liam Mark <[email protected]>
---
drivers/staging/android/ion/ion.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index 1fe633a7fdba..0aae845b20ba 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -268,8 +268,8 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
table = a->table;
mutex_lock(&buffer->lock);
- if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
- direction)) {
+ if (!dma_map_sg_attrs(attachment->dev, table->sgl, table->nents,
+ direction, attachment->dma_map_attrs)) {
mutex_unlock(&buffer->lock);
return ERR_PTR(-ENOMEM);
}
@@ -287,7 +287,8 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
struct ion_buffer *buffer = attachment->dmabuf->priv;
mutex_lock(&buffer->lock);
- dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
+ dma_unmap_sg_attrs(attachment->dev, table->sgl, table->nents, direction,
+ attachment->dma_map_attrs);
a->dma_mapped = false;
mutex_unlock(&buffer->lock);
}
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Often userspace doesn't know when the kernel will be calling dma_buf_detach
on the buffer.
If userpace starts its CPU access at the same time as the sg list is being
freed it could end up accessing the sg list after it has been freed.
Thread A Thread B
- DMA_BUF_IOCTL_SYNC IOCT
- ion_dma_buf_begin_cpu_access
- list_for_each_entry
- ion_dma_buf_detatch
- free_duped_table
- dma_sync_sg_for_cpu
Fix this by getting the ion_buffer lock before freeing the sg table memory.
Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
Signed-off-by: Liam Mark <[email protected]>
---
drivers/staging/android/ion/ion.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
index a0802de8c3a1..6f5afab7c1a1 100644
--- a/drivers/staging/android/ion/ion.c
+++ b/drivers/staging/android/ion/ion.c
@@ -248,10 +248,10 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf,
struct ion_dma_buf_attachment *a = attachment->priv;
struct ion_buffer *buffer = dmabuf->priv;
- free_duped_table(a->table);
mutex_lock(&buffer->lock);
list_del(&a->list);
mutex_unlock(&buffer->lock);
+ free_duped_table(a->table);
kfree(a);
}
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Add support for configuring dma mapping attributes when mapping
and unmapping memory through dma_buf_map_attachment and
dma_buf_unmap_attachment.
Signed-off-by: Liam Mark <[email protected]>
---
include/linux/dma-buf.h | 3 +++
1 file changed, 3 insertions(+)
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 58725f890b5b..59bf33e09e2d 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -308,6 +308,8 @@ struct dma_buf {
* @dev: device attached to the buffer.
* @node: list of dma_buf_attachment.
* @priv: exporter specific attachment data.
+ * @dma_map_attrs: DMA mapping attributes to be used in
+ * dma_buf_map_attachment() and dma_buf_unmap_attachment().
*
* This structure holds the attachment information between the dma_buf buffer
* and its user device(s). The list contains one attachment struct per device
@@ -323,6 +325,7 @@ struct dma_buf_attachment {
struct device *dev;
struct list_head node;
void *priv;
+ unsigned long dma_map_attrs;
};
/**
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 1/18/19 12:37 PM, Liam Mark wrote:
> Often userspace doesn't know when the kernel will be calling dma_buf_detach
> on the buffer.
> If userpace starts its CPU access at the same time as the sg list is being
> freed it could end up accessing the sg list after it has been freed.
>
> Thread A Thread B
> - DMA_BUF_IOCTL_SYNC IOCT
> - ion_dma_buf_begin_cpu_access
> - list_for_each_entry
> - ion_dma_buf_detatch
> - free_duped_table
> - dma_sync_sg_for_cpu
>
The window for this seems really small, but it does seem technically
possible, good find. for what it's worth:
Acked-by: Andrew F. Davis <[email protected]>
> Fix this by getting the ion_buffer lock before freeing the sg table memory.
>
> Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> Signed-off-by: Liam Mark <[email protected]>
> ---
> drivers/staging/android/ion/ion.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index a0802de8c3a1..6f5afab7c1a1 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -248,10 +248,10 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf,
> struct ion_dma_buf_attachment *a = attachment->priv;
> struct ion_buffer *buffer = dmabuf->priv;
>
> - free_duped_table(a->table);
> mutex_lock(&buffer->lock);
> list_del(&a->list);
> mutex_unlock(&buffer->lock);
> + free_duped_table(a->table);
>
> kfree(a);
> }
>
On 1/18/19 12:37 PM, Liam Mark wrote:
> The ION begin_cpu_access and end_cpu_access functions use the
> dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> maintenance.
>
> Currently it is possible to apply cache maintenance, via the
> begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> dma mapped.
>
> The dma sync sg APIs should not be called on sg lists which have not been
> dma mapped as this can result in cache maintenance being applied to the
> wrong address. If an sg list has not been dma mapped then its dma_address
> field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> use the dma_address field to calculate the address onto which to apply
> cache maintenance.
>
> Also I don’t think we want CMOs to be applied to a buffer which is not
> dma mapped as the memory should already be coherent for access from the
> CPU. Any CMOs required for device access taken care of in the
> dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> So really it only makes sense for begin_cpu_access and end_cpu_access to
> apply CMOs if the buffer is dma mapped.
>
> Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> cache maintenance to buffers which are dma mapped.
>
> Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> Signed-off-by: Liam Mark <[email protected]>
> ---
> drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index 6f5afab7c1a1..1fe633a7fdba 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> struct device *dev;
> struct sg_table *table;
> struct list_head list;
> + bool dma_mapped;
> };
>
> static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
>
> a->table = table;
> a->dev = attachment->dev;
> + a->dma_mapped = false;
> INIT_LIST_HEAD(&a->list);
>
> attachment->priv = a;
> @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> {
> struct ion_dma_buf_attachment *a = attachment->priv;
> struct sg_table *table;
> + struct ion_buffer *buffer = attachment->dmabuf->priv;
>
> table = a->table;
>
> + mutex_lock(&buffer->lock);
> if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> - direction))
> + direction)) {
> + mutex_unlock(&buffer->lock);
> return ERR_PTR(-ENOMEM);
> + }
> + a->dma_mapped = true;
> + mutex_unlock(&buffer->lock);
>
> return table;
> }
> @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> struct sg_table *table,
> enum dma_data_direction direction)
> {
> + struct ion_dma_buf_attachment *a = attachment->priv;
> + struct ion_buffer *buffer = attachment->dmabuf->priv;
> +
> + mutex_lock(&buffer->lock);
> dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> + a->dma_mapped = false;
> + mutex_unlock(&buffer->lock);
> }
>
> static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>
> mutex_lock(&buffer->lock);
> list_for_each_entry(a, &buffer->attachments, list) {
When no devices are attached then buffer->attachments is empty and the
below does not run, so if I understand this patch correctly then what
you are protecting against is CPU access in the window after
dma_buf_attach but before dma_buf_map.
This is the kind of thing that again makes me think a couple more
ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
the backing memory to be allocated until map time, this is why the
dma_address field would still be null as you note in the commit message.
So why should the CPU be performing accesses on a buffer that is not
actually backed yet?
I can think of two solutions:
1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
least one device is mapped.
2) Treat the CPU access request like the a device map request and
trigger the allocation of backing memory just like if a device map had
come in.
I know the current Ion heaps (and most other DMA-BUF exporters) all do
the allocation up front so the memory is already there, but DMA-BUF was
designed with late allocation in mind. I have a use-case I'm working on
that finally exercises this DMA-BUF functionality and I would like to
have it export through ION. This patch doesn't prevent that, but seems
like it is endorsing the the idea that buffers always need to be backed,
even before device attach/map is has occurred.
Either of the above two solutions would need to target the DMA-BUF
framework,
Sumit,
Any comment?
Thanks,
Andrew
> - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> - direction);
> + if (a->dma_mapped)
> + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> + a->table->nents, direction);
> }
>
> unlock:
> @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
>
> mutex_lock(&buffer->lock);
> list_for_each_entry(a, &buffer->attachments, list) {
> - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> - direction);
> + if (a->dma_mapped)
> + dma_sync_sg_for_device(a->dev, a->table->sgl,
> + a->table->nents, direction);
> }
> mutex_unlock(&buffer->lock);
>
>
On 1/18/19 10:37 AM, Liam Mark wrote:
> Often userspace doesn't know when the kernel will be calling dma_buf_detach
> on the buffer.
> If userpace starts its CPU access at the same time as the sg list is being
> freed it could end up accessing the sg list after it has been freed.
>
> Thread A Thread B
> - DMA_BUF_IOCTL_SYNC IOCT
> - ion_dma_buf_begin_cpu_access
> - list_for_each_entry
> - ion_dma_buf_detatch
> - free_duped_table
> - dma_sync_sg_for_cpu
>
> Fix this by getting the ion_buffer lock before freeing the sg table memory.
>
> Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> Signed-off-by: Liam Mark <[email protected]>
> ---
> drivers/staging/android/ion/ion.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index a0802de8c3a1..6f5afab7c1a1 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -248,10 +248,10 @@ static void ion_dma_buf_detatch(struct dma_buf *dmabuf,
> struct ion_dma_buf_attachment *a = attachment->priv;
> struct ion_buffer *buffer = dmabuf->priv;
>
> - free_duped_table(a->table);
> mutex_lock(&buffer->lock);
> list_del(&a->list);
> mutex_unlock(&buffer->lock);
> + free_duped_table(a->table);
>
> kfree(a);
> }
>
Acked-by: Laura Abbott <[email protected]>
On 1/18/19 10:37 AM, Liam Mark wrote:
> Add support for configuring dma mapping attributes when mapping
> and unmapping memory through dma_buf_map_attachment and
> dma_buf_unmap_attachment.
>
> Signed-off-by: Liam Mark <[email protected]>
> ---
> include/linux/dma-buf.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> index 58725f890b5b..59bf33e09e2d 100644
> --- a/include/linux/dma-buf.h
> +++ b/include/linux/dma-buf.h
> @@ -308,6 +308,8 @@ struct dma_buf {
> * @dev: device attached to the buffer.
> * @node: list of dma_buf_attachment.
> * @priv: exporter specific attachment data.
> + * @dma_map_attrs: DMA mapping attributes to be used in
> + * dma_buf_map_attachment() and dma_buf_unmap_attachment().
> *
> * This structure holds the attachment information between the dma_buf buffer
> * and its user device(s). The list contains one attachment struct per device
> @@ -323,6 +325,7 @@ struct dma_buf_attachment {
> struct device *dev;
> struct list_head node;
> void *priv;
> + unsigned long dma_map_attrs;
> };
>
> /**
>
Did you miss part of this patch? This only adds it to the structure but doesn't
add it to any API. The same commment applies to the follow up patch,
I don't quite see how it's being used.
Thanks,
Laura
On Fri, 18 Jan 2019, Andrew F. Davis wrote:
> On 1/18/19 12:37 PM, Liam Mark wrote:
> > The ION begin_cpu_access and end_cpu_access functions use the
> > dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> > maintenance.
> >
> > Currently it is possible to apply cache maintenance, via the
> > begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> > dma mapped.
> >
> > The dma sync sg APIs should not be called on sg lists which have not been
> > dma mapped as this can result in cache maintenance being applied to the
> > wrong address. If an sg list has not been dma mapped then its dma_address
> > field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> > use the dma_address field to calculate the address onto which to apply
> > cache maintenance.
> >
> > Also I don’t think we want CMOs to be applied to a buffer which is not
> > dma mapped as the memory should already be coherent for access from the
> > CPU. Any CMOs required for device access taken care of in the
> > dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> > So really it only makes sense for begin_cpu_access and end_cpu_access to
> > apply CMOs if the buffer is dma mapped.
> >
> > Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> > cache maintenance to buffers which are dma mapped.
> >
> > Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> > Signed-off-by: Liam Mark <[email protected]>
> > ---
> > drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> > 1 file changed, 21 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > index 6f5afab7c1a1..1fe633a7fdba 100644
> > --- a/drivers/staging/android/ion/ion.c
> > +++ b/drivers/staging/android/ion/ion.c
> > @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> > struct device *dev;
> > struct sg_table *table;
> > struct list_head list;
> > + bool dma_mapped;
> > };
> >
> > static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> >
> > a->table = table;
> > a->dev = attachment->dev;
> > + a->dma_mapped = false;
> > INIT_LIST_HEAD(&a->list);
> >
> > attachment->priv = a;
> > @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > {
> > struct ion_dma_buf_attachment *a = attachment->priv;
> > struct sg_table *table;
> > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> >
> > table = a->table;
> >
> > + mutex_lock(&buffer->lock);
> > if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > - direction))
> > + direction)) {
> > + mutex_unlock(&buffer->lock);
> > return ERR_PTR(-ENOMEM);
> > + }
> > + a->dma_mapped = true;
> > + mutex_unlock(&buffer->lock);
> >
> > return table;
> > }
> > @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > struct sg_table *table,
> > enum dma_data_direction direction)
> > {
> > + struct ion_dma_buf_attachment *a = attachment->priv;
> > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > +
> > + mutex_lock(&buffer->lock);
> > dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > + a->dma_mapped = false;
> > + mutex_unlock(&buffer->lock);
> > }
> >
> > static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> >
> > mutex_lock(&buffer->lock);
> > list_for_each_entry(a, &buffer->attachments, list) {
>
> When no devices are attached then buffer->attachments is empty and the
> below does not run, so if I understand this patch correctly then what
> you are protecting against is CPU access in the window after
> dma_buf_attach but before dma_buf_map.
>
Yes
> This is the kind of thing that again makes me think a couple more
> ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
> the backing memory to be allocated until map time, this is why the
> dma_address field would still be null as you note in the commit message.
> So why should the CPU be performing accesses on a buffer that is not
> actually backed yet?
>
> I can think of two solutions:
>
> 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
> least one device is mapped.
>
Would be quite limiting to clients.
> 2) Treat the CPU access request like the a device map request and
> trigger the allocation of backing memory just like if a device map had
> come in.
>
Which is, as you mention pretty much what we have now (though the buffer
is allocated even earlier).
> I know the current Ion heaps (and most other DMA-BUF exporters) all do
> the allocation up front so the memory is already there, but DMA-BUF was
> designed with late allocation in mind. I have a use-case I'm working on
> that finally exercises this DMA-BUF functionality and I would like to
> have it export through ION. This patch doesn't prevent that, but seems
> like it is endorsing the the idea that buffers always need to be backed,
> even before device attach/map is has occurred.
>
I didn't interpret the DMA-buf contract as requiring the dma-map to be
called in order for a backing store to be provided, I interpreted it as
meaning there could be a backing store before the dma-map but at the
dma-map call the final backing store configuration would be decided
(perhaps involving migrating the memory to the final backing store).
I will let the dma-buf experts correct me on that.
Limiting userspace clients to not be able to access buffers until after
they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
change of ownership of the memory from the CPU to the device. So generally
while a buffer is dma mapped you have the device access it (though of
course it is supported for CPU to access to the buffer while dma mapped)
and then once the buffer is dma-unmapped the CPU can access it. This is
how the DMA APIs are frequently used, and the changes above make ION align
more with the way the DMA APIs are used. Basically when the buffer is not
dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
ION ensures not CMOs are applied) but if the CPU does want to access the
buffer while it is dma mapped then ION ensures that the appropriate CMOs
are applied.
It seems like a legitimate uses case to me to allow clients to access the
buffer before (and after) dma-mapping, example post processing of buffers.
> Either of the above two solutions would need to target the DMA-BUF
> framework,
>
> Sumit,
>
> Any comment?
>
> Thanks,
> Andrew
>
> > - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> > - direction);
> > + if (a->dma_mapped)
> > + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> > + a->table->nents, direction);
> > }
> >
> > unlock:
> > @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
> >
> > mutex_lock(&buffer->lock);
> > list_for_each_entry(a, &buffer->attachments, list) {
> > - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> > - direction);
> > + if (a->dma_mapped)
> > + dma_sync_sg_for_device(a->dev, a->table->sgl,
> > + a->table->nents, direction);
> > }
> > mutex_unlock(&buffer->lock);
> >
> >
>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Fri, 18 Jan 2019, Laura Abbott wrote:
> On 1/18/19 10:37 AM, Liam Mark wrote:
> > Add support for configuring dma mapping attributes when mapping
> > and unmapping memory through dma_buf_map_attachment and
> > dma_buf_unmap_attachment.
> >
> > Signed-off-by: Liam Mark <[email protected]>
> > ---
> > include/linux/dma-buf.h | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
> > index 58725f890b5b..59bf33e09e2d 100644
> > --- a/include/linux/dma-buf.h
> > +++ b/include/linux/dma-buf.h
> > @@ -308,6 +308,8 @@ struct dma_buf {
> > * @dev: device attached to the buffer.
> > * @node: list of dma_buf_attachment.
> > * @priv: exporter specific attachment data.
> > + * @dma_map_attrs: DMA mapping attributes to be used in
> > + * dma_buf_map_attachment() and dma_buf_unmap_attachment().
> > *
> > * This structure holds the attachment information between the dma_buf
> > buffer
> > * and its user device(s). The list contains one attachment struct per
> > device
> > @@ -323,6 +325,7 @@ struct dma_buf_attachment {
> > struct device *dev;
> > struct list_head node;
> > void *priv;
> > + unsigned long dma_map_attrs;
> > };
> > /**
> >
>
> Did you miss part of this patch? This only adds it to the structure but
> doesn't
> add it to any API. The same commment applies to the follow up patch,
> I don't quite see how it's being used.
>
Were you asking for a cleaner DMA-buf API to set this field or were you
asking for a change to an upstream client to make use of this field?
I have clients set the dma_map_attrs field directly on their
dma_buf_attachment struct before calling dma_buf_map_attachment (if they
need this functionality).
Of course this is all being used in Android for out of tree drivers, but
I assume it is just as useful to everyone else who has cached ION buffers
which aren't always accessed by the CPU.
My understanding is that AOSP Android on Hikey 960 also is currently
suffering from too many CMOs due to dma_map_attachemnt always applying
CMOs, so this support should help them avoid it.
> Thanks,
> Laura
>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 1/18/19 1:32 PM, Liam Mark wrote:
> On Fri, 18 Jan 2019, Laura Abbott wrote:
>
>> On 1/18/19 10:37 AM, Liam Mark wrote:
>>> Add support for configuring dma mapping attributes when mapping
>>> and unmapping memory through dma_buf_map_attachment and
>>> dma_buf_unmap_attachment.
>>>
>>> Signed-off-by: Liam Mark <[email protected]>
>>> ---
>>> include/linux/dma-buf.h | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
>>> index 58725f890b5b..59bf33e09e2d 100644
>>> --- a/include/linux/dma-buf.h
>>> +++ b/include/linux/dma-buf.h
>>> @@ -308,6 +308,8 @@ struct dma_buf {
>>> * @dev: device attached to the buffer.
>>> * @node: list of dma_buf_attachment.
>>> * @priv: exporter specific attachment data.
>>> + * @dma_map_attrs: DMA mapping attributes to be used in
>>> + * dma_buf_map_attachment() and dma_buf_unmap_attachment().
>>> *
>>> * This structure holds the attachment information between the dma_buf
>>> buffer
>>> * and its user device(s). The list contains one attachment struct per
>>> device
>>> @@ -323,6 +325,7 @@ struct dma_buf_attachment {
>>> struct device *dev;
>>> struct list_head node;
>>> void *priv;
>>> + unsigned long dma_map_attrs;
>>> };
>>> /**
>>>
>>
>> Did you miss part of this patch? This only adds it to the structure but
>> doesn't
>> add it to any API. The same commment applies to the follow up patch,
>> I don't quite see how it's being used.
>>
>
> Were you asking for a cleaner DMA-buf API to set this field or were you
> asking for a change to an upstream client to make use of this field?
>
> I have clients set the dma_map_attrs field directly on their
> dma_buf_attachment struct before calling dma_buf_map_attachment (if they
> need this functionality).
> Of course this is all being used in Android for out of tree drivers, but
> I assume it is just as useful to everyone else who has cached ION buffers
> which aren't always accessed by the CPU.
>
> My understanding is that AOSP Android on Hikey 960 also is currently
> suffering from too many CMOs due to dma_map_attachemnt always applying
> CMOs, so this support should help them avoid it.
>
Ahhhh I see how you intend this to be used now! I was missing
that clients would do attachment->dma_map_attrs = blah
and that was how it would be stored as opposed to passing
it in at the top level for dma_buf_map. I'll give this some
more thought but I think it could work if Sumit is okay
with the approach.
Thanks,
Laura
On Fri, Jan 18, 2019 at 10:37:46AM -0800, Liam Mark wrote:
> Add support for configuring dma mapping attributes when mapping
> and unmapping memory through dma_buf_map_attachment and
> dma_buf_unmap_attachment.
>
> Signed-off-by: Liam Mark <[email protected]>
And who is going to decide which ones to pass? And who documents
which ones are safe?
I'd much rather have explicit, well documented dma-buf flags that
might get translated to the DMA API flags, which are not error checked,
not very well documented and way to easy to get wrong.
On 1/19/19 2:25 AM, Christoph Hellwig wrote:
> On Fri, Jan 18, 2019 at 10:37:46AM -0800, Liam Mark wrote:
>> Add support for configuring dma mapping attributes when mapping
>> and unmapping memory through dma_buf_map_attachment and
>> dma_buf_unmap_attachment.
>>
>> Signed-off-by: Liam Mark <[email protected]>
>
> And who is going to decide which ones to pass? And who documents
> which ones are safe?
>
> I'd much rather have explicit, well documented dma-buf flags that
> might get translated to the DMA API flags, which are not error checked,
> not very well documented and way to easy to get wrong.
>
I'm not sure having flags in dma-buf really solves anything
given drivers can use the attributes directly with dma_map
anyway, which is what we're looking to do. The intention
is for the driver creating the dma_buf attachment to have
the knowledge of which flags to use.
Thanks,
Laura
On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> > And who is going to decide which ones to pass? And who documents
> > which ones are safe?
> >
> > I'd much rather have explicit, well documented dma-buf flags that
> > might get translated to the DMA API flags, which are not error checked,
> > not very well documented and way to easy to get wrong.
> >
>
> I'm not sure having flags in dma-buf really solves anything
> given drivers can use the attributes directly with dma_map
> anyway, which is what we're looking to do. The intention
> is for the driver creating the dma_buf attachment to have
> the knowledge of which flags to use.
Well, there are very few flags that you can simply use for all calls of
dma_map*. And given how badly these flags are defined I just don't want
people to add more places where they indirectly use these flags, as
it will be more than enough work to clean up the current mess.
What flag(s) do you want to pass this way, btw? Maybe that is where
the problem is.
Hi Liam,
On Fri, Jan 18, 2019 at 10:37:47AM -0800, Liam Mark wrote:
> Add support for configuring dma mapping attributes when mapping
> and unmapping memory through dma_buf_map_attachment and
> dma_buf_unmap_attachment.
>
> For example this will allow ION clients to skip cache maintenance, by
> using DMA_ATTR_SKIP_CPU_SYNC, for buffers which are clean and haven't been
> accessed by the CPU.
How can a client know that the buffer won't be accessed by the CPU in
the future though?
I don't think we can push this decision to clients, because they are
lacking information about what else is going on with the buffer. It
needs to be done by the exporter, IMO.
Thanks,
-Brian
>
> Signed-off-by: Liam Mark <[email protected]>
> ---
> drivers/staging/android/ion/ion.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> index 1fe633a7fdba..0aae845b20ba 100644
> --- a/drivers/staging/android/ion/ion.c
> +++ b/drivers/staging/android/ion/ion.c
> @@ -268,8 +268,8 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> table = a->table;
>
> mutex_lock(&buffer->lock);
> - if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> - direction)) {
> + if (!dma_map_sg_attrs(attachment->dev, table->sgl, table->nents,
> + direction, attachment->dma_map_attrs)) {
> mutex_unlock(&buffer->lock);
> return ERR_PTR(-ENOMEM);
> }
> @@ -287,7 +287,8 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> struct ion_buffer *buffer = attachment->dmabuf->priv;
>
> mutex_lock(&buffer->lock);
> - dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> + dma_unmap_sg_attrs(attachment->dev, table->sgl, table->nents, direction,
> + attachment->dma_map_attrs);
> a->dma_mapped = false;
> mutex_unlock(&buffer->lock);
> }
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> > > And who is going to decide which ones to pass? And who documents
> > > which ones are safe?
> > >
> > > I'd much rather have explicit, well documented dma-buf flags that
> > > might get translated to the DMA API flags, which are not error checked,
> > > not very well documented and way to easy to get wrong.
> > >
> >
> > I'm not sure having flags in dma-buf really solves anything
> > given drivers can use the attributes directly with dma_map
> > anyway, which is what we're looking to do. The intention
> > is for the driver creating the dma_buf attachment to have
> > the knowledge of which flags to use.
>
> Well, there are very few flags that you can simply use for all calls of
> dma_map*. And given how badly these flags are defined I just don't want
> people to add more places where they indirectly use these flags, as
> it will be more than enough work to clean up the current mess.
>
> What flag(s) do you want to pass this way, btw? Maybe that is where
> the problem is.
>
The main use case is for allowing clients to pass in
DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
ION the buffers aren't usually accessed from the CPU so this allows
clients to often avoid doing unnecessary cache maintenance.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 1/21/19 1:44 PM, Liam Mark wrote:
> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
>
>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
>>>> And who is going to decide which ones to pass? And who documents
>>>> which ones are safe?
>>>>
>>>> I'd much rather have explicit, well documented dma-buf flags that
>>>> might get translated to the DMA API flags, which are not error checked,
>>>> not very well documented and way to easy to get wrong.
>>>>
>>>
>>> I'm not sure having flags in dma-buf really solves anything
>>> given drivers can use the attributes directly with dma_map
>>> anyway, which is what we're looking to do. The intention
>>> is for the driver creating the dma_buf attachment to have
>>> the knowledge of which flags to use.
>>
>> Well, there are very few flags that you can simply use for all calls of
>> dma_map*. And given how badly these flags are defined I just don't want
>> people to add more places where they indirectly use these flags, as
>> it will be more than enough work to clean up the current mess.
>>
>> What flag(s) do you want to pass this way, btw? Maybe that is where
>> the problem is.
>>
>
> The main use case is for allowing clients to pass in
> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> ION the buffers aren't usually accessed from the CPU so this allows
> clients to often avoid doing unnecessary cache maintenance.
>
How can a client know that no CPU access has occurred that needs to be
flushed out?
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> On 1/21/19 1:44 PM, Liam Mark wrote:
> > On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> >
> >> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> >>>> And who is going to decide which ones to pass? And who documents
> >>>> which ones are safe?
> >>>>
> >>>> I'd much rather have explicit, well documented dma-buf flags that
> >>>> might get translated to the DMA API flags, which are not error checked,
> >>>> not very well documented and way to easy to get wrong.
> >>>>
> >>>
> >>> I'm not sure having flags in dma-buf really solves anything
> >>> given drivers can use the attributes directly with dma_map
> >>> anyway, which is what we're looking to do. The intention
> >>> is for the driver creating the dma_buf attachment to have
> >>> the knowledge of which flags to use.
> >>
> >> Well, there are very few flags that you can simply use for all calls of
> >> dma_map*. And given how badly these flags are defined I just don't want
> >> people to add more places where they indirectly use these flags, as
> >> it will be more than enough work to clean up the current mess.
> >>
> >> What flag(s) do you want to pass this way, btw? Maybe that is where
> >> the problem is.
> >>
> >
> > The main use case is for allowing clients to pass in
> > DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> > which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> > ION the buffers aren't usually accessed from the CPU so this allows
> > clients to often avoid doing unnecessary cache maintenance.
> >
>
> How can a client know that no CPU access has occurred that needs to be
> flushed out?
>
I have left this to clients, but if they own the buffer they can have the
knowledge as to whether CPU access is needed in that use case (example for
post-processing).
For example with the previous version of ION we left all decisions of
whether cache maintenance was required up to the client, they would use
the ION cache maintenance IOCTL to force cache maintenance only when it
was required.
In these cases almost all of the access was being done by the device and
in the rare cases CPU access was required clients would initiate the
required cache maintenance before and after the CPU access.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 1/21/19 2:20 PM, Liam Mark wrote:
> On Mon, 21 Jan 2019, Andrew F. Davis wrote:
>
>> On 1/21/19 1:44 PM, Liam Mark wrote:
>>> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
>>>
>>>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
>>>>>> And who is going to decide which ones to pass? And who documents
>>>>>> which ones are safe?
>>>>>>
>>>>>> I'd much rather have explicit, well documented dma-buf flags that
>>>>>> might get translated to the DMA API flags, which are not error checked,
>>>>>> not very well documented and way to easy to get wrong.
>>>>>>
>>>>>
>>>>> I'm not sure having flags in dma-buf really solves anything
>>>>> given drivers can use the attributes directly with dma_map
>>>>> anyway, which is what we're looking to do. The intention
>>>>> is for the driver creating the dma_buf attachment to have
>>>>> the knowledge of which flags to use.
>>>>
>>>> Well, there are very few flags that you can simply use for all calls of
>>>> dma_map*. And given how badly these flags are defined I just don't want
>>>> people to add more places where they indirectly use these flags, as
>>>> it will be more than enough work to clean up the current mess.
>>>>
>>>> What flag(s) do you want to pass this way, btw? Maybe that is where
>>>> the problem is.
>>>>
>>>
>>> The main use case is for allowing clients to pass in
>>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
>>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
>>> ION the buffers aren't usually accessed from the CPU so this allows
>>> clients to often avoid doing unnecessary cache maintenance.
>>>
>>
>> How can a client know that no CPU access has occurred that needs to be
>> flushed out?
>>
>
> I have left this to clients, but if they own the buffer they can have the
> knowledge as to whether CPU access is needed in that use case (example for
> post-processing).
>
> For example with the previous version of ION we left all decisions of
> whether cache maintenance was required up to the client, they would use
> the ION cache maintenance IOCTL to force cache maintenance only when it
> was required.
> In these cases almost all of the access was being done by the device and
> in the rare cases CPU access was required clients would initiate the
> required cache maintenance before and after the CPU access.
>
I think we have different definitions of "client", I'm talking about the
DMA-BUF client (the importer), that is who can set this flag. It seems
you mean the userspace application, which has no control over this flag.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Mon, Jan 21, 2019 at 11:44:10AM -0800, Liam Mark wrote:
> The main use case is for allowing clients to pass in
> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> ION the buffers aren't usually accessed from the CPU so this allows
> clients to often avoid doing unnecessary cache maintenance.
This can't work. The cpu can still easily speculate into this area.
Moreover in general these operations should be cheap if the addresses
aren't cached.
On Mon, Jan 21, 2019 at 12:20:42PM -0800, Liam Mark wrote:
> I have left this to clients, but if they own the buffer they can have the
> knowledge as to whether CPU access is needed in that use case (example for
> post-processing).
That is an API design which the user is more likely to get wrong than
right and thus does not pass the smell test.
On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> On Mon, Jan 21, 2019 at 11:44:10AM -0800, Liam Mark wrote:
> > The main use case is for allowing clients to pass in
> > DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> > which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> > ION the buffers aren't usually accessed from the CPU so this allows
> > clients to often avoid doing unnecessary cache maintenance.
>
> This can't work. The cpu can still easily speculate into this area.
Can you provide more detail on your concern here.
The use case I am thinking about here is a cached buffer which is accessed
by a non IO-coherent device (quite a common use case for ION).
Guessing on your concern:
The speculative access can be an issue if you are going to access the
buffer from the CPU after the device has written to it, however if you
know you aren't going to do any CPU access before the buffer is again
returned to the device then I don't think the speculative access is a
concern.
> Moreover in general these operations should be cheap if the addresses
> aren't cached.
>
I am thinking of use cases with cached buffers here, so CMO isn't cheap.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> On Mon, Jan 21, 2019 at 12:20:42PM -0800, Liam Mark wrote:
> > I have left this to clients, but if they own the buffer they can have the
> > knowledge as to whether CPU access is needed in that use case (example for
> > post-processing).
>
> That is an API design which the user is more likely to get wrong than
> right and thus does not pass the smell test.
>
With the previous version of ION Android ION clients were successfully
managing all their cache maintenance.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> On 1/21/19 2:20 PM, Liam Mark wrote:
> > On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> >
> >> On 1/21/19 1:44 PM, Liam Mark wrote:
> >>> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> >>>
> >>>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> >>>>>> And who is going to decide which ones to pass? And who documents
> >>>>>> which ones are safe?
> >>>>>>
> >>>>>> I'd much rather have explicit, well documented dma-buf flags that
> >>>>>> might get translated to the DMA API flags, which are not error checked,
> >>>>>> not very well documented and way to easy to get wrong.
> >>>>>>
> >>>>>
> >>>>> I'm not sure having flags in dma-buf really solves anything
> >>>>> given drivers can use the attributes directly with dma_map
> >>>>> anyway, which is what we're looking to do. The intention
> >>>>> is for the driver creating the dma_buf attachment to have
> >>>>> the knowledge of which flags to use.
> >>>>
> >>>> Well, there are very few flags that you can simply use for all calls of
> >>>> dma_map*. And given how badly these flags are defined I just don't want
> >>>> people to add more places where they indirectly use these flags, as
> >>>> it will be more than enough work to clean up the current mess.
> >>>>
> >>>> What flag(s) do you want to pass this way, btw? Maybe that is where
> >>>> the problem is.
> >>>>
> >>>
> >>> The main use case is for allowing clients to pass in
> >>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> >>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> >>> ION the buffers aren't usually accessed from the CPU so this allows
> >>> clients to often avoid doing unnecessary cache maintenance.
> >>>
> >>
> >> How can a client know that no CPU access has occurred that needs to be
> >> flushed out?
> >>
> >
> > I have left this to clients, but if they own the buffer they can have the
> > knowledge as to whether CPU access is needed in that use case (example for
> > post-processing).
> >
> > For example with the previous version of ION we left all decisions of
> > whether cache maintenance was required up to the client, they would use
> > the ION cache maintenance IOCTL to force cache maintenance only when it
> > was required.
> > In these cases almost all of the access was being done by the device and
> > in the rare cases CPU access was required clients would initiate the
> > required cache maintenance before and after the CPU access.
> >
>
> I think we have different definitions of "client", I'm talking about the
> DMA-BUF client (the importer), that is who can set this flag. It seems
> you mean the userspace application, which has no control over this flag.
>
I am also talking about dma-buf clients, I am referring to both the
userspace and kernel component of the client. For example our Camera ION
client has both a usersapce and kernel component and they have ION
buffers, which they control the access to, which may or may not be
accessed by the CPU in certain uses cases.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On 1/21/19 4:18 PM, Liam Mark wrote:
> On Mon, 21 Jan 2019, Andrew F. Davis wrote:
>
>> On 1/21/19 2:20 PM, Liam Mark wrote:
>>> On Mon, 21 Jan 2019, Andrew F. Davis wrote:
>>>
>>>> On 1/21/19 1:44 PM, Liam Mark wrote:
>>>>> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
>>>>>
>>>>>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
>>>>>>>> And who is going to decide which ones to pass? And who documents
>>>>>>>> which ones are safe?
>>>>>>>>
>>>>>>>> I'd much rather have explicit, well documented dma-buf flags that
>>>>>>>> might get translated to the DMA API flags, which are not error checked,
>>>>>>>> not very well documented and way to easy to get wrong.
>>>>>>>>
>>>>>>>
>>>>>>> I'm not sure having flags in dma-buf really solves anything
>>>>>>> given drivers can use the attributes directly with dma_map
>>>>>>> anyway, which is what we're looking to do. The intention
>>>>>>> is for the driver creating the dma_buf attachment to have
>>>>>>> the knowledge of which flags to use.
>>>>>>
>>>>>> Well, there are very few flags that you can simply use for all calls of
>>>>>> dma_map*. And given how badly these flags are defined I just don't want
>>>>>> people to add more places where they indirectly use these flags, as
>>>>>> it will be more than enough work to clean up the current mess.
>>>>>>
>>>>>> What flag(s) do you want to pass this way, btw? Maybe that is where
>>>>>> the problem is.
>>>>>>
>>>>>
>>>>> The main use case is for allowing clients to pass in
>>>>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
>>>>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
>>>>> ION the buffers aren't usually accessed from the CPU so this allows
>>>>> clients to often avoid doing unnecessary cache maintenance.
>>>>>
>>>>
>>>> How can a client know that no CPU access has occurred that needs to be
>>>> flushed out?
>>>>
>>>
>>> I have left this to clients, but if they own the buffer they can have the
>>> knowledge as to whether CPU access is needed in that use case (example for
>>> post-processing).
>>>
>>> For example with the previous version of ION we left all decisions of
>>> whether cache maintenance was required up to the client, they would use
>>> the ION cache maintenance IOCTL to force cache maintenance only when it
>>> was required.
>>> In these cases almost all of the access was being done by the device and
>>> in the rare cases CPU access was required clients would initiate the
>>> required cache maintenance before and after the CPU access.
>>>
>>
>> I think we have different definitions of "client", I'm talking about the
>> DMA-BUF client (the importer), that is who can set this flag. It seems
>> you mean the userspace application, which has no control over this flag.
>>
>
> I am also talking about dma-buf clients, I am referring to both the
> userspace and kernel component of the client. For example our Camera ION
> client has both a usersapce and kernel component and they have ION
> buffers, which they control the access to, which may or may not be
> accessed by the CPU in certain uses cases.
>
I know they often work together, but for this discussion it would be
good to keep kernel clients and usperspace clients separate. There are
three types of actors at play here, userspace clients, kernel clients,
and exporters.
DMA-BUF only provides the basic sync primitive + mmap directly to
userspace, both operations are fulfilled by the exporter. This patch is
about adding more control to the kernel side clients. The kernel side
clients cannot know what userspace or other kernel side clients have
done with the buffer, *only* the exporter has the whole picture.
Therefor neither type of client should be deciding if the CPU needs
flushed or not, only the exporter, based on the type of buffer, the
current set attachments, and previous actions (is this first attachment,
CPU get access in-between, etc...) can make this decision.
You goal seems to be to avoid unneeded CPU side CMOs when a device
detaches and another attaches with no CPU access in-between, right?
That's reasonable to me, but it must be the exporter who keeps track and
skips the CMO. This patch allows the client to tell the exporter the CMO
is not needed and that is not safe.
Andrew
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On 1/21/19 4:12 PM, Liam Mark wrote:
> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
>
>> On Mon, Jan 21, 2019 at 11:44:10AM -0800, Liam Mark wrote:
>>> The main use case is for allowing clients to pass in
>>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
>>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
>>> ION the buffers aren't usually accessed from the CPU so this allows
>>> clients to often avoid doing unnecessary cache maintenance.
>>
>> This can't work. The cpu can still easily speculate into this area.
>
> Can you provide more detail on your concern here.
> The use case I am thinking about here is a cached buffer which is accessed
> by a non IO-coherent device (quite a common use case for ION).
>
> Guessing on your concern:
> The speculative access can be an issue if you are going to access the
> buffer from the CPU after the device has written to it, however if you
> know you aren't going to do any CPU access before the buffer is again
> returned to the device then I don't think the speculative access is a
> concern.
>
>> Moreover in general these operations should be cheap if the addresses
>> aren't cached.
>>
>
> I am thinking of use cases with cached buffers here, so CMO isn't cheap.
>
These buffers are cacheable, not cached, if you haven't written anything
the data wont actually be in cache. And in the case of speculative cache
filling the lines are marked clean. In either case the only cost is the
little 7 instruction loop calling the clean/invalidate instruction (dc
civac for ARMv8) for the cache-lines. Unless that is the cost you are
trying to avoid?
In that case if you are mapping and unmapping so much that the little
CMO here is hurting performance then I would argue your usage is broken
and needs to be re-worked a bit.
Andrew
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Mon, 21 Jan 2019, Brian Starkey wrote:
> Hi Liam,
>
> On Fri, Jan 18, 2019 at 10:37:47AM -0800, Liam Mark wrote:
> > Add support for configuring dma mapping attributes when mapping
> > and unmapping memory through dma_buf_map_attachment and
> > dma_buf_unmap_attachment.
> >
> > For example this will allow ION clients to skip cache maintenance, by
> > using DMA_ATTR_SKIP_CPU_SYNC, for buffers which are clean and haven't been
> > accessed by the CPU.
>
> How can a client know that the buffer won't be accessed by the CPU in
> the future though?
>
Yes, for use cases where you don't if it will be accessed in the future
then you would only use it to optimize the dma map path, but as I
mentioned in the other thread there are cases (such as in our Camera)
where we have complete ownership of buffers and do know if it will be
accessed in the future.
> I don't think we can push this decision to clients, because they are
> lacking information about what else is going on with the buffer. It
> needs to be done by the exporter, IMO.
>
I do agree it would be better to handle in the exporter, but in a
pipelining use case where there might not be any devices attached that
doesn't seem very doable.
> Thanks,
> -Brian
>
> >
> > Signed-off-by: Liam Mark <[email protected]>
> > ---
> > drivers/staging/android/ion/ion.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > index 1fe633a7fdba..0aae845b20ba 100644
> > --- a/drivers/staging/android/ion/ion.c
> > +++ b/drivers/staging/android/ion/ion.c
> > @@ -268,8 +268,8 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > table = a->table;
> >
> > mutex_lock(&buffer->lock);
> > - if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > - direction)) {
> > + if (!dma_map_sg_attrs(attachment->dev, table->sgl, table->nents,
> > + direction, attachment->dma_map_attrs)) {
> > mutex_unlock(&buffer->lock);
> > return ERR_PTR(-ENOMEM);
> > }
> > @@ -287,7 +287,8 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > struct ion_buffer *buffer = attachment->dmabuf->priv;
> >
> > mutex_lock(&buffer->lock);
> > - dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > + dma_unmap_sg_attrs(attachment->dev, table->sgl, table->nents, direction,
> > + attachment->dma_map_attrs);
> > a->dma_mapped = false;
> > mutex_unlock(&buffer->lock);
> > }
> > --
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project
> >
> > _______________________________________________
> > dri-devel mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Tue, 22 Jan 2019, Andrew F. Davis wrote:
> On 1/21/19 4:18 PM, Liam Mark wrote:
> > On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> >
> >> On 1/21/19 2:20 PM, Liam Mark wrote:
> >>> On Mon, 21 Jan 2019, Andrew F. Davis wrote:
> >>>
> >>>> On 1/21/19 1:44 PM, Liam Mark wrote:
> >>>>> On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> >>>>>
> >>>>>> On Sat, Jan 19, 2019 at 08:50:41AM -0800, Laura Abbott wrote:
> >>>>>>>> And who is going to decide which ones to pass? And who documents
> >>>>>>>> which ones are safe?
> >>>>>>>>
> >>>>>>>> I'd much rather have explicit, well documented dma-buf flags that
> >>>>>>>> might get translated to the DMA API flags, which are not error checked,
> >>>>>>>> not very well documented and way to easy to get wrong.
> >>>>>>>>
> >>>>>>>
> >>>>>>> I'm not sure having flags in dma-buf really solves anything
> >>>>>>> given drivers can use the attributes directly with dma_map
> >>>>>>> anyway, which is what we're looking to do. The intention
> >>>>>>> is for the driver creating the dma_buf attachment to have
> >>>>>>> the knowledge of which flags to use.
> >>>>>>
> >>>>>> Well, there are very few flags that you can simply use for all calls of
> >>>>>> dma_map*. And given how badly these flags are defined I just don't want
> >>>>>> people to add more places where they indirectly use these flags, as
> >>>>>> it will be more than enough work to clean up the current mess.
> >>>>>>
> >>>>>> What flag(s) do you want to pass this way, btw? Maybe that is where
> >>>>>> the problem is.
> >>>>>>
> >>>>>
> >>>>> The main use case is for allowing clients to pass in
> >>>>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> >>>>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> >>>>> ION the buffers aren't usually accessed from the CPU so this allows
> >>>>> clients to often avoid doing unnecessary cache maintenance.
> >>>>>
> >>>>
> >>>> How can a client know that no CPU access has occurred that needs to be
> >>>> flushed out?
> >>>>
> >>>
> >>> I have left this to clients, but if they own the buffer they can have the
> >>> knowledge as to whether CPU access is needed in that use case (example for
> >>> post-processing).
> >>>
> >>> For example with the previous version of ION we left all decisions of
> >>> whether cache maintenance was required up to the client, they would use
> >>> the ION cache maintenance IOCTL to force cache maintenance only when it
> >>> was required.
> >>> In these cases almost all of the access was being done by the device and
> >>> in the rare cases CPU access was required clients would initiate the
> >>> required cache maintenance before and after the CPU access.
> >>>
> >>
> >> I think we have different definitions of "client", I'm talking about the
> >> DMA-BUF client (the importer), that is who can set this flag. It seems
> >> you mean the userspace application, which has no control over this flag.
> >>
> >
> > I am also talking about dma-buf clients, I am referring to both the
> > userspace and kernel component of the client. For example our Camera ION
> > client has both a usersapce and kernel component and they have ION
> > buffers, which they control the access to, which may or may not be
> > accessed by the CPU in certain uses cases.
> >
>
> I know they often work together, but for this discussion it would be
> good to keep kernel clients and usperspace clients separate. There are
> three types of actors at play here, userspace clients, kernel clients,
> and exporters.
>
> DMA-BUF only provides the basic sync primitive + mmap directly to
> userspace,
Well dma-buf does provide dma_buf_kmap/dma_buf_begin_cpu_access which
allows the same fucntionality in the kernel, but I don't think that changes
your argument.
> both operations are fulfilled by the exporter. This patch is
> about adding more control to the kernel side clients. The kernel side
> clients cannot know what userspace or other kernel side clients have
> done with the buffer, *only* the exporter has the whole picture.
>
> Therefor neither type of client should be deciding if the CPU needs
> flushed or not, only the exporter, based on the type of buffer, the
> current set attachments, and previous actions (is this first attachment,
> CPU get access in-between, etc...) can make this decision.
>
> You goal seems to be to avoid unneeded CPU side CMOs when a device
> detaches and another attaches with no CPU access in-between, right?
> That's reasonable to me, but it must be the exporter who keeps track and
> skips the CMO. This patch allows the client to tell the exporter the CMO
> is not needed and that is not safe.
>
I agree it would be better have this logic in the exporter, but I just
haven't heard an upstreamable way to make that work.
But maybe to explore that a bit more.
If we consider having CPU access with no devices attached a legitimate use
case:
The pipelining use case I am thinking of is
1) dev 1 attach, map, access, unmap
2) dev 1 detach
3) (maybe) CPU access
4) dev 2 attach
5) dev 2 map, access
6) ...
It would be unfortunate to not consider this something legitimate for
userspace to do in a pipelining use case.
Requiring devices to stay attached doesn't seem very clean to me as there
isn't necessarily a nice place to tell them when to detach.
If we considered the above a supported use case I think we could support
it in dma-buf (based on past discussions) if we had 2 things
#1 if we tracked the state of the buffer (example if it has had a previous
cached/uncached write and no following CMO). Then when either the CPU or
a device was going to access a buffer it could decide, based on the
previous access if any CMO needs to be applied first.
#2 we had a non-architecture specific way to apply cache maintenance
without a device, so that in step #3 the begin_cpu_acess call could
successfully invalidate the buffer.
I think #1 is doable since we can tell tell if devices are IO coherent or
not and we know the direction of accesses in dma map and begin cpu access.
I think we would probably agree that #2 is a problem though, getting the
kernel to expose that API seems like a hard argument.
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Tue, 22 Jan 2019, Andrew F. Davis wrote:
> On 1/21/19 4:12 PM, Liam Mark wrote:
> > On Mon, 21 Jan 2019, Christoph Hellwig wrote:
> >
> >> On Mon, Jan 21, 2019 at 11:44:10AM -0800, Liam Mark wrote:
> >>> The main use case is for allowing clients to pass in
> >>> DMA_ATTR_SKIP_CPU_SYNC in order to skip the default cache maintenance
> >>> which happens in dma_buf_map_attachment and dma_buf_unmap_attachment. In
> >>> ION the buffers aren't usually accessed from the CPU so this allows
> >>> clients to often avoid doing unnecessary cache maintenance.
> >>
> >> This can't work. The cpu can still easily speculate into this area.
> >
> > Can you provide more detail on your concern here.
> > The use case I am thinking about here is a cached buffer which is accessed
> > by a non IO-coherent device (quite a common use case for ION).
> >
> > Guessing on your concern:
> > The speculative access can be an issue if you are going to access the
> > buffer from the CPU after the device has written to it, however if you
> > know you aren't going to do any CPU access before the buffer is again
> > returned to the device then I don't think the speculative access is a
> > concern.
> >
> >> Moreover in general these operations should be cheap if the addresses
> >> aren't cached.
> >>
> >
> > I am thinking of use cases with cached buffers here, so CMO isn't cheap.
> >
>
> These buffers are cacheable, not cached, if you haven't written anything
> the data wont actually be in cache.
That's true
> And in the case of speculative cache
> filling the lines are marked clean. In either case the only cost is the
> little 7 instruction loop calling the clean/invalidate instruction (dc
> civac for ARMv8) for the cache-lines. Unless that is the cost you are
> trying to avoid?
>
This is the cost I am trying to avoid and this comes back to our previous
discussion. We have a coherent system cache so if you are doing this for
every cache line on a large buffer it adds up with this work and the going
to the bus.
For example I believe 1080P buffers are 8MB, and 4K buffers are even
larger.
I also still think you would want to solve this properly such that
invalidates aren't being done unnecessarily.
> In that case if you are mapping and unmapping so much that the little
> CMO here is hurting performance then I would argue your usage is broken
> and needs to be re-worked a bit.
>
I am not sure I would say it is broken, the large buffers (example 1080P
buffers) are mapped and unmapped on every frame. I don't think there is
any clean way to avoid that in a pipelining framework, you could ask
clients to keep the buffers dma mapped but there isn't necessarily a good
time to tell them to unmap.
It would be unfortunate to not consider this something legitimate for
usespace to do in a pipelining use case.
Requiring devices to stay attached doesn't seem very clean to me as there
isn't necessarily a nice place to tell them when to detach.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Fri, 18 Jan 2019, Liam Mark wrote:
> On Fri, 18 Jan 2019, Andrew F. Davis wrote:
>
> > On 1/18/19 12:37 PM, Liam Mark wrote:
> > > The ION begin_cpu_access and end_cpu_access functions use the
> > > dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> > > maintenance.
> > >
> > > Currently it is possible to apply cache maintenance, via the
> > > begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> > > dma mapped.
> > >
> > > The dma sync sg APIs should not be called on sg lists which have not been
> > > dma mapped as this can result in cache maintenance being applied to the
> > > wrong address. If an sg list has not been dma mapped then its dma_address
> > > field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> > > use the dma_address field to calculate the address onto which to apply
> > > cache maintenance.
> > >
> > > Also I don’t think we want CMOs to be applied to a buffer which is not
> > > dma mapped as the memory should already be coherent for access from the
> > > CPU. Any CMOs required for device access taken care of in the
> > > dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> > > So really it only makes sense for begin_cpu_access and end_cpu_access to
> > > apply CMOs if the buffer is dma mapped.
> > >
> > > Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> > > cache maintenance to buffers which are dma mapped.
> > >
> > > Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> > > Signed-off-by: Liam Mark <[email protected]>
> > > ---
> > > drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> > > 1 file changed, 21 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > > index 6f5afab7c1a1..1fe633a7fdba 100644
> > > --- a/drivers/staging/android/ion/ion.c
> > > +++ b/drivers/staging/android/ion/ion.c
> > > @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> > > struct device *dev;
> > > struct sg_table *table;
> > > struct list_head list;
> > > + bool dma_mapped;
> > > };
> > >
> > > static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > > @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > >
> > > a->table = table;
> > > a->dev = attachment->dev;
> > > + a->dma_mapped = false;
> > > INIT_LIST_HEAD(&a->list);
> > >
> > > attachment->priv = a;
> > > @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > > {
> > > struct ion_dma_buf_attachment *a = attachment->priv;
> > > struct sg_table *table;
> > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > >
> > > table = a->table;
> > >
> > > + mutex_lock(&buffer->lock);
> > > if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > > - direction))
> > > + direction)) {
> > > + mutex_unlock(&buffer->lock);
> > > return ERR_PTR(-ENOMEM);
> > > + }
> > > + a->dma_mapped = true;
> > > + mutex_unlock(&buffer->lock);
> > >
> > > return table;
> > > }
> > > @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > > struct sg_table *table,
> > > enum dma_data_direction direction)
> > > {
> > > + struct ion_dma_buf_attachment *a = attachment->priv;
> > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > > +
> > > + mutex_lock(&buffer->lock);
> > > dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > > + a->dma_mapped = false;
> > > + mutex_unlock(&buffer->lock);
> > > }
> > >
> > > static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > > @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > >
> > > mutex_lock(&buffer->lock);
> > > list_for_each_entry(a, &buffer->attachments, list) {
> >
> > When no devices are attached then buffer->attachments is empty and the
> > below does not run, so if I understand this patch correctly then what
> > you are protecting against is CPU access in the window after
> > dma_buf_attach but before dma_buf_map.
> >
>
> Yes
>
> > This is the kind of thing that again makes me think a couple more
> > ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
> > the backing memory to be allocated until map time, this is why the
> > dma_address field would still be null as you note in the commit message.
> > So why should the CPU be performing accesses on a buffer that is not
> > actually backed yet?
> >
> > I can think of two solutions:
> >
> > 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
> > least one device is mapped.
> >
>
> Would be quite limiting to clients.
>
> > 2) Treat the CPU access request like the a device map request and
> > trigger the allocation of backing memory just like if a device map had
> > come in.
> >
>
> Which is, as you mention pretty much what we have now (though the buffer
> is allocated even earlier).
>
> > I know the current Ion heaps (and most other DMA-BUF exporters) all do
> > the allocation up front so the memory is already there, but DMA-BUF was
> > designed with late allocation in mind. I have a use-case I'm working on
> > that finally exercises this DMA-BUF functionality and I would like to
> > have it export through ION. This patch doesn't prevent that, but seems
> > like it is endorsing the the idea that buffers always need to be backed,
> > even before device attach/map is has occurred.
> >
>
> I didn't interpret the DMA-buf contract as requiring the dma-map to be
> called in order for a backing store to be provided, I interpreted it as
> meaning there could be a backing store before the dma-map but at the
> dma-map call the final backing store configuration would be decided
> (perhaps involving migrating the memory to the final backing store).
> I will let the dma-buf experts correct me on that.
>
> Limiting userspace clients to not be able to access buffers until after
> they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
> change of ownership of the memory from the CPU to the device. So generally
> while a buffer is dma mapped you have the device access it (though of
> course it is supported for CPU to access to the buffer while dma mapped)
> and then once the buffer is dma-unmapped the CPU can access it. This is
> how the DMA APIs are frequently used, and the changes above make ION align
> more with the way the DMA APIs are used. Basically when the buffer is not
> dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
> ION ensures not CMOs are applied) but if the CPU does want to access the
> buffer while it is dma mapped then ION ensures that the appropriate CMOs
> are applied.
>
> It seems like a legitimate uses case to me to allow clients to access the
> buffer before (and after) dma-mapping, example post processing of buffers.
>
>
> > Either of the above two solutions would need to target the DMA-BUF
> > framework,
> >
> > Sumit,
> >
> > Any comment?
> >
In a separate thread Sumit seems to have confirmed that it is not a
requirement for exporters to defer the allocation until first dma map.
https://lore.kernel.org/lkml/CAO_48GEYPW0u6uWkkFgqjmmabLcBm69OD34QihSNGewqz_AqSQ@mail.gmail.com/
From Sumit:
"""
> Maybe it should be up to the exporter if early CPU access is allowed?
>
> I'm hoping someone with authority over the DMA-BUF framework can clarify
> original intentions here.
>
I suppose dma-buf as a framework can't know or decide what the exporter
wants or can do - whether the exporter wants to use it for 'only
zero-copy', or do some intelligent things behind the scene, I think should
be best left to the exporter.
"""
So it seems like it is acceptable for ION to continue to support access to
the buffer from the CPU before it is DMA mapped.
I was wondering if there was any additional feedback on this change since
it does fix a bug where userspace can cause the system to crash and I
think the change also results in a more logical application of CMOs.
> > Thanks,
> > Andrew
> >
> > > - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> > > - direction);
> > > + if (a->dma_mapped)
> > > + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> > > + a->table->nents, direction);
> > > }
> > >
> > > unlock:
> > > @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
> > >
> > > mutex_lock(&buffer->lock);
> > > list_for_each_entry(a, &buffer->attachments, list) {
> > > - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> > > - direction);
> > > + if (a->dma_mapped)
> > > + dma_sync_sg_for_device(a->dev, a->table->sgl,
> > > + a->table->nents, direction);
> > > }
> > > mutex_unlock(&buffer->lock);
> > >
> > >
> >
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Hi Liam,
On Tue, Jan 29, 2019 at 03:44:53PM -0800, Liam Mark wrote:
> On Fri, 18 Jan 2019, Liam Mark wrote:
>
> > On Fri, 18 Jan 2019, Andrew F. Davis wrote:
> >
> > > On 1/18/19 12:37 PM, Liam Mark wrote:
> > > > The ION begin_cpu_access and end_cpu_access functions use the
> > > > dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> > > > maintenance.
> > > >
> > > > Currently it is possible to apply cache maintenance, via the
> > > > begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> > > > dma mapped.
> > > >
> > > > The dma sync sg APIs should not be called on sg lists which have not been
> > > > dma mapped as this can result in cache maintenance being applied to the
> > > > wrong address. If an sg list has not been dma mapped then its dma_address
> > > > field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> > > > use the dma_address field to calculate the address onto which to apply
> > > > cache maintenance.
> > > >
> > > > Also I don’t think we want CMOs to be applied to a buffer which is not
> > > > dma mapped as the memory should already be coherent for access from the
> > > > CPU. Any CMOs required for device access taken care of in the
> > > > dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> > > > So really it only makes sense for begin_cpu_access and end_cpu_access to
> > > > apply CMOs if the buffer is dma mapped.
> > > >
> > > > Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> > > > cache maintenance to buffers which are dma mapped.
> > > >
> > > > Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> > > > Signed-off-by: Liam Mark <[email protected]>
> > > > ---
> > > > drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> > > > 1 file changed, 21 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > > > index 6f5afab7c1a1..1fe633a7fdba 100644
> > > > --- a/drivers/staging/android/ion/ion.c
> > > > +++ b/drivers/staging/android/ion/ion.c
> > > > @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> > > > struct device *dev;
> > > > struct sg_table *table;
> > > > struct list_head list;
> > > > + bool dma_mapped;
> > > > };
> > > >
> > > > static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > > > @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > > >
> > > > a->table = table;
> > > > a->dev = attachment->dev;
> > > > + a->dma_mapped = false;
> > > > INIT_LIST_HEAD(&a->list);
> > > >
> > > > attachment->priv = a;
> > > > @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > > > {
> > > > struct ion_dma_buf_attachment *a = attachment->priv;
> > > > struct sg_table *table;
> > > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > > >
> > > > table = a->table;
> > > >
> > > > + mutex_lock(&buffer->lock);
> > > > if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > > > - direction))
> > > > + direction)) {
> > > > + mutex_unlock(&buffer->lock);
> > > > return ERR_PTR(-ENOMEM);
> > > > + }
> > > > + a->dma_mapped = true;
> > > > + mutex_unlock(&buffer->lock);
> > > >
> > > > return table;
> > > > }
> > > > @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > > > struct sg_table *table,
> > > > enum dma_data_direction direction)
> > > > {
> > > > + struct ion_dma_buf_attachment *a = attachment->priv;
> > > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > > > +
> > > > + mutex_lock(&buffer->lock);
> > > > dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > > > + a->dma_mapped = false;
> > > > + mutex_unlock(&buffer->lock);
> > > > }
> > > >
> > > > static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > > > @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > > >
> > > > mutex_lock(&buffer->lock);
> > > > list_for_each_entry(a, &buffer->attachments, list) {
> > >
> > > When no devices are attached then buffer->attachments is empty and the
> > > below does not run, so if I understand this patch correctly then what
> > > you are protecting against is CPU access in the window after
> > > dma_buf_attach but before dma_buf_map.
> > >
> >
> > Yes
> >
> > > This is the kind of thing that again makes me think a couple more
> > > ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
> > > the backing memory to be allocated until map time, this is why the
> > > dma_address field would still be null as you note in the commit message.
> > > So why should the CPU be performing accesses on a buffer that is not
> > > actually backed yet?
> > >
> > > I can think of two solutions:
> > >
> > > 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
> > > least one device is mapped.
> > >
> >
> > Would be quite limiting to clients.
> >
> > > 2) Treat the CPU access request like the a device map request and
> > > trigger the allocation of backing memory just like if a device map had
> > > come in.
> > >
> >
> > Which is, as you mention pretty much what we have now (though the buffer
> > is allocated even earlier).
> >
> > > I know the current Ion heaps (and most other DMA-BUF exporters) all do
> > > the allocation up front so the memory is already there, but DMA-BUF was
> > > designed with late allocation in mind. I have a use-case I'm working on
> > > that finally exercises this DMA-BUF functionality and I would like to
> > > have it export through ION. This patch doesn't prevent that, but seems
> > > like it is endorsing the the idea that buffers always need to be backed,
> > > even before device attach/map is has occurred.
> > >
> >
> > I didn't interpret the DMA-buf contract as requiring the dma-map to be
> > called in order for a backing store to be provided, I interpreted it as
> > meaning there could be a backing store before the dma-map but at the
> > dma-map call the final backing store configuration would be decided
> > (perhaps involving migrating the memory to the final backing store).
> > I will let the dma-buf experts correct me on that.
> >
> > Limiting userspace clients to not be able to access buffers until after
> > they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
> > change of ownership of the memory from the CPU to the device. So generally
> > while a buffer is dma mapped you have the device access it (though of
> > course it is supported for CPU to access to the buffer while dma mapped)
> > and then once the buffer is dma-unmapped the CPU can access it. This is
> > how the DMA APIs are frequently used, and the changes above make ION align
> > more with the way the DMA APIs are used. Basically when the buffer is not
> > dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
> > ION ensures not CMOs are applied) but if the CPU does want to access the
> > buffer while it is dma mapped then ION ensures that the appropriate CMOs
> > are applied.
> >
> > It seems like a legitimate uses case to me to allow clients to access the
> > buffer before (and after) dma-mapping, example post processing of buffers.
> >
> >
> > > Either of the above two solutions would need to target the DMA-BUF
> > > framework,
> > >
> > > Sumit,
> > >
> > > Any comment?
> > >
>
> In a separate thread Sumit seems to have confirmed that it is not a
> requirement for exporters to defer the allocation until first dma map.
>
> https://lore.kernel.org/lkml/CAO_48GEYPW0u6uWkkFgqjmmabLcBm69OD34QihSNGewqz_AqSQ@mail.gmail.com/
>
> From Sumit:
> """
> > Maybe it should be up to the exporter if early CPU access is allowed?
> >
> > I'm hoping someone with authority over the DMA-BUF framework can clarify
> > original intentions here.
> >
>
> I suppose dma-buf as a framework can't know or decide what the exporter
> wants or can do - whether the exporter wants to use it for 'only
> zero-copy', or do some intelligent things behind the scene, I think should
> be best left to the exporter.
> """
>
> So it seems like it is acceptable for ION to continue to support access to
> the buffer from the CPU before it is DMA mapped.
>
> I was wondering if there was any additional feedback on this change since
> it does fix a bug where userspace can cause the system to crash and I
> think the change also results in a more logical application of CMOs.
>
We hit the same crash, and this patch certainly looks like it would
fix it. On that basis:
Reviewed-by: Brian Starkey <[email protected]>
I don't think anyone here had a chance to test it yet, though.
Thanks,
-Brian
>
> > > Thanks,
> > > Andrew
> > >
> > > > - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> > > > - direction);
> > > > + if (a->dma_mapped)
> > > > + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> > > > + a->table->nents, direction);
> > > > }
> > > >
> > > > unlock:
> > > > @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
> > > >
> > > > mutex_lock(&buffer->lock);
> > > > list_for_each_entry(a, &buffer->attachments, list) {
> > > > - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> > > > - direction);
> > > > + if (a->dma_mapped)
> > > > + dma_sync_sg_for_device(a->dev, a->table->sgl,
> > > > + a->table->nents, direction);
> > > > }
> > > > mutex_unlock(&buffer->lock);
> > > >
> > > >
> > >
> >
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 1/29/19 5:44 PM, Liam Mark wrote:
> On Fri, 18 Jan 2019, Liam Mark wrote:
>
>> On Fri, 18 Jan 2019, Andrew F. Davis wrote:
>>
>>> On 1/18/19 12:37 PM, Liam Mark wrote:
>>>> The ION begin_cpu_access and end_cpu_access functions use the
>>>> dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
>>>> maintenance.
>>>>
>>>> Currently it is possible to apply cache maintenance, via the
>>>> begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
>>>> dma mapped.
>>>>
>>>> The dma sync sg APIs should not be called on sg lists which have not been
>>>> dma mapped as this can result in cache maintenance being applied to the
>>>> wrong address. If an sg list has not been dma mapped then its dma_address
>>>> field has not been populated, some dma ops such as the swiotlb_dma_ops ops
>>>> use the dma_address field to calculate the address onto which to apply
>>>> cache maintenance.
>>>>
>>>> Also I don’t think we want CMOs to be applied to a buffer which is not
>>>> dma mapped as the memory should already be coherent for access from the
>>>> CPU. Any CMOs required for device access taken care of in the
>>>> dma_buf_map_attachment and dma_buf_unmap_attachment calls.
>>>> So really it only makes sense for begin_cpu_access and end_cpu_access to
>>>> apply CMOs if the buffer is dma mapped.
>>>>
>>>> Fix the ION begin_cpu_access and end_cpu_access functions to only apply
>>>> cache maintenance to buffers which are dma mapped.
>>>>
>>>> Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
>>>> Signed-off-by: Liam Mark <[email protected]>
>>>> ---
>>>> drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
>>>> 1 file changed, 21 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
>>>> index 6f5afab7c1a1..1fe633a7fdba 100644
>>>> --- a/drivers/staging/android/ion/ion.c
>>>> +++ b/drivers/staging/android/ion/ion.c
>>>> @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
>>>> struct device *dev;
>>>> struct sg_table *table;
>>>> struct list_head list;
>>>> + bool dma_mapped;
>>>> };
>>>>
>>>> static int ion_dma_buf_attach(struct dma_buf *dmabuf,
>>>> @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
>>>>
>>>> a->table = table;
>>>> a->dev = attachment->dev;
>>>> + a->dma_mapped = false;
>>>> INIT_LIST_HEAD(&a->list);
>>>>
>>>> attachment->priv = a;
>>>> @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
>>>> {
>>>> struct ion_dma_buf_attachment *a = attachment->priv;
>>>> struct sg_table *table;
>>>> + struct ion_buffer *buffer = attachment->dmabuf->priv;
>>>>
>>>> table = a->table;
>>>>
>>>> + mutex_lock(&buffer->lock);
>>>> if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
>>>> - direction))
>>>> + direction)) {
>>>> + mutex_unlock(&buffer->lock);
>>>> return ERR_PTR(-ENOMEM);
>>>> + }
>>>> + a->dma_mapped = true;
>>>> + mutex_unlock(&buffer->lock);
>>>>
>>>> return table;
>>>> }
>>>> @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
>>>> struct sg_table *table,
>>>> enum dma_data_direction direction)
>>>> {
>>>> + struct ion_dma_buf_attachment *a = attachment->priv;
>>>> + struct ion_buffer *buffer = attachment->dmabuf->priv;
>>>> +
>>>> + mutex_lock(&buffer->lock);
>>>> dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
>>>> + a->dma_mapped = false;
>>>> + mutex_unlock(&buffer->lock);
>>>> }
>>>>
>>>> static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
>>>> @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
>>>>
>>>> mutex_lock(&buffer->lock);
>>>> list_for_each_entry(a, &buffer->attachments, list) {
>>>
>>> When no devices are attached then buffer->attachments is empty and the
>>> below does not run, so if I understand this patch correctly then what
>>> you are protecting against is CPU access in the window after
>>> dma_buf_attach but before dma_buf_map.
>>>
>>
>> Yes
>>
>>> This is the kind of thing that again makes me think a couple more
>>> ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
>>> the backing memory to be allocated until map time, this is why the
>>> dma_address field would still be null as you note in the commit message.
>>> So why should the CPU be performing accesses on a buffer that is not
>>> actually backed yet?
>>>
>>> I can think of two solutions:
>>>
>>> 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
>>> least one device is mapped.
>>>
>>
>> Would be quite limiting to clients.
>>
I can agree with that, option two seems more reasonable.
>>> 2) Treat the CPU access request like the a device map request and
>>> trigger the allocation of backing memory just like if a device map had
>>> come in.
>>>
>>
>> Which is, as you mention pretty much what we have now (though the buffer
>> is allocated even earlier).
>>
It only behaves like it does because the buffer is always allocated. We
still need a way to allocate at map/CPU access time given to Ion heap
exporters.
>>> I know the current Ion heaps (and most other DMA-BUF exporters) all do
>>> the allocation up front so the memory is already there, but DMA-BUF was
>>> designed with late allocation in mind. I have a use-case I'm working on
>>> that finally exercises this DMA-BUF functionality and I would like to
>>> have it export through ION. This patch doesn't prevent that, but seems
>>> like it is endorsing the the idea that buffers always need to be backed,
>>> even before device attach/map is has occurred.
>>>
>>
>> I didn't interpret the DMA-buf contract as requiring the dma-map to be
>> called in order for a backing store to be provided, I interpreted it as
>> meaning there could be a backing store before the dma-map but at the
>> dma-map call the final backing store configuration would be decided
>> (perhaps involving migrating the memory to the final backing store).
>> I will let the dma-buf experts correct me on that.
>>
>> Limiting userspace clients to not be able to access buffers until after
>> they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
>> change of ownership of the memory from the CPU to the device. So generally
>> while a buffer is dma mapped you have the device access it (though of
>> course it is supported for CPU to access to the buffer while dma mapped)
>> and then once the buffer is dma-unmapped the CPU can access it. This is
>> how the DMA APIs are frequently used, and the changes above make ION align
>> more with the way the DMA APIs are used. Basically when the buffer is not
>> dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
>> ION ensures not CMOs are applied) but if the CPU does want to access the
>> buffer while it is dma mapped then ION ensures that the appropriate CMOs
>> are applied.
>>
>> It seems like a legitimate uses case to me to allow clients to access the
>> buffer before (and after) dma-mapping, example post processing of buffers.
>>
>>
>>> Either of the above two solutions would need to target the DMA-BUF
>>> framework,
>>>
>>> Sumit,
>>>
>>> Any comment?
>>>
>
> In a separate thread Sumit seems to have confirmed that it is not a
> requirement for exporters to defer the allocation until first dma map.
>
> https://lore.kernel.org/lkml/CAO_48GEYPW0u6uWkkFgqjmmabLcBm69OD34QihSNGewqz_AqSQ@mail.gmail.com/
>
> From Sumit:
> """
>> Maybe it should be up to the exporter if early CPU access is allowed?
>>
>> I'm hoping someone with authority over the DMA-BUF framework can clarify
>> original intentions here.
>>
>
> I suppose dma-buf as a framework can't know or decide what the exporter
> wants or can do - whether the exporter wants to use it for 'only
> zero-copy', or do some intelligent things behind the scene, I think should
> be best left to the exporter.
> """
>
> So it seems like it is acceptable for ION to continue to support access to
> the buffer from the CPU before it is DMA mapped.
>
It sounds like it is to be left to the exporter, which means some heaps
should be allowed to *not* allow such a thing if they chose. More
control needs to be given to Ion heaps to make the framework usable by
more types of heaps.
But that is beyond the scope of this patch..
> I was wondering if there was any additional feedback on this change since
> it does fix a bug where userspace can cause the system to crash and I
> think the change also results in a more logical application of CMOs.
>
>
I'm not sure I like the direction, but this patch does seem technically
correct at blocking a crash,
Reviewed-by: Andrew F. Davis <[email protected]>
>>> Thanks,
>>> Andrew
>>>
>>>> - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
>>>> - direction);
>>>> + if (a->dma_mapped)
>>>> + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
>>>> + a->table->nents, direction);
>>>> }
>>>>
>>>> unlock:
>>>> @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
>>>>
>>>> mutex_lock(&buffer->lock);
>>>> list_for_each_entry(a, &buffer->attachments, list) {
>>>> - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
>>>> - direction);
>>>> + if (a->dma_mapped)
>>>> + dma_sync_sg_for_device(a->dev, a->table->sgl,
>>>> + a->table->nents, direction);
>>>> }
>>>> mutex_unlock(&buffer->lock);
>>>>
>>>>
>>>
>>
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>> a Linux Foundation Collaborative Project
>
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> a Linux Foundation Collaborative Project
>
On Wed, Jan 30, 2019 at 11:31:23AM +0000, Brian Starkey wrote:
>
> On Tue, Jan 29, 2019 at 03:44:53PM -0800, Liam Mark wrote:
> > On Fri, 18 Jan 2019, Liam Mark wrote:
> >
> > > On Fri, 18 Jan 2019, Andrew F. Davis wrote:
> > >
> > > > On 1/18/19 12:37 PM, Liam Mark wrote:
> > > > > The ION begin_cpu_access and end_cpu_access functions use the
> > > > > dma_sync_sg_for_cpu and dma_sync_sg_for_device APIs to perform cache
> > > > > maintenance.
> > > > >
> > > > > Currently it is possible to apply cache maintenance, via the
> > > > > begin_cpu_access and end_cpu_access APIs, to ION buffers which are not
> > > > > dma mapped.
> > > > >
> > > > > The dma sync sg APIs should not be called on sg lists which have not been
> > > > > dma mapped as this can result in cache maintenance being applied to the
> > > > > wrong address. If an sg list has not been dma mapped then its dma_address
> > > > > field has not been populated, some dma ops such as the swiotlb_dma_ops ops
> > > > > use the dma_address field to calculate the address onto which to apply
> > > > > cache maintenance.
> > > > >
> > > > > Also I don’t think we want CMOs to be applied to a buffer which is not
> > > > > dma mapped as the memory should already be coherent for access from the
> > > > > CPU. Any CMOs required for device access taken care of in the
> > > > > dma_buf_map_attachment and dma_buf_unmap_attachment calls.
> > > > > So really it only makes sense for begin_cpu_access and end_cpu_access to
> > > > > apply CMOs if the buffer is dma mapped.
> > > > >
> > > > > Fix the ION begin_cpu_access and end_cpu_access functions to only apply
> > > > > cache maintenance to buffers which are dma mapped.
> > > > >
> > > > > Fixes: 2a55e7b5e544 ("staging: android: ion: Call dma_map_sg for syncing and mapping")
> > > > > Signed-off-by: Liam Mark <[email protected]>
> > > > > ---
> > > > > drivers/staging/android/ion/ion.c | 26 +++++++++++++++++++++-----
> > > > > 1 file changed, 21 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/drivers/staging/android/ion/ion.c b/drivers/staging/android/ion/ion.c
> > > > > index 6f5afab7c1a1..1fe633a7fdba 100644
> > > > > --- a/drivers/staging/android/ion/ion.c
> > > > > +++ b/drivers/staging/android/ion/ion.c
> > > > > @@ -210,6 +210,7 @@ struct ion_dma_buf_attachment {
> > > > > struct device *dev;
> > > > > struct sg_table *table;
> > > > > struct list_head list;
> > > > > + bool dma_mapped;
> > > > > };
> > > > >
> > > > > static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > > > > @@ -231,6 +232,7 @@ static int ion_dma_buf_attach(struct dma_buf *dmabuf,
> > > > >
> > > > > a->table = table;
> > > > > a->dev = attachment->dev;
> > > > > + a->dma_mapped = false;
> > > > > INIT_LIST_HEAD(&a->list);
> > > > >
> > > > > attachment->priv = a;
> > > > > @@ -261,12 +263,18 @@ static struct sg_table *ion_map_dma_buf(struct dma_buf_attachment *attachment,
> > > > > {
> > > > > struct ion_dma_buf_attachment *a = attachment->priv;
> > > > > struct sg_table *table;
> > > > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > > > >
> > > > > table = a->table;
> > > > >
> > > > > + mutex_lock(&buffer->lock);
> > > > > if (!dma_map_sg(attachment->dev, table->sgl, table->nents,
> > > > > - direction))
> > > > > + direction)) {
> > > > > + mutex_unlock(&buffer->lock);
> > > > > return ERR_PTR(-ENOMEM);
> > > > > + }
> > > > > + a->dma_mapped = true;
> > > > > + mutex_unlock(&buffer->lock);
> > > > >
> > > > > return table;
> > > > > }
> > > > > @@ -275,7 +283,13 @@ static void ion_unmap_dma_buf(struct dma_buf_attachment *attachment,
> > > > > struct sg_table *table,
> > > > > enum dma_data_direction direction)
> > > > > {
> > > > > + struct ion_dma_buf_attachment *a = attachment->priv;
> > > > > + struct ion_buffer *buffer = attachment->dmabuf->priv;
> > > > > +
> > > > > + mutex_lock(&buffer->lock);
> > > > > dma_unmap_sg(attachment->dev, table->sgl, table->nents, direction);
> > > > > + a->dma_mapped = false;
> > > > > + mutex_unlock(&buffer->lock);
> > > > > }
> > > > >
> > > > > static int ion_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
> > > > > @@ -346,8 +360,9 @@ static int ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
> > > > >
> > > > > mutex_lock(&buffer->lock);
> > > > > list_for_each_entry(a, &buffer->attachments, list) {
> > > >
> > > > When no devices are attached then buffer->attachments is empty and the
> > > > below does not run, so if I understand this patch correctly then what
> > > > you are protecting against is CPU access in the window after
> > > > dma_buf_attach but before dma_buf_map.
> > > >
> > >
> > > Yes
> > >
> > > > This is the kind of thing that again makes me think a couple more
> > > > ordering requirements on DMA-BUF ops are needed. DMA-BUFs do not require
> > > > the backing memory to be allocated until map time, this is why the
> > > > dma_address field would still be null as you note in the commit message.
> > > > So why should the CPU be performing accesses on a buffer that is not
> > > > actually backed yet?
> > > >
> > > > I can think of two solutions:
> > > >
> > > > 1) Only allow CPU access (mmap, kmap, {begin,end}_cpu_access) while at
> > > > least one device is mapped.
> > > >
> > >
> > > Would be quite limiting to clients.
> > >
> > > > 2) Treat the CPU access request like the a device map request and
> > > > trigger the allocation of backing memory just like if a device map had
> > > > come in.
> > > >
> > >
> > > Which is, as you mention pretty much what we have now (though the buffer
> > > is allocated even earlier).
> > >
> > > > I know the current Ion heaps (and most other DMA-BUF exporters) all do
> > > > the allocation up front so the memory is already there, but DMA-BUF was
> > > > designed with late allocation in mind. I have a use-case I'm working on
> > > > that finally exercises this DMA-BUF functionality and I would like to
> > > > have it export through ION. This patch doesn't prevent that, but seems
> > > > like it is endorsing the the idea that buffers always need to be backed,
> > > > even before device attach/map is has occurred.
> > > >
> > >
> > > I didn't interpret the DMA-buf contract as requiring the dma-map to be
> > > called in order for a backing store to be provided, I interpreted it as
> > > meaning there could be a backing store before the dma-map but at the
> > > dma-map call the final backing store configuration would be decided
> > > (perhaps involving migrating the memory to the final backing store).
> > > I will let the dma-buf experts correct me on that.
> > >
> > > Limiting userspace clients to not be able to access buffers until after
> > > they are dma-mapped seems unfortuntate to me, dma-mapping usually means a
> > > change of ownership of the memory from the CPU to the device. So generally
> > > while a buffer is dma mapped you have the device access it (though of
> > > course it is supported for CPU to access to the buffer while dma mapped)
> > > and then once the buffer is dma-unmapped the CPU can access it. This is
> > > how the DMA APIs are frequently used, and the changes above make ION align
> > > more with the way the DMA APIs are used. Basically when the buffer is not
> > > dma-mapped the CPU doesn't need to do any CMOs to access the buffer (and
> > > ION ensures not CMOs are applied) but if the CPU does want to access the
> > > buffer while it is dma mapped then ION ensures that the appropriate CMOs
> > > are applied.
> > >
> > > It seems like a legitimate uses case to me to allow clients to access the
> > > buffer before (and after) dma-mapping, example post processing of buffers.
> > >
> > >
> > > > Either of the above two solutions would need to target the DMA-BUF
> > > > framework,
> > > >
> > > > Sumit,
> > > >
> > > > Any comment?
> > > >
> >
> > In a separate thread Sumit seems to have confirmed that it is not a
> > requirement for exporters to defer the allocation until first dma map.
> >
> > https://lore.kernel.org/lkml/CAO_48GEYPW0u6uWkkFgqjmmabLcBm69OD34QihSNGewqz_AqSQ@mail.gmail.com/
> >
> > From Sumit:
> > """
> > > Maybe it should be up to the exporter if early CPU access is allowed?
> > >
> > > I'm hoping someone with authority over the DMA-BUF framework can clarify
> > > original intentions here.
> > >
> >
> > I suppose dma-buf as a framework can't know or decide what the exporter
> > wants or can do - whether the exporter wants to use it for 'only
> > zero-copy', or do some intelligent things behind the scene, I think should
> > be best left to the exporter.
> > """
> >
> > So it seems like it is acceptable for ION to continue to support access to
> > the buffer from the CPU before it is DMA mapped.
> >
> > I was wondering if there was any additional feedback on this change since
> > it does fix a bug where userspace can cause the system to crash and I
> > think the change also results in a more logical application of CMOs.
> >
>
> We hit the same crash, and this patch certainly looks like it would
> fix it. On that basis:
>
> Reviewed-by: Brian Starkey <[email protected]>
>
> I don't think anyone here had a chance to test it yet, though.
I've run some testing, and this patch does indeed fix the crash in
dma_sync_sg_for_cpu when it tried to use the 0 dma_address from the sg
list.
Tested-by: Ørjan Eide <[email protected]>
I tested this on an older kernel, v4.14, since the dma-mapping code
moved, in v4.19, to ignore the dma_address and instead use sg_phys() to
get a valid address from the page, which is always valid in the ion sg
lists. While this wouldn't crash on newer kernels, it's still good to
avoid the unnecessary work when no CMO is needed.
Is this patch a candidate for the relevant stable kernels, those that
have this bug exposed to user space via Ion and DMA_BUF_IOCTL_SYNC?
Thanks,
-Ørjan
>
> Thanks,
> -Brian
>
> >
> > > > Thanks,
> > > > Andrew
> > > >
> > > > > - dma_sync_sg_for_cpu(a->dev, a->table->sgl, a->table->nents,
> > > > > - direction);
> > > > > + if (a->dma_mapped)
> > > > > + dma_sync_sg_for_cpu(a->dev, a->table->sgl,
> > > > > + a->table->nents, direction);
> > > > > }
> > > > >
> > > > > unlock:
> > > > > @@ -369,8 +384,9 @@ static int ion_dma_buf_end_cpu_access(struct dma_buf *dmabuf,
> > > > >
> > > > > mutex_lock(&buffer->lock);
> > > > > list_for_each_entry(a, &buffer->attachments, list) {
> > > > > - dma_sync_sg_for_device(a->dev, a->table->sgl, a->table->nents,
> > > > > - direction);
> > > > > + if (a->dma_mapped)
> > > > > + dma_sync_sg_for_device(a->dev, a->table->sgl,
> > > > > + a->table->nents, direction);
> > > > > }
> > > > > mutex_unlock(&buffer->lock);
> > > > >
> > > > >
> > > >
> > >
> > > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > > a Linux Foundation Collaborative Project
> >
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> > a Linux Foundation Collaborative Project
>
> > _______________________________________________
> > dri-devel mailing list
> > [email protected]
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> _______________________________________________
> Linaro-mm-sig mailing list
> [email protected]
> https://lists.linaro.org/mailman/listinfo/linaro-mm-sig
The CPU may only access DMA mapped memory if ownership has been
transferred back to the CPU using dma_sync_{single,sg}_to_cpu, and then
before the device can access it again ownership needs to be transferred
back to the device using dma_sync_{single,sg}_to_device.
> I've run some testing, and this patch does indeed fix the crash in
> dma_sync_sg_for_cpu when it tried to use the 0 dma_address from the sg
> list.
>
> Tested-by: ?rjan Eide <[email protected]>
>
> I tested this on an older kernel, v4.14, since the dma-mapping code
> moved, in v4.19, to ignore the dma_address and instead use sg_phys() to
> get a valid address from the page, which is always valid in the ion sg
> lists. While this wouldn't crash on newer kernels, it's still good to
> avoid the unnecessary work when no CMO is needed.
Can you also test is with CONFIG_DMA_API_DEBUG enabled, as that should
catch all the usual mistakes in DMA API usage, including the one found?
On Wed, Feb 06, 2019 at 11:31:04PM -0800, Christoph Hellwig wrote:
> The CPU may only access DMA mapped memory if ownership has been
> transferred back to the CPU using dma_sync_{single,sg}_to_cpu, and then
> before the device can access it again ownership needs to be transferred
> back to the device using dma_sync_{single,sg}_to_device.
>
> > I've run some testing, and this patch does indeed fix the crash in
> > dma_sync_sg_for_cpu when it tried to use the 0 dma_address from the sg
> > list.
> >
> > Tested-by: ?rjan Eide <[email protected]>
> >
> > I tested this on an older kernel, v4.14, since the dma-mapping code
> > moved, in v4.19, to ignore the dma_address and instead use sg_phys() to
> > get a valid address from the page, which is always valid in the ion sg
> > lists. While this wouldn't crash on newer kernels, it's still good to
> > avoid the unnecessary work when no CMO is needed.
>
> Can you also test is with CONFIG_DMA_API_DEBUG enabled, as that should
> catch all the usual mistakes in DMA API usage, including the one found?
I checked again with CONFIG_DMA_API_DEBUG=y, both with and without this
patch, and I didn't get any dma-mapping errors.
The issue I hit, without this patch, is when a CPU access starts after a
device have attached, which caused ion to create a copy of the buffer's
sg list with dma_address zeroed, but before the device have mapped the
buffer.
--
?rjan
+ Sumit
Hi Sumit,
Do you have any thoughts on this patch?
It fixes a potential crash in on older kernel and I think limiting
begin/end_cpu_access to only apply cache maintenance when the buffer is
dma mapped makes sense from a logical perspective and performance
perspective.
On Wed, 6 Feb 2019, Ørjan Eide wrote:
>
> I've run some testing, and this patch does indeed fix the crash in
> dma_sync_sg_for_cpu when it tried to use the 0 dma_address from the sg
> list.
>
> Tested-by: Ørjan Eide <[email protected]>
>
> I tested this on an older kernel, v4.14, since the dma-mapping code
> moved, in v4.19, to ignore the dma_address and instead use sg_phys() to
> get a valid address from the page, which is always valid in the ion sg
> lists. While this wouldn't crash on newer kernels, it's still good to
> avoid the unnecessary work when no CMO is needed.
>
Isn't a fix like this also required from a stability perspective for
future kernels? I understand from your analysis below that the crash has
been fixed after 4.19 by using sg_phys to get the address but aren't we
breaking the DMA API contract by calling dma_sync_* without first dma
mapping the memory, if so then we have no guarantee that future
implementations of functions like dma_direct_sync_sg_for_cpu will properly
handle calls to dma_sync_* if the memory is not dma mapped.
> Is this patch a candidate for the relevant stable kernels, those that
> have this bug exposed to user space via Ion and DMA_BUF_IOCTL_SYNC?
>
My belief is that is relevant for older kernels otherwise an unprivileged
malicious userspace application may be able to crash the system if they
can call DMA_BUF_IOCTL_SYNC at the right time.
BTW thanks Ørjan testing and anaalsyis you have carried out on this
change.
Liam
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project