2010-12-20 18:44:09

by Felipe Contreras

[permalink] [raw]
Subject: [PATCH v2] staging: tidspbridge: protect dmm_map properly

We need to protect not only the dmm_map list, but the individual
map_obj's, otherwise, we might be building the scatter-gather list with
garbage. So, use the existing proc_lock for that.

I observed race conditions which caused kernel panics while running
stress tests. This patch fixes those.

Signed-off-by: Felipe Contreras <[email protected]>
---
drivers/staging/tidspbridge/rmgr/proc.c | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/tidspbridge/rmgr/proc.c b/drivers/staging/tidspbridge/rmgr/proc.c
index b47d7aa..e2fe165 100644
--- a/drivers/staging/tidspbridge/rmgr/proc.c
+++ b/drivers/staging/tidspbridge/rmgr/proc.c
@@ -781,12 +781,14 @@ int proc_begin_dma(void *hprocessor, void *pmpu_addr, u32 ul_size,
(u32)pmpu_addr,
ul_size, dir);

+ mutex_lock(&proc_lock);
+
/* find requested memory are in cached mapping information */
map_obj = find_containing_mapping(pr_ctxt, (u32) pmpu_addr, ul_size);
if (!map_obj) {
pr_err("%s: find_containing_mapping failed\n", __func__);
status = -EFAULT;
- goto err_out;
+ goto no_map;
}

if (memory_give_ownership(map_obj, (u32) pmpu_addr, ul_size, dir)) {
@@ -795,6 +797,8 @@ int proc_begin_dma(void *hprocessor, void *pmpu_addr, u32 ul_size,
status = -EFAULT;
}

+no_map:
+ mutex_unlock(&proc_lock);
err_out:

return status;
@@ -819,21 +823,24 @@ int proc_end_dma(void *hprocessor, void *pmpu_addr, u32 ul_size,
(u32)pmpu_addr,
ul_size, dir);

+ mutex_lock(&proc_lock);
+
/* find requested memory are in cached mapping information */
map_obj = find_containing_mapping(pr_ctxt, (u32) pmpu_addr, ul_size);
if (!map_obj) {
pr_err("%s: find_containing_mapping failed\n", __func__);
status = -EFAULT;
- goto err_out;
+ goto no_map;
}

if (memory_regain_ownership(map_obj, (u32) pmpu_addr, ul_size, dir)) {
pr_err("%s: InValid address parameters %p %x\n",
__func__, pmpu_addr, ul_size);
status = -EFAULT;
- goto err_out;
}

+no_map:
+ mutex_unlock(&proc_lock);
err_out:
return status;
}
@@ -1726,9 +1733,8 @@ int proc_un_map(void *hprocessor, void *map_addr,
(p_proc_object->hbridge_context, va_align, size_align);
}

- mutex_unlock(&proc_lock);
if (status)
- goto func_end;
+ goto unmap_failed;

/*
* A successful unmap should be followed by removal of map_obj
@@ -1737,6 +1743,9 @@ int proc_un_map(void *hprocessor, void *map_addr,
*/
remove_mapping_information(pr_ctxt, (u32) map_addr, size_align);

+unmap_failed:
+ mutex_unlock(&proc_lock);
+
func_end:
dev_dbg(bridge, "%s: hprocessor: 0x%p map_addr: 0x%p status: 0x%x\n",
__func__, hprocessor, map_addr, status);
--
1.7.3.3


2010-12-21 14:22:33

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

Hi Felipe,

On Mon, Dec 20, 2010 at 8:43 PM, Felipe Contreras
<[email protected]> wrote:
> We need to protect not only the dmm_map list, but the individual
> map_obj's, otherwise, we might be building the scatter-gather list with
> garbage.

Thanks for reporting this !

> So, use the existing proc_lock for that.

I don't like this.

You are taking this lock over the entire proc_begin_dma() and
proc_end_dma() functions, and by that artificially imposing sequential
behavior of their execution (system-wide).

I strongly prefer protecting the dmm map object itself, and not the
code that uses it, in order to avoid redundant bottlenecks.

Something like the patch below should fix the panics without globally
locking the dma functions (UNTESTED! I still want to have a gst-dsp
setup like yours so I can reproduce the issues you see. If needed,
I'll get to it).

Please note: the code below assumes that applications calls
proc_end_dma() for every proc_begin_dma(). This is anyway mandatory,
because otherwise we are risking data corruptions, but we know older
applications (e.g. probably TI samples) don't do that (since the
dspbridge's DMA API is relatively new). If we want/need to support
those older applications with this patch, it's a 2-liner change.

But there is another issue. I'm not quite sure how gst-dsp is using
the bridge DMA API, but those panics may suggest that there are
additional races here: if the application unmaps a memory region,
before the DMA operation completes (i.e. proc_end_dma() completed
execution), the application will face:

1. errors: when proc_end_dma() does get invoked, the map_obj will
already be destroyed, and find_containing_mapping() will simply fail

2. data corruptions: as a result of (1), dma_unmap_sg() will not be
called, and that may result in data corruptions and generally is very
bad

We can treat this as an application error, but we don't have to, and
it will actually be very clean if we won't.

Given the below patch, it should be quite easy to solve those
additional races too, e.g., by adding a boolean 'dying' flag to each
map_obj, which would (on un_map) mark an object as not available for a
new DMA operation (proc_start_dma), but it'd still wait until all of
its referenced are removed (proc_end_dma), and only then it will be
finally removed from the list.

Such a change will only work with applications that calls
proc_end_dma() for each proc_start_dma(). How is gst-dsp in that
respect ? do we want to keep supporting older applications which don't
do that ? if we want, we can still support both old and new API with
some ioctl games (introduce new ioctl commands which will work the new
way, and keep the old ones around, maybe only mark them as
deprecated).

Thanks,
Ohad.

---
.../staging/tidspbridge/include/dspbridge/drv.h | 2 +
drivers/staging/tidspbridge/rmgr/proc.c | 39 ++++++++++++++++++--
2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/tidspbridge/include/dspbridge/drv.h
b/drivers/staging/tidspbridge/include/dspbridge/drv.h
index c1f363e..cad0626 100644
--- a/drivers/staging/tidspbridge/include/dspbridge/drv.h
+++ b/drivers/staging/tidspbridge/include/dspbridge/drv.h
@@ -25,6 +25,7 @@

#include <dspbridge/drvdefs.h>
#include <linux/idr.h>
+#include <linux/atomic.h>

#define DRV_ASSIGN 1
#define DRV_RELEASE 0
@@ -106,6 +107,7 @@ struct dmm_map_object {
u32 num_usr_pgs;
struct page **pages;
struct bridge_dma_map_info dma_info;
+ atomic_t refcnt;
};

/* Used for DMM reserved memory accounting */
diff --git a/drivers/staging/tidspbridge/rmgr/proc.c
b/drivers/staging/tidspbridge/rmgr/proc.c
index b47d7aa..4aabfcc 100644
--- a/drivers/staging/tidspbridge/rmgr/proc.c
+++ b/drivers/staging/tidspbridge/rmgr/proc.c
@@ -112,6 +112,26 @@ static s32 get_envp_count(char **envp);
static char **prepend_envp(char **new_envp, char **envp, s32 envp_elems,
s32 cnew_envp, char *sz_var);

+/*
+ * Grab the dmm_map_object's reference count. This operation is valid only
+ * when map_obj is ALREADY grabbed, e.g., it is still in the dmm_map list
+ * and list's lock is taken
+ */
+static inline void map_obj_get(struct dmm_map_object *map_obj)
+{
+ atomic_inc(&map_obj->refcnt);
+}
+
+/* Ungrab map_obj and destroy it, if it was the last reference */
+static inline void map_obj_put(struct dmm_map_object *map_obj)
+{
+ if (atomic_dec_and_test(&map_obj->refcnt)) {
+ kfree(map_obj->dma_info.sg);
+ kfree(map_obj->pages);
+ kfree(map_obj);
+ }
+}
+
/* remember mapping information */
static struct dmm_map_object *add_mapping_info(struct process_context *pr_ctxt,
u32 mpu_addr, u32 dsp_addr, u32 size)
@@ -143,6 +163,8 @@ static struct dmm_map_object
*add_mapping_info(struct process_context *pr_ctxt,
map_obj->dsp_addr = dsp_addr;
map_obj->size = size;
map_obj->num_usr_pgs = num_usr_pgs;
+ /* dmm map objects are initially grabbed */
+ atomic_set(&map_obj->refcnt, 1);

spin_lock(&pr_ctxt->dmm_map_lock);
list_add(&map_obj->link, &pr_ctxt->dmm_map_list);
@@ -181,9 +203,7 @@ static void remove_mapping_information(struct
process_context *pr_ctxt,
if (match_exact_map_obj(map_obj, dsp_addr, size)) {
pr_debug("%s: match, deleting map info\n", __func__);
list_del(&map_obj->link);
- kfree(map_obj->dma_info.sg);
- kfree(map_obj->pages);
- kfree(map_obj);
+ map_obj_put(map_obj);
goto out;
}
pr_debug("%s: candidate didn't match\n", __func__);
@@ -203,6 +223,11 @@ static int match_containing_map_obj(struct
dmm_map_object *map_obj,
mpu_addr + size <= map_obj_end;
}

+/*
+ * Find a dmm map object that contains the given memory block,
+ * and increase its reference count to prevent its destruction
+ * until released
+ */
static struct dmm_map_object *find_containing_mapping(
struct process_context *pr_ctxt,
u32 mpu_addr, u32 size)
@@ -220,6 +245,7 @@ static struct dmm_map_object *find_containing_mapping(
map_obj->size);
if (match_containing_map_obj(map_obj, mpu_addr, size)) {
pr_debug("%s: match!\n", __func__);
+ map_obj_get(map_obj);
goto out;
}

@@ -793,6 +819,8 @@ int proc_begin_dma(void *hprocessor, void
*pmpu_addr, u32 ul_size,
pr_err("%s: InValid address parameters %p %x\n",
__func__, pmpu_addr, ul_size);
status = -EFAULT;
+ /* release the reference count we took in the find above */
+ map_obj_put(map_obj);
}

err_out:
@@ -834,6 +862,11 @@ int proc_end_dma(void *hprocessor, void
*pmpu_addr, u32 ul_size,
goto err_out;
}

+ /* release first the reference count we took in the find above */
+ map_obj_put(map_obj);
+ /* now release the reference count that was taken in proc_begin_dma */
+ map_obj_put(map_obj);
+
err_out:
return status;
}
--
1.7.1

2010-12-21 16:44:13

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

Hi Ohad,

On Tue, Dec 21, 2010 at 4:06 PM, Ohad Ben-Cohen <[email protected]> wrote:
> On Mon, Dec 20, 2010 at 8:43 PM, Felipe Contreras
> <[email protected]> wrote:
>> We need to protect not only the dmm_map list, but the individual
>> map_obj's, otherwise, we might be building the scatter-gather list with
>> garbage.
>
> Thanks for reporting this !
>
>> So, use the existing proc_lock for that.
>
> I don't like this.
>
> You are taking this lock over the entire proc_begin_dma() and
> proc_end_dma() functions, and by that artificially imposing sequential
> behavior of their execution (system-wide).

Wouldn't you want the proc_*_dma() operations to finish before
unmaping the pages? I think this sequence is not artificial; it's
actually the desired behavior.

> I strongly prefer protecting the dmm map object itself, and not the
> code that uses it, in order to avoid redundant bottlenecks.
>
> Something like the patch below should fix the panics without globally
> locking the dma functions (UNTESTED! I still want to have a gst-dsp
> setup like yours so I can reproduce the issues you see. If needed,
> I'll get to it).

I have actually documented how I create it:
http://felipec.wordpress.com/2010/10/08/my-arm-development-notes/

I even provided a tarball with all the GStreamer stuff you can extract
to /opt/gst and use on any system (with a reasonable glibc):
http://people.freedesktop.org/~felipec/arm-gst-dev.tar.gz

The actual test I'm using to trigger this issue is way more
complicated, but I could try to simplify it.

> Please note: the code below assumes that applications calls
> proc_end_dma() for every proc_begin_dma().  This is anyway mandatory,
> because otherwise we are risking data corruptions, but we know older
> applications (e.g. probably TI samples) don't do that (since the
> dspbridge's DMA API is relatively new). If we want/need to support
> those older applications with this patch, it's a 2-liner change.

I analyzed this when you introduced the patches, and I found that at
least the OpenMAX code at that time wasn't even doing any DMA
operation; just mapping and unmapping. So it's not an issue for that.

> But there is another issue. I'm not quite sure how gst-dsp is using
> the bridge DMA API, but those panics may suggest that there are
> additional races here: if the application unmaps a memory region,
> before the DMA operation completes (i.e. proc_end_dma() completed
> execution), the application will face:

gst-dsp is so far using the old API.

> 1. errors: when proc_end_dma() does get invoked, the map_obj will
> already be destroyed, and find_containing_mapping() will simply fail

That's not a big issue; a mere warning (that wouldn't happen anyway
because user-space either calls proc_end_dma(), or it doesn't).

> 2. data corruptions: as a result of (1), dma_unmap_sg() will not be
> called, and that may result in data corruptions and generally is very
> bad

That depends on particular systems. My understanding is that unless
speculative pre-fetching is enabled, which TI suggests not to do on
OMAP3430 and OMAP3630, there's actually no issue.

Not to mention that gst-dsp is supposed to work on old kernels that
don't even do the proper flushes on dma_unmap_sg() (pre .34 I think).

> We can treat this as an application error, but we don't have to, and
> it will actually be very clean if we won't.
>
> Given the below patch, it should be quite easy to solve those
> additional races too, e.g., by adding a boolean 'dying' flag to each
> map_obj, which would (on un_map) mark an object as not available for a
> new DMA operation (proc_start_dma), but it'd still wait until all of
> its referenced are removed (proc_end_dma), and only then it will be
> finally removed from the list.

That sounds very dangerous. So basically you would be allowing to do
dma_unmap_sg() *after* the pages have been unmapped, and thus not
pinned (put_page()); I don't think you are supposed to do that.
Wouldn't it be possible that the pages get swapped out?

> Such a change will only work with applications that calls
> proc_end_dma() for each proc_start_dma(). How is gst-dsp in that
> respect ? do we want to keep supporting older applications which don't
> do that ? if we want, we can still support both old and new API with
> some ioctl games (introduce new ioctl commands which will work the new
> way, and keep the old ones around, maybe only mark them as
> deprecated).

IMO, as long as the old API is there (and not even deprecated), the
code should support that behavior. Once that API is removed, then we
can think about stopping that support.

I don't see what is the problem. Currently the bulk of DMA operations
are done through mapping/unmapping and thus the proc_lock is used _all
the time_ already (by both gst-dsp and TI OpenMAX). The problem is
that gst-dsp uses the flush/invalidate API for a few small chunks of
data (1 page), and this patch is just extending the lock for that.
Even if somebody uses this API for big chunks of data, the performance
can't be worst than the current mapping/unmapping of huge memory
areas.

Cheers.

--
Felipe Contreras

2010-12-27 10:30:07

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

Hi Felipe,

On Tue, Dec 21, 2010 at 6:44 PM, Felipe Contreras
<[email protected]> wrote:
> Wouldn't you want the proc_*_dma() operations to finish before
> unmaping the pages?

Yes, but that's not what your patch is doing exactly: it serializes
the execution of proc_begin_dma(), proc_end_dma(), proc_map() and
proc_un_map() using a single global mutex and by that prevents their
concurrent execution (system-wide).

I understand your intention: you don't want proc_un_map() to kick in
in the middle of a proc_*_dma() operation. That's fine, but the patch
has a side effect, which serializes the DMA operations themselves, and
prevents their concurrent execution (even if they are for separate
memory addresses, or invoked by different processes, etc...).

Looking ahead, this DMM code is going to be extracted into an
independent module, where it will be shared between dspbridge and
syslink (the IPC framework for OMAP4 and forth): both projects use
this DMM concept, and it won't make sense for syslink to duplicate the
code. So even if today's dspbridge use cases doesn't have a lot of
concurrency, and performance is satisfying even in light of your
patch, we still want the code to be ready for more demanding use cases
(several remote processors, several multimedia processes running on
the host, several concurrent multimedia streams, etc...). If we use
coarse-grained locking now, it will just make it harder to remove it
later when scalability issues will show up (see the BKL removal
efforts...).

So I prefer we don't add any locking to the proc_*_dma() API at all.
Instead, let's just take a reference count every time a map_obj is
found (and therefore is about to be used), and release it after it is
used. And now, inside proc_un_map(), if we notice that our map_obj is
still being used, it means the application called us prematurely and
we can't proceed. Something like (revised the patch from my previous
email):

diff --git a/drivers/staging/tidspbridge/include/dspbridge/drv.h
b/drivers/staging/tidspbridge/include/dspbridge/drv.h
index c1f363e..cad0626 100644
--- a/drivers/staging/tidspbridge/include/dspbridge/drv.h
+++ b/drivers/staging/tidspbridge/include/dspbridge/drv.h
@@ -25,6 +25,7 @@

#include <dspbridge/drvdefs.h>
#include <linux/idr.h>
+#include <linux/atomic.h>

#define DRV_ASSIGN 1
#define DRV_RELEASE 0
@@ -106,6 +107,7 @@ struct dmm_map_object {
u32 num_usr_pgs;
struct page **pages;
struct bridge_dma_map_info dma_info;
+ atomic_t refcnt;
};

/* Used for DMM reserved memory accounting */
diff --git a/drivers/staging/tidspbridge/rmgr/proc.c
b/drivers/staging/tidspbridge/rmgr/proc.c
index b47d7aa..d060692 100644
--- a/drivers/staging/tidspbridge/rmgr/proc.c
+++ b/drivers/staging/tidspbridge/rmgr/proc.c
@@ -112,9 +112,37 @@ static s32 get_envp_count(char **envp);
static char **prepend_envp(char **new_envp, char **envp, s32 envp_elems,
s32 cnew_envp, char *sz_var);

+/* Increase map_obj's reference count */
+static inline void map_obj_get(struct dmm_map_object *map_obj)
+{
+ atomic_inc(&map_obj->refcnt);
+}
+
+/* Decrease map_obj's reference count */
+static inline void map_obj_put(struct dmm_map_object *map_obj)
+{
+ atomic_dec(&map_obj->refcnt);
+}
+
+/**
+ * is_map_obj_used() - check out whether a given map_obj is still being used
+ * @map_obj: a dmm map object
+ *
+ * Returns 'true' if a given map_obj is being used, or 'false' otherwise.
+ *
+ * This function must be used while the dmm map list spinlock is taken,
+ * otherwise it is not safe to use its answer (if the list is not locked,
+ * someone can find and start using a map_obj immediately after this functions
+ * replys 'false'.
+ */
+static inline bool is_map_obj_used(struct dmm_map_object *map_obj)
+{
+ return atomic_read(&map_obj->refcnt) ? true : false;
+}
+
/* remember mapping information */
-static struct dmm_map_object *add_mapping_info(struct process_context *pr_ctxt,
- u32 mpu_addr, u32 dsp_addr, u32 size)
+static struct dmm_map_object *create_mapping_info(u32 mpu_addr, u32 dsp_addr,
+ u32 size)
{
struct dmm_map_object *map_obj;

@@ -144,11 +172,15 @@ static struct dmm_map_object
*add_mapping_info(struct process_context *pr_ctxt,
map_obj->size = size;
map_obj->num_usr_pgs = num_usr_pgs;

+ return map_obj;
+}
+
+static void add_mapping_info(struct process_context *pr_ctxt,
+ struct dmm_map_object *map_obj)
+{
spin_lock(&pr_ctxt->dmm_map_lock);
list_add(&map_obj->link, &pr_ctxt->dmm_map_list);
spin_unlock(&pr_ctxt->dmm_map_lock);
-
- return map_obj;
}

static int match_exact_map_obj(struct dmm_map_object *map_obj,
@@ -162,10 +194,18 @@ static int match_exact_map_obj(struct
dmm_map_object *map_obj,
map_obj->size == size;
}

-static void remove_mapping_information(struct process_context *pr_ctxt,
+static void free_mapping_info(struct dmm_map_object *map_obj)
+{
+ kfree(map_obj->dma_info.sg);
+ kfree(map_obj->pages);
+ kfree(map_obj);
+}
+
+static int remove_mapping_information(struct process_context *pr_ctxt,
u32 dsp_addr, u32 size)
{
struct dmm_map_object *map_obj;
+ int ret = 0;

pr_debug("%s: looking for virt 0x%x size 0x%x\n", __func__,
dsp_addr, size);
@@ -180,10 +220,12 @@ static void remove_mapping_information(struct
process_context *pr_ctxt,

if (match_exact_map_obj(map_obj, dsp_addr, size)) {
pr_debug("%s: match, deleting map info\n", __func__);
+ if (is_map_obj_used(map_obj)) {
+ ret = -EBUSY;
+ goto out;
+ }
list_del(&map_obj->link);
- kfree(map_obj->dma_info.sg);
- kfree(map_obj->pages);
- kfree(map_obj);
+ free_mapping_info(map_obj);
goto out;
}
pr_debug("%s: candidate didn't match\n", __func__);
@@ -192,6 +234,7 @@ static void remove_mapping_information(struct
process_context *pr_ctxt,
pr_err("%s: failed to find given map info\n", __func__);
out:
spin_unlock(&pr_ctxt->dmm_map_lock);
+ return ret;
}

static int match_containing_map_obj(struct dmm_map_object *map_obj,
@@ -203,6 +246,11 @@ static int match_containing_map_obj(struct
dmm_map_object *map_obj,
mpu_addr + size <= map_obj_end;
}

+/*
+ * Find a dmm map object that contains the given memory block,
+ * and increase its reference count to prevent its destruction
+ * until released
+ */
static struct dmm_map_object *find_containing_mapping(
struct process_context *pr_ctxt,
u32 mpu_addr, u32 size)
@@ -220,6 +268,7 @@ static struct dmm_map_object *find_containing_mapping(
map_obj->size);
if (match_containing_map_obj(map_obj, mpu_addr, size)) {
pr_debug("%s: match!\n", __func__);
+ map_obj_get(map_obj);
goto out;
}

@@ -795,6 +844,9 @@ int proc_begin_dma(void *hprocessor, void
*pmpu_addr, u32 ul_size,
status = -EFAULT;
}

+ /* release the reference count we took in the find above */
+ map_obj_put(map_obj);
+
err_out:

return status;
@@ -831,9 +883,11 @@ int proc_end_dma(void *hprocessor, void
*pmpu_addr, u32 ul_size,
pr_err("%s: InValid address parameters %p %x\n",
__func__, pmpu_addr, ul_size);
status = -EFAULT;
- goto err_out;
}

+ /* release the reference count we took in the find above */
+ map_obj_put(map_obj);
+
err_out:
return status;
}
@@ -1356,7 +1410,7 @@ int proc_map(void *hprocessor, void *pmpu_addr,
u32 ul_size,
u32 size_align;
int status = 0;
struct proc_object *p_proc_object = (struct proc_object *)hprocessor;
- struct dmm_map_object *map_obj;
+ struct dmm_map_object *map_obj = NULL;
u32 tmp_addr = 0;

#ifdef CONFIG_TIDSPBRIDGE_CACHE_LINE_CHECK
@@ -1394,8 +1448,7 @@ int proc_map(void *hprocessor, void *pmpu_addr,
u32 ul_size,
/* Mapped address = MSB of VA | LSB of PA */
tmp_addr = (va_align | ((u32) pmpu_addr & (PG_SIZE4K - 1)));
/* mapped memory resource tracking */
- map_obj = add_mapping_info(pr_ctxt, pa_align, tmp_addr,
- size_align);
+ map_obj = create_mapping_info(pa_align, tmp_addr, size_align);
if (!map_obj)
status = -ENOMEM;
else
@@ -1406,10 +1459,15 @@ int proc_map(void *hprocessor, void
*pmpu_addr, u32 ul_size,
if (!status) {
/* Mapped address = MSB of VA | LSB of PA */
*pp_map_addr = (void *) tmp_addr;
+ /* Actually add the new map_obj and make it usable */
+ add_mapping_info(pr_ctxt, map_obj);
} else {
- remove_mapping_information(pr_ctxt, tmp_addr, size_align);
+ /* if a map_obj was created, let it go */
+ if (map_obj)
+ free_mapping_info(map_obj);
dmm_un_map_memory(dmm_mgr, va_align, &size_align);
}
+
mutex_unlock(&proc_lock);

if (status)
@@ -1715,6 +1773,24 @@ int proc_un_map(void *hprocessor, void *map_addr,

/* Critical section */
mutex_lock(&proc_lock);
+
+ /*
+ * Remove the map_obj first, so it won't be used after the region is
+ * unmapped.
+ *
+ * Note: if the map_obj is still in use, the application messed up and
+ * called proc_un_map() before its proc_*_dma() operations completed.
+ * In this case we just return and error and let the application
+ * deal with it (e.g. by calling us again).
+ */
+ status = remove_mapping_information(pr_ctxt, (u32) map_addr,
+ size_align);
+ if (status == -EBUSY) {
+ dev_err(bridge, "%s: map_obj is still in use (addr: 0x%p)\n",
+ __func__, map_addr);
+ goto unlock_proc;
+ }
+
/*
* Update DMM structures. Get the size to unmap.
* This function returns error if the VA is not mapped
@@ -1726,17 +1802,11 @@ int proc_un_map(void *hprocessor, void *map_addr,
(p_proc_object->hbridge_context, va_align, size_align);
}

+unlock_proc:
mutex_unlock(&proc_lock);
if (status)
goto func_end;

- /*
- * A successful unmap should be followed by removal of map_obj
- * from dmm_map_list, so that mapped memory resource tracking
- * remains uptodate
- */
- remove_mapping_information(pr_ctxt, (u32) map_addr, size_align);
-
func_end:
dev_dbg(bridge, "%s: hprocessor: 0x%p map_addr: 0x%p status: 0x%x\n",
__func__, hprocessor, map_addr, status);


Additionally, the patch has two other changes:
- proc_map: don't register a new map_obj to the dmm_map_list before
the mapping is successful. This prevents a similar race, where
proc_*_dma() can find a memory region before it is really mapped.
- proc_un_map: before unmapping a memory region, first remove it from
the dmm_map_list. The motivation is the same: we want to make sure the
map_obj can't be found by proc_*_dma() after we unmap its region.

Currently, in this patch, if proc_un_map() realizes the map_obj is
still in use, it just returns an error. IMHO this is the right thing
to do, because if it happened it is clear that the application is
misusing the bridge API, by calling proc_un_map() without waiting for
the DMA operation to complete.

This way applications will have to fix themselves, because otherwise
they face another race which isn't fixable by the kernel: if
proc_un_map() is fast enough, it can manage to actually unmap the
memory region before proc_begin_dma() manage to do any progress at
all, and in this case, proc_begin_dma()'s cache operation will not
take place at all and then the user's data will surely be corrupted.
Obviously, that can only be fixed in the application itself, so we
better catch those races and fix them in the application.

Having said that, it's still possible to preserve the current bridge
behavior, and instead of letting proc_un_map() return a failure when
the obj_map is still in use, we can make it sleep until the map_obj's
ref count is dropped to zero (using wait and complete).. Not sure we
want that, though... I prefer application to know if something as bad
as that has happened...

> I have actually documented how I create it:
> http://felipec.wordpress.com/2010/10/08/my-arm-development-notes/
>
> I even provided a tarball with all the GStreamer stuff you can extract
> to /opt/gst and use on any system (with a reasonable glibc):
> http://people.freedesktop.org/~felipec/arm-gst-dev.tar.gz

Thanks! Also thanks for the other bits of information you gave
throughout your reply.

>
> The actual test I'm using to trigger this issue is way more
> complicated, but I could try to simplify it.

Hopefully it's not necessary. I managed to reproduced the panic by
adding manual delays to the code, but anyway it looks like we
understand the bug good enough, so it might not be needed to duplicate
the exact test you had.

> That sounds very dangerous.

I agree. Let's drop that other proposal.

Thanks,
Ohad.

2010-12-27 13:55:31

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

[email protected] wrote:
> On Tue, Dec 21, 2010 at 6:44 PM, Felipe Contreras
> <[email protected]> wrote:
> > Wouldn't you want the proc_*_dma() operations to finish before
> > unmaping the pages?
>
> Yes, but that's not what your patch is doing exactly: it serializes
> the execution of proc_begin_dma(), proc_end_dma(), proc_map() and
> proc_un_map() using a single global mutex and by that prevents their
> concurrent execution (system-wide).
>
> I understand your intention: you don't want proc_un_map() to kick in
> in the middle of a proc_*_dma() operation. That's fine, but the patch
> has a side effect, which serializes the DMA operations themselves, and
> prevents their concurrent execution (even if they are for separate
> memory addresses, or invoked by different processes, etc...).

Yeah, but as I said, the proc_map() and proc_un_map() operations are
_already_ serialized, and those are the main operations used by both
gst-dsp, and TI's openmax, which are the only known users of this
driver.

So, effectively, serializing the proc_begin_dma() and proc_end_dma()
would not affect anyone negatively for the time being.

> Looking ahead, this DMM code is going to be extracted into an
> independent module, where it will be shared between dspbridge and
> syslink (the IPC framework for OMAP4 and forth): both projects use
> this DMM concept, and it won't make sense for syslink to duplicate the
> code. So even if today's dspbridge use cases doesn't have a lot of
> concurrency, and performance is satisfying even in light of your
> patch, we still want the code to be ready for more demanding use cases
> (several remote processors, several multimedia processes running on
> the host, several concurrent multimedia streams, etc...). If we use
> coarse-grained locking now, it will just make it harder to remove it
> later when scalability issues will show up (see the BKL removal
> efforts...).
>
> So I prefer we don't add any locking to the proc_*_dma() API at all.
> Instead, let's just take a reference count every time a map_obj is
> found (and therefore is about to be used), and release it after it is
> used. And now, inside proc_un_map(), if we notice that our map_obj is
> still being used, it means the application called us prematurely and
> we can't proceed. Something like (revised the patch from my previous
> email):

For the long-term goal I agree with that approach, however, first, I
think my patch should be applied, since it's fixing a problem using an
already existing and actively excersised mechanism. In fact, I think
this should be pushed to 2.6.37 as:

1) prevents a kernel panic
2) the issue is reproducible and clearly identified
3) the patch is small and obvious

> diff --git a/drivers/staging/tidspbridge/include/dspbridge/drv.h
> b/drivers/staging/tidspbridge/include/dspbridge/drv.h
> index c1f363e..cad0626 100644
> --- a/drivers/staging/tidspbridge/include/dspbridge/drv.h
> +++ b/drivers/staging/tidspbridge/include/dspbridge/drv.h
> @@ -25,6 +25,7 @@
>
> #include <dspbridge/drvdefs.h>
> #include <linux/idr.h>
> +#include <linux/atomic.h>
>
> #define DRV_ASSIGN 1
> #define DRV_RELEASE 0
> @@ -106,6 +107,7 @@ struct dmm_map_object {
> u32 num_usr_pgs;
> struct page **pages;
> struct bridge_dma_map_info dma_info;
> + atomic_t refcnt;
> };
>
> /* Used for DMM reserved memory accounting */
> diff --git a/drivers/staging/tidspbridge/rmgr/proc.c
> b/drivers/staging/tidspbridge/rmgr/proc.c
> index b47d7aa..d060692 100644
> --- a/drivers/staging/tidspbridge/rmgr/proc.c
> +++ b/drivers/staging/tidspbridge/rmgr/proc.c
> @@ -112,9 +112,37 @@ static s32 get_envp_count(char **envp);
> static char **prepend_envp(char **new_envp, char **envp, s32 envp_elems,
> s32 cnew_envp, char *sz_var);
>
> +/* Increase map_obj's reference count */
> +static inline void map_obj_get(struct dmm_map_object *map_obj)
> +{
> + atomic_inc(&map_obj->refcnt);
> +}
> +
> +/* Decrease map_obj's reference count */
> +static inline void map_obj_put(struct dmm_map_object *map_obj)
> +{
> + atomic_dec(&map_obj->refcnt);
> +}
> +
> +/**
> + * is_map_obj_used() - check out whether a given map_obj is still being used
> + * @map_obj: a dmm map object
> + *
> + * Returns 'true' if a given map_obj is being used, or 'false' otherwise.
> + *
> + * This function must be used while the dmm map list spinlock is taken,
> + * otherwise it is not safe to use its answer (if the list is not locked,
> + * someone can find and start using a map_obj immediately after this functions
> + * replys 'false'.
> + */
> +static inline bool is_map_obj_used(struct dmm_map_object *map_obj)
> +{
> + return atomic_read(&map_obj->refcnt) ? true : false;
> +}
> +
> /* remember mapping information */
> -static struct dmm_map_object *add_mapping_info(struct process_context *pr_ctxt,
> - u32 mpu_addr, u32 dsp_addr, u32 size)
> +static struct dmm_map_object *create_mapping_info(u32 mpu_addr, u32 dsp_addr,
> + u32 size)
> {
> struct dmm_map_object *map_obj;
>
> @@ -144,11 +172,15 @@ static struct dmm_map_object
> *add_mapping_info(struct process_context *pr_ctxt,
> map_obj->size = size;
> map_obj->num_usr_pgs = num_usr_pgs;
>
> + return map_obj;
> +}
> +
> +static void add_mapping_info(struct process_context *pr_ctxt,
> + struct dmm_map_object *map_obj)
> +{
> spin_lock(&pr_ctxt->dmm_map_lock);
> list_add(&map_obj->link, &pr_ctxt->dmm_map_list);
> spin_unlock(&pr_ctxt->dmm_map_lock);
> -
> - return map_obj;
> }
>
> static int match_exact_map_obj(struct dmm_map_object *map_obj,
> @@ -162,10 +194,18 @@ static int match_exact_map_obj(struct
> dmm_map_object *map_obj,
> map_obj->size == size;
> }
>
> -static void remove_mapping_information(struct process_context *pr_ctxt,
> +static void free_mapping_info(struct dmm_map_object *map_obj)
> +{
> + kfree(map_obj->dma_info.sg);
> + kfree(map_obj->pages);
> + kfree(map_obj);
> +}
> +
> +static int remove_mapping_information(struct process_context *pr_ctxt,
> u32 dsp_addr, u32 size)
> {
> struct dmm_map_object *map_obj;
> + int ret = 0;
>
> pr_debug("%s: looking for virt 0x%x size 0x%x\n", __func__,
> dsp_addr, size);
> @@ -180,10 +220,12 @@ static void remove_mapping_information(struct
> process_context *pr_ctxt,
>
> if (match_exact_map_obj(map_obj, dsp_addr, size)) {
> pr_debug("%s: match, deleting map info\n", __func__);
> + if (is_map_obj_used(map_obj)) {
> + ret = -EBUSY;
> + goto out;
> + }
> list_del(&map_obj->link);
> - kfree(map_obj->dma_info.sg);
> - kfree(map_obj->pages);
> - kfree(map_obj);
> + free_mapping_info(map_obj);
> goto out;
> }
> pr_debug("%s: candidate didn't match\n", __func__);
> @@ -192,6 +234,7 @@ static void remove_mapping_information(struct
> process_context *pr_ctxt,
> pr_err("%s: failed to find given map info\n", __func__);
> out:
> spin_unlock(&pr_ctxt->dmm_map_lock);
> + return ret;
> }
>
> static int match_containing_map_obj(struct dmm_map_object *map_obj,
> @@ -203,6 +246,11 @@ static int match_containing_map_obj(struct
> dmm_map_object *map_obj,
> mpu_addr + size <= map_obj_end;
> }
>
> +/*
> + * Find a dmm map object that contains the given memory block,
> + * and increase its reference count to prevent its destruction
> + * until released
> + */
> static struct dmm_map_object *find_containing_mapping(
> struct process_context *pr_ctxt,
> u32 mpu_addr, u32 size)
> @@ -220,6 +268,7 @@ static struct dmm_map_object *find_containing_mapping(
> map_obj->size);
> if (match_containing_map_obj(map_obj, mpu_addr, size)) {
> pr_debug("%s: match!\n", __func__);
> + map_obj_get(map_obj);
> goto out;
> }
>
> @@ -795,6 +844,9 @@ int proc_begin_dma(void *hprocessor, void
> *pmpu_addr, u32 ul_size,
> status = -EFAULT;
> }
>
> + /* release the reference count we took in the find above */
> + map_obj_put(map_obj);
> +
> err_out:
>
> return status;
> @@ -831,9 +883,11 @@ int proc_end_dma(void *hprocessor, void
> *pmpu_addr, u32 ul_size,
> pr_err("%s: InValid address parameters %p %x\n",
> __func__, pmpu_addr, ul_size);
> status = -EFAULT;
> - goto err_out;
> }
>
> + /* release the reference count we took in the find above */
> + map_obj_put(map_obj);
> +
> err_out:
> return status;
> }
> @@ -1356,7 +1410,7 @@ int proc_map(void *hprocessor, void *pmpu_addr,
> u32 ul_size,
> u32 size_align;
> int status = 0;
> struct proc_object *p_proc_object = (struct proc_object *)hprocessor;
> - struct dmm_map_object *map_obj;
> + struct dmm_map_object *map_obj = NULL;
> u32 tmp_addr = 0;
>
> #ifdef CONFIG_TIDSPBRIDGE_CACHE_LINE_CHECK
> @@ -1394,8 +1448,7 @@ int proc_map(void *hprocessor, void *pmpu_addr,
> u32 ul_size,
> /* Mapped address = MSB of VA | LSB of PA */
> tmp_addr = (va_align | ((u32) pmpu_addr & (PG_SIZE4K - 1)));
> /* mapped memory resource tracking */
> - map_obj = add_mapping_info(pr_ctxt, pa_align, tmp_addr,
> - size_align);
> + map_obj = create_mapping_info(pa_align, tmp_addr, size_align);
> if (!map_obj)
> status = -ENOMEM;
> else
> @@ -1406,10 +1459,15 @@ int proc_map(void *hprocessor, void
> *pmpu_addr, u32 ul_size,
> if (!status) {
> /* Mapped address = MSB of VA | LSB of PA */
> *pp_map_addr = (void *) tmp_addr;
> + /* Actually add the new map_obj and make it usable */
> + add_mapping_info(pr_ctxt, map_obj);
> } else {
> - remove_mapping_information(pr_ctxt, tmp_addr, size_align);
> + /* if a map_obj was created, let it go */
> + if (map_obj)
> + free_mapping_info(map_obj);
> dmm_un_map_memory(dmm_mgr, va_align, &size_align);
> }
> +
> mutex_unlock(&proc_lock);
>
> if (status)
> @@ -1715,6 +1773,24 @@ int proc_un_map(void *hprocessor, void *map_addr,
>
> /* Critical section */
> mutex_lock(&proc_lock);
> +
> + /*
> + * Remove the map_obj first, so it won't be used after the region is
> + * unmapped.
> + *
> + * Note: if the map_obj is still in use, the application messed up and
> + * called proc_un_map() before its proc_*_dma() operations completed.
> + * In this case we just return and error and let the application
> + * deal with it (e.g. by calling us again).
> + */
> + status = remove_mapping_information(pr_ctxt, (u32) map_addr,
> + size_align);
> + if (status == -EBUSY) {
> + dev_err(bridge, "%s: map_obj is still in use (addr: 0x%p)\n",
> + __func__, map_addr);
> + goto unlock_proc;
> + }
> +
> /*
> * Update DMM structures. Get the size to unmap.
> * This function returns error if the VA is not mapped
> @@ -1726,17 +1802,11 @@ int proc_un_map(void *hprocessor, void *map_addr,
> (p_proc_object->hbridge_context, va_align, size_align);
> }
>
> +unlock_proc:
> mutex_unlock(&proc_lock);
> if (status)
> goto func_end;
>
> - /*
> - * A successful unmap should be followed by removal of map_obj
> - * from dmm_map_list, so that mapped memory resource tracking
> - * remains uptodate
> - */
> - remove_mapping_information(pr_ctxt, (u32) map_addr, size_align);
> -
> func_end:
> dev_dbg(bridge, "%s: hprocessor: 0x%p map_addr: 0x%p status: 0x%x\n",
> __func__, hprocessor, map_addr, status);

This approach looks cleaner, however, we need a flag in
remove_mapping_information() in order to force the removal, otherwise
there will be memory leaks. Imagine a process crashes, and
remove_mapping_information() returns -EBUSY.

> Additionally, the patch has two other changes:
> - proc_map: don't register a new map_obj to the dmm_map_list before
> the mapping is successful. This prevents a similar race, where
> proc_*_dma() can find a memory region before it is really mapped.
> - proc_un_map: before unmapping a memory region, first remove it from
> the dmm_map_list. The motivation is the same: we want to make sure the
> map_obj can't be found by proc_*_dma() after we unmap its region.
>
> Currently, in this patch, if proc_un_map() realizes the map_obj is
> still in use, it just returns an error. IMHO this is the right thing
> to do, because if it happened it is clear that the application is
> misusing the bridge API, by calling proc_un_map() without waiting for
> the DMA operation to complete.
>
> This way applications will have to fix themselves, because otherwise
> they face another race which isn't fixable by the kernel: if
> proc_un_map() is fast enough, it can manage to actually unmap the
> memory region before proc_begin_dma() manage to do any progress at
> all, and in this case, proc_begin_dma()'s cache operation will not
> take place at all and then the user's data will surely be corrupted.
> Obviously, that can only be fixed in the application itself, so we
> better catch those races and fix them in the application.
>
> Having said that, it's still possible to preserve the current bridge
> behavior, and instead of letting proc_un_map() return a failure when
> the obj_map is still in use, we can make it sleep until the map_obj's
> ref count is dropped to zero (using wait and complete).. Not sure we
> want that, though... I prefer application to know if something as bad
> as that has happened...

Sure, but I see this as a broader effort to have finer locking, part of
this should be to remove the already existing proc_lock. For now however
I don't think it's feasible to get this into 2.6.37, while my patch
should.

Cheers.

--
Felipe Contreras

2010-12-28 06:33:52

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Mon, Dec 27, 2010 at 3:55 PM, Felipe Contreras
<[email protected]> wrote:
> So, effectively, serializing the proc_begin_dma() and proc_end_dma()
> would not affect anyone negatively for the time being.

You can never really tell who is using the kernel (or will be using
this kernel version), how and under which workload.

> For the long-term goal I agree with that approach, however, first, I
> think my patch should be applied, since it's fixing a problem using an
> already existing and actively excersised mechanism. In fact, I think
> this should be pushed to 2.6.37 as:
>
> ?1) prevents a kernel panic
> ?2) the issue is reproducible and clearly identified
> ?3) the patch is small and obvious

Both patches are (IMHO). But frankly I don't mind your patch will be
applied now as long as it doesn't stay. I can rebase my patch against
it after it is applied, and send separately.

> This approach looks cleaner, however, we need a flag in
> remove_mapping_information() in order to force the removal, otherwise
> there will be memory leaks. Imagine a process crashes, and
> remove_mapping_information() returns -EBUSY.

Can't happen; both proc_*_dma() operations decrease the reference
count before exiting, so it's not up to the application.

> Sure, but I see this as a broader effort to have finer locking, part of
> this should be to remove the already existing proc_lock.

Having bad locking is not an excuse for adding more.

Anyway, as I said, I don't really mind your patch will be applied as
long as it is a temporary workaround.

Thanks,
Ohad.

2010-12-28 10:37:00

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 8:33 AM, Ohad Ben-Cohen <[email protected]> wrote:
> On Mon, Dec 27, 2010 at 3:55 PM, Felipe Contreras
> <[email protected]> wrote:
>> So, effectively, serializing the proc_begin_dma() and proc_end_dma()
>> would not affect anyone negatively for the time being.
>
> You can never really tell who is using the kernel (or will be using
> this kernel version), how and under which workload.

No, but it's better to address real issues rather than hypothetical.

However, as I sad, everybody is using proc_map() and proc_un_map()
which take a lock, and there are no complaints. This patch would make
proc_begin_dma() and proc_end_dma() as slow as the map operations, so
even these hypothetical users would not get affected negatively.

>> For the long-term goal I agree with that approach, however, first, I
>> think my patch should be applied, since it's fixing a problem using an
>> already existing and actively excersised mechanism. In fact, I think
>> this should be pushed to 2.6.37 as:
>>
>>  1) prevents a kernel panic
>>  2) the issue is reproducible and clearly identified
>>  3) the patch is small and obvious
>
> Both patches are (IMHO). But frankly I don't mind your patch will be
> applied now as long as it doesn't stay. I can rebase my patch against
> it after it is applied, and send separately.

Ok, can I get your Ack? I guess Omar would be able to push it to Greg,
and perhaps it would make .37.

>> This approach looks cleaner, however, we need a flag in
>> remove_mapping_information() in order to force the removal, otherwise
>> there will be memory leaks. Imagine a process crashes, and
>> remove_mapping_information() returns -EBUSY.
>
> Can't happen; both proc_*_dma() operations decrease the reference
> count before exiting, so it's not up to the application.

Then why did you add that check for is_map_obj_used(), and then return
-EBUSY? If that can happen, then it can happen when the application is
crashing; user-space crashes while kernel-space is in the middle of a
proc_*_dma() operation.

>> Sure, but I see this as a broader effort to have finer locking, part of
>> this should be to remove the already existing proc_lock.
>
> Having bad locking is not an excuse for adding more.

No, but not being a permanent solution is not an excuse for not fixing
a kernel panic right away.

> Anyway, as I said, I don't really mind your patch will be applied as
> long as it is a temporary workaround.

Can you Ack?

--
Felipe Contreras

2010-12-28 10:57:09

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 12:36 PM, Felipe Contreras
<[email protected]> wrote:
>> You can never really tell who is using the kernel (or will be using
>> this kernel version), how and under which workload.
>
> No, but it's better to address real issues rather than hypothetical.

Right. and both patches do that. one adds locks, and one doesn't.

> However, as I sad, everybody is using proc_map() and proc_un_map()
> which take a lock, and there are no complaints.

I didn't complain about that; I didn't like you adding locks to the DMA API.

> Ok, can I get your Ack?

Frankly, I don't like the locks you are adding. But as I said, I
wouldn't resist them as long as it's temporary.

> Then why did you add that check for is_map_obj_used(), and then return
> -EBUSY? If that can happen, then it can happen when the application is
> crashing; user-space crashes while kernel-space is in the middle of a
> proc_*_dma() operation.

I still don't know how exactly you triggered the bug: is gst-dsp
multithreaded ? and one of its threads invoked proc_un_map() while
another thread called proc_begin_dma() ?

Anyhow, a thread that is calling proc_*_dma() will both increase the
reference count and decrease it back before going back to user space.
Otherwise your patch would be problematic as well - who will unlock
the mutex you take in proc_*_dma() ?

>>> Sure, but I see this as a broader effort to have finer locking, part of
>>> this should be to remove the already existing proc_lock.
>>
>> Having bad locking is not an excuse for adding more.
>
> No, but not being a permanent solution is not an excuse for not fixing
> a kernel panic right away.

Right. But we have a fix that doesn't add any additional locking... I
don't see why it can't be taken now, but as I said, I wouldn't resist
staging it for the next cycle.

2010-12-28 12:12:45

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 12:56 PM, Ohad Ben-Cohen <[email protected]> wrote:
> On Tue, Dec 28, 2010 at 12:36 PM, Felipe Contreras
> <[email protected]> wrote:
>>> You can never really tell who is using the kernel (or will be using
>>> this kernel version), how and under which workload.
>>
>> No, but it's better to address real issues rather than hypothetical.
>
> Right. and both patches do that. one adds locks, and one doesn't.
>
>> However, as I sad, everybody is using proc_map() and proc_un_map()
>> which take a lock, and there are no complaints.
>
> I didn't complain about that; I didn't like you adding locks to the DMA API.
>
>> Ok, can I get your Ack?
>
> Frankly, I don't like the locks you are adding. But as I said, I
> wouldn't resist them as long as it's temporary.
>
>> Then why did you add that check for is_map_obj_used(), and then return
>> -EBUSY? If that can happen, then it can happen when the application is
>> crashing; user-space crashes while kernel-space is in the middle of a
>> proc_*_dma() operation.
>
> I still don't know how exactly you triggered the bug: is gst-dsp
> multithreaded ? and one of its threads invoked proc_un_map() while
> another thread called proc_begin_dma() ?

I haven't investigated why that happens, but kernel-space should not
panic regardless of what user-space does.

> Anyhow, a thread that is calling proc_*_dma() will both increase the
> reference count and decrease it back before going back to user space.
> Otherwise your patch would be problematic as well - who will unlock
> the mutex you take in proc_*_dma() ?

I'm saying that user-space might crash *before* proc_*_dma() finishes,
before the reference count has been decreased.

In my patch there would be no issue because proc_un_map() would wait
until proc_*_dma() has released the lock.

>>>> Sure, but I see this as a broader effort to have finer locking, part of
>>>> this should be to remove the already existing proc_lock.
>>>
>>> Having bad locking is not an excuse for adding more.
>>
>> No, but not being a permanent solution is not an excuse for not fixing
>> a kernel panic right away.
>
> Right. But we have a fix that doesn't add any additional locking... I
> don't see why it can't be taken now, but as I said, I wouldn't resist
> staging it for the next cycle.

Because:
1) Your patch changes 114 lines, mine 18
2) It hasn't been reviewed, nor tested by other people
3) At least I see a potential issue
4) 2.6.37 is imminent

IMO one patch has chances going into 2.6.37, the other one doesn't. I
don't see the problem of pushing my patch to 2.6.37, and once your
patch has been properly reviewed, and tested, put it for the 2.6.38
merge window. Anyway, it's looking more and more that this patch would
not be ack'ed in time, so I guess we would have to wait to see what
other people (Omar?) say, which would probably be 2.6.38 timeline.

Cheers.

--
Felipe Contreras

2010-12-28 12:18:24

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 2:12 PM, Felipe Contreras
<[email protected]> wrote:
> I haven't investigated why that happens, but kernel-space should not
> panic regardless of what user-space does.

Agree of course. But that's not what we were discussing...

>> Anyhow, a thread that is calling proc_*_dma() will both increase the
>> reference count and decrease it back before going back to user space.
>> Otherwise your patch would be problematic as well - who will unlock
>> the mutex you take in proc_*_dma() ?
>
> I'm saying that user-space might crash *before* proc_*_dma() finishes,
> before the reference count has been decreased.
>
> In my patch there would be no issue because proc_un_map() would wait
> until proc_*_dma() has released the lock.

But what will happen if, as you say, user-space would crash before
proc_*_dma() has released the lock ? how could proc_un_map() run ?

> Because:
> ?1) Your patch changes 114 lines, mine 18
> ?2) It hasn't been reviewed, nor tested by other people
> ?3) At least I see a potential issue
> ?4) 2.6.37 is imminent
>
> IMO one patch has chances going into 2.6.37, the other one doesn't. I
> don't see the problem of pushing my patch to 2.6.37, and once your
> patch has been properly reviewed, and tested, put it for the 2.6.38
> merge window. Anyway, it's looking more and more that this patch would
> not be ack'ed in time, so I guess we would have to wait to see what
> other people (Omar?) say, which would probably be 2.6.38 timeline.

This is all good, and I have no problem with it. As I said, I don't
resist your patch as a temporary fix. But it doesn't mean I like it...

2010-12-28 12:24:07

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 2:18 PM, Ohad Ben-Cohen <[email protected]> wrote:
> On Tue, Dec 28, 2010 at 2:12 PM, Felipe Contreras
> <[email protected]> wrote:
>> I haven't investigated why that happens, but kernel-space should not
>> panic regardless of what user-space does.
>
> Agree of course. But that's not what we were discussing...

Well, hopefully after applying your patch it would be easier to figure that out.

>>> Anyhow, a thread that is calling proc_*_dma() will both increase the
>>> reference count and decrease it back before going back to user space.
>>> Otherwise your patch would be problematic as well - who will unlock
>>> the mutex you take in proc_*_dma() ?
>>
>> I'm saying that user-space might crash *before* proc_*_dma() finishes,
>> before the reference count has been decreased.
>>
>> In my patch there would be no issue because proc_un_map() would wait
>> until proc_*_dma() has released the lock.
>
> But what will happen if, as you say, user-space would crash before
> proc_*_dma() has released the lock ? how could proc_un_map() run ?

user-space crashed, not kernel-space; the code would continue to run
and eventually release the lock.

> This is all good, and I have no problem with it. As I said, I don't
> resist your patch as a temporary fix. But it doesn't mean I like it...

Yeah, so the chances of getting this fixed on 2.6.37 are dimmed.

--
Felipe Contreras

2010-12-28 12:37:48

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 2:24 PM, Felipe Contreras
<[email protected]> wrote:
> user-space crashed, not kernel-space; the code would continue to run
> and eventually release the lock.

So you'll have to be more specific about the scenario you are describing.

If there's a user thread that is still running the proc_*_dma()
function, and we agree that this thread keeps running until completion
and then returns to user space, what's the problem ?

If that user thread will crash, drv_remove_all_resources() will clean
up all map_obj's.

> Yeah, so the chances of getting this fixed on 2.6.37 are dimmed.

I wouldn't worry about that.. In the worst case, "Cc:
[email protected]" will push the fix into 2.6.37.x..

2010-12-28 14:49:09

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 2:12 PM, Felipe Contreras
<[email protected]> wrote:
> On Tue, Dec 28, 2010 at 12:56 PM, Ohad Ben-Cohen <[email protected]> wrote:
>> I still don't know how exactly you triggered the bug: is gst-dsp
>> multithreaded ? and one of its threads invoked proc_un_map() while
>> another thread called proc_begin_dma() ?
>
> I haven't investigated why that happens

Btw, I still think you should look into this.

The kernel panic will be solved, but you may still have a race there
that can lead to data corruption: if proc_un_map will be fast enough,
it will acquire the proc_lock mutex before proc_begin_dma(), and then
you will miss a cache operation.

2010-12-29 09:39:11

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 2:37 PM, Ohad Ben-Cohen <[email protected]> wrote:
> On Tue, Dec 28, 2010 at 2:24 PM, Felipe Contreras
> <[email protected]> wrote:
>> user-space crashed, not kernel-space; the code would continue to run
>> and eventually release the lock.
>
> So you'll have to be more specific about the scenario you are describing.
>
> If there's a user thread that is still running the proc_*_dma()
> function, and we agree that this thread keeps running until completion
> and then returns to user space, what's the problem ?

The problem is if the user-space process crashes exactly in the middle
of it, *before* completing. With locks there's no problem, as
proc_un_map() would wait for the lock in my patch. In your patch it
would not wait, just return -EBUSY.

> If that user thread will crash, drv_remove_all_resources() will clean
> up all map_obj's.

Not if a proc_*_dma() is still running.

There's a very narrow timing window where this could happen, but it is there.

--
Felipe Contreras

2010-12-29 09:42:23

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Tue, Dec 28, 2010 at 4:48 PM, Ohad Ben-Cohen <[email protected]> wrote:
> On Tue, Dec 28, 2010 at 2:12 PM, Felipe Contreras
> <[email protected]> wrote:
>> On Tue, Dec 28, 2010 at 12:56 PM, Ohad Ben-Cohen <[email protected]> wrote:
>>> I still don't know how exactly you triggered the bug: is gst-dsp
>>> multithreaded ? and one of its threads invoked proc_un_map() while
>>> another thread called proc_begin_dma() ?
>>
>> I haven't investigated why that happens
>
> Btw, I still think you should look into this.
>
> The kernel panic will be solved, but you may still have a race there
> that can lead to data corruption: if proc_un_map will be fast enough,
> it will acquire the proc_lock mutex before proc_begin_dma(), and then
> you will miss a cache operation.

Aquiring the lock is the first thing done; if proc_un_map() aquires
the lock first, it's because it was run first, and thus a problem for
user-space. If user-space wants the cache operation, it must run
proc_begin_dma() first, there's nothing kernel-space can do to fix
that.

--
Felipe Contreras

2010-12-29 09:46:47

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Wed, Dec 29, 2010 at 11:39 AM, Felipe Contreras
<[email protected]> wrote:
> On Tue, Dec 28, 2010 at 2:37 PM, Ohad Ben-Cohen <[email protected]> wrote:
>> On Tue, Dec 28, 2010 at 2:24 PM, Felipe Contreras
>> <[email protected]> wrote:
>>> user-space crashed, not kernel-space; the code would continue to run
>>> and eventually release the lock.
>>
>> So you'll have to be more specific about the scenario you are describing.
>>
>> If there's a user thread that is still running the proc_*_dma()
>> function, and we agree that this thread keeps running until completion
>> and then returns to user space, what's the problem ?
>
> The problem is if the user-space process crashes exactly in the middle
> of it, *before* completing. With locks there's no problem, as
> proc_un_map() would wait for the lock in my patch. In your patch it
> would not wait, just return -EBUSY.

We have two threads.

One called proc_un_map(), and one called proc_begin_dma().

The first crashed, but the second didn't. it still holds the bridge
device open. When it will exit, and release the device, then
drv_remove_all_resources() will be called, and all the map_obj's will
be cleaned.

>
>> If that user thread will crash, drv_remove_all_resources() will clean
>> up all map_obj's.
>
> Not if a proc_*_dma() is still running.

It will be called after it will return, and its thread will exit (or crash).

2010-12-29 09:51:57

by Ohad Ben Cohen

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Wed, Dec 29, 2010 at 11:42 AM, Felipe Contreras
<[email protected]> wrote:
> On Tue, Dec 28, 2010 at 4:48 PM, Ohad Ben-Cohen <[email protected]> wrote:
>> On Tue, Dec 28, 2010 at 2:12 PM, Felipe Contreras
>> <[email protected]> wrote:
>>> On Tue, Dec 28, 2010 at 12:56 PM, Ohad Ben-Cohen <[email protected]> wrote:
>>>> I still don't know how exactly you triggered the bug: is gst-dsp
>>>> multithreaded ? and one of its threads invoked proc_un_map() while
>>>> another thread called proc_begin_dma() ?
>>>
>>> I haven't investigated why that happens
>>
>> Btw, I still think you should look into this.
>>
>> The kernel panic will be solved, but you may still have a race there
>> that can lead to data corruption: if proc_un_map will be fast enough,
>> it will acquire the proc_lock mutex before proc_begin_dma(), and then
>> you will miss a cache operation.
>
> Aquiring the lock is the first thing done; if proc_un_map() aquires
> the lock first, it's because it was run first

Not true.

Again, we have two threads:

T1 - called proc_begin_dma()

T2 - called proc_un_map()

Let's say T1 called proc_begin_dma() first, but it still didn't
acquire the lock.

At this point the scheduler let T2 run. It calls proc_un_map() and
acquire the lock, and free all map_obj's.

Then the scheduler let T1 continue to execute, but it can't find any
map_obj, and the required cache operation is not performed.

Of course the kernel has nothing to do with this, and don't care really.

That's why I said it's a user space race - you need to make sure T2
will not call proc_un_map() before T1 (or any other thread you have)
completed executing all DMA operations (that you care about).

>, and thus a problem for
> user-space. If user-space wants the cache operation, it must run
> proc_begin_dma() first, there's nothing kernel-space can do to fix
> that.
>
> --
> Felipe Contreras
>

2010-12-29 10:06:28

by Felipe Contreras

[permalink] [raw]
Subject: Re: [PATCH v2] staging: tidspbridge: protect dmm_map properly

On Wed, Dec 29, 2010 at 11:46 AM, Ohad Ben-Cohen <[email protected]> wrote:
> On Wed, Dec 29, 2010 at 11:39 AM, Felipe Contreras
> <[email protected]> wrote:
>> On Tue, Dec 28, 2010 at 2:37 PM, Ohad Ben-Cohen <[email protected]> wrote:
>>> On Tue, Dec 28, 2010 at 2:24 PM, Felipe Contreras
>>> <[email protected]> wrote:
>>>> user-space crashed, not kernel-space; the code would continue to run
>>>> and eventually release the lock.
>>>
>>> So you'll have to be more specific about the scenario you are describing.
>>>
>>> If there's a user thread that is still running the proc_*_dma()
>>> function, and we agree that this thread keeps running until completion
>>> and then returns to user space, what's the problem ?
>>
>> The problem is if the user-space process crashes exactly in the middle
>> of it, *before* completing. With locks there's no problem, as
>> proc_un_map() would wait for the lock in my patch. In your patch it
>> would not wait, just return -EBUSY.
>
> We have two threads.
>
> One called proc_un_map(), and one called proc_begin_dma().
>
> The first crashed, but the second didn't. it still holds the bridge
> device open. When it will exit, and release the device, then
> drv_remove_all_resources() will be called, and all the map_obj's will
> be cleaned.

I'm not familiar how crashes are handled; if you say as long as one
task is still running the device release is not called, then I guess
there's no issue.

--
Felipe Contreras