2013-03-18 22:14:40

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH 0/2] Address issues in dma-debug API

Christoph Paasch recently reported a "device driver failed to check map
error" on igb. However after reviewing the code there was no possibility of
that. On closer inspection there was a bug in the DMA debug API that was
causing the issue. These two patches address the issues I found.

The first issue I found while trying to implement a workaround. Specifically
the problem is a locking bug which is triggered if a multiple mapped buffer
exists and there is not an exact match for the unmap. This results in the CPU
becoming deadlocked.

The second issue, which was the original problem, is resolved by guaranteeing
that if we call dma_mapping_error we set a matching entry to MAP_ERR_CHECKED
that was not previously set.

I'm not sure if these are critical enough to go into one of the upcoming RC
kernels or if these can wait until the merge since this is in a debugging API.
I'm leaving that for the sub-maintainers to decide.

---

Alexander Duyck (2):
dma-debug: Fix locking bug in check_unmap
dma-debug: Update DMA debug API to better handle multiple mappings of a buffer


lib/dma-debug.c | 42 ++++++++++++++++++++++++++++--------------
1 files changed, 28 insertions(+), 14 deletions(-)

--


2013-03-18 22:15:03

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH 2/2] dma-debug: Update DMA debug API to better handle multiple mappings of a buffer

There were reports of the igb driver unmapping buffers without calling
dma_mapping_error. On closer inspection issues were found in the DMA debug
API and how it handled multiple mappings of the same buffer.

The issue I found is the fact that the debug_dma_mapping_error would only set
the map_err_type to MAP_ERR_CHECKED in the case that the was only one match
for device and device address. However in the case of non-IOMMU, multiple
addresses existed and as a result it was not setting this field once a
second mapping was instantiated. I have resolved this by changing the search
so that it instead will now set MAP_ERR_CHECKED on the first buffer that
matches the device and DMA address that is currently in the state
MAP_ERR_NOT_CHECKED.

A secondary side effect of this patch is that in the case of multiple buffers
using the same address only the last mapping will have a valid map_err_type.
The previous mappings will all end up with map_err_type set to
MAP_ERR_CHECKED because of the dma_mapping_error call in debug_dma_map_page.
However this behavior may be preferable as it means you will likely only see
one real error per multi-mapped buffer, versus the current behavior of
multiple false errors mer multi-mapped buffer.

Signed-off-by: Alexander Duyck <[email protected]>
---
lib/dma-debug.c | 24 +++++++++++++++++++-----
1 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 724bd4d..aa465d9 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -1082,13 +1082,27 @@ void debug_dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
ref.dev = dev;
ref.dev_addr = dma_addr;
bucket = get_hash_bucket(&ref, &flags);
- entry = bucket_find_exact(bucket, &ref);

- if (!entry)
- goto out;
+ list_for_each_entry(entry, &bucket->list, list) {
+ if (!exact_match(&ref, entry))
+ continue;
+
+ /*
+ * The same physical address can be mapped multiple
+ * times. Without a hardware IOMMU this results in the
+ * same device addresses being put into the dma-debug
+ * hash multiple times too. This can result in false
+ * positives being reported. Therefore we implement a
+ * best-fit algorithm here which updates the first entry
+ * from the hash which fits the reference value and is
+ * not currently listed as being checked.
+ */
+ if (entry->map_err_type == MAP_ERR_NOT_CHECKED) {
+ entry->map_err_type = MAP_ERR_CHECKED;
+ break;
+ }
+ }

- entry->map_err_type = MAP_ERR_CHECKED;
-out:
put_hash_bucket(bucket, &flags);
}
EXPORT_SYMBOL(debug_dma_mapping_error);

2013-03-18 22:15:00

by Duyck, Alexander H

[permalink] [raw]
Subject: [PATCH 1/2] dma-debug: Fix locking bug in check_unmap

In check_unmap it is possible to get into a dead-locked state if
dma_mapping_error is called. The problem is that the bucket is locked in
check_unmap, and locked again by debug_dma_mapping_error which is called by
dma_mapping_error. To resolve that we must release the lock on the bucket
before making the call to dma_mapping_error.

Signed-off-by: Alexander Duyck <[email protected]>
---
lib/dma-debug.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 5e396ac..724bd4d 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -862,17 +862,18 @@ static void check_unmap(struct dma_debug_entry *ref)
entry = bucket_find_exact(bucket, ref);

if (!entry) {
+ /* must drop lock before calling dma_mapping_error */
+ put_hash_bucket(bucket, &flags);
+
if (dma_mapping_error(ref->dev, ref->dev_addr)) {
err_printk(ref->dev, NULL,
- "DMA-API: device driver tries "
- "to free an invalid DMA memory address\n");
- return;
+ "DMA-API: device driver tries to free an invalid DMA memory address\n");
+ } else {
+ err_printk(ref->dev, NULL,
+ "DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x%016llx] [size=%llu bytes]\n",
+ ref->dev_addr, ref->size);
}
- err_printk(ref->dev, NULL, "DMA-API: device driver tries "
- "to free DMA memory it has not allocated "
- "[device address=0x%016llx] [size=%llu bytes]\n",
- ref->dev_addr, ref->size);
- goto out;
+ return;
}

if (ref->size != entry->size) {
@@ -936,7 +937,6 @@ static void check_unmap(struct dma_debug_entry *ref)
hash_bucket_del(entry);
dma_entry_free(entry);

-out:
put_hash_bucket(bucket, &flags);
}

2013-03-19 20:29:13

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 1/2] dma-debug: Fix locking bug in check_unmap

On Mon, 2013-03-18 at 15:12 -0700, Alexander Duyck wrote:
> In check_unmap it is possible to get into a dead-locked state if
> dma_mapping_error is called. The problem is that the bucket is locked in
> check_unmap, and locked again by debug_dma_mapping_error which is called by
> dma_mapping_error. To resolve that we must release the lock on the bucket
> before making the call to dma_mapping_error.
>
> Signed-off-by: Alexander Duyck <[email protected]>

Looks good.

Reviewed-by: Shuah Khan
Tested-by Shuah Khan

Thanks for finding and fixing the problem.
-- Shuah

2013-03-19 20:30:31

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 2/2] dma-debug: Update DMA debug API to better handle multiple mappings of a buffer

On Mon, 2013-03-18 at 15:12 -0700, Alexander Duyck wrote:
> There were reports of the igb driver unmapping buffers without calling
> dma_mapping_error. On closer inspection issues were found in the DMA debug
> API and how it handled multiple mappings of the same buffer.
>
> The issue I found is the fact that the debug_dma_mapping_error would only set
> the map_err_type to MAP_ERR_CHECKED in the case that the was only one match
> for device and device address. However in the case of non-IOMMU, multiple
> addresses existed and as a result it was not setting this field once a
> second mapping was instantiated. I have resolved this by changing the search
> so that it instead will now set MAP_ERR_CHECKED on the first buffer that
> matches the device and DMA address that is currently in the state
> MAP_ERR_NOT_CHECKED.
>
> A secondary side effect of this patch is that in the case of multiple buffers
> using the same address only the last mapping will have a valid map_err_type.
> The previous mappings will all end up with map_err_type set to
> MAP_ERR_CHECKED because of the dma_mapping_error call in debug_dma_map_page.
> However this behavior may be preferable as it means you will likely only see
> one real error per multi-mapped buffer, versus the current behavior of
> multiple false errors mer multi-mapped buffer.
>
> Signed-off-by: Alexander Duyck <[email protected]>
> ---

Looks good. Tested it as well.

Reviewed-by: [email protected]
Tested-by: [email protected]

Thanks for finding and fixing the problem.

-- Shuah