2012-08-10 22:46:47

by Shuah Khan

[permalink] [raw]
Subject: dma mapping error check analysis

I analyzed current calls to dma_map_single() and dma_map_page() in the kernel
to see if dma mapping errors are checked after mapping routines return.

Reference linux-next August 6 2012.

This analysis stemmed from the discussion on my patch that disables swiotlb
overflow as a first step towards removing the support all together. Please
refer to thread below:

https://lkml.org/lkml/2012/7/24/391

The goal of this analysis is to find drivers that don't currently check dma
mapping errors and fix them. I did a grep for dma_map_single() and
dma_map_page() and looked at the code that calls these routines. I classified
the results of dma mapping error check status as follows:

Broken:
1. No error checks
2. Partial checks - In that source file, not all calls are followed by checks.
3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping
error occurs in the middle of a multiple mapping attempt.

The first two categories are classified as broken and need fixing.

The third one needs fixing, since it leaves dangling mapped pages, and holds
on to them which is equivalent to memory leak. Some drivers release all mapped
pages when the device closes, but others don't. Not doing unmap might be
harmless on some architectures going by the comments I found in some source
files.

Good:
1. Checks dma mapping errors and unmaps already mapped pages when mapping
error occurs in the middle of a multiple mapping attempt.
2. Checks dma mapping errors without unlikely()
3. Checks dma mapping errors with unlikely()

I lumped the above three cases as good cases. Using unlikely() is icing on the
cake, and something we need to be concerned about compared to other problems in
this area.

- dmap_map_single() - results
No error checks - 195 (46%)
Partial checks - 46 (11%)
Doesn't unmap: 26 (6%)
Good: 147 (35%)

- dma_map_page() - results
No error checks: 61 (59%)
Partial checks: 7 (.06%)
Doesn't unmap: 15 (14.5%)
Good: 20 (19%)

In summary a large % of the cases (> 50%) go unchecked. That raises the
following questions:

When do mapping errors get detected?
How often do these errors occur?
Why don't we see failures related to missing dma mapping error checks?
Are they silent failures?

Based on what I found, I am not too eager to remove swiotlb overflow support
which would increase the probability of returning dma mapping errors.

However I propose the following to gather more information:

- Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is
triggered. (This is a delta on the disable swiotlb patch I sent a few weeks
ago - References in this posting).
- Change dma_map_single() and dma_map_page() to track how many times they
return before attempting to fix all the places that don't do dma mapping
error checks. (Maybe a counter that keeps track, pr_* is not an option).

Comments, thoughts on the analysis and proposal are welcome.

-- Shuah


2012-08-17 14:21:30

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: dma mapping error check analysis

On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote:
> I analyzed current calls to dma_map_single() and dma_map_page() in the kernel
> to see if dma mapping errors are checked after mapping routines return.
>
> Reference linux-next August 6 2012.
>
> This analysis stemmed from the discussion on my patch that disables swiotlb
> overflow as a first step towards removing the support all together. Please
> refer to thread below:
>
> https://lkml.org/lkml/2012/7/24/391
>
> The goal of this analysis is to find drivers that don't currently check dma
> mapping errors and fix them. I did a grep for dma_map_single() and
> dma_map_page() and looked at the code that calls these routines. I classified
> the results of dma mapping error check status as follows:
>
> Broken:
> 1. No error checks
> 2. Partial checks - In that source file, not all calls are followed by checks.
> 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping
> error occurs in the middle of a multiple mapping attempt.
>
> The first two categories are classified as broken and need fixing.
>
> The third one needs fixing, since it leaves dangling mapped pages, and holds
> on to them which is equivalent to memory leak. Some drivers release all mapped
> pages when the device closes, but others don't. Not doing unmap might be
> harmless on some architectures going by the comments I found in some source
> files.
>
> Good:
> 1. Checks dma mapping errors and unmaps already mapped pages when mapping
> error occurs in the middle of a multiple mapping attempt.
> 2. Checks dma mapping errors without unlikely()
> 3. Checks dma mapping errors with unlikely()
>
> I lumped the above three cases as good cases. Using unlikely() is icing on the
> cake, and something we need to be concerned about compared to other problems in
> this area.
>
> - dmap_map_single() - results
> No error checks - 195 (46%)
> Partial checks - 46 (11%)
> Doesn't unmap: 26 (6%)
> Good: 147 (35%)
>
> - dma_map_page() - results
> No error checks: 61 (59%)
> Partial checks: 7 (.06%)
> Doesn't unmap: 15 (14.5%)
> Good: 20 (19%)
>
> In summary a large % of the cases (> 50%) go unchecked. That raises the
> following questions:
>
> When do mapping errors get detected?
> How often do these errors occur?
> Why don't we see failures related to missing dma mapping error checks?
> Are they silent failures?
>
> Based on what I found, I am not too eager to remove swiotlb overflow support
> which would increase the probability of returning dma mapping errors.
>
> However I propose the following to gather more information:
>
> - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is
> triggered. (This is a delta on the disable swiotlb patch I sent a few weeks
> ago - References in this posting).

As opposed to printk(KERN_ERR ? Why?

> - Change dma_map_single() and dma_map_page() to track how many times they
> return before attempting to fix all the places that don't do dma mapping
> error checks. (Maybe a counter that keeps track, pr_* is not an option).

Perhaps this should be done in the DMA debug API instead?

>
> Comments, thoughts on the analysis and proposal are welcome.
>
> -- Shuah

2012-08-17 18:03:13

by Shuah Khan

[permalink] [raw]
Subject: Re: dma mapping error check analysis

On Fri, 2012-08-17 at 10:11 -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote:
> > I analyzed current calls to dma_map_single() and dma_map_page() in the kernel
> > to see if dma mapping errors are checked after mapping routines return.
> >
> > Reference linux-next August 6 2012.
> >
> > This analysis stemmed from the discussion on my patch that disables swiotlb
> > overflow as a first step towards removing the support all together. Please
> > refer to thread below:
> >
> > https://lkml.org/lkml/2012/7/24/391
> >
> > The goal of this analysis is to find drivers that don't currently check dma
> > mapping errors and fix them. I did a grep for dma_map_single() and
> > dma_map_page() and looked at the code that calls these routines. I classified
> > the results of dma mapping error check status as follows:
> >
> > Broken:
> > 1. No error checks
> > 2. Partial checks - In that source file, not all calls are followed by checks.
> > 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping
> > error occurs in the middle of a multiple mapping attempt.
> >
> > The first two categories are classified as broken and need fixing.
> >
> > The third one needs fixing, since it leaves dangling mapped pages, and holds
> > on to them which is equivalent to memory leak. Some drivers release all mapped
> > pages when the device closes, but others don't. Not doing unmap might be
> > harmless on some architectures going by the comments I found in some source
> > files.
> >
> > Good:
> > 1. Checks dma mapping errors and unmaps already mapped pages when mapping
> > error occurs in the middle of a multiple mapping attempt.
> > 2. Checks dma mapping errors without unlikely()
> > 3. Checks dma mapping errors with unlikely()
> >
> > I lumped the above three cases as good cases. Using unlikely() is icing on the
> > cake, and something we need to be concerned about compared to other problems in
> > this area.
> >
> > - dmap_map_single() - results
> > No error checks - 195 (46%)
> > Partial checks - 46 (11%)
> > Doesn't unmap: 26 (6%)
> > Good: 147 (35%)
> >
> > - dma_map_page() - results
> > No error checks: 61 (59%)
> > Partial checks: 7 (.06%)
> > Doesn't unmap: 15 (14.5%)
> > Good: 20 (19%)
> >
> > In summary a large % of the cases (> 50%) go unchecked. That raises the
> > following questions:
> >
> > When do mapping errors get detected?
> > How often do these errors occur?
> > Why don't we see failures related to missing dma mapping error checks?
> > Are they silent failures?
> >
> > Based on what I found, I am not too eager to remove swiotlb overflow support
> > which would increase the probability of returning dma mapping errors.
> >
> > However I propose the following to gather more information:
> >
> > - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is
> > triggered. (This is a delta on the disable swiotlb patch I sent a few weeks
> > ago - References in this posting).
>
> As opposed to printk(KERN_ERR ? Why?

printk(KERN_ERR) is just fine.

>
> > - Change dma_map_single() and dma_map_page() to track how many times they
> > return before attempting to fix all the places that don't do dma mapping
> > error checks. (Maybe a counter that keeps track, pr_* is not an option).
>
> Perhaps this should be done in the DMA debug API instead?

Yes that is a good idea. Will explore that.

-- Shuah

2012-08-24 21:14:12

by Shuah Khan

[permalink] [raw]
Subject: [PATCH] swiotlb: Add support to disable swiotlb overflow buffer with deprecation warning

Add support for disabling swiotlb overflow buffer using zero size swiotlb
overflow buffer to help test disable overflow scenarios to find drivers that
don't check dma mapping errors. Add kernel error message to track overflow
buffer triggers to understand how often overflow buffer gets used. Add
deprecation notice warning message and deprecation schedule documentation, as
a first step towards removing overflow support, to be consistent with other
iommu implementations and return DMA_ERROR_CODE. Once drivers are fixed
overflow support can be removed.

Tested on x86-64 and compile tested on IA64.

Signed-off-by: Shuah Khan <[email protected]>
---
Documentation/feature-removal-schedule.txt | 15 ++++++
lib/swiotlb.c | 73 ++++++++++++++++++++--------
2 files changed, 69 insertions(+), 19 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index a52924e..b5938ca 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -646,3 +646,18 @@ Who: Russell King <[email protected]>,
Santosh Shilimkar <[email protected]>

----------------------------
+
+What: SWIOTLB overflow buffer support.
+When: 3.8
+Why: Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE
+ (a value of zero) to make it consistent with iommu implementation
+ on Intel, AMD, and swiotlb-xen. In 3.7, add logging to track swiotlb
+ buffer overflow triggers to understand how often overflow buffer
+ gets used. The next step is to fix drivers that don't call
+ dma_mapping_error to check for errors returned by the mapping
+ interfaces. Once drivers are fixed overflow support can be removed.
+ If you see any problems related to disabling SWIOTLB overflow buffer,
+ please report to us!
+ E-mail us at: [email protected]
+Who: Shuah Khan <[email protected]> <[email protected]>
+
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 45bc1f8..658f35a 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -92,6 +92,24 @@ static DEFINE_SPINLOCK(io_tlb_lock);

static int late_alloc;

+static void swiotlb_print_overflow_deprecation_notice(void)
+{
+ if (io_tlb_overflow) {
+ pr_warn("SWIOTLB overflow buffer will be deprecated.\n"
+ " If you have a driver that depends on this feature\n"
+ " please Email us at: [email protected],\n"
+ " Shuah Khan ([email protected]), and\n"
+ " Konrad Wilk ([email protected])\n");
+ } else {
+ pr_warn("SWIOTLB overflow buffer is disabled and will be\n"
+ " deprecated. Please report problems related to\n"
+ " disabling overflow buffer to\n"
+ " [email protected],\n"
+ " Shuah Khan ([email protected]), and\n"
+ " Konrad Wilk ([email protected])\n");
+ }
+}
+
static int __init
setup_io_tlb_npages(char *str)
{
@@ -108,7 +126,6 @@ setup_io_tlb_npages(char *str)
return 1;
}
__setup("swiotlb=", setup_io_tlb_npages);
-/* make io_tlb_overflow tunable too? */

unsigned long swiotlb_nr_tbl(void)
{
@@ -156,12 +173,18 @@ void __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
io_tlb_index = 0;
io_tlb_orig_addr = alloc_bootmem_pages(PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));

- /*
- * Get the overflow emergency buffer
- */
- io_tlb_overflow_buffer = alloc_bootmem_low_pages(PAGE_ALIGN(io_tlb_overflow));
- if (!io_tlb_overflow_buffer)
- panic("Cannot allocate SWIOTLB overflow buffer!\n");
+ if (io_tlb_overflow) {
+ /*
+ * Get the overflow emergency buffer
+ */
+ io_tlb_overflow_buffer = alloc_bootmem_low_pages(
+ PAGE_ALIGN(io_tlb_overflow));
+ if (!io_tlb_overflow_buffer)
+ panic("Cannot allocate SWIOTLB overflow buffer!\n");
+ }
+
+ swiotlb_print_overflow_deprecation_notice();
+
if (verbose)
swiotlb_print_info();
}
@@ -264,14 +287,17 @@ swiotlb_late_init_with_default_size(size_t default_size)

memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));

- /*
- * Get the overflow emergency buffer
- */
- io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
- get_order(io_tlb_overflow));
- if (!io_tlb_overflow_buffer)
- goto cleanup4;
+ if (io_tlb_overflow) {
+ /*
+ * Get the overflow emergency buffer
+ */
+ io_tlb_overflow_buffer = (void *)
+ __get_free_pages(GFP_DMA, get_order(io_tlb_overflow));
+ if (!io_tlb_overflow_buffer)
+ goto cleanup4;
+ }

+ swiotlb_print_overflow_deprecation_notice();
swiotlb_print_info();

late_alloc = 1;
@@ -297,12 +323,13 @@ cleanup1:

void __init swiotlb_free(void)
{
- if (!io_tlb_overflow_buffer)
+ if (!io_tlb_orig_addr)
return;

if (late_alloc) {
- free_pages((unsigned long)io_tlb_overflow_buffer,
- get_order(io_tlb_overflow));
+ if (io_tlb_overflow_buffer)
+ free_pages((unsigned long)io_tlb_overflow_buffer,
+ get_order(io_tlb_overflow));
free_pages((unsigned long)io_tlb_orig_addr,
get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
@@ -310,8 +337,9 @@ void __init swiotlb_free(void)
free_pages((unsigned long)io_tlb_start,
get_order(io_tlb_nslabs << IO_TLB_SHIFT));
} else {
- free_bootmem_late(__pa(io_tlb_overflow_buffer),
- PAGE_ALIGN(io_tlb_overflow));
+ if (io_tlb_overflow_buffer)
+ free_bootmem_late(__pa(io_tlb_overflow_buffer),
+ PAGE_ALIGN(io_tlb_overflow));
free_bootmem_late(__pa(io_tlb_orig_addr),
PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
free_bootmem_late(__pa(io_tlb_list),
@@ -681,7 +709,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
map = map_single(dev, phys, size, dir);
if (!map) {
swiotlb_full(dev, size, dir, 1);
+ if (!io_tlb_overflow)
+ return DMA_ERROR_CODE;
map = io_tlb_overflow_buffer;
+ dev_err(dev, "SWIOTLB is full. Mapping overflow buffer.\n");
}

dev_addr = swiotlb_virt_to_bus(dev, map);
@@ -691,6 +722,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
*/
if (!dma_capable(dev, dev_addr, size)) {
swiotlb_tbl_unmap_single(dev, map, size, dir);
+ if (!io_tlb_overflow)
+ return DMA_ERROR_CODE;
dev_addr = swiotlb_virt_to_bus(dev, io_tlb_overflow_buffer);
}

@@ -910,6 +943,8 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
int
swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
{
+ if (!io_tlb_overflow)
+ return DMA_ERROR_CODE;
return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
}
EXPORT_SYMBOL(swiotlb_dma_mapping_error);
--
1.7.9.5


2012-08-25 18:26:08

by Konrad Rzeszutek Wilk

[permalink] [raw]
Subject: Re: [PATCH] swiotlb: Add support to disable swiotlb overflow buffer with deprecation warning

On Fri, Aug 24, 2012 at 03:14:05PM -0600, Shuah Khan wrote:
> Add support for disabling swiotlb overflow buffer using zero size swiotlb
> overflow buffer to help test disable overflow scenarios to find drivers that
> don't check dma mapping errors. Add kernel error message to track overflow
> buffer triggers to understand how often overflow buffer gets used. Add
> deprecation notice warning message and deprecation schedule documentation, as
> a first step towards removing overflow support, to be consistent with other
> iommu implementations and return DMA_ERROR_CODE. Once drivers are fixed
> overflow support can be removed.

This looks like the previous patch. I thought we discussed this and
decided that the overflow buffer support should not be disabled outright
(meaning by default it is ON) - but we can provide a knob to turn it
off.

And alongside you were going to look at DMA API debug code to see if you
can come up with a mechanism to track that the users of DMA API call
dma_mapping_error after they have called various variants of 'map'?

>
> Tested on x86-64 and compile tested on IA64.
>
> Signed-off-by: Shuah Khan <[email protected]>
> ---
> Documentation/feature-removal-schedule.txt | 15 ++++++
> lib/swiotlb.c | 73 ++++++++++++++++++++--------
> 2 files changed, 69 insertions(+), 19 deletions(-)
>
> diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
> index a52924e..b5938ca 100644
> --- a/Documentation/feature-removal-schedule.txt
> +++ b/Documentation/feature-removal-schedule.txt
> @@ -646,3 +646,18 @@ Who: Russell King <[email protected]>,
> Santosh Shilimkar <[email protected]>
>
> ----------------------------
> +
> +What: SWIOTLB overflow buffer support.
> +When: 3.8
> +Why: Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE
> + (a value of zero) to make it consistent with iommu implementation
> + on Intel, AMD, and swiotlb-xen. In 3.7, add logging to track swiotlb
> + buffer overflow triggers to understand how often overflow buffer
> + gets used. The next step is to fix drivers that don't call
> + dma_mapping_error to check for errors returned by the mapping
> + interfaces. Once drivers are fixed overflow support can be removed.
> + If you see any problems related to disabling SWIOTLB overflow buffer,
> + please report to us!
> + E-mail us at: [email protected]
> +Who: Shuah Khan <[email protected]> <[email protected]>
> +
> diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> index 45bc1f8..658f35a 100644
> --- a/lib/swiotlb.c
> +++ b/lib/swiotlb.c
> @@ -92,6 +92,24 @@ static DEFINE_SPINLOCK(io_tlb_lock);
>
> static int late_alloc;
>
> +static void swiotlb_print_overflow_deprecation_notice(void)
> +{
> + if (io_tlb_overflow) {
> + pr_warn("SWIOTLB overflow buffer will be deprecated.\n"
> + " If you have a driver that depends on this feature\n"
> + " please Email us at: [email protected],\n"
> + " Shuah Khan ([email protected]), and\n"
> + " Konrad Wilk ([email protected])\n");
> + } else {
> + pr_warn("SWIOTLB overflow buffer is disabled and will be\n"
> + " deprecated. Please report problems related to\n"
> + " disabling overflow buffer to\n"
> + " [email protected],\n"
> + " Shuah Khan ([email protected]), and\n"
> + " Konrad Wilk ([email protected])\n");
> + }
> +}
> +
> static int __init
> setup_io_tlb_npages(char *str)
> {
> @@ -108,7 +126,6 @@ setup_io_tlb_npages(char *str)
> return 1;
> }
> __setup("swiotlb=", setup_io_tlb_npages);
> -/* make io_tlb_overflow tunable too? */
>
> unsigned long swiotlb_nr_tbl(void)
> {
> @@ -156,12 +173,18 @@ void __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
> io_tlb_index = 0;
> io_tlb_orig_addr = alloc_bootmem_pages(PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
>
> - /*
> - * Get the overflow emergency buffer
> - */
> - io_tlb_overflow_buffer = alloc_bootmem_low_pages(PAGE_ALIGN(io_tlb_overflow));
> - if (!io_tlb_overflow_buffer)
> - panic("Cannot allocate SWIOTLB overflow buffer!\n");
> + if (io_tlb_overflow) {
> + /*
> + * Get the overflow emergency buffer
> + */
> + io_tlb_overflow_buffer = alloc_bootmem_low_pages(
> + PAGE_ALIGN(io_tlb_overflow));
> + if (!io_tlb_overflow_buffer)
> + panic("Cannot allocate SWIOTLB overflow buffer!\n");
> + }
> +
> + swiotlb_print_overflow_deprecation_notice();
> +
> if (verbose)
> swiotlb_print_info();
> }
> @@ -264,14 +287,17 @@ swiotlb_late_init_with_default_size(size_t default_size)
>
> memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));
>
> - /*
> - * Get the overflow emergency buffer
> - */
> - io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
> - get_order(io_tlb_overflow));
> - if (!io_tlb_overflow_buffer)
> - goto cleanup4;
> + if (io_tlb_overflow) {
> + /*
> + * Get the overflow emergency buffer
> + */
> + io_tlb_overflow_buffer = (void *)
> + __get_free_pages(GFP_DMA, get_order(io_tlb_overflow));
> + if (!io_tlb_overflow_buffer)
> + goto cleanup4;
> + }
>
> + swiotlb_print_overflow_deprecation_notice();
> swiotlb_print_info();
>
> late_alloc = 1;
> @@ -297,12 +323,13 @@ cleanup1:
>
> void __init swiotlb_free(void)
> {
> - if (!io_tlb_overflow_buffer)
> + if (!io_tlb_orig_addr)
> return;
>
> if (late_alloc) {
> - free_pages((unsigned long)io_tlb_overflow_buffer,
> - get_order(io_tlb_overflow));
> + if (io_tlb_overflow_buffer)
> + free_pages((unsigned long)io_tlb_overflow_buffer,
> + get_order(io_tlb_overflow));
> free_pages((unsigned long)io_tlb_orig_addr,
> get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
> free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
> @@ -310,8 +337,9 @@ void __init swiotlb_free(void)
> free_pages((unsigned long)io_tlb_start,
> get_order(io_tlb_nslabs << IO_TLB_SHIFT));
> } else {
> - free_bootmem_late(__pa(io_tlb_overflow_buffer),
> - PAGE_ALIGN(io_tlb_overflow));
> + if (io_tlb_overflow_buffer)
> + free_bootmem_late(__pa(io_tlb_overflow_buffer),
> + PAGE_ALIGN(io_tlb_overflow));
> free_bootmem_late(__pa(io_tlb_orig_addr),
> PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
> free_bootmem_late(__pa(io_tlb_list),
> @@ -681,7 +709,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> map = map_single(dev, phys, size, dir);
> if (!map) {
> swiotlb_full(dev, size, dir, 1);
> + if (!io_tlb_overflow)
> + return DMA_ERROR_CODE;
> map = io_tlb_overflow_buffer;
> + dev_err(dev, "SWIOTLB is full. Mapping overflow buffer.\n");
> }
>
> dev_addr = swiotlb_virt_to_bus(dev, map);
> @@ -691,6 +722,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> */
> if (!dma_capable(dev, dev_addr, size)) {
> swiotlb_tbl_unmap_single(dev, map, size, dir);
> + if (!io_tlb_overflow)
> + return DMA_ERROR_CODE;
> dev_addr = swiotlb_virt_to_bus(dev, io_tlb_overflow_buffer);
> }
>
> @@ -910,6 +943,8 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
> int
> swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> {
> + if (!io_tlb_overflow)
> + return DMA_ERROR_CODE;
> return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
> }
> EXPORT_SYMBOL(swiotlb_dma_mapping_error);
> --
> 1.7.9.5
>
>
>

2012-08-27 12:38:22

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH] swiotlb: Add support to disable swiotlb overflow buffer with deprecation warning

On Sat, 2012-08-25 at 14:25 -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, Aug 24, 2012 at 03:14:05PM -0600, Shuah Khan wrote:
> > Add support for disabling swiotlb overflow buffer using zero size swiotlb
> > overflow buffer to help test disable overflow scenarios to find drivers that
> > don't check dma mapping errors. Add kernel error message to track overflow
> > buffer triggers to understand how often overflow buffer gets used. Add
> > deprecation notice warning message and deprecation schedule documentation, as
> > a first step towards removing overflow support, to be consistent with other
> > iommu implementations and return DMA_ERROR_CODE. Once drivers are fixed
> > overflow support can be removed.
>
> This looks like the previous patch. I thought we discussed this and
> decided that the overflow buffer support should not be disabled outright
> (meaning by default it is ON) - but we can provide a knob to turn it
> off.

No change in plans. This patch doesn't disable the overflow buffer. I
kept the functionality from my earlier patch that checks if overflow
buffer size is 0 and not allocate overflow buffer as a way to disable it
for testing. By default the size is 32*1024 and gets crates as usual and
maintains returning the overflow buffer when swiotlb is full.

>
> And alongside you were going to look at DMA API debug code to see if you
> can come up with a mechanism to track that the users of DMA API call
> dma_mapping_error after they have called various variants of 'map'?

I am working on this now.

-- Shuah
>
> >
> > Tested on x86-64 and compile tested on IA64.
> >
> > Signed-off-by: Shuah Khan <[email protected]>
> > ---
> > Documentation/feature-removal-schedule.txt | 15 ++++++
> > lib/swiotlb.c | 73 ++++++++++++++++++++--------
> > 2 files changed, 69 insertions(+), 19 deletions(-)
> >
> > diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
> > index a52924e..b5938ca 100644
> > --- a/Documentation/feature-removal-schedule.txt
> > +++ b/Documentation/feature-removal-schedule.txt
> > @@ -646,3 +646,18 @@ Who: Russell King <[email protected]>,
> > Santosh Shilimkar <[email protected]>
> >
> > ----------------------------
> > +
> > +What: SWIOTLB overflow buffer support.
> > +When: 3.8
> > +Why: Remove SWIOTLB overflow buffer support and return DMA_ERROR_CODE
> > + (a value of zero) to make it consistent with iommu implementation
> > + on Intel, AMD, and swiotlb-xen. In 3.7, add logging to track swiotlb
> > + buffer overflow triggers to understand how often overflow buffer
> > + gets used. The next step is to fix drivers that don't call
> > + dma_mapping_error to check for errors returned by the mapping
> > + interfaces. Once drivers are fixed overflow support can be removed.
> > + If you see any problems related to disabling SWIOTLB overflow buffer,
> > + please report to us!
> > + E-mail us at: [email protected]
> > +Who: Shuah Khan <[email protected]> <[email protected]>
> > +
> > diff --git a/lib/swiotlb.c b/lib/swiotlb.c
> > index 45bc1f8..658f35a 100644
> > --- a/lib/swiotlb.c
> > +++ b/lib/swiotlb.c
> > @@ -92,6 +92,24 @@ static DEFINE_SPINLOCK(io_tlb_lock);
> >
> > static int late_alloc;
> >
> > +static void swiotlb_print_overflow_deprecation_notice(void)
> > +{
> > + if (io_tlb_overflow) {
> > + pr_warn("SWIOTLB overflow buffer will be deprecated.\n"
> > + " If you have a driver that depends on this feature\n"
> > + " please Email us at: [email protected],\n"
> > + " Shuah Khan ([email protected]), and\n"
> > + " Konrad Wilk ([email protected])\n");
> > + } else {
> > + pr_warn("SWIOTLB overflow buffer is disabled and will be\n"
> > + " deprecated. Please report problems related to\n"
> > + " disabling overflow buffer to\n"
> > + " [email protected],\n"
> > + " Shuah Khan ([email protected]), and\n"
> > + " Konrad Wilk ([email protected])\n");
> > + }
> > +}
> > +
> > static int __init
> > setup_io_tlb_npages(char *str)
> > {
> > @@ -108,7 +126,6 @@ setup_io_tlb_npages(char *str)
> > return 1;
> > }
> > __setup("swiotlb=", setup_io_tlb_npages);
> > -/* make io_tlb_overflow tunable too? */
> >
> > unsigned long swiotlb_nr_tbl(void)
> > {
> > @@ -156,12 +173,18 @@ void __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int verbose)
> > io_tlb_index = 0;
> > io_tlb_orig_addr = alloc_bootmem_pages(PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
> >
> > - /*
> > - * Get the overflow emergency buffer
> > - */
> > - io_tlb_overflow_buffer = alloc_bootmem_low_pages(PAGE_ALIGN(io_tlb_overflow));
> > - if (!io_tlb_overflow_buffer)
> > - panic("Cannot allocate SWIOTLB overflow buffer!\n");
> > + if (io_tlb_overflow) {
> > + /*
> > + * Get the overflow emergency buffer
> > + */
> > + io_tlb_overflow_buffer = alloc_bootmem_low_pages(
> > + PAGE_ALIGN(io_tlb_overflow));
> > + if (!io_tlb_overflow_buffer)
> > + panic("Cannot allocate SWIOTLB overflow buffer!\n");
> > + }
> > +
> > + swiotlb_print_overflow_deprecation_notice();
> > +
> > if (verbose)
> > swiotlb_print_info();
> > }
> > @@ -264,14 +287,17 @@ swiotlb_late_init_with_default_size(size_t default_size)
> >
> > memset(io_tlb_orig_addr, 0, io_tlb_nslabs * sizeof(phys_addr_t));
> >
> > - /*
> > - * Get the overflow emergency buffer
> > - */
> > - io_tlb_overflow_buffer = (void *)__get_free_pages(GFP_DMA,
> > - get_order(io_tlb_overflow));
> > - if (!io_tlb_overflow_buffer)
> > - goto cleanup4;
> > + if (io_tlb_overflow) {
> > + /*
> > + * Get the overflow emergency buffer
> > + */
> > + io_tlb_overflow_buffer = (void *)
> > + __get_free_pages(GFP_DMA, get_order(io_tlb_overflow));
> > + if (!io_tlb_overflow_buffer)
> > + goto cleanup4;
> > + }
> >
> > + swiotlb_print_overflow_deprecation_notice();
> > swiotlb_print_info();
> >
> > late_alloc = 1;
> > @@ -297,12 +323,13 @@ cleanup1:
> >
> > void __init swiotlb_free(void)
> > {
> > - if (!io_tlb_overflow_buffer)
> > + if (!io_tlb_orig_addr)
> > return;
> >
> > if (late_alloc) {
> > - free_pages((unsigned long)io_tlb_overflow_buffer,
> > - get_order(io_tlb_overflow));
> > + if (io_tlb_overflow_buffer)
> > + free_pages((unsigned long)io_tlb_overflow_buffer,
> > + get_order(io_tlb_overflow));
> > free_pages((unsigned long)io_tlb_orig_addr,
> > get_order(io_tlb_nslabs * sizeof(phys_addr_t)));
> > free_pages((unsigned long)io_tlb_list, get_order(io_tlb_nslabs *
> > @@ -310,8 +337,9 @@ void __init swiotlb_free(void)
> > free_pages((unsigned long)io_tlb_start,
> > get_order(io_tlb_nslabs << IO_TLB_SHIFT));
> > } else {
> > - free_bootmem_late(__pa(io_tlb_overflow_buffer),
> > - PAGE_ALIGN(io_tlb_overflow));
> > + if (io_tlb_overflow_buffer)
> > + free_bootmem_late(__pa(io_tlb_overflow_buffer),
> > + PAGE_ALIGN(io_tlb_overflow));
> > free_bootmem_late(__pa(io_tlb_orig_addr),
> > PAGE_ALIGN(io_tlb_nslabs * sizeof(phys_addr_t)));
> > free_bootmem_late(__pa(io_tlb_list),
> > @@ -681,7 +709,10 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> > map = map_single(dev, phys, size, dir);
> > if (!map) {
> > swiotlb_full(dev, size, dir, 1);
> > + if (!io_tlb_overflow)
> > + return DMA_ERROR_CODE;
> > map = io_tlb_overflow_buffer;
> > + dev_err(dev, "SWIOTLB is full. Mapping overflow buffer.\n");
> > }
> >
> > dev_addr = swiotlb_virt_to_bus(dev, map);
> > @@ -691,6 +722,8 @@ dma_addr_t swiotlb_map_page(struct device *dev, struct page *page,
> > */
> > if (!dma_capable(dev, dev_addr, size)) {
> > swiotlb_tbl_unmap_single(dev, map, size, dir);
> > + if (!io_tlb_overflow)
> > + return DMA_ERROR_CODE;
> > dev_addr = swiotlb_virt_to_bus(dev, io_tlb_overflow_buffer);
> > }
> >
> > @@ -910,6 +943,8 @@ EXPORT_SYMBOL(swiotlb_sync_sg_for_device);
> > int
> > swiotlb_dma_mapping_error(struct device *hwdev, dma_addr_t dma_addr)
> > {
> > + if (!io_tlb_overflow)
> > + return DMA_ERROR_CODE;
> > return (dma_addr == swiotlb_virt_to_bus(hwdev, io_tlb_overflow_buffer));
> > }
> > EXPORT_SYMBOL(swiotlb_dma_mapping_error);
> > --
> > 1.7.9.5
> >
> >
> >