2016-11-11 08:50:44

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH 0/4] Compact userptr object backing store allocation

From: Tvrtko Ursulin <[email protected]>

Userptr backing store with SWIOTBL active is currently allocated in the same
inefficient manner, with one sg entry per object page, as what the commit
871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") fixed
for regular GEM objects.

We can fix that by adding new a __sg_alloc_table_from_pages core function which
allows us to control the maximum desired coalesced segment size.

Other than that the series starts with two simple fixes to
sg_alloc_table_from_pages which deal with incorrect data type usage and a
theoretical overflow condition. Fixing the latter enables easy addition of the
above mentioned __sg_alloc_table_from_pages.

Tvrtko Ursulin (4):
lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
lib/scatterlist: Avoid potential scatterlist entry overflow
lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

drivers/gpu/drm/i915/i915_drv.h | 9 +++
drivers/gpu/drm/i915/i915_gem.c | 15 +----
drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++-------
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 +-
drivers/rapidio/devices/rio_mport_cdev.c | 4 +-
include/linux/scatterlist.h | 11 ++--
lib/scatterlist.c | 78 ++++++++++++++++++++------
7 files changed, 87 insertions(+), 62 deletions(-)

--
2.7.4


2016-11-11 08:50:48

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

From: Tvrtko Ursulin <[email protected]>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: [email protected]
---
include/linux/scatterlist.h | 11 +++++----
lib/scatterlist.c | 55 ++++++++++++++++++++++++++++++++++-----------
2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..29591dbb20fd 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
struct scatterlist *, gfp_t, sg_alloc_fn *);
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask,
+ unsigned int max_segment);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);

size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..c0521420f931 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
EXPORT_SYMBOL(sg_alloc_table);

/**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- * an array of pages
- * @sgt: The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages: Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size: Number of valid bytes in the buffer (after offset)
- * @gfp_mask: GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @gfp_mask: GFP allocation mask
+ * @max_segment: Maximum size of a single scatterlist node in bytes
*
* Description:
* Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
* Returns:
* 0 on success, negative error on failure
*/
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask,
+ unsigned int max_segment)
{
- const unsigned int max_segment = UINT_MAX;
unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,

return 0;
}
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @gfp_mask: GFP allocation mask
+ *
+ * Description:
+ * Allocate and initialize an sg table from a list of pages. Contiguous
+ * ranges of the pages are squashed into a single scatterlist node. A user
+ * may provide an offset at a start and a size of valid data in a buffer
+ * specified by the page array. The returned sg table is released by
+ * sg_free_table.
+ *
+ * Returns:
+ * 0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask)
+{
+ return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+ size, gfp_mask, UINT_MAX);
+}
EXPORT_SYMBOL(sg_alloc_table_from_pages);

void __sg_page_iter_start(struct sg_page_iter *piter,
--
2.7.4

2016-11-11 08:50:55

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow

From: Tvrtko Ursulin <[email protected]>

Since the scatterlist length field is an unsigned int, make
sure that sg_alloc_table_from_pages does not overflow it while
coallescing pages to a single entry.

v2: Drop reference to future use. Use UINT_MAX.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: [email protected]
---
lib/scatterlist.c | 25 +++++++++++++++++++------
1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index e05e7fc98892..de15f369b317 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
- unsigned int chunks;
+ const unsigned int max_segment = UINT_MAX;
+ unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
int ret;
@@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,

/* compute number of contiguous chunks */
chunks = 1;
- for (i = 1; i < n_pages; ++i)
- if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
+ seg_len = PAGE_SIZE;
+ for (i = 1; i < n_pages; ++i) {
+ if (seg_len >= max_segment ||
+ page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
++chunks;
+ seg_len = PAGE_SIZE;
+ } else {
+ seg_len += PAGE_SIZE;
+ }
+ }

ret = sg_alloc_table(sgt, chunks, gfp_mask);
if (unlikely(ret))
@@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
/* merging chunks and putting them into the scatterlist */
cur_page = 0;
for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
- unsigned long chunk_size;
+ unsigned int chunk_size;
unsigned int j;

/* look for the end of the current chunk */
+ seg_len = PAGE_SIZE;
for (j = cur_page + 1; j < n_pages; ++j)
- if (page_to_pfn(pages[j]) !=
+ if (seg_len >= max_segment ||
+ page_to_pfn(pages[j]) !=
page_to_pfn(pages[j - 1]) + 1)
break;
+ else
+ seg_len += PAGE_SIZE;

chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;
- sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
+ sg_set_page(s, pages[cur_page],
+ min_t(unsigned long, size, chunk_size), offset);
size -= chunk_size;
offset = 0;
cur_page = j;
--
2.7.4

2016-11-11 08:50:53

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

From: Tvrtko Ursulin <[email protected]>

Scatterlist entries have an unsigned int for the offset so
correct the sg_alloc_table_from_pages function accordingly.

Since these are offsets withing a page, unsigned int is
wide enough.

Also converts callers which were using unsigned long locally
with the lower_32_bits annotation to make it explicitly
clear what is happening.

v2: Use offset_in_page. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Pawel Osciak <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Kyungmin Park <[email protected]>
Cc: Tomasz Stanislawski <[email protected]>
Cc: Matt Porter <[email protected]>
Cc: Alexandre Bounine <[email protected]>
Cc: [email protected]
Cc: [email protected]
Acked-by: Marek Szyprowski <[email protected]> (v1)
---
drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 ++--
drivers/rapidio/devices/rio_mport_cdev.c | 4 ++--
include/linux/scatterlist.h | 2 +-
lib/scatterlist.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/media/v4l2-core/videobuf2-dma-contig.c b/drivers/media/v4l2-core/videobuf2-dma-contig.c
index fb6a177be461..51e8765bc3c6 100644
--- a/drivers/media/v4l2-core/videobuf2-dma-contig.c
+++ b/drivers/media/v4l2-core/videobuf2-dma-contig.c
@@ -478,7 +478,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
{
struct vb2_dc_buf *buf;
struct frame_vector *vec;
- unsigned long offset;
+ unsigned int offset;
int n_pages, i;
int ret = 0;
struct sg_table *sgt;
@@ -506,7 +506,7 @@ static void *vb2_dc_get_userptr(struct device *dev, unsigned long vaddr,
buf->dev = dev;
buf->dma_dir = dma_dir;

- offset = vaddr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(vaddr));
vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE);
if (IS_ERR(vec)) {
ret = PTR_ERR(vec);
diff --git a/drivers/rapidio/devices/rio_mport_cdev.c b/drivers/rapidio/devices/rio_mport_cdev.c
index 9013a585507e..0fae29ff47ba 100644
--- a/drivers/rapidio/devices/rio_mport_cdev.c
+++ b/drivers/rapidio/devices/rio_mport_cdev.c
@@ -876,10 +876,10 @@ rio_dma_transfer(struct file *filp, u32 transfer_mode,
* offset within the internal buffer specified by handle parameter.
*/
if (xfer->loc_addr) {
- unsigned long offset;
+ unsigned int offset;
long pinned;

- offset = (unsigned long)(uintptr_t)xfer->loc_addr & ~PAGE_MASK;
+ offset = lower_32_bits(offset_in_page(xfer->loc_addr));
nr_pages = PAGE_ALIGN(xfer->length + offset) >> PAGE_SHIFT;

page_list = kmalloc_array(nr_pages,
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe6acd7..c981bee1a3ae 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -263,7 +263,7 @@ int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask);

size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 004fc70fc56a..e05e7fc98892 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -391,7 +391,7 @@ EXPORT_SYMBOL(sg_alloc_table);
*/
int sg_alloc_table_from_pages(struct sg_table *sgt,
struct page **pages, unsigned int n_pages,
- unsigned long offset, unsigned long size,
+ unsigned int offset, unsigned long size,
gfp_t gfp_mask)
{
unsigned int chunks;
--
2.7.4

2016-11-11 08:51:41

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

From: Tvrtko Ursulin <[email protected]>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Chris Wilson <[email protected]>
---
drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
3 files changed, 16 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..319f8def0f86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
__T; \
})

+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+ return swiotlb_nr_tbl() << IO_TLB_SHIFT;
+#else
+ return UINT_MAX;
+#endif
+}
+
#endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
mutex_unlock(&obj->mm.lock);
}

-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
- return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
- return 0;
-#endif
-}
-
static void i915_sg_trim(struct sg_table *orig_st)
{
struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
int ret;
gfp_t gfp;

@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);

- max_segment = swiotlb_max_size();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..b4461f1832a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,20 @@ struct get_pages_work {
struct task_struct *task;
};

-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
static int
st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
{
- struct scatterlist *sg;
- int ret, n;
+ int ret;

*st = kmalloc(sizeof(**st), GFP_KERNEL);
if (*st == NULL)
return -ENOMEM;

- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
+ ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
+ num_pages << PAGE_SHIFT,
+ GFP_KERNEL, i915_sg_segment_size());
+ if (ret)
+ goto err;

return 0;

--
2.7.4

2016-11-11 08:59:29

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH v3 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

From: Tvrtko Ursulin <[email protected]>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.

v3:
* Actually include the swiotlb override fix.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Chris Wilson <[email protected]>
---
drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
3 files changed, 18 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..38a555c9c44b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
__T; \
})

+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+ unsigned int nr_tbl = swiotlb_nr_tbl();
+
+ return nr_tbl > 0 ? nr_tbl << IO_TLB_SHIFT : UINT_MAX;
+#else
+ return UINT_MAX;
+#endif
+}
+
#endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
mutex_unlock(&obj->mm.lock);
}

-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
- return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
- return 0;
-#endif
-}
-
static void i915_sg_trim(struct sg_table *orig_st)
{
struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
int ret;
gfp_t gfp;

@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);

- max_segment = swiotlb_max_size();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..b4461f1832a6 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,20 @@ struct get_pages_work {
struct task_struct *task;
};

-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
static int
st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
{
- struct scatterlist *sg;
- int ret, n;
+ int ret;

*st = kmalloc(sizeof(**st), GFP_KERNEL);
if (*st == NULL)
return -ENOMEM;

- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
+ ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
+ num_pages << PAGE_SHIFT,
+ GFP_KERNEL, i915_sg_segment_size());
+ if (ret)
+ goto err;

return 0;

--
2.7.4

2016-11-11 10:19:23

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 2/4] lib/scatterlist: Avoid potential scatterlist entry overflow

On Fri, Nov 11, 2016 at 08:50:18AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> Since the scatterlist length field is an unsigned int, make
> sure that sg_alloc_table_from_pages does not overflow it while
> coallescing pages to a single entry.
>
> v2: Drop reference to future use. Use UINT_MAX.
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: [email protected]
> ---
> lib/scatterlist.c | 25 +++++++++++++++++++------
> 1 file changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/lib/scatterlist.c b/lib/scatterlist.c
> index e05e7fc98892..de15f369b317 100644
> --- a/lib/scatterlist.c
> +++ b/lib/scatterlist.c
> @@ -394,7 +394,8 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> unsigned int offset, unsigned long size,
> gfp_t gfp_mask)
> {
> - unsigned int chunks;
> + const unsigned int max_segment = UINT_MAX;
> + unsigned int seg_len, chunks;
> unsigned int i;
> unsigned int cur_page;
> int ret;
> @@ -402,9 +403,16 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
>
> /* compute number of contiguous chunks */
> chunks = 1;
> - for (i = 1; i < n_pages; ++i)
> - if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)
> + seg_len = PAGE_SIZE;
> + for (i = 1; i < n_pages; ++i) {
> + if (seg_len >= max_segment ||
> + page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) {
> ++chunks;
> + seg_len = PAGE_SIZE;
> + } else {
> + seg_len += PAGE_SIZE;
> + }
> + }
>
> ret = sg_alloc_table(sgt, chunks, gfp_mask);
> if (unlikely(ret))
> @@ -413,17 +421,22 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,
> /* merging chunks and putting them into the scatterlist */
> cur_page = 0;
> for_each_sg(sgt->sgl, s, sgt->orig_nents, i) {
> - unsigned long chunk_size;
> + unsigned int chunk_size;
> unsigned int j;
>
> /* look for the end of the current chunk */
> + seg_len = PAGE_SIZE;
> for (j = cur_page + 1; j < n_pages; ++j)
> - if (page_to_pfn(pages[j]) !=
> + if (seg_len >= max_segment ||
> + page_to_pfn(pages[j]) !=
> page_to_pfn(pages[j - 1]) + 1)
> break;
> + else
> + seg_len += PAGE_SIZE;
>
> chunk_size = ((j - cur_page) << PAGE_SHIFT) - offset;

chunk_size = seg_len - offset;
?

> - sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
> + sg_set_page(s, pages[cur_page],
> + min_t(unsigned long, size, chunk_size), offset);
> size -= chunk_size;
> offset = 0;
> cur_page = j;

Could BUG_ON(size); afterwards and BUG_ON(!size) inside? Although that
requires the caller didn't make a mistake in n_pages vs offset+size.

Minor comments,
Reviewed-by: Chris Wilson <[email protected]>
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-11 10:23:25

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

On Fri, Nov 11, 2016 at 08:50:20AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
>
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
>
> v2:
> * Rename helper to i915_sg_segment_size and fix swiotlb override.
> * Commit message update.
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Chris Wilson <[email protected]>
> ---
> drivers/gpu/drm/i915/i915_drv.h | 9 +++++++++
> drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
> drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++++++----------------------
> 3 files changed, 16 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 30777dee3f9c..319f8def0f86 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -4175,4 +4175,13 @@ int remap_io_mapping(struct vm_area_struct *vma,
> __T; \
> })
>
> +static inline unsigned int i915_sg_segment_size(void)
> +{
> +#if IS_ENABLED(CONFIG_SWIOTLB)
> + return swiotlb_nr_tbl() << IO_TLB_SHIFT;
> +#else
> + return UINT_MAX;
> +#endif
> +}
> +
> #endif
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1c20edba7f2a..cb4c188a395c 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
> mutex_unlock(&obj->mm.lock);
> }
>
> -static unsigned int swiotlb_max_size(void)
> -{
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> - return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
> -#else
> - return 0;
> -#endif
> -}
> -
> static void i915_sg_trim(struct sg_table *orig_st)
> {
> struct sg_table new_st;
> @@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> struct sgt_iter sgt_iter;
> struct page *page;
> unsigned long last_pfn = 0; /* suppress gcc warning */
> - unsigned int max_segment;
> + unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
> int ret;
> gfp_t gfp;
>
> @@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
>
> - max_segment = swiotlb_max_size();
> - if (!max_segment)
> - max_segment = rounddown(UINT_MAX, PAGE_SIZE);
> -
> st = kmalloc(sizeof(*st), GFP_KERNEL);
> if (st == NULL)
> return ERR_PTR(-ENOMEM);
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 64261639f547..b4461f1832a6 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -390,36 +390,20 @@ struct get_pages_work {
> struct task_struct *task;
> };
>
> -#if IS_ENABLED(CONFIG_SWIOTLB)
> -#define swiotlb_active() swiotlb_nr_tbl()
> -#else
> -#define swiotlb_active() 0
> -#endif
> -
> static int
> st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
> {
> - struct scatterlist *sg;
> - int ret, n;
> + int ret;
>
> *st = kmalloc(sizeof(**st), GFP_KERNEL);
> if (*st == NULL)
> return -ENOMEM;
>
> - if (swiotlb_active()) {
> - ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
> - if (ret)
> - goto err;
> -
> - for_each_sg((*st)->sgl, sg, num_pages, n)
> - sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
> - } else {
> - ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
> - 0, num_pages << PAGE_SHIFT,
> - GFP_KERNEL);
> - if (ret)
> - goto err;
> - }
> + ret = __sg_alloc_table_from_pages(*st, pvec, num_pages, 0,
> + num_pages << PAGE_SHIFT,

Petty:

ret = __sg_alloc_table_from_pages(*st, pvec, num_pages,

pvec + num_pages are paired

0, num_pages << PAGE_SHIFT,
offset + size are paired

i915_sg_segment_size()),
GFP_KERNEL);

And for some reason I always like to see gfp_t last.

Otherwise looks good,
Reviewed-by: Chris Wilson <[email protected]>
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-11 10:24:36

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

On Fri, Nov 11, 2016 at 08:50:19AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> Drivers like i915 benefit from being able to control the maxium
> size of the sg coallesced segment while building the scatter-
> gather list.
>
> Introduce and export the __sg_alloc_table_from_pages function
> which will allow it that control.
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: [email protected]


> ---
> include/linux/scatterlist.h | 11 +++++----
> lib/scatterlist.c | 55 ++++++++++++++++++++++++++++++++++-----------
> 2 files changed, 49 insertions(+), 17 deletions(-)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index c981bee1a3ae..29591dbb20fd 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
> int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
> struct scatterlist *, gfp_t, sg_alloc_fn *);
> int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
> -int sg_alloc_table_from_pages(struct sg_table *sgt,
> - struct page **pages, unsigned int n_pages,
> - unsigned int offset, unsigned long size,
> - gfp_t gfp_mask);
> +int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
> + unsigned int n_pages, unsigned int offset,
> + unsigned long size, gfp_t gfp_mask,
> + unsigned int max_segment);

Just the question of parameter order, I like gfp_t last :)
And I think offset / size / max_segment tie together.
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-11 12:35:55

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH v2 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

From: Tvrtko Ursulin <[email protected]>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: [email protected]
Cc: Chris Wilson <[email protected]>
---
include/linux/scatterlist.h | 11 +++++----
lib/scatterlist.c | 55 ++++++++++++++++++++++++++++++++++-----------
2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
struct scatterlist *, gfp_t, sg_alloc_fn *);
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);

size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..8ee1ece81626 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
EXPORT_SYMBOL(sg_alloc_table);

/**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- * an array of pages
- * @sgt: The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages: Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size: Number of valid bytes in the buffer (after offset)
- * @gfp_mask: GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a single scatterlist node in bytes
+ * @gfp_mask: GFP allocation mask
*
* Description:
* Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
* Returns:
* 0 on success, negative error on failure
*/
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask)
{
- const unsigned int max_segment = UINT_MAX;
unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,

return 0;
}
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @gfp_mask: GFP allocation mask
+ *
+ * Description:
+ * Allocate and initialize an sg table from a list of pages. Contiguous
+ * ranges of the pages are squashed into a single scatterlist node. A user
+ * may provide an offset at a start and a size of valid data in a buffer
+ * specified by the page array. The returned sg table is released by
+ * sg_free_table.
+ *
+ * Returns:
+ * 0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask)
+{
+ return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+ size, gfp_mask, UINT_MAX);
+}
EXPORT_SYMBOL(sg_alloc_table_from_pages);

void __sg_page_iter_start(struct sg_page_iter *piter,
--
2.7.4

2016-11-11 12:36:24

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH v4 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

From: Tvrtko Ursulin <[email protected]>

With the addition of __sg_alloc_table_from_pages we can control
the maximum coallescing size and eliminate a separate path for
allocating backing store here.

Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
SWIOTLB max segment size") this enables more compact sg lists to
be created and so has a beneficial effect on workloads with many
and/or large objects of this class.

v2:
* Rename helper to i915_sg_segment_size and fix swiotlb override.
* Commit message update.

v3:
* Actually include the swiotlb override fix.

v4:
* Regroup parameters a bit. (Chris Wilson)

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]> (v2)
---
drivers/gpu/drm/i915/i915_drv.h | 11 +++++++++++
drivers/gpu/drm/i915/i915_gem.c | 15 +--------------
drivers/gpu/drm/i915/i915_gem_userptr.c | 29 +++++++----------------------
3 files changed, 19 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 30777dee3f9c..38a555c9c44b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -4175,4 +4175,15 @@ int remap_io_mapping(struct vm_area_struct *vma,
__T; \
})

+static inline unsigned int i915_sg_segment_size(void)
+{
+#if IS_ENABLED(CONFIG_SWIOTLB)
+ unsigned int nr_tbl = swiotlb_nr_tbl();
+
+ return nr_tbl > 0 ? nr_tbl << IO_TLB_SHIFT : UINT_MAX;
+#else
+ return UINT_MAX;
+#endif
+}
+
#endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1c20edba7f2a..cb4c188a395c 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2223,15 +2223,6 @@ void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
mutex_unlock(&obj->mm.lock);
}

-static unsigned int swiotlb_max_size(void)
-{
-#if IS_ENABLED(CONFIG_SWIOTLB)
- return rounddown(swiotlb_nr_tbl() << IO_TLB_SHIFT, PAGE_SIZE);
-#else
- return 0;
-#endif
-}
-
static void i915_sg_trim(struct sg_table *orig_st)
{
struct sg_table new_st;
@@ -2267,7 +2258,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
struct sgt_iter sgt_iter;
struct page *page;
unsigned long last_pfn = 0; /* suppress gcc warning */
- unsigned int max_segment;
+ unsigned int max_segment = rounddown(i915_sg_segment_size(), PAGE_SIZE);
int ret;
gfp_t gfp;

@@ -2278,10 +2269,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
GEM_BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
GEM_BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);

- max_segment = swiotlb_max_size();
- if (!max_segment)
- max_segment = rounddown(UINT_MAX, PAGE_SIZE);
-
st = kmalloc(sizeof(*st), GFP_KERNEL);
if (st == NULL)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 64261639f547..f3fcb2d1bde9 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -390,36 +390,21 @@ struct get_pages_work {
struct task_struct *task;
};

-#if IS_ENABLED(CONFIG_SWIOTLB)
-#define swiotlb_active() swiotlb_nr_tbl()
-#else
-#define swiotlb_active() 0
-#endif
-
static int
st_set_pages(struct sg_table **st, struct page **pvec, int num_pages)
{
- struct scatterlist *sg;
- int ret, n;
+ int ret;

*st = kmalloc(sizeof(**st), GFP_KERNEL);
if (*st == NULL)
return -ENOMEM;

- if (swiotlb_active()) {
- ret = sg_alloc_table(*st, num_pages, GFP_KERNEL);
- if (ret)
- goto err;
-
- for_each_sg((*st)->sgl, sg, num_pages, n)
- sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
- } else {
- ret = sg_alloc_table_from_pages(*st, pvec, num_pages,
- 0, num_pages << PAGE_SHIFT,
- GFP_KERNEL);
- if (ret)
- goto err;
- }
+ ret = __sg_alloc_table_from_pages(*st, pvec, num_pages,
+ 0, num_pages << PAGE_SHIFT,
+ i915_sg_segment_size(),
+ GFP_KERNEL);
+ if (ret)
+ goto err;

return 0;

--
2.7.4

2016-11-11 14:17:54

by Tvrtko Ursulin

[permalink] [raw]
Subject: [PATCH v3 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

From: Tvrtko Ursulin <[email protected]>

Drivers like i915 benefit from being able to control the maxium
size of the sg coallesced segment while building the scatter-
gather list.

Introduce and export the __sg_alloc_table_from_pages function
which will allow it that control.

v2: Reorder parameters. (Chris Wilson)
v3: Fix incomplete reordering in v2.

Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: [email protected]
Cc: Chris Wilson <[email protected]>
---
include/linux/scatterlist.h | 11 +++++----
lib/scatterlist.c | 55 ++++++++++++++++++++++++++++++++++-----------
2 files changed, 49 insertions(+), 17 deletions(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index c981bee1a3ae..16b740afeed2 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -261,10 +261,13 @@ void sg_free_table(struct sg_table *);
int __sg_alloc_table(struct sg_table *, unsigned int, unsigned int,
struct scatterlist *, gfp_t, sg_alloc_fn *);
int sg_alloc_table(struct sg_table *, unsigned int, gfp_t);
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask);
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask);
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask);

size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
size_t buflen, off_t skip, bool to_buffer);
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index de15f369b317..7f4d637d3606 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -370,14 +370,15 @@ int sg_alloc_table(struct sg_table *table, unsigned int nents, gfp_t gfp_mask)
EXPORT_SYMBOL(sg_alloc_table);

/**
- * sg_alloc_table_from_pages - Allocate and initialize an sg table from
- * an array of pages
- * @sgt: The sg table header to use
- * @pages: Pointer to an array of page pointers
- * @n_pages: Number of pages in the pages array
- * @offset: Offset from start of the first page to the start of a buffer
- * @size: Number of valid bytes in the buffer (after offset)
- * @gfp_mask: GFP allocation mask
+ * __sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @max_segment: Maximum size of a single scatterlist node in bytes
+ * @gfp_mask: GFP allocation mask
*
* Description:
* Allocate and initialize an sg table from a list of pages. Contiguous
@@ -389,12 +390,11 @@ EXPORT_SYMBOL(sg_alloc_table);
* Returns:
* 0 on success, negative error on failure
*/
-int sg_alloc_table_from_pages(struct sg_table *sgt,
- struct page **pages, unsigned int n_pages,
- unsigned int offset, unsigned long size,
- gfp_t gfp_mask)
+int __sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, unsigned int max_segment,
+ gfp_t gfp_mask)
{
- const unsigned int max_segment = UINT_MAX;
unsigned int seg_len, chunks;
unsigned int i;
unsigned int cur_page;
@@ -444,6 +444,35 @@ int sg_alloc_table_from_pages(struct sg_table *sgt,

return 0;
}
+EXPORT_SYMBOL(__sg_alloc_table_from_pages);
+
+/**
+ * sg_alloc_table_from_pages - Allocate and initialize an sg table from
+ * an array of pages
+ * @sgt: The sg table header to use
+ * @pages: Pointer to an array of page pointers
+ * @n_pages: Number of pages in the pages array
+ * @offset: Offset from start of the first page to the start of a buffer
+ * @size: Number of valid bytes in the buffer (after offset)
+ * @gfp_mask: GFP allocation mask
+ *
+ * Description:
+ * Allocate and initialize an sg table from a list of pages. Contiguous
+ * ranges of the pages are squashed into a single scatterlist node. A user
+ * may provide an offset at a start and a size of valid data in a buffer
+ * specified by the page array. The returned sg table is released by
+ * sg_free_table.
+ *
+ * Returns:
+ * 0 on success, negative error on failure
+ */
+int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
+ unsigned int n_pages, unsigned int offset,
+ unsigned long size, gfp_t gfp_mask)
+{
+ return __sg_alloc_table_from_pages(sgt, pages, n_pages, offset,
+ size, UINT_MAX, gfp_mask);
+}
EXPORT_SYMBOL(sg_alloc_table_from_pages);

void __sg_page_iter_start(struct sg_page_iter *piter,
--
2.7.4

2016-11-14 09:53:13

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH v4 4/4] drm/i915: Use __sg_alloc_table_from_pages for userptr allocations

On Fri, Nov 11, 2016 at 12:36:13PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> With the addition of __sg_alloc_table_from_pages we can control
> the maximum coallescing size and eliminate a separate path for
> allocating backing store here.
>
> Similar to 871dfbd67d4e ("drm/i915: Allow compaction upto
> SWIOTLB max segment size") this enables more compact sg lists to
> be created and so has a beneficial effect on workloads with many
> and/or large objects of this class.
>
> v2:
> * Rename helper to i915_sg_segment_size and fix swiotlb override.
> * Commit message update.
>
> v3:
> * Actually include the swiotlb override fix.
>
> v4:
> * Regroup parameters a bit. (Chris Wilson)
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Chris Wilson <[email protected]>
> Reviewed-by: Chris Wilson <[email protected]> (v2)
(bump)
Reviewed-by: Chris Wilson <[email protected]>
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-14 09:53:39

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH v3 3/4] lib/scatterlist: Introduce and export __sg_alloc_table_from_pages

On Fri, Nov 11, 2016 at 02:17:44PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> Drivers like i915 benefit from being able to control the maxium
> size of the sg coallesced segment while building the scatter-
> gather list.
>
> Introduce and export the __sg_alloc_table_from_pages function
> which will allow it that control.
>
> v2: Reorder parameters. (Chris Wilson)
> v3: Fix incomplete reordering in v2.
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: [email protected]
> Cc: Chris Wilson <[email protected]>

Works for me,
Reviewed-by: Chris Wilson <[email protected]>
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-14 09:57:36

by Chris Wilson

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

On Fri, Nov 11, 2016 at 08:50:17AM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> Scatterlist entries have an unsigned int for the offset so
> correct the sg_alloc_table_from_pages function accordingly.
>
> Since these are offsets withing a page, unsigned int is
> wide enough.
>
> Also converts callers which were using unsigned long locally
> with the lower_32_bits annotation to make it explicitly
> clear what is happening.
>
> v2: Use offset_in_page. (Chris Wilson)
>
> Signed-off-by: Tvrtko Ursulin <[email protected]>
> Cc: Masahiro Yamada <[email protected]>
> Cc: Pawel Osciak <[email protected]>
> Cc: Marek Szyprowski <[email protected]>
> Cc: Kyungmin Park <[email protected]>
> Cc: Tomasz Stanislawski <[email protected]>
> Cc: Matt Porter <[email protected]>
> Cc: Alexandre Bounine <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Acked-by: Marek Szyprowski <[email protected]> (v1)

If there were kerneldoc, it would nicely explain that having an offset
larger then a page is silly when passing in array of pages.

Changes elsewhere look ok (personally I'd be happy with just
offset_in_page(), 4GiB superpages are somebody else's problem :)

Reviewed-by: Chris Wilson <[email protected]>
-Chris

--
Chris Wilson, Intel Open Source Technology Centre

2016-11-14 11:45:22

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation


Hi Andrew,

On 11/11/2016 08:50, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <[email protected]>
>
> Userptr backing store with SWIOTBL active is currently allocated in the same
> inefficient manner, with one sg entry per object page, as what the commit
> 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment size") fixed
> for regular GEM objects.
>
> We can fix that by adding new a __sg_alloc_table_from_pages core function which
> allows us to control the maximum desired coalesced segment size.
>
> Other than that the series starts with two simple fixes to
> sg_alloc_table_from_pages which deal with incorrect data type usage and a
> theoretical overflow condition. Fixing the latter enables easy addition of the
> above mentioned __sg_alloc_table_from_pages.
>
> Tvrtko Ursulin (4):
> lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
> lib/scatterlist: Avoid potential scatterlist entry overflow
> lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
> drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
>
> drivers/gpu/drm/i915/i915_drv.h | 9 +++
> drivers/gpu/drm/i915/i915_gem.c | 15 +----
> drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++-------
> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 +-
> drivers/rapidio/devices/rio_mport_cdev.c | 4 +-
> include/linux/scatterlist.h | 11 ++--
> lib/scatterlist.c | 78 ++++++++++++++++++++------
> 7 files changed, 87 insertions(+), 62 deletions(-)

I have three patches to lib/scatterlist.c in this series which has all
been reviewed and tested.

We would like to merge them via the DRM tree if there are no objections?

Kind regards,

Tvrtko


2016-11-22 14:32:01

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 1/4] lib/scatterlist: Fix offset type in sg_alloc_table_from_pages

Em Mon, 14 Nov 2016 09:55:48 +0000
Chris Wilson <[email protected]> escreveu:

> On Fri, Nov 11, 2016 at 08:50:17AM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <[email protected]>
> >
> > Scatterlist entries have an unsigned int for the offset so
> > correct the sg_alloc_table_from_pages function accordingly.
> >
> > Since these are offsets withing a page, unsigned int is
> > wide enough.
> >
> > Also converts callers which were using unsigned long locally
> > with the lower_32_bits annotation to make it explicitly
> > clear what is happening.
> >
> > v2: Use offset_in_page. (Chris Wilson)
> >
> > Signed-off-by: Tvrtko Ursulin <[email protected]>
> > Cc: Masahiro Yamada <[email protected]>
> > Cc: Pawel Osciak <[email protected]>
> > Cc: Marek Szyprowski <[email protected]>
> > Cc: Kyungmin Park <[email protected]>
> > Cc: Tomasz Stanislawski <[email protected]>
> > Cc: Matt Porter <[email protected]>
> > Cc: Alexandre Bounine <[email protected]>
> > Cc: [email protected]
> > Cc: [email protected]
> > Acked-by: Marek Szyprowski <[email protected]> (v1)
>
> If there were kerneldoc, it would nicely explain that having an offset
> larger then a page is silly when passing in array of pages.
>
> Changes elsewhere look ok (personally I'd be happy with just
> offset_in_page(), 4GiB superpages are somebody else's problem :)

For the media changes, that looked OK. We don't have any needs
to stream 4GB images nowadays :-)

Reviewed-by: Mauro Carvalho Chehab <[email protected]>

> Reviewed-by: Chris Wilson <[email protected]>
> -Chris
>


Thanks,
Mauro

2016-11-28 10:24:01

by Tvrtko Ursulin

[permalink] [raw]
Subject: Re: [Intel-gfx] [PATCH 0/4] Compact userptr object backing store allocation


On 14/11/2016 11:45, Tvrtko Ursulin wrote:
>
> Hi Andrew,
>
> On 11/11/2016 08:50, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <[email protected]>
>>
>> Userptr backing store with SWIOTBL active is currently allocated in
>> the same
>> inefficient manner, with one sg entry per object page, as what the commit
>> 871dfbd67d4e ("drm/i915: Allow compaction upto SWIOTLB max segment
>> size") fixed
>> for regular GEM objects.
>>
>> We can fix that by adding new a __sg_alloc_table_from_pages core
>> function which
>> allows us to control the maximum desired coalesced segment size.
>>
>> Other than that the series starts with two simple fixes to
>> sg_alloc_table_from_pages which deal with incorrect data type usage and a
>> theoretical overflow condition. Fixing the latter enables easy
>> addition of the
>> above mentioned __sg_alloc_table_from_pages.
>>
>> Tvrtko Ursulin (4):
>> lib/scatterlist: Fix offset type in sg_alloc_table_from_pages
>> lib/scatterlist: Avoid potential scatterlist entry overflow
>> lib/scatterlist: Introduce and export __sg_alloc_table_from_pages
>> drm/i915: Use __sg_alloc_table_from_pages for userptr allocations
>>
>> drivers/gpu/drm/i915/i915_drv.h | 9 +++
>> drivers/gpu/drm/i915/i915_gem.c | 15 +----
>> drivers/gpu/drm/i915/i915_gem_userptr.c | 28 ++-------
>> drivers/media/v4l2-core/videobuf2-dma-contig.c | 4 +-
>> drivers/rapidio/devices/rio_mport_cdev.c | 4 +-
>> include/linux/scatterlist.h | 11 ++--
>> lib/scatterlist.c | 78
>> ++++++++++++++++++++------
>> 7 files changed, 87 insertions(+), 62 deletions(-)
>
> I have three patches to lib/scatterlist.c in this series which has all
> been reviewed and tested.
>
> We would like to merge them via the DRM tree if there are no objections?

Excuse me for the re-ping - still looking for an ack to merge these few
patches via the drm tree. (The code in question does not have a
designated maintainer.)

Regards,

Tvrtko