2017-04-25 18:20:48

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

This patch introduces functions which kmap the pages inside an sgl.
These functions replace a common pattern of kmap(sg_page(sg)) that is
used in more than 50 places within the kernel.

The motivation for this work is to eventually safely support sgls that
contain io memory. In order for that to work, any access to the contents
of an iomem SGL will need to be done with iomemcpy or hit some warning.
(The exact details of how this will work have yet to be worked out.)
Having all the kmaps in one place is just a first step in that
direction. Additionally, seeing this helps cut down the users of sg_page,
it should make any effort to go to struct-page-less DMAs a little
easier (should that idea ever swing back into favour again).

A flags option is added to select between a regular or atomic mapping so
these functions can replace kmap(sg_page or kmap_atomic(sg_page.
Future work may expand this to have flags for using page_address or
vmap. We include a flag to require the function not to fail to
support legacy code that has no easy error path. Much further in the
future, there may be a flag to allocate memory and copy the data
from/to iomem.

We also add the semantic that sg_map can fail to create a mapping,
despite the fact that the current code this is replacing is assumed to
never fail and the current version of these functions cannot fail. This
is to support iomem which may either have to fail to create the mapping or
allocate memory as a bounce buffer which itself can fail.

Also, in terms of cleanup, a few of the existing kmap(sg_page) users
play things a bit loose in terms of whether they apply sg->offset
so using these helper functions should help avoid such issues.

Signed-off-by: Logan Gunthorpe <logang-OTvnGxWRz7hWk0Htik3J/[email protected]>
---
include/linux/scatterlist.h | 85 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 85 insertions(+)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe..fad170b 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -5,6 +5,7 @@
#include <linux/types.h>
#include <linux/bug.h>
#include <linux/mm.h>
+#include <linux/highmem.h>
#include <asm/io.h>

struct scatterlist {
@@ -126,6 +127,90 @@ static inline struct page *sg_page(struct scatterlist *sg)
return (struct page *)((sg)->page_link & ~0x3);
}

+#define SG_KMAP (1 << 0) /* create a mapping with kmap */
+#define SG_KMAP_ATOMIC (1 << 1) /* create a mapping with kmap_atomic */
+#define SG_MAP_MUST_NOT_FAIL (1 << 2) /* indicate sg_map should not fail */
+
+/**
+ * sg_map - kmap a page inside an sgl
+ * @sg: SG entry
+ * @offset: Offset into entry
+ * @flags: Flags for creating the mapping
+ *
+ * Description:
+ * Use this function to map a page in the scatterlist at the specified
+ * offset. sg->offset is already added for you. Note: the semantics of
+ * this function are that it may fail. Thus, its output should be checked
+ * with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
+ * in the mapped page is returned.
+ *
+ * Flags can be any of:
+ * * SG_KMAP - Use kmap to create the mapping
+ * * SG_KMAP_ATOMIC - Use kmap_atomic to map the page atommically.
+ * Thus, the rules of that function apply: the
+ * cpu may not sleep until it is unmaped.
+ * * SG_MAP_MUST_NOT_FAIL - Indicate that sg_map must not fail.
+ * If it does, it will issue a BUG_ON instead.
+ * This is intended for legacy code only, it
+ * is not to be used in new code.
+ *
+ * Also, consider carefully whether this function is appropriate. It is
+ * largely not recommended for new code and if the sgl came from another
+ * subsystem and you don't know what kind of memory might be in the list
+ * then you definitely should not call it. Non-mappable memory may be in
+ * the sgl and thus this function may fail unexpectedly. Consider using
+ * sg_copy_to_buffer instead.
+ **/
+static inline void *sg_map(struct scatterlist *sg, size_t offset, int flags)
+{
+ struct page *pg;
+ unsigned int pg_off;
+ void *ret;
+
+ offset += sg->offset;
+ pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+ pg_off = offset_in_page(offset);
+
+ if (flags & SG_KMAP_ATOMIC)
+ ret = kmap_atomic(pg) + pg_off;
+ else if (flags & SG_KMAP)
+ ret = kmap(pg) + pg_off;
+ else
+ ret = ERR_PTR(-EINVAL);
+
+ /*
+ * In theory, this can't happen yet. Once we start adding
+ * unmapable memory, it also shouldn't happen unless developers
+ * start putting unmappable struct pages in sgls and passing
+ * it to code that doesn't support it.
+ */
+ BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));
+
+ return ret;
+}
+
+/**
+ * sg_unmap - unmap a page that was mapped with sg_map_offset
+ * @sg: SG entry
+ * @addr: address returned by sg_map_offset
+ * @offset: Offset into entry (same as specified for sg_map)
+ * @flags: Flags, which are the same specified for sg_map
+ *
+ * Description:
+ * Unmap the page that was mapped with sg_map_offset
+ **/
+static inline void sg_unmap(struct scatterlist *sg, void *addr,
+ size_t offset, int flags)
+{
+ struct page *pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+ unsigned int pg_off = offset_in_page(offset);
+
+ if (flags & SG_KMAP_ATOMIC)
+ kunmap_atomic(addr - sg->offset - pg_off);
+ else if (flags & SG_KMAP)
+ kunmap(pg);
+}
+
/**
* sg_set_buf - Set sg entry to point at given data
* @sg: SG entry
--
2.1.4


2017-04-26 07:44:16

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

On Tue, Apr 25, 2017 at 12:20:48PM -0600, Logan Gunthorpe wrote:
> This patch introduces functions which kmap the pages inside an sgl.
> These functions replace a common pattern of kmap(sg_page(sg)) that is
> used in more than 50 places within the kernel.
>
> The motivation for this work is to eventually safely support sgls that
> contain io memory. In order for that to work, any access to the contents
> of an iomem SGL will need to be done with iomemcpy or hit some warning.
> (The exact details of how this will work have yet to be worked out.)

I think we'll at least need a draft of those to make sense of these
patches. Otherwise they just look very clumsy.

> + * Use this function to map a page in the scatterlist at the specified
> + * offset. sg->offset is already added for you. Note: the semantics of
> + * this function are that it may fail. Thus, its output should be checked
> + * with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
> + * in the mapped page is returned.
> + *
> + * Flags can be any of:
> + * * SG_KMAP - Use kmap to create the mapping
> + * * SG_KMAP_ATOMIC - Use kmap_atomic to map the page atommically.
> + * Thus, the rules of that function apply: the
> + * cpu may not sleep until it is unmaped.
> + * * SG_MAP_MUST_NOT_FAIL - Indicate that sg_map must not fail.
> + * If it does, it will issue a BUG_ON instead.
> + * This is intended for legacy code only, it
> + * is not to be used in new code.

I'm sorry but this API is just a trainwreck. Right now we have the
nice little kmap_atomic API, which never fails and has a very nice
calling convention where we just pass back the return address, but does
not support sleeping inside the critical section.

And kmap, whіch may fail and requires the original page to be passed
back. Anything that mixes these two concepts up is simply a non-starter.

--
You received this message because you are subscribed to the Google Groups "open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/[email protected]
To post to this group, send email to open-iscsi-/JYPxA39Uh5TLH3MbocFF+G/[email protected]
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

2017-04-26 08:59:17

by Christian König

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

Am 25.04.2017 um 20:20 schrieb Logan Gunthorpe:
> This patch introduces functions which kmap the pages inside an sgl.
> These functions replace a common pattern of kmap(sg_page(sg)) that is
> used in more than 50 places within the kernel.
>
> The motivation for this work is to eventually safely support sgls that
> contain io memory. In order for that to work, any access to the contents
> of an iomem SGL will need to be done with iomemcpy or hit some warning.
> (The exact details of how this will work have yet to be worked out.)
> Having all the kmaps in one place is just a first step in that
> direction. Additionally, seeing this helps cut down the users of sg_page,
> it should make any effort to go to struct-page-less DMAs a little
> easier (should that idea ever swing back into favour again).
>
> A flags option is added to select between a regular or atomic mapping so
> these functions can replace kmap(sg_page or kmap_atomic(sg_page.
> Future work may expand this to have flags for using page_address or
> vmap. We include a flag to require the function not to fail to
> support legacy code that has no easy error path. Much further in the
> future, there may be a flag to allocate memory and copy the data
> from/to iomem.
>
> We also add the semantic that sg_map can fail to create a mapping,
> despite the fact that the current code this is replacing is assumed to
> never fail and the current version of these functions cannot fail. This
> is to support iomem which may either have to fail to create the mapping or
> allocate memory as a bounce buffer which itself can fail.
>
> Also, in terms of cleanup, a few of the existing kmap(sg_page) users
> play things a bit loose in terms of whether they apply sg->offset
> so using these helper functions should help avoid such issues.
>
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---

Good to know that somebody is working on this. Those problems troubled
us as well.

Patch is Acked-by: Christian König <[email protected]>.

Regards,
Christian.

> include/linux/scatterlist.h | 85 +++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 85 insertions(+)
>
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index cb3c8fe..fad170b 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -5,6 +5,7 @@
> #include <linux/types.h>
> #include <linux/bug.h>
> #include <linux/mm.h>
> +#include <linux/highmem.h>
> #include <asm/io.h>
>
> struct scatterlist {
> @@ -126,6 +127,90 @@ static inline struct page *sg_page(struct scatterlist *sg)
> return (struct page *)((sg)->page_link & ~0x3);
> }
>
> +#define SG_KMAP (1 << 0) /* create a mapping with kmap */
> +#define SG_KMAP_ATOMIC (1 << 1) /* create a mapping with kmap_atomic */
> +#define SG_MAP_MUST_NOT_FAIL (1 << 2) /* indicate sg_map should not fail */
> +
> +/**
> + * sg_map - kmap a page inside an sgl
> + * @sg: SG entry
> + * @offset: Offset into entry
> + * @flags: Flags for creating the mapping
> + *
> + * Description:
> + * Use this function to map a page in the scatterlist at the specified
> + * offset. sg->offset is already added for you. Note: the semantics of
> + * this function are that it may fail. Thus, its output should be checked
> + * with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
> + * in the mapped page is returned.
> + *
> + * Flags can be any of:
> + * * SG_KMAP - Use kmap to create the mapping
> + * * SG_KMAP_ATOMIC - Use kmap_atomic to map the page atommically.
> + * Thus, the rules of that function apply: the
> + * cpu may not sleep until it is unmaped.
> + * * SG_MAP_MUST_NOT_FAIL - Indicate that sg_map must not fail.
> + * If it does, it will issue a BUG_ON instead.
> + * This is intended for legacy code only, it
> + * is not to be used in new code.
> + *
> + * Also, consider carefully whether this function is appropriate. It is
> + * largely not recommended for new code and if the sgl came from another
> + * subsystem and you don't know what kind of memory might be in the list
> + * then you definitely should not call it. Non-mappable memory may be in
> + * the sgl and thus this function may fail unexpectedly. Consider using
> + * sg_copy_to_buffer instead.
> + **/
> +static inline void *sg_map(struct scatterlist *sg, size_t offset, int flags)
> +{
> + struct page *pg;
> + unsigned int pg_off;
> + void *ret;
> +
> + offset += sg->offset;
> + pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
> + pg_off = offset_in_page(offset);
> +
> + if (flags & SG_KMAP_ATOMIC)
> + ret = kmap_atomic(pg) + pg_off;
> + else if (flags & SG_KMAP)
> + ret = kmap(pg) + pg_off;
> + else
> + ret = ERR_PTR(-EINVAL);
> +
> + /*
> + * In theory, this can't happen yet. Once we start adding
> + * unmapable memory, it also shouldn't happen unless developers
> + * start putting unmappable struct pages in sgls and passing
> + * it to code that doesn't support it.
> + */
> + BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));
> +
> + return ret;
> +}
> +
> +/**
> + * sg_unmap - unmap a page that was mapped with sg_map_offset
> + * @sg: SG entry
> + * @addr: address returned by sg_map_offset
> + * @offset: Offset into entry (same as specified for sg_map)
> + * @flags: Flags, which are the same specified for sg_map
> + *
> + * Description:
> + * Unmap the page that was mapped with sg_map_offset
> + **/
> +static inline void sg_unmap(struct scatterlist *sg, void *addr,
> + size_t offset, int flags)
> +{
> + struct page *pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
> + unsigned int pg_off = offset_in_page(offset);
> +
> + if (flags & SG_KMAP_ATOMIC)
> + kunmap_atomic(addr - sg->offset - pg_off);
> + else if (flags & SG_KMAP)
> + kunmap(pg);
> +}
> +
> /**
> * sg_set_buf - Set sg entry to point at given data
> * @sg: SG entry


_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

2017-04-26 18:11:33

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions



On 26/04/17 01:44 AM, Christoph Hellwig wrote:
> I think we'll at least need a draft of those to make sense of these
> patches. Otherwise they just look very clumsy.

Ok, I'll work up a draft proposal and send it in a couple days. But
without a lot of cleanup such as this series it's not going to even be
able to compile.

> I'm sorry but this API is just a trainwreck. Right now we have the
> nice little kmap_atomic API, which never fails and has a very nice
> calling convention where we just pass back the return address, but does
> not support sleeping inside the critical section.
>
> And kmap, whіch may fail and requires the original page to be passed
> back. Anything that mixes these two concepts up is simply a non-starter.

Ok, well for starters I think you are mistaken about kmap being able to
fail. I'm having a hard time finding many users of that function that
bother to check for an error when calling it. The main difficulty we
have now is that neither of those functions are expected to fail and we
need them to be able to in cases where the page doesn't map to system
RAM. This patch series is trying to address it for users of scatterlist.
I'm certainly open to other suggestions.

I also have to disagree that kmap and kmap_atomic are all that "nice".
Except for the sleeping restriction and performance, they effectively do
the same thing. And it was necessary to write a macro wrapper around
kunmap_atomic to ensure that users of that function don't screw it up.
(See 597781f3e5.) I'd say the kmap/kmap_atomic functions are the
trainwreck and I'm trying to do my best to cleanup a few cases.

There are a fair number of cases in the kernel that do something like:

if (something)
x = kmap(page);
else
x = kmap_atomic(page);
...
if (something)
kunmap(page)
else
kunmap_atomic(x)

Which just seems cumbersome to me.

In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
say so and I'll make the change. But I'll still need a flags variable
for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
and both of those functions will need to be pretty nearly replicas of
each other.

Logan


_______________________________________________
devel mailing list
[email protected]
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

2017-04-26 23:30:47

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions



On 26/04/17 02:59 AM, wrote:
> Good to know that somebody is working on this. Those problems troubled
> us as well.

Thanks Christian. It's a daunting problem and a there's a lot of work to
do before we will ever be where we need to be so any help, even an ack,
is greatly appreciated.

Logan

2017-04-27 06:53:41

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

On Wed, Apr 26, 2017 at 12:11:33PM -0600, Logan Gunthorpe wrote:
> Ok, well for starters I think you are mistaken about kmap being able to
> fail. I'm having a hard time finding many users of that function that
> bother to check for an error when calling it.

A quick audit of the arch code shows you're right - kmap can't fail
anywhere anymore.

> The main difficulty we
> have now is that neither of those functions are expected to fail and we
> need them to be able to in cases where the page doesn't map to system
> RAM. This patch series is trying to address it for users of scatterlist.
> I'm certainly open to other suggestions.

I think you'll need to follow the existing kmap semantics and never
fail the iomem version either. Otherwise you'll have a special case
that's almost never used that has a different error path.

> There are a fair number of cases in the kernel that do something like:
>
> if (something)
> x = kmap(page);
> else
> x = kmap_atomic(page);
> ...
> if (something)
> kunmap(page)
> else
> kunmap_atomic(x)
>
> Which just seems cumbersome to me.

Passing a different flag based on something isn't really much better.

> In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
> say so and I'll make the change. But I'll still need a flags variable
> for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
> and both of those functions will need to be pretty nearly replicas of
> each other.

Again, wrong way. Suddenly making things fail for your special case
that normally don't fail is a receipe for bugs.

2017-04-27 15:27:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:

> > The main difficulty we
> > have now is that neither of those functions are expected to fail and we
> > need them to be able to in cases where the page doesn't map to system
> > RAM. This patch series is trying to address it for users of scatterlist.
> > I'm certainly open to other suggestions.
>
> I think you'll need to follow the existing kmap semantics and never
> fail the iomem version either. Otherwise you'll have a special case
> that's almost never used that has a different error path.

How about first switching as many call sites as possible to use
sg_copy_X_buffer instead of kmap?

A random audit of Logan's series suggests this is actually a fairly
common thing.

eg drivers/mmc/host/sdhci.c is only doing this:

buffer = sdhci_kmap_atomic(sg, &flags);
memcpy(buffer, align, size);
sdhci_kunmap_atomic(buffer, &flags);

drivers/scsi/mvsas/mv_sas.c is this:

+ to = sg_map(sg_resp, 0, SG_KMAP_ATOMIC);
+ memcpy(to,
+ slot->response + sizeof(struct mvs_err_info),
+ sg_dma_len(sg_resp));
+ sg_unmap(sg_resp, to, 0, SG_KMAP_ATOMIC);

etc.

Lots of other places seem similar, if not sometimes a little bit more
convoluted..

Switching all the trivial cases to use copy might bring more clarity
as to what is actually required for the remaining few users? If there
are only a few then it may no longer matter if the API is not idyllic.

Jason

2017-04-27 15:44:18

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions



On 27/04/17 12:53 AM, Christoph Hellwig wrote:
> I think you'll need to follow the existing kmap semantics and never
> fail the iomem version either. Otherwise you'll have a special case
> that's almost never used that has a different error path.
>
> Again, wrong way. Suddenly making things fail for your special case
> that normally don't fail is a receipe for bugs.

I don't disagree but these restrictions make the problem impossible to
solve? If there is iomem behind a page in an SGL and someone tries to
map it, we either have to fail or we break iomem safety which was your
original concern.

Logan

_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

2017-04-27 15:57:58

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions



On 27/04/17 09:27 AM, Jason Gunthorpe wrote:
> On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:
> How about first switching as many call sites as possible to use
> sg_copy_X_buffer instead of kmap?

Yeah, I could look at doing that first.

One problem is we might get more Naks of the form of Herbert Xu's who
might be concerned with the performance implications.

These are definitely a bit more invasive changes than thin wrappers
around kmap calls.

> A random audit of Logan's series suggests this is actually a fairly
> common thing.

It's not _that_ common but there are a significant fraction. One of my
patches actually did this to two places that seemed to be reimplementing
the sg_copy_X_buffer logic.

Thanks,

Logan

2017-04-27 20:13:47

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions


On 26/04/17 01:44 AM, Christoph Hellwig wrote:
> I think we'll at least need a draft of those to make sense of these
> patches. Otherwise they just look very clumsy.

Ok, what follows is a draft patch attempting to show where I'm thinking
of going with this. Obviously it will not compile because it assumes
the users throughout the kernel are a bit different than they are today.
Notably, there is no sg_page anymore.

There's also likely a ton of issues and arguments to have over a bunch
of the specifics below and I'd expect the concept to evolve more
as cleanup occurs. This itself is an evolution of the draft I posted
replying to you in my last RFC thread.

Also, before any of this is truly useful to us, pfn_t would have to
infect a few other places in the kernel.

Thanks,

Logan


diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index fad170b..85ef928 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -6,13 +6,14 @@
#include <linux/bug.h>
#include <linux/mm.h>
#include <linux/highmem.h>
+#include <linux/pfn_t.h>
#include <asm/io.h>

struct scatterlist {
#ifdef CONFIG_DEBUG_SG
unsigned long sg_magic;
#endif
- unsigned long page_link;
+ pfn_t pfn;
unsigned int offset;
unsigned int length;
dma_addr_t dma_address;
@@ -60,15 +61,68 @@ struct sg_table {

#define SG_MAGIC 0x87654321

-/*
- * We overload the LSB of the page pointer to indicate whether it's
- * a valid sg entry, or whether it points to the start of a new
scatterlist.
- * Those low bits are there for everyone! (thanks mason :-)
- */
-#define sg_is_chain(sg) ((sg)->page_link & 0x01)
-#define sg_is_last(sg) ((sg)->page_link & 0x02)
-#define sg_chain_ptr(sg) \
- ((struct scatterlist *) ((sg)->page_link & ~0x03))
+static inline bool sg_is_chain(struct scatterlist *sg)
+{
+ return sg->pfn.val & PFN_SG_CHAIN;
+}
+
+static inline bool sg_is_last(struct scatterlist *sg)
+{
+ return sg->pfn.val & PFN_SG_LAST;
+}
+
+static inline struct scatterlist *sg_chain_ptr(struct scatterlist *sg)
+{
+ unsigned long sgl = pfn_t_to_pfn(sg->pfn);
+ return (struct scatterlist *)(sgl << PAGE_SHIFT);
+}
+
+static inline bool sg_is_iomem(struct scatterlist *sg)
+{
+ return pfn_t_is_iomem(sg->pfn);
+}
+
+/**
+ * sg_assign_pfn - Assign a given pfn_t to an SG entry
+ * @sg: SG entry
+ * @pfn: The pfn
+ *
+ * Description:
+ * Assign a pfn to sg entry. Also see sg_set_pfn(), the most commonly
used
+ * variant.w
+ *
+ **/
+static inline void sg_assign_pfn(struct scatterlist *sg, pfn_t pfn)
+{
+#ifdef CONFIG_DEBUG_SG
+ BUG_ON(sg->sg_magic != SG_MAGIC);
+ BUG_ON(sg_is_chain(sg));
+ BUG_ON(pfn.val & (PFN_SG_CHAIN | PFN_SG_LAST));
+#endif
+
+ sg->pfn = pfn;
+}
+
+/**
+ * sg_set_pfn - Set sg entry to point at given pfn
+ * @sg: SG entry
+ * @pfn: The page
+ * @len: Length of data
+ * @offset: Offset into page
+ *
+ * Description:
+ * Use this function to set an sg entry pointing at a pfn, never assign
+ * the page directly. We encode sg table information in the lower bits
+ * of the page pointer. See sg_pfn_t for looking up the pfn_t belonging
+ * to an sg entry.
+ **/
+static inline void sg_set_pfn(struct scatterlist *sg, pfn_t pfn,
+ unsigned int len, unsigned int offset)
+{
+ sg_assign_pfn(sg, pfn);
+ sg->offset = offset;
+ sg->length = len;
+}

/**
* sg_assign_page - Assign a given page to an SG entry
@@ -82,18 +136,13 @@ struct sg_table {
**/
static inline void sg_assign_page(struct scatterlist *sg, struct page
*page)
{
- unsigned long page_link = sg->page_link & 0x3;
+ if (!page) {
+ pfn_t null_pfn = {0};
+ sg_assign_pfn(sg, null_pfn);
+ return;
+ }

- /*
- * In order for the low bit stealing approach to work, pages
- * must be aligned at a 32-bit boundary as a minimum.
- */
- BUG_ON((unsigned long) page & 0x03);
-#ifdef CONFIG_DEBUG_SG
- BUG_ON(sg->sg_magic != SG_MAGIC);
- BUG_ON(sg_is_chain(sg));
-#endif
- sg->page_link = page_link | (unsigned long) page;
+ sg_assign_pfn(sg, page_to_pfn_t(page));
}

/**
@@ -106,8 +155,7 @@ static inline void sg_assign_page(struct scatterlist
*sg, struct page *page)
* Description:
* Use this function to set an sg entry pointing at a page, never assign
* the page directly. We encode sg table information in the lower bits
- * of the page pointer. See sg_page() for looking up the page belonging
- * to an sg entry.
+ * of the page pointer.
*
**/
static inline void sg_set_page(struct scatterlist *sg, struct page *page,
@@ -118,13 +166,53 @@ static inline void sg_set_page(struct scatterlist
*sg, struct page *page,
sg->length = len;
}

-static inline struct page *sg_page(struct scatterlist *sg)
+/**
+ * sg_pfn_t - Return the pfn_t for the sg
+ * @sg: SG entry
+ *
+ **/
+static inline pfn_t sg_pfn_t(struct scatterlist *sg)
{
#ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
BUG_ON(sg_is_chain(sg));
#endif
- return (struct page *)((sg)->page_link & ~0x3);
+
+ return sg->pfn;
+}
+
+/**
+ * sg_to_mappable_page - Try to return a struct page safe for general
+ * use in the kernel
+ * @sg: SG entry
+ * @page: A pointer to the returned page
+ *
+ * Description:
+ * If possible, return a mappable page that's safe for use around the
+ * kernel. Should only be used in legacy situations. sg_pfn_t() is a
+ * better choice for new code. This is deliberately more awkward than
+ * the old sg_page to enforce the __must_check rule and discourage future
+ * use.
+ *
+ * An example where this is required is in nvme-fabrics: a page from an
+ * sgl is placed into a bio. This function would be required until we can
+ * convert bios to use pfn_t as well. Similar issues with skbs, etc.
+ **/
+static inline __must_check int sg_to_mappable_page(struct scatterlist *sg,
+ struct page **ret)
+{
+ struct page *pg;
+
+ if (unlikely(sg_is_iomem(sg)))
+ return -EFAULT;
+
+ pg = pfn_t_to_page(sg->pfn);
+ if (unlikely(!pg))
+ return -EFAULT;
+
+ *ret = pg;
+
+ return 0;
}

#define SG_KMAP (1 << 0) /* create a mapping with kmap */
@@ -167,8 +255,19 @@ static inline void *sg_map(struct scatterlist *sg,
size_t offset, int flags)
unsigned int pg_off;
void *ret;

+ if (unlikely(sg_is_iomem(sg))) {
+ ret = ERR_PTR(-EFAULT);
+ goto out;
+ }
+
+ pg = pfn_t_to_page(sg->pfn);
+ if (unlikely(!pg)) {
+ ret = ERR_PTR(-EFAULT);
+ goto out;
+ }
+
offset += sg->offset;
- pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+ pg = nth_page(pg, offset >> PAGE_SHIFT);
pg_off = offset_in_page(offset);

if (flags & SG_KMAP_ATOMIC)
@@ -178,12 +277,7 @@ static inline void *sg_map(struct scatterlist *sg,
size_t offset, int flags)
else
ret = ERR_PTR(-EINVAL);

- /*
- * In theory, this can't happen yet. Once we start adding
- * unmapable memory, it also shouldn't happen unless developers
- * start putting unmappable struct pages in sgls and passing
- * it to code that doesn't support it.
- */
+out:
BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));

return ret;
@@ -202,9 +296,15 @@ static inline void *sg_map(struct scatterlist *sg,
size_t offset, int flags)
static inline void sg_unmap(struct scatterlist *sg, void *addr,
size_t offset, int flags)
{
- struct page *pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+ struct page *pg;
unsigned int pg_off = offset_in_page(offset);

+ pg = pfn_t_to_page(sg->pfn);
+ if (unlikely(!pg))
+ return;
+
+ pg = nth_page(pg, offset >> PAGE_SHIFT);
+
if (flags & SG_KMAP_ATOMIC)
kunmap_atomic(addr - sg->offset - pg_off);
else if (flags & SG_KMAP)
@@ -246,17 +346,18 @@ static inline void sg_set_buf(struct scatterlist
*sg, const void *buf,
static inline void sg_chain(struct scatterlist *prv, unsigned int
prv_nents,
struct scatterlist *sgl)
{
+ pfn_t pfn;
+ unsigned long _sgl = (unsigned long) sgl;
+
/*
* offset and length are unused for chain entry. Clear them.
*/
prv[prv_nents - 1].offset = 0;
prv[prv_nents - 1].length = 0;

- /*
- * Set lowest bit to indicate a link pointer, and make sure to clear
- * the termination bit if it happens to be set.
- */
- prv[prv_nents - 1].page_link = ((unsigned long) sgl | 0x01) & ~0x02;
+ BUG_ON(_sgl & PAGE_MASK);
+ pfn = __pfn_to_pfn_t(_sgl >> PAGE_SHIFT, PFN_SG_CHAIN);
+ prv[prv_nents - 1].pfn = pfn;
}

/**
@@ -276,8 +377,8 @@ static inline void sg_mark_end(struct scatterlist *sg)
/*
* Set termination bit, clear potential chain bit
*/
- sg->page_link |= 0x02;
- sg->page_link &= ~0x01;
+ sg->pfn.val |= PFN_SG_LAST;
+ sg->pfn.val &= ~PFN_SG_CHAIN;
}

/**
@@ -293,7 +394,7 @@ static inline void sg_unmark_end(struct scatterlist *sg)
#ifdef CONFIG_DEBUG_SG
BUG_ON(sg->sg_magic != SG_MAGIC);
#endif
- sg->page_link &= ~0x02;
+ sg->pfn.val &= ~PFN_SG_LAST;
}

/**
@@ -301,14 +402,13 @@ static inline void sg_unmark_end(struct
scatterlist *sg)
* @sg: SG entry
*
* Description:
- * This calls page_to_phys() on the page in this sg entry, and adds the
- * sg offset. The caller must know that it is legal to call
page_to_phys()
- * on the sg page.
+ * This calls pfn_t_to_phys() on the pfn in this sg entry, and adds the
+ * sg offset.
*
**/
static inline dma_addr_t sg_phys(struct scatterlist *sg)
{
- return page_to_phys(sg_page(sg)) + sg->offset;
+ return pfn_t_to_phys(sg->pfn) + sg->offset;
}

/**
@@ -323,7 +423,12 @@ static inline dma_addr_t sg_phys(struct scatterlist
*sg)
**/
static inline void *sg_virt(struct scatterlist *sg)
{
- return page_address(sg_page(sg)) + sg->offset;
+ struct page *pg = pfn_t_to_page(sg->pfn);
+
+ BUG_ON(sg_is_iomem(sg));
+ BUG_ON(!pg);
+
+ return page_address(pg) + sg->offset;
}

int sg_nents(struct scatterlist *sg);
@@ -422,10 +527,18 @@ void __sg_page_iter_start(struct sg_page_iter *piter,
/**
* sg_page_iter_page - get the current page held by the page iterator
* @piter: page iterator holding the page
+ *
+ * This function will require some cleanup. Some users simply mark
+ * attributes of the pages which are fine, others actually map it and
+ * will require some saftey there.
*/
static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
{
- return nth_page(sg_page(piter->sg), piter->sg_pgoffset);
+ struct page *pg = pfn_t_to_page(piter->sg->pfn);
+ if (!pg)
+ return NULL;
+
+ return nth_page(pg, piter->sg_pgoffset);
}

/**
@@ -468,11 +581,13 @@ static inline dma_addr_t
sg_page_iter_dma_address(struct sg_page_iter *piter)
#define SG_MITER_ATOMIC (1 << 0) /* use kmap_atomic */
#define SG_MITER_TO_SG (1 << 1) /* flush back to phys on unmap */
#define SG_MITER_FROM_SG (1 << 2) /* nop */
+#define SG_MITER_SUPPORTS_IOMEM (1 << 3) /* iteratee supports
iomem */

struct sg_mapping_iter {
/* the following three fields can be accessed directly */
struct page *page; /* currently mapped page */
void *addr; /* pointer to the mapped area */
+ void __iomem *ioaddr; /* pointer to iomem */
size_t length; /* length of the mapped area */
size_t consumed; /* number of consumed bytes */
struct sg_page_iter piter; /* page iterator */
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index c6cf822..2d1c58c 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -571,6 +571,8 @@ EXPORT_SYMBOL(sg_miter_skip);
*/
bool sg_miter_next(struct sg_mapping_iter *miter)
{
+ void *addr;
+
sg_miter_stop(miter);

/*
@@ -580,13 +582,25 @@ bool sg_miter_next(struct sg_mapping_iter *miter)
if (!sg_miter_get_next_page(miter))
return false;

+ if (sg_is_iomem(miter->piter.sg) &&
+ !(miter->__flags & SG_MITER_SUPPORTS_IOMEM))
+ return false;
+
miter->page = sg_page_iter_page(&miter->piter);
miter->consumed = miter->length = miter->__remaining;

if (miter->__flags & SG_MITER_ATOMIC)
- miter->addr = kmap_atomic(miter->page) + miter->__offset;
+ addr = kmap_atomic(miter->page) + miter->__offset;
else
- miter->addr = kmap(miter->page) + miter->__offset;
+ addr = kmap(miter->page) + miter->__offset;
+
+ if (sg_is_iomem(miter->piter.sg)) {
+ miter->addr = NULL;
+ miter->ioaddr = (void * __iomem) addr;
+ } else {
+ miter->addr = addr;
+ miter->ioaddr = NULL;
+ }

return true;
}
@@ -651,7 +665,7 @@ size_t sg_copy_buffer(struct scatterlist *sgl,
unsigned int nents, void *buf,
{
unsigned int offset = 0;
struct sg_mapping_iter miter;
- unsigned int sg_flags = SG_MITER_ATOMIC;
+ unsigned int sg_flags = SG_MITER_ATOMIC | SG_MITER_SUPPORTS_IOMEM;

if (to_buffer)
sg_flags |= SG_MITER_FROM_SG;
@@ -668,10 +682,17 @@ size_t sg_copy_buffer(struct scatterlist *sgl,
unsigned int nents, void *buf,

len = min(miter.length, buflen - offset);

- if (to_buffer)
- memcpy(buf + offset, miter.addr, len);
- else
- memcpy(miter.addr, buf + offset, len);
+ if (miter.addr) {
+ if (to_buffer)
+ memcpy(buf + offset, miter.addr, len);
+ else
+ memcpy(miter.addr, buf + offset, len);
+ } else if (miter.ioaddr) {
+ if (to_buffer)
+ memcpy_fromio(buf + offset, miter.addr, len);
+ else
+ memcpy_toio(miter.addr, buf + offset, len);
+ }

offset += len;
}