2019-01-22 10:02:24

by Joel Nider

[permalink] [raw]
Subject: [PATCH v3 0/3] update infiniband uverbs documentation

A small patchset to update the verbs API documentation with some
information regarding the ioctl syscall. First patch converts the
file format to ReST, since this is the new preferred format, moves
the file to Documentation/userspace-api, and updates the index.
The 2nd patch adds the new content, documenting a bit of the internal
workings of the kernel side of the API functions. The goal is to make
it easier for developers unfamiliar with the structure to understand
what is going on when adding a new function.
The 3rd patch updates the MAINTAINERS file.

v3 addresses comments from Jon, Willy and Jason:
The location of the new content should be driver-api
The location of the old (converted) content should be userspace-api
MAINTAINERS file must be updated

Joel Nider (3):
docs-rst: Convert user verbs doc to rst
docs-rst: driver-api: Add infiniband interface documentation
MAINTAINERS: add new RDMA/Infiniband documentation

Documentation/driver-api/index.rst | 1 +
Documentation/driver-api/infiniband.rst | 73 +++++++++++++++++++++++++
Documentation/infiniband/user_verbs.txt | 69 -----------------------
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/rdma_user_verbs.rst | 70 ++++++++++++++++++++++++
MAINTAINERS | 2 +
6 files changed, 147 insertions(+), 69 deletions(-)
create mode 100644 Documentation/driver-api/infiniband.rst
delete mode 100644 Documentation/infiniband/user_verbs.txt
create mode 100644 Documentation/userspace-api/rdma_user_verbs.rst

--
2.7.4



2019-01-22 10:02:32

by Joel Nider

[permalink] [raw]
Subject: [PATCH v3 1/3] docs-rst: Convert user verbs doc to rst

Move user_verbs from infiniband to userspace while changing the
format. Replace the existing Documentation/infiniband/user_verbs.txt
with Documentation/userspace-api/rdma_user_verbs.rst. No substantial
changes to the content - just some minor reformatting to have the
rendering come out nicely.

Since this documents a userspace API, its home should be with the
other userspace API docs.

Signed-off-by: Joel Nider <[email protected]>
---
Documentation/infiniband/user_verbs.txt | 69 ------------------------
Documentation/userspace-api/index.rst | 1 +
Documentation/userspace-api/rdma_user_verbs.rst | 70 +++++++++++++++++++++++++
3 files changed, 71 insertions(+), 69 deletions(-)
delete mode 100644 Documentation/infiniband/user_verbs.txt
create mode 100644 Documentation/userspace-api/rdma_user_verbs.rst

diff --git a/Documentation/infiniband/user_verbs.txt b/Documentation/infiniband/user_verbs.txt
deleted file mode 100644
index df049b9..0000000
--- a/Documentation/infiniband/user_verbs.txt
+++ /dev/null
@@ -1,69 +0,0 @@
-USERSPACE VERBS ACCESS
-
- The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
- enables direct userspace access to IB hardware via "verbs," as
- described in chapter 11 of the InfiniBand Architecture Specification.
-
- To use the verbs, the libibverbs library, available from
- https://github.com/linux-rdma/rdma-core, is required. libibverbs contains a
- device-independent API for using the ib_uverbs interface.
- libibverbs also requires appropriate device-dependent kernel and
- userspace driver for your InfiniBand hardware. For example, to use
- a Mellanox HCA, you will need the ib_mthca kernel module and the
- libmthca userspace driver be installed.
-
-User-kernel communication
-
- Userspace communicates with the kernel for slow path, resource
- management operations via the /dev/infiniband/uverbsN character
- devices. Fast path operations are typically performed by writing
- directly to hardware registers mmap()ed into userspace, with no
- system call or context switch into the kernel.
-
- Commands are sent to the kernel via write()s on these device files.
- The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
- The structs for commands that require a response from the kernel
- contain a 64-bit field used to pass a pointer to an output buffer.
- Status is returned to userspace as the return value of the write()
- system call.
-
-Resource management
-
- Since creation and destruction of all IB resources is done by
- commands passed through a file descriptor, the kernel can keep track
- of which resources are attached to a given userspace context. The
- ib_uverbs module maintains idr tables that are used to translate
- between kernel pointers and opaque userspace handles, so that kernel
- pointers are never exposed to userspace and userspace cannot trick
- the kernel into following a bogus pointer.
-
- This also allows the kernel to clean up when a process exits and
- prevent one process from touching another process's resources.
-
-Memory pinning
-
- Direct userspace I/O requires that memory regions that are potential
- I/O targets be kept resident at the same physical address. The
- ib_uverbs module manages pinning and unpinning memory regions via
- get_user_pages() and put_page() calls. It also accounts for the
- amount of memory pinned in the process's locked_vm, and checks that
- unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
-
- Pages that are pinned multiple times are counted each time they are
- pinned, so the value of locked_vm may be an overestimate of the
- number of pages pinned by a process.
-
-/dev files
-
- To create the appropriate character device files automatically with
- udev, a rule like
-
- KERNEL=="uverbs*", NAME="infiniband/%k"
-
- can be used. This will create device nodes named
-
- /dev/infiniband/uverbs0
-
- and so on. Since the InfiniBand userspace verbs should be safe for
- use by non-privileged processes, it may be useful to add an
- appropriate MODE or GROUP to the udev rule.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index a3233da..b82720f 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -20,6 +20,7 @@ place where this information is gathered.
seccomp_filter
unshare
spec_ctrl
+ rdma_user_verbs

.. only:: subproject and html

diff --git a/Documentation/userspace-api/rdma_user_verbs.rst b/Documentation/userspace-api/rdma_user_verbs.rst
new file mode 100644
index 0000000..ffc4aec
--- /dev/null
+++ b/Documentation/userspace-api/rdma_user_verbs.rst
@@ -0,0 +1,70 @@
+======================
+Userspace Verbs Access
+======================
+The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
+enables direct userspace access to IB hardware via "verbs," as
+described in chapter 11 of the InfiniBand Architecture Specification.
+
+To use the verbs, the libibverbs library, available from
+https://github.com/linux-rdma/rdma-core, is required. libibverbs contains a
+device-independent API for using the ib_uverbs interface.
+libibverbs also requires appropriate device-dependent kernel and
+userspace driver for your InfiniBand hardware. For example, to use
+a Mellanox HCA, you will need the ib_mthca kernel module and the
+libmthca userspace driver be installed.
+
+User-kernel communication
+=========================
+Userspace communicates with the kernel for slow path, resource
+management operations via the /dev/infiniband/uverbsN character
+devices. Fast path operations are typically performed by writing
+directly to hardware registers mmap()ed into userspace, with no
+system call or context switch into the kernel.
+
+Commands are sent to the kernel via write()s on these device files.
+The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
+The structs for commands that require a response from the kernel
+contain a 64-bit field used to pass a pointer to an output buffer.
+Status is returned to userspace as the return value of the write()
+system call.
+
+Resource management
+===================
+Since creation and destruction of all IB resources is done by
+commands passed through a file descriptor, the kernel can keep track
+of which resources are attached to a given userspace context. The
+ib_uverbs module maintains idr tables that are used to translate
+between kernel pointers and opaque userspace handles, so that kernel
+pointers are never exposed to userspace and userspace cannot trick
+the kernel into following a bogus pointer.
+
+This also allows the kernel to clean up when a process exits and
+prevent one process from touching another process's resources.
+
+Memory pinning
+==============
+Direct userspace I/O requires that memory regions that are potential
+I/O targets be kept resident at the same physical address. The
+ib_uverbs module manages pinning and unpinning memory regions via
+get_user_pages() and put_page() calls. It also accounts for the
+amount of memory pinned in the process's locked_vm, and checks that
+unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
+
+Pages that are pinned multiple times are counted each time they are
+pinned, so the value of locked_vm may be an overestimate of the
+number of pages pinned by a process.
+
+/dev files
+==========
+To create the appropriate character device files automatically with
+udev, a rule like::
+
+ KERNEL=="uverbs*", NAME="infiniband/%k"
+
+can be used. This will create device nodes named::
+
+ /dev/infiniband/uverbs0
+
+and so on. Since the InfiniBand userspace verbs should be safe for
+use by non-privileged processes, it may be useful to add an
+appropriate MODE or GROUP to the udev rule.
--
2.7.4


2019-01-22 10:03:50

by Joel Nider

[permalink] [raw]
Subject: [PATCH v3 3/3] MAINTAINERS: add new RDMA/Infiniband documentation

The RDMA (Infiniband) user verb documentation has been converted to .rst
format, and moved to the Documentation/userspace-api directory. In
addition, a new file is added to Documentation/driver-api with some
details on Infiniband internals. Thus, the MAINTAINERS file must be
updated with the responsibility of the new files.

Signed-off-by: Joel Nider <[email protected]>
---
MAINTAINERS | 2 ++
1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 2cf9c1c..977b83d 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -7487,6 +7487,8 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
S: Supported
F: Documentation/devicetree/bindings/infiniband/
F: Documentation/infiniband/
+F: Documentation/userspace-api/rdma_user_verbs.rst
+F: Documentation/driver-api/infiniband.rst
F: drivers/infiniband/
F: include/uapi/linux/if_infiniband.h
F: include/uapi/rdma/
--
2.7.4


2019-01-22 10:03:52

by Joel Nider

[permalink] [raw]
Subject: [PATCH v3 2/3] docs-rst: driver-api: Add infiniband interface documentation

A short document regarding the user verbs interface implementation on
the kernel side. Also, the corresponding index entry in the
documentation tree.

Signed-off-by: Joel Nider <[email protected]>
---
Documentation/driver-api/index.rst | 1 +
Documentation/driver-api/infiniband.rst | 73 +++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
create mode 100644 Documentation/driver-api/infiniband.rst

diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
index ab38ced..ecb3f8a5 100644
--- a/Documentation/driver-api/index.rst
+++ b/Documentation/driver-api/index.rst
@@ -28,6 +28,7 @@ available subsections can be seen below.
regulator
iio/index
input
+ infiniband
usb/index
firewire
pci/index
diff --git a/Documentation/driver-api/infiniband.rst b/Documentation/driver-api/infiniband.rst
new file mode 100644
index 0000000..2de47ff
--- /dev/null
+++ b/Documentation/driver-api/infiniband.rst
@@ -0,0 +1,73 @@
+==========================
+Infiniband Interface Guide
+==========================
+
+This guide is for people who wish to understand the implementation details of
+handler functions in the Infiniband subsystem. There are currently two system
+calls for executing Infiniband commands: write() and ioctl(). Older commands
+are sent to the kernel via write()s on the device files described in
+:doc:`../userspace-api/rdma_user_verbs`. New commands must use the ioctl()
+method. For completeness, both mechanisms are described here.
+
+The interface between userspace and kernel is kept in sync by checking the
+version number. In the kernel, it is defined by IB_USER_VERBS_ABI_VERSION
+(in include/uapi/rdma/ib_user_verbs.h).
+
+Write system call
+-----------------
+The entry point to the kernel is the ib_uverbs_write() function, which is
+invoked as a response to the 'write' system call. The requested function is
+looked up from an array called uverbs_cmd_table which contains function pointers
+to the various command handlers.
+
+Write Command Handlers
+~~~~~~~~~~~~~~~~~~~~~~
+These command handler functions are declared
+with the IB_VERBS_DECLARE_CMD macro in drivers/infiniband/core/uverbs.h. There
+are also extended commands, which are kept in a similar manner in the
+uverbs_ex_cmd_table. The extended commands use 64-bit values in the command
+header, as opposed to the 32-bit values used in the regular command table.
+
+Ioctl system call
+-----------------
+The entry point for the 'ioctl' system call is the ib_uverbs_ioctl() function.
+Unlike write(), ioctl() accepts a 'cmd' parameter, which must have the value
+defined by RDMA_VERBS_IOCTL. More documentation regarding the ioctl numbering
+scheme can be found in: Documentation/ioctl/ioctl-number.txt. The
+command-specific information is passed as a pointer in the 'arg' parameter,
+which is cast as a 'struct ib_uverbs_ioctl_hdr*'.
+
+The way command handler functions (methods) are looked up is more complicated
+than the array index used for write(). Here, the ib_uverbs_cmd_verbs() function
+uses a radix tree to search for the correct command handler. If the lookup
+succeeds, the method is invoked by ib_uverbs_run_method().
+
+Ioctl Command Handlers
+~~~~~~~~~~~~~~~~~~~~~~
+Command handlers (also known as 'methods') for ioctl are declared with the
+UVERBS_HANDLER macro. The handler is registered for use by the
+DECLARE_UVERBS_NAMED_METHOD macro, which binds the name of the handler with its
+attributes. By convention, the methods are implemented in files named with the
+prefix 'uverbs_std_types_'.
+
+Each method can accept a set of parameters called attributes. There are 6
+types of attributes: idr, fd, pointer, enum, const and flags. The idr attribute
+declares an indirect (translated) handle for the method, and
+specifies the object that the method will act upon. The first attribute should
+be a handle to the uobj (ib_uobject) which contains private data. There may be
+0 or more
+additional attributes, including other handles. The 'pointer' attribute must be
+specified as 'in' or 'out', depending on if it is an input from userspace, or
+meant to return a value to userspace.
+
+The method also needs to be bound to an object, which is done with the
+DECLARE_UVERBS_NAMED_OBJECT macro. This macro takes a variable
+number of methods and stores them in an array attached to the object.
+
+Objects are declared using DECLARE_UVERBS_NAMED_OBJECT macro. Most of the
+objects (including pd, mw, cq, etc.) are defined in uverbs_std_types.c,
+and the remaining objects are declared in files that are prefixed with the
+name 'uverbs_std_types_'.
+
+Objects trees are declared using the DECLARE_UVERBS_OBJECT_TREE macro. This
+combines all of the objects.
--
2.7.4


2019-01-30 12:59:12

by Joel Nider

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation

Hi Jon,

Have you had a chance to review this patchset?

Thanks,

Joel Nider/Haifa/IBM@IBMIL wrote on 01/22/2019 12:00:32 PM:

> From: Joel Nider/Haifa/IBM@IBMIL
> To: "Jonathan Corbet" <[email protected]>
> Cc: "Jason Gunthorpe" <[email protected]>, "Leon Romanovsky"
<[email protected]>,
> "Doug Ledford" <[email protected]>, "Mike Rapoport"
<[email protected]>,
> Joel Nider/Haifa/IBM@IBMIL, [email protected],
[email protected]
> Date: 01/22/2019 12:00 PM
> Subject: [PATCH v3 0/3] update infiniband uverbs documentation
>
> A small patchset to update the verbs API documentation with some
> information regarding the ioctl syscall. First patch converts the
> file format to ReST, since this is the new preferred format, moves
> the file to Documentation/userspace-api, and updates the index.
> The 2nd patch adds the new content, documenting a bit of the internal
> workings of the kernel side of the API functions. The goal is to make
> it easier for developers unfamiliar with the structure to understand
> what is going on when adding a new function.
> The 3rd patch updates the MAINTAINERS file.
>
> v3 addresses comments from Jon, Willy and Jason:
> The location of the new content should be driver-api
> The location of the old (converted) content should be userspace-api
> MAINTAINERS file must be updated
>
> Joel Nider (3):
> docs-rst: Convert user verbs doc to rst
> docs-rst: driver-api: Add infiniband interface documentation
> MAINTAINERS: add new RDMA/Infiniband documentation
>
> Documentation/driver-api/index.rst | 1 +
> Documentation/driver-api/infiniband.rst | 73
+++++++++++++++++++++++++
> Documentation/infiniband/user_verbs.txt | 69
-----------------------
> Documentation/userspace-api/index.rst | 1 +
> Documentation/userspace-api/rdma_user_verbs.rst | 70
++++++++++++++++++++++++
> MAINTAINERS | 2 +
> 6 files changed, 147 insertions(+), 69 deletions(-)
> create mode 100644 Documentation/driver-api/infiniband.rst
> delete mode 100644 Documentation/infiniband/user_verbs.txt
> create mode 100644 Documentation/userspace-api/rdma_user_verbs.rst
>
> --
> 2.7.4
>



2019-01-30 19:25:55

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation

On Wed, 30 Jan 2019 14:57:21 +0200
"Joel Nider" <[email protected]> wrote:

> Have you had a chance to review this patchset?

I've been mostly away from the keyboard for the last week; will be back
and dealing with things soon.

Thanks,

jon

2019-02-01 17:49:58

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation

On Tue, Jan 22, 2019 at 12:00:32PM +0200, Joel Nider wrote:
> A small patchset to update the verbs API documentation with some
> information regarding the ioctl syscall. First patch converts the
> file format to ReST, since this is the new preferred format, moves
> the file to Documentation/userspace-api, and updates the index.
> The 2nd patch adds the new content, documenting a bit of the internal
> workings of the kernel side of the API functions. The goal is to make
> it easier for developers unfamiliar with the structure to understand
> what is going on when adding a new function.
> The 3rd patch updates the MAINTAINERS file.
>
> v3 addresses comments from Jon, Willy and Jason:
> The location of the new content should be driver-api
> The location of the old (converted) content should be userspace-api
> MAINTAINERS file must be updated
>
> Joel Nider (3):
> docs-rst: Convert user verbs doc to rst
> docs-rst: driver-api: Add infiniband interface documentation
> MAINTAINERS: add new RDMA/Infiniband documentation

Doc folks, what is the feedback on these patches? Should I take
them through the rdma tree?

Thanks,
Jason

2019-02-01 17:59:55

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] docs-rst: driver-api: Add infiniband interface documentation

On Tue, Jan 22, 2019 at 12:00:34PM +0200, Joel Nider wrote:
> A short document regarding the user verbs interface implementation on
> the kernel side. Also, the corresponding index entry in the
> documentation tree.
>
> Signed-off-by: Joel Nider <[email protected]>
> Documentation/driver-api/index.rst | 1 +
> Documentation/driver-api/infiniband.rst | 73 +++++++++++++++++++++++++++++++++
> 2 files changed, 74 insertions(+)
> create mode 100644 Documentation/driver-api/infiniband.rst
>
> diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst
> index ab38ced..ecb3f8a5 100644
> +++ b/Documentation/driver-api/index.rst
> @@ -28,6 +28,7 @@ available subsections can be seen below.
> regulator
> iio/index
> input
> + infiniband
> usb/index
> firewire
> pci/index
> diff --git a/Documentation/driver-api/infiniband.rst b/Documentation/driver-api/infiniband.rst
> new file mode 100644
> index 0000000..2de47ff
> +++ b/Documentation/driver-api/infiniband.rst
> @@ -0,0 +1,73 @@
> +==========================
> +Infiniband Interface Guide
> +==========================
> +
> +This guide is for people who wish to understand the implementation details of
> +handler functions in the Infiniband subsystem. There are currently two system
> +calls for executing Infiniband commands: write() and ioctl(). Older commands
> +are sent to the kernel via write()s on the device files described in
> +:doc:`../userspace-api/rdma_user_verbs`. New commands must use the ioctl()
> +method. For completeness, both mechanisms are described here.
> +
> +The interface between userspace and kernel is kept in sync by checking the
> +version number. In the kernel, it is defined by IB_USER_VERBS_ABI_VERSION
> +(in include/uapi/rdma/ib_user_verbs.h).
> +
> +Write system call
> +-----------------
> +The entry point to the kernel is the ib_uverbs_write() function, which is
> +invoked as a response to the 'write' system call. The requested function is
> +looked up from an array called uverbs_cmd_table which contains function pointers
> +to the various command handlers.

This array was deleted recently

> +Write Command Handlers
> +~~~~~~~~~~~~~~~~~~~~~~
> +These command handler functions are declared
> +with the IB_VERBS_DECLARE_CMD macro in drivers/infiniband/core/uverbs.h. There
> +are also extended commands, which are kept in a similar manner in the
> +uverbs_ex_cmd_table. The extended commands use 64-bit values in the command
> +header, as opposed to the 32-bit values used in the regular command table.

IB_VERBS_DELCARE_CMD is also deleted

> +Objects trees are declared using the DECLARE_UVERBS_OBJECT_TREE macro. This
> +combines all of the objects.

DECLARE_UVERBS_OBJECT_TREE as well

Jason

2019-02-01 23:17:30

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation

On Fri, 1 Feb 2019 09:52:52 -0700
Jason Gunthorpe <[email protected]> wrote:

> Doc folks, what is the feedback on these patches? Should I take
> them through the rdma tree?

If you think they are ready (it seemed that there were still comments on
one of the patches?) I'll take them, just let me know.

Thanks,

jon (aka "doc folks" :)

2019-02-03 12:58:20

by Joel Nider

[permalink] [raw]
Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation

Jonathan Corbet <[email protected]> wrote on 02/02/2019 01:15:33 AM:

> Subject: Re: [PATCH v3 0/3] update infiniband uverbs documentation
>
> On Fri, 1 Feb 2019 09:52:52 -0700
> Jason Gunthorpe <[email protected]> wrote:
>
> > Doc folks, what is the feedback on these patches? Should I take
> > them through the rdma tree?
>
> If you think they are ready (it seemed that there were still comments on
> one of the patches?) I'll take them, just let me know.

From Jason's latest review, it looks like 90% of what I wrote is no longer
relevant:
> This array was deleted recently
> IB_VERBS_DELCARE_CMD is also deleted
> DECLARE_UVERBS_OBJECT_TREE as well

So I'm abandoning this patch.

> Thanks,
>
> jon (aka "doc folks" :)
>