2021-01-26 19:55:55

by Fox Chen

[permalink] [raw]
Subject: [PATCH 00/12] docs: path-lookup: Update pathlookup docs

The Path lookup is a very complex subject in VFS. The path-lookup
document provides a very detailed guidance to help people understand
how path lookup works in the kernel.This document was originally
written based on three lwn articles five years ago. As times goes by,
some of the content was outdated. This patchset is intended to update
the document to make it more relevant to current codebase.


Fox Chen (12):
docs: path-lookup: update follow_managed() part
docs: path-lookup: update path_to_nameidata() parth
docs: path-lookup: update path_mountpoint() part
docs: path-lookup: update do_last() part
docs: path-lookup: remove filename_mountpoint
docs: path-lookup: Add macro name to symlink limit description
docs: path-lookup: i_op->follow_link replaced with i_op->get_link
docs: path-lookup: update i_op->put_link and cookie description
docs: path-lookup: no get_link()
docs: path-lookup: update WALK_GET, WALK_PUT desc
docs: path-lookup: update get_link() ->follow_link description
docs: path-lookup: update symlink description

Documentation/filesystems/path-lookup.rst | 146 ++++++++++------------
1 file changed, 63 insertions(+), 83 deletions(-)

--
2.30.0


2021-01-26 20:00:20

by Fox Chen

[permalink] [raw]
Subject: [PATCH 06/12] docs: path-lookup: Add macro name to symlink limit description

Add macro name MAXSYMLINKS to the symlink limit description, so
that it is consistent with path name length description above.

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index bc450e0864d6..25d2a5a59f45 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -994,8 +994,8 @@ is 4096. There are a number of reasons for this limit; not letting the
kernel spend too much time on just one path is one of them. With
symbolic links you can effectively generate much longer paths so some
sort of limit is needed for the same reason. Linux imposes a limit of
-at most 40 symlinks in any one path lookup. It previously imposed a
-further limit of eight on the maximum depth of recursion, but that was
+at most 40 (MAXSYMLINKS) symlinks in any one path lookup. It previously imposed
+a further limit of eight on the maximum depth of recursion, but that was
raised to 40 when a separate stack was implemented, so there is now
just the one limit.

--
2.30.0

2021-01-26 20:00:21

by Fox Chen

[permalink] [raw]
Subject: [PATCH 04/12] docs: path-lookup: update do_last() part

traling_symlink() was merged into lookup_last, do_last().

do_last() has later been split into open_last_lookups()
and do_open().

see related commit: c5971b8c6354a95c9ee7eb722928af5000bac247

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 34 +++++++++++------------
1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 4e77c8520fa9..1f05b1417a55 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -496,11 +496,11 @@ one provided by a dead NFS server. In the current kernel, path_mountpoint
has been merged into ``path_lookup_at()`` with a new flag LOOKUP_MOUNTPOINT.

Finally ``path_openat()`` is used for the ``open()`` system call; it
-contains, in support functions starting with "``do_last()``", all the
+contains, in support functions starting with "``open_last_lookups()``", all the
complexity needed to handle the different subtleties of O_CREAT (with
or without O_EXCL), final "``/``" characters, and trailing symbolic
links. We will revisit this in the final part of this series, which
-focuses on those symbolic links. "``do_last()``" will sometimes, but
+focuses on those symbolic links. "``open_last_lookups()``" will sometimes, but
not always, take ``i_rwsem``, depending on what it finds.

Each of these, or the functions which call them, need to be alert to
@@ -1201,26 +1201,26 @@ symlink.
This case is handled by the relevant caller of ``link_path_walk()``, such as
``path_lookupat()`` using a loop that calls ``link_path_walk()``, and then
handles the final component. If the final component is a symlink
-that needs to be followed, then ``trailing_symlink()`` is called to set
-things up properly and the loop repeats, calling ``link_path_walk()``
-again. This could loop as many as 40 times if the last component of
-each symlink is another symlink.
+that needs to be followed, then ``open_last_lookups()`` and ``do_open()`` is
+called to set things up properly and the loop repeats, calling
+``link_path_walk()`` again. This could loop as many as 40 times if the last
+component of each symlink is another symlink.

The various functions that examine the final component and possibly
-report that it is a symlink are ``lookup_last()``, ``mountpoint_last()``
-and ``do_last()``, each of which use the same convention as
-``walk_component()`` of returning ``1`` if a symlink was found that needs
-to be followed.
+report that it is a symlink are ``lookup_last()``, ``open_last_lookups()``
+, each of which use the same convention as
+``walk_component()`` of returning ``char *name`` if a symlink was found that
+needs to be followed.

-Of these, ``do_last()`` is the most interesting as it is used for
-opening a file. Part of ``do_last()`` runs with ``i_rwsem`` held and this
-part is in a separate function: ``lookup_open()``.
+Of these, ``open_last_lookups()``, ``do_open()`` is the most interesting as it is
+used for opening a file. Part of ``open_last_lookups()`` runs with ``i_rwsem``
+held and this part is in a separate function: ``lookup_open()``.

-Explaining ``do_last()`` completely is beyond the scope of this article,
-but a few highlights should help those interested in exploring the
-code.
+Explaining ``open_last_lookups()``, ``do_open()`` completely is beyond the scope
+of this article, but a few highlights should help those interested in exploring
+the code.

-1. Rather than just finding the target file, ``do_last()`` needs to open
+1. Rather than just finding the target file, ``do_open()`` needs to open
it. If the file was found in the dcache, then ``vfs_open()`` is used for
this. If not, then ``lookup_open()`` will either call ``atomic_open()`` (if
the filesystem provides it) to combine the final lookup with the open, or
--
2.30.0

2021-01-26 20:04:02

by Fox Chen

[permalink] [raw]
Subject: Re: [PATCH 00/12] docs: path-lookup: Update pathlookup docs

On Tue, Jan 26, 2021 at 3:25 PM Fox Chen <[email protected]> wrote:
>
> The Path lookup is a very complex subject in VFS. The path-lookup
> document provides a very detailed guidance to help people understand
> how path lookup works in the kernel.This document was originally
> written based on three lwn articles five years ago. As times goes by,
> some of the content was outdated. This patchset is intended to update
> the document to make it more relevant to current codebase.
>
>
> Fox Chen (12):
> docs: path-lookup: update follow_managed() part
> docs: path-lookup: update path_to_nameidata() parth
> docs: path-lookup: update path_mountpoint() part
> docs: path-lookup: update do_last() part
> docs: path-lookup: remove filename_mountpoint
> docs: path-lookup: Add macro name to symlink limit description
> docs: path-lookup: i_op->follow_link replaced with i_op->get_link
> docs: path-lookup: update i_op->put_link and cookie description
> docs: path-lookup: no get_link()
> docs: path-lookup: update WALK_GET, WALK_PUT desc
> docs: path-lookup: update get_link() ->follow_link description
> docs: path-lookup: update symlink description
>
> Documentation/filesystems/path-lookup.rst | 146 ++++++++++------------
> 1 file changed, 63 insertions(+), 83 deletions(-)
>
> --
> 2.30.0
>


To help review the patches, I annotated and highlighted changes using hypothesis

you can check the current docs:
https://hyp.is/go?url=https%3A%2F%2Fwww.kernel.org%2Fdoc%2Fhtml%2Flatest%2Ffilesystems%2Fpath-lookup.html&group=__world__

the docs after patches applied:
https://hyp.is/go?url=http%3A%2F%2Flinux-docs.54fox.com%2Flinux_docs%2Ffilesystems%2Fpath-lookup.html&group=__world__

2021-01-26 20:04:03

by Fox Chen

[permalink] [raw]
Subject: [PATCH 08/12] docs: path-lookup: update i_op->put_link and cookie description

No inode->put_link operation anymore. We use delayed_call to
deal with link destruction. Cookie has been replaced with
struct delayed_call.

Related commit: fceef393a538134f03b778c5d2519e670269342f

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 31 +++++++----------------
1 file changed, 9 insertions(+), 22 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 0a362849b26f..8572300b5405 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1068,34 +1068,21 @@ method. This is called both in RCU-walk and REF-walk. In RCU-walk the
RCU-walk. Much like the ``i_op->permission()`` method we
looked at previously, ``->get_link()`` would need to be careful that
all the data structures it references are safe to be accessed while
-holding no counted reference, only the RCU lock. Though getting a
-reference with ``->follow_link()`` is not yet done in RCU-walk mode, the
-code is ready to release the reference when that does happen.
-
-This need to drop the reference to a symlink adds significant
-complexity. It requires a reference to the inode so that the
-``i_op->put_link()`` inode operation can be called. In REF-walk, that
-reference is kept implicitly through a reference to the dentry, so
-keeping the ``struct path`` of the symlink is easiest. For RCU-walk,
-the pointer to the inode is kept separately. To allow switching from
-RCU-walk back to REF-walk in the middle of processing nested symlinks
-we also need the seq number for the dentry so we can confirm that
-switching back was safe.
-
-Finally, when providing a reference to a symlink, the filesystem also
-provides an opaque "cookie" that must be passed to ``->put_link()`` so that it
-knows what to free. This might be the allocated memory area, or a
-pointer to the ``struct page`` in the page cache, or something else
-completely. Only the filesystem knows what it is.
+holding no counted reference, only the RCU lock.
+
+Finally, a callback ``struct delayed_called`` will be passed to get_link,
+file systems can set their own put_link function and argument through
+``set_delayed_call``. Latter on, when vfs wants to put link, it will call
+``do_delayed_call`` to invoke that callback function with the argument.

In order for the reference to each symlink to be dropped when the walk completes,
whether in RCU-walk or REF-walk, the symlink stack needs to contain,
along with the path remnants:

-- the ``struct path`` to provide a reference to the inode in REF-walk
-- the ``struct inode *`` to provide a reference to the inode in RCU-walk
+- the ``struct path`` to provide a reference to the previous path
+- the ``const char *`` to provide a reference to the to previous name
- the ``seq`` to allow the path to be safely switched from RCU-walk to REF-walk
-- the ``cookie`` that tells ``->put_path()`` what to put.
+- the ``struct delayed_call`` for later invocation.

This means that each entry in the symlink stack needs to hold five
pointers and an integer instead of just one pointer (the path
--
2.30.0

2021-01-26 20:04:06

by Fox Chen

[permalink] [raw]
Subject: [PATCH 12/12] docs: path-lookup: update symlink description

instead of lookup_real()/vfs_create(), i_op->lookup() and
i_op->create() will be called directly.

update vfs_open() logic

should_follow_link is merged into lookup_last() or open_last_lookup()
which returns symlink name instead of an integer.

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 2bb3ca486acd..0c6fc296056c 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1204,16 +1204,15 @@ the code.
it. If the file was found in the dcache, then ``vfs_open()`` is used for
this. If not, then ``lookup_open()`` will either call ``atomic_open()`` (if
the filesystem provides it) to combine the final lookup with the open, or
- will perform the separate ``lookup_real()`` and ``vfs_create()`` steps
+ will perform the separate ``i_op->lookup()`` and ``i_op->create()`` steps
directly. In the later case the actual "open" of this newly found or
created file will be performed by ``vfs_open()``, just as if the name
were found in the dcache.

2. ``vfs_open()`` can fail with ``-EOPENSTALE`` if the cached information
- wasn't quite current enough. Rather than restarting the lookup from
- the top with ``LOOKUP_REVAL`` set, ``lookup_open()`` is called instead,
- giving the filesystem a chance to resolve small inconsistencies.
- If that doesn't work, only then is the lookup restarted from the top.
+ wasn't quite current enough. If it's in RCU-walk -ECHILD will be returned
+ otherwise will return -ESTALE. When -ESTALE is returned, the caller may
+ retry with LOOKUP_REVAL flag set.

3. An open with O_CREAT **does** follow a symlink in the final component,
unlike other creation system calls (like ``mkdir``). So the sequence::
@@ -1223,8 +1222,8 @@ the code.

will create a file called ``/tmp/bar``. This is not permitted if
``O_EXCL`` is set but otherwise is handled for an O_CREAT open much
- like for a non-creating open: ``should_follow_link()`` returns ``1``, and
- so does ``do_last()`` so that ``trailing_symlink()`` gets called and the
+ like for a non-creating open: ``lookup_last()`` or ``open_last_lookup()``
+ returns a non ``Null`` value, and ``link_path_walk()`` gets called and the
open process continues on the symlink that was found.

Updating the access time
--
2.30.0

2021-01-27 05:57:01

by Fox Chen

[permalink] [raw]
Subject: [PATCH 02/12] docs: path-lookup: update path_to_nameidata() parth

No path_to_namei() anymore, step_into() will be called.
Related commit: c99687a03a78775f77d57fe9b07af4c8ec3dd03c

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index e778db767120..2ad96e1e3c49 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -455,7 +455,7 @@ In the absence of symbolic links, ``walk_component()`` creates a new
``struct path`` containing a counted reference to the new dentry and a
reference to the new ``vfsmount`` which is only counted if it is
different from the previous ``vfsmount``. It then calls
-``path_to_nameidata()`` to install the new ``struct path`` in the
+``step_into()`` to install the new ``struct path`` in the
``struct nameidata`` and drop the unneeded references.

This "hand-over-hand" sequencing of getting a reference to the new
--
2.30.0

2021-01-27 05:57:29

by Fox Chen

[permalink] [raw]
Subject: [PATCH 11/12] docs: path-lookup: update get_link() ->follow_link description

get_link() is merged into pick_link(). i_op->follow_link is
replaced with i_op->get_link(). get_link() can return ERR_PTR(0)
which equals NULL.

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 921779a4636f..2bb3ca486acd 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1137,10 +1137,10 @@ Symlinks with no final component

A pair of special-case symlinks deserve a little further explanation.
Both result in a new ``struct path`` (with mount and dentry) being set
-up in the ``nameidata``, and result in ``get_link()`` returning ``NULL``.
+up in the ``nameidata``, and result in ``pick_link()`` returning ``NULL``.

The more obvious case is a symlink to "``/``". All symlinks starting
-with "``/``" are detected in ``get_link()`` which resets the ``nameidata``
+with "``/``" are detected in ``pick_link()`` which resets the ``nameidata``
to point to the effective filesystem root. If the symlink only
contains "``/``" then there is nothing more to do, no components at all,
so ``NULL`` is returned to indicate that the symlink can be released and
@@ -1157,12 +1157,11 @@ something that looks like a symlink. It is really a reference to the
target file, not just the name of it. When you ``readlink`` these
objects you get a name that might refer to the same file - unless it
has been unlinked or mounted over. When ``walk_component()`` follows
-one of these, the ``->follow_link()`` method in "procfs" doesn't return
+one of these, the ``->get_link()`` method in "procfs" doesn't return
a string name, but instead calls ``nd_jump_link()`` which updates the
-``nameidata`` in place to point to that target. ``->follow_link()`` then
-returns ``NULL``. Again there is no final component and ``get_link()``
-reports this by leaving the ``last_type`` field of ``nameidata`` as
-``LAST_BIND``.
+``nameidata`` in place to point to that target. ``->get_link()`` then
+returns ``0``. Again there is no final component and ``pick_link()``
+returns NULL.

Following the symlink in the final component
--------------------------------------------
--
2.30.0

2021-01-27 05:57:42

by Fox Chen

[permalink] [raw]
Subject: [PATCH 07/12] docs: path-lookup: i_op->follow_link replaced with i_op->get_link

follow_link has been replaced by get_link() which can be
called in RCU mode.

see commit: 6b2553918d8b4e6de9853fd6315bec7271a2e592

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 25d2a5a59f45..0a362849b26f 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1062,13 +1062,11 @@ filesystem cannot successfully get a reference in RCU-walk mode, it
must return ``-ECHILD`` and ``unlazy_walk()`` will be called to return to
REF-walk mode in which the filesystem is allowed to sleep.

-The place for all this to happen is the ``i_op->follow_link()`` inode
-method. In the present mainline code this is never actually called in
-RCU-walk mode as the rewrite is not quite complete. It is likely that
-in a future release this method will be passed an ``inode`` pointer when
-called in RCU-walk mode so it both (1) knows to be careful, and (2) has the
-validated pointer. Much like the ``i_op->permission()`` method we
-looked at previously, ``->follow_link()`` would need to be careful that
+The place for all this to happen is the ``i_op->get_link()`` inode
+method. This is called both in RCU-walk and REF-walk. In RCU-walk the
+``dentry*`` argument is NULL, ``->get_link()`` can return -ECHILD to drop
+RCU-walk. Much like the ``i_op->permission()`` method we
+looked at previously, ``->get_link()`` would need to be careful that
all the data structures it references are safe to be accessed while
holding no counted reference, only the RCU lock. Though getting a
reference with ``->follow_link()`` is not yet done in RCU-walk mode, the
--
2.30.0

2021-01-27 05:58:01

by Fox Chen

[permalink] [raw]
Subject: [PATCH 09/12] docs: path-lookup: no get_link()

no get_link() anymore. we have step_into() and pick_link().

walk_component() will call step_into(), in turn call pick_link,
and return symlink name.

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 8572300b5405..915c9ffe22c1 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1106,12 +1106,10 @@ doesn't need to notice. Getting this ``name`` variable on and off the
stack is very straightforward; pushing and popping the references is
a little more complex.

-When a symlink is found, ``walk_component()`` returns the value ``1``
-(``0`` is returned for any other sort of success, and a negative number
-is, as usual, an error indicator). This causes ``get_link()`` to be
-called; it then gets the link from the filesystem. Providing that
-operation is successful, the old path ``name`` is placed on the stack,
-and the new value is used as the ``name`` for a while. When the end of
+When a symlink is found, ``walk_component()`` calls ``pick_link()``,
+it then gets the link from the filesystem returning new path ``name``.
+Providing that operation is successful, the old path ``name`` is placed on the
+stack, and the new value is used as the ``name`` for a while. When the end of
the path is found (i.e. ``*name`` is ``'\0'``) the old ``name`` is restored
off the stack and path walking continues.

--
2.30.0

2021-01-27 05:58:03

by Fox Chen

[permalink] [raw]
Subject: [PATCH 10/12] docs: path-lookup: update WALK_GET, WALK_PUT desc

WALK_GET is changed to WALK_TRAILING with a different meaning.
Here it should be WALK_NOFOLLOW. WALK_PUT dosn't exist, we have
WALK_MORE.

WALK_PUT == !WALK_MORE

And there is not should_follow_link().

related commits:
8c4efe22e7c4de1d44f624120a979e31e3a584b8
1c4ff1a87e46a06fc00a83da2fbbc3ac0298f221

Signed-off-by: Fox Chen <[email protected]>
---
Documentation/filesystems/path-lookup.rst | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
index 915c9ffe22c1..921779a4636f 100644
--- a/Documentation/filesystems/path-lookup.rst
+++ b/Documentation/filesystems/path-lookup.rst
@@ -1126,13 +1126,11 @@ stack in ``walk_component()`` immediately when the symlink is found;
old symlink as it walks that last component. So it is quite
convenient for ``walk_component()`` to release the old symlink and pop
the references just before pushing the reference information for the
-new symlink. It is guided in this by two flags; ``WALK_GET``, which
-gives it permission to follow a symlink if it finds one, and
-``WALK_PUT``, which tells it to release the current symlink after it has been
-followed. ``WALK_PUT`` is tested first, leading to a call to
-``put_link()``. ``WALK_GET`` is tested subsequently (by
-``should_follow_link()``) leading to a call to ``pick_link()`` which sets
-up the stack frame.
+new symlink. It is guided in this by two flags; ``WALK_NOFOLLOW``, which
+suggests whether to follow a symlink if it finds one, and
+``WALK_MORE``, which tells whether to release the current symlink after it has
+been followed. ``WALK_MORE`` is tested first, leading to a call to
+``put_link()``.

Symlinks with no final component
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
--
2.30.0

2021-01-27 19:51:27

by Jonathan Corbet

[permalink] [raw]
Subject: Re: [PATCH 00/12] docs: path-lookup: Update pathlookup docs

On Tue, 26 Jan 2021 15:24:31 +0800
Fox Chen <[email protected]> wrote:

> The Path lookup is a very complex subject in VFS. The path-lookup
> document provides a very detailed guidance to help people understand
> how path lookup works in the kernel.This document was originally
> written based on three lwn articles five years ago. As times goes by,
> some of the content was outdated. This patchset is intended to update
> the document to make it more relevant to current codebase.
>
>
> Fox Chen (12):
> docs: path-lookup: update follow_managed() part
> docs: path-lookup: update path_to_nameidata() parth
> docs: path-lookup: update path_mountpoint() part
> docs: path-lookup: update do_last() part
> docs: path-lookup: remove filename_mountpoint
> docs: path-lookup: Add macro name to symlink limit description
> docs: path-lookup: i_op->follow_link replaced with i_op->get_link
> docs: path-lookup: update i_op->put_link and cookie description
> docs: path-lookup: no get_link()
> docs: path-lookup: update WALK_GET, WALK_PUT desc
> docs: path-lookup: update get_link() ->follow_link description
> docs: path-lookup: update symlink description
>
> Documentation/filesystems/path-lookup.rst | 146 ++++++++++------------
> 1 file changed, 63 insertions(+), 83 deletions(-)

Neil Brown (copied) is the original author of this document; I'd really
like his feedback on these changes. Neil, the full set is at:

https://lore.kernel.org/lkml/[email protected]/

Thanks,

jon

2021-01-28 03:26:25

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 02/12] docs: path-lookup: update path_to_nameidata() parth

On Tue, Jan 26 2021, Fox Chen wrote:

> No path_to_namei() anymore, step_into() will be called.
> Related commit: c99687a03a78775f77d57fe9b07af4c8ec3dd03c
>
> Signed-off-by: Fox Chen <[email protected]>
> ---
> Documentation/filesystems/path-lookup.rst | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
> index e778db767120..2ad96e1e3c49 100644
> --- a/Documentation/filesystems/path-lookup.rst
> +++ b/Documentation/filesystems/path-lookup.rst
> @@ -455,7 +455,7 @@ In the absence of symbolic links, ``walk_component()`` creates a new
> ``struct path`` containing a counted reference to the new dentry and a
> reference to the new ``vfsmount`` which is only counted if it is
> different from the previous ``vfsmount``. It then calls
> -``path_to_nameidata()`` to install the new ``struct path`` in the
> +``step_into()`` to install the new ``struct path`` in the
> ``struct nameidata`` and drop the unneeded references.

The logic describe here is now embodied by the code in step_into(), so
the change doesn't make the description any more correct.

Possibly you need to change the hero of the story from walk_component()
to step_into(), but that is just a guess.

NeilBrown


>
> This "hand-over-hand" sequencing of getting a reference to the new
> --
> 2.30.0


Attachments:
signature.asc (869.00 B)

2021-01-28 04:07:53

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 04/12] docs: path-lookup: update do_last() part

On Tue, Jan 26 2021, Fox Chen wrote:

> traling_symlink() was merged into lookup_last, do_last().
>
> do_last() has later been split into open_last_lookups()
> and do_open().
>
> see related commit: c5971b8c6354a95c9ee7eb722928af5000bac247
>
> Signed-off-by: Fox Chen <[email protected]>
> ---
> Documentation/filesystems/path-lookup.rst | 34 +++++++++++------------
> 1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
> index 4e77c8520fa9..1f05b1417a55 100644
> --- a/Documentation/filesystems/path-lookup.rst
> +++ b/Documentation/filesystems/path-lookup.rst
> @@ -496,11 +496,11 @@ one provided by a dead NFS server. In the current kernel, path_mountpoint
> has been merged into ``path_lookup_at()`` with a new flag LOOKUP_MOUNTPOINT.
>
> Finally ``path_openat()`` is used for the ``open()`` system call; it
> -contains, in support functions starting with "``do_last()``", all the
> +contains, in support functions starting with "``open_last_lookups()``", all the
> complexity needed to handle the different subtleties of O_CREAT (with
> or without O_EXCL), final "``/``" characters, and trailing symbolic
> links. We will revisit this in the final part of this series, which
> -focuses on those symbolic links. "``do_last()``" will sometimes, but
> +focuses on those symbolic links. "``open_last_lookups()``" will sometimes, but
> not always, take ``i_rwsem``, depending on what it finds.
>
> Each of these, or the functions which call them, need to be alert to
> @@ -1201,26 +1201,26 @@ symlink.
> This case is handled by the relevant caller of ``link_path_walk()``, such as
> ``path_lookupat()`` using a loop that calls ``link_path_walk()``, and then
> handles the final component. If the final component is a symlink
> -that needs to be followed, then ``trailing_symlink()`` is called to set
> -things up properly and the loop repeats, calling ``link_path_walk()``
> -again. This could loop as many as 40 times if the last component of
> -each symlink is another symlink.
> +that needs to be followed, then ``open_last_lookups()`` and ``do_open()`` is
> +called to set things up properly and the loop repeats, calling

This implies that do_open() is inside the loop (in path_openat()). But
it isn't, it comes after the loop.

(I haven't closely examined this rest of this patch).

NeilBrown


> +``link_path_walk()`` again. This could loop as many as 40 times if the last
> +component of each symlink is another symlink.
>
> The various functions that examine the final component and possibly
> -report that it is a symlink are ``lookup_last()``, ``mountpoint_last()``
> -and ``do_last()``, each of which use the same convention as
> -``walk_component()`` of returning ``1`` if a symlink was found that needs
> -to be followed.
> +report that it is a symlink are ``lookup_last()``, ``open_last_lookups()``
> +, each of which use the same convention as
> +``walk_component()`` of returning ``char *name`` if a symlink was found that
> +needs to be followed.
>
> -Of these, ``do_last()`` is the most interesting as it is used for
> -opening a file. Part of ``do_last()`` runs with ``i_rwsem`` held and this
> -part is in a separate function: ``lookup_open()``.
> +Of these, ``open_last_lookups()``, ``do_open()`` is the most interesting as it is
> +used for opening a file. Part of ``open_last_lookups()`` runs with ``i_rwsem``
> +held and this part is in a separate function: ``lookup_open()``.
>
> -Explaining ``do_last()`` completely is beyond the scope of this article,
> -but a few highlights should help those interested in exploring the
> -code.
> +Explaining ``open_last_lookups()``, ``do_open()`` completely is beyond the scope
> +of this article, but a few highlights should help those interested in exploring
> +the code.
>
> -1. Rather than just finding the target file, ``do_last()`` needs to open
> +1. Rather than just finding the target file, ``do_open()`` needs to open
> it. If the file was found in the dcache, then ``vfs_open()`` is used for
> this. If not, then ``lookup_open()`` will either call ``atomic_open()`` (if
> the filesystem provides it) to combine the final lookup with the open, or
> --
> 2.30.0


Attachments:
signature.asc (869.00 B)

2021-01-28 04:09:40

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 08/12] docs: path-lookup: update i_op->put_link and cookie description

On Tue, Jan 26 2021, Fox Chen wrote:

> No inode->put_link operation anymore. We use delayed_call to
> deal with link destruction. Cookie has been replaced with
> struct delayed_call.
>
> Related commit: fceef393a538134f03b778c5d2519e670269342f
>
> Signed-off-by: Fox Chen <[email protected]>
> ---
> Documentation/filesystems/path-lookup.rst | 31 +++++++----------------
> 1 file changed, 9 insertions(+), 22 deletions(-)
>
> diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
> index 0a362849b26f..8572300b5405 100644
> --- a/Documentation/filesystems/path-lookup.rst
> +++ b/Documentation/filesystems/path-lookup.rst
> @@ -1068,34 +1068,21 @@ method. This is called both in RCU-walk and REF-walk. In RCU-walk the
> RCU-walk. Much like the ``i_op->permission()`` method we
> looked at previously, ``->get_link()`` would need to be careful that
> all the data structures it references are safe to be accessed while
> -holding no counted reference, only the RCU lock. Though getting a
> -reference with ``->follow_link()`` is not yet done in RCU-walk mode, the
> -code is ready to release the reference when that does happen.
> -
> -This need to drop the reference to a symlink adds significant
> -complexity. It requires a reference to the inode so that the
> -``i_op->put_link()`` inode operation can be called. In REF-walk, that
> -reference is kept implicitly through a reference to the dentry, so
> -keeping the ``struct path`` of the symlink is easiest. For RCU-walk,
> -the pointer to the inode is kept separately. To allow switching from
> -RCU-walk back to REF-walk in the middle of processing nested symlinks
> -we also need the seq number for the dentry so we can confirm that
> -switching back was safe.
> -
> -Finally, when providing a reference to a symlink, the filesystem also
> -provides an opaque "cookie" that must be passed to ``->put_link()`` so that it
> -knows what to free. This might be the allocated memory area, or a
> -pointer to the ``struct page`` in the page cache, or something else
> -completely. Only the filesystem knows what it is.
> +holding no counted reference, only the RCU lock.
> +
> +Finally, a callback ``struct delayed_called`` will be passed to get_link,
> +file systems can set their own put_link function and argument through
> +``set_delayed_call``. Latter on, when vfs wants to put link, it will call

"Later", not "Latter".

Also, I'm not sure that the "Finally" at the start of the sentence makes
sense any more. I think it was meant to introduce the final part of the
"significant complexity", but now that significant complexity is gone.
At least, I assume it is gone. I haven't checked to code to see if
maybe it has just been moved.

NeilBrown


> +``do_delayed_call`` to invoke that callback function with the argument.
>
> In order for the reference to each symlink to be dropped when the walk completes,
> whether in RCU-walk or REF-walk, the symlink stack needs to contain,
> along with the path remnants:
>
> -- the ``struct path`` to provide a reference to the inode in REF-walk
> -- the ``struct inode *`` to provide a reference to the inode in RCU-walk
> +- the ``struct path`` to provide a reference to the previous path
> +- the ``const char *`` to provide a reference to the to previous name
> - the ``seq`` to allow the path to be safely switched from RCU-walk to REF-walk
> -- the ``cookie`` that tells ``->put_path()`` what to put.
> +- the ``struct delayed_call`` for later invocation.
>
> This means that each entry in the symlink stack needs to hold five
> pointers and an integer instead of just one pointer (the path
> --
> 2.30.0


Attachments:
signature.asc (869.00 B)

2021-01-28 04:15:53

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 07/12] docs: path-lookup: i_op->follow_link replaced with i_op->get_link

On Tue, Jan 26 2021, Fox Chen wrote:

> follow_link has been replaced by get_link() which can be
> called in RCU mode.
>
> see commit: 6b2553918d8b4e6de9853fd6315bec7271a2e592
>
> Signed-off-by: Fox Chen <[email protected]>
> ---
> Documentation/filesystems/path-lookup.rst | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/Documentation/filesystems/path-lookup.rst b/Documentation/filesystems/path-lookup.rst
> index 25d2a5a59f45..0a362849b26f 100644
> --- a/Documentation/filesystems/path-lookup.rst
> +++ b/Documentation/filesystems/path-lookup.rst
> @@ -1062,13 +1062,11 @@ filesystem cannot successfully get a reference in RCU-walk mode, it
> must return ``-ECHILD`` and ``unlazy_walk()`` will be called to return to
> REF-walk mode in which the filesystem is allowed to sleep.
>
> -The place for all this to happen is the ``i_op->follow_link()`` inode
> -method. In the present mainline code this is never actually called in
> -RCU-walk mode as the rewrite is not quite complete. It is likely that
> -in a future release this method will be passed an ``inode`` pointer when
> -called in RCU-walk mode so it both (1) knows to be careful, and (2) has the
> -validated pointer. Much like the ``i_op->permission()`` method we
> -looked at previously, ``->follow_link()`` would need to be careful that
> +The place for all this to happen is the ``i_op->get_link()`` inode
> +method. This is called both in RCU-walk and REF-walk. In RCU-walk the
> +``dentry*`` argument is NULL, ``->get_link()`` can return -ECHILD to drop
> +RCU-walk. Much like the ``i_op->permission()`` method we

The phrase "drop RCU-walk" isn't consistent with the rest of the text.
It talks about "dropping down to REF-walk", so you could write "dropping
out of RCU-walk", but not just "dropping RCU-walk".

NeilBrown


> +looked at previously, ``->get_link()`` would need to be careful that
> all the data structures it references are safe to be accessed while
> holding no counted reference, only the RCU lock. Though getting a
> reference with ``->follow_link()`` is not yet done in RCU-walk mode, the
> --
> 2.30.0


Attachments:
signature.asc (869.00 B)

2021-01-28 04:16:15

by NeilBrown

[permalink] [raw]
Subject: Re: [PATCH 00/12] docs: path-lookup: Update pathlookup docs

On Tue, Jan 26 2021, Fox Chen wrote:

> The Path lookup is a very complex subject in VFS. The path-lookup
> document provides a very detailed guidance to help people understand
> how path lookup works in the kernel.This document was originally
> written based on three lwn articles five years ago. As times goes by,
> some of the content was outdated. This patchset is intended to update
> the document to make it more relevant to current codebase.
>
>
> Fox Chen (12):
> docs: path-lookup: update follow_managed() part
> docs: path-lookup: update path_to_nameidata() parth
> docs: path-lookup: update path_mountpoint() part
> docs: path-lookup: update do_last() part
> docs: path-lookup: remove filename_mountpoint
> docs: path-lookup: Add macro name to symlink limit description
> docs: path-lookup: i_op->follow_link replaced with i_op->get_link
> docs: path-lookup: update i_op->put_link and cookie description
> docs: path-lookup: no get_link()
> docs: path-lookup: update WALK_GET, WALK_PUT desc
> docs: path-lookup: update get_link() ->follow_link description
> docs: path-lookup: update symlink description
>

Thanks for doing this. I've responded individually to several of the
patches. As you can see from my comments, there is often more to it
than just changing function names. In some places you have capture the
more general nature of the change fairly well. In other places the
result is incoherent or confusion.
Making small updates to this sort of documentation is not easy. You
need to step have and see the "big picture", to overall story-arc.
Sometimes you can fit changes into that arc, sometimes you might need to
restructure or re-tell the story.

This is part of why I haven't put much effort into the document -
re-telling a story can be a lot of work.

NeilBrown


Attachments:
signature.asc (869.00 B)