2022-02-16 13:55:24

by Miklos Szeredi

[permalink] [raw]
Subject: Re: race between vfs_rename and do_linkat (mv and link)

On Wed, 16 Feb 2022 at 14:18, Xavier Roche <[email protected]> wrote:
>
> On Wed, Feb 16, 2022 at 11:28:18AM +0100, Miklos Szeredi wrote:
> > Something like this:
> > diff --git a/fs/namei.c b/fs/namei.c
> > index 3f1829b3ab5b..dd6908cee49d 100644
>
> Tested-by: Xavier Roche <[email protected]>
>
> I confirm this completely fixes at least the specific race. Tested on a
> unpatched and then patched 5.16.5, with the trivial bash test, and then
> with a C++ torture test.

Thanks for testing.

One issue with the patch is nesting of lock_rename() calls in stacked
fs (rwsem is not allowed to recurse even for read locks).

So the lock needs to be per-sb, but then do_linkat() becomes more
complex due to not being able to use the filename_create() helper.
But it's still much simpler than the special lookup loop described by
Al.

Thanks,
Miklos
>
> Before:
> -------
>
> $ time ./linkbug
> Failed after 4 with No such file or directory
>
> real 0m0,004s
> user 0m0,000s
> sys 0m0,004s
>
> After:
> ------
>
> (no error after ten minutes of running the program)
>
> Torture test program:
> ---------------------
>
> /* Linux rename vs. linkat race condition.
> * Rationale: both (1) moving a file to a target and (2) linking the target to a file in parallel leads to a race
> * on Linux kernel.
> * Sample file courtesy of Xavier Grand at Algolia
> * g++ -pthread linkbug.c -o linkbug
> */
>
> #include <thread>
> #include <unistd.h>
> #include <assert.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <iostream>
> #include <string.h>
>
> static const char* producedDir = "/tmp";
> static const char* producedFile = "/tmp/file.txt";
> static const char* producedTmpFile = "/tmp/file.txt.tmp";
> static const char* producedThreadDir = "/tmp/tmp";
> static const char* producedThreadFile = "/tmp/file.txt.tmp.2";
>
> bool createFile(const char* path)
> {
> const int fdOut = open(path,
> O_WRONLY | O_CREAT | O_TRUNC | O_EXCL | O_CLOEXEC,
> S_IRUSR | S_IWUSR | S_IRGRP | S_IWGRP | S_IROTH | S_IWOTH);
> assert(fdOut != -1);
> assert(write(fdOut, "Foo", 4) == 4);
> assert(close(fdOut) == 0);
> return true;
> }
>
> void func()
> {
> int nbSuccess = 0;
> // Loop producedThread a hardlink of the file
> while (true) {
> if (link(producedFile, producedThreadFile) != 0) {
> std::cout << "Failed after " << nbSuccess << " with " << strerror(errno) << std::endl;
> exit(EXIT_FAILURE);
> } else {
> nbSuccess++;
> }
> assert(unlink(producedThreadFile) == 0);
> }
> }
>
> int main()
> {
> // Setup env
> unlink(producedTmpFile);
> unlink(producedFile);
> unlink(producedThreadFile);
> createFile(producedFile);
> mkdir(producedThreadDir, 0777);
>
> // Async thread doing a hardlink and moving it
> std::thread t(func);
> // Loop creating a .tmp and moving it
> while (true) {
> assert(createFile(producedTmpFile));
> assert(rename(producedTmpFile, producedFile) == 0);
> }
> return 0;
> }


2022-02-18 17:17:46

by Miklos Szeredi

[permalink] [raw]
Subject: Re: race between vfs_rename and do_linkat (mv and link)

On Wed, 16 Feb 2022 at 14:37, Miklos Szeredi <[email protected]> wrote:

> One issue with the patch is nesting of lock_rename() calls in stacked
> fs (rwsem is not allowed to recurse even for read locks).

Scratch that, there's no recursion since do_linkat() is not called
from stacked fs. And taking link_rwsem is optional for the link
operation, so this is fine. For stacked fs the race is hopefully
taken care by the top layer locking, no need to repeat it in lower
layers.

I've now sent this patch with a proper header comment to Al.

Thanks,
Miklos