Hi all,
I'm a user-space developer working on Wayland. Recently we've been
discussing about security considerations related to FD passing between
processes [1].
A Wayland compositor often needs to share read-only data with its
clients. Examples include a keyboard keymap, or a pixel format table.
The clients might be untrusted. The data sharing can happen by having
the compositor send a read-only FD (ie, a FD opened with O_RDONLY) to
clients.
It was assumed that passing such a FD wouldn't allow Wayland clients to
write to the file. However, it was recently discovered that procfs
allows to bypass this restriction. A process can open(2)
"/proc/self/fd/<fd>" with O_RDWR, and that will return a FD suitable for
writing. This also works when running the client inside a user namespace.
A PoC is available at [2] and can be tested inside a compositor which
uses this O_RDONLY strategy (e.g. wlroots compositors).
Question: is this intended behavior, or is this an oversight? If this is
intended behavior, what would be a good way to share a FD to another
process without allowing it to write to the underlying file?
Thanks,
Simon
[1]: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/issues/92
[2]: https://paste.sr.ht/~emersion/eac94b03f286e21f8362354b6af032291c00f8a7
On Thursday, May 12th, 2022 at 14:30, Amir Goldstein <[email protected]> wrote:
> Clients can also readlink("/proc/self/fd/<fd>") to get the path of the file
> and open it from its path (if path is accessible in their mount namespace).
What the compositor does is:
- shm_open with O_RDWR
- Write the kyeboard keymap
- shm_open again the same file with O_RDONLY
- shm_unlink
- Send the O_RDONLY FD to clients
Thus, the file doesn't exist anymore when clients get the FD.
> Would the clients typically have write permission to those files?
> Do they need to?
Compositors need to disallow clients from writing to the shared files.
If a client gets write access to the shared file, they can corrupt the
keyboard keymap (and other data) used by all other clients.
> > intended behavior, what would be a good way to share a FD to another
> > process without allowing it to write to the underlying file?
>
> If wayland can use a read-only bind mount to the location of the files that it
> needs to share, then re-open will get EROFS.
Wayland just uses FD passing via Unix sockets to share memory. It
doesn't (and can't) assume anything regarding the filesystem layout,
because the clients might be running in a separate namespace with a
completely different layout (e.g. Flatpak).
On Thursday, May 12th, 2022 at 12:37, Simon Ser <[email protected]> wrote:
> what would be a good way to share a FD to another
> process without allowing it to write to the underlying file?
(I'm reminded that memfd + seals exist for this purpose. Still, I'd be
interested to know whether that O_RDONLY/O_RDWR behavior is intended,
because it's pretty surprising. The motivation for using O_RDONLY over
memfd seals is that it isn't Linux-specific.)
On Thu, May 12, 2022 at 2:09 PM Simon Ser <[email protected]> wrote:
>
> Hi all,
>
> I'm a user-space developer working on Wayland. Recently we've been
> discussing about security considerations related to FD passing between
> processes [1].
>
> A Wayland compositor often needs to share read-only data with its
> clients. Examples include a keyboard keymap, or a pixel format table.
> The clients might be untrusted. The data sharing can happen by having
> the compositor send a read-only FD (ie, a FD opened with O_RDONLY) to
> clients.
>
> It was assumed that passing such a FD wouldn't allow Wayland clients to
> write to the file. However, it was recently discovered that procfs
> allows to bypass this restriction. A process can open(2)
> "/proc/self/fd/<fd>" with O_RDWR, and that will return a FD suitable for
> writing. This also works when running the client inside a user namespace.
> A PoC is available at [2] and can be tested inside a compositor which
> uses this O_RDONLY strategy (e.g. wlroots compositors).
>
> Question: is this intended behavior, or is this an oversight? If this is
Clients can also readlink("/proc/self/fd/<fd>") to get the path of the file
and open it from its path (if path is accessible in their mount namespace).
Would the clients typically have write permission to those files?
Do they need to?
> intended behavior, what would be a good way to share a FD to another
> process without allowing it to write to the underlying file?
>
If wayland can use a read-only bind mount to the location of the files that it
needs to share, then re-open will get EROFS.
Thanks,
Amir.
On Thu, 12 May 2022 at 14:41, Simon Ser <[email protected]> wrote:
>
> On Thursday, May 12th, 2022 at 12:37, Simon Ser <[email protected]> wrote:
>
> > what would be a good way to share a FD to another
> > process without allowing it to write to the underlying file?
>
> (I'm reminded that memfd + seals exist for this purpose. Still, I'd be
> interested to know whether that O_RDONLY/O_RDWR behavior is intended,
> because it's pretty surprising. The motivation for using O_RDONLY over
> memfd seals is that it isn't Linux-specific.)
Yes, this is intended. The /proc/$PID/fd/$FD file represents the
inode pointed to by $FD. So the open flags for $FD are irrelevant
when operating on the proc fd file.
Thanks,
Miklos
On Thu, May 12, 2022 at 3:38 PM Simon Ser <[email protected]> wrote:
>
> On Thursday, May 12th, 2022 at 14:30, Amir Goldstein <[email protected]> wrote:
>
> > Clients can also readlink("/proc/self/fd/<fd>") to get the path of the file
> > and open it from its path (if path is accessible in their mount namespace).
>
> What the compositor does is:
>
> - shm_open with O_RDWR
> - Write the kyeboard keymap
> - shm_open again the same file with O_RDONLY
> - shm_unlink
> - Send the O_RDONLY FD to clients
>
> Thus, the file doesn't exist anymore when clients get the FD.
From system POV, a readonly bind mount of /dev/shm
could be created (e.g. at /dev/shm-ro) and then wayland could open
the shm rdonly file from that path.
If wayland cannot rely on system to create the bind mount for it,
it could also clone its own mount namespace and create the
bind mount in its own namespace for opening the rdonly file.
But that implementation would be Linux specific an Linux has many
other APIs that were suggested on the linked gitlab issue.
You did not mention them in your question and did not say why
those solutions are not a good enough for your needs.
Thanks,
Amir.
On Thu, May 12, 2022 at 02:56:22PM +0200, Miklos Szeredi wrote:
> On Thu, 12 May 2022 at 14:41, Simon Ser <[email protected]> wrote:
> >
> > On Thursday, May 12th, 2022 at 12:37, Simon Ser <[email protected]> wrote:
> >
> > > what would be a good way to share a FD to another
> > > process without allowing it to write to the underlying file?
> >
> > (I'm reminded that memfd + seals exist for this purpose. Still, I'd be
> > interested to know whether that O_RDONLY/O_RDWR behavior is intended,
> > because it's pretty surprising. The motivation for using O_RDONLY over
> > memfd seals is that it isn't Linux-specific.)
>
> Yes, this is intended. The /proc/$PID/fd/$FD file represents the
> inode pointed to by $FD. So the open flags for $FD are irrelevant
> when operating on the proc fd file.
Fwiw, the original openat2() patchset contained upgrade masks which we
decided to split it out into a separate patchset.
The idea is that struct open_how would be extended with an upgrade mask
field which allows the opener to specify with which permissions a file
descriptor is allowed to be re-opened. This has quite a lot of
use-cases, especially in container runtimes. So one could open an fd and
restrict it from being re-opened with O_WRONLY. For container runtimes
this is a huge security win and for userspace in general it would
provide a backwards compatible way of e.g., making O_PATH fds
non-upgradable. The plan is to resend the extension at some point in the
not too distant future.
Christian
On 12/05/2022 14.38, Simon Ser wrote:
> On Thursday, May 12th, 2022 at 14:30, Amir Goldstein <[email protected]> wrote:
>
>> Clients can also readlink("/proc/self/fd/<fd>") to get the path of the file
>> and open it from its path (if path is accessible in their mount namespace).
>
> What the compositor does is:
>
> - shm_open with O_RDWR
> - Write the kyeboard keymap
> - shm_open again the same file with O_RDONLY
> - shm_unlink
> - Send the O_RDONLY FD to clients
>
> Thus, the file doesn't exist anymore when clients get the FD.
So, what happens if you do fchmod(fd, 0400) on the fd before passing it
to the client [1].
I assume the client is not running as the same uid as the compositor (so
it can't fchmod() the inode back); if it is, then it could just ptrace
you and all bets are off.
[1] or for that matter, simply specify 0400 as the mode argument when
creating the file - that's perfectly legal to do in conjunction with
O_RDWR | O_CREAT | O_EXCL, and should probably be done to prevent
anybody else from opening the same shm file with write permission before
it gets shm_unlinked.
Rasmus
On 2022-05-12, Simon Ser <[email protected]> wrote:
> Hi all,
>
> I'm a user-space developer working on Wayland. Recently we've been
> discussing about security considerations related to FD passing between
> processes [1].
>
> A Wayland compositor often needs to share read-only data with its
> clients. Examples include a keyboard keymap, or a pixel format table.
> The clients might be untrusted. The data sharing can happen by having
> the compositor send a read-only FD (ie, a FD opened with O_RDONLY) to
> clients.
>
> It was assumed that passing such a FD wouldn't allow Wayland clients to
> write to the file. However, it was recently discovered that procfs
> allows to bypass this restriction. A process can open(2)
> "/proc/self/fd/<fd>" with O_RDWR, and that will return a FD suitable for
> writing. This also works when running the client inside a user namespace.
> A PoC is available at [2] and can be tested inside a compositor which
> uses this O_RDONLY strategy (e.g. wlroots compositors).
>
> Question: is this intended behavior, or is this an oversight? If this is
> intended behavior, what would be a good way to share a FD to another
> process without allowing it to write to the underlying file?
This is currently intended behaviour, but I am working on a patchset to
fix it. This was originally meant to be included with openat2(2) along
with some other hardenings in order to add safe O_EMPTYPATH support (as
well as having the ability for you to open an O_PATH descriptor and
restrict how it can be re-opened).
The WIP patchset is in my repo[1]. The main issue at the moment is how
to deal with directories (for parity with *at(2) semantics as well as
our own sanity, using a /proc/self/fd/$n as a path component can't be
blocked so there's some more access mode fiddling necessary to make this
all cleaner). I should have an RFC version ready in a couple of weeks.
[1]: https://github.com/cyphar/linux/tree/magiclink/main
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>
Hi!
> I'm a user-space developer working on Wayland. Recently we've been
> discussing about security considerations related to FD passing between
> processes [1].
>
> A Wayland compositor often needs to share read-only data with its
> clients. Examples include a keyboard keymap, or a pixel format table.
> The clients might be untrusted. The data sharing can happen by having
> the compositor send a read-only FD (ie, a FD opened with O_RDONLY) to
> clients.
>
> It was assumed that passing such a FD wouldn't allow Wayland clients to
> write to the file. However, it was recently discovered that procfs
> allows to bypass this restriction. A process can open(2)
> "/proc/self/fd/<fd>" with O_RDWR, and that will return a FD suitable for
> writing. This also works when running the client inside a user namespace.
> A PoC is available at [2] and can be tested inside a compositor which
> uses this O_RDONLY strategy (e.g. wlroots compositors).
>
> Question: is this intended behavior, or is this an oversight? If this is
> intended behavior, what would be a good way to share a FD to another
> process without allowing it to write to the underlying file?
Sounds like a bug. Not all world is Linux, and 'mount /proc' changing
security characteristics of fd passing is nasty and surprising.
We should not surprise people when it has security implications.
Best regards,
Pavel
--
On 2022-05-13, Christian Brauner <[email protected]> wrote:
> On Thu, May 12, 2022 at 02:56:22PM +0200, Miklos Szeredi wrote:
> > On Thu, 12 May 2022 at 14:41, Simon Ser <[email protected]> wrote:
> > >
> > > On Thursday, May 12th, 2022 at 12:37, Simon Ser <[email protected]> wrote:
> > >
> > > > what would be a good way to share a FD to another
> > > > process without allowing it to write to the underlying file?
> > >
> > > (I'm reminded that memfd + seals exist for this purpose. Still, I'd be
> > > interested to know whether that O_RDONLY/O_RDWR behavior is intended,
> > > because it's pretty surprising. The motivation for using O_RDONLY over
> > > memfd seals is that it isn't Linux-specific.)
> >
> > Yes, this is intended. The /proc/$PID/fd/$FD file represents the
> > inode pointed to by $FD. So the open flags for $FD are irrelevant
> > when operating on the proc fd file.
>
> Fwiw, the original openat2() patchset contained upgrade masks which we
> decided to split it out into a separate patchset.
>
> The idea is that struct open_how would be extended with an upgrade mask
> field which allows the opener to specify with which permissions a file
> descriptor is allowed to be re-opened. This has quite a lot of
> use-cases, especially in container runtimes. So one could open an fd and
> restrict it from being re-opened with O_WRONLY. For container runtimes
> this is a huge security win and for userspace in general it would
> provide a backwards compatible way of e.g., making O_PATH fds
> non-upgradable. The plan is to resend the extension at some point in the
> not too distant future.
I am currently working on reviving this patchset.
The main issue at the moment is that the semantics for how we should
deal with directories is a little difficult to define (we want to ignore
modes for directories because of *at(2) semantics but there's no fmode_t
bits at the moment representing that the flip is a directory), but I'm
working on it.
This is going to be included along with the O_EMPTYPATH feature (since
making this safe is IMHO a pre-requisite for O_EMPTYPATH).
--
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>