2021-10-21 05:56:31

by Alistair Francis

[permalink] [raw]
Subject: [PATCH v2] uapi: futex: Add a futex syscall

From: Alistair Francis <[email protected]>

This commit adds two futex syscall wrappers that are exposed to
userspace.

Neither the kernel or glibc currently expose a futex wrapper, so
userspace is left performing raw syscalls. This has mostly been becuase
the overloading of one of the arguments makes it impossible to provide a
single type safe function.

Until recently the single syscall has worked fine. With the introduction
of a 64-bit time_t futex call on 32-bit architectures, this has become
more complex. The logic of handling the two possible futex syscalls is
complex and often implemented incorrectly.

This patch adds two futux syscall functions that correctly handle the
time_t complexity for userspace.

This idea is based on previous discussions: https://lkml.org/lkml/2021/9/21/143

Signed-off-by: Alistair Francis <[email protected]>
---
include/uapi/linux/futex_syscall.h | 81 ++++++++++++++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100644 include/uapi/linux/futex_syscall.h

diff --git a/include/uapi/linux/futex_syscall.h b/include/uapi/linux/futex_syscall.h
new file mode 100644
index 0000000000000..f84a0c68baf78
--- /dev/null
+++ b/include/uapi/linux/futex_syscall.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_FUTEX_SYSCALL_H
+#define _UAPI_LINUX_FUTEX_SYSCALL_H
+
+#include <asm/unistd.h>
+#include <errno.h>
+#include <linux/types.h>
+#include <linux/time_types.h>
+#include <sys/syscall.h>
+
+/**
+ * futex_syscall_timeout() - __NR_futex/__NR_futex_time64 syscall wrapper
+ * @uaddr: address of first futex
+ * @op: futex op code
+ * @val: typically expected value of uaddr, but varies by op
+ * @timeout: an absolute struct timespec
+ * @uaddr2: address of second futex for some ops
+ * @val3: varies by op
+ */
+static inline int
+__kernel_futex_syscall_timeout(volatile u_int32_t *uaddr, int op, u_int32_t val,
+ struct timespec *timeout, volatile u_int32_t *uaddr2, int val3)
+{
+#if defined(__NR_futex_time64)
+ if (sizeof(*timeout) != sizeof(struct __kernel_old_timespec)) {
+ int ret = syscall(__NR_futex_time64, uaddr, op, val, timeout, uaddr2, val3);
+
+ if (ret == 0 || errno != ENOSYS)
+ return ret;
+ }
+#endif
+
+#if defined(__NR_futex)
+ if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
+ return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
+
+ if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
+ struct __kernel_old_timespec ts32;
+
+ ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;
+ ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
+
+ return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
+ } else if (!timeout) {
+ return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
+ }
+#endif
+
+ errno = ENOSYS;
+ return -1;
+}
+
+/**
+ * futex_syscall_nr_requeue() - __NR_futex/__NR_futex_time64 syscall wrapper
+ * @uaddr: address of first futex
+ * @op: futex op code
+ * @val: typically expected value of uaddr, but varies by op
+ * @nr_requeue: an op specific meaning
+ * @uaddr2: address of second futex for some ops
+ * @val3: varies by op
+ */
+static inline int
+__kernel_futex_syscall_nr_requeue(volatile u_int32_t *uaddr, int op, u_int32_t val,
+ u_int32_t nr_requeue, volatile u_int32_t *uaddr2, int val3)
+{
+#if defined(__NR_futex_time64)
+ int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
+
+ if (ret == 0 || errno != ENOSYS)
+ return ret;
+#endif
+
+#if defined(__NR_futex)
+ return syscall(__NR_futex, uaddr, op, val, nr_requeue, uaddr2, val3);
+#endif
+
+ errno = ENOSYS;
+ return -1;
+}
+
+#endif /* _UAPI_LINUX_FUTEX_SYSCALL_H */
--
2.31.1


2021-10-21 06:34:09

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

On Thu, Oct 21, 2021 at 7:54 AM Alistair Francis
<[email protected]> wrote:
>
> From: Alistair Francis <[email protected]>
>
> This commit adds two futex syscall wrappers that are exposed to
> userspace.
>
> Neither the kernel or glibc currently expose a futex wrapper, so
> userspace is left performing raw syscalls. This has mostly been becuase
> the overloading of one of the arguments makes it impossible to provide a
> single type safe function.
>
> Until recently the single syscall has worked fine. With the introduction
> of a 64-bit time_t futex call on 32-bit architectures, this has become
> more complex. The logic of handling the two possible futex syscalls is
> complex and often implemented incorrectly.
>
> This patch adds two futux syscall functions that correctly handle the
> time_t complexity for userspace.
>
> This idea is based on previous discussions: https://lkml.org/lkml/2021/9/21/143
>
> Signed-off-by: Alistair Francis <[email protected]>

This looks good to me, it addresses my earlier feedback, but I think we
need others to look into the question of whether we want this to be a
single function (as I suggested last time) or a pair of them (as you did).

I just replied to your email about this at
https://lore.kernel.org/lkml/CAK8P3a1CxFfHze6id1sQbQXV-x8DXkEdfqh51MwabzwhKAoTdQ@mail.gmail.com/

I added the futex maintainers and the linux-api list to Cc for them to
reply. Full patch quoted below, no further comments from me.

Arnd

> ---
> include/uapi/linux/futex_syscall.h | 81 ++++++++++++++++++++++++++++++
> 1 file changed, 81 insertions(+)
> create mode 100644 include/uapi/linux/futex_syscall.h
>
> diff --git a/include/uapi/linux/futex_syscall.h b/include/uapi/linux/futex_syscall.h
> new file mode 100644
> index 0000000000000..f84a0c68baf78
> --- /dev/null
> +++ b/include/uapi/linux/futex_syscall.h
> @@ -0,0 +1,81 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +#ifndef _UAPI_LINUX_FUTEX_SYSCALL_H
> +#define _UAPI_LINUX_FUTEX_SYSCALL_H
> +
> +#include <asm/unistd.h>
> +#include <errno.h>
> +#include <linux/types.h>
> +#include <linux/time_types.h>
> +#include <sys/syscall.h>
> +
> +/**
> + * futex_syscall_timeout() - __NR_futex/__NR_futex_time64 syscall wrapper
> + * @uaddr: address of first futex
> + * @op: futex op code
> + * @val: typically expected value of uaddr, but varies by op
> + * @timeout: an absolute struct timespec
> + * @uaddr2: address of second futex for some ops
> + * @val3: varies by op
> + */
> +static inline int
> +__kernel_futex_syscall_timeout(volatile u_int32_t *uaddr, int op, u_int32_t val,
> + struct timespec *timeout, volatile u_int32_t *uaddr2, int val3)
> +{
> +#if defined(__NR_futex_time64)
> + if (sizeof(*timeout) != sizeof(struct __kernel_old_timespec)) {
> + int ret = syscall(__NR_futex_time64, uaddr, op, val, timeout, uaddr2, val3);
> +
> + if (ret == 0 || errno != ENOSYS)
> + return ret;
> + }
> +#endif
> +
> +#if defined(__NR_futex)
> + if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
> + return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
> +
> + if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
> + struct __kernel_old_timespec ts32;
> +
> + ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;
> + ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
> +
> + return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
> + } else if (!timeout) {
> + return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
> + }
> +#endif
> +
> + errno = ENOSYS;
> + return -1;
> +}
> +
> +/**
> + * futex_syscall_nr_requeue() - __NR_futex/__NR_futex_time64 syscall wrapper
> + * @uaddr: address of first futex
> + * @op: futex op code
> + * @val: typically expected value of uaddr, but varies by op
> + * @nr_requeue: an op specific meaning
> + * @uaddr2: address of second futex for some ops
> + * @val3: varies by op
> + */
> +static inline int
> +__kernel_futex_syscall_nr_requeue(volatile u_int32_t *uaddr, int op, u_int32_t val,
> + u_int32_t nr_requeue, volatile u_int32_t *uaddr2, int val3)
> +{
> +#if defined(__NR_futex_time64)
> + int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
> +
> + if (ret == 0 || errno != ENOSYS)
> + return ret;
> +#endif
> +
> +#if defined(__NR_futex)
> + return syscall(__NR_futex, uaddr, op, val, nr_requeue, uaddr2, val3);
> +#endif
> +
> + errno = ENOSYS;
> + return -1;
> +}
> +
> +#endif /* _UAPI_LINUX_FUTEX_SYSCALL_H */
> --
> 2.31.1
>

2021-10-21 11:56:12

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

Hi Alistair,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linux/master]
[also build test ERROR on soc/for-next linus/master v5.15-rc6 next-20211021]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Alistair-Francis/uapi-futex-Add-a-futex-syscall/20211021-135527
base: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 2f111a6fd5b5297b4e92f53798ca086f7c7d33a4
config: i386-randconfig-a004-20211021 (attached as .config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 3cea2505fd8d99a9ba0cb625aecfe28a47c4e3f8)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/6dcb960761dacb92295496cefad0a38e19d4c8ba
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Alistair-Francis/uapi-futex-Add-a-futex-syscall/20211021-135527
git checkout 6dcb960761dacb92295496cefad0a38e19d4c8ba
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from <built-in>:1:
./usr/include/linux/futex_syscall.h:21:45: error: unknown type name 'u_int32_t'
__kernel_futex_syscall_timeout(__volatile__ u_int32_t *uaddr, int op, u_int32_t val,
^
./usr/include/linux/futex_syscall.h:21:71: error: unknown type name 'u_int32_t'
__kernel_futex_syscall_timeout(__volatile__ u_int32_t *uaddr, int op, u_int32_t val,
^
./usr/include/linux/futex_syscall.h:22:16: warning: declaration of 'struct timespec' will not be visible outside of this function [-Wvisibility]
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:22:48: error: unknown type name 'u_int32_t'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
>> ./usr/include/linux/futex_syscall.h:25:12: error: invalid application of 'sizeof' to an incomplete type 'struct timespec'
if (sizeof(*timeout) != sizeof(struct __kernel_old_timespec)) {
^~~~~~~~~~
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
>> ./usr/include/linux/futex_syscall.h:26:14: error: implicit declaration of function 'syscall' [-Werror,-Wimplicit-function-declaration]
int ret = syscall(__NR_futex_time64, uaddr, op, val, timeout, uaddr2, val3);
^
./usr/include/linux/futex_syscall.h:34:12: error: invalid application of 'sizeof' to an incomplete type 'struct timespec'
if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
^~~~~~~~~~
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:35:10: error: implicit declaration of function 'syscall' [-Werror,-Wimplicit-function-declaration]
return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
^
>> ./usr/include/linux/futex_syscall.h:37:24: error: incomplete definition of type 'struct timespec'
if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
~~~~~~~^
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:37:49: error: incomplete definition of type 'struct timespec'
if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
~~~~~~~^
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:40:42: error: incomplete definition of type 'struct timespec'
ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;
~~~~~~~^
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:41:43: error: incomplete definition of type 'struct timespec'
ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
~~~~~~~^
./usr/include/linux/futex_syscall.h:22:16: note: forward declaration of 'struct timespec'
struct timespec *timeout, __volatile__ u_int32_t *uaddr2, int val3)
^
>> ./usr/include/linux/futex_syscall.h:45:46: error: use of undeclared identifier 'NULL'
return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
^
./usr/include/linux/futex_syscall.h:63:48: error: unknown type name 'u_int32_t'
__kernel_futex_syscall_nr_requeue(__volatile__ u_int32_t *uaddr, int op, u_int32_t val,
^
./usr/include/linux/futex_syscall.h:63:74: error: unknown type name 'u_int32_t'
__kernel_futex_syscall_nr_requeue(__volatile__ u_int32_t *uaddr, int op, u_int32_t val,
^
./usr/include/linux/futex_syscall.h:64:5: error: unknown type name 'u_int32_t'
u_int32_t nr_requeue, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:64:40: error: unknown type name 'u_int32_t'
u_int32_t nr_requeue, __volatile__ u_int32_t *uaddr2, int val3)
^
./usr/include/linux/futex_syscall.h:67:13: error: implicit declaration of function 'syscall' [-Werror,-Wimplicit-function-declaration]
int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
^
1 warning and 17 errors generated.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (7.33 kB)
.config.gz (28.68 kB)
Download all attachments

2021-10-25 16:35:40

by André Almeida

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

Hi Alistair,

Às 02:54 de 21/10/21, Alistair Francis escreveu:
> From: Alistair Francis <[email protected]>
>
> This commit adds two futex syscall wrappers that are exposed to
> userspace.
>
> Neither the kernel or glibc currently expose a futex wrapper, so
> userspace is left performing raw syscalls. This has mostly been becuase

because

> the overloading of one of the arguments makes it impossible to provide a
> single type safe function.
>
> Until recently the single syscall has worked fine. With the introduction
> of a 64-bit time_t futex call on 32-bit architectures, this has become
> more complex. The logic of handling the two possible futex syscalls is
> complex and often implemented incorrectly.
>
> This patch adds two futux syscall functions that correctly handle the
> time_t complexity for userspace.
>
> This idea is based on previous discussions: https://lkml.org/lkml/2021/9/21/143

I would use lore
https://lore.kernel.org/lkml/CAK8P3a3x_EyCiPDpMK54y=Rtm-Wb08ym2TNiuAZgXhYrThcWTw@mail.gmail.com/

>
> Signed-off-by: Alistair Francis <[email protected]>

Thanks for working on that :)

> ---
> include/uapi/linux/futex_syscall.h | 81 ++++++++++++++++++++++++++++++
> 1 file changed, 81 insertions(+)
> create mode 100644 include/uapi/linux/futex_syscall.h
>
> diff --git a/include/uapi/linux/futex_syscall.h b/include/uapi/linux/futex_syscall.h
> new file mode 100644
> index 0000000000000..f84a0c68baf78
> --- /dev/null
> +++ b/include/uapi/linux/futex_syscall.h
> @@ -0,0 +1,81 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +#ifndef _UAPI_LINUX_FUTEX_SYSCALL_H
> +#define _UAPI_LINUX_FUTEX_SYSCALL_H
> +
> +#include <asm/unistd.h>
> +#include <errno.h>
> +#include <linux/types.h>
> +#include <linux/time_types.h>
> +#include <sys/syscall.h>
> +
> +/**
> + * futex_syscall_timeout() - __NR_futex/__NR_futex_time64 syscall wrapper
> + * @uaddr: address of first futex
> + * @op: futex op code
> + * @val: typically expected value of uaddr, but varies by op
> + * @timeout: an absolute struct timespec
> + * @uaddr2: address of second futex for some ops
> + * @val3: varies by op
> + */
> +static inline int
> +__kernel_futex_syscall_timeout(volatile u_int32_t *uaddr, int op, u_int32_t val,
> + struct timespec *timeout, volatile u_int32_t *uaddr2, int val3)

I tried to write an example[0] that uses this header, but I can't
compile given that u_int32_t isn't defined. Maybe change to uint32_t and
include <stdint.h>?

Also, I got some invalid use of undefined type 'struct timespec', and
#include <time.h> solved.

[0] https://paste.debian.net/1216834/

> +{
> +#if defined(__NR_futex_time64)
> + if (sizeof(*timeout) != sizeof(struct __kernel_old_timespec)) {
> + int ret = syscall(__NR_futex_time64, uaddr, op, val, timeout, uaddr2, val3);
> +
> + if (ret == 0 || errno != ENOSYS)
> + return ret;
> + }
> +#endif
> +
> +#if defined(__NR_futex)
> + if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
> + return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
> +
> + if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
> + struct __kernel_old_timespec ts32;
> +
> + ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;> + ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
> +
> + return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
> + } else if (!timeout) {
> + return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
> + }
> +#endif

If I read this part right, you will always use ts32 for __NR_futex. I
know that it can be misleading, but __NR_futex uses ts64 in 64-bit
archs, so they shouldn't be converted to ts32 in those cases.

Just to make it clear, there's no __NR_futex_time64 at 64-bit archs.

> +
> + errno = ENOSYS;
> + return -1;
> +}
> +
> +/**
> + * futex_syscall_nr_requeue() - __NR_futex/__NR_futex_time64 syscall wrapper
> + * @uaddr: address of first futex
> + * @op: futex op code
> + * @val: typically expected value of uaddr, but varies by op
> + * @nr_requeue: an op specific meaning
> + * @uaddr2: address of second futex for some ops
> + * @val3: varies by op
> + */
> +static inline int
> +__kernel_futex_syscall_nr_requeue(volatile u_int32_t *uaddr, int op, u_int32_t val,
> + u_int32_t nr_requeue, volatile u_int32_t *uaddr2, int val3)

I would always assume that op is FUTEX_CMP_REQUEUE, given that
FUTEX_REQUEUE is racy. From `man futex`:

The FUTEX_CMP_REQUEUE operation was added as a replacement for the
earlier FUTEX_REQUEUE. The difference is that the check of the value at
uaddr can be used to ensure that requeueing happens only under certain
conditions, which allows race conditions to be avoided in certain use cases.

And then we can drop `int op` from the args and give defined
descriptions for the args.

> +{
> +#if defined(__NR_futex_time64)
> + int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
> +
> + if (ret == 0 || errno != ENOSYS)
> + return ret;
> +#endif
> +
> +#if defined(__NR_futex)
> + return syscall(__NR_futex, uaddr, op, val, nr_requeue, uaddr2, val3);
> +#endif
> +
> + errno = ENOSYS;
> + return -1;
> +}
> +
> +#endif /* _UAPI_LINUX_FUTEX_SYSCALL_H */
>

Sorry if this question was already asked but I didn't find it in the
thread: Should we go with wrappers for the most common op? Like:

__kernel_futex_wait(volatile uint32_t *uaddr, uint32_t val, struct
timespec *timeout)

__kernel_futex_wake(volatile uint32_t *uaddr, uint32_t nr_wake)

Thanks!
André

2021-10-25 17:39:11

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

On Mon, Oct 25, 2021 at 6:33 PM André Almeida <[email protected]> wrote:
> Às 02:54 de 21/10/21, Alistair Francis escreveu:
> > From: Alistair Francis <[email protected]>

> > +#if defined(__NR_futex)
> > + if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
> > + return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
> > +
> > + if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
> > + struct __kernel_old_timespec ts32;
> > +
> > + ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;> + ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
> > +
> > + return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
> > + } else if (!timeout) {
> > + return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
> > + }
> > +#endif
>
> If I read this part right, you will always use ts32 for __NR_futex. I
> know that it can be misleading, but __NR_futex uses ts64 in 64-bit
> archs, so they shouldn't be converted to ts32 in those cases.

__kernel_old_timespec is the correct type for sys_futex() on all
architectures.

Maybe name the local variable 'ts' or 'ts_old' instead of 'ts32' then?

> > +{
> > +#if defined(__NR_futex_time64)
> > + int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
> > +
> > + if (ret == 0 || errno != ENOSYS)
> > + return ret;
> > +#endif
> > +
> > +#if defined(__NR_futex)
> > + return syscall(__NR_futex, uaddr, op, val, nr_requeue, uaddr2, val3);
> > +#endif
> > +
> > + errno = ENOSYS;
> > + return -1;
> > +}
> > +
> > +#endif /* _UAPI_LINUX_FUTEX_SYSCALL_H */
> >
>
> Sorry if this question was already asked but I didn't find it in the
> thread: Should we go with wrappers for the most common op? Like:
>
> __kernel_futex_wait(volatile uint32_t *uaddr, uint32_t val, struct
> timespec *timeout)
>
> __kernel_futex_wake(volatile uint32_t *uaddr, uint32_t nr_wake)

I had suggested having just a single function definition here, but having one
per argument type seems reasonable as well. Having one definition
per futex_op value would also be possible, but in that case I suppose
we need all 13 of them, not just two.

Arnd

2021-11-23 05:45:04

by Alistair Francis

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

On Tue, Oct 26, 2021 at 3:33 AM Arnd Bergmann <[email protected]> wrote:
>
> On Mon, Oct 25, 2021 at 6:33 PM André Almeida <[email protected]> wrote:
> > Às 02:54 de 21/10/21, Alistair Francis escreveu:
> > > From: Alistair Francis <[email protected]>
>
> > > +#if defined(__NR_futex)
> > > + if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
> > > + return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
> > > +
> > > + if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
> > > + struct __kernel_old_timespec ts32;
> > > +
> > > + ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;> + ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
> > > +
> > > + return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
> > > + } else if (!timeout) {
> > > + return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
> > > + }
> > > +#endif
> >
> > If I read this part right, you will always use ts32 for __NR_futex. I
> > know that it can be misleading, but __NR_futex uses ts64 in 64-bit
> > archs, so they shouldn't be converted to ts32 in those cases.
>
> __kernel_old_timespec is the correct type for sys_futex() on all
> architectures.
>
> Maybe name the local variable 'ts' or 'ts_old' instead of 'ts32' then?
>
> > > +{
> > > +#if defined(__NR_futex_time64)
> > > + int ret = syscall(__NR_futex_time64, uaddr, op, val, nr_requeue, uaddr2, val3);
> > > +
> > > + if (ret == 0 || errno != ENOSYS)
> > > + return ret;
> > > +#endif
> > > +
> > > +#if defined(__NR_futex)
> > > + return syscall(__NR_futex, uaddr, op, val, nr_requeue, uaddr2, val3);
> > > +#endif
> > > +
> > > + errno = ENOSYS;
> > > + return -1;
> > > +}
> > > +
> > > +#endif /* _UAPI_LINUX_FUTEX_SYSCALL_H */
> > >
> >
> > Sorry if this question was already asked but I didn't find it in the
> > thread: Should we go with wrappers for the most common op? Like:
> >
> > __kernel_futex_wait(volatile uint32_t *uaddr, uint32_t val, struct
> > timespec *timeout)
> >
> > __kernel_futex_wake(volatile uint32_t *uaddr, uint32_t nr_wake)
>
> I had suggested having just a single function definition here, but having one
> per argument type seems reasonable as well. Having one definition
> per futex_op value would also be possible, but in that case I suppose
> we need all 13 of them, not just two.

Sorry I have taken so long to look at this again. I have addressed
your other comments.

I don't have a strong preference here. I do like that by having a
generic __kernel_futex_syscall_timeout() it should be easier to
convert existing uses of SYSCALL() to the new function (the requeue is
it's own thing anyway).

Alistair

>
> Arnd

2021-11-24 06:10:36

by Alistair Francis

[permalink] [raw]
Subject: Re: [PATCH v2] uapi: futex: Add a futex syscall

On Tue, Oct 26, 2021 at 2:34 AM André Almeida <[email protected]> wrote:
>
> Hi Alistair,
>
> Às 02:54 de 21/10/21, Alistair Francis escreveu:
> > From: Alistair Francis <[email protected]>
> >
> > This commit adds two futex syscall wrappers that are exposed to
> > userspace.
> >
> > Neither the kernel or glibc currently expose a futex wrapper, so
> > userspace is left performing raw syscalls. This has mostly been becuase
>
> because
>
> > the overloading of one of the arguments makes it impossible to provide a
> > single type safe function.
> >
> > Until recently the single syscall has worked fine. With the introduction
> > of a 64-bit time_t futex call on 32-bit architectures, this has become
> > more complex. The logic of handling the two possible futex syscalls is
> > complex and often implemented incorrectly.
> >
> > This patch adds two futux syscall functions that correctly handle the
> > time_t complexity for userspace.
> >
> > This idea is based on previous discussions: https://lkml.org/lkml/2021/9/21/143
>
> I would use lore
> https://lore.kernel.org/lkml/CAK8P3a3x_EyCiPDpMK54y=Rtm-Wb08ym2TNiuAZgXhYrThcWTw@mail.gmail.com/
>
> >
> > Signed-off-by: Alistair Francis <[email protected]>
>
> Thanks for working on that :)
>
> > ---
> > include/uapi/linux/futex_syscall.h | 81 ++++++++++++++++++++++++++++++
> > 1 file changed, 81 insertions(+)
> > create mode 100644 include/uapi/linux/futex_syscall.h
> >
> > diff --git a/include/uapi/linux/futex_syscall.h b/include/uapi/linux/futex_syscall.h
> > new file mode 100644
> > index 0000000000000..f84a0c68baf78
> > --- /dev/null
> > +++ b/include/uapi/linux/futex_syscall.h
> > @@ -0,0 +1,81 @@
> > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> > +#ifndef _UAPI_LINUX_FUTEX_SYSCALL_H
> > +#define _UAPI_LINUX_FUTEX_SYSCALL_H
> > +
> > +#include <asm/unistd.h>
> > +#include <errno.h>
> > +#include <linux/types.h>
> > +#include <linux/time_types.h>
> > +#include <sys/syscall.h>
> > +
> > +/**
> > + * futex_syscall_timeout() - __NR_futex/__NR_futex_time64 syscall wrapper
> > + * @uaddr: address of first futex
> > + * @op: futex op code
> > + * @val: typically expected value of uaddr, but varies by op
> > + * @timeout: an absolute struct timespec
> > + * @uaddr2: address of second futex for some ops
> > + * @val3: varies by op
> > + */
> > +static inline int
> > +__kernel_futex_syscall_timeout(volatile u_int32_t *uaddr, int op, u_int32_t val,
> > + struct timespec *timeout, volatile u_int32_t *uaddr2, int val3)
>
> I tried to write an example[0] that uses this header, but I can't
> compile given that u_int32_t isn't defined. Maybe change to uint32_t and
> include <stdint.h>?
>
> Also, I got some invalid use of undefined type 'struct timespec', and
> #include <time.h> solved.
>
> [0] https://paste.debian.net/1216834/
>
> > +{
> > +#if defined(__NR_futex_time64)
> > + if (sizeof(*timeout) != sizeof(struct __kernel_old_timespec)) {
> > + int ret = syscall(__NR_futex_time64, uaddr, op, val, timeout, uaddr2, val3);
> > +
> > + if (ret == 0 || errno != ENOSYS)
> > + return ret;
> > + }
> > +#endif
> > +
> > +#if defined(__NR_futex)
> > + if (sizeof(*timeout) == sizeof(struct __kernel_old_timespec))
> > + return syscall(__NR_futex, uaddr, op, val, timeout, uaddr2, val3);
> > +
> > + if (timeout && timeout->tv_sec == (long)timeout->tv_sec) {
> > + struct __kernel_old_timespec ts32;
> > +
> > + ts32.tv_sec = (__kernel_long_t) timeout->tv_sec;> + ts32.tv_nsec = (__kernel_long_t) timeout->tv_nsec;
> > +
> > + return syscall(__NR_futex, uaddr, op, val, &ts32, uaddr2, val3);
> > + } else if (!timeout) {
> > + return syscall(__NR_futex, uaddr, op, val, NULL, uaddr2, val3);
> > + }
> > +#endif
>
> If I read this part right, you will always use ts32 for __NR_futex. I
> know that it can be misleading, but __NR_futex uses ts64 in 64-bit
> archs, so they shouldn't be converted to ts32 in those cases.
>
> Just to make it clear, there's no __NR_futex_time64 at 64-bit archs.
>
> > +
> > + errno = ENOSYS;
> > + return -1;
> > +}
> > +
> > +/**
> > + * futex_syscall_nr_requeue() - __NR_futex/__NR_futex_time64 syscall wrapper
> > + * @uaddr: address of first futex
> > + * @op: futex op code
> > + * @val: typically expected value of uaddr, but varies by op
> > + * @nr_requeue: an op specific meaning
> > + * @uaddr2: address of second futex for some ops
> > + * @val3: varies by op
> > + */
> > +static inline int
> > +__kernel_futex_syscall_nr_requeue(volatile u_int32_t *uaddr, int op, u_int32_t val,
> > + u_int32_t nr_requeue, volatile u_int32_t *uaddr2, int val3)
>
> I would always assume that op is FUTEX_CMP_REQUEUE, given that
> FUTEX_REQUEUE is racy. From `man futex`:

There are other ops that this could be though. From just the kernel
futex self tests it could be FUTEX_WAKE_OP, FUTEX_WAIT_REQUEUE_PI or
FUTEX_CMP_REQUEUE_PI

Alistair

>
> The FUTEX_CMP_REQUEUE operation was added as a replacement for the
> earlier FUTEX_REQUEUE. The difference is that the check of the value at
> uaddr can be used to ensure that requeueing happens only under certain
> conditions, which allows race conditions to be avoided in certain use cases.
>
> And then we can drop `int op` from the args and give defined
> descriptions for the args.
>