Received: by 2002:a05:6a10:eb17:0:0:0:0 with SMTP id hx23csp767960pxb; Wed, 8 Sep 2021 11:51:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxLYcoCIqHDklvHbCtrg3syHmXGMhoYjGlvYeuiuhhfaIwXok7vDIsCo5KIPxBNYMk33iCs X-Received: by 2002:a05:6638:1491:: with SMTP id j17mr5083666jak.75.1631127115026; Wed, 08 Sep 2021 11:51:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631127115; cv=none; d=google.com; s=arc-20160816; b=x+VTcm9KSDlQ4j4eoCzz1v+A1dAq01WvxkBa/bex/BrQN2p0dS85onooZ7vIRSVUnP H6/mjx1ayvA9e5ywxar+nUe56VAfLlS3+4KwZpvAvE/9svbVKlRkHF1YdfgVyNUyLw6u bys+RPONVfplzgL/PACVNlPYsfk+n0SgzhuuHt20B22dCDm93EuFkSLKgFEKakHu6BBX LlhtPgDg2HioamxZA8Gwe2yWbLTHMMHhDEP2/32vuaqtCFAKVIks3dRGZgATjO9s/y0Q RgJDs4OtHPEFxXzDFR/Apxwz4lUMdlv7HpzFgGthYUNkADheLlELLVIa1FqNUFql/LR0 piig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=YAZubrtFZP+6/ByAAEGI/qOIuTDfk/aZwVU4wMuJpMc=; b=WeI3F3YyPngvxDiztTNcOtPT5YWlE+ltHAdhZKaRL5MiHOtm0JgQa4lib6E1W4oZXw +XTnP/M0BW49BF5A8iKMLxopoHEh6XfD2ZbVYP5na0Quej1VP79qKTSKZ+8N8do0sPbi 3Z3XZHPDT86FLjJgeEuMg/ygXODjvDnVmVYxHPJ3YDaeDoOXO78mco9ALPPyMdmx7s+e 0RW+1Sw01Jq8MbZFF/kPinQZsOOl07che617Dp9TFRJUV/qQ5crcRc+K0VcyKvbnrpH8 cjcNGia2lIdU4rWR1GzHxRQAkN9VZmMEK6TlN7A0s86l6OoHzffkWjyGmAltCP2+tH6w Ujbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@posk.io header.s=google header.b=GIukAKgH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p8si3151380ilh.169.2021.09.08.11.51.42; Wed, 08 Sep 2021 11:51:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@posk.io header.s=google header.b=GIukAKgH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1350326AbhIHSuZ (ORCPT + 99 others); Wed, 8 Sep 2021 14:50:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1350232AbhIHSuX (ORCPT ); Wed, 8 Sep 2021 14:50:23 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6C6BC061575 for ; Wed, 8 Sep 2021 11:49:14 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id f11-20020a17090aa78b00b0018e98a7cddaso2196117pjq.4 for ; Wed, 08 Sep 2021 11:49:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=posk.io; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YAZubrtFZP+6/ByAAEGI/qOIuTDfk/aZwVU4wMuJpMc=; b=GIukAKgHqmwo4y7H58D+oc50+Fjv2sgz4ProJwIZawRL+4p+hNFtun1GADk8SYP6n2 xLvQa6DKX+WSNowYx0wGwk2aPvP/OPPzK8ufPjvgQhdp7UNAne1wupvp/uSMK7AuZu3z PNMgVFKm3vGOa2lHiB9DvAjPuCHSwx75MQNqZgBVOSwxp+s7+Rq9p5fAAHS+KQkl0Hw3 i9UH6jhRGx8ykFW8rWj2pB1xqShLM/F3c2y8XmapaWckgkf7bM9UHRZoutPSq+1kG5s1 3ypNyAkcJnSvVfh+J4AdMYSetl608/76t3nBwrO0HiV1Oza/TriTiEGO1KMOC4WRje8q aFWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YAZubrtFZP+6/ByAAEGI/qOIuTDfk/aZwVU4wMuJpMc=; b=qMNMgz2lisnNtjrKLVkYFzkzXyZS94fL0z+soefBYIxBrPY3KqJpx8b4fqs+lxwvyq Q/2JPvy7LzttiGpTOzyDYNxQvcq0Fv4fcBSh9yukIwYihkxAavvMddkr1e/BPH7bXQV3 4VFpuE/mWmBS5YaMNZkjKkX9frV2k4heavULlnmcJpJe7Gi/BRzd3KsMG3/PNSL66eXf ZExbGnedmYaUxjLw8gb4yXnb+4ypplECfwrIE2QMtiW6w6USa2RDgNEOcU47p8YRlut3 aZ4NTxp7UdUHSKKE7fW//3TToJ7fiFzsZO4WedJ4l80aPKoaN9fuBtHOJii83yzJDMvQ Jh3w== X-Gm-Message-State: AOAM531BKluNxnIQ2kQZEdMKSwXRIhqUyMprg19iNnTDVMrBDScDs1Kb ohdWaGulC00gB1/2Bk0Q8h6TVA== X-Received: by 2002:a17:90a:7f01:: with SMTP id k1mr5715781pjl.156.1631126954180; Wed, 08 Sep 2021 11:49:14 -0700 (PDT) Received: from posk-g1.lan (23-118-52-46.lightspeed.sntcca.sbcglobal.net. [23.118.52.46]) by smtp.gmail.com with ESMTPSA id m64sm3645640pga.55.2021.09.08.11.49.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Sep 2021 11:49:13 -0700 (PDT) From: Peter Oskolkov X-Google-Original-From: Peter Oskolkov To: Peter Zijlstra , Ingo Molnar , Thomas Gleixner , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org Cc: Paul Turner , Ben Segall , Peter Oskolkov , Peter Oskolkov , Andrei Vagin , Jann Horn , Thierry Delisle Subject: [PATCH 2/4 v0.5] sched/umcg: RFC: add userspace atomic helpers Date: Wed, 8 Sep 2021 11:49:03 -0700 Message-Id: <20210908184905.163787-3-posk@google.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210908184905.163787-1-posk@google.com> References: <20210908184905.163787-1-posk@google.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add helper functions to work atomically with userspace 32/64 bit values - there are some .*futex.* named helpers, but they are not exactly what is needed for UMCG; I haven't found what else I could use, so I rolled these. At the moment only X86_64 is supported. Note: the helpers should probably go into arch/ somewhere; I have them in kernel/sched/umcg.h temporarily for convenience. Please let me know where I should put them. Note: the current code follows sugestions here: https://lore.kernel.org/lkml/YOgCdMWE9OXvqczk@hirez.programming.kicks-ass.net/ with the exception that I couldn't combine __try_cmpxchg_user_32/64 functions into a macro, as my asm foo is not too strong. I'll continue trying to make the macro work, but for the moment I've decided to post this RFC so that other areas of the patchset could be reviewed. Changelog: v04->v0.5: - added xchg_user_** helpers; v0.3->v0.4: - added put_user_nosleep; - removed linked list/stack operations patch; v0.2->v0.3: - renamed and refactored the helpers a bit, as described above; - moved linked list/stack operations into a separate patch. Signed-off-by: Peter Oskolkov --- kernel/sched/umcg.h | 312 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 312 insertions(+) create mode 100644 kernel/sched/umcg.h diff --git a/kernel/sched/umcg.h b/kernel/sched/umcg.h new file mode 100644 index 000000000000..89ba84afa977 --- /dev/null +++ b/kernel/sched/umcg.h @@ -0,0 +1,312 @@ +/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ +#ifndef _KERNEL_SCHED_UMCG_H +#define _KERNEL_SCHED_UMCG_H + +#ifdef CONFIG_X86_64 + +#include + +#include +#include + +/* TODO: move atomic operations below into arch/ headers */ +static inline int __try_cmpxchg_user_32(u32 *uval, u32 __user *uaddr, + u32 oldval, u32 newval) +{ + int ret = 0; + + asm volatile("\n" + "1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n" + "2:\n" + "\t.section .fixup, \"ax\"\n" + "3:\tmov %3, %0\n" + "\tjmp 2b\n" + "\t.previous\n" + _ASM_EXTABLE_UA(1b, 3b) + : "+r" (ret), "=a" (oldval), "+m" (*uaddr) + : "i" (-EFAULT), "r" (newval), "1" (oldval) + : "memory" + ); + *uval = oldval; + return ret; +} + +static inline int __try_cmpxchg_user_64(u64 *uval, u64 __user *uaddr, + u64 oldval, u64 newval) +{ + int ret = 0; + + asm volatile("\n" + "1:\t" LOCK_PREFIX "cmpxchgq %4, %2\n" + "2:\n" + "\t.section .fixup, \"ax\"\n" + "3:\tmov %3, %0\n" + "\tjmp 2b\n" + "\t.previous\n" + _ASM_EXTABLE_UA(1b, 3b) + : "+r" (ret), "=a" (oldval), "+m" (*uaddr) + : "i" (-EFAULT), "r" (newval), "1" (oldval) + : "memory" + ); + *uval = oldval; + return ret; +} + +static inline int fix_pagefault(unsigned long uaddr, bool write_fault) +{ + struct mm_struct *mm = current->mm; + int ret; + + mmap_read_lock(mm); + ret = fixup_user_fault(mm, uaddr, write_fault ? FAULT_FLAG_WRITE : 0, + NULL); + mmap_read_unlock(mm); + + return ret < 0 ? ret : 0; +} + +/** + * cmpxchg_32_user - compare_exchange 32-bit values + * + * Return: + * 0 - OK + * -EFAULT: memory access error + * -EAGAIN: @expected did not match; consult @prev + */ +static inline int cmpxchg_user_32(u32 __user *uaddr, u32 *old, u32 new) +{ + int ret = -EFAULT; + u32 __old = *old; + + if (unlikely(!access_ok(uaddr, sizeof(*uaddr)))) + return -EFAULT; + + pagefault_disable(); + + while (true) { + __uaccess_begin_nospec(); + ret = __try_cmpxchg_user_32(old, uaddr, __old, new); + user_access_end(); + + if (!ret) { + ret = *old == __old ? 0 : -EAGAIN; + break; + } + + if (fix_pagefault((unsigned long)uaddr, true) < 0) + break; + } + + pagefault_enable(); + return ret; +} + +/** + * cmpxchg_64_user - compare_exchange 64-bit values + * + * Return: + * 0 - OK + * -EFAULT: memory access error + * -EAGAIN: @expected did not match; consult @prev + */ +static inline int cmpxchg_user_64(u64 __user *uaddr, u64 *old, u64 new) +{ + int ret = -EFAULT; + u64 __old = *old; + + if (unlikely(!access_ok(uaddr, sizeof(*uaddr)))) + return -EFAULT; + + pagefault_disable(); + + while (true) { + __uaccess_begin_nospec(); + ret = __try_cmpxchg_user_64(old, uaddr, __old, new); + user_access_end(); + + if (!ret) { + ret = *old == __old ? 0 : -EAGAIN; + break; + } + + if (fix_pagefault((unsigned long)uaddr, true) < 0) + break; + } + + pagefault_enable(); + + return ret; +} + +static inline int __try_xchg_user_32(u32 *oval, u32 __user *uaddr, u32 newval) +{ + u32 oldval = 0; + int ret = 0; + + asm volatile("\n" + "1:\txchgl %0, %2\n" + "2:\n" + "\t.section .fixup, \"ax\"\n" + "3:\tmov %3, %0\n" + "\tjmp 2b\n" + "\t.previous\n" + _ASM_EXTABLE_UA(1b, 3b) + : "=r" (oldval), "=r" (ret), "+m" (*uaddr) + : "i" (-EFAULT), "0" (newval), "1" (0) + ); + + if (ret) + return ret; + + *oval = oldval; + return 0; +} + +static inline int __try_xchg_user_64(u64 *oval, u64 __user *uaddr, u64 newval) +{ + u64 oldval = 0; + int ret = 0; + + asm volatile("\n" + "1:\txchgq %0, %2\n" + "2:\n" + "\t.section .fixup, \"ax\"\n" + "3:\tmov %3, %0\n" + "\tjmp 2b\n" + "\t.previous\n" + _ASM_EXTABLE_UA(1b, 3b) + : "=r" (oldval), "=r" (ret), "+m" (*uaddr) + : "i" (-EFAULT), "0" (newval), "1" (0) + ); + + if (ret) + return ret; + + *oval = oldval; + return 0; +} + +/** + * xchg_32_user - atomically exchange 64-bit values + * + * Return: + * 0 - OK + * -EFAULT: memory access error + */ +static inline int xchg_user_32(u32 __user *uaddr, u32 *val) +{ + int ret = -EFAULT; + + if (unlikely(!access_ok(uaddr, sizeof(*uaddr)))) + return -EFAULT; + + pagefault_disable(); + + while (true) { + __uaccess_begin_nospec(); + ret = __try_xchg_user_32(val, uaddr, *val); + user_access_end(); + + if (!ret) + break; + + if (fix_pagefault((unsigned long)uaddr, true) < 0) + break; + } + + pagefault_enable(); + + return ret; +} + +/** + * xchg_64_user - atomically exchange 64-bit values + * + * Return: + * 0 - OK + * -EFAULT: memory access error + */ +static inline int xchg_user_64(u64 __user *uaddr, u64 *val) +{ + int ret = -EFAULT; + + if (unlikely(!access_ok(uaddr, sizeof(*uaddr)))) + return -EFAULT; + + pagefault_disable(); + + while (true) { + __uaccess_begin_nospec(); + ret = __try_xchg_user_64(val, uaddr, *val); + user_access_end(); + + if (!ret) + break; + + if (fix_pagefault((unsigned long)uaddr, true) < 0) + break; + } + + pagefault_enable(); + + return ret; +} + +/** + * get_user_nosleep - get user value with inline fixup without sleeping. + * + * get_user() might sleep and therefore cannot be used in preempt-disabled + * regions. + */ +#define get_user_nosleep(out, uaddr) \ +({ \ + int ret = -EFAULT; \ + \ + if (access_ok((uaddr), sizeof(*(uaddr)))) { \ + pagefault_disable(); \ + \ + while (true) { \ + if (!__get_user((out), (uaddr))) { \ + ret = 0; \ + break; \ + } \ + \ + if (fix_pagefault((unsigned long)(uaddr), false) < 0) \ + break; \ + } \ + \ + pagefault_enable(); \ + } \ + ret; \ +}) + +/** + * put_user_nosleep - put user value with inline fixup without sleeping. + * + * put_user() might sleep and therefore cannot be used in preempt-disabled + * regions. + */ +#define put_user_nosleep(out, uaddr) \ +({ \ + int ret = -EFAULT; \ + \ + if (access_ok((uaddr), sizeof(*(uaddr)))) { \ + pagefault_disable(); \ + \ + while (true) { \ + if (!__put_user((out), (uaddr))) { \ + ret = 0; \ + break; \ + } \ + \ + if (fix_pagefault((unsigned long)(uaddr), true) < 0) \ + break; \ + } \ + \ + pagefault_enable(); \ + } \ + ret; \ +}) + +#endif /* CONFIG_X86_64 */ +#endif /* _KERNEL_SCHED_UMCG_H */ -- 2.25.1