Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp5160472pxj; Wed, 9 Jun 2021 10:29:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyFCscb0LrgxQNDrK70xQu/q2TEAGirC9i3PyH3l2ZvFPOAmrFIAGGXckbRWnoJVZZ6rfAZ X-Received: by 2002:aa7:c844:: with SMTP id g4mr518221edt.257.1623259759503; Wed, 09 Jun 2021 10:29:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623259759; cv=none; d=google.com; s=arc-20160816; b=svKhMKGARp0yQBBxeDjunfmMYKSgP+DZvpDaT8gf9PLA27hCAHUVYlNqCdJagyvu23 ZSuvKWngb0SS4bdb5K6EK3ZKIMqRDAMqF16VfxKeNQ6SwEW9tkD28kVfWisZHkjpDUs/ JH5XaabQ/my6UaDrR7zMdbjmmiM0EtnZcsZwRJDonp2zmOpzMhlnHKP6dRG1wD7VSXs+ Wbe5kGQCp2IO19F7JLlGjtyfhZpP3XGgEuTExoB9JCa3bbB9grqwUgZiawv4x9wPLYUH 8/WSRE0LrXd5jqFx6sjMWIXJNner+sUauyVm7qqxIX8B7if+exwGn7tqqz5jtmE0rFmc lfpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=hAerKgjIq4kS6ywfhPEqKwcMuutGzkYxDRK8aN/LbKc=; b=EuZaE8mtppxe17jFI96hzz2rkvONazz6bXn+kUG+of/NaIR+If2SMPnaErmyFZoLhh X7J1cDDgvKlIH3Rvl9kR4k1IwhVt5LR+Ejb7pPdHsWYXMvLjSSyvPx2D450pCsLxMV09 pmrYQUqqUOCMWSYkDxmachiWpNZyyJx8bnYel2Wfy6yJUsHdu0CpNoRqcUZxi84L7XC1 CDQoTFZudXGNYMKAb0TysNp9OCl6MiU6RtNs5RDYQXFMrw3UeaV65aMctiTUhNZDqNcA QyrrYvxgrzC/G+BvpF1ZuWqKHllJPG75rbMg5S1Cwyf9ivA6sEt2Op0PmBc9mFsP5+28 HwjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=e6XPOybt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h22si279714ejl.384.2021.06.09.10.28.39; Wed, 09 Jun 2021 10:29:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=e6XPOybt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235728AbhFINET (ORCPT + 99 others); Wed, 9 Jun 2021 09:04:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231663AbhFINEP (ORCPT ); Wed, 9 Jun 2021 09:04:15 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5AA10C061574; Wed, 9 Jun 2021 06:02:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=hAerKgjIq4kS6ywfhPEqKwcMuutGzkYxDRK8aN/LbKc=; b=e6XPOybtBMWkskNlTkinyrzGq3 zK59VJwsz7R4qFITyiIJ+XekQShaB0xFryNoU4WJycnF6o4C+oCCKj9esgE07OGfjCR2cxs3HgEwY rfqJITMByfLpH5WLnr4HZaxcYRNMI3K/Syn+gi5QrLPVaHT6MKkuR4eDhPbqkNhOqqiPneffmB0HL gbhy5/1dxcXJaPGgUsDHwggAYXl+5mtIBzG7bFu3L99WaIVgVE8bfoG8JiTbYCkps4NnzxqOjnvas RQGUgPsybamxniloaGAOkdYcyPVd/X5gpkwNC0GyO4JLFxcpm+J5TiyGa1tVLnSGWhl42qeijfouU mqLFdPCg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=worktop.programming.kicks-ass.net) by casper.infradead.org with esmtpsa (Exim 4.94 #2 (Red Hat Linux)) id 1lqxpQ-000WJA-Gc; Wed, 09 Jun 2021 13:01:17 +0000 Received: by worktop.programming.kicks-ass.net (Postfix, from userid 1000) id 0229E9867D0; Wed, 9 Jun 2021 15:01:03 +0200 (CEST) Date: Wed, 9 Jun 2021 15:01:03 +0200 From: Peter Zijlstra To: Jann Horn Cc: Peter Oskolkov , Ingo Molnar , Thomas Gleixner , kernel list , Linux API , Paul Turner , Ben Segall , Peter Oskolkov , Joel Fernandes , Andrew Morton , Andrei Vagin , Jim Newsome Subject: Re: [RFC PATCH v0.1 4/9] sched/umcg: implement core UMCG API Message-ID: <20210609130103.GB68187@worktop.programming.kicks-ass.net> References: <20210520183614.1227046-1-posk@google.com> <20210520183614.1227046-5-posk@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 21, 2021 at 11:33:14PM +0200, Jann Horn wrote: > > SYSCALL_DEFINE2(umcg_wake, u32, flags, u32, next_tid) > > { > > - return -ENOSYS; > > + struct umcg_task_data *next_utd; > > + struct task_struct *next; > > + int ret = -EINVAL; > > + > > + if (!next_tid) > > + return -EINVAL; > > + if (flags) > > + return -EINVAL; > > + > > + next = find_get_task_by_vpid(next_tid); > > + if (!next) > > + return -ESRCH; > > + rcu_read_lock(); > > Wouldn't it be more efficient to replace the last 4 lines with the following? > > rcu_read_lock(); > next = find_task_by_vpid(next_tid); > if (!next) { > err = -ESRCH; > goto out; > } This wakeup crud needs to modify the umcg->state, which is a user variable. That can't be done under RCU. Weirdly the proposed code doesn't actually do any of that for undocumented raisins :/ > Then you don't need to use refcounting here... > > > + next_utd = rcu_dereference(next->umcg_task_data); > > + if (!next_utd) > > + goto out; > > + > > + if (!READ_ONCE(next_utd->in_wait)) { > > + ret = -EAGAIN; > > + goto out; > > + } > > + > > + ret = wake_up_process(next); > > + put_task_struct(next); > > ... and you'd be able to drop this put_task_struct(), too. > > > + if (ret) > > + ret = 0; > > + else > > + ret = -EAGAIN; > > + > > +out: > > + rcu_read_unlock(); > > + return ret; > > } > > > > /** > > @@ -139,5 +325,44 @@ SYSCALL_DEFINE2(umcg_wake, u32, flags, u32, next_tid) > > SYSCALL_DEFINE4(umcg_swap, u32, wake_flags, u32, next_tid, u32, wait_flags, > > const struct __kernel_timespec __user *, timeout) > > { > > - return -ENOSYS; > > + struct umcg_task_data *curr_utd; > > + struct umcg_task_data *next_utd; > > + struct task_struct *next; > > + int ret = -EINVAL; > > + > > + rcu_read_lock(); > > + curr_utd = rcu_dereference(current->umcg_task_data); > > + > > + if (!next_tid || wake_flags || wait_flags || !curr_utd) > > + goto out; > > + > > + if (timeout) { > > + ret = -EOPNOTSUPP; > > + goto out; > > + } > > + > > + next = find_get_task_by_vpid(next_tid); > > + if (!next) { > > + ret = -ESRCH; > > + goto out; > > + } > > There isn't any type of access check here, right? Any task can wake up > any other task? That feels a bit weird to me - and if you want to keep > it as-is, it should probably at least be documented that any task on > the system can send you spurious wakeups if you opt in to umcg. You can only send wakeups to other UMCG thingies, per the next->umcg_task_data check below. That said.. > In contrast, shared futexes can avoid this because they get their > access control implicitly from the VMA. Every task must expect spurious wakups at all times, always (for TASK_NORMAL wakeups that is). There's plenty ways to generate them. > > + next_utd = rcu_dereference(next->umcg_task_data); > > + if (!next_utd) { > > + ret = -EINVAL; > > + goto out; > > + }