Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp6105271pxb; Mon, 8 Nov 2021 03:17:37 -0800 (PST) X-Google-Smtp-Source: ABdhPJzQmALKevUeZoE4rBkEuqKhnnHxsOeyX+LafMGZvhwW2meHbKU146vN8I8B14XUPNYWDKr+ X-Received: by 2002:a92:cd8f:: with SMTP id r15mr46827202ilb.278.1636370256890; Mon, 08 Nov 2021 03:17:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1636370256; cv=none; d=google.com; s=arc-20160816; b=wfAseUNr/5S5XMhZU8dpPA5AKb/s1+CzDRTVRaJ4CFiaX6T0vwZKZlXiY0sLDNRAAa SVzXoKQsDhwRa2gMqcMWnzWW+QF05twaqR538SR5ZQugl7zveM0CRMijA7sDwcAKlZR6 fSPlKSVQ02TjIVJBaEHZJo7YvW8pr6UxKCoxMoM/1cNO6cWiGKeo0BWcgvIiCb3uSwZb ETnaUwMyj8ccgkdu6MdJi/GP1ewDyfAMNYzLG1FpiVYhNAakLTRvGwW86yKCqwKSWhK5 H7vJw/VtUcXgUHmksajeZpop+iDMPej5hnqbK9ux2DPkBkAimI7giPuMQ8sXE3LRD9fN pBvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:dkim-signature:date; bh=WQbmR1jcqvCuL03TSf5CWqZohB7XsDdyWdjXB2Dny1k=; b=qe+qUUj9bN75Tbyr/QzQwvH3D5F9RLGw3C4PRIzsZx9w4ljD1szTLxmklXMGch62n4 UVFCg6kAYIuMQBxRXWOgxIofbejezgl1GByzYUUupKm+7HicBalWCnB5t4KFs9EeqD4i bR8F1QAyonHKyxshmbNilLdC/X+5s9YVh3AgvgvdEn6tqFmeeInnE2MZ5oZpKUP5gwtT Swtm+7wwL6vjvP9JADBSsGVu2U+S5UDeMS9LyKxSZ//45q1HmoXNkgplu1l+X7uJoPzG IwzozakQx3Yp6DuXkRRW6dJey0l92+F/hki/5ohh/3eSDW18SUAdwC8mvRNtgEEjY9xw vauA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lO14FLeK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g14si24257480ilf.33.2021.11.08.03.17.22; Mon, 08 Nov 2021 03:17:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lO14FLeK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236968AbhKHG7j (ORCPT + 99 others); Mon, 8 Nov 2021 01:59:39 -0500 Received: from out2.migadu.com ([188.165.223.204]:17132 "EHLO out2.migadu.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236168AbhKHG7i (ORCPT ); Mon, 8 Nov 2021 01:59:38 -0500 Date: Mon, 8 Nov 2021 14:57:10 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1636354613; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=WQbmR1jcqvCuL03TSf5CWqZohB7XsDdyWdjXB2Dny1k=; b=lO14FLeKWK5O+gFiCixgy4SySNPk9B8XwXXOceG7THtu3vjx2BmuTRi7nchx/FqCG4fpDQ X10Yr/NceMn0Hg6CKyUg5Gt1aQaWifadewGQ8dKCJqS2/SBgK6V+RW8z5z1CN4NYdyBPCu VWww6e4Nif4FMD+FHsF7vLCkH0EejS0= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Tao Zhou To: Peter Oskolkov Cc: Peter Zijlstra , Ingo Molnar , Thomas Gleixner , Andrew Morton , Dave Hansen , Andy Lutomirski , linux-mm@kvack.org, Linux Kernel Mailing List , linux-api@vger.kernel.org, Paul Turner , Ben Segall , Peter Oskolkov , Andrei Vagin , Jann Horn , Thierry Delisle , Tao Zhou Subject: Re: [PATCH v0.8 3/6] sched/umcg: implement UMCG syscalls Message-ID: References: <20211104195804.83240-1-posk@google.com> <20211104195804.83240-4-posk@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: tao.zhou@linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Nov 07, 2021 at 10:26:34AM -0800, Peter Oskolkov wrote: > On Sat, Nov 6, 2021 at 10:19 AM Tao Zhou wrote: > > > > Hi Peter, > > > > On Thu, Nov 04, 2021 at 12:58:01PM -0700, Peter Oskolkov wrote: > > > > > +/** > > > + * umcg_update_state: atomically update umcg_task.state_ts, set new timestamp. > > > + * @state_ts - points to the state_ts member of struct umcg_task to update; > > > + * @expected - the expected value of state_ts, including the timestamp; > > > + * @desired - the desired value of state_ts, state part only; > > > + * @may_fault - whether to use normal or _nofault cmpxchg. > > > + * > > > + * The function is basically cmpxchg(state_ts, expected, desired), with extra > > > + * code to set the timestamp in @desired. > > > + */ > > > +static int umcg_update_state(u64 __user *state_ts, u64 *expected, u64 desired, > > > + bool may_fault) > > > +{ > > > + u64 curr_ts = (*expected) >> (64 - UMCG_STATE_TIMESTAMP_BITS); > > > + u64 next_ts = ktime_get_ns() >> UMCG_STATE_TIMESTAMP_GRANULARITY; > > > + > > > + /* Cut higher order bits. */ > > > + next_ts &= UMCG_TASK_STATE_MASK_FULL; > > > > next_ts &= (1 << UMCG_STATE_TIMESTAMP_BITS) - 1; or am I wrong. > > Right, thanks. I'll fix it in the next patchset version, if any. But > at the moment I don't think this is bad enough to prevent merging, if > the maintainers feel like it - basically, the condition below will > always be false, so if the state is updated within 16 nanoseconds, the > timestamps may sometimes match. For this to be an issue, this should > result in ABA updates, so two state changes should happen in 16 > nanoseconds, which is extremely unlikely (impossible?), as most state The task state occupy 0-5 bits and use 2 bits(00, 01, 10, 11) to denote NONE, RUNNING, IDLE, BLOCK. Is it possible to grasp 4 bits from here to used to extend the timestamp resolution. > changes are accompanied by other atomic ops. > > > > > > + if (next_ts == curr_ts) > > > + ++next_ts; > > > + > > > + /* Remove an old timestamp, if any. */ > > > + desired &= UMCG_TASK_STATE_MASK_FULL; > > > + > > > + /* Set the new timestamp. */ > > > + desired |= (next_ts << (64 - UMCG_STATE_TIMESTAMP_BITS)); > > > + > > > + if (may_fault) > > > + return cmpxchg_user_64(state_ts, expected, desired); > > > + > > > + return cmpxchg_user_64_nofault(state_ts, expected, desired); > > > +} > > > + > > > +/** > > > + * sys_umcg_ctl: (un)register the current task as a UMCG task. > > > + * @flags: ORed values from enum umcg_ctl_flag; see below; > > > + * @self: a pointer to struct umcg_task that describes this > > > + * task and governs the behavior of sys_umcg_wait if > > > + * registering; must be NULL if unregistering. > > > + * > > > + * @flags & UMCG_CTL_REGISTER: register a UMCG task: > > > + * UMCG workers: > > > + * - @flags & UMCG_CTL_WORKER > > > + * - self->state must be UMCG_TASK_BLOCKED > > > + * UMCG servers: > > > + * - !(@flags & UMCG_CTL_WORKER) > > > + * - self->state must be UMCG_TASK_RUNNING > > > + * > > > + * All tasks: > > > + * - self->next_tid must be zero > > > + * > > > + * If the conditions above are met, sys_umcg_ctl() immediately returns > > > + * if the registered task is a server; a worker will be added to > > > + * idle_workers_ptr, and the worker put to sleep; an idle server > > > + * from idle_server_tid_ptr will be woken, if present. > > > + * > > > + * @flags == UMCG_CTL_UNREGISTER: unregister a UMCG task. If the current task > > > + * is a UMCG worker, the userspace is responsible for waking its > > > + * server (before or after calling sys_umcg_ctl). > > > + * > > > + * Return: > > > + * 0 - success > > > + * -EFAULT - failed to read @self > > > + * -EINVAL - some other error occurred > > > + */ > > > +SYSCALL_DEFINE2(umcg_ctl, u32, flags, struct umcg_task __user *, self) > > > +{ > > > + struct umcg_task ut; > > > + > > > + if (flags == UMCG_CTL_UNREGISTER) { > > > + if (self || !current->umcg_task) > > > + return -EINVAL; > > > + > > > + if (current->flags & PF_UMCG_WORKER) > > > + umcg_handle_exiting_worker(); > > > + else > > > + umcg_clear_task(current); > > > + > > > + return 0; > > > + } > > > + > > > + if (!(flags & UMCG_CTL_REGISTER)) > > > + return -EINVAL; > > > + > > > + flags &= ~UMCG_CTL_REGISTER; > > > + if (flags && flags != UMCG_CTL_WORKER) > > > + return -EINVAL; > > > + > > > + if (current->umcg_task || !self) > > > + return -EINVAL; > > > + > > > + if (copy_from_user(&ut, self, sizeof(ut))) > > > + return -EFAULT; > > > + > > > + if (ut.next_tid) > > > + return -EINVAL; > > > + > > > + if (flags == UMCG_CTL_WORKER) { > > > + if ((ut.state_ts & UMCG_TASK_STATE_MASK_FULL) != UMCG_TASK_BLOCKED) > > > > Or use UMCG_TASK_STATE_MASK that is enough. > > Do you have a use case for this (i.e. when state flags can be > legitimately set here)? At the moment I can't think of it, and I'd > rather keep things more strict to avoid dealing with unexpected use > cases in the future. When read through this thread, I am not realize that this time the state flags should not be set. But I need to go other round to be more clear like I'm now reading the doc again.. > > > + return -EINVAL; > > > + > > > + WRITE_ONCE(current->umcg_task, self); > > > + current->flags |= PF_UMCG_WORKER; > > > + > > > + /* Trigger umcg_handle_resuming_worker() */ > > > + set_tsk_thread_flag(current, TIF_NOTIFY_RESUME); > > > + } else { > > > + if ((ut.state_ts & UMCG_TASK_STATE_MASK_FULL) != UMCG_TASK_RUNNING) > > > > The same here. > > Yes, the same here - why do you think task state flags should be allowed here? > > > > > > > + return -EINVAL; > > > + > > > + WRITE_ONCE(current->umcg_task, self); > > > + } > > > + > > > + return 0; > > > +} > > > + > > > > > > Thanks, > > Tao