Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp883574ybt; Tue, 7 Jul 2020 02:37:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy4PMLbC7CVRND3ahzqmDODiv386WvogJ5dByujqn0/KLk2ozW6eKNwrOIHmAOE/SqrJ9Bg X-Received: by 2002:a50:aca6:: with SMTP id x35mr58282015edc.328.1594114658306; Tue, 07 Jul 2020 02:37:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594114658; cv=none; d=google.com; s=arc-20160816; b=wvWdNVG58YWM6mTcb7AnScXO1YmpdgPejAFfIJOV8WVzu9EdbvqdVv1AoT6GQtafRD v43JJLDaN2F1EOlnI+qdT9DbOGwmGyQdl0JixDksvb33Qoi6D2cIf74iDypo/ndfwUmh vFDb6GaqGIqbyABDsNFutgLWdtVzgt+Kp6rvLQmLLds+gAzzlEPpvp1kdIYEGrxdx8sd thr7gWhTZGW3uKi6oXHSSzGAFO98LvisXL0e7HMqUBjB/VLtC9sYBq3WmeQ/H4RS2L9p 1gBIqljD0jhKXFsgBNKkVoo1PWEJowS9xDND5ix5nTYqrrFR66k2BD52rTETfXv1ld0d 4FwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+mn3xP5JPzLVcAKrICyqBVh4YjMacOKzxrD0ALVT+vQ=; b=p9M8PARORj+iXmtnTjdP4by0rDyYeSQYkyk+aFBBOkTV4CirqYsi9ANxlft3+d4mDa oGYjy53duj+WdGPRF23AEotMbLqVOtNnlBzmFG4CyedSbl+ju7BLXkxv0D0Ba3kfcQul 5xhmWO2VPTrA7GJLS/RoEV8UPCXtUl3XfghKnuPJX0r1Nv/1YBcKVY/2dMKIYg13j7Yx dCCxTkHd3/VP3F3Qtn6PIsrnNJHzRyi84YLEd8hefPkNocGQ6UWQB0vKodxus7Ga5zWz /3OxrX4k46GAnNobZeDTXRg2DI5wvHnqPr9MRvEMDE0Ylz1AvLQDIht9XiPDm0BSndmf hwZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id fx23si14497845ejb.273.2020.07.07.02.37.14; Tue, 07 Jul 2020 02:37:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727945AbgGGJex (ORCPT + 99 others); Tue, 7 Jul 2020 05:34:53 -0400 Received: from foss.arm.com ([217.140.110.172]:34362 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727058AbgGGJex (ORCPT ); Tue, 7 Jul 2020 05:34:53 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 60550C0A; Tue, 7 Jul 2020 02:34:52 -0700 (PDT) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.195.21]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EB5923F68F; Tue, 7 Jul 2020 02:34:49 -0700 (PDT) Date: Tue, 7 Jul 2020 10:34:47 +0100 From: Qais Yousef To: Valentin Schneider Cc: Ingo Molnar , Peter Zijlstra , Doug Anderson , Jonathan Corbet , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Luis Chamberlain , Kees Cook , Iurii Zaikin , Quentin Perret , Patrick Bellasi , Pavan Kondeti , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v6 1/2] sched/uclamp: Add a new sysctl to control RT default boost value Message-ID: <20200707093447.4t6eqjy4fkt747fo@e107158-lin.cambridge.arm.com> References: <20200706142839.26629-1-qais.yousef@arm.com> <20200706142839.26629-2-qais.yousef@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/06/20 16:49, Valentin Schneider wrote: > > On 06/07/20 15:28, Qais Yousef wrote: > > CC: linux-fsdevel@vger.kernel.org > > --- > > > > Peter > > > > I didn't do the > > > > read_lock(&taslist_lock); > > smp_mb__after_spinlock(); > > read_unlock(&tasklist_lock); > > > > dance you suggested on IRC as it didn't seem necessary. But maybe I missed > > something. > > > > So the annoying bit with just uclamp_fork() is that it happens *before* the > task is appended to the tasklist. This means without too much care we > would have (if we'd do a sync at uclamp_fork()): > > CPU0 (sysctl write) CPU1 (concurrent forker) > > copy_process() > uclamp_fork() > p.uclamp_min = state > state = foo > > for_each_process_thread(p, t) > update_state(t); > list_add(p) > > i.e. that newly forked process would entirely sidestep the update. Now, > with Peter's suggested approach we can be in a much better situation. If we > have this in the sysctl update: > > state = foo; > > read_lock(&taslist_lock); > smp_mb__after_spinlock(); > read_unlock(&tasklist_lock); > > for_each_process_thread(p, t) > update_state(t); > > While having this in the fork: > > write_lock(&tasklist_lock); > list_add(p); > write_unlock(&tasklist_lock); > > sched_post_fork(p); // state re-read here; probably wants an mb first > > Then we can no longer miss an update. If the forked p doesn't see the new > value, it *must* have been added to the tasklist before the updater loops > over it, so the loop will catch it. If it sees the new value, we're done. uclamp_fork() has nothing to do with the race. If copy_process() duplicates the task_struct of an RT task, it'll copy the old value. I'd expect the newly introduced sched_post_fork() (also in copy_process() after the list update) to prevent this race altogether. Now we could end up with a problem if for_each_process_thread() doesn't see the newly forked task _after_ sched_post_fork(). Hence my question to Peter. > > AIUI, the above strategy doesn't require any use of RCU. The update_state() > and sched_post_fork() can race, but as per the above they should both be > writing the same value. for_each_process_thread() must be protected by either tasklist_lock or rcu_read_lock(). The other RCU logic I added is not to protect against the race above. I describe the other race condition in a comment. Basically another updater on a different cpu via fork() and sched_setattr() might read an old value and get preempted. The rcu synchronization will ensure concurrent updaters have finished before iterating the list. Thanks -- Qais Yousef