Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp1447867ybl; Thu, 22 Aug 2019 14:57:24 -0700 (PDT) X-Google-Smtp-Source: APXvYqwW20YDTlt3YiGo3o1VWK58MUN93u4lVNtg+URt4y5eocYB3o54iKURiXM1tH5Rosx3mD2k X-Received: by 2002:a63:8dc9:: with SMTP id z192mr1110433pgd.151.1566511043809; Thu, 22 Aug 2019 14:57:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1566511043; cv=none; d=google.com; s=arc-20160816; b=maeYtpau3L/Lf4KpG8FW99TPgM9jPszNEwMvcBmBRfzIdZCXo/UeWgXH474eRxtERx Krkc6gnMQIzzvHCWzBNlm24vtTZ0YT/0j/HzErWjw3wPdpGBWCUuDEypDHLLCvsDqBLX lxxoobXKYMQvG7Ij39KpPQ4aBfNfQx5YFhAWTj83kkPG2Zi0YEovHBYqBghN9isLkOwN ggNk76nSUfAAQSk9Ab7GfutnpvNFG6b0G1SKkx4wbRyaV3Nw8qUwfPkI0umyOeZJVeBK a3t2Qeq572fzmpdYTdlb5DlTz47cm3r8QLzpaGjJL/ELonbbCZ6SrFgNsumEWVnFTpHp sqCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=pTQuSaZ0y064hBW3JcT9xVAG9FaJRln7HthFHLIaxKw=; b=UA0FOLuFWL6y4eHFcDRO7sZX/CFN93mH2gsm3FrIsuMt8ab+QtmXZhuPZjwhR2+JSB tMtAzI55kq1ufKUqrW6sAk/cHkQRLSSIIpo3wvzkLkaoNq1sEirh0uby/HoJKnTXiFry jE8qQEIHbXRV8dQiB3vvBwr0gJTQCscK292fNWBS4ZjJSkVPk3bSY4/w2Cyhi4DAp62a 8zuDvkJld2Bxg69G/fPXaiupm4mfBzzPESy2t76J5hQrmDAB/TjGWe6H2r1ClEeQ5ihd vV13RYak6CCyiufAl4yavZAkP/02YqFicLKjboj6o87Q0dLVtCawFqKTo8l+K5JWDP0y +MNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=rE3mh1Kh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 185si704261pfg.261.2019.08.22.14.57.07; Thu, 22 Aug 2019 14:57:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=rE3mh1Kh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390175AbfHVQv6 (ORCPT + 99 others); Thu, 22 Aug 2019 12:51:58 -0400 Received: from merlin.infradead.org ([205.233.59.134]:58768 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730719AbfHVQv6 (ORCPT ); Thu, 22 Aug 2019 12:51:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=pTQuSaZ0y064hBW3JcT9xVAG9FaJRln7HthFHLIaxKw=; b=rE3mh1KhRD5albGjLJLcHMxXz uQNpAHKi9fUnp//XhXuJoL1qG+Jm/bqYzByuMFhMko+u1R4QDlgY0r+Ywvs2wPJz0BKm2xaJp8blp 2qGz6+QHP1KNkeJR9hOvymZfHbQ8lfGAD6rR5OW39+chPUKnsFI6NuQ8aRoxzAarhX9QfxphtsaWS FHn73cIlD3SxBrvcCOw7SqjQPZL6yMt/5FzvKbAn96rwlAV8FDUTU62D+fzyZm969qbbsRI5xmV67 T7xCftbICv7arb4obbaajCzBga5+sgNfPJ9x5B29qcwCY3iALom+N7kFNDJhaRhjnagrs6/pcmF9E DAtVXFinA==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92 #3 (Red Hat Linux)) id 1i0qJ7-0006C3-3W; Thu, 22 Aug 2019 16:51:29 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 6AA87307145; Thu, 22 Aug 2019 18:50:53 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 7EA2720B74844; Thu, 22 Aug 2019 18:51:25 +0200 (CEST) Date: Thu, 22 Aug 2019 18:51:25 +0200 From: Peter Zijlstra To: Alessio Balsini Cc: mingo@kernel.org, juri.lelli@redhat.com, linux-kernel@vger.kernel.org, dietmar.eggemann@arm.com, luca.abeni@santannapisa.it, bristot@redhat.com, dvyukov@google.com, tglx@linutronix.de, vpillai@digitalocean.com, rostedt@goodmis.org, kernel-team@android.com Subject: Re: [RFC][PATCH 01/13] sched/deadline: Impose global limits on sched_attr::sched_period Message-ID: <20190822165125.GW2369@hirez.programming.kicks-ass.net> References: <20190726145409.947503076@infradead.org> <20190726161357.397880775@infradead.org> <20190802172104.GA134279@google.com> <20190805115309.GJ2349@hirez.programming.kicks-ass.net> <20190822122949.GA245353@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190822122949.GA245353@google.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 22, 2019 at 01:29:49PM +0100, Alessio Balsini wrote: > Yes, that's exactly what I meant! > What about this refactoring? Makes sense; and now I spot a race (and sched_rt_handler() from which I copied this is susceptible too): > +int sched_dl_period_handler(struct ctl_table *table, int write, > + void __user *buffer, size_t *lenp, > + loff_t *ppos) > +{ > + unsigned int old_max, old_min; > + static DEFINE_MUTEX(mutex); > + int ret; > + > + mutex_lock(&mutex); > + old_max = sysctl_sched_dl_period_max; > + old_min = sysctl_sched_dl_period_min; > + > + ret = proc_douintvec(table, write, buffer, lenp, ppos); Here the sysctl_* values have been changed, interleave with: > + if (!ret && write) { > + u64 max = (u64)sysctl_sched_dl_period_max * NSEC_PER_USEC; > + u64 min = (u64)sysctl_sched_dl_period_min * NSEC_PER_USEC; > + > + if (min < 1ULL << DL_SCALE || max < min) { > + sysctl_sched_dl_period_max = old_max; > + sysctl_sched_dl_period_min = old_min; > + ret = -EINVAL; > + } > + } > + mutex_unlock(&mutex); > + > + return ret; > +} > @@ -2675,12 +2713,21 @@ bool __checkparam_dl(const struct sched_attr *attr) > attr->sched_period & (1ULL << 63)) > return false; > > + period = attr->sched_period; > + if (!period) > + period = attr->sched_deadline; > + > /* runtime <= deadline <= period (if period != 0) */ > - if ((attr->sched_period != 0 && > - attr->sched_period < attr->sched_deadline) || > + if (period < attr->sched_deadline || > attr->sched_deadline < attr->sched_runtime) > return false; > > + max = (u64)READ_ONCE(sysctl_sched_dl_period_max) * NSEC_PER_USEC; > + min = (u64)READ_ONCE(sysctl_sched_dl_period_min) * NSEC_PER_USEC; this, and we're using unvalidated numbers. > + if (period < min || period > max) > + return false; > + > return true; > } Something like the completely untested (again, sorry) below ought to cure that I think; the same needs doing to sched_rt_handler() I'm thinking. --- Subject: sched/deadline: Impose global limits on sched_attr::sched_period From: Alessio Balsini Date: Thu, 22 Aug 2019 13:29:49 +0100 There are two DoS scenarios with SCHED_DEADLINE related to sched_attr::sched_period: - since access-control only looks at utilization and density, a very large period can allow a very large runtime, which in turn can incur a very large latency to lower priority tasks. - for very short periods we can end up spending more time programming the hardware timer than actually running the task. Mitigate these by imposing limits on the period. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Alessio Balsini Cc: rostedt@goodmis.org Cc: mingo@kernel.org Cc: luca.abeni@santannapisa.it Cc: bristot@redhat.com Cc: vpillai@digitalocean.com Cc: kernel-team@android.com Cc: juri.lelli@redhat.com Cc: dietmar.eggemann@arm.com Cc: dvyukov@google.com Cc: tglx@linutronix.de --- include/linux/sched/sysctl.h | 7 +++++ kernel/sched/deadline.c | 58 +++++++++++++++++++++++++++++++++++++++++-- kernel/sysctl.c | 14 ++++++++++ 3 files changed, 77 insertions(+), 2 deletions(-) --- a/include/linux/sched/sysctl.h +++ b/include/linux/sched/sysctl.h @@ -56,6 +56,13 @@ int sched_proc_update_handler(struct ctl extern unsigned int sysctl_sched_rt_period; extern int sysctl_sched_rt_runtime; +extern unsigned int sysctl_sched_dl_period_max; +extern unsigned int sysctl_sched_dl_period_min; + +extern int sched_dl_period_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos); + #ifdef CONFIG_UCLAMP_TASK extern unsigned int sysctl_sched_uclamp_util_min; extern unsigned int sysctl_sched_uclamp_util_max; --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2633,6 +2633,49 @@ void __getparam_dl(struct task_struct *p } /* + * Default limits for DL period: on the top end we guard against small util + * tasks still getting ridiculous long effective runtimes, on the bottom end we + * guard against timer DoS. + */ +unsigned int sysctl_sched_dl_period_max = 1 << 22; /* ~4 seconds */ +unsigned int sysctl_sched_dl_period_min = 100; /* 100 us */ + +int sched_dl_period_handler(struct ctl_table *table, int write, + void __user *buffer, size_t *lenp, + loff_t *ppos) +{ + unsigned int new_max, new_min; + struct ctl_table new_table; + static DEFINE_MUTEX(mutex); + int ret; + + mutex_lock(&mutex); + new_max = sysctl_sched_dl_period_max; + new_min = sysctl_sched_dl_period_min; + new_table = *table; + if (new_table.data == &sysctl_sched_dl_period_max) + new_table.data = &new_max; + else + new_table.data = &new_min; + + ret = proc_douintvec(&new_table, write, buffer, lenp, ppos); + if (!ret && write) { + u64 max = (u64)new_max * NSEC_PER_USEC; + u64 min = (u64)new_min * NSEC_PER_USEC; + + if (min > 1ULL << DL_SCALE && max > min) { + WRITE_ONCE(sysctl_sched_dl_period_max, new_max); + WRITE_ONCE(sysctl_sched_dl_period_min, new_min); + } else { + ret = -EINVAL; + } + } + mutex_unlock(&mutex); + + return ret; +} + +/* * This function validates the new parameters of a -deadline task. * We ask for the deadline not being zero, and greater or equal * than the runtime, as well as the period of being zero or @@ -2644,6 +2687,8 @@ void __getparam_dl(struct task_struct *p */ bool __checkparam_dl(const struct sched_attr *attr) { + u64 period, max, min; + /* special dl tasks don't actually use any parameter */ if (attr->sched_flags & SCHED_FLAG_SUGOV) return true; @@ -2667,12 +2712,21 @@ bool __checkparam_dl(const struct sched_ attr->sched_period & (1ULL << 63)) return false; + period = attr->sched_period; + if (!period) + period = attr->sched_deadline; + /* runtime <= deadline <= period (if period != 0) */ - if ((attr->sched_period != 0 && - attr->sched_period < attr->sched_deadline) || + if (period < attr->sched_deadline || attr->sched_deadline < attr->sched_runtime) return false; + max = (u64)READ_ONCE(sysctl_sched_dl_period_max) * NSEC_PER_USEC; + min = (u64)READ_ONCE(sysctl_sched_dl_period_min) * NSEC_PER_USEC; + + if (period < min || period > max) + return false; + return true; } --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -443,6 +443,20 @@ static struct ctl_table kern_table[] = { .proc_handler = sched_rt_handler, }, { + .procname = "sched_deadline_period_max_us", + .data = &sysctl_sched_dl_period_max, + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = sched_dl_period_handler, + }, + { + .procname = "sched_deadline_period_min_us", + .data = &sysctl_sched_dl_period_min, + .maxlen = sizeof(unsigned int), + .mode = 0644, + .proc_handler = sched_dl_period_handler, + }, + { .procname = "sched_rr_timeslice_ms", .data = &sysctl_sched_rr_timeslice, .maxlen = sizeof(int),