Received: by 2002:a25:31c3:0:0:0:0:0 with SMTP id x186csp696033ybx; Wed, 6 Nov 2019 07:06:47 -0800 (PST) X-Google-Smtp-Source: APXvYqwvLzt4JHz3hPNgb3tyIEsZfADd8HTWRw3bqULGVKJWkIwL/Ub0mIdsQPEo4Ncl02bRVsAX X-Received: by 2002:a17:906:6403:: with SMTP id d3mr9467090ejm.258.1573052807578; Wed, 06 Nov 2019 07:06:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1573052807; cv=none; d=google.com; s=arc-20160816; b=eARvS9VzLXdZvTVFmYMn/4niTADIeBfmXDEUvxDKGIkVEuJuoEEO9K+qcMT4mVTmva dqPNow2ZCWMMnp2jNbi16J+cA//2QcTGF7O4QjSRAoCT2qkaUcT5CTa/zwUnoPctfe2R +99nEHPCKppxBONxXklbPTedaLZ3IDr/gXAti4SOijj3RIN1CSDfxUEWGK071uJObrnE q5jMT545CkMrPwWlqAq5UML2uhtXTQwdMNbGK1kRAiCUQzLmBc2sIFP83keGq2X/u43P Qg8PJf5DxSV0SmOysYsa63ZrOtXrSrDPRt3/aH6GlQJlfG+YbGmLXxhc+3uPyOvHOhV6 AfPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=7jRaRoS6+mP8m3o2PpATUS37dVZwWiVNUX/LET2BrQI=; b=STs3sAYwuqiZj0TlGbYha6AYGTrnYTfjo7yaasHWIfHG6yS0Puv4M0qnWHdGbq84/O IS/kDOHn+Zns8NmLLwG/j08qMAZ+B9snt6dgPAZz8GlNFs3vzewD8BObcJpkU0TebjCZ TmHPg0i4TXZ6gjqhmy8d1kHjV/mmyxn8Gm98oR8APxffpX9URRILohLIGsr8/2joBjf4 ehQ5EvjizBw+859qAK24eo/xuPvYSaKmHCMQbtwtjxyu/XWLUookSnjMKiG4f+1Rv7Fy VeYxfm3o4m+46G0/jc9yzYNCMRAv1nKpTeHgt5xa9re/pUzs7msKqR6hFoA45MSR+oan VF8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b17si13074289edj.417.2019.11.06.07.06.23; Wed, 06 Nov 2019 07:06:47 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727824AbfKFPE4 (ORCPT + 99 others); Wed, 6 Nov 2019 10:04:56 -0500 Received: from foss.arm.com ([217.140.110.172]:41416 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727548AbfKFPE4 (ORCPT ); Wed, 6 Nov 2019 10:04:56 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 625E97CD; Wed, 6 Nov 2019 07:04:55 -0800 (PST) Received: from e107158-lin.cambridge.arm.com (e107158-lin.cambridge.arm.com [10.1.195.21]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 442CB3F71A; Wed, 6 Nov 2019 07:04:53 -0800 (PST) Date: Wed, 6 Nov 2019 15:04:50 +0000 From: Qais Yousef To: Peter Zijlstra Cc: Quentin Perret , linux-kernel@vger.kernel.org, aaron.lwe@gmail.com, valentin.schneider@arm.com, mingo@kernel.org, pauld@redhat.com, jdesfossez@digitalocean.com, naravamudan@digitalocean.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, juri.lelli@redhat.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, kernel-team@android.com, john.stultz@linaro.org Subject: Re: NULL pointer dereference in pick_next_task_fair Message-ID: <20191106150450.fa5ppdejiggsb46a@e107158-lin.cambridge.arm.com> References: <20191028174603.GA246917@google.com> <20191106120525.GX4131@hirez.programming.kicks-ass.net> <20191106130838.GL5671@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20191106130838.GL5671@hirez.programming.kicks-ass.net> User-Agent: NeoMutt/20171215 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/06/19 14:08, Peter Zijlstra wrote: > On Wed, Nov 06, 2019 at 01:05:25PM +0100, Peter Zijlstra wrote: > > On Mon, Oct 28, 2019 at 05:46:03PM +0000, Quentin Perret wrote: > > > > > > After digging a bit, the offending commit seems to be: > > > > > > 67692435c411 ("sched: Rework pick_next_task() slow-path") > > > > > > By 'offending' I mean that reverting it makes the issue go away. The > > > issue comes from the fact that pick_next_entity() returns a NULL se in > > > the 'simple' path of pick_next_task_fair(), which causes obvious > > > problems in the subsequent call to set_next_entity(). > > > > > > I'll dig more, but if anybody understands the issue in the meatime feel > > > free to send me a patch to try out :) > > > > So for all those who didn't follow along on IRC, the below seems to cure > > things. > > > > The only thing I'm now considering is if we shouldn't be setting > > ->on_cpu=2 _before_ calling put_prev_task(). I'll go audit the RT/DL > > cases. > > So I think it all works, but that's more by accident than anything else. > I'll move the ->on_cpu=2 assignment earlier. That clearly avoids calling > put_prev_task() while we're in put_prev_task(). Did you mean avoids calling *set_next_task()* while we're in put_prev_task()? So what you're saying is that put_prev_task_{rt,dl}() could drop the rq_lock() too and the race could happen while we're inside these functions, correct? Or is it a different reason? By the way, is all reads/writes to ->on_cpu happen when a lock is held? Ie: we don't need to use any smp read/write barriers? Cheers -- Qais Yousef