Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp1471914pxb; Fri, 20 Nov 2020 10:15:27 -0800 (PST) X-Google-Smtp-Source: ABdhPJwJKFQjKBOZlu99xfziGY2+yupvcRZvVPKKx6DG6R/KCEoCu5XKNUXvNhIVCNQ7ItZ3+zqG X-Received: by 2002:a50:d490:: with SMTP id s16mr18112691edi.187.1605896127459; Fri, 20 Nov 2020 10:15:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1605896127; cv=none; d=google.com; s=arc-20160816; b=PXMZ9sp+rBbIJY02jwLq9++r0GUAyBKmoVTed+zhcpIAKLDi53rEYSjiTK8NOy9oeT YsoaLGg7pZR2WBXVQwQiGSj4RMSl9FUvRr9AYcbRgMDLj1UlY4Q8XvsKtVIibPUEOUKu kp9BTK/dSKRAVr5m73aAhxNwkFarwDEivKLubw5QNUe9ORHa3Ozdi0gB5m6y0ZN4Mt86 iasd39bCU5OTBO0QsTzCSujkjoRqzuDKdRhuLwyTUSKcU6FxzhH6mXI5DVRlcKmnKLil QhGPNe4SzOY/rQ22ggfuZpDrDeBs/vDcPTFJAZyozMgPdTnYkNPkygkTiVTc/bbYKzeX GvCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature:dkim-filter; bh=o3HGfEV+KvcBH59b0ShX6ILxpiauc3u3nnu8jBOhqu8=; b=c/3l8Td+S9O2Fhewc9HEnoE03fk895KyxzdOE0NDXDAS5zYWFg7jE2YcojoLFVrXv7 BWXFqrUw2dictKhMgcWeXPdLpP8P8T1ZbZZzysh5LFm4cxBVa5Ibr8tfJJPaky+11C5T 5YNoGqE2P8XIhXPdVIrw79GakFghbqi9FqubNDC1gGjF/CEnt6wXGrKBU+ZT89F8TvZp LIezihfZ/kk9PDXj8LRF9b3SQkAj3BAxUO7s0YpiZDOuCycxDJWgh/jD8iUTEkxvOdJM 5r9tiTwi3HrlhV2Km0qs3bm0u8aZ9rDrKPGFDcluwEklEarzuIQu0RbmLHWQ9vgSafIi PGtw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mChXe2Xy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t2si2269327edv.271.2020.11.20.10.15.02; Fri, 20 Nov 2020 10:15:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=mChXe2Xy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729305AbgKTSLQ (ORCPT + 99 others); Fri, 20 Nov 2020 13:11:16 -0500 Received: from linux.microsoft.com ([13.77.154.182]:34522 "EHLO linux.microsoft.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729508AbgKTSLN (ORCPT ); Fri, 20 Nov 2020 13:11:13 -0500 Received: from [192.168.86.179] (c-73-38-52-84.hsd1.vt.comcast.net [73.38.52.84]) by linux.microsoft.com (Postfix) with ESMTPSA id D301E20B717A; Fri, 20 Nov 2020 10:11:07 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com D301E20B717A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1605895872; bh=o3HGfEV+KvcBH59b0ShX6ILxpiauc3u3nnu8jBOhqu8=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=mChXe2XyjeoACPf6tie7qgKERp54WJ9C483uVNUX2LsivEAa30sUYo41XFsixb0Za ceqjBlDYfyAHhLZNldwbHFYf+TO36eNNjsSHz44VvPSoVDNNo8C8/rQlXjIUQrZZV7 jDol1/79uRl+bRC189L4xfbZwZnqppaUMjjG8gJo= Subject: Re: [PATCH -tip 03/32] sched/fair: Fix pick_task_fair crashes due to empty rbtree To: "Singh, Balbir" , "Joel Fernandes (Google)" , Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , Aaron Lu , Aubrey Li , tglx@linutronix.de, linux-kernel@vger.kernel.org Cc: mingo@kernel.org, torvalds@linux-foundation.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , vineeth@bitbyteword.org, Chen Yu , Christian Brauner , Agata Gruza , Antonio Gomez Iglesias , graf@amazon.com, konrad.wilk@oracle.com, dfaggioli@suse.com, pjt@google.com, rostedt@goodmis.org, derkling@google.com, benbjiang@tencent.com, Alexandre Chartre , James.Bottomley@hansenpartnership.com, OWeisse@umich.edu, Dhaval Giani , Junaid Shahid , jsbarnes@google.com, chris.hyser@oracle.com, Ben Segall , Josh Don , Hao Luo , Tom Lendacky , Aubrey Li , "Paul E. McKenney" , Tim Chen References: <20201117232003.3580179-1-joel@joelfernandes.org> <20201117232003.3580179-4-joel@joelfernandes.org> From: Vineeth Pillai Message-ID: Date: Fri, 20 Nov 2020 13:11:06 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Balbir, On 11/20/20 5:15 AM, Singh, Balbir wrote: > On 18/11/20 10:19 am, Joel Fernandes (Google) wrote: >> From: Peter Zijlstra >> >> pick_next_entity() is passed curr == NULL during core-scheduling. Due to >> this, if the rbtree is empty, the 'left' variable is set to NULL within >> the function. This can cause crashes within the function. >> >> This is not an issue if put_prev_task() is invoked on the currently >> running task before calling pick_next_entity(). However, in core >> scheduling, it is possible that a sibling CPU picks for another RQ in >> the core, via pick_task_fair(). This remote sibling would not get any >> opportunities to do a put_prev_task(). >> >> Fix it by refactoring pick_task_fair() such that pick_next_entity() is >> called with the cfs_rq->curr. This will prevent pick_next_entity() from >> crashing if its rbtree is empty. >> >> Also this fixes another possible bug where update_curr() would not be >> called on the cfs_rq hierarchy if the rbtree is empty. This could effect >> cross-cpu comparison of vruntime. >> > It is not clear from the changelog as to what does put_prev_task() do to prevent > the crash from occuring? Why did we pass NULL as curr in the first place to > pick_next_entity? A little more context on this crash in v8 is here: https://lwn.net/ml/linux-kernel/8230ada7-839f-2335-9a55-b09f6a813e91@linux.microsoft.com/ The issue here arises from the fact that, we try to pick task for a sibling while sibling is running a task. Running tasks are not in the cfs_rq and pick_next_entity can return NULL if there is only one cfs task in the cfs_rq. This would not happen normally because put_prev_task is called before pick_task and put_prev_task adds the task back to cfs_rq. But for coresched, pick_task is called on a remote sibling's cfs_rq without calling put_prev_task and this can lead to pick_next_entity returning NULL. The initial logic of passing NULL would work fine as long as we call put_prev_task before calling pick_task_fair. But for coresched, we call pick_task_fair on siblings while the task is running and would not be able to call put_prev_task. So this refactor of the code fixes the crash by explicitly passing curr. Hope this clarifies.. Thanks, Vineeth