Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp2523392yba; Mon, 15 Apr 2019 13:31:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqxKpzfQsEaACQR4F4dP/3lw356LYod/yUS7Z1nDhW1nGnn/b9/ouFKyydpQkWW3BDAo/qbF X-Received: by 2002:a63:2b03:: with SMTP id r3mr71370205pgr.105.1555360317367; Mon, 15 Apr 2019 13:31:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555360317; cv=none; d=google.com; s=arc-20160816; b=jEmvT+H9j+IgTB660k3nv2QRIzeQNi4Zv7nfsk5QICmzdUCM3OtGxQW0og2keB4bJI 0CA8gtmLLMwi9D5LOpMtiSx6EdKKFPcyXqXabd+HMo8O46FSaqi30Y0TW51aglzY9XtK 4N43Vfs84d4pxwPpYnp9nbgzt9f4j6PlHmUw0PkepSyuD/mLepkSfUDfOIfrKX/0WlGh p6m+SqKHSxlN4aeu7fe8grR1a4usL0+AfR2ZPBrxxuXndeCBBMt5z4vnnVJOSRJ4AHLM AFD8RqQl3ULeUaEb3imbHgJBBneHQJQEzFeoZaj74tNs0L1ipSm3U1fHMq4re9IR3wK6 UTzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=JOAunqO9O0MY/71sXnQur3SpdZrceT3XVmqp1dBZr7U=; b=hGC5F0HirD1RcaVFaiD2CfiQuJ01dyoQzPUkuEBjlB9VlP7Ep4dbosVDCfZKJ6dPPa liVCGTpSsbezQYVHTRAjM6qqfEIvrpxUMLWG1syt9oeTmyqw8kq8P3E93I689dS8Yd8c sIb8KA570I181j54ho4L5AwvdBJxCyoRqmZx4wepoei2HWGOIzg7nY0HXQqfV4KDuIK2 yQRPeMIc+d+zNTFjgzzKkNW0RVxKCzpFyoOJ437xayP0CiEDfIWPytrd8gmhL+/ADOna dKwQ7gZ3jGJ9As8bOUjedOzVlWSTomD5ML0Rqc7zQukEyTriKOHRUrB5SOa8YP7VSXSe NArg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=baQvWRVT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 62si44627013pge.445.2019.04.15.13.31.41; Mon, 15 Apr 2019 13:31:57 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=baQvWRVT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730647AbfDOTJE (ORCPT + 99 others); Mon, 15 Apr 2019 15:09:04 -0400 Received: from mail.kernel.org ([198.145.29.99]:44650 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730076AbfDOTJD (ORCPT ); Mon, 15 Apr 2019 15:09:03 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 2C4C620651; Mon, 15 Apr 2019 19:09:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1555355342; bh=bqHfxnGvPVQ3pzmATHpoCwrN8xDujREwKUkl0Ii1/M0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=baQvWRVTlReKD4rht0MEtxQ9G0ZtDX5gRO1wkSPbymb25D8rdh3P83d+QEInMafFI DQm3WGHcK/1g1wSQCHM+Ot+LwqFYeQqLBFQOlXwD9gDzHy0OvjVBHzt6dYp4ndeufn C+n+7NO9wNLxjATKvOTLMkhj6Ug6Y5km6qORDtGA= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mel Gorman , "Peter Zijlstra (Intel)" , Valentin Schneider , Linus Torvalds , Mike Galbraith , Thomas Gleixner , Ingo Molnar Subject: [PATCH 4.19 084/101] sched/fair: Do not re-read ->h_load_next during hierarchical load calculation Date: Mon, 15 Apr 2019 20:59:22 +0200 Message-Id: <20190415183744.847322543@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190415183740.341577907@linuxfoundation.org> References: <20190415183740.341577907@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Mel Gorman commit 0e9f02450da07fc7b1346c8c32c771555173e397 upstream. A NULL pointer dereference bug was reported on a distribution kernel but the same issue should be present on mainline kernel. It occured on s390 but should not be arch-specific. A partial oops looks like: Unable to handle kernel pointer dereference in virtual kernel address space ... Call Trace: ... try_to_wake_up+0xfc/0x450 vhost_poll_wakeup+0x3a/0x50 [vhost] __wake_up_common+0xbc/0x178 __wake_up_common_lock+0x9e/0x160 __wake_up_sync_key+0x4e/0x60 sock_def_readable+0x5e/0x98 The bug hits any time between 1 hour to 3 days. The dereference occurs in update_cfs_rq_h_load when accumulating h_load. The problem is that cfq_rq->h_load_next is not protected by any locking and can be updated by parallel calls to task_h_load. Depending on the compiler, code may be generated that re-reads cfq_rq->h_load_next after the check for NULL and then oops when reading se->avg.load_avg. The dissassembly showed that it was possible to reread h_load_next after the check for NULL. While this does not appear to be an issue for later compilers, it's still an accident if the correct code is generated. Full locking in this path would have high overhead so this patch uses READ_ONCE to read h_load_next only once and check for NULL before dereferencing. It was confirmed that there were no further oops after 10 days of testing. As Peter pointed out, it is also necessary to use WRITE_ONCE() to avoid any potential problems with store tearing. Signed-off-by: Mel Gorman Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Valentin Schneider Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Fixes: 685207963be9 ("sched: Move h_load calculation to task_h_load()") Link: https://lkml.kernel.org/r/20190319123610.nsivgf3mjbjjesxb@techsingularity.net Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman --- kernel/sched/fair.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7437,10 +7437,10 @@ static void update_cfs_rq_h_load(struct if (cfs_rq->last_h_load_update == now) return; - cfs_rq->h_load_next = NULL; + WRITE_ONCE(cfs_rq->h_load_next, NULL); for_each_sched_entity(se) { cfs_rq = cfs_rq_of(se); - cfs_rq->h_load_next = se; + WRITE_ONCE(cfs_rq->h_load_next, se); if (cfs_rq->last_h_load_update == now) break; } @@ -7450,7 +7450,7 @@ static void update_cfs_rq_h_load(struct cfs_rq->last_h_load_update = now; } - while ((se = cfs_rq->h_load_next) != NULL) { + while ((se = READ_ONCE(cfs_rq->h_load_next)) != NULL) { load = cfs_rq->h_load; load = div64_ul(load * se->avg.load_avg, cfs_rq_load_avg(cfs_rq) + 1);