Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp3705385pxk; Tue, 29 Sep 2020 04:18:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw3rd1qfsvQzwS8IPm3rkQ7ewk0a2EQlzMrce6Ywx/VR5Bi3Quu2WPgKVWHVjgkYEndsOPN X-Received: by 2002:aa7:dd8d:: with SMTP id g13mr2739919edv.324.1601378304681; Tue, 29 Sep 2020 04:18:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601378304; cv=none; d=google.com; s=arc-20160816; b=S5yW9FaBoex2ooBiYMhuK0iwL3ELL5+a6i4kasorh0+r9/+XvQ67VIK0GlZl9t+Hpp IZu3I+6axfjRQLzs/Lv1OZPejHmlqEike66D9oLLKb7dg8cpq2kU2XTg40zC/y4lFeg3 nWclGYyvoZQlkhpyMAP2F6BfjwakkkmC/Yd4qd7mxaLJ8ATTCy0neiDh3stSzfERGn80 VpvHU5B1o66sqPLwJYTPalstdJ4NaPx2s4ns/vC+nSqU9756dzbGVaN0MoJ7rz259qXt FJtYO1ZLpatQOqSRYh+A29wG18kJfF2bfPKkUvryiyoJaQkmvnuLERwcGVzto8krc+7X kj8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=t07ramQHMnQJOO6yuU/B52JzZsEyOQV7VkA0rjKhRGQ=; b=mP6u2AxD1a9LxL0C71ZDVOKX2e7nULGuLysz1pNWdrScIOOUUsUlz8jj8BEjH6mTLV FnAI3ACi1kFUyUAMLfH7O1EfMpuwzvNWgQUCPX81hbt1BSzs4bs7PJdeWAMIHL/kPrh3 gU2j2VCaLewWHwxjrgCFa3n4fpAphoODkwaZZMYCSWrVr+Twmk/eWOg93KxoFDT0mqSP DgYoE8hqNjD+fVknugaCnXpkxnxmQqLtjtPQLj0se03s5uIPxBQfjbtUcQA05Pt+kzQL pTV3QnANYIglN2xJXN83vibM7fXhvNt2BvESPTy+3mmTXV4Z2ESArQqOYdAPxHFbs2Aw 6hCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=REtZmKdl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id c33si2894111edf.530.2020.09.29.04.18.02; Tue, 29 Sep 2020 04:18:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=REtZmKdl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729348AbgI2LRK (ORCPT + 99 others); Tue, 29 Sep 2020 07:17:10 -0400 Received: from mail.kernel.org ([198.145.29.99]:33502 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729611AbgI2LQs (ORCPT ); Tue, 29 Sep 2020 07:16:48 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 5F98E206A5; Tue, 29 Sep 2020 11:16:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601378207; bh=coewqD0ZjIG/fSjV2ryhd1Cfjd2vswYmnGMbu0EDkPg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=REtZmKdlCZ1NC/j19Lpz71i8IZWr0GLwrmPYm0NKQGQ2KazapJRWqbd1jpTvJ5kbS srCbhnIACoKBgR2FyMttPqOOrn3qOOslZ6aX3Z1bwezG9qKkNMBnLh2cnPI8JEVFHs 18Uabr4+YC5cLi2EY5ab4NzNQYoKCI1qK9Mmlb7w= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Qian Cai , Andrew Morton , Marco Elver , Matthew Wilcox , Linus Torvalds , Sasha Levin Subject: [PATCH 4.14 100/166] mm/vmscan.c: fix data races using kswapd_classzone_idx Date: Tue, 29 Sep 2020 13:00:12 +0200 Message-Id: <20200929105940.204483429@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929105935.184737111@linuxfoundation.org> References: <20200929105935.184737111@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Qian Cai [ Upstream commit 5644e1fbbfe15ad06785502bbfe5751223e5841d ] pgdat->kswapd_classzone_idx could be accessed concurrently in wakeup_kswapd(). Plain writes and reads without any lock protection result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as well as saving a branch (compilers might well optimize the original code in an unintentional way anyway). While at it, also take care of pgdat->kswapd_order and non-kswapd threads in allow_direct_reclaim(). The data races were reported by KCSAN, BUG: KCSAN: data-race in wakeup_kswapd / wakeup_kswapd write to 0xffff9f427ffff2dc of 4 bytes by task 7454 on cpu 13: wakeup_kswapd+0xf1/0x400 wakeup_kswapd at mm/vmscan.c:3967 wake_all_kswapds+0x59/0xc0 wake_all_kswapds at mm/page_alloc.c:4241 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_slowpath at mm/page_alloc.c:4512 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x16e/0x6f0 __handle_mm_fault+0xcd5/0xd40 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 1 lock held by mtest01/7454: #0: ffff9f425afe8808 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x143/0x6f9 do_user_addr_fault at arch/x86/mm/fault.c:1405 (inlined by) do_page_fault at arch/x86/mm/fault.c:1539 irq event stamp: 6944085 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x34c/0x57c irq_exit+0xa2/0xc0 read to 0xffff9f427ffff2dc of 4 bytes by task 7472 on cpu 38: wakeup_kswapd+0xc8/0x400 wake_all_kswapds+0x59/0xc0 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x16e/0x6f0 __handle_mm_fault+0xcd5/0xd40 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 1 lock held by mtest01/7472: #0: ffff9f425a9ac148 (&mm->mmap_sem#2){++++}, at: do_page_fault+0x143/0x6f9 irq event stamp: 6793561 count_memcg_event_mm+0x1a6/0x270 count_memcg_event_mm+0x119/0x270 __do_softirq+0x34c/0x57c irq_exit+0xa2/0xc0 BUG: KCSAN: data-race in kswapd / wakeup_kswapd write to 0xffff90973ffff2dc of 4 bytes by task 820 on cpu 6: kswapd+0x27c/0x8d0 kthread+0x1e0/0x200 ret_from_fork+0x27/0x50 read to 0xffff90973ffff2dc of 4 bytes by task 6299 on cpu 0: wakeup_kswapd+0xf3/0x450 wake_all_kswapds+0x59/0xc0 __alloc_pages_slowpath+0xdcc/0x1290 __alloc_pages_nodemask+0x3bb/0x450 alloc_pages_vma+0x8a/0x2c0 do_anonymous_page+0x170/0x700 __handle_mm_fault+0xc9f/0xd00 handle_mm_fault+0xfc/0x2f0 do_page_fault+0x263/0x6f9 page_fault+0x34/0x40 Signed-off-by: Qian Cai Signed-off-by: Andrew Morton Reviewed-by: Andrew Morton Cc: Marco Elver Cc: Matthew Wilcox Link: http://lkml.kernel.org/r/1582749472-5171-1-git-send-email-cai@lca.pw Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/vmscan.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c6962aa5ddb40..5ee6fbdec8a8d 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2950,8 +2950,9 @@ static bool allow_direct_reclaim(pg_data_t *pgdat) /* kswapd must be awake if processes are being throttled */ if (!wmark_ok && waitqueue_active(&pgdat->kswapd_wait)) { - pgdat->kswapd_classzone_idx = min(pgdat->kswapd_classzone_idx, - (enum zone_type)ZONE_NORMAL); + if (READ_ONCE(pgdat->kswapd_classzone_idx) > ZONE_NORMAL) + WRITE_ONCE(pgdat->kswapd_classzone_idx, ZONE_NORMAL); + wake_up_interruptible(&pgdat->kswapd_wait); } @@ -3451,9 +3452,9 @@ out: static enum zone_type kswapd_classzone_idx(pg_data_t *pgdat, enum zone_type prev_classzone_idx) { - if (pgdat->kswapd_classzone_idx == MAX_NR_ZONES) - return prev_classzone_idx; - return pgdat->kswapd_classzone_idx; + enum zone_type curr_idx = READ_ONCE(pgdat->kswapd_classzone_idx); + + return curr_idx == MAX_NR_ZONES ? prev_classzone_idx : curr_idx; } static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, @@ -3497,8 +3498,11 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * the previous request that slept prematurely. */ if (remaining) { - pgdat->kswapd_classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); - pgdat->kswapd_order = max(pgdat->kswapd_order, reclaim_order); + WRITE_ONCE(pgdat->kswapd_classzone_idx, + kswapd_classzone_idx(pgdat, classzone_idx)); + + if (READ_ONCE(pgdat->kswapd_order) < reclaim_order) + WRITE_ONCE(pgdat->kswapd_order, reclaim_order); } finish_wait(&pgdat->kswapd_wait, &wait); @@ -3580,12 +3584,12 @@ static int kswapd(void *p) tsk->flags |= PF_MEMALLOC | PF_SWAPWRITE | PF_KSWAPD; set_freezable(); - pgdat->kswapd_order = 0; - pgdat->kswapd_classzone_idx = MAX_NR_ZONES; + WRITE_ONCE(pgdat->kswapd_order, 0); + WRITE_ONCE(pgdat->kswapd_classzone_idx, MAX_NR_ZONES); for ( ; ; ) { bool ret; - alloc_order = reclaim_order = pgdat->kswapd_order; + alloc_order = reclaim_order = READ_ONCE(pgdat->kswapd_order); classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); kswapd_try_sleep: @@ -3593,10 +3597,10 @@ kswapd_try_sleep: classzone_idx); /* Read the new order and classzone_idx */ - alloc_order = reclaim_order = pgdat->kswapd_order; + alloc_order = reclaim_order = READ_ONCE(pgdat->kswapd_order); classzone_idx = kswapd_classzone_idx(pgdat, classzone_idx); - pgdat->kswapd_order = 0; - pgdat->kswapd_classzone_idx = MAX_NR_ZONES; + WRITE_ONCE(pgdat->kswapd_order, 0); + WRITE_ONCE(pgdat->kswapd_classzone_idx, MAX_NR_ZONES); ret = try_to_freeze(); if (kthread_should_stop()) @@ -3638,20 +3642,23 @@ kswapd_try_sleep: void wakeup_kswapd(struct zone *zone, int order, enum zone_type classzone_idx) { pg_data_t *pgdat; + enum zone_type curr_idx; if (!managed_zone(zone)) return; if (!cpuset_zone_allowed(zone, GFP_KERNEL | __GFP_HARDWALL)) return; + pgdat = zone->zone_pgdat; + curr_idx = READ_ONCE(pgdat->kswapd_classzone_idx); + + if (curr_idx == MAX_NR_ZONES || curr_idx < classzone_idx) + WRITE_ONCE(pgdat->kswapd_classzone_idx, classzone_idx); + + if (READ_ONCE(pgdat->kswapd_order) < order) + WRITE_ONCE(pgdat->kswapd_order, order); - if (pgdat->kswapd_classzone_idx == MAX_NR_ZONES) - pgdat->kswapd_classzone_idx = classzone_idx; - else - pgdat->kswapd_classzone_idx = max(pgdat->kswapd_classzone_idx, - classzone_idx); - pgdat->kswapd_order = max(pgdat->kswapd_order, order); if (!waitqueue_active(&pgdat->kswapd_wait)) return; -- 2.25.1