Received: by 2002:a05:6a10:5bc5:0:0:0:0 with SMTP id os5csp813945pxb; Wed, 27 Oct 2021 12:57:40 -0700 (PDT) X-Google-Smtp-Source: ABdhPJysvs5kxNy8OMTQMtCYd31Fi6ugtdYqptBbVQ0MlzeXksmrP3r699tNgaxPNAsO9lYHb6P6 X-Received: by 2002:a17:907:72c3:: with SMTP id du3mr22165477ejc.536.1635364660284; Wed, 27 Oct 2021 12:57:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1635364660; cv=none; d=google.com; s=arc-20160816; b=cMxKbjfKTnRwTM5jiggSWNRgPx5efUSsq8ZSfkBwEZIN/uWJIf9zhd1PbV5qkPdjsd 4RGMsAItrOSF3CqMPc3Xm6vwpyqz6oTeMQuXox/ZbSlgd2WZcSB2aozRSg9IcA/ZzECv q/xK2ikfTdZyWBzqFx4tzLRB0npDdA8sQWm98eCZ0nI6d6VjKqq/vDQF3djhRtHHLSWH /52fquWi+/2YCeagT5K6cJwgC5BkkQd/BsKEkmH5+1FnXDzLHOjrrNC68TH9QmG52wUO UfJyJpPZqPGzTJQjIZpDxd5R+WuNxkO6XSM+bPRY2aFinm8ja3yNZ86/OYSTdwIQxQBU /z1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from; bh=T0/RH2anWbjuXsZF7NrZ6adNYCDxDBQ5135tNQMN4qI=; b=B/5+wM5rRrjza4zc4gp9VUcn5R24/WgTi7BbAqqiwh3ZeB5LEx48VPDbuhr5tihLvg Q+DWl4bK8USYCqClMF49Pa+nt2GEG3j5TKy847FpF/u7TLnha3wYBs/dX186euO3OAP0 By+07/T8ZIrIGhtNDivsCp+9Xiea7o/hGIbd/sk+7yaFy2NbC+5Voq7a3rIq+PbmdEm2 Kg0WJ1HtzoGFf8qu8H/CIv5lUmLu4fL3q0gILuBGK0TIdy9vO4hPKZQ9lj6o56K2YH8j 4OyVzsw4qcxDln0rzbXGNim5ZcMLb8xu0DkdqW/mj2yqonSWxkDcshHDWWn052goyqrj oYkQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id sh42si1257222ejc.463.2021.10.27.12.57.15; Wed, 27 Oct 2021 12:57:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240312AbhJ0HOt convert rfc822-to-8bit (ORCPT + 99 others); Wed, 27 Oct 2021 03:14:49 -0400 Received: from szxga01-in.huawei.com ([45.249.212.187]:13980 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236942AbhJ0HOq (ORCPT ); Wed, 27 Oct 2021 03:14:46 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4HfKb01nR2zZcLl; Wed, 27 Oct 2021 15:10:20 +0800 (CST) Received: from dggpeml500025.china.huawei.com (7.185.36.35) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Wed, 27 Oct 2021 15:12:08 +0800 Received: from dggpeml500026.china.huawei.com (7.185.36.106) by dggpeml500025.china.huawei.com (7.185.36.35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Wed, 27 Oct 2021 15:12:06 +0800 Received: from dggpeml500026.china.huawei.com ([7.185.36.106]) by dggpeml500026.china.huawei.com ([7.185.36.106]) with mapi id 15.01.2308.015; Wed, 27 Oct 2021 15:12:06 +0800 From: songyuanzheng To: Dennis Zhou , Christoph Lameter CC: "tj@kernel.org" , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" Subject: RE: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages Thread-Topic: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages Thread-Index: AQHXyXUG8JGkGq6nOUaljbOjl/I/yqvkDQiAgAJjj4A= Date: Wed, 27 Oct 2021 07:12:06 +0000 Message-ID: <4be3bce19c1d44c4a04bb411dfa26182@huawei.com> References: <20211025070015.553813-1-songyuanzheng@huawei.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.174.179.110] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Thanks for the advice, Dennis Zhou and Christoph Lameter. I really appreciate it. I edited this patch by changing the pcpu_nr_empty_pop_pages to atomic_t variable. Here is the v2 patch: https://patchwork.kernel.org/project/linux-mm/patch/20211026084312.2138852-1-songyuanzheng@huawei.com/. Would you mind reviewing it again? Thanks, Yuanzheng Song -----Original Message----- From: Dennis Zhou [mailto:dennis@kernel.org] Sent: Tuesday, October 26, 2021 10:42 AM To: Christoph Lameter Cc: songyuanzheng ; dennis@kernel.org; tj@kernel.org; akpm@linux-foundation.org; linux-mm@kvack.org; linux-kernel@vger.kernel.org Subject: Re: [PATCH -next] mm/percpu: fix data-race with pcpu_nr_empty_pop_pages Hello, On Mon, Oct 25, 2021 at 09:50:48AM +0200, Christoph Lameter wrote: > On Mon, 25 Oct 2021, Yuanzheng Song wrote: > > > When reading the pcpu_nr_empty_pop_pages in pcpu_alloc() and writing > > the pcpu_nr_empty_pop_pages in > > pcpu_update_empty_pages() at the same time, the data-race occurs. > > Looks like a use case for the atomic RMV instructions. > Yeah. I see 2 options. Switch the variable over to an atomic or we can move the read behind pcpu_lock. All the writes are already behind it othewise that would actually be problematic. In this particular case, reading a wrong # of empty pages isn't a big deal as eventually the background work will get scheduled. Thanks, Dennis > > To fix this issue, use READ_ONCE() and WRITE_ONCE() to read and > > write the pcpu_nr_empty_pop_pages. > > Never thought that READ_ONCE and WRITE_ONCE can fix races like this. > Really? > > > diff --git a/mm/percpu.c b/mm/percpu.c index > > 293009cc03ef..e8ef92e698ab 100644 > > --- a/mm/percpu.c > > +++ b/mm/percpu.c > > @@ -574,7 +574,9 @@ static void pcpu_isolate_chunk(struct pcpu_chunk > > *chunk) > > > > if (!chunk->isolated) { > > chunk->isolated = true; > > - pcpu_nr_empty_pop_pages -= chunk->nr_empty_pop_pages; > > + WRITE_ONCE(pcpu_nr_empty_pop_pages, > > + READ_ONCE(pcpu_nr_empty_pop_pages) - > > + chunk->nr_empty_pop_pages); > > atomic_sub()? > > > } > > list_move(&chunk->list, > > &pcpu_chunk_lists[pcpu_to_depopulate_slot]); > > } > > @@ -585,7 +587,9 @@ static void pcpu_reintegrate_chunk(struct > > pcpu_chunk *chunk) > > > > if (chunk->isolated) { > > chunk->isolated = false; > > - pcpu_nr_empty_pop_pages += chunk->nr_empty_pop_pages; > > + WRITE_ONCE(pcpu_nr_empty_pop_pages, > > + READ_ONCE(pcpu_nr_empty_pop_pages) + > > + chunk->nr_empty_pop_pages); > > atomic_add()? >