Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp1446435pxb; Sun, 11 Apr 2021 20:26:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwo5Z5DQsKF/KZXrFmIKoqoVJMXnqe20QENLq3LuCGFaQ22nrHEFZ2CbHstP5riO4VImRKb X-Received: by 2002:a17:906:4f91:: with SMTP id o17mr25564905eju.503.1618198019541; Sun, 11 Apr 2021 20:26:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618198019; cv=none; d=google.com; s=arc-20160816; b=TaJUh7dhG5HmtOqhIbO89Wmx/AW9/LNVPiV91JHPEzhfSlPV9M9/9kmaMQnho8en3D ioATgm0CPoPWYGyYvpGrijQBn9Z3sUL5gIn6GLsMb40oOlcQVllI1OaFwaWNtT+WPtUi V5jE9KQiHnM/KmJF8B9qi1HqWom2AjzggOIG89Rax59TXBKyQxhAusl/020Quu0xyIU0 d4hdpCXW9x+JaeYhzRKJ3WTYwo7JEoIpAlz3RQJE6Zh1aWu/lLVwAYQ+EqMGGVo0nnrL ecvZgeYhGcFyNlHi8N21kJ+nrEln7x1YRWPayGIjTc0a+K8cdAv2Xe1Vn3LHnK6M7z6+ T2SQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=fAgo05IdcxFMuw4IQU3FLx8nlVX5/nirFG/sSemLbug=; b=fFKfkcC+SrqC1jKQBo0N8aZezYpd+Nb5ewQGa4gTeKH7+QfYDNfQjgTzXBT1hL0sT3 Kfl6PMzCkXqAbiraQ57uo10ntd7eMXI+HclR12eDMCS1jQRhnQSIHJCxJB0S1mxjviW/ o/dUqLICKwr+uRyz4z5QlSZlNY/FmCuCm1tJzP1i5WoGCPNsVCmO5fKYbRNYAW0Ugrq9 +/8C135lsqbGlJ6cxPKi5uVQWvCgTZY4foQfxgBzR2rwTt3z3lpLahY67MeabqpC4xXk z5BZDQQDe7U0RRp5KYV0QwBXvuRV0XE6SWXMNakyZneYEfLPJZ6/3Up70ybjEp0ssTFg Mw0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n24si6880878eds.571.2021.04.11.20.26.36; Sun, 11 Apr 2021 20:26:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235420AbhDLDZU (ORCPT + 99 others); Sun, 11 Apr 2021 23:25:20 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:16523 "EHLO szxga05-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235261AbhDLDZT (ORCPT ); Sun, 11 Apr 2021 23:25:19 -0400 Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.59]) by szxga05-in.huawei.com (SkyGuard) with ESMTP id 4FJYv56zKWzPptm; Mon, 12 Apr 2021 11:22:09 +0800 (CST) Received: from [10.174.176.162] (10.174.176.162) by DGGEMS403-HUB.china.huawei.com (10.3.19.203) with Microsoft SMTP Server id 14.3.498.0; Mon, 12 Apr 2021 11:24:59 +0800 Subject: Re: [PATCH 2/5] swap: fix do_swap_page() race with swapoff To: "Huang, Ying" CC: Tim Chen , , , , , , , , , , , , References: <20210408130820.48233-1-linmiaohe@huawei.com> <20210408130820.48233-3-linmiaohe@huawei.com> <7684b3de-2824-9b1f-f033-d4bc14f9e195@linux.intel.com> <50d34b02-c155-bad7-da1f-03807ad31275@huawei.com> <995a130b-f07a-4771-1fe3-477d2f3c1e8e@linux.intel.com> <87r1jgwa2w.fsf@yhuang6-desk1.ccr.corp.intel.com> From: Miaohe Lin Message-ID: <2268e7f9-cfdf-92f3-9a32-04e103e132fc@huawei.com> Date: Mon, 12 Apr 2021 11:24:58 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.6.0 MIME-Version: 1.0 In-Reply-To: <87r1jgwa2w.fsf@yhuang6-desk1.ccr.corp.intel.com> Content-Type: text/plain; charset="windows-1252" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.174.176.162] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/4/12 9:44, Huang, Ying wrote: > Miaohe Lin writes: > >> On 2021/4/10 1:17, Tim Chen wrote: >>> >>> >>> On 4/9/21 1:42 AM, Miaohe Lin wrote: >>>> On 2021/4/9 5:34, Tim Chen wrote: >>>>> >>>>> >>>>> On 4/8/21 6:08 AM, Miaohe Lin wrote: >>>>>> When I was investigating the swap code, I found the below possible race >>>>>> window: >>>>>> >>>>>> CPU 1 CPU 2 >>>>>> ----- ----- >>>>>> do_swap_page >>>>>> synchronous swap_readpage >>>>>> alloc_page_vma >>>>>> swapoff >>>>>> release swap_file, bdev, or ... >>>>> >>>> >>>> Many thanks for quick review and reply! >>>> >>>>> Perhaps I'm missing something. The release of swap_file, bdev etc >>>>> happens after we have cleared the SWP_VALID bit in si->flags in destroy_swap_extents >>>>> if I read the swapoff code correctly. >>>> Agree. Let's look this more close: >>>> CPU1 CPU2 >>>> ----- ----- >>>> swap_readpage >>>> if (data_race(sis->flags & SWP_FS_OPS)) { >>>> swapoff >>>> p->swap_file = NULL; >>>> struct file *swap_file = sis->swap_file; >>>> struct address_space *mapping = swap_file->f_mapping;[oops!] >>>> ... >>>> p->flags = 0; >>>> ... >>>> >>>> Does this make sense for you? >>> >>> p->swapfile = NULL happens after the >>> p->flags &= ~SWP_VALID, synchronize_rcu(), destroy_swap_extents() sequence in swapoff(). >>> >>> So I don't think the sequence you illustrated on CPU2 is in the right order. >>> That said, without get_swap_device/put_swap_device in swap_readpage, you could >>> potentially blow pass synchronize_rcu() on CPU2 and causes a problem. so I think >>> the problematic race looks something like the following: >>> >>> >>> CPU1 CPU2 >>> ----- ----- >>> swap_readpage >>> if (data_race(sis->flags & SWP_FS_OPS)) { >>> swapoff >>> p->flags = &= ~SWP_VALID; >>> .. >>> synchronize_rcu(); >>> .. >>> p->swap_file = NULL; >>> struct file *swap_file = sis->swap_file; >>> struct address_space *mapping = swap_file->f_mapping;[oops!] >>> ... >>> ... >>> >> >> Agree. This is also what I meant to illustrate. And you provide a better one. Many thanks! > > For the pages that are swapped in through swap cache. That isn't an > issue. Because the page is locked, the swap entry will be marked with > SWAP_HAS_CACHE, so swapoff() cannot proceed until the page has been > unlocked. > > So the race is for the fast path as follows, > > if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && > __swap_count(entry) == 1) > > I found it in your original patch description. But please make it more > explicit to reduce the potential confusing. Sure. Should I rephrase the commit log to clarify this or add a comment in the code? Thanks. > > Best Regards, > Huang, Ying > . >