Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp257608pxk; Wed, 16 Sep 2020 03:44:18 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyb/LTu1gitjzhIjVPQdYar03oaHvKtrmlAJI9G5rsK8utjfFEHVffh43jMq4rFTGJ41Szu X-Received: by 2002:a17:906:b43:: with SMTP id v3mr24198105ejg.383.1600253057978; Wed, 16 Sep 2020 03:44:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1600253057; cv=none; d=google.com; s=arc-20160816; b=EpMlZR9L0s6ZBH3t7HUIiQewxmw421ZmxAhmNMn4/yietMDufUd/OoKhiY1AX6Z8jV +/qMs0cKsQfS8HTKXB6As1WhaSnc7/H0DDoUBHcRcFg98k4EIgz7lVw2vYyfWKw6lHl0 jyJokRqbv1y1Zq1Fx25o9KHu3mSGU72j4V203QP/46QZN7uPmB6WUp+5m8kJQ6CcfxFg jC69RuOrVvZ4qiBEVfRP6y5NhQ6xUCz3JoVtUltMxOFfhmwM9Xsp16KZZZeJx9XxBOZH PgfL/UMWdTxohjNiJtnwG3gZQ8hEGhAJ3pDo00w/mkiZWfZOW77ckFZWECCzI3FdDAzK RWyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=YI6AtuLwYA/oKouZ5DlloMz+MkVTsSZjiWT80qCMI4s=; b=sDfBwEs7uMZAurQYgnPm9gJgDXG5v4cPA96b6K1HXUhx+NXGnCgEYpHBNy0cYeDCsB YmW/R+gXpVpXAoC2NxSO7vjXw7Ftn6PCeVAVV0xnU1LQDNEstBPnnDh67RPsLje6fXGG i3LHPhKZx7UPXTMEa6L4OfhWr8B5rgY/UhdYWs3w9+i1UOQ/f3Jzn5YJ6sb3cJ4fJEz4 +FSqVkq8he2QR2ewBoQbCmf+oAN5nqireVpVoN4wGt9wfXNkKaANL/n12LQD7V2xZ6WI 13bYOpr2s+Ffyim/HrNh67Xa00KxuqDlUQUX9R9VEFFjPR4+GNrIponIi61eYZqI67EE ZJZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q4si11281318edg.483.2020.09.16.03.43.45; Wed, 16 Sep 2020 03:44:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726856AbgIPKil (ORCPT + 99 others); Wed, 16 Sep 2020 06:38:41 -0400 Received: from mx2.suse.de ([195.135.220.15]:50878 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726888AbgIPKfI (ORCPT ); Wed, 16 Sep 2020 06:35:08 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id E2CF4ACE3; Wed, 16 Sep 2020 10:35:01 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 119301E12E1; Wed, 16 Sep 2020 12:34:46 +0200 (CEST) Date: Wed, 16 Sep 2020 12:34:46 +0200 From: Jan Kara To: Linus Torvalds Cc: Matthieu Baerts , Michael Larabel , Matthew Wilcox , Amir Goldstein , Ted Ts'o , Andreas Dilger , Ext4 Developers List , Jan Kara , linux-fsdevel Subject: Re: Kernel Benchmarking Message-ID: <20200916103446.GB3607@quack2.suse.cz> References: <9550725a-2d3f-fa35-1410-cae912e128b9@tessares.net> <37989469-f88c-199b-d779-ed41bc65fe56@tessares.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Tue 15-09-20 16:35:45, Linus Torvalds wrote: > On Tue, Sep 15, 2020 at 12:56 PM Matthieu Baerts > wrote: > > > > I am sorry, I am not sure how to verify this. I guess it was one > > processor because I removed "-smp 2" option from qemu. So I guess it > > switched to a uniprocessor mode. > > Ok, that all sounds fine. So yes, your problem happens even with just > one CPU, and it's not any subtle SMP race. > > Which is all good - apart from the bug existing in the first place, of > course. It just reinforces the "it's probably a latent deadlock" > thing. So from the traces another theory that appeared to me is that it could be a "missed wakeup" problem. Looking at the code in wait_on_page_bit_common() I found one suspicious thing (which isn't a great match because the problem seems to happen on UP as well and I think it's mostly a theoretical issue but still I'll write it here): wait_on_page_bit_common() has: spin_lock_irq(&q->lock); SetPageWaiters(page); if (!trylock_page_bit_common(page, bit_nr, wait)) - which expands to: ( if (wait->flags & WQ_FLAG_EXCLUSIVE) { if (test_and_set_bit(bit_nr, &page->flags)) return false; } else if (test_bit(bit_nr, &page->flags)) return false; ) __add_wait_queue_entry_tail(q, wait); spin_unlock_irq(&q->lock); Now the suspicious thing is the ordering here. What prevents the compiler (or the CPU for that matter) from reordering SetPageWaiters() call behind the __add_wait_queue_entry_tail() call? I know SetPageWaiters() and test_and_set_bit() operate on the same long but is it really guaranteed something doesn't reorder these? In unlock_page() we have: if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags)) wake_up_page_bit(page, PG_locked); So if the reordering happens, clear_bit_unlock_is_negative_byte() could return false even though we have a waiter queued. And this seems to be a thing commit 2a9127fcf22 ("mm: rewrite wait_on_page_bit_common() logic") introduced because before we had set_current_state() between SetPageWaiters() and test_bit() which implies a memory barrier. Honza -- Jan Kara SUSE Labs, CR