Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp763869pxb; Tue, 12 Apr 2022 12:49:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxUTSV/NCZp/TwHjx6zsP8wRMNZSDy/5H2PGNvDQYZXvzjzhU92IzHvZik9C8QUgEREFaNz X-Received: by 2002:a17:903:230f:b0:158:8521:1e76 with SMTP id d15-20020a170903230f00b0015885211e76mr6044182plh.84.1649792998162; Tue, 12 Apr 2022 12:49:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649792998; cv=none; d=google.com; s=arc-20160816; b=O+vZMMEk+Hqxt27wELTYTOP04RqnuZg8LE+uRXPWl2VSfAKVuJl6uQPnYv3uor282w 8qE6jmExKtjlHvhGOMQa4UnDm6cO71Sq59TD/1UqruDlLbv0dOqfDnWQdw84UYph1zkX oubMjqTbhuUPEIpxZEXqc4COfEwq/Tz19vt2JpaHOoP0or1DgWPl1GZkY0fblez3qpFh f6/q9oWdBMXgkN/wkhW/QloPAsacthUE2rVr8qFln5PXWAHu3Aec30oqoxjrlSSbW0Bz 8o0Gf9gUx9mnr8awmtJUY3q4LC5aKijiXid3yU2bRHDKpcggTFyROFV+EjIUgWSnWveT 45Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=Xpx5QoqOcS07n9zZphQssxqx1W+y0kEDrzg6PwOcRqo=; b=MURQ7yYeRE/lWFtAJpoaK/fKKqfBcf45Uv7/If0g/WXOyKMaODIgkgkmqXWRUikkGv 4NT88Z4dW6ALVn9h1lMVpgtNr/HsHfUERctLR2uYleo6cexl9ZMCeJyyoJo7avkdFN01 70jsbJEp4Fzl+M7g4/YAIug4BZw1oP4rGemYvVly0RbkRtwAyjA0kbu3b6uh8CSulx5S cy4zYEauTjegdrgy6WeDe0v8u/Wmk+ZK3iqtiElVtcM5rbUVapsWUNYWhurE5Rhw8Lr8 Qz7g/mntMNbMmo+uWvVryVFaZlw0cCqSdexvLcshHC3MgSw3o6z4R1invpxzfnAlgPcJ 3qrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NTwAC78d; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b17-20020a170902d89100b00156b1cd8b9dsi12108113plz.206.2022.04.12.12.49.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Apr 2022 12:49:58 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=NTwAC78d; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 87CB266CB2; Tue, 12 Apr 2022 12:45:20 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232561AbiDLRGe (ORCPT + 99 others); Tue, 12 Apr 2022 13:06:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352059AbiDLRGb (ORCPT ); Tue, 12 Apr 2022 13:06:31 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B728412AA2 for ; Tue, 12 Apr 2022 10:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649783051; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Xpx5QoqOcS07n9zZphQssxqx1W+y0kEDrzg6PwOcRqo=; b=NTwAC78dCXvPqKZiVXzbmCU8lx8B0SZfJ3prJjVt29fdD76yUSiaIknOeBYB+zPMfeLpKe /3h+fpZpZPH+XEYD5RrAotE6/1aiON/BfbpAmwEi1qwSOGayAor+W5G8Ar/p82RfANmZlC /PaYrRRhnn/jY6XFxNbL7VagkILvzEM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-473-p_-pdPW4OgmCHztSd49Pbw-1; Tue, 12 Apr 2022 13:04:07 -0400 X-MC-Unique: p_-pdPW4OgmCHztSd49Pbw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E0107803D77; Tue, 12 Apr 2022 17:04:06 +0000 (UTC) Received: from [10.22.19.27] (unknown [10.22.19.27]) by smtp.corp.redhat.com (Postfix) with ESMTP id E77D840470ED; Tue, 12 Apr 2022 17:04:05 +0000 (UTC) Message-ID: Date: Tue, 12 Apr 2022 13:04:05 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH v5] locking/rwsem: Make handoff bit handling more consistent Content-Language: en-US To: john.p.donnelly@oracle.com, chenguanyou , gregkh@linuxfoundation.org Cc: dave@stgolabs.net, hdanton@sina.com, linux-kernel@vger.kernel.org, mazhenhua@xiaomi.com, mingo@redhat.com, peterz@infradead.org, quic_aiquny@quicinc.com, will@kernel.org, sashal@kernel.org, stable@vger.kernel.org References: <20211116012912.723980-1-longman@redhat.com> <20220214154741.12399-1-chenguanyou@xiaomi.com> <3f02975c-1a9d-be20-32cf-f1d8e3dfafcc@oracle.com> <31178c33-e25c-c3e8-35e2-776b5211200c@oracle.com> <161c2e25-3d26-4dd7-d378-d1741f7bcca8@redhat.com> <2b6ed542-b3e0-1a87-33ac-d52fc0e0339c@oracle.com> From: Waiman Long In-Reply-To: <2b6ed542-b3e0-1a87-33ac-d52fc0e0339c@oracle.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.11.54.2 X-Spam-Status: No, score=-3.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,RDNS_NONE,SPF_HELO_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/12/22 12:28, john.p.donnelly@oracle.com wrote: > On 4/11/22 4:07 PM, Waiman Long wrote: >> >> On 4/11/22 17:03, john.p.donnelly@oracle.com wrote: >>> >>>>> >>>>> I have reached out to Waiman and he suggested this for our next >>>>> test pass: >>>>> >>>>> >>>>> 1ee326196c6658 locking/rwsem: Always try to wake waiters in >>>>> out_nolock path >>>> >>>> Does this commit help to avoid the lockup problem? >>>> >>>> Commit 1ee326196c6658 fixes a potential missed wakeup problem when >>>> a reader first in the wait queue is interrupted out without >>>> acquiring the lock. It is actually not a fix for commit >>>> d257cc8cb8d5. However, this commit changes the out_nolock path >>>> behavior of writers by leaving the handoff bit set when the wait >>>> queue isn't empty. That likely makes the missed wakeup problem >>>> easier to reproduce. >>>> >>>> Cheers, >>>> Longman >>>> >>> >>> Hi, >>> >>> >>> We are testing now >>> >>> ETA for fio soak test completion is  ~15hr from now. >>> >>> I wanted to share the stack traces for future reference + occurrences. >>> >> I am looking forward to your testing results tomorrow. >> >> Cheers, >> Longman >> > Hi > >  Our 24hr fio soak test with : > >  1ee326196c6658 locking/rwsem: Always try to wake waiters in > out_nolock path > > >  applied to 5.15.30  passed. > >  I suggest you append  1ee326196c6658 with : > > >  cc: stable > >   Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more > consistent") > > > I'll leave the implementation details up to the core maintainers how > to do that ;-) Thanks for the test. The patch has already been in the tip tree. It may not be easy to add a Fixes tag to it. Anyway, I will encourage stable tree maintainer to take it as it does fix a problem as shown in your test. Cheers, Longman