Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp4398435rwe; Tue, 30 Aug 2022 09:22:50 -0700 (PDT) X-Google-Smtp-Source: AA6agR6RqIXqDyt9dOdcd8xr/ZOBshWrZSzyAbZ9NltH1JE1n5pBESfM5L/mjqNdfvBi6KO4ItYD X-Received: by 2002:a05:6402:3507:b0:448:b672:55ee with SMTP id b7-20020a056402350700b00448b67255eemr4108733edd.107.1661876569819; Tue, 30 Aug 2022 09:22:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661876569; cv=none; d=google.com; s=arc-20160816; b=N7HIjeQxjTsruvc2r0gLWocXUvDgw28luPUMTaUtMyWYc5jqjAyLfGC6vrCOnlcoR3 th4jna7k+X35YBE1zKZDDEMg3rQ5P0i+Z4BcbPn+OrSIwvY6w0MHRjHGBgMOOOs7PPN9 M7yeF1hMCC/+SI+UuwKWz7n4du3CnfIdxESIaZMl5V3HiW5r/9ki7YvGrH9jEgktgCUN aSEI7N/K1ln9w4j59rf1g00Kz8nDhPW+3YD6it2Ig88SkmsoWYoXMsAO2AaRt/DqlU1W MhOgVcOJo08YGd1+8Z+Y/H+pO/FVS9rKxpDoyBS5yJve9L4lbpi5Ch9VujIfDEUkjVJl OGJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=mREfjPyAZE8fR8BbY6J8ChRmeGgj52ZYtbGLQB+Yl64=; b=Qv3K7D3PRHgwT+9LGY+tf0uI4AIzaGzMhq1eQPNscDmKCeTdWFUgLySxF8ggzxyEtL 8Uhy2dLlZtFhkWBXz0c5BZOZHAqdOXhzg3ZqEnqM10yBDcnF7Xg9++7WQNlvH8Ntz0SV c6RnHkzBZYXujBz6ZX4lhDKH4RmYj93ZPLEfaO26aPDbRtqcUI4dqlo5/nktzPwSsoL4 v5NSHsc9bD/z2fnEcn/cZ3QTcja/q99Qp52oNv0zs0/WAiVNrLH89TvUehUFAG/hjZ9X VqBzx+JhEYFRkVzlC3+M9OXFb7UUu/BOefkraO22MbvLu9sI6YCLishaEcfCe5i0jk8N exoA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Pb7d9l19; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p19-20020aa7d313000000b004478a9e98dasi7658186edq.307.2022.08.30.09.22.23; Tue, 30 Aug 2022 09:22:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=Pb7d9l19; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231163AbiH3QSf (ORCPT + 99 others); Tue, 30 Aug 2022 12:18:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231146AbiH3QSa (ORCPT ); Tue, 30 Aug 2022 12:18:30 -0400 Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D099C6B71 for ; Tue, 30 Aug 2022 09:18:27 -0700 (PDT) Received: by mail-ej1-x62e.google.com with SMTP id y3so23251155ejc.1 for ; Tue, 30 Aug 2022 09:18:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=mREfjPyAZE8fR8BbY6J8ChRmeGgj52ZYtbGLQB+Yl64=; b=Pb7d9l19y97oNgQaeYdhiQ+Bd+pVG+plxWtfgRvphPOnXTtvxZU1nsqpOkAjdVPPzi 3OAwiJDtg/Tq/6Fo3iDXejNMbxNNlsFnE+08YdD9h+4aPcy447bFVmbpHq5eYzVXz9aG UOzZhSw6z5Z6lwQxqtWlFrHuGfaFLHYNdHof4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=mREfjPyAZE8fR8BbY6J8ChRmeGgj52ZYtbGLQB+Yl64=; b=dZSjKZI72eIaGYxNdQ0j46M8lXNSekTQr6Jqjg16aQEdiwOzQPGK9GfcD/9wp0CE/s HtlDK+CdlZ9nakV40e4kBhZFY5uQ5K3kqH7Au1cqU5YNhyStvOHcOIDsQckcQnycPL/l G2+0XCXcmmRaImS+fXue5OmG5sK+gHRfABG8czd3SyeFy6v0D1/eJKFo+DjULs1JKUoG PsfYCwujBHRDFlBp0RSVxecGSxdvVKkeyGP+8VpNlzLVpxankv9cEcprsDqBHZX23xSu uk/+uPbAK9KV1NtV9Ee7JgSKMN8M1f7NnngvJK8j2+UoOjdpnezFTxrDNRzd+PA8w14p wDLA== X-Gm-Message-State: ACgBeo2k/R1uzrzAcXFMwO+5MCTx/qCzbX/0tc2vpVAmOjkU2aoytTca GA4nnhRWKTtQ8J5iVCDCfojQrttro/ZUJOq9 X-Received: by 2002:a17:907:6d98:b0:741:3872:b9c2 with SMTP id sb24-20020a1709076d9800b007413872b9c2mr12785494ejc.259.1661876305402; Tue, 30 Aug 2022 09:18:25 -0700 (PDT) Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com. [209.85.221.44]) by smtp.gmail.com with ESMTPSA id e23-20020a170906315700b007420aaba67esm1399937eje.36.2022.08.30.09.18.23 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 30 Aug 2022 09:18:23 -0700 (PDT) Received: by mail-wr1-f44.google.com with SMTP id bq11so14898050wrb.12 for ; Tue, 30 Aug 2022 09:18:23 -0700 (PDT) X-Received: by 2002:a5d:6881:0:b0:225:28cb:332f with SMTP id h1-20020a5d6881000000b0022528cb332fmr9403684wru.405.1661876302923; Tue, 30 Aug 2022 09:18:22 -0700 (PDT) MIME-Version: 1.0 References: <20211116012912.723980-1-longman@redhat.com> <20220719104104.1634-1-hdanton@sina.com> <20220722115510.2101-1-hdanton@sina.com> <20220723001713.2156-1-hdanton@sina.com> <2fcf84e6-168b-4ee7-bc9e-5b1c3c9a3d4e@redhat.com> In-Reply-To: From: Doug Anderson Date: Tue, 30 Aug 2022 09:18:09 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v5] locking/rwsem: Make handoff bit handling more consistent To: Waiman Long , Hillf Danton Cc: Peter Zijlstra , Will Deacon , Davidlohr Bueso , MM , LKML Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Fri, Aug 5, 2022 at 12:16 PM Doug Anderson wrote: > > Hi, > > On Fri, Aug 5, 2022 at 12:02 PM Waiman Long wrote: > > > > > > On 8/5/22 13:14, Doug Anderson wrote: > > > Hi, > > > > > > On Fri, Jul 22, 2022 at 5:17 PM Hillf Danton wrote: > > >> On Fri, 22 Jul 2022 07:02:42 -0700 Doug Anderson wrote: > > >>> Thanks! I added this diff to your previous diff and my simple test > > >>> still passes and I don't see your WARN_ON triggered. > > >> Thanks! > > >>> How do we move forward? Are you going to officially submit a patch > > >>> with both of your diffs squashed together? Are we waiting for > > >>> additional review from someone? > > >> Given it is not unusual for us to miss anything important, lets take > > >> a RWSEM_WAIT_TIMEOUT nap now or two. > > > It appears that another fix has landed in the meantime. Commit > > > 6eebd5fb2083 ("locking/rwsem: Allow slowpath writer to ignore handoff > > > bit if not set by first waiter"). > > > > > > ...unfortunately with that patch my test cases still hangs. :( > > > > The aim of commit 6eebd5fb2083 ("locking/rwsem: Allow slowpath writer to > > ignore handoff bit if not set by first waiter") is to restore slowpath > > writer behavior to be the same as before commit d257cc8cb8d5 > > ("locking/rwsem: Make handoff bit handling more consistent"). > > Ah, OK. I just saw another fix to the same commit and assumed that > perhaps it was intended to address the same issue. > > > > If the hang still exists, there may be other cause for it. Could you > > share more information about what the test case is doing and any kernel > > splat that you have? > > It's all described in my earlier reply including my full test case: > > https://lore.kernel.org/r/CAD=FV=URCo5xv3k3jWbxV1uRkUU5k6bcnuB1puZhxayEyVc6-A@mail.gmail.com > > Previously I tested Hillf's patches and they fixed it for me. Hillf: do you have any plan here for your patches? I spent some time re-testing this today atop mainline, specifically atop commit dcf8e5633e2e ("tracing: Define the is_signed_type() macro once"). Some notes: 1. I can confirm that my test case still reproduces a hang on mainline, though it seems a bit harder to reproduce (sometimes I have to run for a few minutes). I didn't spend lots of time confirming that the hang is exactly the same, but the same testcase reproduces it so it probably is. If it's important I can drop into kgdb and dig around to confirm. 2. Blindly applying the first (and resolving the trivial merge conflict) or both of your proposed patches no longer fixes the hang on mainline. 3. Reverting Waiman's commit 6eebd5fb2083 ("locking/rwsem: Allow slowpath writer to ignore handoff bit if not set by first waiter") and then applying your two fixes _does_ still fix the patch on mainline. I ran for 20 minutes w/ no reproduction. So it seems like Waiman's recent commit interacts with your fix in a bad way. :( -Doug -Doug