Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp2630386imn; Tue, 2 Aug 2022 09:59:54 -0700 (PDT) X-Google-Smtp-Source: AA6agR53xoKY8bCS2WFT6ga5KTkKF0ZzmI80Jlw2NKjZTilWoX7nQbH9D83aKE+pUT8YZIrBgzgc X-Received: by 2002:a17:907:2c47:b0:730:8bbb:69a8 with SMTP id hf7-20020a1709072c4700b007308bbb69a8mr7638618ejc.38.1659459593943; Tue, 02 Aug 2022 09:59:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659459593; cv=none; d=google.com; s=arc-20160816; b=Js9pEtn0eQxUiV3X+J8zQpUtLU0pTqetbtaZ1qadSMDaPPrMwV+hC7hKz4cu4JUZ9H AhYHQGX8r0nyESWRRXIC/GM8/PQa2LE1DJMIC2GGw8DJGPDj7vtdEHfzQX8aqVumiSmz zSHuXLHrqA0dSzn3tNmwYAdIHmXpEkUPEdnEjQpCQbQw4JVgcrScEq1+tzDddeLsFVku 4EHqZ+E2n81MTaSob51dYb6cCYF/eOTZcmS3199Rfnx3wBtfz2D0jpTO1HQ6e64lFj3M jf6jxRB3e9xiSOWCXY19gMAAjY9bcA8JEEQ1wWlfaVFrjRNx2cP+q2/BQdkSbrO9VTc6 FG9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=QGrCaqqxWoQzQtulOGjhYIWDqCG17gLP4XBJOMrD/9Y=; b=R5jvZ6uZlQxuYj6WfKnzi6hr0V6aeIwkOZnI19O9jQlNLiN4raBXcnaRn9lfBcg/py uskUVxnH2JFfsSzxRiZ/sP3NfqANBT+Mrpn3gGwqWnPtA31Pk1HG3N3ZlzdTvMbaBhbw cdguBUaDZjl9+gg+8bUIpylyB0oXfGJVnOw37bjEL+sA+XCNPnifLVKlvEllVet31ViR VnITf65cfk/fTc2CI+QlP5+OGgX7f/AG328RIlt8CR+5W4VRalVNdYhsQlQ8nB4O79kJ qw5mrQ2haOdHDZv+dlFRAToZKBz5onQov1sj0k8ziO1al2P+C+xsB3f17l9faKIxmfkj VXrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hr20-20020a1709073f9400b007308e7033c3si5333589ejc.954.2022.08.02.09.59.28; Tue, 02 Aug 2022 09:59:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238322AbiHBQuK (ORCPT + 99 others); Tue, 2 Aug 2022 12:50:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238188AbiHBQuH (ORCPT ); Tue, 2 Aug 2022 12:50:07 -0400 Received: from 1wt.eu (wtarreau.pck.nerim.net [62.212.114.60]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E35E1101FE; Tue, 2 Aug 2022 09:50:04 -0700 (PDT) Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id 272Go2L7021881; Tue, 2 Aug 2022 18:50:02 +0200 Date: Tue, 2 Aug 2022 18:50:02 +0200 From: Willy Tarreau To: Dipanjan Das Cc: Lukas Bulwahn , Denis Efremov , Jens Axboe , linux-block@vger.kernel.org, Linux Kernel Mailing List , syzkaller , fleischermarius@googlemail.com, its.priyanka.bose@gmail.com Subject: Re: INFO: task hung in __floppy_read_block_0 Message-ID: <20220802165002.GA21797@1wt.eu> References: <20220731095307.GA12211@1wt.eu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Mon, Aug 01, 2022 at 10:04:46PM -0700, Dipanjan Das wrote: > On Sun, Jul 31, 2022 at 2:53 AM Willy Tarreau wrote: > > > > Thus I'm a bit confused about what to look for. It's very likely that > > there are still bugs left in this driver, but trying to identify them > > and to validate a fix will be difficult if they cannot be reproduced. > > Maybe they only happen under emulation due to timing issues. > > > > As such, any hint about the exact setup and how long to wait to get > > the error would be much appreciated. > > We can confirm that we were able to trigger the issue on the latest > 5.19 (commit: 3d7cb6b04c3f3115719235cc6866b10326de34cd) with the > C-repro within a VM. We use this: > https://syzkaller.appspot.com/text?tag=KernelConfig&x=cd73026ceaed1402 > config to build the kernel. The issue triggers after around 143 > seconds. For all the five times we tried, we were able to reproduce > the issue deterministically every time. Please let us know if you need > any other information. Yep, I could reproduce it under qemu as well. I've added traces, and ugly things are happening with the lock (but I haven't understood what yet). What I saw was that process_fd_request() is first called under lock, then we drop the lock, then __floppy_read_block_0() is called under lock, which calls process_fd_request(), then the lock is dropped, wait_for_completion() is called, then process_fd_request() is called again without lock this time, and from there we're looping in fd_wait_for_completion. I need to dig into more details but it doesn't seem right to me that process_fd_request() is sometimes called under a lock and sometimes out, and that __floppy_read_block_0() is called with a lock held and it's relesed under it. I could have missed certain things due to the concurrent accesses but in any case I should probably not be observing this. I'll try to dig deeper. I really don't know that area and I must confess it's not the most exciting to rediscover each time :-) Thanks, Willy