Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp3147895pxu; Mon, 14 Dec 2020 23:03:21 -0800 (PST) X-Google-Smtp-Source: ABdhPJzS2t1k6u69CyUp+o6+uLXw4mx59pLa1J6XhuxIKEu3aswc18KhWUoox6IPEZmcJ3/OJIDL X-Received: by 2002:a17:906:b2d1:: with SMTP id cf17mr26296610ejb.281.1608015801268; Mon, 14 Dec 2020 23:03:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608015801; cv=none; d=google.com; s=arc-20160816; b=PFJ3PAxptdH/PTBcQsa14NsUvJPci2q3DdXmJyt4xy9JVuJcOO02rsjwsKBJWJv2IO CPcCWoc65ITwpBMIXDNmAYhxiesBTv5fMlmtzvyWIMQ3psMOhLeSWqpPFvthMnH+xBwa KUGi9KVgzPUlh3JMbAcxcB3v7mQmDFBSI+Yl8UpFBUIogYscemKyg0GyR1mq7SxCV+bD yABNqaB5XXQR3S8KZamsFtYBCMeSuGxjaSYPUFzvQHFupxa+gnZprv3kUCmxJ+C+qVbg jU/LWhfWLLZV/KUR+1OSj/sb5x1NpjQbvOS8+10cnwvDp++bxpffWTcEIYBfvmhty2p+ ybQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=yhGqB4ddTNw2UxPBLpCs8fPaoCpcfSjCzHh1iUJ8+QI=; b=amVXanvPXLuSnR3rykFyEcIYDlTeizE0rh1wPz2dAsxkWqVA3iNtTaD81jAZYGqJLQ /yzx9m8zxIDqLVx+u+b0CsEzLyFFfiNfu1nprsjNkjm29nPkvzbTZPQ4l3GlYganz5e5 3kGug15p/2/Nnqym2vqRhKyud6rx12j8GVzXZEkjU91ZGJ+wGsZ2QNHf3NkK08QfmYe/ nep3i7Vf3hDdYtGhK8IQI1F6j1eUtawLHvGYd++YqvjfwkatRezRooCchnUBfd8pK6ls bA2Vuq0pFo3VnOD7xdIHhcIh3mJWw/aXlBhgHbhA4VEJklq45GLNQ81FxVmDcdyO2Yfb h/fg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id o11si414709ejg.118.2020.12.14.23.02.57; Mon, 14 Dec 2020 23:03:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726586AbgLOHAl (ORCPT + 99 others); Tue, 15 Dec 2020 02:00:41 -0500 Received: from out4436.biz.mail.alibaba.com ([47.88.44.36]:18985 "EHLO out4436.biz.mail.alibaba.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726289AbgLOHAl (ORCPT ); Tue, 15 Dec 2020 02:00:41 -0500 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R711e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=xiaoguang.wang@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UIi2izO_1608015588; Received: from 30.225.32.197(mailfrom:xiaoguang.wang@linux.alibaba.com fp:SMTPD_---0UIi2izO_1608015588) by smtp.aliyun-inc.com(127.0.0.1); Tue, 15 Dec 2020 14:59:48 +0800 Subject: Re: Lockdep warning on io_file_data_ref_zero() with 5.10-rc5 To: Jens Axboe , Pavel Begunkov , Nadav Amit Cc: linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, LKML , Alexander Viro References: <13baf2c4-a403-41fc-87ca-6f5cb7999692@kernel.dk> From: Xiaoguang Wang Message-ID: Date: Tue, 15 Dec 2020 14:58:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1 MIME-Version: 1.0 In-Reply-To: <13baf2c4-a403-41fc-87ca-6f5cb7999692@kernel.dk> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org hi, > On 11/28/20 5:13 PM, Pavel Begunkov wrote: >> On 28/11/2020 23:59, Nadav Amit wrote: >>> Hello Pavel, >>> >>> I got the following lockdep splat while rebasing my work on 5.10-rc5 on the >>> kernel (based on 5.10-rc5+). >>> >>> I did not actually confirm that the problem is triggered without my changes, >>> as my iouring workload requires some kernel changes (not iouring changes), >>> yet IMHO it seems pretty clear that this is a result of your commit >>> e297822b20e7f ("io_uring: order refnode recyclingā€¯), that acquires a lock in >>> io_file_data_ref_zero() inside a softirq context. >> >> Yeah, that's true. It was already reported by syzkaller and fixed by Jens, but >> queued for 5.11. Thanks for letting know anyway! >> >> https://lore.kernel.org/io-uring/948d2d3b-5f36-034d-28e6-7490343a5b59@kernel.dk/T/#t >> >> >> Jens, I think it's for the best to add it for 5.10, at least so that lockdep >> doesn't complain. > > Yeah maybe, though it's "just" a lockdep issue, it can't trigger any > deadlocks. I'd rather just keep it in 5.11 and ensure it goes to stable. > This isn't new in this series. Sorry, I'm not familiar with lockdep implementation, here I wonder why you say it can't trigger any deadlocks, looking at that the syzbot report, seems that the deadlock may happen. And I also wonder whether spin lock bh variants are enough, normal ios are completed in interrupt context, ==> io_complete_rw ====> __io_complete_rw ======> io_complete_rw_common ========> __io_req_complete ==========> io_put_req ============> io_free_req ==============> __io_free_req ================> io_dismantle_req ==================> io_put_file ====================> percpu_ref_put(req->fixed_file_refs); if we drop the last reference here, io_file_data_ref_zero() will be called, then we'll call spin_lock(&data->lock); in interrupt context. Should we use spin lock irq variants? Regards, Xiaoguang Wang >