Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2469807pxb; Tue, 13 Apr 2021 02:41:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyMZ/l2gi7sQ0Oaea3dLruQHpUrwlDUHUmbk0bF14cW9jMp6OfKk4wVw5/GbhlF5CaXecd/ X-Received: by 2002:a17:902:f2d1:b029:eb:2e32:8804 with SMTP id h17-20020a170902f2d1b02900eb2e328804mr1892048plc.40.1618306887840; Tue, 13 Apr 2021 02:41:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618306887; cv=none; d=google.com; s=arc-20160816; b=cfj12RuFAu4Xcn74knPzhtmjEigT6cOV1ceeBCisTYEU8GifLF/ED2rTZkq2sAQToF OYnKAPFeci+almQnUPqutALlEeqV56EUGy769dbKtopEi6MKgwj7whXtkopOTWOyKmmW ZpomzbJEHS+Mms6YUbCZKzgF2t8eWY7clKxbbSGIIbVm0MyYuOout+Qrm6/Cr2HVlZMX sxv4+B7TcssTNzkEDyQQhERDj7gG71rnbl9ykP+e0PZRrC3p9SvnLcNksAvG130VZk1c QVb2tHV30+Q7nn9yZ078PkoKOj677t4SWPC+/BS8mKTsOH+0fLsAzbYVk7ahNhvmF9aB Ag2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=D5uDSu+GYJjaWPoe8h3mJwhd/DcJYP150YgFVzspmgY=; b=q1+kVI+6FRPMpiJtKO8MKkoVijtX2yOghKPLXvOWbIK1SvY3X2uOym17mnZWJeoOQ4 K7qWCZLGjoq2912fE2VBo6iahb0+gV/1rb/gYkRyrL/tlFkVl1NSBaV2eK9d8XBImSdG Xls5GpN83firhDVWo3V+wcrKk1sWw2+TgC7W0nlCj4e+vrwQXEvdUfjwtDSc4M5Ktu6A mvZMPhZcSgvBgULoOARrFE/EGfCJ42kGXElGg25lzdM1VxXUINV/T5y+K1b3INpp4xwC 8uzaBJ4zLU+w0xrJS34o8JZO7LNA5JxRBdtYgi3fWLZlb0rEgpjh55WpZMOEgRBf2vmS X0OA== ARC-Authentication-Results: i=1; mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=b5yrKS8a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id ls12si2399994pjb.101.2021.04.13.02.41.15; Tue, 13 Apr 2021 02:41:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=temperror (no key for signature) header.i=@szeredi.hu header.s=google header.b=b5yrKS8a; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229723AbhDMI6L (ORCPT + 99 others); Tue, 13 Apr 2021 04:58:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbhDMI6J (ORCPT ); Tue, 13 Apr 2021 04:58:09 -0400 Received: from mail-vk1-xa31.google.com (mail-vk1-xa31.google.com [IPv6:2607:f8b0:4864:20::a31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AAECC061756 for ; Tue, 13 Apr 2021 01:57:50 -0700 (PDT) Received: by mail-vk1-xa31.google.com with SMTP id r196so3464339vkd.11 for ; Tue, 13 Apr 2021 01:57:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=szeredi.hu; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=D5uDSu+GYJjaWPoe8h3mJwhd/DcJYP150YgFVzspmgY=; b=b5yrKS8aPr7bTDMX8mH6+Fq/epKx0YSIK5VkWKmOEGzqNx1gi8KL4KpK7134yf//KU b+PIbkJKISKmcsa8P63lCkdZvcRD7Tm7bkWsOOwW7h/QogtNhJgfMBk+qv0qkdJ3uKjx 1F6YXdE1YQLUxBOIE0719ILXx67GO1Y9s6OMs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=D5uDSu+GYJjaWPoe8h3mJwhd/DcJYP150YgFVzspmgY=; b=USbrv9LQeJqK8wqzKeMELct9LNw18pXar/sCfcJMwrMlLgQkfNIXGV+0aS69wvSCn/ aJgCyiwwA0zSaZm0hW115oZNNNqBwTU0C7Xdrq/6DYRvYkqN4+i1GvgEeHS3vgQlmKYX bfAYzWLhTMbbq91d8qsaMHMm8+LRdNkFTtOeH10rU13rELljks7b2YpAjCO0gD2P/9hY CMDtcUiczyJmbHoBk6y7yppxEMVNLxSxV5/W1Ye92hUK4dMaciEa5rDO2slF8ApUp0P1 dtVlChEQRI57reeKKdxTXBKRlUwTsbHp2Jf36KVPiFPOewOIe2oi6/hkTFmAfzUTbjnz TLbQ== X-Gm-Message-State: AOAM530QTii2TYT9ti6lu2H8Xw+Nj53QyKw/2obk+rc+7pbBJyTtlWbY D3WRvAKu+riwxD3ogid/7y5dsjGl505lOugKOAkA1KLRnBBDvoP21xo= X-Received: by 2002:a1f:4ec3:: with SMTP id c186mr22566430vkb.11.1618304269218; Tue, 13 Apr 2021 01:57:49 -0700 (PDT) MIME-Version: 1.0 References: <807bb470f90bae5dcd80a29020d38f6b5dd6ef8e.1616826872.git.baolin.wang@linux.alibaba.com> In-Reply-To: From: Miklos Szeredi Date: Tue, 13 Apr 2021 10:57:38 +0200 Message-ID: Subject: Re: [PATCH v2 1/2] fuse: Fix possible deadlock when writing back dirty pages To: Baolin Wang Cc: Peng Tao , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 12, 2021 at 3:23 PM Baolin Wang wrote: > > Hi Miklos, > > =E5=9C=A8 2021/3/27 14:36, Baolin Wang =E5=86=99=E9=81=93: > > We can meet below deadlock scenario when writing back dirty pages, and > > writing files at the same time. The deadlock scenario can be reproduced > > by: > > > > - A writeback worker thread A is trying to write a bunch of dirty pages= by > > fuse_writepages(), and the fuse_writepages() will lock one page (named = page 1), > > add it into rb_tree with setting writeback flag, and unlock this page 1= , > > then try to lock next page (named page 2). > > > > - But at the same time a file writing can be triggered by another proce= ss B, > > to write several pages by fuse_perform_write(), the fuse_perform_write(= ) > > will lock all required pages firstly, then wait for all writeback pages > > are completed by fuse_wait_on_page_writeback(). > > > > - Now the process B can already lock page 1 and page 2, and wait for pa= ge 1 > > waritehack is completed (page 1 is under writeback set by process A). B= ut > > process A can not complete the writeback of page 1, since it is still > > waiting for locking page 2, which was locked by process B already. > > > > A deadlock is occurred. > > > > To fix this issue, we should make sure each page writeback is completed > > after lock the page in fuse_fill_write_pages() separately, and then wri= te > > them together when all pages are stable. > > > > [1450578.772896] INFO: task kworker/u259:6:119885 blocked for more than= 120 seconds. > > [1450578.796179] kworker/u259:6 D 0 119885 2 0x00000028 > > [1450578.796185] Workqueue: writeback wb_workfn (flush-0:78) > > [1450578.796188] Call trace: > > [1450578.798804] __switch_to+0xd8/0x148 > > [1450578.802458] __schedule+0x280/0x6a0 > > [1450578.806112] schedule+0x34/0xe8 > > [1450578.809413] io_schedule+0x20/0x40 > > [1450578.812977] __lock_page+0x164/0x278 > > [1450578.816718] write_cache_pages+0x2b0/0x4a8 > > [1450578.820986] fuse_writepages+0x84/0x100 [fuse] > > [1450578.825592] do_writepages+0x58/0x108 > > [1450578.829412] __writeback_single_inode+0x48/0x448 > > [1450578.834217] writeback_sb_inodes+0x220/0x520 > > [1450578.838647] __writeback_inodes_wb+0x50/0xe8 > > [1450578.843080] wb_writeback+0x294/0x3b8 > > [1450578.846906] wb_do_writeback+0x2ec/0x388 > > [1450578.850992] wb_workfn+0x80/0x1e0 > > [1450578.854472] process_one_work+0x1bc/0x3f0 > > [1450578.858645] worker_thread+0x164/0x468 > > [1450578.862559] kthread+0x108/0x138 > > [1450578.865960] INFO: task doio:207752 blocked for more than 120 secon= ds. > > [1450578.888321] doio D 0 207752 207740 0x00000000 > > [1450578.888329] Call trace: > > [1450578.890945] __switch_to+0xd8/0x148 > > [1450578.894599] __schedule+0x280/0x6a0 > > [1450578.898255] schedule+0x34/0xe8 > > [1450578.901568] fuse_wait_on_page_writeback+0x8c/0xc8 [fuse] > > [1450578.907128] fuse_perform_write+0x240/0x4e0 [fuse] > > [1450578.912082] fuse_file_write_iter+0x1dc/0x290 [fuse] > > [1450578.917207] do_iter_readv_writev+0x110/0x188 > > [1450578.921724] do_iter_write+0x90/0x1c8 > > [1450578.925598] vfs_writev+0x84/0xf8 > > [1450578.929071] do_writev+0x70/0x110 > > [1450578.932552] __arm64_sys_writev+0x24/0x30 > > [1450578.936727] el0_svc_common.constprop.0+0x80/0x1f8 > > [1450578.941694] el0_svc_handler+0x30/0x80 > > [1450578.945606] el0_svc+0x10/0x14 > > > > Suggested-by: Peng Tao > > Signed-off-by: Baolin Wang > > Do you have any comments for this patch set? Thanks. Hi, I guess this is related: https://lore.kernel.org/linux-fsdevel/20210209100115.GB1208880@miu.piliscsa= ba.redhat.com/ Can you verify that the patch at the above link fixes your issue? Thanks, Miklos