Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp4664950pxu; Tue, 13 Oct 2020 04:20:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzNfWZR3jN+rVrpGjmQDuFMMmTM+GT8DmAZFXMFfitrDHQGQkDb5S2dvgF85oCWous1o8Zt X-Received: by 2002:a05:6402:293:: with SMTP id l19mr19698029edv.227.1602588024786; Tue, 13 Oct 2020 04:20:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602588024; cv=none; d=google.com; s=arc-20160816; b=tiJShqPUBpXrwkboh5iw+4m6Rs9qlF0f8HyNwPpJjCNdiImHSmg3gbJI3/UjlawYQj zjJHOpq3Z32FGa6Qxg0To/SArVqaINWb6aQ0vwCSoRpGfKOh17ROKE9a8s5GlFUWEkuv a78bG9lUzJy/4eGtsJveFKSQFRC5TIeSOvd+rkZQVxwOBzLRGH4XuBLuqeECS1zrrCPU 20FQDeI/0GnizCSUEqXrXrdEcN+BGZ0DsPnHkSw2JhSDocEDcPBqLiC96br5DM7PcYQD tl5xHvTh6lak43EeQTRB4zNsBcNQ0Y1zYeRfs4035BVPXflI42FnluuQzdq8yEGR0lV7 1eow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=HKxGmE75kRnuLEddC8Qi6UHsEVupFzBlxAXPGS9Knbs=; b=CgpJ8aGqNNvAPhcIlCWyjJ1+NzKsIceqsEJORyg06GGp8zVuJnMOZFB9owdW0eoX/n TBQzCNhLHy4/BKWDc6IuUpvlwWwzUsFTmO2Plph/jVOjLoryqn+6lsAbWhK1No7mURWs ZZ33Hnxtk2fWNgZR5fhIE6UTp8cH07Ad3k8Vqho8AS0/bUF8CpiR3VUrnHvjMkYbjKhL xaJ3EHQCEHjNxmDIxUndHz+V51DBzResveI6oqt5hmMMzJuHQOWS9etJ6WHeR1+p0m7z 44qZn9T6kdIMSfYJB1VZm+oAF9jm81gihnJKMuJO0q8Jzat5dGodRuPL2OwM3+bpwEe0 PEKw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p18si14440112ejn.695.2020.10.13.04.20.02; Tue, 13 Oct 2020 04:20:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728759AbgJMCak (ORCPT + 99 others); Mon, 12 Oct 2020 22:30:40 -0400 Received: from szxga07-in.huawei.com ([45.249.212.35]:33344 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728130AbgJMCaj (ORCPT ); Mon, 12 Oct 2020 22:30:39 -0400 Received: from DGGEMS412-HUB.china.huawei.com (unknown [172.30.72.58]) by Forcepoint Email with ESMTP id CE49759926B025F0CED3; Tue, 13 Oct 2020 10:30:36 +0800 (CST) Received: from [10.136.114.67] (10.136.114.67) by smtp.huawei.com (10.3.19.212) with Microsoft SMTP Server (TLS) id 14.3.487.0; Tue, 13 Oct 2020 10:30:36 +0800 Subject: Re: [f2fs-dev] [f2fs bug] infinite loop in f2fs_get_meta_page_nofail() To: CC: Eric Biggers , , , , References: <000000000000432c5405b1113296@google.com> <20201007213253.GD1530638@gmail.com> <20201007215305.GA714500@google.com> <20201009015015.GA1931838@google.com> <8fa4f9fe-5ca5-f3a3-c8f4-e800373c1e46@huawei.com> <20201009043237.GB1973455@google.com> <20201009145626.GA2186792@google.com> From: Chao Yu Message-ID: <70faa161-bcd7-64a3-4a6c-04963c0784b6@huawei.com> Date: Tue, 13 Oct 2020 10:30:36 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20201009145626.GA2186792@google.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.136.114.67] X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Jaegeuk, I guess you missed sending last applied patch to mailing list? Thanks, On 2020/10/9 22:56, jaegeuk@kernel.org wrote: > On 10/09, Chao Yu wrote: >> On 2020/10/9 12:32, jaegeuk@kernel.org wrote: >>> On 10/09, Chao Yu wrote: >>>> On 2020/10/9 9:50, jaegeuk@kernel.org wrote: >>>>> On 10/09, Chao Yu wrote: >>>>>> On 2020/10/8 5:53, jaegeuk@kernel.org wrote: >>>>>>> On 10/07, Eric Biggers wrote: >>>>>>>> [moved linux-fsdevel to Bcc] >>>>>>>> >>>>>>>> On Wed, Oct 07, 2020 at 02:18:19AM -0700, syzbot wrote: >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> syzbot found the following issue on: >>>>>>>>> >>>>>>>>> HEAD commit: a804ab08 Add linux-next specific files for 20201006 >>>>>>>>> git tree: linux-next >>>>>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=17fe30bf900000 >>>>>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=26c1b4cc4a62ccb >>>>>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=ee250ac8137be41d7b13 >>>>>>>>> compiler: gcc (GCC) 10.1.0-syz 20200507 >>>>>>>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1336413b900000 >>>>>>>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12f7392b900000 >>>>>>>>> >>>>>>>>> The issue was bisected to: >>>>>>>>> >>>>>>>>> commit eede846af512572b1f30b34f9889d7df64c017d4 >>>>>>>>> Author: Jaegeuk Kim >>>>>>>>> Date: Fri Oct 2 21:17:35 2020 +0000 >>>>>>>>> >>>>>>>>> f2fs: f2fs_get_meta_page_nofail should not be failed >>>>>>>>> >>>>>>>> >>>>>>>> Jaegeuk, it looks like the loop you added in the above commit doesn't terminate >>>>>>>> if the requested page is beyond the end of the device. >>>>>>> >>>>>>> Yes, that will go infinite loop. Otherwise, it will trigger a panic during >>>>>>> the device reboot. Let me think how to avoid that before trying to get the >>>>>>> wrong lba access. >>>>>> >>>>>> Delivering f2fs_get_sum_page()'s return value needs a lot of codes change, I think >>>>>> we can just zeroing sum_page in error case, as we have already shutdown f2fs via >>>>>> calling f2fs_stop_checkpoint(), then f2fs_cp_error() will stop all updates to >>>>>> filesystem data including summary pages. >>>>> >>>>> That sounds like one solution tho, I'm afraid of getting another panic by >>>>> wrong zero'ed summary page. >>>> >>>> What case do you mean? maybe I missed some corner cases? >>> >>> I sent v2 to fix syzbot issue, which fixes wrong use of >>> f2fs_get_meta_page_nofail. >> >> I agreed to fix that case, however we may encounter deadloop in other >> places where we call f2fs_get_meta_page_nofail()? like the case that >> filesystem will always see EIO after we shutdown device via dmflakey? > > We may need another option to deal with this. At least, however, it's literally > _nofail function which should guarantee no error, instead of hiding the error > with zero'ed page. > >> >> Thanks, >> >>> >>>> >>>> Thanks, >>>> >>>>> >>>>>> >>>>>> Thoughts? >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>>> >>>>>>>> - Eric >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Linux-f2fs-devel mailing list >>>>>>> Linux-f2fs-devel@lists.sourceforge.net >>>>>>> https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel >>>>>>> . >>>>>>> >>>>> . >>>>> >>> . >>> > . >