Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1157751yba; Tue, 2 Apr 2019 03:34:36 -0700 (PDT) X-Google-Smtp-Source: APXvYqwW+Rpafu5e4XntZSkDOUil1FNwk9iw/khYR6LVTrz7gXCNAdWr962zT/UMamcMQDTIOr8c X-Received: by 2002:a17:902:848d:: with SMTP id c13mr54499184plo.279.1554201275939; Tue, 02 Apr 2019 03:34:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554201275; cv=none; d=google.com; s=arc-20160816; b=Pmqs+TNWkHoYm9lhvtxZ7jJEqYy9kBDKymrsNEu3ZXglslWfwK/wM990DaGwnYZ+qx Gtv8gIeTRyxikfrUnPh2Fv/uJDW98s/hZcQU9vrdwA++ibU7Gpmsi8H8iozonDeqH6jN lC5+HAJWiZgxtCLczMk2FjGlqQ1c0IXIwrnd6YuFStWLnGWp0yIOYk9WvxJmXfwAexm1 f4fZYoqDvrv7tUhPePUUYkkc8iD4s5L++z7M6hPGLKn7N6Tl4nJVMGgIXIH6f7p/hfpA kyY1xJPM6V5hUvQvnSnbIf3Ci9/QmRNbhvhuw6v3RRLBZP2SW4nrUkdQnHMhFl7wsDku uiag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :mime-version:dkim-signature; bh=hM/V1Uq/1yAabAAkSZVY+jlW6yenEpfKWKsRMpyLU0w=; b=vcoGTEs1bI6kugMaxsPreebDzbT+WWEqt1lCAs52OpO6bBtYAw6bCDSvqLrSuUW7ix 0IxBTCWYvcJ3pdYEB0ZTR+5h4/+Qx0TWwY+ISgLK+6WVRWuGe0/0rcWLDYaL7AjAS/dN fPW56qSVrQexIvLqQCfrgbgsyYUZAcqmQLmlqm1qz2rH66kmuAY01Hy0GwdRO5z+jE5N Szztlr+B74xvmjgJ7Yk+oBhRMl8vQjoBUIBux7xYup8inJZCJgwemJ3F2TPmUN2TBJDM fhviUtZsG3Pyvjq1ya6teZ2D81Ghkh2uj/qJzVIACv1uFzXOXxKijKDs3dbD8ZUklMM/ UeTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=c2RIDpFX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z11si10787244pfa.153.2019.04.02.03.34.20; Tue, 02 Apr 2019 03:34:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=c2RIDpFX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729970AbfDBKIs (ORCPT + 99 others); Tue, 2 Apr 2019 06:08:48 -0400 Received: from mail-ed1-f42.google.com ([209.85.208.42]:40807 "EHLO mail-ed1-f42.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729918AbfDBKIr (ORCPT ); Tue, 2 Apr 2019 06:08:47 -0400 Received: by mail-ed1-f42.google.com with SMTP id h22so11096272edw.7 for ; Tue, 02 Apr 2019 03:08:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to:cc; bh=hM/V1Uq/1yAabAAkSZVY+jlW6yenEpfKWKsRMpyLU0w=; b=c2RIDpFXbpk9OZCrGoRUjMxouPD1dFAc+L9gYIVhc4n9JdmSRVCIrjzusaPhmCsEda x34yHX0Rgec7iYM3zyQatJ0fYCdRi3GfrMI4VNqJi0bSF6NAqTdh6F37HpmRbMSmdSRw KrumdfuTTgtCdPJF3Q5rRTSRpwqK5XUgnieKbNquvdgJ9m7ASZY9kflVlRclrsezsZhZ jKlDpJy6p620QS1UtMt/PcnnuA5wEtTVkHiLVG68mS1AEbgXnrSJ/Yiyv4swd+SgHfgJ 0qxV5XnydanDwt1HKTMThdCMz1pp4eKeg39LMx8Ooz2sL6XRfoVoGE5MNJoad3ZQyjxD /IOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc; bh=hM/V1Uq/1yAabAAkSZVY+jlW6yenEpfKWKsRMpyLU0w=; b=n75enYCKwa2RIkbCg+Dxg+g/d+rlnW4GNkWSGXNh6ktgficdIbBV7aKxiOeh3a3KfH SqUhVXgouyWmlROwyX374ALYgSE137x9LQjPDbNH7QOk8uabhQsGmAgKVtTAgK1+CW50 gbuQVrfqnDz6GCIbNo//zeM3Y4FrOMTKf8cS4oBGrBqcJFXeokHoHkhBZEG6MsQeKu/r 2g8cJOExgZqkLX2U2hos2rzyK/gqZSZLR2MxvypwR0tJsZ2gT+dNbYR/4ypzB4xpzAB6 ylonBhey+ZQo+fX+7dZCa3BOO0/e4wktTzImVau4DnkfINYyiBsyO0FHHRBty8wE0KVa DvHg== X-Gm-Message-State: APjAAAWqBZL3bl9YBB01FvQSKIyMWkVe4F2YlbWErpO7aepXmQVMJRUT uT3LyRpr3h+7lzaxZ3+JLwnQE5TA6m7O2mEHyeU= X-Received: by 2002:a50:a705:: with SMTP id h5mr36665432edc.226.1554199725901; Tue, 02 Apr 2019 03:08:45 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a50:aedc:0:0:0:0:0 with HTTP; Tue, 2 Apr 2019 03:08:45 -0700 (PDT) From: Jari Ruusu Date: Tue, 2 Apr 2019 13:08:45 +0300 Message-ID: Subject: ext3 file system livelock and file system corruption, 4.9.166 stable kernel To: Greg Kroah-Hartman Cc: "zhangyi (F)" , "Theodore Ts'o" , Jan Kara , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org To trigger this ext4 file system bug, you need a sparse file with correct sparse pattern on old-school ext3 file system. I tried more simpler ways to trigger this but those attempts did not trigger the bug. I have provided compressed sparse file that reliably triggers the bug. Size of compressed sparse file 1667256 bytes. Size of uncompressed sparse file 7369850880 bytes. Following commands will demo the problem. wget http://www.elisanet.fi/jariruusu/123/sparse-demo.data.xz xz -d sparse-demo.data.xz mkfs -t ext3 -b 4096 -e remount-ro -O "^dir_index" /dev/sdc1 mount -t ext3 /dev/sdc1 /mnt cp -v --sparse=always sparse-demo.data /mnt/aa cp -v --sparse=always sparse-demo.data /mnt/bb umount /mnt mount -t ext3 /dev/sdc1 /mnt cp -v --sparse=always /mnt/bb /mnt/aa That last cp command reliably triggers the bug that livelocks and after reset you have file system corruption to deal with. Deeply unfunny. The bug is caused by "ext4: brelse all indirect buffer in ext4_ind_remove_space()" upstream commit 674a2b27234d1b7afcb0a9162e81b2e53aeef217, from , who provided a follow-up patch "ext4: cleanup bh release code in ext4_ind_remove_space()" upstream commit 5e86bdda41534e17621d5a071b294943cae4376e. The problem with that follow-up patch is that it is almost criminally mislabeled. It should have said "fixes ext3 livelock and file system corrupting bug" or something like that, so that Greg KH & Co would have understood that it must be backported to stable kernels too. Now the bug appears to be in all/most stable kernels already. Below is the buggy patch that causes the problem. Look at those new while loops. Once the while condition is true once, it is ALWAYS true, so it livelocks. > --- a/fs/ext4/indirect.c > +++ b/fs/ext4/indirect.c > @@ -1385,10 +1385,14 @@ end_range: > partial->p + 1, > partial2->p, > (chain+n-1) - partial); > - BUFFER_TRACE(partial->bh, "call brelse"); > - brelse(partial->bh); > - BUFFER_TRACE(partial2->bh, "call brelse"); > - brelse(partial2->bh); > + while (partial > chain) { > + BUFFER_TRACE(partial->bh, "call brelse"); > + brelse(partial->bh); > + } > + while (partial2 > chain2) { > + BUFFER_TRACE(partial2->bh, "call brelse"); > + brelse(partial2->bh); > + } > return 0; > } > Greg & Co, Please revert that above patch from stable kernels or backport the follow-up patch that fixes the problem. -- Jari Ruusu 4096R/8132F189 12D6 4C3A DCDA 0AA4 27BD ACDF F073 3C80 8132 F189