Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3483057pxj; Tue, 1 Jun 2021 06:25:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwAv8eX8HIKtAIQV7e6wJu/rkFbuEn2cwVtA2c1zLpke/WRFueMnp0XWex68CB8AJQeYkrR X-Received: by 2002:a17:906:3818:: with SMTP id v24mr20703898ejc.197.1622553914650; Tue, 01 Jun 2021 06:25:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622553914; cv=none; d=google.com; s=arc-20160816; b=pvRrNYpBf36wtDXKm7PSRAUcTAKk32dHcRCatgvXNxuvhavo/pe1EDCxcVEbSO3SVk npLrfNOgEkjT/+ztHGKfwxoGfGyKzW37l3ipHP2UkI+QMGzjQSj62dh9JNmhUMddk6X9 aaxZRAJZKvgWGczrbIsfXgNG9ldYmWWVQy8nQIcDg8STWrdkbjzNANxuaUR81ZilgtTZ o6gH983ztaUD9L6RVYxJ6OOjJLfvpfFZZOxGdJYKAmlV3T9/XiO0vR9eKQOuQBRH0QKz D/p7F6tDCwxiszp+Rlso54fKtNJhEPAHZ+RQ2z+Z61GMpNipgnT4EZmgpUDwQiKO7gHC OVdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:content-disposition:mime-version :message-id:subject:cc:to:from:date:dkim-signature; bh=4+P5vP/IMM0ahXtkgFF2oW6amPennmZbptYSDq3P2S0=; b=Sa0IYC17Lo19AjInoSdoCpPYBH+jtprZmVnd3TLU+cWR0Wippx0tld1u81xm0Riiij 7zh2XkFofMDl5Z6hVi0Y9KAFk2qKfsQtucIA9msHhHEAEKnsPfTxsVvrY2Iu3OZVXBO7 PgDmslc8va2O6kg1i7RRbRiL40op0SM74RoZoyxRqKFxAtUqngGVNNYuI48Q/a8JY2h6 hQhmd8Z8xJasVlpLjrzOkNdoXcQLI4CYViRJUVPZ4nObBA2rQWAwifoE2ba9JZs/esHI QPOHVeL2zXBSQq0971v3iKsu9aZccRkj8MSdu24qbG4rZDVDm3Vd2SqLod3P57ARBn4H jb5Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass (test mode) header.i=@axis.com header.s=axis-central1 header.b=VpNSD4Ew; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=axis.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v9si16245727ejh.703.2021.06.01.06.24.51; Tue, 01 Jun 2021 06:25:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass (test mode) header.i=@axis.com header.s=axis-central1 header.b=VpNSD4Ew; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=axis.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233871AbhFANZB (ORCPT + 99 others); Tue, 1 Jun 2021 09:25:01 -0400 Received: from smtp2.axis.com ([195.60.68.18]:13680 "EHLO smtp2.axis.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233584AbhFANZA (ORCPT ); Tue, 1 Jun 2021 09:25:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=axis.com; q=dns/txt; s=axis-central1; t=1622553799; x=1654089799; h=date:from:to:cc:subject:message-id:mime-version; bh=4+P5vP/IMM0ahXtkgFF2oW6amPennmZbptYSDq3P2S0=; b=VpNSD4EwPm4Fk59GveFo50zmjrsyVYt5QwANdlGzplOm/MyVrjyDRBtv mNNSdWH7zBoAkD10k9DzeM2WkpjBU6cw9WoImmsXzdEuJOXDgN7FmUUWJ vQ8y4mhvN9cO2DpgDepbNq0phXsRnEvu4ZD9zlH7fJzmLrYFuxO/FLvrA +iIlpQQp05kq9Kx2zY9RddCDW9Hm9bgy0M3vP8Kb4juixth5yuRO68Wgq MUFxhEZpdXtbiVg8qhuN4zWg1DKXTAbB6CD0SNKzwQEnCe5bcM1sb/kIU Xra6v90Z4cxMruHwD+FdBXD6KTOcWQU4OM/r25w0TH0lrBx9NRvNc5oC3 A==; Date: Tue, 1 Jun 2021 15:23:17 +0200 From: Vincent Whitchurch To: CC: , Subject: __buffer_migrate_page() vs ll_rw_block() Message-ID: <20210601132316.GA27976@axis.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org I'm seeing occasional squashfs read failures ("squashfs_read_data failed to read block") when compaction is run at the same time as reading from squashfs, with something like the commands below. The kernel version is the latest stable/v5.4 kernel, v5.4.123. while :; do echo 1 > /proc/sys/vm/compact_memory; done & while :; do echo 3 > /proc/sys/vm/drop_caches; find fs/ > /dev/null; done & On this kernel, squashfs uses ll_rw_block(). The problem is that ll_rw_block() ignores BHs which it can't get a lock on, but __buffer_migrate_page() can take the lock on the BHs in order to check if they can be migrated. If __buffer_migrate_page() holds the lock at the same time that ll_rw_block() wants it, the BH is skipped and I/O is not issued for these blocks, and squashfs ends up seeing !buffer_uptodate() and erroring out. On newer kernels, squashfs doesn't use ll_rw_block() anymore, but I still see other users of that function in other filesystems, and AFAICS the underlying problem of the race with __buffer_migrate_page() has not yet been fixed. I'd be happy to receive any suggestions about the right way to fix this. Thank you.