Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp844773yba; Wed, 24 Apr 2019 10:30:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqzCNdHNFD6k9/gtq53VeWDN75UoBYyNaVoN2kJi0VPHmAtExt5M0KbDpBNWungOXfrvHHRd X-Received: by 2002:a63:c605:: with SMTP id w5mr31148000pgg.355.1556127048829; Wed, 24 Apr 2019 10:30:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556127048; cv=none; d=google.com; s=arc-20160816; b=a/Ys6Fak8uACm5JxJRzF5lzRTJGOm1krKCTMMlzIr0SkMfeAdDElXhM8qydcmVndum NxZF9gPm2FRTYTqwnu/Gz282TPVRGHxywJZF7PkwYWm024CQ/j8uE65k3tgYafLiRN9v 0ziYkcqmmcUokugs6mjvwJD/TE/WtvSgUYpTFrRM1PqARZxHb7WJcfelbA0j1hupX1Kj 4O4hWRPnIvlYjxJRdM2nZdaRzzayKCHl6m6tZvfs0xhlZiWPWVAdxaAvmlpEeq4lxJQD ULD0JyLhAm1wRi8vuXGdsqxqJmoJisfsg6GLOr2twB9JQmZOeef7UQiRgvSImTjJc3fJ cAHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=Ke1xUBQkLbDOBfxm0k72ifYCGsz+VTzFC6CID5WeoGs=; b=fFLfhqJbRiYQ+wJz4Ou2YYSW2YZZC5Z0fHs96zOKTr4c1kix3XmnJI6VixcgmpRuoU zEecKkf0wpi2opv9ZfHCbUjLzRno84gY78QvspGg+Xhr3ggVkCIgZSTBNuAiSWhHbMHu 1ANmyZkpFRVMdmGrgjKQy5ZkoL2ziE5+uIaDtaTSyUj8SBcuiB8mpM4P4CKcUuytAZcr 5YmNi4eS4VpX2pIoh3Pw+PSwuIynpZfG8gFBLIqg+9S5tmgq3LJ8TpJcUJAUsKwZv2GL Ec1y5b3iQLhv1GIuLL64j6SF8LbJvgmdjIsG0boihQNVfPtSWrVCnlQrNAPpYSr8WB/3 v0WQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=R3ZPB9rz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f11si19177997plo.169.2019.04.24.10.30.33; Wed, 24 Apr 2019 10:30:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=R3ZPB9rz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390295AbfDXR25 (ORCPT + 99 others); Wed, 24 Apr 2019 13:28:57 -0400 Received: from mail.kernel.org ([198.145.29.99]:55198 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390739AbfDXR2x (ORCPT ); Wed, 24 Apr 2019 13:28:53 -0400 Received: from localhost (62-193-50-229.as16211.net [62.193.50.229]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id DB9C42054F; Wed, 24 Apr 2019 17:28:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1556126932; bh=wKFla7yfuzeonqVhJY20KD/ieqPybzxw7oy6gpPOwLk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=R3ZPB9rzXB1F7pOkWrzJ2g/82hJ5bV0kngSEP4+gI0InfuzTDwGbKEh2WuBQ1R9wY C7IHUu1fCtu18ZwbqC1xRJ1uMjqdC1L2niuEkeqOQ/0UvLugauoyzRZ5hikSk6ABk4 EKp4uV2ExISiIniAEwylz7fGJTi1oVsW0d3wi35E= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, "Darrick J. Wong" , Liu Bo , Zubin Mithra Subject: [PATCH 4.14 67/70] iomap: report collisions between directio and buffered writes to userspace Date: Wed, 24 Apr 2019 19:10:27 +0200 Message-Id: <20190424170919.032089132@linuxfoundation.org> X-Mailer: git-send-email 2.21.0 In-Reply-To: <20190424170906.751869122@linuxfoundation.org> References: <20190424170906.751869122@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Darrick J. Wong commit 5a9d929d6e13278df62bd9e3d3ceae8c87ad1eea upstream. If two programs simultaneously try to write to the same part of a file via direct IO and buffered IO, there's a chance that the post-diowrite pagecache invalidation will fail on the dirty page. When this happens, the dio write succeeded, which means that the page cache is no longer coherent with the disk! Programs are not supposed to mix IO types and this is a clear case of data corruption, so store an EIO which will be reflected to userspace during the next fsync. Replace the WARN_ON with a ratelimited pr_crit so that the developers have /some/ kind of breadcrumb to track down the offending program(s) and file(s) involved. Signed-off-by: Darrick J. Wong Reviewed-by: Liu Bo Signed-off-by: Greg Kroah-Hartman Cc: Zubin Mithra --- fs/direct-io.c | 24 +++++++++++++++++++++++- fs/iomap.c | 12 ++++++++++-- include/linux/fs.h | 1 + 3 files changed, 34 insertions(+), 3 deletions(-) --- a/fs/direct-io.c +++ b/fs/direct-io.c @@ -219,6 +219,27 @@ static inline struct page *dio_get_page( return dio->pages[sdio->head]; } +/* + * Warn about a page cache invalidation failure during a direct io write. + */ +void dio_warn_stale_pagecache(struct file *filp) +{ + static DEFINE_RATELIMIT_STATE(_rs, 86400 * HZ, DEFAULT_RATELIMIT_BURST); + char pathname[128]; + struct inode *inode = file_inode(filp); + char *path; + + errseq_set(&inode->i_mapping->wb_err, -EIO); + if (__ratelimit(&_rs)) { + path = file_path(filp, pathname, sizeof(pathname)); + if (IS_ERR(path)) + path = "(unknown)"; + pr_crit("Page cache invalidation failure on direct I/O. Possible data corruption due to collision with buffered I/O!\n"); + pr_crit("File: %s PID: %d Comm: %.20s\n", path, current->pid, + current->comm); + } +} + /** * dio_complete() - called when all DIO BIO I/O has been completed * @offset: the byte offset in the file of the completed operation @@ -290,7 +311,8 @@ static ssize_t dio_complete(struct dio * err = invalidate_inode_pages2_range(dio->inode->i_mapping, offset >> PAGE_SHIFT, (offset + ret - 1) >> PAGE_SHIFT); - WARN_ON_ONCE(err); + if (err) + dio_warn_stale_pagecache(dio->iocb->ki_filp); } if (!(dio->flags & DIO_SKIP_DIO_COUNT)) --- a/fs/iomap.c +++ b/fs/iomap.c @@ -753,7 +753,8 @@ static ssize_t iomap_dio_complete(struct err = invalidate_inode_pages2_range(inode->i_mapping, offset >> PAGE_SHIFT, (offset + dio->size - 1) >> PAGE_SHIFT); - WARN_ON_ONCE(err); + if (err) + dio_warn_stale_pagecache(iocb->ki_filp); } inode_dio_end(file_inode(iocb->ki_filp)); @@ -1010,9 +1011,16 @@ iomap_dio_rw(struct kiocb *iocb, struct if (ret) goto out_free_dio; + /* + * Try to invalidate cache pages for the range we're direct + * writing. If this invalidation fails, tough, the write will + * still work, but racing two incompatible write paths is a + * pretty crazy thing to do, so we don't support it 100%. + */ ret = invalidate_inode_pages2_range(mapping, start >> PAGE_SHIFT, end >> PAGE_SHIFT); - WARN_ON_ONCE(ret); + if (ret) + dio_warn_stale_pagecache(iocb->ki_filp); ret = 0; if (iov_iter_rw(iter) == WRITE && !dio->wait_for_completion && --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2965,6 +2965,7 @@ enum { }; void dio_end_io(struct bio *bio); +void dio_warn_stale_pagecache(struct file *filp); ssize_t __blockdev_direct_IO(struct kiocb *iocb, struct inode *inode, struct block_device *bdev, struct iov_iter *iter,