Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2365630ybf; Mon, 2 Mar 2020 07:14:44 -0800 (PST) X-Google-Smtp-Source: APXvYqwrzy7Su8FvOB8p6ZchhfOgaq8e+iloFdvA+6KNKhHKlf7rUmvq+w3Yf0e22SJAmDMYL05k X-Received: by 2002:a05:6830:10d7:: with SMTP id z23mr13571639oto.114.1583162084243; Mon, 02 Mar 2020 07:14:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583162084; cv=none; d=google.com; s=arc-20160816; b=qV1Eb4Pf60rGyq4M+WS0p+Qjx7gH2TPoR5qQ1aRsRdxAwhpJFpxPW/B5gaM63HYa9A QpdTRJSdXumxioflLAfo8PEKvdkkZrenT95+TrjG4fCbTESoppR1II1XXtguDrsuk1aP W4HQCN65ZvBrTQ80Ddf0O3nXkMq0fyIE8lH9k7UGXECk+pCqPZpZI/PoKFjTjlc7DXTQ nIs16dShC9qjgsV+VhJUqgsNSmBAUmH7fIwb89ohqbKVS6rdpSyZw0plG0K9H3vYbRU6 jMFTM1pb8QLHG5+XY1jb6gESvL6H4vA62Z5iAEIv75NaSh03DomUXiZ4RMAmnc+i8hea gnFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :dkim-signature; bh=hbtuA6sJCGwZmeCm5dVANr4qYyH1rkLKX1gUP6zj/ZE=; b=mmmtglaNgduaL/Ij5hqZdElSZ4QEwIKzwv2CstO/qbxtz1To9ZdpcHxaaWigIzd8oe FtFmRBGaLgpSRSwQ5pYfFlSukesfQf2ZOiDkjAtASFKGGyUZvHj2/r1aNrEiu7ltfDvh XQGBWPAXEqD8k39PNYYcGufRWr6HyMWEJjR+wer0iRPEKtbZuawabVbOIfkuzymTUjpm mvpyoSLcXvYrE6UBai3BEx/lfcV25zZsLD4gwCJJ9fgvXeXcfkGo+Bw20ajSq9uqhwxG LALTWHtNdUCnWQX3wz/iFUcLg4tV83WJ2KCkS73jb0cgtCjbKgYQReetddYtbeHoUT/s g7xA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@126.com header.s=s110527 header.b=g4G7EghD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=126.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a21si5301742oib.166.2020.03.02.07.14.31; Mon, 02 Mar 2020 07:14:44 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@126.com header.s=s110527 header.b=g4G7EghD; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=126.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727060AbgCBPOV (ORCPT + 99 others); Mon, 2 Mar 2020 10:14:21 -0500 Received: from m15-113.126.com ([220.181.15.113]:47910 "EHLO m15-113.126.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726451AbgCBPOV (ORCPT ); Mon, 2 Mar 2020 10:14:21 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=From:Subject:Date:Message-Id; bh=hbtuA6sJCGwZmeCm5d VANr4qYyH1rkLKX1gUP6zj/ZE=; b=g4G7EghDT40FCio/gpHJxKjigaPPGy2SPR NrhT6bdIRNw6RZvRao7UPkcscl7lb4vRVfRLanCKSWqs8PB6GTgHxiK5xsBQtzjj GyZvk4nvAq4xR55tHFW92tQJ8aWkBBx01VuQM6kOm1VLiD/dHxBJEKSNrN9epCxr HLWgrgQKA= Received: from 192.168.137.250 (unknown [112.10.84.98]) by smtp3 (Coremail) with SMTP id DcmowACXJcLBIl1e0wO_Bw--.47949S3; Mon, 02 Mar 2020 23:14:10 +0800 (CST) From: Xianting Tian To: linux-kernel@vger.kernel.org Subject: [PATCH] mm/filemap.c: clear page error before actual read Date: Mon, 2 Mar 2020 10:14:09 -0500 Message-Id: <1583162049-17623-1-git-send-email-xianting_tian@126.com> X-Mailer: git-send-email 1.8.3.1 X-CM-TRANSID: DcmowACXJcLBIl1e0wO_Bw--.47949S3 X-Coremail-Antispam: 1Uf129KBjvJXoWxWF1DKFWkWF43JF13AF13Jwb_yoW5WFykpr ZxJ3WxKr4DGFnIkF4vy3Z7Cr1rGr4vyay5Way8W3srZwn8GFn3u34SkayDWrW3Jr1FyF42 qrsYqas8CrZYqaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07jSE__UUUUU= X-Originating-IP: [112.10.84.98] X-CM-SenderInfo: h0ld03plqjs3xldqqiyswou0bp/1tbi7QDapFpD+Fix5QABs- Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Mount failure issue happens under the circumstances: Application totally forked 38 threads to mount 38 cramfs images in docker, but several mounts failed with high probability. Mount failed due to the checking result of the page(read from the superblock of loop dev) is not uptodate after wait_on_page_locked(page) returned in function cramfs_read: wait_on_page_locked(page); if (!PageUptodate(page)) { ... } The reason of the checking result of the page not uptodate: systemd-udevd read the loopX dev before mount, because the status of loopX is Lo_unbound at this time, so loop_make_request directly trigger the calling of io_end handler end_buffer_async_read, which called SetPageError(page). So It caused the page can't be set to uptodate in function end_buffer_saync_read: if(page_uptodate && !PageError(page)) { SetPageUptodate(page); } Then mount operation is performed, it used the same page which is just used by systemd-udevd above, Because this page is not uptodate, it will launch a actual read via submit_bh, then wait on this page by calling wait_on_page_locked(page). When the I/O of the page done, io_end handler end_buffer_async_read is called, because no one cleared the page error(during the whole read path), which is caused by systemd-udevd, so this page is still in "PageError" staus, which is can't be set to uptodate in function end_buffer_async_read, then caused mount failure. But sometimes mount succeed even through systemd-udeved read loop dev just before, The reason is systemd-udevd launched other loopX read just between step 3.1 and 3.2, the steps as below: 1, loopX dev default status is Lo_unbound; 2, systemd-udved read loopX dev (page is set to PageError); 3, mount operation 1) set loopX status to Lo_bound; ==>systemd-udevd read loopX dev<== 2) read loopX dev(page has no error) 3) mount succeed As the loopX dev status is set to Lo_bound after step 3.1, so the other loopX dev read by systemd-udevd will go through the whole I/O stack, part of the call trace as below: SYS_read vfs_read do_sync_read blkdev_aio_read generic_file_aio_read do_generic_file_read ClearPageError(page);mapping->a_ops->readpage(filp, page) here, mapping->a_ops_readpage() is blkdev_readpage. We can see the ClearPageError(page) is called, then the read in step 3.2 succeed. The fix is to add ClearPageError just before the actual read of other read path. Signed-off-by: Xianting Tian --- mm/filemap.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/filemap.c b/mm/filemap.c index 1784478..c2cf0fd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2755,6 +2755,7 @@ static struct page *do_read_cache_page(struct address_space *mapping, } filler: + ClearPageError(page); if (filler) err = filler(data, page); else -- 1.8.3.1