Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp1023807imm; Fri, 17 Aug 2018 10:30:02 -0700 (PDT) X-Google-Smtp-Source: AA+uWPyQxPj8AtTlrZ7ktVSOpku5KPiVeiIXMfy10O7dHDbr/H07EGMHxP30EjYceRdac5jqjFVA X-Received: by 2002:a63:610:: with SMTP id 16-v6mr2318845pgg.96.1534527002465; Fri, 17 Aug 2018 10:30:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534527002; cv=none; d=google.com; s=arc-20160816; b=ucXQdpXF3Bx+VVg3iBVSPB2Cz6PAVjM1goV4uk6wbQYlXUpNk3XBTz8DuSsesnFzob IDdSu0Gkte7U/FIVvyW6a0jA3O5mFRnXwvhCcFLwHxyuvCeQ3It3bBaiwhWx0tUekSdA dDpOVxn0s2BqONE2SjVeJpWn7pAuHGMIfLKO3ypGbEsbP3Od/BB8fZBSZPkCdBzTZCD0 c96OQc84bMRP4VLoUaxfJ0UHE7H5slexNWHThHzWDkvA5XM5BAh4Q7kv3n0KTifLF9QX k7ywWsxxoIIoavzWVfd0xAUwWB+7P8Ma925sb8o666YCpKHD3BHGZBT/rknsdVXi3WGe iU6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id:autocrypt :openpgp:subject:from:cc:to:arc-authentication-results; bh=APuF1HmOhIbjUh2DK7WYNxNjGQWbT1inw+64RhaQP1k=; b=ArkVFvJ/lqkQXvTQKmnkZZct1hfZaGjLMDta2axFX9bIl1CYhdQrNXmkfDnhp4Nxjq bMgQG3RE5lO/PDg43BA9MqyZcC7CEm6n6d3HWD8hOWhnd3xexHuwQxyPhzMChy+d208m muPhe10A/APKNb+pfymrjpSyFoL+gwlPUTRqleY29/sHDPhyxdl8MjhBKjhyMCwhs37S Lnrnimq/H0iEXGqP6+BQ0vVD1zdUMiEdn+VU0wUTCXy9aabEMbLo6eWIRDPfqZxbIHTe +0bDU8TlCkYzhhS7vrODZRnN26f+Wj52fCNcIM3fN2iEiONQ9IoTg6i76q0g8ua4qihe keWQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v129-v6si2789916pfc.330.2018.08.17.10.29.46; Fri, 17 Aug 2018 10:30:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727853AbeHQUc7 (ORCPT + 99 others); Fri, 17 Aug 2018 16:32:59 -0400 Received: from vps-vb.mhejs.net ([37.28.154.113]:48236 "EHLO vps-vb.mhejs.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727218AbeHQUc7 (ORCPT ); Fri, 17 Aug 2018 16:32:59 -0400 Received: by vps-vb.mhejs.net with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.90_1) (envelope-from ) id 1fqiYC-0007h2-JD; Fri, 17 Aug 2018 19:28:40 +0200 To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-kernel , Joseph Qi , Tejun Heo , jiufei.xue@linux.alibaba.com, Caspar Zhang From: "Maciej S. Szmigiero" Subject: [PATCH] cfq: clear queue pointers from cfqg after unpinning them in cfq_pd_offline Openpgp: preference=signencrypt Autocrypt: addr=mail@maciej.szmigiero.name; prefer-encrypt=mutual; keydata= xsFNBFpGusUBEADXUMM2t7y9sHhI79+2QUnDdpauIBjZDukPZArwD+sDlx5P+jxaZ13XjUQc 6oJdk+jpvKiyzlbKqlDtw/Y2Ob24tg1g/zvkHn8AVUwX+ZWWewSZ0vcwp7u/LvA+w2nJbIL1 N0/QUUdmxfkWTHhNqgkNX5hEmYqhwUPozFR0zblfD/6+XFR7VM9yT0fZPLqYLNOmGfqAXlxY m8nWmi+lxkd/PYqQQwOq6GQwxjRFEvSc09m/YPYo9hxh7a6s8hAP88YOf2PD8oBB1r5E7KGb Fv10Qss4CU/3zaiyRTExWwOJnTQdzSbtnM3S8/ZO/sL0FY/b4VLtlZzERAraxHdnPn8GgxYk oPtAqoyf52RkCabL9dsXPWYQjkwG8WEUPScHDy8Uoo6imQujshG23A99iPuXcWc/5ld9mIo/ Ee7kN50MOXwS4vCJSv0cMkVhh77CmGUv5++E/rPcbXPLTPeRVy6SHgdDhIj7elmx2Lgo0cyh uyxyBKSuzPvb61nh5EKAGL7kPqflNw7LJkInzHqKHDNu57rVuCHEx4yxcKNB4pdE2SgyPxs9 9W7Cz0q2Hd7Yu8GOXvMfQfrBiEV4q4PzidUtV6sLqVq0RMK7LEi0RiZpthwxz0IUFwRw2KS/ 9Kgs9LmOXYimodrV0pMxpVqcyTepmDSoWzyXNP2NL1+GuQtaTQARAQABzTBNYWNpZWogUy4g U3ptaWdpZXJvIDxtYWlsQG1hY2llai5zem1pZ2llcm8ubmFtZT7CwZQEEwEIAD4WIQRyeg1N 257Z9gOb7O+Ef143kM4JdwUCWka6xQIbAwUJA8JnAAULCQgHAgYVCgkICwIEFgIDAQIeAQIX gAAKCRCEf143kM4Jdx4+EACwi1bXraGxNwgFj+KI8T0Xar3fYdaOF7bb7cAHllBCPQkutjnx 8SkYxqGvSNbBhGtpL1TqAYLB1Jr+ElB8qWEV6bJrffbRmsiBPORAxMfu8FF+kVqCYZs3nbku XNzmzp6R/eii40S+XySiscmpsrVQvz7I+xIIYdC0OTUu0Vl3IHf718GBYSD+TodCazEdN96k p9uD9kWNCU1vnL7FzhqClhPYLjPCkotrWM4gBNDbRiEHv1zMXb0/jVIR/wcDIUv6SLhzDIQn Lhre8LyKwid+WQxq7ZF0H+0VnPf5q56990cEBeB4xSyI+tr47uNP2K1kmW1FPd5q6XlIlvh2 WxsG6RNphbo8lIE6sd7NWSY3wXu4/R1AGdn2mnXKMp2O9039ewY6IhoeodCKN39ZR9LNld2w Dp0MU39LukPZKkVtbMEOEi0R1LXQAY0TQO//0IlAehfbkkYv6IAuNDd/exnj59GtwRfsXaVR Nw7XR/8bCvwU4svyRqI4luSuEiXvM9rwDAXbRKmu+Pk5h+1AOV+KjKPWCkBEHaASOxuApouQ aPZw6HDJ3fdFmN+m+vNcRPzST30QxGrXlS5GgY6CJ10W9gt/IJrFGoGxGxYjj4WzO97Rg6Mq WMa7wMPPNcnX5Nc/b8HW67Jhs3trj0szq6FKhqBsACktOU4g/ksV8eEtnM7AzQRaRrwiAQwA xnVmJqeP9VUTISps+WbyYFYlMFfIurl7tzK74bc67KUBp+PHuDP9p4ZcJUGC3UZJP85/GlUV dE1NairYWEJQUB7bpogTuzMI825QXIB9z842HwWfP2RW5eDtJMeujzJeFaUpmeTG9snzaYxY N3r0TDKj5dZwSIThIMQpsmhH2zylkT0jH7kBPxb8IkCQ1c6wgKITwoHFjTIO0B75U7bBNSDp XUaUDvd6T3xd1Fz57ujAvKHrZfWtaNSGwLmUYQAcFvrKDGPB5Z3ggkiTtkmW3OCQbnIxGJJw /+HefYhB5/kCcpKUQ2RYcYgCZ0/WcES1xU5dnNe4i0a5gsOFSOYCpNCfTHttVxKxZZTQ/rxj XwTuToXmTI4Nehn96t25DHZ0t9L9UEJ0yxH2y8Av4rtf75K2yAXFZa8dHnQgCkyjA/gs0ujG wD+Gs7dYQxP4i+rLhwBWD3mawJxLxY0vGwkG7k7npqanlsWlATHpOdqBMUiAR22hs02FikAo iXNgWTy7ABEBAAHCwXwEGAEIACYWIQRyeg1N257Z9gOb7O+Ef143kM4JdwUCWka8IgIbDAUJ A8JnAAAKCRCEf143kM4Jd9nXD/9jstJU6L1MLyr/ydKOnY48pSlZYgII9rSnFyLUHzNcW2c/ qw9LPMlDcK13tiVRQgKT4W+RvsET/tZCQcap2OF3Z6vd1naTur7oJvgvVM5lVhUia2O60kEZ XNlMLFwLSmGXhaAXNBySpzN2xStSLCtbK58r7Vf9QS0mR0PGU2v68Cb8fFWcYu2Yzn3RXf0Y dIVWvaQG9whxZq5MdJm5dknfTcCG+MtmbP/DnpQpjAlgVmDgMgYTBW1W9etU36YW0pTqEYuv 6cmRgSAKEDaYHhFLTR1+lLJkp5fFo3Sjm7XqmXzfSv9JGJGMKzoFOMBoLYv+VFnMoLX5UJAs 0JyFqFY2YxGyLd4J103NI/ocqQeU0TVvOZGVkENPSxIESnbxPghsEC0MWEbGsvqA8FwvU7Xf GhZPYzTRf7CndDnezEA69EhwpZXKs4CvxbXo5PDTv0OWzVaAWqq8s8aTMJWWAhvobFozJ63z afYHkuEjMo0Xps3o3uvKg7coooH521nNsv4ci+KeBq3mgMCRAy0g/Ef+Ql7mt900RCBHu4tk tOhPc3J1ep/e2WAJ4ngUqJhilzyCJnzVJ4cT79VK/uPtlfUCZdUz+jTC88TmP1p5wlucS31k Thy/CV4cqDFB8yzEujTSiRzd7neG3sH0vcxBd69uvSxLZPLGID840k0v5sftPA== Message-ID: Date: Fri, 17 Aug 2018 19:28:39 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The current linux-block, 4.18 and 4.17 can reliably be crashed within few minutes by running the following bash snippet: mkfs.ext4 -v /dev/sda3 && mount /dev/sda3 /mnt/test/ -t ext4; while true; do mkdir /sys/fs/cgroup/unified/test/; echo $$ >/sys/fs/cgroup/unified/test/cgroup.procs; dd if=/dev/zero of=/mnt/test/test-$(( RANDOM * 10 / 32768 )) bs=1M count=1024 & echo $$ >/sys/fs/cgroup/unified/cgroup.procs; sleep 1; kill -KILL $!; wait $!; rmdir /sys/fs/cgroup/unified/test; done # cat /sys/block/sda/queue/scheduler noop [cfq] # cat /sys/block/sda/queue/rotational 1 # cat /sys/fs/cgroup/unified/cgroup.subtree_control cpu io memory pids The backtraces vary but often they are NULL pointer dereferences due to various cfqq fields being NULL. Or BUG_ON(cfqq->ref <= 0) in cfq_put_queue() triggers due to cfqq reference count being zero. Bisection points at commit 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"). The prime suspect looked like .pd_offline_fn() method being called multiple times, but from analyzing the mentioned commit this didn't seem possible and runtime trials have confirmed that. However, CFQ's cfq_pd_offline() implementation of the above method were leaving queue pointers intact in cfqg after unpinning them. After making sure that they are cleared to NULL in this function I can no longer reproduce the crash. Signed-off-by: Maciej S. Szmigiero Fixes: 4c6994806f70 ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()"). Cc: stable@vger.kernel.org --- block/cfq-iosched.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 2eb87444b157..ed41aa978c4a 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1644,14 +1644,20 @@ static void cfq_pd_offline(struct blkg_policy_data *pd) int i; for (i = 0; i < IOPRIO_BE_NR; i++) { - if (cfqg->async_cfqq[0][i]) + if (cfqg->async_cfqq[0][i]) { cfq_put_queue(cfqg->async_cfqq[0][i]); - if (cfqg->async_cfqq[1][i]) + cfqg->async_cfqq[0][i] = NULL; + } + if (cfqg->async_cfqq[1][i]) { cfq_put_queue(cfqg->async_cfqq[1][i]); + cfqg->async_cfqq[1][i] = NULL; + } } - if (cfqg->async_idle_cfqq) + if (cfqg->async_idle_cfqq) { cfq_put_queue(cfqg->async_idle_cfqq); + cfqg->async_idle_cfqq = NULL; + } /* * @blkg is going offline and will be ignored by