Received: by 10.192.165.156 with SMTP id m28csp2060249imm; Sat, 14 Apr 2018 12:50:56 -0700 (PDT) X-Google-Smtp-Source: AIpwx49fdgQZrHSUBUQO+YT6IvvuHpdapx96UgbRhMT2iC3Wm6MkVKFX/1ih5kYYDTJo1MaJn3a2 X-Received: by 2002:a17:902:2f03:: with SMTP id s3-v6mr9770900plb.274.1523735456524; Sat, 14 Apr 2018 12:50:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523735456; cv=none; d=google.com; s=arc-20160816; b=k9BKmIhRUIWjT/aEdf1xPo652BLEk/IJYgaxtFXM2ixjSeafIC4D2RmLV9wQf5Aimo qCmX/VN9vc+r8Mqqa3gyAsAlYkLu9d9M+eVBG1zRWnDSsObtJL+AUD+wVWuCBAm0OEkV zmNbzZNFza0wegOKnM0KAsnMmbcVBqFGne07L3trFY303gtWzeiGKu4BZ8FfiiYldLkP IetBuruHAmA3j7sy2FHM/c/gy48K1FQFbuUBzGJEDH8fEhI4O67p5h2HM4atd1opQ7Fj iSkdfd/ndgOfLgVb9XxyClHLziOYq/jxT0ui4iZMuYuYpPi4EOANljagKTW6RIhLRxbU Xmtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=M9aceykpfUK7IXEVvRPQmE/L8xvpXVkO3GvvvShvW+M=; b=GoQav2aWsaGA0w+yAwM3wW9WodpAqwKo9JYjaGkTtRHOeLTVzlCN1i41qEqkX7xv25 zQGPD263OHzrpAmokwmZNl9kcoLFz9W/TlOmkrLCUBLUSyyO6QMfwm4OAxgND9JQ7eq6 AULxjKWbbBv7vwsfWtgpVgZcW9mhJQOmTDTk80rADpBwdr5T45uhFFwYBkE19LV0dx9H 646xZWKe8q0dmqW8Yqst5DBfbSBeWPvGbKQX8lxD6K7SL2NncpvOpz0QCWj9sgp2zK7Z KZVqzoQ5hHhPDFA16p4iemTaTNMimWlK0t8MLOcGIwGv1PvGpQhdhdJJYKeBhpNcyNtM W7fw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qKooEvNF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i128si7525262pfg.343.2018.04.14.12.50.06; Sat, 14 Apr 2018 12:50:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=qKooEvNF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751805AbeDNTqn (ORCPT + 99 others); Sat, 14 Apr 2018 15:46:43 -0400 Received: from mail-wr0-f194.google.com ([209.85.128.194]:34049 "EHLO mail-wr0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751590AbeDNTqk (ORCPT ); Sat, 14 Apr 2018 15:46:40 -0400 Received: by mail-wr0-f194.google.com with SMTP id d19so15935499wre.1; Sat, 14 Apr 2018 12:46:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=M9aceykpfUK7IXEVvRPQmE/L8xvpXVkO3GvvvShvW+M=; b=qKooEvNFQDurR8CM6pL2X1x7lB0nW1l1efpAe4iE+sg6STK+cPNtJL+2wqgHPvFc8s smVuW+Q5qxyYaVawFzUbEP4j0eewj2VM15pN6/tAH9zofcLqEe6TIIsKXG6sO68ahObo I7gvF5MpzloQflt5HdS+SbvZb8BdQOux2MbalSKBR2ZdVzSbv6zYd0LAz/jPDk54yKAx MbgNlPecCTs9wY3OEfn18tkJfxZwz++ofaS6AqndRrWu7s3Y9hLg4Aqnjm7KoBXQK4W4 AN0nXGKQlhlkLe8zzFmep8uG9J553siI1Q9XBcWaKfZzpTnuPfa70CjzFuCvjS1XE7J+ IOew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=M9aceykpfUK7IXEVvRPQmE/L8xvpXVkO3GvvvShvW+M=; b=CH17uZAMRwLVAUfF8JCVMDXRnrFRyCHND1eMEt26zQlNKmfJWIHM90wACD898nr0xd cUKnx3+SIiJZye6Z/ptLpw93tnA/f/pYJePSplXxInB85pQBWK2FAitnXi8HiK3e8m7K 4iXZgZRek97KazpmcdyrnXQeRpYDX7sjAy/H2ScL7dhsuAhhoMp9yxfFGAqOAkc1d4jp KBq6UjF2uUJIyx/pzNuP/3m3qDy7uFBm834W86GizTtfXRVCjdy1pMEW7di+ZQXOwYKd 3Ndx/sGpXyoaVJFbL6lC0aJ09zgQ0qaAnLg/CdUfJj3E9R+N3Qayj9qF35Uf2KIQh35L 9hAQ== X-Gm-Message-State: ALQs6tAAZcdUOeD2eMZ+BCcVIdWo2gNYttK20oxr8tYVhIjpqb8EwIR9 wVEWTHgnH8THd4JR86zHfxT8/fVG X-Received: by 10.28.195.67 with SMTP id t64mr7148288wmf.113.1523735197945; Sat, 14 Apr 2018 12:46:37 -0700 (PDT) Received: from [172.16.1.192] (host-89-243-165-90.as13285.net. [89.243.165.90]) by smtp.gmail.com with ESMTPSA id f2sm5750639wre.76.2018.04.14.12.46.36 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 14 Apr 2018 12:46:37 -0700 (PDT) Subject: Re: blktest for [PATCH v2] block: do not use interruptible wait anywhere To: Johannes Thumshirn , linux-block@vger.kernel.org Cc: Jens Axboe , Bart Van Assche , linux-kernel@vger.kernel.org, stable@vger.kernel.org References: <13e67e1243ebd96f3f56a150d47444ada47ebde0.camel@wdc.com> <20180412181158.8884-1-alan.christopher.jenkins@gmail.com> <1523608284.7787.3.camel@suse.de> From: Alan Jenkins Message-ID: Date: Sat, 14 Apr 2018 20:46:36 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <1523608284.7787.3.camel@suse.de> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-GB Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/04/18 09:31, Johannes Thumshirn wrote: > Hi Alan, > > On Thu, 2018-04-12 at 19:11 +0100, Alan Jenkins wrote: >> # dd if=/dev/sda of=/dev/null iflag=direct & \ >> while killall -SIGUSR1 dd; do sleep 0.1; done & \ >> echo mem > /sys/power/state ; \ >> sleep 5; killall dd # stop after 5 seconds > Can you please also add a regression test to blktests[1] for this? > > [1] https://github.com/osandov/blktests > > Thanks, > Johannes Good question. It would be nice to promote this test. Template looks like I need the commit (sha1) first. I had some ideas about automating it, so I wrote a standalone (see end).  I can automate the wakeup by using pm_test, but this is still a system suspend test.  Unfortunately I don't think there's any alternative. To give the most dire example # This test is non-destructive, but it exercises suspend in all drivers. # If your system has a problem with suspend, it might not wake up again. So I'm not sure if it would be acceptable for the default set? How useful is this going to be? Is there an expanded/full set of tests that gets run somewhere? If you can't guarantee it's going to be run somewhere, I'd worry the cost/benefit  feels a little narrow :-(. There were one or two further "interesting" details, and it might theoretically bitrot if it's not run periodically. If you look at the diff and title for the fix, I don't think it's at high risk of being reversed unintentionally. And I think you can trust users will notice if the fix gets merged away accidentally, before it hits -stable releases :-). The issue kills the entire GUI session on resume from suspend, say once every three days, on gnome-shell (due to Xwayland). One unfortunate user switched to Xorg only to find that was also affected.  I honestly assume the issue applies generally to laptop systems.  The only mitigating factor is if you have RAM to spare, so you don't hit the major pagefaults during resume. #!/bin/bash # This test is non-destructive, but it exercises suspend in all drivers. # If your system has a problem with suspend, it might not wake up again. # TEST_DEV must be SCSI (inc. libata). # # Additionally, this test will abort if $TEST_DEV is too tiny # and we finish reading it within 3 seconds. Sorry. TEST_DEV=sda # RATIONALE # # The original root cause issue was the behaviour around blk_queue_freeze(). # It put tasks into an interruptible wait, which is wrong for block devices. # # XXX Insert reference to fix commit XXX # # The freeze feature is not directly exposed to userspace, so I can not test # it directly :(. (It's used to "guarantee no request is in use, so we can # change any data structure of the queue afterward". I.e. freeze, modify the # queue structure, unfreeze). # # However, this lead to a regression with a decent reproducer. In v4.15 the # same interruptible wait was also used for SCSI suspend/resume. SCSI resume # can take a second or so... hence we like to do it asynchronously. This # means we can observe the wait at resume time, and we can test if it is # interruptible. # # Note `echo quiesce > /sys/class/scsi_device/*/device/state` can *not* # trigger the specific wait in the block layer. That code path only # sets the SCSI device state; it does not set any block device state. # (It does not call into blk_queue_freeze() or blk_set_preempt_only(); # it literally just sets sdev->sdev_state to SDEV_QUIESCE). set -o nounset abort() { echo "$*" echo "=== Test ERROR ===" exit 2 } SYSFS_PM_TEST_DELAY=/sys/module/suspend/parameters/pm_test_delay SAVED_PM_TEST_DELAY= # Child process IDs DD= SUBSHELL= cleanup() { # In many cases the subshell will already have exited... # and semantics for `wait` are crappy in shell. # Failure will be harmless in most cases. # Just try to provide enough context for the user to guess. echo "Cleaning up" if [ -n "$SUBSHELL" ]; then echo "Killing sub-shell PID $SUBSHELL..." kill $SUBSHELL wait $SUBSHELL fi if [ -n "$DD" ]; then echo "Killing 'dd' PID $DD..." kill $DD wait $DD fi echo "Resetting pm_test" echo none > /sys/power/pm_test echo "Resetting pm_test_delay" if [ -n "$SAVED_PM_TEST_DELAY" ]; then echo "$SAVED_PM_TEST_DELAY" > "$SYSFS_PM_TEST_DELAY" fi } trap cleanup EXIT # "If a user has disabled async probing a likely reason # is due to a storage enclosure that does not inject # staggered spin-ups. For safety, make resume # synchronous as well in that case." if ! SCAN="$(cat /sys/module/scsi_mod/parameters/scan)"; then abort "error reading '/sys/module/scsi_mod/parameters/scan' ?" fi if [ "$SCAN" != "async" ]; then abort "This test does not work if you have set 'scsi_mod.scan=sync'" fi # Ignore USR1, in the hope that this applies to child processes. # This allows us to safely `kill -USR1 $DD`, when we don't know # whether the child process has fully started yet. # # I think this is the only place I relied on the specific # shell (bash) behaviour. trap "" USR1 # Check dd can work if ! dd iflag=direct if="/dev/$TEST_DEV" of=/dev/null count=1 status=none; then abort "'dd'" fi # Start dd, as a background process which submits IOs and yells when one fails. # We want to hit the block layer, so use direct IO to avoid being served from # page cache. dd iflag=direct if="/dev/$TEST_DEV" of=/dev/null status=none & DD=$! if ! echo devices > /sys/power/pm_test; then abort "Setting pm_test failed, does your kernel lack CONFIG_PM_TEST?" fi if ! SAVED_PM_TEST_DELAY="$(cat "$SYSFS_PM_TEST_DELAY")"; then abort "error reading pm_test_delay" fi if ! echo 0 > "$SYSFS_PM_TEST_DELAY"; then abort "error setting pm_test_delay" fi # Just keep sending signals to 'dd' as long as it's alive. # dd accepts USR1 signal to print status. It doesn't seem to be a problem # that we told dd not to actually *print* anything ('status=none'). # # In theory this script is probably subject to various pid re-use races. # But I started in sh... so far blktests does not depend on python... # also direct IO is best to reproduce this, which is not built in to python. # ( while kill -USR1 $DD 2>/dev/null; do true; done ) & SUBSHELL=$! # Wait a second without suspending, it might pick up typos # or other unexpected errors. sleep 1 if ! kill -0 $DD; then DD= wait $DD || echo "'dd' exited with error" abort "'dd' exited early?" fi if ! kill -0 $SUBSHELL; then SUBSHELL= abort "subshell exited early?" fi # Log that we're suspending. User might not have guessed, # or maybe suspend (or pm_test suspend) is broken on this system. echo "Now simulating suspend/resume" if ! echo mem > /sys/power/state; then abort "system suspend failed or not supported?" fi # Now wait for TEST_DEV to resume asynchronously if ! dd iflag=direct if="/dev/$TEST_DEV" of=/dev/null count=1 status=none; then abort "'dd'" fi # Wait another second. This might be useful in the case dd got blocked on a # page fault during the suspend; it will have a second to get sorted out, # while potentially receiving an IO error and exiting. sleep 1 if ! kill -0 $DD 2>/dev/null; then if wait $DD; then DD= abort "'dd' exited early, without error. Device too tiny?" fi echo "'dd' exited with error" echo "=== Test FAIL ===" DD= exit 1 fi echo "=== Test PASS ==="