Received: by 10.213.65.68 with SMTP id h4csp389437imn; Tue, 13 Mar 2018 07:35:32 -0700 (PDT) X-Google-Smtp-Source: AG47ELucUgg1BOhdGW70+hPkoMhSrLRkbjUat2snCi5lGaRq6ibroCn1KirJ3yqUrQp8o/1OOztt X-Received: by 10.99.170.5 with SMTP id e5mr682687pgf.92.1520951732839; Tue, 13 Mar 2018 07:35:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520951732; cv=none; d=google.com; s=arc-20160816; b=dZOQJEMxvhwHG7OrNRbFajnKOKIHxSQBilvr+owjmogb2biR5YE+NRDIdEtc8QRB8p F83HA5o6zg4L3Kwb6XJeagITWRGhDb+4rdg2RBs26hSygcYJHWm/YrrEg+F9qrdLzPUr VlBI5rvC5aRdGwMyvAYYXW1aTG2hgurwChreric39Js+b0nQhEQxbe1Ouc1R8mfT6uGd xuYcZtDTLRQPfs5BdTBqkZbN1iDpTBN+yqUT4lqa0HSPfpvh+lXND0xLt09mEBxv15vo X+VPyg8FhymsMyClgkuyj8Y6BE2Uk9lMd1fL9Xj9pep91xFpNBVwWGgUCcdA+OXLs6lG LmSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=JjBkrxf3txbmzUMHb3G7NSMKKCmIfy7Sjrem2R9vgi0=; b=uFE8NGJboZnxHjx7rYEEHEUpFCbZdPHRcMIEETDPJzZdRFxb1uyseTty7GWSxvFrLH 8uwbWrj0zyVVixLqMZflOavFbmzaXorHclY7GvNNxM6vLsNMO6lUqpUSXMrzoIIpj7f+ 4EQT2iQHQt6njG7SlWNVh4FkDD6DeFhnSAnF0CpqvKKb6EDjdwIaSdDg73iwnN71Hr85 c6klLNo7UAd2+n9pv90+ttnlnbqA/DhizrpFD0qutMnT/mWZrDryLwkXpGqjO/ttazJB LCDrZoR8lOQZIluWD0Naq5x2DNc9U9+k10a2Gpc3AYewF7iFtEuEsfzm60tPWepyBcIu GZCw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id m3si181367pgr.448.2018.03.13.07.35.17; Tue, 13 Mar 2018 07:35:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752839AbeCMOcv (ORCPT + 99 others); Tue, 13 Mar 2018 10:32:51 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:55514 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752214AbeCMOcr (ORCPT ); Tue, 13 Mar 2018 10:32:47 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9D7944023159; Tue, 13 Mar 2018 14:32:46 +0000 (UTC) Received: from ming.t460p (ovpn-12-56.pek2.redhat.com [10.72.12.56]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 635791102E2C; Tue, 13 Mar 2018 14:32:32 +0000 (UTC) Date: Tue, 13 Mar 2018 22:32:28 +0800 From: Ming Lei To: Martin Steigerwald Cc: Hans de Goede , Linux Kernel Mailing List , Thorsten Leemhuis , Tejun Heo , linux-block@vger.kernel.org, Bart Van Assche , linux-scsi@vger.kernel.org, "Martin K. Petersen" , "James E.J. Bottomley" Subject: Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts Message-ID: <20180313143222.GA10883@ming.t460p> References: <27165802.vQ9JbjrmvU@merkaba> <2276139.2HCKFmVDEL@merkaba> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <2276139.2HCKFmVDEL@merkaba> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Tue, 13 Mar 2018 14:32:46 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Tue, 13 Mar 2018 14:32:46 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'ming.lei@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 13, 2018 at 02:08:23PM +0100, Martin Steigerwald wrote: > Hans de Goede - 11.03.18, 15:37: > > Hi Martin, > > > > On 11-03-18 09:20, Martin Steigerwald wrote: > > > Hello. > > > > > > Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue > > > with SMART checks occassionally failing like this: > > > > > > smartd[28017]: Device: /dev/sdb [SAT], is in SLEEP mode, suspending checks > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/INTEL_SSDSA2CW300G3_[…]: Error updating > > > SMART data: Error sending ATA command CHECK POWER MODE: Unexpected sense > > > data returned:#0120000: 0e 09 0c 00 00 00 ff 00 00 00 00 00 00 00 50 > > > 00 ..............P.#0120010: 00 00 00 00 00 00 00 00 00 00 00 00 00 > > > 00 00 00 ................#012 (g-io-error-quark, 0) merkaba > > > udisksd[24408]: Error performing housekeeping for drive > > > /org/freedesktop/UDisks2/drives/Crucial_CT480M500SSD3_[…]: Error updating > > > SMART dat a: Error sending ATA command CHECK POWER MODE: Unexpected sense > > > data returned:#0120000: 01 00 1d 00 00 00 0e 09 0c 00 00 00 ff 00 00 > > > 00 ................#0120010: 00 0 0 00 00 50 00 00 00 00 00 00 00 > > > 00 00 00 00 ....P...........#012 (g-io-error-quark, 0) > > > > > > (Intel SSD is connected via SATA, Crucial via mSATA in a ThinkPad T520) > > > > > > However when I then check manually with smartctl -a | -x | -H the device > > > reports SMART data just fine. > > > > > > As smartd correctly detects that device is in sleep mode, this may be an > > > userspace issue in udisksd. > > > > > > Also at some boot attempts the boot hangs with a message like "could not > > > connect to lvmetad, scanning manually for devices". I use BTRFS RAID 1 > > > on to LVs (each on one of the SSDs). A configuration that requires a > > > manual > > > adaption to InitRAMFS in order to boot (basically vgchange -ay before > > > btrfs device scan). > > > > > > I wonder whether that has to do with the new SATA LPM policy stuff, but as > > > I had issues with > > > > > > 3 => Medium power with Device Initiated PM enabled > > > > > > (machine did not boot, which could also have been caused by me > > > accidentally > > > removing all TCP/IP network support in the kernel with that setting) > > > > > > I set it back to > > > > > > CONFIG_SATA_MOBILE_LPM_POLICY=0 > > > > > > (firmware settings) > > > > Right, so at that settings the LPM policy changes are effectively > > disabled and cannot explain your SMART issues. > > Yes, I now good a photo of one of those boot failures I mentioned, at it seems > to be related to blk-mq, as the backtrace contains "blk_mq_terminate_expired". > > I add the screenshot to my bug report. > > [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and > boot failures with blk_mq_terminate_expired in backtrace > https://bugzilla.kernel.org/show_bug.cgi?id=199077 > > Hans, I will test your LPM policy horkage for Crucial m500 patch at a later > time. I first wanted to add the photo of the boot failure to the bug report. > > Ming and Bart, I added you to cc, cause I had to do with you about another > blk-mq report, please feel free to adapt. Looks RIP points to scsi_times_out+0x17/0x1d0, maybe a SCSI regression? Thanks, Ming