Received: by 2002:ab2:6816:0:b0:1f9:5764:f03e with SMTP id t22csp1147881lqo; Fri, 17 May 2024 12:09:32 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUD6uXtzPNi13vPmxSpW3Y576MB5DeSLnBsE4N7nsIkoP2Q+Bw+7CJOK7UJLWIZEJG9GFRb5ReNAlPW9rIHECXhFbXYTQNc/uK8bCndCg== X-Google-Smtp-Source: AGHT+IEoG1KyXaqWL17ZqUCe96tagqkzMrNbRglJ5WD5RNaRUpOzfaV75nbv40EieEJraxaRsG7M X-Received: by 2002:a17:90a:ea12:b0:2b2:7c52:e175 with SMTP id 98e67ed59e1d1-2b6cceef2f7mr20842142a91.31.1715972972132; Fri, 17 May 2024 12:09:32 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1715972972; cv=pass; d=google.com; s=arc-20160816; b=nfQdDRswxGrF+fPX0vhKhD4+Bku8meAn1iVrLL9Ws87DdQ5VtVYgKEyocLVaVSFm7V MsGa27LbUWufjd295gYzurOVqNNxUkBffpANCGwF8pTPpHpZH+2BAp5kB8uOkB3EnuD+ yeVhbaIbMB5HTwa3cPC/G11LL6h0WY3NzR+E9pYFmBtGpEMgAyzZaeuw1Ukwda+kY0xj sQxN+u7dR0d1i5CsDojz5lffzy9a0PnJqSVM5KBI0zF8eSz+rwCQA75iK18IDNgTTZlv s5zb9cJcDlYQDz1+WgfWqQe39UsrJ15de3DYO69uCpfg9kcjmmxq870u6kHBBuGOkYZL v7jg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=a4uMmidnditPn5vH2prz1JHphXaIwcox538EYEol6Zk=; fh=hik2D631pKe6TFlf0O4BE6Di2f+LOm1zucEOkAUVLNY=; b=VYzgbMv2BEzeuy89gGg38UZCxaEJfEMBJ7YKmRD6iWBgdW0EYx1qWc5Z7cy/eTnCfG e/r8GlocWoa2Z6vOzpqYdpJMHjqhJV2RIGSmlBPzsAL2rIO/QyjFgXj8qSezmFuYf/PD nLdPzW+DUZs0VrHs3JPwaBaB1FosmXdm4+dnIMDO46CJkccJvRxMLcuQEfa7s7LMRHNA 22UXtKrJfnvaF5ZJ8qjmQb93p1VV2RHH2HCUhZ+aYQ6BdIO4I7E1OPlPekApHLmS+rtJ AxvDDAghgQiihP5hlnpDzEfY8NI0PhJ/MIA3knOLh5USvbThNGg/bmKmYTBYfV/vX6O2 awtw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=B1SYVzmP; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-182554-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-182554-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id 98e67ed59e1d1-2b628866880si20358178a91.25.2024.05.17.12.09.31 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 May 2024 12:09:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-182554-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=B1SYVzmP; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-182554-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-182554-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 90FC9285CCF for ; Fri, 17 May 2024 19:09:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 288C713E404; Fri, 17 May 2024 19:08:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="B1SYVzmP" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 477FD6A005; Fri, 17 May 2024 19:08:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715972933; cv=none; b=QQ/i0pRgj0D7kWG+RRZYaomOdeGHizHE9aRCxVXU8jVWBwY0+iiSMfOYF2TaxseAIfbJ6LbOKKBy9ZIZgiNaeU6yhf3xDeaR8ABYlp+SlgYIhWRNEON58eTqxuZ5bzpQ0VbYj1KuCmeaqMCe1YNv0l3lLiKgL3UGb8Sy2M56m3k= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715972933; c=relaxed/simple; bh=uWb+bOSKwQLJn2Jz1qSkQVUr71jyyOc0xxanYUgAxrI=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=KFJA9BwKNdljJgKRbFqQkPAp8A2gWEN8Ot7/7faFXPxRlrhUfGRd/SzfoOhzqYIdeY+wqpCxZp396KbcZLafpMvuTnWu/PBmwLB9iBA0aoz81IxxDj1SGBbe0+FYJSKQDGg9TWXPV4g38gH8uZDuZc8QBnEE5Ixc6SdJnrbzmTI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=B1SYVzmP; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id B611DC32789; Fri, 17 May 2024 19:08:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1715972932; bh=uWb+bOSKwQLJn2Jz1qSkQVUr71jyyOc0xxanYUgAxrI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=B1SYVzmPVa95dii8iiNGtWbLnTRYHsNyGY7TVnPycTDXAk3DJBlQppLj44zNGyc+U GhhVRlWsBkN8LPYPURcukRIiJAahNyMthuUXayDEb4WqQRzn0M04aWAU5ImLEZuq/K pwmGci4Y122Pc+Vo2PEp72j/FNFr4gAXtGLRV4n2lDjvTvpejpG1jMmtM2OS5IjlYS 7fT4VlfMOBYtTk4i1UdHsRho9A6uz3Zkz/dDt30lXty9sGoz0Sm4qf5BrNy6Vf+A5u Y9Q+/QffY2SZ6DizYsyRgYveRTXtsmY0xVTm7gdTXty/Y+fo6NIVRZtWIlAnDSc4Iz 2EKG2LaWZVelg== Received: by mail-oo1-f42.google.com with SMTP id 006d021491bc7-5ac4470de3bso67475eaf.0; Fri, 17 May 2024 12:08:52 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCV4mENtBuqDt2koXhnTscw/9Ji7dJk3uMGwi5eIRNIQKc4Gq6k41pPiLwfzxQgQ7GkwpNJDBq1iDrFlGn/vCLkRJ+uaht5LkSta9Etpf5BAZn7pedMPM6WIMsbLFfyMqJUX7BSfM20= X-Gm-Message-State: AOJu0YxZW6/o71188Y90/ok8IA6W/ltN0AbcGvq0JI6Se/Bn54q/Qvwa 6rknc+bwk2JJ0CtGoyTixOwH1TpQOMSOcn+wOsG/U1MA/NKRYH//TeAjZdbf3yupSEUZBArMADy FECabuVYxXLOWRhi+/2i1URpgrp8= X-Received: by 2002:a05:6820:1f16:b0:5b2:7d9f:e708 with SMTP id 006d021491bc7-5b28193e476mr24576900eaf.1.1715972931992; Fri, 17 May 2024 12:08:51 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <87o79cjjik.fsf@kernel.org> <20240511184847.GCZj-9j2sh1Akpt9iS@fat_crate.local> <20240511184945.GDZj-9yaOEWqf1ng8u@fat_crate.local> <87h6f4jdrq.fsf@kernel.org> <878r0djxgc.fsf@kernel.org> <874jb0jzx5.fsf@kernel.org> <20240514160555.GCZkOL41oB3hBt45eO@fat_crate.local> <87msoofjg1.fsf@kernel.org> <35086bb6-ee11-4ac6-b8ba-5fab20065b54@intel.com> <871q60ffnr.fsf@kernel.org> <7813dff5-b140-48c4-bc15-ed25c7a07591@intel.com> <87eda0cljg.fsf@kernel.org> In-Reply-To: <87eda0cljg.fsf@kernel.org> From: "Rafael J. Wysocki" Date: Fri, 17 May 2024 21:08:40 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [regression] suspend stress test stalls within 30 minutes To: Kalle Valo Cc: Dave Hansen , Borislav Petkov , Pawan Gupta , Thomas Gleixner , Ingo Molnar , Dave Hansen , "Rafael J. Wysocki" , x86@kernel.org, linux-pm@vger.kernel.org, linux-kernel@vger.kernel.org, regressions@lists.linux.dev, Jeff Johnson Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 17, 2024 at 8:59=E2=80=AFPM Kalle Valo wrote= : > > Dave Hansen writes: > > > On 5/17/24 11:37, Kalle Valo wrote: > >> While writing this email I found another way to continue the suspend > >> after a stall: terminate rtcwake with CTRL-C in the ssh session runnin= g > >> the for loop. That explains why 'sudo shutdown -h now' makes the suspe= nd > >> go forward, it most likely kills the stalled rtcwake process. > > > > Could we try and figure out what rtcwake is doing during its stall? A > > couple of ideas: > > > > You could strace it to see if it's hung in the kernel: > > > > strace -o strace.log rtcwake ... > > > > You could look at its stack in /proc, like this: > > > > # cat /proc/`pidof sleep`/stack > > [<0>] hrtimer_nanosleep+0xb5/0x190 > > [<0>] common_nsleep+0x44/0x50 > > [<0>] __x64_sys_clock_nanosleep+0xcb/0x140 > > [<0>] do_syscall_64+0x65/0x140 > > [<0>] entry_SYSCALL_64_after_hwframe+0x6e/0x76 > > > > Or you can use sysrq: > > > > echo t > /proc/sysrq-trigger > > > > to get *all* tasks' stacks dumped out to dmesg. > > > > I'd probably do all three in that order. > > > > Getting a function-graph trace of rtcwake during the stall would also b= e > > nice, but that's a lot of data so let's try the easier things first. > > I can do all that but most probably not this week. Luckily it's quite > easy to reproduce the bug, one time I even saw it in the first iteration > and usually within 15 minutes or so. > > And do let me know if there's anything else I should try. My somewhat educated guess is that pm_notifier_call_chain_robust() blocks for you, so you can add debug printk()s around the call to this in suspend_prepare(). It is also possible that pm_prepare_console() does something weird and your description of the problem indicates that it doesn't get to user space freezing.