Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp3348968ybg; Mon, 28 Oct 2019 11:23:48 -0700 (PDT) X-Google-Smtp-Source: APXvYqw0f5nUmvi1Erv4I3icAyCokkBhYgOpzci3MGfXUaEBrGDyB6JOl3esRBr+AWJvWC3Zp7QE X-Received: by 2002:aa7:d95a:: with SMTP id l26mr11579636eds.12.1572287028787; Mon, 28 Oct 2019 11:23:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572287028; cv=none; d=google.com; s=arc-20160816; b=hYhlq9zrqB5F3FQuTohyPh54H8phdblmMWLV+DUYtnN+928fzCkpttbWjZiWrf3ZY4 mfB1ZfgZm1x3cx8Uyd6thcnbAknUaCT3GiaZl9ItdDAxjIPr55yhmsaQV+tfuTMWM3ie gFKuMpTtubN8SVP5gQwqDiCjaaMj5sSv5vAsKzblFS4Lj5SGqNXkhwzcTFe0/R4Wo6r8 d4gYeQ1xNPAnZaan/LhSR9dtawghd/6Ionaj/RxnCsnQDwAWVJr9n/8xWnq2MYapj4NK PPRU5itR86ChgETVjcH2rUHjUpe8ri6BO1yPuloem+BZ+AhXZSd87uzggbbiS4ZfmXqj k+NQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:subject:message-id:date:from :mime-version:dkim-signature; bh=+J6HJzJSTXLwW+2B+amFF2JyA8VeAPbIhBSjpfVOGwQ=; b=Y41Ac1pnviH837P8tvZ2zAE0mcyujcH1PXxObO9xzWLD6xXu7snvKQ9cg1rHBa2dcJ VfvzRUpDSL18gefl1zin6RdY4VwQpsWoglhRL1OnbQN97GthIeSoG+FIZWeuYV0un2YY 9n4mrXw5ZgJJ3TnSSXIbNLgBQhI+TxThmFF5627JRy6W9ZI2l/pRdEOVZHBGNWNGBJbJ Wtdbt/Pr8PXRrIi91ToIouPVvqr/A7+WcHAn4k6lE1ne4xN39ha7Bvt8W1RLpZwSGhQq 4LhuIgMRJEHiHQSJ7/gfpH3j87x90bJo1ox6oI0FnUsrZhutkOiOYtbL8pm0UD3174s4 UhPQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@googlemail.com header.s=20161025 header.b="Xk/vMcjp"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k21si6372263ejx.384.2019.10.28.11.23.24; Mon, 28 Oct 2019 11:23:48 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@googlemail.com header.s=20161025 header.b="Xk/vMcjp"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=googlemail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733228AbfJ1IJx (ORCPT + 99 others); Mon, 28 Oct 2019 04:09:53 -0400 Received: from mail-ua1-f65.google.com ([209.85.222.65]:46362 "EHLO mail-ua1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729786AbfJ1IJw (ORCPT ); Mon, 28 Oct 2019 04:09:52 -0400 Received: by mail-ua1-f65.google.com with SMTP id o4so1147328uat.13 for ; Mon, 28 Oct 2019 01:09:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=+J6HJzJSTXLwW+2B+amFF2JyA8VeAPbIhBSjpfVOGwQ=; b=Xk/vMcjpXfBASXEhn6lcrfhQjnAOagsIbQPgsDgTb87Pe1de50ed8ql9SZpqlDSi5e kiE3yzi16D58CjTS/0DUZ9heft3huPvobpdhuD0xwA2xp3WS4fiKZj23D+CMGkLHMilr vsfCmDMfBFdbaVXJ6qKtMuULt9vSifXDZpaOFkJqwOy8gtuqSyl7fgebJkXCpcBg9z4U QFsP+z4pLnY7T8pNJ9uwqOrrU1/4wonTNq6puqBHqAbRoATkAKaUDUcEYRJNxUVyAMDa FkoBGQOocs2H/CS4W0rDJxu+2KWUmg0sgvOCeECnDsQVv20WlvcUkLgW9aI9lGPfSN1E XJ1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+J6HJzJSTXLwW+2B+amFF2JyA8VeAPbIhBSjpfVOGwQ=; b=O6aWrYsv1KMvIFr+E9iKNNqsmGwcEtVgurmNULrBeyLtx5VujAiXEiSfK0zVPEeYTf R9TUY5iQKvtGqejjqyUOJdUZBlTJOdGRYCtyR7xBKnYaW0Vd+/kFJ9mBZ0LOTuusxTnV 8wi1E292++31DcGwWvSWTUnkp0snpdmfhiIwZvYvRtbCoWlRnXyjQYyu8vw6dDs/Fzkk tNJKHkJjaIOFBxVoem5MsB3lh/hVvXIZDNLWyzYWxyZyEzugkKUFWUPY+viZy29+mNMk fHB5dyMd2OPhwOgh4ztCOlfCWlE7V2upvcZzqDLMj2uL5IAJ7F8ciDzzEe8ab1HLbqWM 4EjQ== X-Gm-Message-State: APjAAAXx38fXAEpzDB2g22DnT6BDwIRzQdJm/BJnc1eCqmeI1ucn2ATy fyzIFQU6y0pkKgHsgpG6rHWud/Y2FZwbJcfX80K3oINT X-Received: by 2002:ab0:55c8:: with SMTP id w8mr7291486uaa.66.1572250189669; Mon, 28 Oct 2019 01:09:49 -0700 (PDT) MIME-Version: 1.0 From: Stephan Date: Mon, 28 Oct 2019 09:09:38 +0100 Message-ID: Subject: Process waiting on NFS transitions to uninterruptable sleep when receiving a signal with custom signal handler To: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello everyone, I have asked this question on Stackoverflow a while ago but unfortunately nobody had an idea on this. I am currently doing some research on how we can extend the monitoring solution for Linux in our datacenter in order to detect inaccessible NFS mounts. My idea was to look for NFS mounts in /proc/self/mountinfo and then for each mount, call alarm(), issue a syncronous interruptible call via stat()/fsstat() or similar, and in case of an alarm, return an error in the signal handler. However, I experienced the following behaviour which I am not sure how to explain or debug. It turned out that when a process waiting in the stat system call on a mountpoint of a diconnected NFS server, it responds to signals as expected. For example, one can exit it pressing Strc+C, or it displays "Alarm clock" and ends when the alarm timer fires. The same applies e.g. to SIGUSR1/2, leading the program to display "User defined signal 1" (or "2") and end. I suspect these messages come from a general signal dispatcher inside glibc, but it would be nice to hear some details on how this works. In all cases in which a custom signal handler was registered, the process transitions to an uninterruptible sleep state when a signal for this custom handler is scheduled; leading to no other signal being processed anymore. Of course this applies to SIGALRM as well when the alarm() timer sends the signal. All signals show up in /proc/PID/status as below: Threads: 1 SigQ: 4/31339 SigPnd: 0000000000000000 ShdPnd: 0000000000002a02 SigBlk: 0000000000000000 SigIgn: 0000000000000000 SigCgt: 0000000000000200 I looked at the information from "echo w > /proc/sysrq-trigger" but there is nothing of help to me: [26099350.815187] signal D 0000000000000000 0 49633 39989 0x00000084 [26099350.815193] ffff880001d27b88 0000000000000046 ffff88008a8184c0 ffff880001d28000 [26099350.815199] ffff880001d27c18 ffffffff81e0a168 ffffffffa03d1df0 0000000000000000 [26099350.815204] ffff880001d27ba0 ffffffff81619dd5 ffff88008a8184c0 0000000000000082 [26099350.815209] Call Trace: [26099350.815213] [] schedule+0x35/0x80 [26099350.815223] [] rpc_wait_bit_killable+0x1e/0xa0 [sunrpc] [26099350.815227] [] __wait_on_bit+0x5a/0x90 [26099350.815231] [] out_of_line_wait_on_bit+0x6e/0x80 [26099350.815242] [] __rpc_execute+0x14e/0x450 [sunrpc] [26099350.815251] [] rpc_run_task+0x69/0x80 [sunrpc] [26099350.815259] [] nfs4_call_sync_sequence+0x56/0x80 [nfsv4] [26099350.815267] [] _nfs4_proc_getattr+0xb0/0xc0 [nfsv4] [26099350.815279] [] nfs4_proc_getattr+0x53/0xd0 [nfsv4] [26099350.815288] [] __nfs_revalidate_inode+0x94/0x2a0 [nfs] [26099350.815296] [] nfs_getattr+0x7e/0x250 [nfs] [26099350.815303] [] vfs_fstatat+0x5a/0x90 [26099350.815306] [] SYSC_newstat+0x1a/0x40 [26099350.815312] [] entry_SYSCALL_64_fastpath+0x20/0xe9 [26099350.817782] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x20/0xe9 It is also not possible to access anything in usermode as it is not possible to attach a debugger. The development happened on SLES 12 SP4, Kernel version 4.4.162-94.72-default. I am attaching some C and bash code for reproduction, the issue can be triggered with SIGUSR1 (kill -USR1 PID) or any other one with changes to the code. As for C, there is no difference in using signal() or sigaction() to install the handler. The handlers are deliberately left empty to be sure the is no "forbidden" function called inside. Regards, Stephan C-Code: ================ #include #include void sig_handler(int sig) { } int main(void) { int ret; struct stat buf; signal(SIGUSR1, sig_handler); alarm(30); ret = stat("/a", &buf); return 0; } bash-Code: ================= #!/bin/bash sighandler() { declare unused } trap sighandler USR1 [[ -d /a ]] && echo "stat() returned"