Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757248AbXE1CcM (ORCPT ); Sun, 27 May 2007 22:32:12 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752180AbXE1Cb6 (ORCPT ); Sun, 27 May 2007 22:31:58 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:45143 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751221AbXE1Cb6 (ORCPT ); Sun, 27 May 2007 22:31:58 -0400 Date: Mon, 28 May 2007 11:32:00 +0900 Message-ID: <87lkf9brvj.wl%takeuchi_satoru@jp.fujitsu.com> From: Satoru Takeuchi To: linux-kernel@vger.kernel.org, Ingo Molnar Subject: [BUG] signal: multithread program returns with wrong errno on receiving SIGSTOP User-Agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.6 Emacs/21.4 (i486-pc-linux-gnu) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2748 Lines: 124 Hi, I found a bug on signal subsystem. If there is some multithread program and one of the thread is blocking on the system call, it returns with wrong errno on receiving SIGSTOP and following SIGCONT. Arch dependency =============== succeed to reproduce: i386, ia64 Unknown: any other arch # I suspect that this problem occur on any arch... How to reproduce ================ This process needs 2 terminals, term A and term B. 1. issue the test program (attached on this mail) from term A. $ ./mt-wrong-errno 2. send SIGSTOP to test program from term B. $ ps a ... 9685 pts/3 Sl+ 0:00 ./mt-wrong-errno 9688 pts/4 R+ 0:00 ps a $ kill -STOP 9685 3. send SIGCONT to test program from term B. $ kill -CONT 9685 Expected Result =============== `./mt-wrong-errno' restarts read syscall (and stop again by receiving SIGTTIN). Actual Result ============= `./mt-wrong-errno' returns with wrong errno 512 means ERESTARTSYS. The comment on include/linux/errno.h says "this errno should never be seen by user program." So it's definitely a bug. output of term A ---------------- $ ./mt-wrong-errno [1]+ Stopped ./mt-wrong-errno $ errno = 512 output of term B ---------------- $ ps a ... 9685 pts/3 Sl+ 0:00 ./mt-wrong-errno 9688 pts/4 R+ 0:00 ps a $ kill -STOP 9685 $ kill -CONT 9685 This problem can always reproduce, so it's no timing issue. I suspect that group_stop_count counting mechanism has something wrong and signal handling is canceled erroneously. Any hints or patch itself are welcome. Thanks, Satoru ------------------------------------------------------------------------------- /* * mt-wrong-errno.c * * Copyright 2007 Satoru Takeuchi * * This software may be used and distributed according to the terms * of the GNU General Public License, incorporated herein by reference. * */ #include #include #include #include #include #include #include #include void *thread_fn(void *arg) { char c; int ret; ret = read(STDIN_FILENO, &c, 1); if (ret < 0) printf("errno = %d\n", errno); return NULL; } int main(int argc, char *argv[]) { pthread_t t; if (pthread_create(&t, NULL, thread_fn, NULL)) err(EXIT_FAILURE, "pthread_create() failed\n"); if (pthread_join(t, NULL)) warn("pthread_join() failed\n"); } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/