Received: by 10.223.185.111 with SMTP id b44csp1403336wrg; Sat, 10 Mar 2018 05:15:42 -0800 (PST) X-Google-Smtp-Source: AG47ELtEzsGpIviyQigErxIVxEowi3NqjCobIoVb3tkSco/2Lq4kriv31sJAYz/hzzZN9C0EGBjp X-Received: by 2002:a17:902:20e5:: with SMTP id v34-v6mr2131282plg.66.1520687742432; Sat, 10 Mar 2018 05:15:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520687742; cv=none; d=google.com; s=arc-20160816; b=ipFrMdWog3KcGjIx40ivRS8hBohHeNDPp+CDiqY18qtJbfgESKSFuwuVJ82UCklxeE clz8B+o5Sa2jxKZ4R+ye8fCLSWIoNp30oMuW91Ptp+WkIbJYHtMRewxHUTrtAyzYYVOG sxi7IIAVKOwWEj5RdqYXjfy3AvAPTty92Hy2sMHoHpIhH0vYHuWUgdiVxZjz8BJyxQwF /XgVxT3Es+paCRZp30ZG726a4v+Ml4fX7RkQwKs2ZX8kzzKXxG9gFTsL6YEOlgvBJLqu X6y7cnidAys1mb++i/ILLxH81IsIzid4tpv3k3INb5XWULQwC6mXZedhHxrC0dbqF1If Ytyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :arc-authentication-results; bh=2fg9s1GeRco5sp8UoTUF3YOCOxySNX1vCiA8qYvdgB0=; b=NonwJZ/APrUsR0X3ZLC5ymy5DkPQPfJoulep9B1QI2prgLUIEsT9M+BFstyI/6nPEe bnR1owtYkBzYxqFQsW0/OOuUDo9UQXkjICkceOA7GAw3/AzI1lkT3hCcFi8AxGk3/bmh the9WndPnFqvVBvRq12nlju/sbrB0lLp3Gl5oBspaHf987jvGBIBF+okicQz5njS3/7g +iV/8aIROXdyeNiFtP/mNBdRj3psyCuydAL6NNnWRbb6HD683l2xibmzaBXxxa2Cc8L7 mOK3j8VQ1mPEcP4lNEuC0f0i2ZGjZUmvjxzRuYFB+ViHXvw65CYHwrfNOvtoDFMbISbb Pcdg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j2si2640411pff.214.2018.03.10.05.14.43; Sat, 10 Mar 2018 05:15:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932998AbeCJNMj (ORCPT + 99 others); Sat, 10 Mar 2018 08:12:39 -0500 Received: from 9.mo173.mail-out.ovh.net ([46.105.72.44]:40688 "EHLO 9.mo173.mail-out.ovh.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932976AbeCJNMi (ORCPT ); Sat, 10 Mar 2018 08:12:38 -0500 X-Greylist: delayed 1800 seconds by postgrey-1.27 at vger.kernel.org; Sat, 10 Mar 2018 08:12:38 EST Received: from player728.ha.ovh.net (unknown [10.109.105.52]) by mo173.mail-out.ovh.net (Postfix) with ESMTP id 40323AFA57 for ; Sat, 10 Mar 2018 13:32:57 +0100 (CET) Received: from bahia.lan (lns-bzn-46-82-253-208-248.adsl.proxad.net [82.253.208.248]) (Authenticated sender: groug@kaod.org) by player728.ha.ovh.net (Postfix) with ESMTPSA id B2D6954009A; Sat, 10 Mar 2018 13:32:45 +0100 (CET) Date: Sat, 10 Mar 2018 13:32:45 +0100 From: Greg Kurz To: Andrew Morton Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, v9fs-developer@lists.sourceforge.net, Eric Van Hensbergen , Ron Minnich , Latchesar Ionkov , "David S. Miller" , Yiwen Jiang Subject: Re: [PATCH] net/9p: avoid -ERESTARTSYS leak to userspace Message-ID: <20180310133245.2fedabe6@bahia.lan> In-Reply-To: <20180309141252.ee4aea14ec7c32226c480b23@linux-foundation.org> References: <152062809886.10599.7361006774123053312.stgit@bahia.lan> <20180309141252.ee4aea14ec7c32226c480b23@linux-foundation.org> X-Mailer: Claws Mail 3.15.1-dirty (GTK+ 2.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Ovh-Tracer-Id: 15021756558728993267 X-VR-SPAMSTATE: OK X-VR-SPAMSCORE: -100 X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedtfedrkeelgdefjecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfqggfjpdevjffgvefmvefgnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Andrew, Thank you very much for taking care of this. Please find my answers to your remarks below. On Fri, 9 Mar 2018 14:12:52 -0800 Andrew Morton wrote: > On Fri, 09 Mar 2018 21:41:38 +0100 Greg Kurz wrote: > > > If it was interrupted by a signal, the 9p client may need to send some > > more requests to the server for cleanup before returning to userspace. > > > > To avoid such a last minute request to be interrupted right away, the > > client memorizes if a signal is pending, clears TIF_SIGPENDING, handles > > the request and calls recalc_sigpending() before returning. > > > > Unfortunately, if the transmission of this cleanup request fails for any > > reason, the transport returns an error and the client propagates it right > > away, without calling recalc_sigpending(). > > > > This ends up with -ERESTARTSYS from the initially interrupted request > > crawling up to syscall exit, with TIF_SIGPENDING cleared by the cleanup > > request. The specific signal handling code, which is responsible for > > converting -ERESTARTSYS to -EINTR is not called, and userspace receives > > the confusing errno value: > > > > open: Unknown error 512 (512) > > > > This is really hard to hit in real life. I discovered the issue while > > working on hot-unplug of a virtio-9p-pci device with an instrumented > > QEMU allowing to control request completion. > > > > Both p9_client_zc_rpc() and p9_client_rpc() functions have this buggy > > error path actually. Their code flow is a bit obscure and the best > > thing to do would probably be a full rewrite: to really ensure this > > situation of clearing TIF_SIGPENDING and returning -ERESTARTSYS can > > never happen. > > > > But given the general lack of interest for the 9p code, I won't risk > > breaking more things. So this patch simply fixes the buggy paths in > > both functions with a trivial label+goto. > > > > Thanks to Laurent Dufour for his help and suggestions on how to find > > the root cause and how to fix it. > > That's a fairly straightforward-looking bug. However the code still > looks a bit racy: > > > : if (signal_pending(current)) { > : sigpending = 1; > : clear_thread_flag(TIF_SIGPENDING); > : } else > : sigpending = 0; > : > > what happens if signal_pending(current) becomes true right here? > Depending on the transport c->trans_mod->request() could possibly return -ERESTARTSYS, and we would then recalc_sigpending() and propagate -ERESTARTSYS to the caller. > : err = c->trans_mod->request(c, req); > > > I'm surprised that the 9p client is mucking with signals at all. > Signals are a userspace IPC scheme and kernel code should instead be > using the more powerful messaging mechanisms which we've developed. > Ones which don't disrupt userspace state. > > Why is this happening? Is there some userspace/kernel interoperation > involved? > Before commit 9523feac272ccad2ad8186ba4fcc89103754de52, we used to rely on wait_event_interruptible() instead of wait_event_killable(). IIUC, the purpose of all this sigpending tweaking was just to allow subsequent cleanup related requests to be issued, without them being interrupted right away because of the initial signal. BTW the issue tentatively fixed by commit 9523feac272ccad2ad8186ba4fcc89103754de52 was more deeply investigated since then. It was caused by a bug in QEMU that got fixed, and will be available in the next release (2.12). And to cope with existing buggy QEMUs, we can assume that a -EINTR response from the server necessarily means the corresponding request has been successfully canceled. Cheers, -- Greg