Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751677AbdFFQeJ (ORCPT ); Tue, 6 Jun 2017 12:34:09 -0400 Received: from mx2.suse.de ([195.135.220.15]:40153 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751453AbdFFQeG (ORCPT ); Tue, 6 Jun 2017 12:34:06 -0400 Date: Tue, 6 Jun 2017 18:34:01 +0200 From: "Luis R. Rodriguez" To: Martin Fuzzey , Matthew Wilcox , Alan Cox , Jonathan Corbet , "Michael Kerrisk (man-pages)" , Linux API , fsdevel@vger.kernel.org Cc: "Luis R. Rodriguez" , David Howells , Dmitry Torokhov , Peter Zijlstra , "Eric W. Biederman" , Andy Lutomirski , Greg KH , Daniel Wagner , David Woodhouse , jewalt@lgsinnovations.com, rafal@milecki.pl, Arend Van Spriel , "Rafael J. Wysocki" , "Li, Yi" , atull@opensource.altera.com, Moritz Fischer , Petr Mladek , Johannes Berg , Emmanuel Grumbach , Luca Coelho , Kalle Valo , Linus Torvalds , Kees Cook , AKASHI Takahiro , Peter Jones , Hans de G oede , "Ted Ts'o" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH v2] firmware: fix sending -ERESTARTSYS due to signal on fallback Message-ID: <20170606163401.GA27288@wotan.suse.de> References: <87fufr3mdy.fsf@xmission.com> <20170526194640.GS8951@wotan.suse.de> <20170526215518.GB40877@dtor-ws> <20170605202410.GQ8951@wotan.suse.de> <59367025.3020901@parkeon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <59367025.3020901@parkeon.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6139 Lines: 140 Adding fsdevel for review on the correct semantics of handling signals on write(), in this case a sysfs write which triggered a sync request firmware call and what the firmware API should return in such case of a signal (I gather this should be -EINTR and not -ERESTARTSYS). Also whether or not SIGINT should be followed or if only allowing SIGKILL is fine (fine by me, but it would change old behaviour). Hoping between fsdevel and linux-api folks we can hash this out. On Tue, Jun 06, 2017 at 11:04:37AM +0200, Martin Fuzzey wrote: > On 05/06/17 22:24, Luis R. Rodriguez wrote: > > > > > > For these two reasons then it would seem best we do two things actually: > > > > 1) return -EINTR instead of -EAGAIN when we detect swait_event_interruptible_timeout() > > got interrupted by a signal (it returns -ERESTARTSYS) > > > I disagree. That would force userspace to handle the signal rather than > having the kernel retry. > > From Documentation/DocBook/kernel-hacking.tmpl: > > After you slept you should check if a signal occurred: the > Unix/Linux way of handling signals is to temporarily exit the > system call with the -ERESTARTSYS error. The > system call entry code will switch back to user context, process > the signal handler and then your system call will be restarted > (unless the user disabled that). So you should be prepared to > process the restart, e.g. if you're in the middle of manipulating > some data structure. This applies but you are missing my point that the LWN article [0] I referred to also stated "Kernel code which uses interruptible sleeps must always check to see whether it woke up as a result of a signal, and, if so, clean up whatever it was doing and return -EINTR back to user space." -- I realize there may be contradiction with above documentation -- this perhaps can be clarified with fsdevel folks *but* regardless of that the same article notes Alan Cox explains that "Unix tradition (and thus almost all applications) believe file store writes to be non signal interruptible. It would not be safe or practical to change that guarantee." So for this reason alone there does seem to be an exemption to the above documentation worth noting for file store writes, and the patch which you tested below *moves* the sysfs write op for firmware in that direction by adding a new killable swait. [0] https://lwn.net/Articles/288056/ > > 2) Do as you note below and add wait_event_killable_timeout() > > Hum, I do think that would be better but, (please correct me if I'm wrong) > the _killable_ variants only allow SIGKILL (and not SIGINT). That seems correct given a TASK_KILLABLE is also TASK_UNINTERRUPTIBLE. > 0cb64249ca "firmware_loader: abort request if wait_for_completion is > interrupted" > > specifically mentrions ctrl-c (SIGINT) in the commit message so that would > no longer work. Great point, but it *also* allowed SIGKILL, so I do feel the goal was also to allow it to be killable. I'm afraid that patch probably did not get proper review from sufficient folks and its worth now asking ourselves what we'd like to do. I'm fine with letting go of SIGINT for firmware sysfs calls for the sake of keeping with the long standing unix tradition on write, given we *still have SIGKILL*. > Myself I think having to use kill -9 to interrupt firmware loading by a > usespace helper is OK but others may disagree. Its why I added fsdevel as well. This is really a semantics and uapi question. Between fsdevel and linux-api folks I would hope we can come to a sensible resolution. > > I do not see why we could not introduce wait_event_killable_timeout() > > and swait_event_killable_timeout() into -stables. > > After seeing how simple it is to do so I tend to agree. Greg, Peter, > > what are your thoughts ? > > > > Martin Fuzzey can you test this patch as an alternative to your issue ? > > > > diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c > > index b9f907eedbf7..70fc42e5e0da 100644 > > --- a/drivers/base/firmware_class.c > > +++ b/drivers/base/firmware_class.c > > @@ -131,7 +131,7 @@ static int __fw_state_wait_common(struct fw_state *fw_st, long timeout) > > { > > long ret; > > - ret = swait_event_interruptible_timeout(fw_st->wq, > > + ret = swait_event_killable_timeout(fw_st->wq, > > __fw_state_is_done(READ_ONCE(fw_st->status)), > > timeout); > > if (ret != 0 && fw_st->status == FW_STATUS_ABORTED) > > diff --git a/include/linux/swait.h b/include/linux/swait.h > > index c1f9c62a8a50..9c5ca2898b2f 100644 > > --- a/include/linux/swait.h > > +++ b/include/linux/swait.h > > @@ -169,4 +169,29 @@ do { \ > > __ret; \ > > }) > > +#define __swait_event_killable(wq, condition) \ > > + (void)___swait_event(wq, condition, TASK_KILLABLE, 0, schedule()) > > + > > +#define swait_event_killable(wq, condition) \ > > +({ \ > > + int __ret = 0; \ > > + if (!(condition)) \ > > + __ret = __swait_event_killable(wq, condition); \ > > + __ret; \ > > +}) > > + > > +#define __swait_event_killable_timeout(wq, condition, timeout) \ > > + ___swait_event(wq, ___wait_cond_timeout(condition), \ > > + TASK_INTERRUPTIBLE, timeout, \ > > + __ret = schedule_timeout(__ret)) > > + > > Should be TASK_KILLABLE above Oops yes sorry. > > +#define swait_event_killable_timeout(wq, condition, timeout) \ > > +({ \ > > + long __ret = timeout; \ > > + if (!___wait_cond_timeout(condition)) \ > > + __ret = __swait_event_killable_timeout(wq, \ > > + condition, timeout); \ > > + __ret; \ > > +}) > > + > > #endif /* _LINUX_SWAIT_H */ > > > > Luis > > After replacing TASK_INTERRUPTIBLE with TASK_KILLABLE above it works for me. Great, thanks for testing. Luis