Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp1580524yba; Sat, 27 Apr 2019 02:35:01 -0700 (PDT) X-Google-Smtp-Source: APXvYqy3UkV+DX+yZ8Qpa/himZLj0kqKYk13M005w6eVK+A1wRczKU0k9upzR68M3yf1nTNVs4Ud X-Received: by 2002:a62:4d42:: with SMTP id a63mr27919565pfb.180.1556357701289; Sat, 27 Apr 2019 02:35:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1556357701; cv=none; d=google.com; s=arc-20160816; b=Ky5YlsZa077EyvpXOCzFmtTyzvxf18IBFpyn+Dk7eBIdn2cH0ilRo48mhyw44MYYr6 QkUKiLPixweieWp+QzDaQjQtcawMqthtxcmAOFYCNdt6dqUgq866m4B1FXIpXQxrUGtr OoFbtXlpDboI8M8vosjWvpjjbpFE086kkaFIGHLTNy80WwyPndjF+3WVhZKiZoXXQfJh kHNt3AXDCHlV5dKmezu7OQJdjBdxoKEvq8/ZJI5wBsF65mWdfgHwUhEAE2K//DSMatt5 peJHKaqD2Mnz2k2qcJsyVR8m4ue3I5gvXLn8FqsvYH86mamhY0t+C2u6DhyBQh5Is7xV dDcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=IuVIabU8U/NrfwgER5MX5lWlz1p5FISTSkU/z8n18SY=; b=V/gi9NQSdhHk8KCEk4HUdy9s+E7Va0qqd1zM9GkKqQzruM0VKlQbfoa3PiSByIBiYQ 5PhIQI1ig60TCj6AsdgQfIGHyn/urtZs7jNBLeS+01ugobF8a63crjY+HiMiAdJsGLYP wWSDIe5HuPTgZVUgbHej/ZYxOPhgMp4jSdLY/0s7o4v3QV/bYqqMDSR8TffFk8Nybwy/ 0gH8dptz1hnPGwXeM8EOdKqEhDQn/7nCuli7sN93/96Xq6uJlZT+IYaCohSxg2UpJ1l8 DxrkzLEkjL0+bNvZnEFIaz7HA73VypObC10XCySXxqX2GoI8SoAAzCMH0mBja2JBWxBo fbPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k37si5887816pgi.303.2019.04.27.02.34.45; Sat, 27 Apr 2019 02:35:01 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726383AbfD0JdU (ORCPT + 99 others); Sat, 27 Apr 2019 05:33:20 -0400 Received: from dcvr.yhbt.net ([64.71.152.64]:57970 "EHLO dcvr.yhbt.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725929AbfD0JdU (ORCPT ); Sat, 27 Apr 2019 05:33:20 -0400 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 5FB461F453; Sat, 27 Apr 2019 09:33:19 +0000 (UTC) Date: Sat, 27 Apr 2019 09:33:19 +0000 From: Eric Wong To: Deepa Dinamani , Arnd Bergmann , Davidlohr Bueso , Al Viro , Jason Baron Cc: linux-kernel@vger.kernel.org, Omar Kilani , linux-fsdevel@vger.kernel.org Subject: Re: Strange issues with epoll since 5.0 Message-ID: <20190427093319.sgicqik2oqkez3wk@dcvr> References: <20190424193903.swlfmfuo6cqnpkwa@dcvr> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190424193903.swlfmfuo6cqnpkwa@dcvr> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Eric Wong wrote: > Omar Kilani wrote: > > Hi there, > > > > I’m still trying to piece together a reproducible test that triggers > > this, but I wanted to post in case someone goes “hmmm... change X > > might have done this”. > > Maybe Davidlohr knows, since he's responsible for most of the > epoll changes in 5.0. Well, I am not sure if I am hitting the same problem Omar is hitting. But I did find an epoll_pwait regression in 5.0: epoll_pwait seems unresponsive to SIGURG in my heavily-parallelized use case[1] on 5.0.9. I bisected it to commit 854a6ed56839a40f6b5d02a2962f48841482eec4 ("signal: Add restore_user_sigmask()") Just reverting the fs/eventpoll.c change in 854a6ed56 seems enough to fix the non-responsive epoll_pwait for me. I have not looked deeply into this, but perhaps the signal_pending check in restore_user_sigmask is racy w.r.t. epoll. It is been a while since I have looked at kernel stuff, myself. Anyways, this revert works; but I'm not 100% sure why... diff --git a/fs/eventpoll.c b/fs/eventpoll.c index a5d219d920e7..151739d76801 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2247,7 +2247,20 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct epoll_event __user *, events, error = do_epoll_wait(epfd, events, maxevents, timeout); - restore_user_sigmask(sigmask, &sigsaved); + /* + * If we changed the signal mask, we need to restore the original one. + * In case we've got a signal while waiting, we do not restore the + * signal mask yet, and we allow do_signal() to deliver the signal on + * the way back to userspace, before the signal mask is restored. + */ + if (sigmask) { + if (error == -EINTR) { + memcpy(¤t->saved_sigmask, &sigsaved, + sizeof(sigsaved)); + set_restore_sigmask(); + } else + set_current_blocked(&sigsaved); + } return error; } @@ -2272,7 +2285,20 @@ COMPAT_SYSCALL_DEFINE6(epoll_pwait, int, epfd, err = do_epoll_wait(epfd, events, maxevents, timeout); - restore_user_sigmask(sigmask, &sigsaved); + /* + * If we changed the signal mask, we need to restore the original one. + * In case we've got a signal while waiting, we do not restore the + * signal mask yet, and we allow do_signal() to deliver the signal on + * the way back to userspace, before the signal mask is restored. + */ + if (sigmask) { + if (err == -EINTR) { + memcpy(¤t->saved_sigmask, &sigsaved, + sizeof(sigsaved)); + set_restore_sigmask(); + } else + set_current_blocked(&sigsaved); + } return err; } Comments and/or a proper fix would be greatly appreciated. [1] my test case is running the cmogstored 1.7.0 test suite in amd64 Debian stable environment. test/mgmt_auto_adjust would get stuck and time-out after 60s on vanilla v5.0.9 tgz: https://bogomips.org/cmogstored/files/cmogstored-1.7.0.tar.gz # Standard autotools install, N=32 or some high-ish number ./configure make -j$N make check -j$N # OR git clone https://bogomips.org/cmogstored.git So, requoting the rest of Omar's original report, here; since I am not sure if his use case involves epoll_pwait like mine does: > Omar Kilani wrote: > > Basically, something’s broken (or at least, has changed enough to > > cause problems in user space) in epoll since 5.0. It’s still broken in > > 5.1-rc5. > > > > It doesn’t happen 100% of the time. It’s sort of hard to pin down but > > I’ve observed the following: > > > > * nginx not accepting connections under load > > * A java app which uses netty / NIO having strange writability > > semantics on channels, which confuses netty / java enough to not > > properly flush written data on the socket. > > > > I went and tested these Linux kernels: > > > > 4.20.17 > > 4.19.32 > > 4.14.111 > > > > And the issue(s) do not show up there. > > > > I’m still actively chasing this up, and will report back — I haven’t > > touched kernel code in 15 years so I’m a little rusty. :) > > > > Regards, > > Omar