Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp880733ybb; Sat, 28 Mar 2020 12:24:16 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuuw84qrYX0KeMlnpG1qXWragTJRb5lykdsdm47AWB+xwuUkgnP38fHgS++jxTpxbKBpU74 X-Received: by 2002:a05:6830:1ad4:: with SMTP id r20mr3832607otc.316.1585423455954; Sat, 28 Mar 2020 12:24:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585423455; cv=none; d=google.com; s=arc-20160816; b=CEFfjUDJhN1Jo7/GJXRcAEKTIsTcokuslklXVzhliVy2UibAs0cMr6/YuEj1mqYI6P RAZN/F3CYb6kPdfmxcJxF7EgCcZYcLdbseP8OauD4GvgEZG6DhD5/HYFBd1T7bPkm2FB H3FlUJ0gqL4ENk/2/ShVdRX3FMoCad5shL8mKXMtj0u2rjEN/ECf/N1laL/w6/CFz7hs q/dUIVU9Uw/o15Ftb7LnMpXBIJ+oNsn7OEn++P1hnZfjm8QU4NNWA/nH1B6HRfPcyIce PMuFfD19TzP6dCgbik2UWdj5/ATzFjrWOTN1wM5GcB18k4DagMIYEoBxLmvcUiX3hTyp JyAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:cc:from:references:to:subject:dkim-signature; bh=V/dqXbIkvlLRHwSTVCJaQGAGws4sXl8nCiGBMMHev8Y=; b=KcnNichte7DK3mN1m7vv8yf+R+N4Fsv/tHc2iT9ScU9yZ5IkR7P8aRV7FkxkJ1tt2Z rZcR0uz0XsUPOOLYnP/XPOsZ1lrXIJimTpyojsowqWaVAqY8MEdXAYWrTXrGzMBfJ+Wy 94kxN8dryudGHN0XvqLHk+iYl+ilmte7B0KmQhc1hpZzOPWLKdU3HAAglu2ytu/KbzyZ ItDTOUoMF8XOyKH2kwNmpPH5MMtIAYsq7lZeDHyNAhdbD6a7scg6cZXvu9S4+hl2ttTr CSSu4JqAtS5wcZFdh1kaLEXN4IWILRcNtVNPYMFGoayi9kXrsGrJiISA438EwqmISnwm Zk0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b="hQ6CxFS/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r6si4552317otc.12.2020.03.28.12.24.03; Sat, 28 Mar 2020 12:24:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b="hQ6CxFS/"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727115AbgC1TV5 (ORCPT + 99 others); Sat, 28 Mar 2020 15:21:57 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:57138 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726751AbgC1TV5 (ORCPT ); Sat, 28 Mar 2020 15:21:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=Content-Transfer-Encoding: Content-Type:In-Reply-To:MIME-Version:Date:Message-ID:Cc:From:References:To: Subject:Sender:Reply-To:Content-ID:Content-Description; bh=V/dqXbIkvlLRHwSTVCJaQGAGws4sXl8nCiGBMMHev8Y=; b=hQ6CxFS/MpFp7tJXVz8ZW9IjKN ot1jxHe5xGMxdkveaEsSLhMsiDk7Ng+W+2shgzRZmf/5TJz8KvW2bgFe50l6epSV36X5NMGjk3ws3 5MBoCIqy3J4g+435JLdRTgOfTbpTezACw4o70wbDBpOBMv6xLXN1GlPV3iqRDBA7Aa8URSPUll5ya wT8zmYoh+FzBzGkHLN0xngxyVz57IZAVCiTQFdaA/l38qg2CWsSXgd1XxZmmapZL6g970ZKCzMpE0 FZi9YUC9lPA1MjpClYT6QRmQBTo6xU9jOcjtmHXrquXxtbVLQKVw8pyPIdV5X186S7PPCt+FestgV lNOFwjuw==; Received: from [2601:1c0:6280:3f0::19c2] by bombadil.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1jIH1o-0005EL-QB; Sat, 28 Mar 2020 19:21:56 +0000 Subject: Re: Weird issue with epoll and kernel >= 5.0 To: Omar Kilani , linux-kernel@vger.kernel.org References: From: Randy Dunlap Cc: Davidlohr Bueso Message-ID: <34206eb5-1280-4aac-9a50-76f967646ca1@infradead.org> Date: Sat, 28 Mar 2020 12:21:55 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/28/20 11:10 AM, Omar Kilani wrote: > Hi there, > > I've observed an issue with epoll and kernels 5.0 and above when a > system is generating a lot of epoll events. > > I see this issue with nginx and jvm / netty based apps (using the > jvm's native epoll support as well as netty's own optimized epoll > support) but *not* with haproxy (?). > > I'm not really sure what the actual problem is (nginx complains about > epoll_wait with a generic error), but it doesn't happen on 4.19.x and > lower. > > I thought it was a netty problem at first and opened this ticket: > > https://github.com/netty/netty/issues/8999 > > But then saw the same issue in nginx. > > I haven't debugged a kernel issue in something like 20 years so I'm > not really sure where to start myself. > > I'd be more than happy to provide my test case that has a very quick > repro to anyone who needs it. Hi, Please do. > Also happy to provide a VM/machine with enough CPUs to trigger it > easily (it seems to happen quicker with more CPUs present) to test > with. There have been around 10 changes in fs/eventpoll.c since v5.0 was released in March, 2019, so it would be helpful if you could test the latest mainline kernel to see if the problem is still present. Hm, it looks like you have identified this commit: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.1-rc5&id=c5a282e9635e9c7382821565083db5d260085e3e as the/a problem. I have Cc-ed the patch author also. -- ~Randy