Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752894AbbBRWSL (ORCPT ); Wed, 18 Feb 2015 17:18:11 -0500 Received: from dcvr.yhbt.net ([64.71.152.64]:43841 "EHLO dcvr.yhbt.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751728AbbBRWSI (ORCPT ); Wed, 18 Feb 2015 17:18:08 -0500 Date: Wed, 18 Feb 2015 22:18:08 +0000 From: Eric Wong To: Ingo Molnar Cc: Jason Baron , peterz@infradead.org, mingo@redhat.com, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, davidel@xmailserver.org, mtk.manpages@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, Thomas Gleixner , Linus Torvalds , Peter Zijlstra Subject: Re: [PATCH v2 2/2] epoll: introduce EPOLLEXCLUSIVE and EPOLLROUNDROBIN Message-ID: <20150218221808.GA3799@dcvr.yhbt.net> References: <7956874bfdc7403f37afe8a75e50c24221039bd2.1424200151.git.jbaron@akamai.com> <20150218080740.GA10199@gmail.com> <54E4B2D0.8020706@akamai.com> <20150218163300.GA28007@gmail.com> <54E4CE14.5010708@akamai.com> <20150218174533.GB31566@gmail.com> <20150218175123.GA31878@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150218175123.GA31878@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1859 Lines: 43 Ingo Molnar wrote: > > * Ingo Molnar wrote: > > > > [...] However, I think the userspace API change is less > > > clear since epoll_wait() doesn't currently have an > > > 'input' events argument as epoll_ctl() does. > > > > ... but the change would be a bit clearer and somewhat > > more flexible: LIFO or FIFO queueing, right? > > > > But having the queueing model as part of the epoll > > context is a legitimate approach as well. > > Btw., there's another optimization that the networking code > already does when processing incoming packets: waking up a > thread on the local CPU, where the wakeup is running. > > Doing the same on epoll would have real scalability > advantages where incoming events are IRQ driven and are > distributed amongst multiple CPUs. Right. One thing in the back of my mind has been to have CPU affinity for epoll. Either having everything in an epoll set favor a certain CPU or even having affinity down to the epitem level (so concurrent epoll_wait callers end up favoring the same epitems). I'm not convinced this series is worth doing without a comparison against my previous suggestion to use a dedicated thread which only makes blocking accept4 + EPOLL_CTL_ADD calls. The majority of epoll events in a typical server should not be for listen sockets, so I'd rather not bloat existing code paths for them. For web servers nowadays, the benefits of maintaining long-lived connections to avoid handshakes is even more beneficial with increasing HTTPS and HTTP2 adoption; so listen socket events should become less common. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/