Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754122Ab2FOIgj (ORCPT ); Fri, 15 Jun 2012 04:36:39 -0400 Received: from mx0.aculab.com ([213.249.233.131]:36329 "HELO mx0.aculab.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1753060Ab2FOIgf convert rfc822-to-8bit (ORCPT ); Fri, 15 Jun 2012 04:36:35 -0400 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8BIT Subject: RE: [RFC] Introduce to batch variants of accept() and epoll_ctl() syscall Date: Fri, 15 Jun 2012 09:35:29 +0100 Message-ID: In-Reply-To: <4FDAB652.6070201@gmail.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: thread-topic: [RFC] Introduce to batch variants of accept() and epoll_ctl() syscall thread-index: Ac1KrRBoF3+QXM6hSDWQKmZkkQnn6AAIyPTA From: "David Laight" To: "Li Yu" , "Linux Netdev List" Cc: "Linux Kernel Mailing List" , Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1443 Lines: 37 > We encounter a performance problem in a large scale computer > cluster, which needs to handle a lot of incoming concurrent TCP > connection requests. > > The top shows the kernel is most cpu hog, the testing is simple, > just a accept() -> epoll_ctl(ADD) loop, the ratio of cpu util sys% to > si% is about 2:5. > > I also asked some experienced webserver/proxy developers in my team > for suggestions, it seem that behavior of many userland > programs already > called accept() multiple times after it is waked up by > epoll_wait(). And the common action is adding the fd that accept() > return into epoll interface by epoll_ctl() syscall then. > > Therefore, I think that we'd better to introduce to batch > variants of > accept() and epoll_ctl() syscall, just like sendmmsg() or recvmmsg(). ... Having seen the support added to NetBSD for sendmmsg() and recvmmsg() (and I'm told the linux code is much the same), I'm surprised that just cutting out a system call entry/exit and fd lookup is significant above the rest of the costs involved in sending a message (which I presume is UDP here). I'd be even more surprised if it is significant for an incoming connection. David -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/