Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261528AbVAXQpd (ORCPT ); Mon, 24 Jan 2005 11:45:33 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261529AbVAXQpc (ORCPT ); Mon, 24 Jan 2005 11:45:32 -0500 Received: from stat16.steeleye.com ([209.192.50.48]:58010 "EHLO hancock.sc.steeleye.com") by vger.kernel.org with ESMTP id S261528AbVAXQpY (ORCPT ); Mon, 24 Jan 2005 11:45:24 -0500 Subject: Re: [PATCH] Avoiding fragmentation through different allocator From: James Bottomley To: Marcelo Tosatti Cc: Mel Gorman , William Lee Irwin III , Linux Memory Management List , Linux Kernel , Grant Grundler In-Reply-To: <20050124122952.GA5739@logos.cnet> References: <20050120101300.26FA5E598@skynet.csn.ul.ie> <20050121142854.GH19973@logos.cnet> <20050122215949.GD26391@logos.cnet> <20050124122952.GA5739@logos.cnet> Content-Type: text/plain Date: Mon, 24 Jan 2005 10:44:12 -0600 Message-Id: <1106585052.5513.26.camel@mulgrave> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-3) Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2274 Lines: 48 On Mon, 2005-01-24 at 10:29 -0200, Marcelo Tosatti wrote: > Since the pages which compose IO operations are most likely sparse (not physically contiguous), > the driver+device has to perform scatter-gather IO on the pages. > > The idea is that if we can have larger memory blocks scatter-gather IO can use less SG list > elements (decreased CPU overhead, decreased device overhead, faster). > > Best scenario is where only one sg element is required (ie one huge physically contiguous block). > > Old devices/unprepared drivers which are not able to perform SG/IO > suffer with sequential small sized operations. > > I'm far away from being a SCSI/ATA knowledgeable person, the storage people can > help with expertise here. > > Grant Grundler and James Bottomley have been working on this area, they might want to > add some comments to this discussion. > > It seems HP (Grant et all) has pursued using big pages on IA64 (64K) for this purpose. Well, the basic advice would be not to worry too much about fragmentation from the point of view of I/O devices. They mostly all do scatter gather (SG) onboard as an intelligent processing operation and they're very good at it. No one has ever really measured an effect we can say "This is due to the card's SG engine". So, the rule we tend to follow is that if SG element reduction comes for free, we take it. The issue that actually causes problems isn't the reduction in processing overhead, it's that the device's SG list is usually finite in size and so it's worth conserving if we can; however it's mostly not worth conserving at the expense of processor cycles. The bottom line is that the I/O (block) subsystem is very efficient at coalescing (both in block space and in physical memory space) and we've got it to the point where it's about as efficient as it can be. If you're going to give us better physical contiguity properties, we'll take them, but if you spend extra cycles doing it, the chances are you'll slow down the I/O throughput path. James - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/