Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757211AbaAITyr (ORCPT ); Thu, 9 Jan 2014 14:54:47 -0500 Received: from smtp.infotech.no ([82.134.31.41]:60677 "EHLO smtp.infotech.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755082AbaAITyh (ORCPT ); Thu, 9 Jan 2014 14:54:37 -0500 Message-ID: <52CEFE62.7070009@interlog.com> Date: Thu, 09 Jan 2014 14:54:10 -0500 From: Douglas Gilbert Reply-To: dgilbert@interlog.com User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Sergey Meirovich , James Smart CC: Jan Kara , linux-scsi , Linux Kernel Mailing List , Gluk Subject: Re: Terrible performance of sequential O_DIRECT 4k writes in SAN environment. ~3 times slower then Solars 10 with the same HBA/Storage. References: <20140106201032.GA13491@quack.suse.cz> <52CC6A53.9010508@emulex.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 14-01-08 08:57 AM, Sergey Meirovich wrote: > Hi James, > > On 7 January 2014 22:57, James Smart wrote: >> Sergey, >> >> The Thor chipset is a bit old - a 4Gig adapter. Most of our performance >> improvements, including parallelization, have gone into the 8G and 16G >> adapters. But you still should have seen significantly beyond what you >> reported. > > First of all - thanks a lot! > > I took Thor because we have exactly the same Thors in some of our > Solaris servers. I've also tried 6 different qlogics (mostly 8G) and > fnic (10G) as well. Surprisingly enough Thor was the fastest one for > seqwr 4k. Though in most of the cases machines were from our different > DCs and hence each one connected to yet another storage. > >> >> We did a sanity check some hardware we already had set up with a Thor >> adapter. We saw 23555 iop/s and 92.1 MB/s without needing to do much, well >> beyond what you've reported, and still not up to what we know the card can >> do. There are some inefficiencies from the linux kernel and some locking >> deltas between our solaris and linux drivers - but not enough to account for >> what you are seeing. >> >> I expect the Direct IO filesystem behavior is the root issue. > > The strangest thing to me that this is the problem with sequential > write. For example the fnic one machine is zoned to EMC XtremIO and > had results: 14.43Mb/sec 3693.65 Requests/sec for sequential 4k. The > same fnic machine perfrormed rather impressive for random 4k > 451.11Mb/sec 115485.02 Requests/sec You could bypass O_DIRECT and use ddpt together with a bsg pass-through (bsg is a little faster than sg for these purposes). For example: # lsscsi -g [0:0:0:0] disk ATA INTEL SSDSC2CW12 400i /dev/sda /dev/sg0 [14:0:0:0] disk Linux scsi_debug 0004 - /dev/sg1 # ddpt if=/dev/bsg/14:0:0:0 bs=512 bpt=128 count=1m Output file not specified so no copy, just reading input 1048576+0 records in 0+0 records out time to read data: 0.283566 secs at 1893.28 MB/sec bs= should match the block size of the storage device and the size of each SCSI READ is dictated by bpt= (so 64 KB in this case). Such a test should show you if your performance problem is in the block layer or below, or above the block layer (at least the point where pass-through commands are injected). Doug Gilbert -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/