Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968515AbXHMPS2 (ORCPT ); Mon, 13 Aug 2007 11:18:28 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1031032AbXHMNEO (ORCPT ); Mon, 13 Aug 2007 09:04:14 -0400 Received: from dsl081-085-152.lax1.dsl.speakeasy.net ([64.81.85.152]:56784 "EHLO moonbase.phunq.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1031028AbXHMNEJ (ORCPT ); Mon, 13 Aug 2007 09:04:09 -0400 From: Daniel Phillips To: Evgeniy Polyakov Subject: Re: Block device throttling [Re: Distributed storage.] Date: Mon, 13 Aug 2007 06:04:06 -0700 User-Agent: KMail/1.9.5 Cc: Jens Axboe , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Peter Zijlstra References: <20070731171347.GA14267@2ka.mipt.ru> <200708130418.03667.phillips@phunq.net> <20070813121802.GB5992@2ka.mipt.ru> In-Reply-To: <20070813121802.GB5992@2ka.mipt.ru> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200708130604.07154.phillips@phunq.net> Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2166 Lines: 57 On Monday 13 August 2007 05:18, Evgeniy Polyakov wrote: > > Say you have a device mapper device with some physical device > > sitting underneath, the classic use case for this throttle code. > > Say 8,000 threads each submit an IO in parallel. The device mapper > > mapping function will be called 8,000 times with associated > > resource allocations, regardless of any throttling on the physical > > device queue. > > Each thread will sleep in generic_make_request(), if limit is > specified correctly, then allocated number of bios will be enough to > have a progress. The problem is, the sleep does not occur before the virtual device mapping function is called. Let's consider two devices, a physical device named pdev and a virtual device sitting on top of it called vdev. vdev's throttle limit is just one element, but we will see that in spite of this, two bios can be handled by the vdev's mapping method before any IO completes, which violates the throttling rules. According to your patch it works like this: Thread 1 Thread 2 bio_queued is zero> vdev->q->bio_queued++ blk_set_bdev(bio, pdev) vdev->bio_queued-- bio_queued is zero> vdev->q->bio_queued++ whoops! Our virtual device mapping function has now allocated resources for two in-flight bios in spite of having its throttle limit set to 1. Perhaps you never worried about the resources that the device mapper mapping function allocates to handle each bio and so did not consider this hole significant. These resources can be significant, as is the case with ddsnap. It is essential to close that window through with the virtual device's queue limit may be violated. Not doing so will allow deadlock. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/