Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp160120imu; Mon, 26 Nov 2018 18:45:49 -0800 (PST) X-Google-Smtp-Source: AJdET5cMlTV89xKey+VUd6fy11QvFKPg8MF+QyspOIrln37/kfhR11r1V/LLsCEQk1HVnMP+uMv/ X-Received: by 2002:a62:868b:: with SMTP id x133mr32523040pfd.252.1543286749228; Mon, 26 Nov 2018 18:45:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543286749; cv=none; d=google.com; s=arc-20160816; b=d4vkstajk2XXIaBNkOzw7t85e4W2B24NEHPBX/1qOPCj0o1U+pDzWZJ4MQdMiEqgM/ +9eRKDEZRSRrQXwyXMlOKTWc6kATTOphPX41Hsqlb4QUt9yS8gjlFKQWlnK/G4TUjAdS GuZcPAPE6gVsPEqL1+AM8Q7a6USjA3UQdIsTq11NeMCmGdrSTUCXiu8uQTFQN4LtnHNW iA6DTwLhfFvF2h+tAisUUgOjqAJsO3AA7+AxrPBbtRr5wImpTNptAqGjjMM3IpzBeCrg UeKsYvBQPMIfYWG/iZS+a+FO3x8FLTVsq3cS9/drxhCZDWh6hb7YTpjIxNCDEAZRAN1h 7B/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=HceQpm+Y1ZbD1rpfaGMY9nQ/A+0TiH92egWM3unopz8=; b=OW+CJGqeEVkjyYP3DNfdOOcicRxsxv9rrhLAL7YjYc38T7jvrARXTSOcd9PhImCcnm L/yz/8Q/a9fmSF/iAsZleR7GDsmBaRLQojPubGqnE8Mg1wsa1mYb8yr6YwlVVNG1vcx4 MI6zL/9XVNKG/iTVVXmEzYxvN+c1sHgBdpF0c94jZyrIM6LHAgYUOFfnA5hy3jHv8+/m APklt5KrGQU7nawPMrNnsFx9N8SogREp9CEAlSaawSuVC8UBfB8/B2nLz/3sXRpSQcin WDxvTq4hMx6jRgHXS4fI6SsNEZzQr7SaoipEw2cOGbFVI9pAAukFZIFbT41CtLKVZPFe 15YA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=A6pFxlxG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h19si2368400plr.67.2018.11.26.18.45.33; Mon, 26 Nov 2018 18:45:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@gmail.com header.s=20161025 header.b=A6pFxlxG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728147AbeK0NB0 (ORCPT + 99 others); Tue, 27 Nov 2018 08:01:26 -0500 Received: from mail-pg1-f195.google.com ([209.85.215.195]:46360 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727544AbeK0NBZ (ORCPT ); Tue, 27 Nov 2018 08:01:25 -0500 Received: by mail-pg1-f195.google.com with SMTP id w7so7057109pgp.13; Mon, 26 Nov 2018 18:05:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=HceQpm+Y1ZbD1rpfaGMY9nQ/A+0TiH92egWM3unopz8=; b=A6pFxlxGVVfx2b+td0C8B5SdPqtGo/XtZQiRHENP40kBUg6yHAJPj4jdiWFRpVRaIn 908iPT9CfyhrKMX4hMfEYzcidqINIem7vrAv/JEPSCLcLGQNQ+5hzUPLKRIWx8jNU89J IpQEDUjTFuaXC5gT5JJJ2MKHZ4+hjvQfWRVV1j6OswjzgTteLgTy52qzJ5FGxIgw0FHR JGbwpAdolgoqWkb9Pj+aNdDSo+j0h8YTd6yJrdQwVVwh9d5uRGg94jLhTCHgHuUqHSEs D0uoIyjzHp4H2HUkAfQaAbi7Z+lCSpE9l98LZZMGEbvo0re3gqZblp8llV47s7x4XBmF vz+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=HceQpm+Y1ZbD1rpfaGMY9nQ/A+0TiH92egWM3unopz8=; b=ktHd9SB0qTn0+nw/2/q8X2jnJdede6hxe8+EBiuQU2B+IMTyKOFZdH960edke8+1YR aGcNCJMe4m7b2iYG1etAEfm/yce+azkR0kmlaAcZ3p2ioFfzuIS3D2DWskW9n3KkO48J aIbZWq72U9nKwXIQg+OoL2LxvgHbOwFWAb3cBz6Al2e+U2TiKA8q+VGcT+yIZJh0X4vy /z7imeUQtX8TdN5XFCJXPidQwlbd8omdyEjVNuiYUz+hUNtMcDK4uOccSdfe92FO1ZnZ TTGsu7B0Ztp1rUM33lVKkb/1JdKeyyMgi38s3eTvjDCgIhrwkSW0DcnzkRTZ3MQ0ce2a eq/w== X-Gm-Message-State: AGRZ1gLLBEqSqrw6bN+N6/DAN/DQJ7w1HJnLg9msInEWltowNOx3P5Kh Pd1mDmyM1MjaWEu8u4ys4jh3XrmSvMk= X-Received: by 2002:a62:c21c:: with SMTP id l28mr30562786pfg.74.1543284312016; Mon, 26 Nov 2018 18:05:12 -0800 (PST) Received: from google.com ([2401:fa00:d:0:98f1:8b3d:1f37:3e8]) by smtp.gmail.com with ESMTPSA id s37sm3061284pgm.19.2018.11.26.18.05.09 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 26 Nov 2018 18:05:10 -0800 (PST) Date: Tue, 27 Nov 2018 11:05:06 +0900 From: Minchan Kim To: Andrew Morton Cc: LKML , Sergey Senozhatsky , stable@vger.kernel.org Subject: Re: [PATCH v2 1/7] zram: fix lockdep warning of free block handling Message-ID: <20181127020506.GA237537@google.com> References: <20181126082813.81977-1-minchan@kernel.org> <20181126082813.81977-2-minchan@kernel.org> <20181126124928.8fbbf01966b741ac79a3d003@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181126124928.8fbbf01966b741ac79a3d003@linux-foundation.org> User-Agent: Mutt/1.10.1+60 (6df12dc1) (2018-08-07) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 26, 2018 at 12:49:28PM -0800, Andrew Morton wrote: > On Mon, 26 Nov 2018 17:28:07 +0900 Minchan Kim wrote: > > > > > ... > > > > With writeback feature, zram_slot_free_notify could be called > > in softirq context by end_swap_bio_read. However, bitmap_lock > > is not aware of that so lockdep yell out. Thanks. > > > > The problem is not only bitmap_lock but it is also zram_slot_lock > > so straightforward solution would disable irq on zram_slot_lock > > which covers every bitmap_lock, too. > > Although duration disabling the irq is short in many places > > zram_slot_lock is used, a place(ie, decompress) is not fast > > enough to hold irqlock on relying on compression algorithm > > so it's not a option. > > > > The approach in this patch is just "best effort", not guarantee > > "freeing orphan zpage". If the zram_slot_lock contention may happen, > > kernel couldn't free the zpage until it recycles the block. However, > > such contention between zram_slot_free_notify and other places to > > hold zram_slot_lock should be very rare in real practice. > > To see how often it happens, this patch adds new debug stat > > "miss_free". > > > > It also adds irq lock in get/put_block_bdev to prevent deadlock > > lockdep reported. The reason I used irq disable rather than bottom > > half is swap_slot_free_notify could be called with irq disabled > > so it breaks local_bh_enable's rule. The irqlock works on only > > writebacked zram slot entry so it should be not frequent lock. > > > > Cc: stable@vger.kernel.org # 4.14+ > > Signed-off-by: Minchan Kim > > --- > > drivers/block/zram/zram_drv.c | 56 +++++++++++++++++++++++++---------- > > drivers/block/zram/zram_drv.h | 1 + > > 2 files changed, 42 insertions(+), 15 deletions(-) > > > > diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c > > index 4879595200e1..472027eaed60 100644 > > --- a/drivers/block/zram/zram_drv.c > > +++ b/drivers/block/zram/zram_drv.c > > @@ -53,6 +53,11 @@ static size_t huge_class_size; > > > > static void zram_free_page(struct zram *zram, size_t index); > > > > +static int zram_slot_trylock(struct zram *zram, u32 index) > > +{ > > + return bit_spin_trylock(ZRAM_LOCK, &zram->table[index].value); > > +} > > + > > static void zram_slot_lock(struct zram *zram, u32 index) > > { > > bit_spin_lock(ZRAM_LOCK, &zram->table[index].value); > > @@ -443,29 +448,45 @@ static ssize_t backing_dev_store(struct device *dev, > > > > static unsigned long get_entry_bdev(struct zram *zram) > > { > > - unsigned long entry; > > + unsigned long blk_idx; > > + unsigned long ret = 0; > > > > - spin_lock(&zram->bitmap_lock); > > /* skip 0 bit to confuse zram.handle = 0 */ > > - entry = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1); > > - if (entry == zram->nr_pages) { > > - spin_unlock(&zram->bitmap_lock); > > - return 0; > > + blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1); > > + if (blk_idx == zram->nr_pages) > > + goto retry; > > + > > + spin_lock_irq(&zram->bitmap_lock); > > + if (test_bit(blk_idx, zram->bitmap)) { > > + spin_unlock_irq(&zram->bitmap_lock); > > + goto retry; > > } > > > > - set_bit(entry, zram->bitmap); > > - spin_unlock(&zram->bitmap_lock); > > + set_bit(blk_idx, zram->bitmap); > > Here we could do > > if (test_and_set_bit(...)) { > spin_unlock(...); > goto retry; > > But it's weird to take the spinlock on behalf of bitops which are > already atomic! > > It seems rather suspicious to me. Why are we doing this? What I need is look_up_and_set operation. I don't see there is an atomic operation for that. But I want to minimize irq disabled area so first, it scans the bit lockless and if race happens, i can try under the lock. It seems __set_bit is enough under the lock. > > > + ret = blk_idx; > > + goto out; > > +retry: > > + spin_lock_irq(&zram->bitmap_lock); > > + blk_idx = find_next_zero_bit(zram->bitmap, zram->nr_pages, 1); > > + if (blk_idx == zram->nr_pages) > > + goto out; > > + > > + set_bit(blk_idx, zram->bitmap); > > + ret = blk_idx; > > +out: > > + spin_unlock_irq(&zram->bitmap_lock); > > > > - return entry; > > + return ret; > > } > > > > static void put_entry_bdev(struct zram *zram, unsigned long entry) > > { > > int was_set; > > + unsigned long flags; > > > > - spin_lock(&zram->bitmap_lock); > > + spin_lock_irqsave(&zram->bitmap_lock, flags); > > was_set = test_and_clear_bit(entry, zram->bitmap); > > - spin_unlock(&zram->bitmap_lock); > > + spin_unlock_irqrestore(&zram->bitmap_lock, flags); > > Here's another one. Surely that locking is unnecessary. Indeed! although get_entry_bdev side can miss some bits, it's not a critical problem. Benefit is we might remove irq disable for the lockdep problem. Yes, I will cook and test. Thanks, Andrew.