Received: by 2002:ac0:950e:0:0:0:0:0 with SMTP id f14csp1142476imc; Sun, 17 Mar 2019 05:23:54 -0700 (PDT) X-Google-Smtp-Source: APXvYqza9fR5tqgByp2IqSmcWtO3PDvkthdfAoawt7rQ+/easagn4UQPPLsai879HodSZX0ngR51 X-Received: by 2002:a63:101c:: with SMTP id f28mr12450671pgl.224.1552825434181; Sun, 17 Mar 2019 05:23:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552825434; cv=none; d=google.com; s=arc-20160816; b=YsOrSVhKmsw/dCsoAYm/XVdduBZu8q7BuxXQgX2A1A/+hgvnc7WxRP9umUfpNxyYat qnXpxl9XmEH6CtBnwweIM5qBp131cv1cXJUj4VXcGZJ8OKsiz5+bvwpYu/xSHlZu0lnq cP9/JXUyKx0ofysNdmYbcvBDbL757fbgtuN0K83wlS4A1ezL5vcYY1HqrfLZbyqtHcOZ V2RVIsUvIfMeQ/BbKu7rXcbi1UUdo+JPpImY4/4TfWsDGNdtXt9SfAD33DGhskqTkEPR 0kO9VKyl+JwvtWMrbPViBAitxFlf6EH++Y9wVMBAT96rndzEWZ277YjTUG15wtCAhsK1 1oXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=VjG0kRzlBwR8YGaPDB6PFGDHZQlihIoW/wLAoir9C6s=; b=I7gWSbqrH+sGF8nmziVBdc/4qfQqW7LyidMmf4wQm8kTMT8nOulRbjLhXJYfbW0fMU dulOg0kX4IOxLRapcm7WWMTol1jdwV1cPQubQoBbwK/USw368sojDHcQp5MkCUq3BasS kG8xfD01xSRfpAh+hlYAZ92WPzfCHD6Dwyz9CoI6CdNix88P3bDz6+ZqzcTrSy0IWoGY YsoQ/mBDZxj8OyIk4F6dqoM8QWmlX3V9I5I4F9bRLVJjApfm5vb2A4SuCnpyWt0xaoDG aUka1J2/SbMKFaBUnCaPObjDQqYDxS3FanBjiAJmq5EVuylSgYKE+ku3QvBZpqjjLOhx rJEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arrikto-com.20150623.gappssmtp.com header.s=20150623 header.b=h7t5XH2P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b6si6405666pgw.475.2019.03.17.05.23.38; Sun, 17 Mar 2019 05:23:54 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@arrikto-com.20150623.gappssmtp.com header.s=20150623 header.b=h7t5XH2P; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727107AbfCQMXD (ORCPT + 99 others); Sun, 17 Mar 2019 08:23:03 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:52883 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726092AbfCQMXC (ORCPT ); Sun, 17 Mar 2019 08:23:02 -0400 Received: by mail-wm1-f66.google.com with SMTP id f65so10394029wma.2 for ; Sun, 17 Mar 2019 05:23:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arrikto-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=VjG0kRzlBwR8YGaPDB6PFGDHZQlihIoW/wLAoir9C6s=; b=h7t5XH2PM9T1GfnLSkLWRzEdRNSs47JxDFqrJG+d/OIH/8SyW+DoHFfzB6PdUSlzsx LW5/9fl/4W6U3k/wp6c/82PrDLiEPrST+sQcgX0VffsDZJEpZtpz/80NIrR9+pPcXlBJ caAyK2F0UvItWGPvGWJ1f1NoR9jYrMgJUXZJCYlQMn6QJUFAUkZMGtsuqpg4miIJAz4T HQ0hKzqbzeoa+nDuOZRT3Huhi6hUNXfGfjR6AxcoZdWiZEmoxsRH5Xh496Iwrmm/tVuz iuAZ65nVJJL+PXNZbes+HLoQ72GScTVuuBD6H0qlxCc3ERtnBDyhWB4EDka0DsVaZk0J LWLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=VjG0kRzlBwR8YGaPDB6PFGDHZQlihIoW/wLAoir9C6s=; b=Z+MQNWe3Ohfq+2UB/wXuDH8verkZStZGZXIOam0ZJxhb+ElVyGFWBcqgTHA3Il2tDM 0XFBGW82DDwyuRyUnDyKlb0JLxn5vpJbM0UbvY9YuSrthmYbzqdWgDDvMy4Asaw4BRim Yt91OddTNkxTBaiHiryDoN5yTKBZrkahqLpurpsw8itQKc4Wz7zCEU3hR/hAS0HSkKZw THkPFlXCnq4uBRr8/n8XsSLmaM7iVhxYit5JIboAwNaieLpIN0Bu2LhQfSK+9y5DRhib mpOiTs2GmLEMT7khzD+eXoLaWHJ41u6ONRTSXYYFdZVXuVXt/M60Rbfq1Wnhqs5Q5GMB lJUA== X-Gm-Message-State: APjAAAUoDD7Ltyiz8BgCXp8achwZbGCj1cBGIP2KvMY0Is/JAVGgED44 eNZc/8pNbomKA/Mn2PKUTIPTKg== X-Received: by 2002:a1c:7306:: with SMTP id d6mr8175785wmb.40.1552825380974; Sun, 17 Mar 2019 05:23:00 -0700 (PDT) Received: from snf-864.vm.snf.arr ([31.177.62.212]) by smtp.gmail.com with ESMTPSA id z10sm5453292wrs.11.2019.03.17.05.22.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 17 Mar 2019 05:23:00 -0700 (PDT) From: Nikos Tsironis To: snitzer@redhat.com, agk@redhat.com, dm-devel@redhat.com Cc: mpatocka@redhat.com, paulmck@linux.ibm.com, hch@infradead.org, iliastsi@arrikto.com, linux-kernel@vger.kernel.org Subject: [PATCH v3 0/6] dm snapshot: Improve performance using a more fine-grained locking scheme Date: Sun, 17 Mar 2019 14:22:52 +0200 Message-Id: <20190317122258.21760-1-ntsironis@arrikto.com> X-Mailer: git-send-email 2.11.0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org dm-snapshot uses a single mutex to serialize every access to the snapshot state, including accesses to the exception hash tables. This mutex is a bottleneck preventing dm-snapshot to scale as the number of threads doing IO increases. The major contention points are __origin_write()/snapshot_map() and pending_complete(), i.e., the submission and completion of pending exceptions. This patchset substitutes the single mutex with: * A read-write semaphore, which protects the mostly read fields of the snapshot structure. * Per-bucket bit spinlocks, that protect accesses to the exception hash tables. fio benchmarks using the null_blk device show significant performance improvements as the number of worker processes increases. Write latency is almost halved and write IOPS are nearly doubled. The relevant patch provides detailed benchmark results. A summary of the patchset follows: 1. The first patch removes an unnecessary use of WRITE_ONCE() in hlist_add_behind(). 2. The second patch adds two helper functions to linux/list_bl.h, which is used to implement the per-bucket bit spinlocks in dm-snapshot. 3. The third patch removes the need to sleep holding the snapshot lock in pending_complete(), thus allowing us to replace the mutex with the per-bucket bit spinlocks. 4. Patches 4, 5 and 6 change the locking scheme, as described previously. Changes in v3: - Don't use WRITE_ONCE() in hlist_bl_add_behind(), as it's not needed. - Fix hlist_add_behind() to also not use WRITE_ONCE(). - Use uintptr_t instead of unsigned long in hlist_bl_add_before(). v2: https://www.redhat.com/archives/dm-devel/2019-March/msg00007.html Changes in v2: - Split third patch of v1 into three patches: 3/5, 4/5, 5/5. v1: https://www.redhat.com/archives/dm-devel/2018-December/msg00161.html Nikos Tsironis (6): list: Don't use WRITE_ONCE() in hlist_add_behind() list_bl: Add hlist_bl_add_before/behind helpers dm snapshot: Don't sleep holding the snapshot lock dm snapshot: Replace mutex with rw semaphore dm snapshot: Make exception tables scalable dm snapshot: Use fine-grained locking scheme drivers/md/dm-exception-store.h | 3 +- drivers/md/dm-snap.c | 359 +++++++++++++++++++++++++++------------- include/linux/list.h | 2 +- include/linux/list_bl.h | 26 +++ 4 files changed, 269 insertions(+), 121 deletions(-) -- 2.11.0