Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp1009155imm; Thu, 5 Jul 2018 12:55:21 -0700 (PDT) X-Google-Smtp-Source: AAOMgpcbtt7GqlpZiAN9Cp7/H7byw6ci/aG5kQJJtSH5OXZ1rYlxFOMz8aC9F/N9dMnVV+YChqMh X-Received: by 2002:a62:8389:: with SMTP id h131-v6mr7061658pfe.105.1530820521332; Thu, 05 Jul 2018 12:55:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530820521; cv=none; d=google.com; s=arc-20160816; b=0q0hxRpgr6rLwcO6a7AiTWttuDAy6IlvKpRgQXEKjou9I/ZD8Kl4/nbF26wtp+3qIj vI5z/3ELaUtGmopiTlMPdEZOCYsroAwcF7c7cCV7KPd+vJpxDHeeMADLGR3O+dqOWAYj Q69bPYKOZSutcnCTvqbQczdEullXkOb/v3Zfud048vj2fX4JEAa+vez/kNRoT6D9Sf0U wVCPUXgnNhWJDHXfkOuPWMtY0D0ecrvTQGfpZElfEonFPmkxk5sTMNbHT00x07A+K3zL xfkC14VBmcpGZxuDbRn6RHoBgCrgErn1rIOMFSKT49vLTqljWhU7K0dkUQk5e4KFDwgY wU/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:subject:cc:to:from :date:message-id:content-transfer-encoding:mime-version :arc-authentication-results; bh=6lSop/DjeAT4NViqsCwC8UNoMc1QCwB0DhkiqF6Cvno=; b=EFJMQl5wdMoYpfwbHAw/wNZj0v0IIh4R3mNscItM7PnXXym2dSPcBkJDxMiGv8iU3B g+jJu6JuITN7ZiRAKnvF3ewvYJ3iNWlqIv3p0kswej6HTWIOqnhOrktTJWxYxJ3oA9Re WGithe13NBfhaKsyH13zEcc/LIgK354Hj4bGN6OMDBHIYWnNXD8B5LfH8kunNe6dVwUb 1GyJUfUCWWq/GdP8zP0jT4AcyXS+GHufRT55/QPaQC4xUSRmYotfDWZVkTj1GdWPHcUI wUZyqXCkvUTQ+sOCR4JHLcFOyh6JK3oSx5vOxLpOBcAgfV7HDE0Q8pLDtC+YftIP06Nn uLAQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 5-v6si6748720pls.450.2018.07.05.12.55.06; Thu, 05 Jul 2018 12:55:21 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754058AbeGETy0 (ORCPT + 99 others); Thu, 5 Jul 2018 15:54:26 -0400 Received: from mail.stoffel.org ([104.236.43.127]:59248 "EHLO mail.stoffel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753828AbeGETyY (ORCPT ); Thu, 5 Jul 2018 15:54:24 -0400 X-Greylist: delayed 594 seconds by postgrey-1.27 at vger.kernel.org; Thu, 05 Jul 2018 15:54:24 EDT Received: from quad.stoffel.org (66-189-75-104.dhcp.oxfr.ma.charter.com [66.189.75.104]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.stoffel.org (Postfix) with ESMTPSA id EB7675FD45; Thu, 5 Jul 2018 15:44:29 -0400 (EDT) Received: by quad.stoffel.org (Postfix, from userid 1000) id 52249A34B9; Thu, 5 Jul 2018 15:44:29 -0400 (EDT) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <23358.29981.262730.50194@quad.stoffel.home> Date: Thu, 5 Jul 2018 15:44:29 -0400 From: "John Stoffel" To: Mikulas Patocka Cc: Jens Axboe , linux-mm@kvack.org, linux-block@vger.kernel.org, Mike Snitzer , linux-kernel@vger.kernel.org, mhocko@kernel.org, dm-devel@redhat.com, xia , "Alasdair G. Kergon" Subject: Re: [dm-devel] [PATCH] mm: set PF_LESS_THROTTLE when allocating memory for i/o In-Reply-To: References: X-Mailer: VM 8.2.0b under 24.5.1 (x86_64-pc-linux-gnu) X-Clacks-Overhead: GNU Terry Pratchett Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >>>>> "Mikulas" == Mikulas Patocka writes: Mikulas> It has been noticed that congestion throttling can slow down Mikulas> allocations path that participate in the IO and thus help the Mikulas> memory reclaim. Stalling those allocation is therefore not Mikulas> productive. Moreover mempool allocator and md variants of the Mikulas> same already implement their own throttling which has a Mikulas> better way to be feedback driven. Stalling at the page Mikulas> allocator is therefore even counterproductive. Can you show numbers for this claim? I would think that throttling needs to be done as close to the disk as possible, and propogate back up the layers to have this all work well, so that faster devices (and the layers stacked on them) will work better without stalling on a slow device. Mikulas> PF_LESS_THROTTLE is a task flag denoting allocation context that is Mikulas> participating in the memory reclaim which fits into these IO paths Mikulas> model, so use the flag and make the page allocator aware they are Mikulas> special and they do not really want any dirty data throttling. Mikulas> The throttling causes stalls on Android - it uses the dm-verity driver Mikulas> that uses dm-bufio. Allocations in dm-bufio were observed to sleep in Mikulas> wait_iff_congested repeatedly. Mikulas> Signed-off-by: Mikulas Patocka Mikulas> Acked-by: Michal Hocko # mempool_alloc and bvec_alloc Mikulas> Cc: stable@vger.kernel.org Mikulas> --- Mikulas> block/bio.c | 4 ++++ Mikulas> drivers/md/dm-bufio.c | 14 +++++++++++--- Mikulas> drivers/md/dm-crypt.c | 8 ++++++++ Mikulas> drivers/md/dm-integrity.c | 4 ++++ Mikulas> drivers/md/dm-kcopyd.c | 3 +++ Mikulas> drivers/md/dm-verity-target.c | 4 ++++ Mikulas> drivers/md/dm-writecache.c | 4 ++++ Mikulas> mm/mempool.c | 4 ++++ Mikulas> 8 files changed, 42 insertions(+), 3 deletions(-) Mikulas> Index: linux-2.6/mm/mempool.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/mm/mempool.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/mm/mempool.c 2018-06-29 03:47:16.270000000 +0200 Mikulas> @@ -369,6 +369,7 @@ void *mempool_alloc(mempool_t *pool, gfp Mikulas> unsigned long flags; Mikulas> wait_queue_entry_t wait; Mikulas> gfp_t gfp_temp; Mikulas> + unsigned old_flags; Mikulas> VM_WARN_ON_ONCE(gfp_mask & __GFP_ZERO); Mikulas> might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM); Mikulas> @@ -381,7 +382,10 @@ void *mempool_alloc(mempool_t *pool, gfp Mikulas> repeat_alloc: Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> element = pool->alloc(gfp_temp, pool->pool_data); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (likely(element != NULL)) Mikulas> return element; Mikulas> Index: linux-2.6/block/bio.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/block/bio.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/block/bio.c 2018-06-29 03:47:16.270000000 +0200 Mikulas> @@ -217,6 +217,7 @@ fallback: Mikulas> } else { Mikulas> struct biovec_slab *bvs = bvec_slabs + *idx; Mikulas> gfp_t __gfp_mask = gfp_mask & ~(__GFP_DIRECT_RECLAIM | __GFP_IO); Mikulas> + unsigned old_flags; Mikulas> /* Mikulas> * Make this allocation restricted and don't dump info on Mikulas> @@ -229,7 +230,10 @@ fallback: Mikulas> * Try a slab allocation. If this fails and __GFP_DIRECT_RECLAIM Mikulas> * is set, retry with the 1-entry mempool Mikulas> */ Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> bvl = kmem_cache_alloc(bvs->slab, __gfp_mask); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (unlikely(!bvl && (gfp_mask & __GFP_DIRECT_RECLAIM))) { Mikulas> *idx = BVEC_POOL_MAX; Mikulas> goto fallback; Mikulas> Index: linux-2.6/drivers/md/dm-bufio.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-bufio.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-bufio.c 2018-06-29 03:47:16.270000000 +0200 Mikulas> @@ -356,6 +356,7 @@ static void __cache_size_refresh(void) Mikulas> static void *alloc_buffer_data(struct dm_bufio_client *c, gfp_t gfp_mask, Mikulas> unsigned char *data_mode) Mikulas> { Mikulas> + void *ptr; Mikulas> if (unlikely(c->slab_cache != NULL)) { Mikulas> *data_mode = DATA_MODE_SLAB; Mikulas> return kmem_cache_alloc(c->slab_cache, gfp_mask); Mikulas> @@ -363,9 +364,14 @@ static void *alloc_buffer_data(struct dm Mikulas> if (c->block_size <= KMALLOC_MAX_SIZE && Mikulas> gfp_mask & __GFP_NORETRY) { Mikulas> + unsigned old_flags; Mikulas> *data_mode = DATA_MODE_GET_FREE_PAGES; Mikulas> - return (void *)__get_free_pages(gfp_mask, Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> + ptr = (void *)__get_free_pages(gfp_mask, c-> sectors_per_block_bits - (PAGE_SHIFT - SECTOR_SHIFT)); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> + return ptr; Mikulas> } Mikulas> *data_mode = DATA_MODE_VMALLOC; Mikulas> @@ -381,8 +387,10 @@ static void *alloc_buffer_data(struct dm Mikulas> */ Mikulas> if (gfp_mask & __GFP_NORETRY) { Mikulas> unsigned noio_flag = memalloc_noio_save(); Mikulas> - void *ptr = __vmalloc(c->block_size, gfp_mask, PAGE_KERNEL); Mikulas> - Mikulas> + unsigned old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> + ptr = __vmalloc(c->block_size, gfp_mask, PAGE_KERNEL); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> memalloc_noio_restore(noio_flag); Mikulas> return ptr; Mikulas> } Mikulas> Index: linux-2.6/drivers/md/dm-integrity.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-integrity.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-integrity.c 2018-06-29 03:47:16.270000000 +0200 Mikulas> @@ -1318,6 +1318,7 @@ static void integrity_metadata(struct wo Mikulas> int r; Mikulas> if (ic->internal_hash) { Mikulas> + unsigned old_flags; Mikulas> struct bvec_iter iter; Mikulas> struct bio_vec bv; Mikulas> unsigned digest_size = crypto_shash_digestsize(ic->internal_hash); Mikulas> @@ -1331,8 +1332,11 @@ static void integrity_metadata(struct wo Mikulas> if (unlikely(ic->mode == 'R')) Mikulas> goto skip_io; Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> checksums = kmalloc((PAGE_SIZE >> SECTOR_SHIFT >> ic->sb->log2_sectors_per_block) * ic->tag_size + extra_space, Mikulas> GFP_NOIO | __GFP_NORETRY | __GFP_NOWARN); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (!checksums) Mikulas> checksums = checksums_onstack; Mikulas> Index: linux-2.6/drivers/md/dm-kcopyd.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-kcopyd.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-kcopyd.c 2018-06-29 03:47:16.270000000 +0200 Mikulas> @@ -245,7 +245,10 @@ static int kcopyd_get_pages(struct dm_kc Mikulas> *pages = NULL; Mikulas> do { Mikulas> + unsigned old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> pl = alloc_pl(__GFP_NOWARN | __GFP_NORETRY | __GFP_KSWAPD_RECLAIM); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (unlikely(!pl)) { Mikulas> /* Use reserved pages */ Mikulas> pl = kc->pages; Mikulas> Index: linux-2.6/drivers/md/dm-verity-target.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-verity-target.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-verity-target.c 2018-06-29 03:47:16.280000000 +0200 Mikulas> @@ -596,9 +596,13 @@ no_prefetch_cluster: Mikulas> static void verity_submit_prefetch(struct dm_verity *v, struct dm_verity_io *io) Mikulas> { Mikulas> struct dm_verity_prefetch_work *pw; Mikulas> + unsigned old_flags; Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> pw = kmalloc(sizeof(struct dm_verity_prefetch_work), Mikulas> GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (!pw) Mikulas> return; Mikulas> Index: linux-2.6/drivers/md/dm-writecache.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-writecache.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-writecache.c 2018-06-29 03:47:16.280000000 +0200 Mikulas> @@ -1473,6 +1473,7 @@ static void __writecache_writeback_pmem( Mikulas> unsigned max_pages; Mikulas> while (wbl->size) { Mikulas> + unsigned old_flags; wbl-> size--; Mikulas> e = container_of(wbl->list.prev, struct wc_entry, lru); Mikulas> list_del(&e->lru); Mikulas> @@ -1486,6 +1487,8 @@ static void __writecache_writeback_pmem( Mikulas> bio_set_dev(&wb->bio, wc->dev->bdev); wb-> bio.bi_iter.bi_sector = read_original_sector(wc, e); wb-> page_offset = PAGE_SIZE; Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> if (max_pages <= WB_LIST_INLINE || Mikulas> unlikely(!(wb->wc_list = kmalloc(max_pages * sizeof(struct wc_entry *), Mikulas> GFP_NOIO | __GFP_NORETRY | Mikulas> @@ -1493,6 +1496,7 @@ static void __writecache_writeback_pmem( wb-> wc_list = wb->wc_list_inline; Mikulas> max_pages = WB_LIST_INLINE; Mikulas> } Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> BUG_ON(!wc_add_block(wb, e, GFP_NOIO)); Mikulas> Index: linux-2.6/drivers/md/dm-crypt.c Mikulas> =================================================================== Mikulas> --- linux-2.6.orig/drivers/md/dm-crypt.c 2018-06-29 03:47:16.290000000 +0200 Mikulas> +++ linux-2.6/drivers/md/dm-crypt.c 2018-06-29 03:47:16.280000000 +0200 Mikulas> @@ -2181,12 +2181,16 @@ static void *crypt_page_alloc(gfp_t gfp_ Mikulas> { Mikulas> struct crypt_config *cc = pool_data; Mikulas> struct page *page; Mikulas> + unsigned old_flags; Mikulas> if (unlikely(percpu_counter_compare(&cc->n_allocated_pages, dm_crypt_pages_per_client) >= 0) && Mikulas> likely(gfp_mask & __GFP_NORETRY)) Mikulas> return NULL; Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> page = alloc_page(gfp_mask); Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> if (likely(page != NULL)) Mikulas> percpu_counter_add(&cc->n_allocated_pages, 1); Mikulas> @@ -2893,7 +2897,10 @@ static int crypt_map(struct dm_target *t Mikulas> if (cc->on_disk_tag_size) { Mikulas> unsigned tag_len = cc->on_disk_tag_size * (bio_sectors(bio) >> cc->sector_shift); Mikulas> + unsigned old_flags; Mikulas> + old_flags = current->flags & PF_LESS_THROTTLE; Mikulas> + current->flags |= PF_LESS_THROTTLE; Mikulas> if (unlikely(tag_len > KMALLOC_MAX_SIZE) || Mikulas> unlikely(!(io->integrity_metadata = kmalloc(tag_len, Mikulas> GFP_NOIO | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN)))) { Mikulas> @@ -2902,6 +2909,7 @@ static int crypt_map(struct dm_target *t io-> integrity_metadata = mempool_alloc(&cc->tag_pool, GFP_NOIO); io-> integrity_metadata_from_pool = true; Mikulas> } Mikulas> + current_restore_flags(old_flags, PF_LESS_THROTTLE); Mikulas> } Mikulas> if (crypt_integrity_aead(cc)) Mikulas> -- Mikulas> dm-devel mailing list Mikulas> dm-devel@redhat.com Mikulas> https://www.redhat.com/mailman/listinfo/dm-devel