Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp34030ybl; Thu, 9 Jan 2020 16:30:05 -0800 (PST) X-Google-Smtp-Source: APXvYqwajJxkDR7z1RH78rEty/5pYLiMg3Ocgl/Pk82s+EWdXoBtfBjGEGPHOV8lZt+/bjgyN3sr X-Received: by 2002:a05:6830:20d3:: with SMTP id z19mr403154otq.330.1578616205402; Thu, 09 Jan 2020 16:30:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1578616205; cv=none; d=google.com; s=arc-20160816; b=FQLdI5nxl4kkDt/WNJb7jYDYDH0C31mQgFrbdcm+blSiMJ8e8VYDcd7fK95CroZ1Fu aEP4dPVHvjhlhNF8jmrS/RtjMHQf96xZ8e1xEwAABFctQvFa1K+vyFe250Iwnwz5DV/g CSInt6ZPcqrzAc8/Sj/6s1bIafc3NQhn0oJPHJBaPuI41HjPFM6Iafpbq6nNZdYBRBkB dZICJanhB32+J0bgGvTtMY9bhQSS8xonOVYF/zGFR+FUASVp/mRZ9ACQqSzPb/5oZEKK T6XhSYECbT9HXYGXD37N6dlytDf76lYKackfn+K+dlVQlFsgKL/R5FBiZ4aWoeIqCewn vw/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=YK07dG9mS/i6t5DE8qmxV0Tg9yfN7MAs/hz7qzpudL4=; b=Mckml8Wl5E4smlVHrP6aUWv6nci7aE2jxeirSOGCw906BUHwHcX1ymQQsvGP9HTPbD 9wnxsjwZ9mUmtdsp8lV1QUXCMck4tGFqvOjLZkgpswK9J9OTm0i9vXh70gAmwpJ+37Dh nITmjgYn4r2QIiao/8qApVeGWzveECBpAO4cXjrd+DiHAyjc2IVJvc+mbGhdvG96/H3O H1RWYNRQQ37paf1wxX9YUyBQUoVDxR9vthpFc5ZoMPQvl21f4pug15L6PXzBHeBeNRtC Yxu/ns8nP8M3q7x740zM7S/evZgEdB0sE021CHUvYfMs7fRUcG3qz83vZp0+GxmG+7Ic mtUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=UPsxAOdU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c21si129641oto.176.2020.01.09.16.29.54; Thu, 09 Jan 2020 16:30:05 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=UPsxAOdU; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730259AbgAJA2z (ORCPT + 99 others); Thu, 9 Jan 2020 19:28:55 -0500 Received: from mail-ed1-f67.google.com ([209.85.208.67]:39245 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730214AbgAJA2y (ORCPT ); Thu, 9 Jan 2020 19:28:54 -0500 Received: by mail-ed1-f67.google.com with SMTP id t17so52261eds.6 for ; Thu, 09 Jan 2020 16:28:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YK07dG9mS/i6t5DE8qmxV0Tg9yfN7MAs/hz7qzpudL4=; b=UPsxAOdUQON+FR7w22H7DQOXhIuTtYc6PmzaFpKapcLEEDcu8VMwaxdTNV+Xq1yAOx fA4KYSLvgUDS7iMJ66tdLniB4FfwqFrDa4vvjNbLauE3aycCDyJfvhIm07PTLVycCsnf 1g2VHo7KNWC4MunVPYa6VL0r7BtNiVpI567rgr0xL+GhDC3IN9S4ckIdqZAnO1oxCQTL bWM3i5B9Bg8usNme7XWiwfLMoFYOmPR844+Rw1dKSLUtudJfzAW2IXLVzCMpCWQzhf1c BCbG1myvcy3dzB23Zs0QZ+N7skreF52WJCOXbEQjOshrfrwRMnM7XtTLVPT0boqtyLKX ld0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YK07dG9mS/i6t5DE8qmxV0Tg9yfN7MAs/hz7qzpudL4=; b=b/c4BU1Rjl7VE44gAa7zhFnaEH9rc+zoQpU8u0anTb9sZISu8HwXtkLvo1ABp0IM51 M06TieW+iJ8jeUNr0oEtbK6QSjgaWGECmsPe9PIOxBnXNMO6bVC3LphDbbWETWxID//r gwr7vm4UtZ2x1Ndx26wnbnS773ylw3ZcU8NrdE5fVvoieb/pw4rx4TiumBpERq4QSKZT vTgEklEFq5IM5v9Aq3i0iIYHXpuMADhOEkrKW006C+zaqDThrOcCETvpBWXA88fSGo5e SnaUHl2/8ZcRJokns9wrGAv9seHz9M7HBMTNNfw+2znckhQ4FFVFdLOGuq5MYLzyeZj8 Mnkw== X-Gm-Message-State: APjAAAUxVp8Z6sbLrc/z9x8FJeEf0HsbFUCEpWGdIiEwGUuBtyV/z0s+ YdTAdKD7zn0bnshPrt3/uNm8fvy890uo8JZzDZA= X-Received: by 2002:aa7:c694:: with SMTP id n20mr501233edq.95.1578616133224; Thu, 09 Jan 2020 16:28:53 -0800 (PST) MIME-Version: 1.0 References: <20200109225646.22983-1-xiyou.wangcong@gmail.com> In-Reply-To: <20200109225646.22983-1-xiyou.wangcong@gmail.com> From: Yang Shi Date: Thu, 9 Jan 2020 16:28:40 -0800 Message-ID: Subject: Re: [PATCH] mm: avoid blocking lock_page() in kcompactd To: Cong Wang Cc: Linux Kernel Mailing List , Andrew Morton , Michal Hocko , Linux MM Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 9, 2020 at 2:57 PM Cong Wang wrote: > > We observed kcompactd hung at __lock_page(): > > INFO: task kcompactd0:57 blocked for more than 120 seconds. > Not tainted 4.19.56.x86_64 #1 > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > kcompactd0 D 0 57 2 0x80000000 > Call Trace: > ? __schedule+0x236/0x860 > schedule+0x28/0x80 > io_schedule+0x12/0x40 > __lock_page+0xf9/0x120 > ? page_cache_tree_insert+0xb0/0xb0 > ? update_pageblock_skip+0xb0/0xb0 > migrate_pages+0x88c/0xb90 > ? isolate_freepages_block+0x3b0/0x3b0 > compact_zone+0x5f1/0x870 > kcompactd_do_work+0x130/0x2c0 > ? __switch_to_asm+0x35/0x70 > ? __switch_to_asm+0x41/0x70 > ? kcompactd_do_work+0x2c0/0x2c0 > ? kcompactd+0x73/0x180 > kcompactd+0x73/0x180 > ? finish_wait+0x80/0x80 > kthread+0x113/0x130 > ? kthread_create_worker_on_cpu+0x50/0x50 > ret_from_fork+0x35/0x40 > > which faddr2line maps to: > > migrate_pages+0x88c/0xb90: > lock_page at include/linux/pagemap.h:483 > (inlined by) __unmap_and_move at mm/migrate.c:1024 > (inlined by) unmap_and_move at mm/migrate.c:1189 > (inlined by) migrate_pages at mm/migrate.c:1419 > > Sometimes kcompactd eventually got out of this situation, sometimes not. > > I think for memory compaction, it is a best effort to migrate the pages, > so it doesn't have to wait for I/O to complete. It is fine to call > trylock_page() here, which is pretty much similar to > buffer_migrate_lock_buffers(). > > Given MIGRATE_SYNC_LIGHT is used on compaction path, just relax the > check for it. But this changed the semantics of MIGRATE_SYNC_LIGHT which means blocking on most operations but not ->writepage. When MIGRATE_SYNC_LIGHT is used it means compaction priority is increased (the initial priority is ASYNC) due to whatever reason (i.e. not enough clean, non-writeback and non-locked pages to migrate). So, it has to wait for some pages to try to not backoff pre-maturely. If I read the code correctly, buffer_migrate_lock_buffers() also blocks on page lock with non-ASYNC mode. Since v5.1 Mel Gorman improved compaction a lot. So, I'm wondering if this happens on the latest upstream or not. And, did you figure out who is locking the page for such long time? Or there might be too many waiters on the list for this page? > > Cc: Andrew Morton > Cc: Michal Hocko > Cc: linux-mm@kvack.org > Signed-off-by: Cong Wang > --- > mm/migrate.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/mm/migrate.c b/mm/migrate.c > index 86873b6f38a7..df60026779d2 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -1010,7 +1010,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage, > bool is_lru = !__PageMovable(page); > > if (!trylock_page(page)) { > - if (!force || mode == MIGRATE_ASYNC) > + if (!force || mode == MIGRATE_ASYNC > + || mode == MIGRATE_SYNC_LIGHT) > goto out; > > /* > -- > 2.21.1 > >