Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp6766056ybi; Mon, 8 Jul 2019 08:18:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqx5scr78q3s7kSp9prEbGNAiAFliZ0wBHuJjlpDnkxRam9Xs8i++OivPF73sdOer9HQPiS2 X-Received: by 2002:a17:90a:1b4a:: with SMTP id q68mr26526712pjq.61.1562599117630; Mon, 08 Jul 2019 08:18:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1562599117; cv=none; d=google.com; s=arc-20160816; b=V3R40iISqw9hRgbuRJOd5/wJbIjOMvF6nJSEBLqQOI8wF+HO619Pby8dpBeF3RLo1v M3+YD8Qk2hgrnsaq6ao/n2L4i7atx15ZdryEm7m4afyG79ceAWoUelKX2ez6Kmb2KF8G 0dlb+oa1BinbTPA2NgQTI1jelnvVTDi1L9KMAkQ8mwDiJVFdH5IWNZQ7U8KaV1XokExB 6+6CTzLjOKsa2V227QyxUFhp3eyd3A0Gy7cexdm4+jqGm1JyQVrBaUhgg62A5t83J40e Cz1VNbxunm1Ny5tbe3JdqM7HDwnsDVRYFX0GO4wS6r1Bmf6xYBAYUMfcZihNHXVaOOJM zgvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:mail-followup-to :message-id:subject:to:from:date; bh=8+L7HbUvnjML44nWAep15RYXhAsqRUlY5dunJ21hplU=; b=OQTK2A2WgRc8KW4UbmvJcR/Yw+h4BLY4xPE7HYX0UUodgTKIhOoY7I56/fdaW2iBD4 EZPXgThK/zfc3ezHebbBKnSNg1CYM1L3WiMN6SFS9B7yZQlXgMS6vs+WvX8oYmXhm2Lh oMDKhlIgLBiy+rMg6whv7873geg5giALj/EzAisUMUMPkngC5nE5nv4ZOtkCSvMZapBn C5hIR/Tadub+01D104oU+JyDHzKj2hZVovFdUupje9GkTLF5AFns5ryACXwV8DKMQ9q7 lZ6gpEJ5LJpv9JW5gPosYq9naXdPFndOH3DMHI0kbEobdKE35kSbkOslEsgEnkZChLPr FFxg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k15si19877374pga.99.2019.07.08.08.18.22; Mon, 08 Jul 2019 08:18:37 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730351AbfGHMXf (ORCPT + 99 others); Mon, 8 Jul 2019 08:23:35 -0400 Received: from swift.blarg.de ([138.201.185.127]:46195 "EHLO swift.blarg.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728964AbfGHMXf (ORCPT ); Mon, 8 Jul 2019 08:23:35 -0400 Received: by swift.blarg.de (Postfix, from userid 1000) id B855D402B5; Mon, 8 Jul 2019 14:23:33 +0200 (CEST) Date: Mon, 8 Jul 2019 14:23:33 +0200 From: Max Kellermann To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: Kernel 5.1.15 stuck in compaction Message-ID: <20190708122333.GA11407@swift.blarg.de> Mail-Followup-To: linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20190708103543.GA10364@swift.blarg.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190708103543.GA10364@swift.blarg.de> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2019/07/08 12:35, Max Kellermann wrote: > one of our web servers got repeatedly stuck in the memory compaction > code; two PHP processes have been busy at 100% inside memory > compaction after a page fault: This trace maybe helpful as well; the first PHP process: 275.846 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0 LOST 8 events! 275.894 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0 LOST 8 events! 275.942 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0 LOST 8 events! 275.989 compaction:mm_compaction_isolate_migratepages:range=(0x8a48e0 ~ 0x8a48e0) nr_scanned=0 nr_taken=0 This is the other PHP process: 188.501 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0 LOST 16 events! 188.600 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0 LOST 5 events! 188.643 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0 LOST 17 events! 188.742 compaction:mm_compaction_isolate_migratepages:range=(0x169f40 ~ 0x169f40) nr_scanned=0 nr_taken=0 No pages are being scanned at all, start and end are the same. However, since my perf report contains calls to compact_unlock_should_abort(), this means that the loop in isolate_migratepages_block() is not getting skipped completely, therefore the loop is just exiting too early.