Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp6947086ybl; Wed, 15 Jan 2020 12:51:09 -0800 (PST) X-Google-Smtp-Source: APXvYqx5uLRlyphl/WI28vwIjOC/lausj6Ls52+U9YbF4b7I+6dSI85RebfZC65BxkCfasDleqlR X-Received: by 2002:a05:6830:114f:: with SMTP id x15mr4018901otq.291.1579121469617; Wed, 15 Jan 2020 12:51:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579121469; cv=none; d=google.com; s=arc-20160816; b=0kxhWzc0frGAmXp4akHUwSZEtGhsogwApccGMzj29TPyBEWd83HTnpBLrOz2Fm1CRb z2sWwjNK5u29CINZQETqR7J4UVcUNwjFTQkLf+mV+8c47tSA869nq06NUZYxpDyRRmio JRE46ZGjldu5dcMFm8aaKUBxlv5V3UxBORTRRJtRyNvdtct74qRFeyGOiIXefSgsqic3 kmYT0Celx43A2mKdBviB517ht7X+/mxFWixZbN1BwK5jGR4uFFpjK7gyd7kGaEpuDLuz 0hP+gJZsN9Imf0mytNQt3akD+nIvycxV1bT7wlQCKAK3vx7C+oNfmF5uEffrIiHtp6pE nX6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature; bh=kWBj9q//ngCI8CHduUy+zt1arDphaiPl0KmAyH3wXMI=; b=nFoxmqid71amPFhIcDF+mkkSDRxMWs5FODDSUk4Doz2DK1r+If7EyMtl/vahwmi6DX 0f6i636eV/xIsMahS/nfePNGS2F0KppTfXyB7d83bJtSyImk7TviIZSMoo3lQ3dKH0nQ OD3AKhZNIKLgrS8Mr+ibWPAnY7XrWv0k209I1qNwuytqYdxCWrd/EqUwi7UMD51p6Zz2 y7I9adm56xYTIS7RioWa0CdAnhUPmM//XL0eoVml5TpLgaCD8YY0gAKQwUs3A4c933CN /PISTw6oYwCMkaFfIkCIVNaGzgGQxuOJDI2YEOq+yDOsUMRk5a+IaQ3JwBPGXe2uasge g0Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vOS+qbOG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c21si11823054oto.176.2020.01.15.12.50.57; Wed, 15 Jan 2020 12:51:09 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vOS+qbOG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729134AbgAOU1L (ORCPT + 99 others); Wed, 15 Jan 2020 15:27:11 -0500 Received: from mail-pg1-f196.google.com ([209.85.215.196]:40629 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726018AbgAOU1L (ORCPT ); Wed, 15 Jan 2020 15:27:11 -0500 Received: by mail-pg1-f196.google.com with SMTP id k25so8724825pgt.7 for ; Wed, 15 Jan 2020 12:27:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=kWBj9q//ngCI8CHduUy+zt1arDphaiPl0KmAyH3wXMI=; b=vOS+qbOG6Ei+ejHHP7brSCEHiRP3WSxVUn0SJkywCLHbCTk2meRAnqcz6CCmeNpPuZ x+5NcR43SKz4az/LB/QeA/hZePBit+amrkLl4DXNcz+tqWGr96jE5EqIpU8T7YKsGqF1 9/+2PhqYvcoJ6e396p1+9uZ+joECShT7FVNWbVXmG19YLYpMuMIoTAEKRoefdJIZL7q0 quXK7Tp7pAv7Zx/SiguSrQcdvFyaP4Xm+0OjB9SNoOUFbmQV3CbThGlSn56Emm8ISzP+ OcbASV6uLDKyly7yXihP0pYSN8g7A/NWps5zy8MVh8FC7W3g8ZJn5EoOtBBukrrfu2oV iaUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=kWBj9q//ngCI8CHduUy+zt1arDphaiPl0KmAyH3wXMI=; b=ojWzD5Bpq+sp1VkwgmQ4S5RF43/77QKUZGPDx+cfE5bw3DrniPZN1igprif6QnghWK 9jS6wBOnD4krs/Cvp7TTqfSJ0sM4H1YmQGtOD/CoUdbqBlsQjsI3a70vfJf4zMvwkmey E+cC6CsU7dj45BqCfteDvugMHyBXDBJyhYuDIc5caVzSao83MQP11lp0DzzIjlZIO0UM USVKHz8893y7fKGemGqths5htwUeZt6zG4l8gaF1nQMpb4LtstPvjsyM7bMzwxti0v2H PVtvMVgqPML/0JMi9OA4c1C3GYzQv3yo9n6mwmxIOxwMKpS7VmXuFBlqw4e322XYcxJt g6Xw== X-Gm-Message-State: APjAAAUtLh/6xb8POkWGS3ZZtwf+J1gBCoXHoF4gk5uCYflGKx+qQVUq dSCvRl+fc6mS9Cgglp6jlFmGQKMvtWs= X-Received: by 2002:aa7:946a:: with SMTP id t10mr33616856pfq.165.1579120030646; Wed, 15 Jan 2020 12:27:10 -0800 (PST) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id u11sm277705pjn.2.2020.01.15.12.27.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Jan 2020 12:27:10 -0800 (PST) Date: Wed, 15 Jan 2020 12:27:09 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Tetsuo Handa cc: Michal Hocko , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch] mm, oom: dump stack of victim when reaping failed In-Reply-To: <9a7cbbf0-4283-f932-e422-84b4fb42a055@I-love.SAKURA.ne.jp> Message-ID: References: <20200115084336.GW19428@dhcp22.suse.cz> <9a7cbbf0-4283-f932-e422-84b4fb42a055@I-love.SAKURA.ne.jp> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 15 Jan 2020, Tetsuo Handa wrote: > >> When a process cannot be oom reaped, for whatever reason, currently the > >> list of locks that are held is currently dumped to the kernel log. > >> > >> Much more interesting is the stack trace of the victim that cannot be > >> reaped. If the stack trace is dumped, we have the ability to find > >> related occurrences in the same kernel code and hopefully solve the > >> issue that is making it wedged. > >> > >> Dump the stack trace when a process fails to be oom reaped. > > > > Yes, this is really helpful. > > tsk would be a thread group leader, but the thread which got stuck is not > always a thread group leader. Maybe dump all threads in that thread group > without PF_EXITING (or something) ? > That's possible, yes. I think it comes down to the classic problem of how much info in the kernel log on oom kill is too much. Stacks for all threads that match the mm being reaped may be *very* verbose. I'm currently tracking a stall in oom reaping where the victim doesn't always have a lock held so we don't know where it's at in the kernel; I'm hoping that a stack for the thread group leader will at least shed some light on it.