Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757659AbdLUDQK (ORCPT ); Wed, 20 Dec 2017 22:16:10 -0500 Received: from scorn.kernelslacker.org ([45.56.101.199]:54792 "EHLO scorn.kernelslacker.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756192AbdLUDQI (ORCPT ); Wed, 20 Dec 2017 22:16:08 -0500 Date: Wed, 20 Dec 2017 22:16:06 -0500 From: Dave Jones To: "Eric W. Biederman" Cc: Linus Torvalds , Al Viro , Linux Kernel , syzkaller-bugs@googlegroups.com, Gargi Sharma , Alexey Dobriyan Subject: Re: proc_flush_task oops Message-ID: <20171221031606.GA4636@codemonkey.org.uk> Mail-Followup-To: Dave Jones , "Eric W. Biederman" , Linus Torvalds , Al Viro , Linux Kernel , syzkaller-bugs@googlegroups.com, Gargi Sharma , Alexey Dobriyan References: <20171218221541.GP21978@ZenIV.linux.org.uk> <20171218231013.GA9481@codemonkey.org.uk> <20171219033926.GA26981@codemonkey.org.uk> <87lghy7eul.fsf@xmission.com> <20171219193020.GA9237@codemonkey.org.uk> <878tdy5r5t.fsf@xmission.com> <87mv2e17vz.fsf@xmission.com> <20171220052803.GA17079@codemonkey.org.uk> <871sjp1cjz.fsf@xmission.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <871sjp1cjz.fsf@xmission.com> User-Agent: Mutt/1.9.2 (2017-12-15) X-Spam-Note: SpamAssassin invocation failed Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 835 Lines: 23 On Wed, Dec 20, 2017 at 12:25:52PM -0600, Eric W. Biederman wrote: > > > > > > If the warning triggers it means the bug is in alloc_pid and somehow > > > something has gotten past the is_child_reaper check. > > > > You're onto something. > > > I am not seeing where things go wrong, but that puts the recent pid bitmap, bit > hash to idr change in the suspect zone. > > Can you try reverting that change: > > e8cfbc245e24 ("pid: remove pidhash") > 95846ecf9dac ("pid: replace pid bitmap implementation with IDR API") > > While keeping the warning in place so we can see if this fixes the > allocation problem? So I can't trigger this any more with those reverted. I seem to hit a bunch of other long-standing bugs first. I'll keep running it overnight, but it looks like this is where the problem lies. Dave