Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751659AbdLIWiv (ORCPT ); Sat, 9 Dec 2017 17:38:51 -0500 Received: from s18231873.onlinehome-server.info ([217.160.179.168]:54458 "EHLO godking.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751343AbdLIWit (ORCPT ); Sat, 9 Dec 2017 17:38:49 -0500 From: Alexander Kappner Subject: Re: [PATCH] usb-core: Fix potential null pointer dereference in xhci-debugfs.c To: Mathias Nyman , Mathias Nyman , Greg Kroah-Hartman , linux-usb@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu References: <1512638774-6837-1-git-send-email-agk@godking.net> <98502b84-1a6e-fd18-f290-ab90f0082e55@godking.net> <3c3a199a-8c73-41c8-d105-33199bfb92dc@linux.intel.com> Message-ID: Date: Sat, 9 Dec 2017 14:38:44 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.3.0 MIME-Version: 1.0 In-Reply-To: <3c3a199a-8c73-41c8-d105-33199bfb92dc@linux.intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3239 Lines: 79 Hi Mathias, thanks for the patch! The system now resumes cleanly from hibernate even with usbmuxd doing its thing. Tested-by: Alexander Kappner While testing this I hit some other issues with xhci-debugfs.c but I'll write these up in a separate bug. On 12/08/2017 09:01 AM, Mathias Nyman wrote: > On 08.12.2017 13:06, Alexander Kappner wrote: >> Hi, >> >>> I think we need to dig a bit deeper. It's good to check if spriv is >>> valid >>> but there are probably other reasons than kzalloc failing. >> >> I agree -- this small allocation is unlikely to fail in practice. >> Also, while my patch prevents the kernel oops, it also prevents the >> debugfs entries from being created. >> >> I've been debugging this more trying to come up with a better >> solution, but I might need some guidance as I'm not too familiar with >> the USB subsystem. The immediate cause of the crash was usbmuxd >> sending a USBDEVFS_SETCONFIGURATION ioctl to a device, which _only if >> it fails_ calls usb_hcd_alloc_bandwidth to try and reset the device, >> which in turn calls xhci_debugfs_create_endpoint. The ioctl handler >> acquires a device-specific lock via usb_lock_device. >> >> When the system resumed from hibernate, xhci_resume was called. This >> in turn called xhci_mem_cleanup to deallocate the device structures, >> which include setting the debugfs_private pointer to NULL (via >> xhci_free_virt_devices_depth_first). It thus seems likely that the >> ioctl is somehow racing with the hibernate. The call to xhci_resume >> is protected by a host-controller specific lock (xhci->lock) but it >> doesn't attempt to take the usb_lock_device device-specific lock. >> >> Now my suspicion is that xhci_resume freed and zeroed the device >> structures while racing with the ioctl handler. I can't seem to find >> any exclusion mechanism that would prevent xhci_resume from racing >> with the USBDEVFS_SETCONFIGURATION ioctl (or any other ioctl, for >> that matter). Am I missing something? If not, is there any reason why >> an ioctl might need to execute in parallel with the xhci_resume? If >> not, can we just do a busy wait in xhci_resume until all pending >> ioctls have returned? > > I'm not sure, but if I recall correctly then power management is supposed > to make sure a driver doesn't access usb devices while the host > controller > is still resuming. > > The odd thing here is that > xhci_debugfs_remove_slot(xhci, slot_id), and > xhci_free_virt_device(xhci, slot_id) are called together when > xhci_mem_cleanup() calls xhci_free_virt_devices_depth_first() > > That means both the xhci_virt_device *dev and dev->debugfs_private > should both be freed and xhci->devs[slot_id] set NULL for that > virt_device. > > so xhci_add_endpoint() should fail a lot earlier because the > xhci->devs[slot_id] > should be a null pointer as well. > > Allocation is also done together in xhci_alloc_dev() > > Looking at it more closely there is actually the .free_dev callback that > first frees the dev->debugs_private but the virt_dev is only freed > conditionally later > > Attached a patch that frees them together, can you try it out? > > If it doesn't help we need to add some elaborate tracing > > Thanks > -Mathias > >