Received: by 10.223.176.5 with SMTP id f5csp1588826wra; Wed, 31 Jan 2018 08:36:12 -0800 (PST) X-Google-Smtp-Source: AH8x2265aZKIcJYON2RO0upZAmE40FfGRNKoKUAN4NdtJf6o7lEKBVmtlnFtc8Vsa7Um08vXMfak X-Received: by 2002:a17:902:7b8f:: with SMTP id w15-v6mr29653649pll.219.1517416572309; Wed, 31 Jan 2018 08:36:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517416572; cv=none; d=google.com; s=arc-20160816; b=RvexTm/2r96yfX4QDVjGv8JfqKwoYke0suW4Ixf3VLFSklhYzIPK//E67nJl6qFRNr dYtmS/oBrjQ9r93fYp+WCVyiJMGF4NxZGPTo3aMmxcaRvPy0qCNq8/KJt+bmwh9nhH6Q c/z8ZhQ7d+d4O8dFGjS9IHi9+rarP+nxyhr2SbbdoRTIZ5Tq75mQr6L6ncVGC5p41yrz I7ZFWWCrY4P2JlL3SuCvxByMrVQjmh2QGyR+3DE8QZ0i14pHxS/tXUtNhYTgPWj4vn5T 8IrcKHzGOHVJlomr+hM1Rvcbh+ZY4w5APpVk7bbu8+NBXfD9wIj/m/tczRUV1eGHkZ3S jyZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date:arc-authentication-results; bh=NZlF25C8W2BnJvvUbtGBlC5U4axM/H8vFbAW2e8xvjs=; b=PBbUoZL8XYhUgvfBjyU2Pc3K/F2a36HIBaWbEoRm0+opnS24yuVzOPsf7J2aZSuFu8 XZ9NujB9TXyxOPHGz77WKyiNQg7kxEw8ds5IZHmx1KZyBsEoe8KfaSPIRTv1PSIGRJQA 7+JAijE5u7rO6fXZ0RYpFRWaOa0UXFWDzATdcA3KYmETDYGFDa86T0BSrS0Zxgy0x4WL G22BOtrVR7KsT3cmV4vEh6wLLYDLoV9dExBWg1CvUSeGVZD9ipSUHjUy1ePLC+GA9Xuf ThAXJU5AHQiXluWJ4066xSetGRg4P3fzcidMGXpv09rBQR0Bq7Qn1NNS3SJKxZQTsgIc Qa6g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u14si3507551pgn.261.2018.01.31.08.35.57; Wed, 31 Jan 2018 08:36:12 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753666AbeAaQCu (ORCPT + 99 others); Wed, 31 Jan 2018 11:02:50 -0500 Received: from iolanthe.rowland.org ([192.131.102.54]:34856 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1753158AbeAaQCt (ORCPT ); Wed, 31 Jan 2018 11:02:49 -0500 Received: (qmail 3079 invoked by uid 2102); 31 Jan 2018 11:02:47 -0500 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 31 Jan 2018 11:02:47 -0500 Date: Wed, 31 Jan 2018 11:02:47 -0500 (EST) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Haiqing Bai cc: gregkh@linuxfoundation.org, , , Subject: Re: [PATCH] ohci-hcd: Fix race condition caused by ohci_urb_enqueue() and io_watchdog_func() In-Reply-To: <1517390197-32323-1-git-send-email-Haiqing.Bai@windriver.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 31 Jan 2018, Haiqing Bai wrote: > Running io_watchdog_func() while ohci_urb_enqueue() is running can > cause a race condition where ohci->prev_frame_no is corrupted and the > watchdog can mis-detect following error: > > ohci-platform 664a0800.usb: frame counter not updating; disabled > ohci-platform 664a0800.usb: HC died; cleaning up > > Specifically, following scenario causes a race condition: > > 1. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags) > and enters the critical section > 2. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it > returns false > 3. ohci_urb_enqueue() sets ohci->prev_frame_no to a frame number > read by ohci_frame_no(ohci) > 4. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer() > 5. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock, > flags) and exits the critical section > 6. Later, ohci_urb_enqueue() is called > 7. ohci_urb_enqueue() calls spin_lock_irqsave(&ohci->lock, flags) > and enters the critical section > 8. The timer scheduled on step 4 expires and io_watchdog_func() runs > 9. io_watchdog_func() calls spin_lock_irqsave(&ohci->lock, flags) > and waits on it because ohci_urb_enqueue() is already in the > critical section on step 7 > 10. ohci_urb_enqueue() calls timer_pending(&ohci->io_watchdog) and it > returns false > 11. ohci_urb_enqueue() sets ohci->prev_frame_no to new frame number > read by ohci_frame_no(ohci) because the frame number proceeded > between step 3 and 6 > 12. ohci_urb_enqueue() schedules io_watchdog_func() with mod_timer() > 13. ohci_urb_enqueue() calls spin_unlock_irqrestore(&ohci->lock, > flags) and exits the critical section, then wake up > io_watchdog_func() which is waiting on step 9 > 14. io_watchdog_func() enters the critical section > 15. io_watchdog_func() calls ohci_frame_no(ohci) and set frame_no > variable to the frame number > 16. io_watchdog_func() compares frame_no and ohci->prev_frame_no > > On step 16, because this calling of io_watchdog_func() is scheduled on > step 4, the frame number set in ohci->prev_frame_no is expected to the > number set on step 3. However, ohci->prev_frame_no is overwritten on > step 11. Because step 16 is executed soon after step 11, the frame > number might not proceed, so ohci->prev_frame_no must equals to > frame_no. That is a nasty bug! > To address above scenario, this patch introduces timer_running flag to > ohci_hcd structure. Setting true to ohci->timer_running indicates > io_watchdog_func() is scheduled or is running. ohci_urb_enqueue() > checks the flag when it schedules the watchdog (step 4 and 12 above), > so ohci->prev_frame_no is not overwritten while io_watchdog_func() is > running. Instead of adding an extra flag variable, which has to be kept in sync with the timer routine, how about defining a special sentinel value for prev_frame_no? For example: #define IO_WATCHDOG_OFF 0xffffff00 Then whenever the timer isn't scheduled or running, set ohci->prev_frame_no to IO_WATCHDOG_OFF. And instead of testing timer_pending(), compare prev_frame_no to this special value. I think that approach will be slightly more robust. Alan Stern