Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754732AbbHJTCR (ORCPT ); Mon, 10 Aug 2015 15:02:17 -0400 Received: from mail-wi0-f175.google.com ([209.85.212.175]:35588 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752632AbbHJTCQ (ORCPT ); Mon, 10 Aug 2015 15:02:16 -0400 Subject: Re: [PATCH] ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits To: "Herton R. Krzesinski" References: <1438967375-14877-1-git-send-email-herton@redhat.com> <55C79294.2010006@colorfullife.com> <20150810153147.GA3540@dhcppc4.redhat.com> Cc: linux-kernel@vger.kernel.org, Andrew Morton , Davidlohr Bueso , Rafael Aquini , Joe Perches , Aristeu Rozanski , djeffery@redhat.com From: Manfred Spraul Message-ID: <55C8F533.1090007@colorfullife.com> Date: Mon, 10 Aug 2015 21:02:11 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 In-Reply-To: <20150810153147.GA3540@dhcppc4.redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2668 Lines: 65 Hi Herton, On 08/10/2015 05:31 PM, Herton R. Krzesinski wrote: > Well without the synchronize_rcu() and with the semid list loop fix I was still > able to get issues, and I thought the problem is related to racing with IPC_RMID > on freeary again. This is one scenario I would imagine: > > A B > > freeary() > list_del(&un->list_id) > spin_lock(&un->ulp->lock) > un->semid = -1 > list_del_rcu(&un->list_proc) > __list_del_entry(&un->list_proc) > __list_del(entry->prev, entry->next) exit_sem() > next->prev = prev; ... > prev->next = next; ... > ... un = list_entry_rcu(ulp->list_proc.next...) > (&un->list_proc)->prev = LIST_POISON2 if (&un->list_proc == &ulp->list_proc) > ... kfree(ulp) > spin_unlock(&un->ulp->lock) <---- bug > > Now that is a very tight window, but I had problems even when I tried this patch > first: > > (...) > - if (&un->list_proc == &ulp->list_proc) > - semid = -1; > - else > - semid = un->semid; > + if (&un->list_proc == &ulp->list_proc) { > + rcu_read_unlock(); What about: + spin_unlock_wait(&ulp->lock); > + break; > + } > + spin_lock(&ulp->lock); > + semid = un->semid; > + spin_unlock(&ulp->lock); > > + /* exit_sem raced with IPC_RMID, nothing to do */ > if (semid == -1) { > rcu_read_unlock(); > - break; > + synchronize_rcu(); > + continue; > } > (...) > > So even with the bad/uneeded synchronize_rcu() which I had placed there, I could > still get issues (however the testing on patch above was on an older kernel than > latest upstream, from RHEL 6, I can test without synchronize_rcu() on latest > upstream, however the affected code is the same). That's when I thought of > scenario above. I was able to get this oops: Adding sleep() usually help, too. But it is ugly, so let's try to understand the race and to fix it. Best regards, Manfred -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/