Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757997AbbGGQdf (ORCPT ); Tue, 7 Jul 2015 12:33:35 -0400 Received: from mail.efficios.com ([78.47.125.74]:43384 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932134AbbGGQdQ (ORCPT ); Tue, 7 Jul 2015 12:33:16 -0400 Date: Tue, 7 Jul 2015 16:33:05 +0000 (UTC) From: Mathieu Desnoyers To: Peter Zijlstra , Arthur Marsh Cc: linux-kernel@vger.kernel.org, Rusty Russell , rostedt , Oleg Nesterov , "Paul E. McKenney" Message-ID: <1736781680.1883.1436286785932.JavaMail.zimbra@efficios.com> In-Reply-To: <20150707072951.GM3644@twins.programming.kicks-ass.net> References: <55997889.5020101@internode.on.net> <20150706100447.GX3644@twins.programming.kicks-ass.net> <559A545A.80508@internode.on.net> <20150706103246.GY3644@twins.programming.kicks-ass.net> <559B63A2.4030601@internode.on.net> <20150707072951.GM3644@twins.programming.kicks-ass.net> Subject: Re: lock-up with module: Optimize __module_address() using a latched RB-tree MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1881_850455956.1436286785930" X-Originating-IP: [78.47.125.74] X-Mailer: Zimbra 8.6.0_GA_1153 (ZimbraWebClient - FF38 (Linux)/8.6.0_GA_1153) Thread-Topic: lock-up with module: Optimize __module_address() using a latched RB-tree Thread-Index: tx1hqy/ffQBi4acOAC/y52hSHn8pFw== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4903 Lines: 91 ------=_Part_1881_850455956.1436286785930 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit ----- On Jul 7, 2015, at 3:29 AM, Peter Zijlstra peterz@infradead.org wrote: > On Tue, Jul 07, 2015 at 02:59:06PM +0930, Arthur Marsh wrote: >> I had a single, non-reproducible case of the same lock-up happening on my >> other machine running the Linus git head kernel in 64-bit mode. > > Hmm, disturbing.. I've had my machines run this stuff for weeks and not > had anything like this :/ > > Do you have a serial cable between those machines? serial console output > will allow capturing more complete traces than these pictures can and > might also aid in capturing some extra debug info. > > In any case, I'll go try and build some debug code. Arthur: can you double-check if you load any module with --force ? This could cause a module header layout mismatch, which can be an issue with the changes done by the identified commit: the module header layout changes there. Also, I'm attaching a small patch which serializes both updates and reads of the module rbree. Can you try it out ? If the problem still shows with the spinlocks in place, that would mean the issue is *not* a race between latched rbtree updates and traversals. Thanks! Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ------=_Part_1881_850455956.1436286785930 Content-Type: text/x-patch; name=0001-TESTING-add-spinlock-to-module.c-rb-latch-tree.patch Content-Disposition: attachment; filename=0001-TESTING-add-spinlock-to-module.c-rb-latch-tree.patch Content-Transfer-Encoding: base64 RnJvbSAwZDA0NmYyMDVmYTQ5YzQ3N2JiZjgxYjcyY2QwMzhmYjlmN2U0MGQ2IE1vbiBTZXAgMTcg MDA6MDA6MDAgMjAwMQpGcm9tOiBNYXRoaWV1IERlc25veWVycyA8bWF0aGlldS5kZXNub3llcnNA ZWZmaWNpb3MuY29tPgpEYXRlOiBUdWUsIDcgSnVsIDIwMTUgMTI6MjQ6MzcgLTA0MDAKU3ViamVj dDogW1BBVENIXSBURVNUSU5HOiBhZGQgc3BpbmxvY2sgdG8gbW9kdWxlLmMgcmIgbGF0Y2ggdHJl ZQoKTm90LVNpZ25lZC1vZmYtYnk6IE1hdGhpZXUgRGVzbm95ZXJzIDxtYXRoaWV1LmRlc25veWVy c0BlZmZpY2lvcy5jb20+Ci0tLQoga2VybmVsL21vZHVsZS5jIHwgMjIgKysrKysrKysrKysrKysr KysrKysrKwogMSBmaWxlIGNoYW5nZWQsIDIyIGluc2VydGlvbnMoKykKCmRpZmYgLS1naXQgYS9r ZXJuZWwvbW9kdWxlLmMgYi9rZXJuZWwvbW9kdWxlLmMKaW5kZXggM2UwZTE5Ny4uZWI5OTc1MSAx MDA2NDQKLS0tIGEva2VybmVsL21vZHVsZS5jCisrKyBiL2tlcm5lbC9tb2R1bGUuYwpAQCAtMTgx LDYgKzE4MSw5IEBAIHN0YXRpYyBzdHJ1Y3QgbW9kX3RyZWVfcm9vdCB7CiAjZGVmaW5lIG1vZHVs ZV9hZGRyX21pbiBtb2RfdHJlZS5hZGRyX21pbgogI2RlZmluZSBtb2R1bGVfYWRkcl9tYXggbW9k X3RyZWUuYWRkcl9tYXgKIAorLyogRnVsbHkgc2VyaWFsaXplIHJlYWRlcnMgYW5kIHVwZGF0ZXMg dG8gcmIgbGF0Y2ggdHJlZS4gKi8KK3N0YXRpYyBERUZJTkVfU1BJTkxPQ0sodGVzdF9yYl9sYXRj aF9sb2NrKTsKKwogc3RhdGljIG5vaW5saW5lIHZvaWQgX19tb2RfdHJlZV9pbnNlcnQoc3RydWN0 IG1vZF90cmVlX25vZGUgKm5vZGUpCiB7CiAJbGF0Y2hfdHJlZV9pbnNlcnQoJm5vZGUtPm5vZGUs ICZtb2RfdHJlZS5yb290LCAmbW9kX3RyZWVfb3BzKTsKQEAgLTE5NywzMSArMjAwLDUwIEBAIHN0 YXRpYyB2b2lkIF9fbW9kX3RyZWVfcmVtb3ZlKHN0cnVjdCBtb2RfdHJlZV9ub2RlICpub2RlKQog ICovCiBzdGF0aWMgdm9pZCBtb2RfdHJlZV9pbnNlcnQoc3RydWN0IG1vZHVsZSAqbW9kKQogewor CXVuc2lnbmVkIGxvbmcgZmxhZ3M7CisKKwlzcGluX2xvY2tfaXJxc2F2ZSgmdGVzdF9yYl9sYXRj aF9sb2NrLCBmbGFncyk7CiAJbW9kLT5tdG5fY29yZS5tb2QgPSBtb2Q7CiAJbW9kLT5tdG5faW5p dC5tb2QgPSBtb2Q7CiAKIAlfX21vZF90cmVlX2luc2VydCgmbW9kLT5tdG5fY29yZSk7CiAJaWYg KG1vZC0+aW5pdF9zaXplKQogCQlfX21vZF90cmVlX2luc2VydCgmbW9kLT5tdG5faW5pdCk7CisJ c3Bpbl91bmxvY2tfaXJxcmVzdG9yZSgmdGVzdF9yYl9sYXRjaF9sb2NrLCBmbGFncyk7CiB9CiAK IHN0YXRpYyB2b2lkIG1vZF90cmVlX3JlbW92ZV9pbml0KHN0cnVjdCBtb2R1bGUgKm1vZCkKIHsK Kwl1bnNpZ25lZCBsb25nIGZsYWdzOworCisJc3Bpbl9sb2NrX2lycXNhdmUoJnRlc3RfcmJfbGF0 Y2hfbG9jaywgZmxhZ3MpOwogCWlmIChtb2QtPmluaXRfc2l6ZSkKIAkJX19tb2RfdHJlZV9yZW1v dmUoJm1vZC0+bXRuX2luaXQpOworCXNwaW5fdW5sb2NrX2lycXJlc3RvcmUoJnRlc3RfcmJfbGF0 Y2hfbG9jaywgZmxhZ3MpOwogfQogCiBzdGF0aWMgdm9pZCBtb2RfdHJlZV9yZW1vdmUoc3RydWN0 IG1vZHVsZSAqbW9kKQogeworCXVuc2lnbmVkIGxvbmcgZmxhZ3M7CisKKwlzcGluX2xvY2tfaXJx c2F2ZSgmdGVzdF9yYl9sYXRjaF9sb2NrLCBmbGFncyk7CiAJX19tb2RfdHJlZV9yZW1vdmUoJm1v ZC0+bXRuX2NvcmUpOwogCW1vZF90cmVlX3JlbW92ZV9pbml0KG1vZCk7CisJc3Bpbl91bmxvY2tf aXJxcmVzdG9yZSgmdGVzdF9yYl9sYXRjaF9sb2NrLCBmbGFncyk7CiB9CiAKIHN0YXRpYyBzdHJ1 Y3QgbW9kdWxlICptb2RfZmluZCh1bnNpZ25lZCBsb25nIGFkZHIpCiB7CiAJc3RydWN0IGxhdGNo X3RyZWVfbm9kZSAqbHRuOworCXVuc2lnbmVkIGxvbmcgZmxhZ3M7CiAKKwlpZiAoaW5fbm1pKCkp IHsKKwkJcHJpbnRrKEtFUk5fRVJSICJtb2RfZmluZCBjYWxsZWQgZnJvbSBOTUlcbiIpOworCQly ZXR1cm4gTlVMTDsKKwl9CisJc3Bpbl9sb2NrX2lycXNhdmUoJnRlc3RfcmJfbGF0Y2hfbG9jaywg ZmxhZ3MpOwogCWx0biA9IGxhdGNoX3RyZWVfZmluZCgodm9pZCAqKWFkZHIsICZtb2RfdHJlZS5y b290LCAmbW9kX3RyZWVfb3BzKTsKKwlzcGluX3VubG9ja19pcnFyZXN0b3JlKCZ0ZXN0X3JiX2xh dGNoX2xvY2ssIGZsYWdzKTsKIAlpZiAoIWx0bikKIAkJcmV0dXJuIE5VTEw7CiAKLS0gCjEuOS4x Cgo= ------=_Part_1881_850455956.1436286785930-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/