Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp1525422imm; Fri, 6 Jul 2018 01:32:27 -0700 (PDT) X-Google-Smtp-Source: AAOMgpftqm1i2p9ENAUhEHbWGECe6wHod7ZmTG1OUZ4RbcvuNktLGTaKeMjHoO6C7R+1Ilq3/8Bk X-Received: by 2002:a17:902:e00a:: with SMTP id ca10-v6mr9401090plb.224.1530865947658; Fri, 06 Jul 2018 01:32:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530865947; cv=none; d=google.com; s=arc-20160816; b=lH2vanjzjZVQMqIpr8gVWJvcowQjuMJ7ggmqKgEhhVYUNnaD4PUYL4uwevOfkL2xyx lS3MSKn7dK/ktT4rtmavrA7hTJVxFRAyIy72H7Kxhtf5ohpFSElOHTPR6sgf82RuRjC8 csS5TzCnPx01v9vpNE09qRBF4ngWwRh9PYB6k6y9TsaINSL9Ng/7k9jCGYvuIntIRFws yIArmrtc1RIWaoa+RKFgsfuBZTYqwzKpk98tB2TukYhxG7NII22cwhPGetpdl1Ql1NDV itUphSahizVhAPzAEzc3e+zxYrLvqAt4We2Acw80msR3fswKvqQ8fkCvTK67UVe7pA/u nSuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=C4ftMIIBVyvn/K0NOb+9/pNDpUN47yEBBh8aaEw2/dA=; b=u1TfgzpTmxEP2qFlSBp9CEWPyKzOiErUUFgz/zUnQdvnpnHsUvf48I8L2Cc28uZARg G8dmcTcnq2+ubbWbT6r+8AZVgiGoYCoGM0xEPhWmUZbsWDZldtTiURZft9bjQepBdHC0 D9HsHjSp9JyD6TUlyRXYBVGUMZN4/cxEDJWcMDJHKhJgX4gW9YaR+7HG27cCYRyqHimV pM+rhrnnsc71EXG7d3kcZf3VXa96HGkiDRF+5UurXu6xNyr1efizehuC/jbpJ3VJi0C/ RIvgD1Q8jnJhJn+MThE/R1Q3RUlqcE1EsFCcBFaQIBESFnwvg+SyFgPKn21xdS6+EIv9 kODg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oM0bvTEQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g35-v6si5062326pgm.54.2018.07.06.01.32.12; Fri, 06 Jul 2018 01:32:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=oM0bvTEQ; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932380AbeGFIba (ORCPT + 99 others); Fri, 6 Jul 2018 04:31:30 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:38836 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932199AbeGFIb1 (ORCPT ); Fri, 6 Jul 2018 04:31:27 -0400 Received: by mail-wr1-f65.google.com with SMTP id j33-v6so3314361wrj.5 for ; Fri, 06 Jul 2018 01:31:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=C4ftMIIBVyvn/K0NOb+9/pNDpUN47yEBBh8aaEw2/dA=; b=oM0bvTEQ4krmyd3dLzDHexzWOJLkv9PrStYkXcQ81jEH5XBpcuMK+1rqKm5uvf6d30 JgR5eicfa/H/6+eNUlhQtqfJZAtVR4uUMSckd780RZ9OQpduORq9gAwgPACa+v8ZYtog RdcNYZ9bIu/fwXtqAoJc1TnqXGCBmQYvEWCCztQObDqVif9Pcfese3F7eMtoPPDG/ViT 3Cv+hbADf6YUOXubVBlX73VPWzm8iWSHWjFcebmY9YYL0TWrXUJYMcAFRYzNNzDLSRBB PFOhENeXvvS7alro63VJeBsGNMGGk06ppeiBeei65V/mecxJ1/RxsxvfLPJItIcvtVLU jsyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=C4ftMIIBVyvn/K0NOb+9/pNDpUN47yEBBh8aaEw2/dA=; b=UbuXa+x/3aMKmoMpZHxVN5oUGSOY5iHGHGbGY6YlGhHVGHC5pUvuDpi5rZDEOtQ3gV Vv0Q22WW6iZgFmRgUGs8yu05CJpqF1EJ2Ti3sJyJY+xSz2/qvdOFaTCRr5Yptzw9ASZE evhTJQp/cAMfdkEvPqB15PULAYkKt4/RTsHEeGTN/yv9dXYdAWxfNr2Wte3nDrn9qOuK /SQHoLOOJPx+VsyD0lGlzF6lT7BK1au9ADACwVwYERCJoelD25vxpg3Hp1IKKoOMgqC2 ouOMcbwPTKe/VKSvqtqHTWZ4+0sCPpq9SFRrjf2+7j/yarN5BTAeMN+FjK5z16mGLItl Okzg== X-Gm-Message-State: APt69E19PANVhjAay/UVyJEy0IjN8cTTC4yM+RgAVAjIGyTV8P5/ZalS 1CwgJgOjFWo/b5gNlFLZi1/ZFVLsxjIqWh0F1eY= X-Received: by 2002:adf:e311:: with SMTP id b17-v6mr7309228wrj.158.1530865885949; Fri, 06 Jul 2018 01:31:25 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:adf:f10f:0:0:0:0:0 with HTTP; Fri, 6 Jul 2018 01:31:24 -0700 (PDT) In-Reply-To: <87601tz1oz.fsf@notabene.neil.brown.name> References: <153080826773.5496.7106875523806885716.stgit@warthog.procyon.org.uk> <87601tz1oz.fsf@notabene.neil.brown.name> From: Vegard Nossum Date: Fri, 6 Jul 2018 10:31:24 +0200 Message-ID: Subject: Re: [PATCH 1/4] cachefiles: Fix assertion "6 == 5 is false" at fs/fscache/operation.c:494 To: NeilBrown Cc: David Howells , linux-cachefs@redhat.com, kiran.modukuri@gmail.com, Lei Xue , LKML , aderobertis@metrics.net, dja@axtens.net Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 6 July 2018 at 01:45, NeilBrown wrote: > On Thu, Jul 05 2018, David Howells wrote: > >> From: kiran modukuri >> >> There is a potential race in fscache operation enqueuing for reading and >> copying multiple pages from cachefiles to netfs. >> Under some heavy load system, it will happen very often. >> >> If this race occurs, an oops similar to the following is seen: >> >> kernel BUG at fs/fscache/operation.c:69! >> invalid opcode: 0000 [#1] SMP >> ... >> #0 [ffff883fff0838d8] machine_kexec at ffffffff81051beb >> #1 [ffff883fff083938] crash_kexec at ffffffff810f2542 >> #2 [ffff883fff083a08] oops_end at ffffffff8163e1a8 >> #3 [ffff883fff083a30] die at ffffffff8101859b >> #4 [ffff883fff083a60] do_trap at ffffffff8163d860 >> #5 [ffff883fff083ab0] do_invalid_op at ffffffff81015204 >> #6 [ffff883fff083b60] invalid_op at ffffffff8164701e >> [exception RIP: fscache_enqueue_operation+246] >> RIP: ffffffffa0b793c6 RSP: ffff883fff083c18 RFLAGS: 00010046 >> RAX: 0000000000000019 RBX: ffff8832ed1a9ec0 RCX: 0000000000000006 >> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046 >> RBP: ffff883fff083c20 R8: 0000000000000086 R9: 000000000000178f >> R10: ffffffff816aeb00 R11: ffff883fff08392e R12: ffff8802f0525620 >> R13: ffff88407ffc01d8 R14: 0000000000000000 R15: 0000000000000003 >> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 >> #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6 >> #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48 >> #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028 >> >> Reported-by: Lei Xue >> Reported-by: Vegard Nossum >> Reported-by: Anthony DeRobertis >> Reported-by: NeilBrown >> Reported-by: Daniel Axtens >> Reported-by: KiranKumar Modukuri >> Signed-off-by: David Howells >> --- [...] > Thanks - I like this approach. Taking the extra reference makes it a > lot more clear what is happening and why. The changelog is a bit sparse, no? We have more info here: https://lkml.org/lkml/2018/5/8/520 https://lkml.org/lkml/2018/7/3/1184 Why not crib some of that and explain the issue properly (or at minimum link the previous threads)? Thanks, Vegard