Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp166456rwd; Wed, 14 Jun 2023 14:01:40 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6frdeS4C0Vl0vKAIF0MpZ1ArAZ4LEXZqU5OX7AQ52vhPkkVVJyt68Rzt83a5y9fpr+lAHN X-Received: by 2002:a05:6a20:734c:b0:ff:a017:2b07 with SMTP id v12-20020a056a20734c00b000ffa0172b07mr2663489pzc.20.1686776499909; Wed, 14 Jun 2023 14:01:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686776499; cv=none; d=google.com; s=arc-20160816; b=RuOSSKO+278RISRuWtNzbgQ9JO5Qe1ZFNvuNnVKkDnx7lL1aGJFdyXzCWMtPytZ7EA JyXeCfkjP/j51cUGJp1szMnqDI0Sqsr4OF8z5LLtsx43TzH7n9LzPVIMV0GvhRiF/DKq Z01UbEW/zJzmyWlPnm/M6Vm0oTS4G/pvVAgCZAbY2wZYqdCGUw5p+FBufYnZTlmRJoGC SwJN/dJ8dKdPQCrVZSP7P6IHkcZQ9O3nsGe6rn50ihb15HjcYVE5ps1NGXxW3njxtSwx HJU/SYxjd/R5mncDRf31YUXzPdB15HlFQPth3Od2rZjm6e7/B5BVRLC1/pkYL/7XcIwm vjwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=z89A2CXQJmGwOB5hRCNbTXuEaIm8LPH5dO0j91q1+cA=; b=YaLPrJLwe4/qSEGOB6P8iGq1CAg6ZjILyd+gUjyMfduuHTHQrdAQVgXTNxf/JosK5L ELex5KDxc07x+/K5a1zVou21Y+6qmHwYgrxN4zoSquyF4zADWP42KT6P7DuANadRH1dw NmdGa6yY67uNDfxiPXYIterxA5ucpvgUfugAa9w4mGEIQARLX4HKw1gJKKqvoIetIdOy o3WH7vmK01gfDINXEykJ8J5BSu3pS3SnP8gieX358Vt39aPyJ7FYxhc/bhe+9vNpOiCA CCsNYIm9DgjWbyxUxOBrsCj02NlJs8J0cKJF+e/4AEviKJQI02IVScPdi+uajPLPRKbG tgrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@umich.edu header.s=google-2016-06-03 header.b=cpwwWFJs; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=umich.edu Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z184-20020a6265c1000000b0065415d5c58fsi2103924pfb.81.2023.06.14.14.01.26; Wed, 14 Jun 2023 14:01:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@umich.edu header.s=google-2016-06-03 header.b=cpwwWFJs; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=umich.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230186AbjFNUoM (ORCPT + 99 others); Wed, 14 Jun 2023 16:44:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231901AbjFNUoL (ORCPT ); Wed, 14 Jun 2023 16:44:11 -0400 Received: from mail-lj1-x233.google.com (mail-lj1-x233.google.com [IPv6:2a00:1450:4864:20::233]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28C6610F6 for ; Wed, 14 Jun 2023 13:44:10 -0700 (PDT) Received: by mail-lj1-x233.google.com with SMTP id 38308e7fff4ca-2b1b1dd208dso14747971fa.0 for ; Wed, 14 Jun 2023 13:44:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=umich.edu; s=google-2016-06-03; t=1686775448; x=1689367448; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=z89A2CXQJmGwOB5hRCNbTXuEaIm8LPH5dO0j91q1+cA=; b=cpwwWFJsC9xlFZJ/lh24VB9GwCLZQOGUcb+1+dDP3bHYtO4E2aNjmi9Fey2bEaKavp 743ofFOp+5rx7MRnyanTYTmgo/fQ+e8/jc/ALW8yLC7nLy+Y+J75YMYXfV/DwGS66+m+ Fjmec6x6gnTr2vAIF9PNYcRAM1tFl2T7i5jF6ZXaqa9Zdcd7TQ3kcrXIroQGkpxCOTER S/ksPRORbBEdHZlBdCUJ1x8G1uOmkdrKcihl6CW/pC807KnjW8DyEDy8UbttRpOZIKLD zWlz38POZuMD0lTlVDDQCoSgjkmX2neGVwNqRbEk/COgDTyhB32+aeAR2xKJf/YgrerT DSvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686775448; x=1689367448; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z89A2CXQJmGwOB5hRCNbTXuEaIm8LPH5dO0j91q1+cA=; b=PzdGyKMXLKz584ezzaUF8Rex0ZCGT4XVlnGFkBcvk6pbA19ZDqIDXB194fyNc2vtp7 EJau05VWpNM2qnxrflWW7DoT591xl3U9WhaXuSGDUoXPHbPa4wM9unesGjuL/nXc1PdO nDtsY7Mr4dQWjaCkXlPY1CKJOQyAC2mnmlecyhukz+KzWUV36HgOh37OU4aPYyKLxldo HS1t8UqbG03Jc3YKz+DBEwohgetxP/13w0Exc9bQsdjMH3tf34V8LLAUnSV4b6GV0K2A gU8BLoIrCIZ3SdktDrVHkdVHWqohJ6eSqP1gt6sPcxrzDEWuFs0XhKJyo3bFoeb+MBlM 6QyA== X-Gm-Message-State: AC+VfDw4AoSr6xLOLqyBlPcXVljwOADytHpbodk/hWketeOQi7JFdJgE cSNaBwSs8rKwksKCviQK8x0VwltQB5OjDQVkWFQ= X-Received: by 2002:a2e:a498:0:b0:2b3:4e9f:906b with SMTP id h24-20020a2ea498000000b002b34e9f906bmr1222521lji.1.1686775447974; Wed, 14 Jun 2023 13:44:07 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Olga Kornievskaia Date: Wed, 14 Jun 2023 16:43:56 -0400 Message-ID: Subject: Re: Handling of BADSESSON error To: Rick Macklem Cc: Trond Myklebust , linux-nfs Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Wed, Jun 14, 2023 at 4:43=E2=80=AFPM Olga Kornievskaia = wrote: > > On Wed, Jun 14, 2023 at 4:24=E2=80=AFPM Rick Macklem wrote: > > > > On Wed, Jun 14, 2023 at 12:58=E2=80=AFPM Olga Kornievskaia wrote: > > > > > > > > > Hi Trond, > > > > > > I'm looking for advice on how to handle the problem that when > > > BADSESSION is received (on an interrupted slot) and we don't incremen= t > > > the seqid for that slot. The client releases the slot and it's > > > possible for another thread to use it before the session is frozen. > > > Here are the (unfiltered sequential) tracepoints showing the problem. > > > Follow slot_nr=3D0 and seq_nr=3D7673 > > > > > > kworker/u2:26-541 [000] ..... 869.508658: nfs4_sequence_done= : > > > error=3D-10052 (BADSESSION) session=3D0x90caa481 slot_nr=3D4 seq_nr= =3D4259 > > > highest_slotid=3D0 target_highest_slotid=3D0 status_flags=3D0x0 () > > > kworker/u2:26-541 [000] ..... 869.508661: nfs4_write: > > > error=3D-10052 (BADSESSION) fileid=3D00:3b:111 fhandle=3D0x59c8ccff > > > offset=3D2304664 count=3D7992 res=3D0 stateid=3D1:0x3f4f04cd > > > layoutstateid=3D0:0x00000000 > > > kworker/u2:1-3198 [000] ..... 869.508898: nfs4_xdr_status: > > > task:0000a2ae@00000011 xid=3D0x5d0f6dda error=3D-10052 (BADSESSION) > > > operation=3D53 > > > kworker/u2:1-3198 [000] ..... 869.508905: nfs4_sequence_done= : > > > error=3D-10052 (BADSESSION) session=3D0x90caa481 slot_nr=3D0 seq_nr= =3D7673 > > > highest_slotid=3D0 target_highest_slotid=3D0 status_flags=3D0x0 () > > > dt-3684 [000] ..... 869.508918: nfs4_set_lock: > > > error=3D-10052 (BADSESSION) cmd=3DSETLK:WRLCK range=3D1603340:1834535 > > > fileid=3D00:3b:109 fhandle=3D0x7c6bc6b4 stateid=3D1:0x8f5f1fe4 > > > lockstateid=3D0:0x7bd5c66f > > > > > > *** this is use of slot_nr=3D0 seq_nr=3D7673 that gets BADSESSION. Sl= ot > > > gets released without incrementing the seq#. The next tracepoint show= s > > > the use of the slot again by another lock call *** > > > > > > kworker/u2:1-3198 [000] ..... 869.508928: > > > nfs4_setup_sequence: session=3D0x90caa481 slot_nr=3D0 seq_nr=3D7673 > > > highest_used_slotid=3D1 > > > kworker/u2:29-549 [000] ..... 869.509746: nfs4_sequence_done= : > > > error=3D0 (OK) session=3D0x90caa481 slot_nr=3D0 seq_nr=3D7673 > > > highest_slotid=3D63 target_highest_slotid=3D63 status_flags=3D0x0 () > > > dt-3672 [000] ..... 869.509770: nfs4_set_lock: > > > error=3D0 (OK) cmd=3DSETLK:WRLCK range=3D146432:159743 fileid=3D00:3b= :129 > > > fhandle=3D0x50fa2dd4 stateid=3D1:0xcf065b31 lockstateid=3D1:0x5c57180= 4 > > > kworker/u2:26-541 [000] ..... 869.509814: > > > nfs4_setup_sequence: session=3D0x90caa481 slot_nr=3D0 seq_nr=3D7674 > > > highest_used_slotid=3D0 > > > kworker/u2:26-541 [000] ..... 869.509857: > > > nfs4_setup_sequence: session=3D0x90caa481 slot_nr=3D1 seq_nr=3D7805 > > > highest_used_slotid=3D1 > > > > > > ** finally the state manager gets to run? But only after 3 "NEW" use > > > of slots are done ** > > > > > > 172.28.68.180-m-3751 [000] ..... 869.510267: nfs4_state_mgr: > > > hostname=3D172.28.68.180 clp state=3DMANAGER_RUNNING|CHECK_LEASE|0xc0= 40 > > > kworker/u2:29-549 [000] ..... 869.510977: nfs4_xdr_status: > > > task:0000a2c8@00000011 xid=3D0x5e0f6dda error=3D-10052 (BADSESSION) > > > operation=3D53 > > > kworker/u2:29-549 [000] ..... 869.510983: nfs4_sequence_done= : > > > error=3D-10052 (BADSESSION) session=3D0x90caa481 slot_nr=3D1 seq_nr= =3D7805 > > > highest_slotid=3D0 target_highest_slotid=3D0 status_flags=3D0x0 () > > > kworker/u2:29-549 [000] ..... 869.510985: nfs4_write: > > > error=3D-10052 (BADSESSION) fileid=3D00:3b:129 fhandle=3D0x50fa2dd4 > > > offset=3D146432 count=3D13312 res=3D0 stateid=3D1:0xcf065b31 > > > layoutstateid=3D0:0x00000000 > > > kworker/u2:26-541 [000] ..... 869.511318: nfs4_sequence_done= : > > > error=3D0 (OK) session=3D0x90caa481 slot_nr=3D0 seq_nr=3D7674 > > > highest_slotid=3D63 target_highest_slotid=3D63 status_flags=3D0x0 () > > > dt-3669 [000] ..... 869.511337: nfs4_set_lock: > > > error=3D0 (OK) cmd=3DSETLK:WRLCK range=3D2462720:2469375 fileid=3D00:= 3b:138 > > > fhandle=3D0xe30d8cf3 stateid=3D1:0xe2787aa1 lockstateid=3D1:0x216421f= e > > > 172.28.68.180-m-3751 [000] ..... 869.511918: > > > nfs4_destroy_session: error=3D0 (OK) dstaddr=3D172.28.68.180 > > > 172.28.68.180-m-3751 [000] ..... 869.513347: > > > nfs4_create_session: error=3D0 (OK) dstaddr=3D172.28.68.180 > > > > > > To prevent reuse of the same slot/seqid for when we receive > > > BADSESSION, can we perhaps set slot->seq_done? Then, when > > > nfs41_sequence_process() calls nfs41_sequence_free_slot(), it'd > > > increment seq_nr then. Slot re-use would be prevented. > > > > > > Or, perhaps we set the NFS4_SLOT_TBL_DRAINING bit right in > > > nfs41_sequence_process() for BADSESSION so that nothing else can get > > > the slot when it's released? > > > > > > Or some other way or preventing slots being (re)used after receiving > > > BADSESSION on that slot. The problem if re-using (interrupted) slots > > > is that they get cached reply from the server and those operations > > > "think" operation succeeded and they have wrong/invalid stateids for > > > instance. > > > > > > Here's the sequence of events. First of all this is a session trunkin= g > > > scenario where one of the servers leaves the group. > > > NFS OP uses slot=3D0 seq=3D0 sends it to server 1. Server 1 processes= the > > > request populates its session cache. But the reply never reaches the > > > client. Connection gets reset. > > > NFS OP is resent using slot=3D0 seq=3D0 to server 2 which just left t= he > > > trunking group. It replies with BADSESSION > > > (session is not frozen on the client yet) new NFS OP uses slot=3D0 se= q=3D0 > > > and sends it to server 1. Server 1 responds out of the session cache. > > To me, this sounds like a broken NFSv4.1/4.2 server. Once a session is = bad, > > I do not think there should ever be a reply that was cached in that bad= session. > > Put another way, the server should not leave the "trunking group' (what= ever that > > means?) without making the session bad for all trunks. I do not think > > a session should > > ever work on one server and not on another one. > > The spec allows for server to leave the group and session to be still val= id. > > From section 2.10.13.1.4 > "If the SEQUENCE requests fail with NFS4ERR_BADSESSION, then the > session no longer exists on any of the server network addresses for > which the client has connections associated with that session ID. It > is possible the session is still alive and available on other network > addresses. " > > Linux server I meant to say Linux client > throws away the session on getting the BADSESSION but the > problem is that it doesn't happen instantly. So some new requests can > sneak before the session gets killed. Thus I'm advocating that slotid > still happens on BADSESSION or I'm suggesting that we freeze the > session table on BADSESSION which we currently don't do -- which > allows new requests to go. > > > > > Having said the above, I have no opinion w.r.t. how such a case should > > be handled. > > (Except to tell the NFS server vendor that their server is broken.) > > > > Just mho, rick > > > > > Client destroys the session > > > Client uses stateid returned from the new OP which is really invalid > > > for the operation. Server fails the operation. Application failure > > > occurs. > > > > > > Thank you..