2007-05-24 09:00:00

by Shunichi Sagawa

[permalink] [raw]
Subject: Permission information on the directory is output with ?

Hello, all

I face the problem that the directory's permission information on
NFS server cannot be correctly acquired.

# ls -l /po/spool/postfix
total 52
drwx------ 18 postfix root 4096 May 22 00:45 active
drwx------ 2 postfix root 4096 Feb 11 2005 bounce
drwx------ 2 postfix root 4096 Feb 11 2005 corrupt
drwx------ 5 postfix root 4096 May 22 02:21 defer
drwx------ 5 postfix root 4096 May 22 02:21 deferred
drwx------ 2 postfix root 4096 Feb 11 2005 flush
-rw-r--r-- 1 root root 0 May 24 2007 hogehoge
drwx------ 2 postfix root 4096 Feb 11 2005 hold
?--------- ? ? ? ? ? incoming <== here
drwx-wx--- 2 postfix postdrop 4096 May 23 2007 maildrop
drwxr-xr-x 2 root root 4096 May 21 14:35 pid
drwx------ 2 postfix root 4096 May 21 14:35 private
drwx--x--- 2 postfix postdrop 4096 May 21 14:35 public
drwx------ 2 postfix root 4096 Feb 11 2005 saved
drwx------ 2 postfix root 4096 Feb 11 2005 trace
#

[System configuration]

nfs-server(RHEL4UP3)
+- x86 machine ----+
| |
+------------------+ nfs-client(RHEL4UP1)
| +-x86 machine -----------------------------------------------+
| nfs | |
+----disk----------+ <-------> | mount /po (option noac) |
| /vol/vol0/jpmail | | |-spool |
| | | | |-postfix |
| | | | | |-active |
| | | | | : |
| | | | | |-incoming(ls -l /po/spool/postfix ?)|
| | | | | |-maildrop(ls -l /po/spool/postfix ?)|
| | | | | : |
+------------------+ | |-vmbox |
+------------------------------------------------------------+


[Reproduction procedure]
If the following operation is done, the output result becomes abnormal.

1. For giving high load, please execute tarloop.sh.
This script is accessed to a local disk.

# cat tarloop.sh
#!/bin/sh
count=1
while :
do
time=`date`
echo "$time count=$count"
tar xfz ./dummydir.tgz
du -a dummydir
rm -rf dummydir
count=$(($count+1))
done
# ls -l dummydir.tgz
-rw-r--r-- 1 root root 965148975 May 22 11:59 dummydir.tgz
#

2. For NFS access, please execute postfix like this.

# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
220 hazard.soft.fujitsu.com ESMTP Postfix
helo soft.fujitsu.com -------------------------> (1)
250 hazard.soft.fujitsu.com
mail from: [email protected] -------------> (2)
250 Ok
rcpt to: [email protected] ---------------> (3)
250 Ok
:
(1) - (3) repeat.

###################################################################
- The incidence of this problem is about 10%.
- When the same procedure is executed after umount/mount is operated,
it is easy to be generated.
###################################################################

[OK pattern]
helo soft.fujitsu.com -------------------------> (1)
250 hazard.soft.fujitsu.com
mail from: [email protected] -------------> (2)
250 Ok
rcpt to: [email protected] ---------------> (3)
250 Ok
[NG pattern]
helo soft.fujitsu.com -------------------------> (1)
250 hazard.soft.fujitsu.com
mail from: [email protected] -------------> (2)
250 Ok
rcpt to: [email protected] ---------------> (3)


When "250 OK" is not output, this problem has been reproduced.

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda8 19931708 13363664 5555552 71% /
none 517336 0 517336 0% /dev/shm
/dev/sdb1 70557052 43613868 23359088 66% /work
rhas4up3-part2:/vol/vol0/jpmail
8143616 4978528 2751392 65% /po
#
# ls -i /po/spool/postfix/
339662 active 339665 defer 339719 hogehoge 339670 maildrop 339693 public
339663 bounce 339666 deferred 339668 hold 339671 pid 339699 saved
339664 corrupt 339667 flush 0 incoming 339673 private 339700 trace
#
# cd /po/spool/postfix/incoming
-bash: cd: /po/spool/postfix/incoming: No such file or directory
#

What is wrong? And is there any patch?


Regards,
Sagawa

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-06-19 08:25:21

by Shunichi Sagawa

[permalink] [raw]
Subject: Re: Permission information on the directory is output with ?

Hello Phrukphicharn,

On Wed, 13 Jun 2007 15:44:29 +0700
"Phrukphicharn, Anuwat" <[email protected]> wrote:

> Hello Shunichi,
>
> I have run the test case for some time, but could not produce such the
> symptom (may be I did something wrong).

I'm sorry, my information was insufficient.

- "/var/spool/postfix" directory exists on a NFS client disk.
- The directory right under "/var/spool/postfix" directory has been linked
to the directory of the NFS server area.

# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda8 19931708 13541004 5378212 72% /
none 516972 0 516972 0% /dev/shm
/dev/sdb1 70557052 43613868 23359088 66% /work
rhas4up3-part2:/vol/vol0/jpmail
8143616 4978656 2751296 65% /po
# ls -ld /var/spool/postfix/
drwxr-xr-x 16 root root 4096 May 21 14:44 /var/spool/postfix/
# ls -l /var/spool/postfix/ | grep incoming
lrwxrwxrwx 1 postfix root 34 May 21 14:44 incoming -> ../../../po/spool/postfix/incoming
#

There are some other symbolic links, which are "active", "bounce",
"corrupt", "defer", "deferred", "flush", "hold", "maildrop", "saved",
"trace", like the link "incoming".

And, this problem occurs in a few seconds by the following method.
(A) postfix execute.
[root@hazard ~]# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
220 hazard.soft.fujitsu.com ESMTP Postfix
helo soft.fujitsu.com
250 hazard.soft.fujitsu.com
mail from: [email protected]
250 Ok
rcpt to: [email protected]
250 Ok
quit

(B) nfs umount/mount.
[root@hazard ~]# df -TH
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda8 ext3 21G 14G 5.8G 71% /
none tmpfs 530M 0 530M 0% /dev/shm
/dev/sdb1 ext3 73G 45G 24G 66% /work
rhas4up3-part2:/vol/vol0/jpmail
nfs 8.4G 5.1G 2.9G 65% /po
[root@hazard ~]# umount /po
[root@hazard ~]# df -TH | grep nfs
[root@hazard ~]# mount /po
[root@hazard ~]# df -TH | grep nfs
nfs 8.4G 5.1G 2.9G 65% /po
[root@hazard ~]#

(C) postfix execute.
[root@hazard ~]# telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
220 hazard.soft.fujitsu.com ESMTP Postfix
helo soft.fujitsu.com
250 hazard.soft.fujitsu.com
mail from: [email protected]
250 Ok
rcpt to: [email protected]

When "250 OK" is not output, "ls -l /po/spool/postfix" is executed.

And, this problem occurs on the following condition.
(1) Access nfs directory through symbolic link, which links to the nfs directory.
For example, access "/var/spool/postfix/incoming". and
(2) O_EXCL is specified for the argument of open(2).
This problem did not occur when I changed(queue_directory of /etc/postfix/main.c)
"/var/spool/postfix" to "/po/spool/postfix".

> Btw, I have worked with this symptom on RHEL4, BZ 228801.

Oh, I see.

> Which the issue is on the client side. You
> may want to check otw attributes of the "incoming" in a GETATTR reply
> (for example) if that is correct. I remember that if 'ls -l' think that
> an attribute value is not valid it will print "?" or other case is it
> cannot retrieve attribute of the file via 'stat("file") call. Which
> latter is my case that the client keeps negative dentry and never
> invalidate it (as a result, a stat() returns ENOENT as the nfs dentry
> revalidation code thinks that it is still valid (so no inode allocated
> for the dentry) while the file is already exist on the server).

Thank you for your detailed information.
The "BZ 228801" phenomena is same(no inode allocated for the dentry) with my problem.
But, I think that the cause of my problem and BZ 228801 is different.

I confirmed my problem and corrected it by the following patches. :-)
(1) RHEL4 UP1 base-kernel.
(2) Fix following patch(Fix RHEL4 UP2(nfs)).
linux-2.6.9-nfs-estale.patch
(3) http://linux-nfs.org/Linux-2.6.x/2.6.12-rc6/linux-2.6.12-02-fix_O_EXCL.dif

Thank you.

Regards,
Sagawa

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs