2013-02-28 15:22:45

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.

These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.

According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.

So, we have four new flags:
O_DENYREAD - to prevent other opens with read access,
O_DENYWRITE - to prevent other opens with write access,
O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
O_DENYMAND - to switch on/off three flags above.

Also, this version of the patchset fixes the problem of data races on newely created files: open with O_CREAT can return the -ETXTBSY error for a successfully created file if this files was locked with a deny lock by another task.

The #1 patch adds flags to fcntl, the #2 patch implements VFS part. The patches #3, #4, #5 are related to CIFS-specific changes, #6 and #7 describe NFS and NFSD parts.

The preliminary patch for Samba that replaces the existing use of flock/LOCK_MAND mechanism with O_DENY* flags:
http://git.etersoft.ru/people/piastry/packages/?p=samba.git;a=commitdiff;h=0596193832ace26a1dd54160c7380d071a83115b

The future part of open man page patch:

-----
O_DENYMAND - used to inforce a mandatory share reservation scheme of the file access. If this flag is passed, the open fails with -ETXTBSY in following cases:
1) if O_DENYREAD flag is specified and there is another open with O_DENYMAND flag and READ access to the file;
2) if O_DENYWRITE flag is specified and there is another open with O_DENYMAND flag and WRITE access to the file;
3) if READ access is requested and there is another open with O_DENYMAND and O_DENYREAD flags;
4) if WRITE access is requested and there is another open with O_DENYMAND and O_DENYWRITE flags.

If O_DENYDELETE flag is specified and the open succeded, any further unlink operation will fail with -ETXTBSY untill this open is closed. Now this flag is processed by CIFS filesystems only.

This mechanism is done through flocks. If O_DENYMAND is specified and flock syscall is processed after the open. The share modes are translated into flock according the following rules:
1) O_DENYMAND -> LOCK_MAND
2) !O_DENYREAD -> LOCK_READ
3) !O_DENYWRITE -> LOCK_WRITE
-----

Pavel Shilovsky (7):
fcntl: Introduce new O_DENY* open flags
vfs: Add O_DENYREAD/WRITE flags support for open syscall
CIFS: Add O_DENY* open flags support
CIFS: Use NT_CREATE_ANDX command for forcemand mounts
CIFS: Translate SHARING_VIOLATION to -ETXTBSY error code for SMB2
NFSv4: Add O_DENY* open flags support
NFSD: Pass share reservations flags to VFS

fs/cifs/cifsacl.c | 6 +-
fs/cifs/cifsglob.h | 12 +++-
fs/cifs/cifsproto.h | 9 +--
fs/cifs/cifssmb.c | 47 ++++++++--------
fs/cifs/dir.c | 14 +++--
fs/cifs/file.c | 18 ++++--
fs/cifs/inode.c | 11 ++--
fs/cifs/link.c | 10 ++--
fs/cifs/readdir.c | 2 +-
fs/cifs/smb1ops.c | 15 ++---
fs/cifs/smb2file.c | 10 ++--
fs/cifs/smb2inode.c | 4 +-
fs/cifs/smb2maperror.c | 2 +-
fs/cifs/smb2ops.c | 10 ++--
fs/cifs/smb2pdu.c | 6 +-
fs/cifs/smb2proto.h | 14 +++--
fs/fcntl.c | 5 +-
fs/locks.c | 117 +++++++++++++++++++++++++++++++++++----
fs/namei.c | 44 ++++++++++++++-
fs/nfs/nfs4xdr.c | 24 ++++++--
fs/nfsd/nfs4state.c | 46 ++++++++++++++-
include/linux/fs.h | 6 ++
include/uapi/asm-generic/fcntl.h | 14 +++++
23 files changed, 345 insertions(+), 101 deletions(-)

--
1.8.1.2



2013-02-28 15:22:52

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 3/7] CIFS: Add O_DENY* open flags support

Make CIFSSMBOpen take share_flags as a parm that allows us
to pass new O_DENY* flags to the server.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/cifs/cifsacl.c | 6 ++++--
fs/cifs/cifsglob.h | 12 +++++++++++-
fs/cifs/cifsproto.h | 9 +++++----
fs/cifs/cifssmb.c | 47 +++++++++++++++++++++++++----------------------
fs/cifs/dir.c | 13 ++++++++-----
fs/cifs/file.c | 12 ++++++++----
fs/cifs/inode.c | 11 ++++++-----
fs/cifs/link.c | 10 +++++-----
fs/cifs/readdir.c | 2 +-
fs/cifs/smb1ops.c | 15 ++++++++-------
fs/cifs/smb2file.c | 10 +++++-----
fs/cifs/smb2inode.c | 4 ++--
fs/cifs/smb2ops.c | 10 ++++++----
fs/cifs/smb2pdu.c | 6 +++---
fs/cifs/smb2proto.h | 14 ++++++++------
15 files changed, 105 insertions(+), 76 deletions(-)

diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index 5cbd00e..222b989 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -891,7 +891,8 @@ static struct cifs_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb,
create_options |= CREATE_OPEN_BACKUP_INTENT;

rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, READ_CONTROL,
- create_options, &fid, &oplock, NULL, cifs_sb->local_nls,
+ FILE_SHARE_ALL, create_options, &fid, &oplock,
+ NULL, cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
if (!rc) {
rc = CIFSSMBGetCIFSACL(xid, tcon, fid, &pntsd, pacllen);
@@ -952,7 +953,8 @@ int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
access_flags = WRITE_DAC;

rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, access_flags,
- create_options, &fid, &oplock, NULL, cifs_sb->local_nls,
+ FILE_SHARE_ALL, create_options, &fid, &oplock,
+ NULL, cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc) {
cERROR(1, "Unable to open file to set ACL");
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index e6899ce..f1d62ed 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -310,7 +310,7 @@ struct smb_version_operations {
struct cifs_sb_info *);
/* open a file for non-posix mounts */
int (*open)(const unsigned int, struct cifs_tcon *, const char *, int,
- int, int, struct cifs_fid *, __u32 *, FILE_ALL_INFO *,
+ int, int, int, struct cifs_fid *, __u32 *, FILE_ALL_INFO *,
struct cifs_sb_info *);
/* set fid protocol-specific info */
void (*set_fid)(struct cifsFileInfo *, struct cifs_fid *, __u32);
@@ -947,6 +947,16 @@ struct cifsFileInfo {
struct work_struct oplock_break; /* work for oplock breaks */
};

+#define CIFS_DENY_RW_FLAGS_SHIFT 22
+#define CIFS_DENY_DEL_FLAG_SHIFT 23
+
+static inline int cifs_get_share_flags(unsigned int flags)
+{
+ return (flags & O_DENYMAND) ?
+ (((~(flags >> CIFS_DENY_RW_FLAGS_SHIFT)) & 3) |
+ ((~(flags >> CIFS_DENY_DEL_FLAG_SHIFT)) & 4)) : FILE_SHARE_ALL;
+}
+
struct cifs_io_parms {
__u16 netfid;
#ifdef CONFIG_CIFS_SMB2
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 1988c1b..9661607 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -361,10 +361,11 @@ extern int CIFSSMBQueryReparseLinkInfo(const unsigned int xid,
const struct nls_table *nls_codepage);
#endif /* temporarily unused until cifs_symlink fixed */
extern int CIFSSMBOpen(const unsigned int xid, struct cifs_tcon *tcon,
- const char *fileName, const int disposition,
- const int access_flags, const int omode,
- __u16 *netfid, int *pOplock, FILE_ALL_INFO *,
- const struct nls_table *nls_codepage, int remap);
+ const char *file_name, const int disposition,
+ const int access_flags, const int share_flags,
+ const int omode, __u16 *netfid, int *oplock,
+ FILE_ALL_INFO *, const struct nls_table *nls_codepage,
+ int remap);
extern int SMBLegacyOpen(const unsigned int xid, struct cifs_tcon *tcon,
const char *fileName, const int disposition,
const int access_flags, const int omode,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 76d0d29..9c4632f 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -1289,10 +1289,11 @@ OldOpenRetry:

int
CIFSSMBOpen(const unsigned int xid, struct cifs_tcon *tcon,
- const char *fileName, const int openDisposition,
- const int access_flags, const int create_options, __u16 *netfid,
- int *pOplock, FILE_ALL_INFO *pfile_info,
- const struct nls_table *nls_codepage, int remap)
+ const char *file_name, const int disposition,
+ const int access_flags, const int share_flags,
+ const int create_options, __u16 *netfid, int *oplock,
+ FILE_ALL_INFO *file_info, const struct nls_table *nls_codepage,
+ int remap)
{
int rc = -EACCES;
OPEN_REQ *pSMB = NULL;
@@ -1313,26 +1314,28 @@ openRetry:
count = 1; /* account for one byte pad to word boundary */
name_len =
cifsConvertToUTF16((__le16 *) (pSMB->fileName + 1),
- fileName, PATH_MAX, nls_codepage, remap);
+ file_name, PATH_MAX, nls_codepage,
+ remap);
name_len++; /* trailing null */
name_len *= 2;
pSMB->NameLength = cpu_to_le16(name_len);
} else { /* BB improve check for buffer overruns BB */
count = 0; /* no pad */
- name_len = strnlen(fileName, PATH_MAX);
+ name_len = strnlen(file_name, PATH_MAX);
name_len++; /* trailing null */
pSMB->NameLength = cpu_to_le16(name_len);
- strncpy(pSMB->fileName, fileName, name_len);
+ strncpy(pSMB->fileName, file_name, name_len);
}
- if (*pOplock & REQ_OPLOCK)
+ if (*oplock & REQ_OPLOCK)
pSMB->OpenFlags = cpu_to_le32(REQ_OPLOCK);
- else if (*pOplock & REQ_BATCHOPLOCK)
+ else if (*oplock & REQ_BATCHOPLOCK)
pSMB->OpenFlags = cpu_to_le32(REQ_BATCHOPLOCK);
pSMB->DesiredAccess = cpu_to_le32(access_flags);
pSMB->AllocationSize = 0;
- /* set file as system file if special file such
- as fifo and server expecting SFU style and
- no Unix extensions */
+ /*
+ * set file as system file if special file such as fifo and server
+ * expecting SFU style and no Unix extensions
+ */
if (create_options & CREATE_OPTION_SPECIAL)
pSMB->FileAttributes = cpu_to_le32(ATTR_SYSTEM);
else
@@ -1347,8 +1350,8 @@ openRetry:
if (create_options & CREATE_OPTION_READONLY)
pSMB->FileAttributes |= cpu_to_le32(ATTR_READONLY);

- pSMB->ShareAccess = cpu_to_le32(FILE_SHARE_ALL);
- pSMB->CreateDisposition = cpu_to_le32(openDisposition);
+ pSMB->ShareAccess = cpu_to_le32(share_flags);
+ pSMB->CreateDisposition = cpu_to_le32(disposition);
pSMB->CreateOptions = cpu_to_le32(create_options & CREATE_OPTIONS_MASK);
/* BB Expirement with various impersonation levels and verify */
pSMB->ImpersonationLevel = cpu_to_le32(SECURITY_IMPERSONATION);
@@ -1366,20 +1369,20 @@ openRetry:
if (rc) {
cFYI(1, "Error in Open = %d", rc);
} else {
- *pOplock = pSMBr->OplockLevel; /* 1 byte no need to le_to_cpu */
+ *oplock = pSMBr->OplockLevel; /* 1 byte no need to le_to_cpu */
*netfid = pSMBr->Fid; /* cifs fid stays in le */
/* Let caller know file was created so we can set the mode. */
/* Do we care about the CreateAction in any other cases? */
if (cpu_to_le32(FILE_CREATE) == pSMBr->CreateAction)
- *pOplock |= CIFS_CREATE_ACTION;
- if (pfile_info) {
- memcpy((char *)pfile_info, (char *)&pSMBr->CreationTime,
+ *oplock |= CIFS_CREATE_ACTION;
+ if (file_info) {
+ memcpy((char *)file_info, (char *)&pSMBr->CreationTime,
36 /* CreationTime to Attributes */);
/* the file_info buf is endian converted by caller */
- pfile_info->AllocationSize = pSMBr->AllocationSize;
- pfile_info->EndOfFile = pSMBr->EndOfFile;
- pfile_info->NumberOfLinks = cpu_to_le32(1);
- pfile_info->DeletePending = 0;
+ file_info->AllocationSize = pSMBr->AllocationSize;
+ file_info->EndOfFile = pSMBr->EndOfFile;
+ file_info->NumberOfLinks = cpu_to_le32(1);
+ file_info->DeletePending = 0;
}
}

diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 8719bbe..6975072 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -203,6 +203,7 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
FILE_ALL_INFO *buf = NULL;
struct inode *newinode = NULL;
int disposition;
+ int share_access;
struct TCP_Server_Info *server = tcon->ses->server;

*oplock = 0;
@@ -293,6 +294,8 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
else
cFYI(1, "Create flag not set in create function");

+ share_access = cifs_get_share_flags(oflags);
+
/*
* BB add processing to set equivalent of mode - e.g. via CreateX with
* ACLs
@@ -320,8 +323,8 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
create_options |= CREATE_OPEN_BACKUP_INTENT;

rc = server->ops->open(xid, tcon, full_path, disposition,
- desired_access, create_options, fid, oplock,
- buf, cifs_sb);
+ desired_access, share_access, create_options,
+ fid, oplock, buf, cifs_sb);
if (rc) {
cFYI(1, "cifs_create returned 0x%x", rc);
goto out;
@@ -626,9 +629,9 @@ int cifs_mknod(struct inode *inode, struct dentry *direntry, umode_t mode,
if (backup_cred(cifs_sb))
create_options |= CREATE_OPEN_BACKUP_INTENT;

- rc = CIFSSMBOpen(xid, pTcon, full_path, FILE_CREATE,
- GENERIC_WRITE, create_options,
- &fileHandle, &oplock, buf, cifs_sb->local_nls,
+ rc = CIFSSMBOpen(xid, pTcon, full_path, FILE_CREATE, GENERIC_WRITE,
+ FILE_SHARE_ALL, create_options, &fileHandle, &oplock,
+ buf, cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc)
goto mknod_out;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 8ea6ca5..3ad484c 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -174,6 +174,7 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
{
int rc;
int desired_access;
+ int share_access;
int disposition;
int create_options = CREATE_NOT_DIR;
FILE_ALL_INFO *buf;
@@ -209,6 +210,7 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
*********************************************************************/

disposition = cifs_get_disposition(f_flags);
+ share_access = cifs_get_share_flags(f_flags);

/* BB pass O_SYNC flag through on file attributes .. BB */

@@ -220,8 +222,8 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
create_options |= CREATE_OPEN_BACKUP_INTENT;

rc = server->ops->open(xid, tcon, full_path, disposition,
- desired_access, create_options, fid, oplock, buf,
- cifs_sb);
+ desired_access, share_access, create_options,
+ fid, oplock, buf, cifs_sb);

if (rc)
goto out;
@@ -579,6 +581,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
struct inode *inode;
char *full_path = NULL;
int desired_access;
+ int share_access;
int disposition = FILE_OPEN;
int create_options = CREATE_NOT_DIR;
struct cifs_fid fid;
@@ -643,6 +646,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
}

desired_access = cifs_convert_flags(cfile->f_flags);
+ share_access = cifs_get_share_flags(cfile->f_flags);

if (backup_cred(cifs_sb))
create_options |= CREATE_OPEN_BACKUP_INTENT;
@@ -658,8 +662,8 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
* not dirty locally we could do this.
*/
rc = server->ops->open(xid, tcon, full_path, disposition,
- desired_access, create_options, &fid, &oplock,
- NULL, cifs_sb);
+ desired_access, share_access, create_options,
+ &fid, &oplock, NULL, cifs_sb);
if (rc) {
mutex_unlock(&cfile->fh_mutex);
cFYI(1, "cifs_reopen returned 0x%x", rc);
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index ed6208f..a497dfa 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -396,8 +396,8 @@ cifs_sfu_type(struct cifs_fattr *fattr, const unsigned char *path,
tcon = tlink_tcon(tlink);

rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, GENERIC_READ,
- CREATE_NOT_DIR, &netfid, &oplock, NULL,
- cifs_sb->local_nls,
+ FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
+ NULL, cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags &
CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc == 0) {
@@ -987,8 +987,9 @@ cifs_rename_pending_delete(const char *full_path, struct dentry *dentry,
tcon = tlink_tcon(tlink);

rc = CIFSSMBOpen(xid, tcon, full_path, FILE_OPEN,
- DELETE|FILE_WRITE_ATTRIBUTES, CREATE_NOT_DIR,
- &netfid, &oplock, NULL, cifs_sb->local_nls,
+ DELETE|FILE_WRITE_ATTRIBUTES, FILE_SHARE_ALL,
+ CREATE_NOT_DIR, &netfid, &oplock, NULL,
+ cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc != 0)
goto out;
@@ -1509,7 +1510,7 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry,

/* open the file to be renamed -- we need DELETE perms */
rc = CIFSSMBOpen(xid, tcon, from_path, FILE_OPEN, DELETE,
- CREATE_NOT_DIR, &srcfid, &oplock, NULL,
+ FILE_SHARE_ALL, CREATE_NOT_DIR, &srcfid, &oplock, NULL,
cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc == 0) {
diff --git a/fs/cifs/link.c b/fs/cifs/link.c
index 51dc2fb..9b4f0db 100644
--- a/fs/cifs/link.c
+++ b/fs/cifs/link.c
@@ -212,7 +212,7 @@ CIFSCreateMFSymLink(const unsigned int xid, struct cifs_tcon *tcon,
create_options |= CREATE_OPEN_BACKUP_INTENT;

rc = CIFSSMBOpen(xid, tcon, fromName, FILE_CREATE, GENERIC_WRITE,
- create_options, &netfid, &oplock, NULL,
+ FILE_SHARE_ALL, create_options, &netfid, &oplock, NULL,
nls_codepage, remap);
if (rc != 0) {
kfree(buf);
@@ -254,8 +254,8 @@ CIFSQueryMFSymLink(const unsigned int xid, struct cifs_tcon *tcon,
FILE_ALL_INFO file_info;

rc = CIFSSMBOpen(xid, tcon, searchName, FILE_OPEN, GENERIC_READ,
- CREATE_NOT_DIR, &netfid, &oplock, &file_info,
- nls_codepage, remap);
+ FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
+ &file_info, nls_codepage, remap);
if (rc != 0)
return rc;

@@ -332,8 +332,8 @@ CIFSCheckMFSymlink(struct cifs_fattr *fattr,
pTcon = tlink_tcon(tlink);

rc = CIFSSMBOpen(xid, pTcon, path, FILE_OPEN, GENERIC_READ,
- CREATE_NOT_DIR, &netfid, &oplock, &file_info,
- cifs_sb->local_nls,
+ FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
+ &file_info, cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags &
CIFS_MOUNT_MAP_SPECIAL_CHR);
if (rc != 0)
diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
index cdd6ff4..726b52e 100644
--- a/fs/cifs/readdir.c
+++ b/fs/cifs/readdir.c
@@ -224,7 +224,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb,
char *tmpbuffer;

rc = CIFSSMBOpen(xid, ptcon, full_path, FILE_OPEN, GENERIC_READ,
- OPEN_REPARSE_POINT, &fid, &oplock, NULL,
+ FILE_SHARE_ALL, OPEN_REPARSE_POINT, &fid, &oplock, NULL,
cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
if (!rc) {
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index 47bc5a8..d782209 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -671,9 +671,9 @@ cifs_mkdir_setinfo(struct inode *inode, const char *full_path,

static int
cifs_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
- int disposition, int desired_access, int create_options,
- struct cifs_fid *fid, __u32 *oplock, FILE_ALL_INFO *buf,
- struct cifs_sb_info *cifs_sb)
+ int disposition, int desired_access, int share_access,
+ int create_options, struct cifs_fid *fid, __u32 *oplock,
+ FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb)
{
if (!(tcon->ses->capabilities & CAP_NT_SMBS))
return SMBLegacyOpen(xid, tcon, path, disposition,
@@ -682,8 +682,8 @@ cifs_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
cifs_sb->local_nls, cifs_sb->mnt_cifs_flags
& CIFS_MOUNT_MAP_SPECIAL_CHR);
return CIFSSMBOpen(xid, tcon, path, disposition, desired_access,
- create_options, &fid->netfid, oplock, buf,
- cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
+ share_access, create_options, &fid->netfid, oplock,
+ buf, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
CIFS_MOUNT_MAP_SPECIAL_CHR);
}

@@ -779,8 +779,9 @@ smb_set_file_info(struct inode *inode, const char *full_path,
cFYI(1, "calling SetFileInfo since SetPathInfo for times not supported "
"by this server");
rc = CIFSSMBOpen(xid, tcon, full_path, FILE_OPEN,
- SYNCHRONIZE | FILE_WRITE_ATTRIBUTES, CREATE_NOT_DIR,
- &netfid, &oplock, NULL, cifs_sb->local_nls,
+ SYNCHRONIZE | FILE_WRITE_ATTRIBUTES, FILE_SHARE_ALL,
+ CREATE_NOT_DIR, &netfid, &oplock, NULL,
+ cifs_sb->local_nls,
cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);

if (rc != 0) {
diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c
index 71e6aed..7dfb50c 100644
--- a/fs/cifs/smb2file.c
+++ b/fs/cifs/smb2file.c
@@ -58,9 +58,9 @@ smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock)

int
smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
- int disposition, int desired_access, int create_options,
- struct cifs_fid *fid, __u32 *oplock, FILE_ALL_INFO *buf,
- struct cifs_sb_info *cifs_sb)
+ int disposition, int desired_access, int share_access,
+ int create_options, struct cifs_fid *fid, __u32 *oplock,
+ FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb)
{
int rc;
__le16 *smb2_path;
@@ -87,8 +87,8 @@ smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
memcpy(smb2_oplock + 1, fid->lease_key, SMB2_LEASE_KEY_SIZE);

rc = SMB2_open(xid, tcon, smb2_path, &fid->persistent_fid,
- &fid->volatile_fid, desired_access, disposition,
- 0, 0, smb2_oplock, smb2_data);
+ &fid->volatile_fid, desired_access, share_access,
+ disposition, 0, 0, smb2_oplock, smb2_data);
if (rc)
goto out;

diff --git a/fs/cifs/smb2inode.c b/fs/cifs/smb2inode.c
index 7064824..6af174a 100644
--- a/fs/cifs/smb2inode.c
+++ b/fs/cifs/smb2inode.c
@@ -54,8 +54,8 @@ smb2_open_op_close(const unsigned int xid, struct cifs_tcon *tcon,
return -ENOMEM;

rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
- desired_access, create_disposition, file_attributes,
- create_options, &oplock, NULL);
+ desired_access, FILE_SHARE_ALL, create_disposition,
+ file_attributes, create_options, &oplock, NULL);
if (rc) {
kfree(utf16_path);
return rc;
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index c9c7aa7..dc38434 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -222,7 +222,8 @@ smb2_is_path_accessible(const unsigned int xid, struct cifs_tcon *tcon,
return -ENOMEM;

rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
- FILE_READ_ATTRIBUTES, FILE_OPEN, 0, 0, &oplock, NULL);
+ FILE_READ_ATTRIBUTES, FILE_SHARE_ALL, FILE_OPEN, 0, 0,
+ &oplock, NULL);
if (rc) {
kfree(utf16_path);
return rc;
@@ -432,8 +433,8 @@ smb2_query_dir_first(const unsigned int xid, struct cifs_tcon *tcon,
return -ENOMEM;

rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
- FILE_READ_ATTRIBUTES | FILE_READ_DATA, FILE_OPEN, 0, 0,
- &oplock, NULL);
+ FILE_READ_ATTRIBUTES | FILE_READ_DATA, FILE_SHARE_ALL,
+ FILE_OPEN, 0, 0, &oplock, NULL);
kfree(utf16_path);
if (rc) {
cERROR(1, "open dir failed");
@@ -515,7 +516,8 @@ smb2_queryfs(const unsigned int xid, struct cifs_tcon *tcon,
u8 oplock = SMB2_OPLOCK_LEVEL_NONE;

rc = SMB2_open(xid, tcon, &srch_path, &persistent_fid, &volatile_fid,
- FILE_READ_ATTRIBUTES, FILE_OPEN, 0, 0, &oplock, NULL);
+ FILE_READ_ATTRIBUTES, FILE_SHARE_ALL, FILE_OPEN, 0, 0,
+ &oplock, NULL);
if (rc)
return rc;
buf->f_type = SMB2_MAGIC_NUMBER;
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 41d9d07..45fa2dc 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -910,8 +910,8 @@ parse_lease_state(struct smb2_create_rsp *rsp)
int
SMB2_open(const unsigned int xid, struct cifs_tcon *tcon, __le16 *path,
u64 *persistent_fid, u64 *volatile_fid, __u32 desired_access,
- __u32 create_disposition, __u32 file_attributes, __u32 create_options,
- __u8 *oplock, struct smb2_file_all_info *buf)
+ __u32 share_access, __u32 create_disposition, __u32 file_attributes,
+ __u32 create_options, __u8 *oplock, struct smb2_file_all_info *buf)
{
struct smb2_create_req *req;
struct smb2_create_rsp *rsp;
@@ -940,7 +940,7 @@ SMB2_open(const unsigned int xid, struct cifs_tcon *tcon, __le16 *path,
req->DesiredAccess = cpu_to_le32(desired_access);
/* File attributes ignored on open (used in create though) */
req->FileAttributes = cpu_to_le32(file_attributes);
- req->ShareAccess = FILE_SHARE_ALL_LE;
+ req->ShareAccess = cpu_to_le32(share_access);
req->CreateDisposition = cpu_to_le32(create_disposition);
req->CreateOptions = cpu_to_le32(create_options);
uni_path_len = (2 * UniStrnlen((wchar_t *)path, PATH_MAX)) + 2;
diff --git a/fs/cifs/smb2proto.h b/fs/cifs/smb2proto.h
index 2aa3535..edff8f6 100644
--- a/fs/cifs/smb2proto.h
+++ b/fs/cifs/smb2proto.h
@@ -86,9 +86,10 @@ extern int smb2_create_hardlink(const unsigned int xid, struct cifs_tcon *tcon,

extern int smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon,
const char *full_path, int disposition,
- int desired_access, int create_options,
- struct cifs_fid *fid, __u32 *oplock,
- FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb);
+ int desired_access, int share_access,
+ int create_options, struct cifs_fid *fid,
+ __u32 *oplock, FILE_ALL_INFO *buf,
+ struct cifs_sb_info *cifs_sb);
extern void smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock);
extern int smb2_unlock_range(struct cifsFileInfo *cfile,
struct file_lock *flock, const unsigned int xid);
@@ -108,9 +109,10 @@ extern int SMB2_tcon(const unsigned int xid, struct cifs_ses *ses,
extern int SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon);
extern int SMB2_open(const unsigned int xid, struct cifs_tcon *tcon,
__le16 *path, u64 *persistent_fid, u64 *volatile_fid,
- __u32 desired_access, __u32 create_disposition,
- __u32 file_attributes, __u32 create_options,
- __u8 *oplock, struct smb2_file_all_info *buf);
+ __u32 desired_access, __u32 share_access,
+ __u32 create_disposition, __u32 file_attributes,
+ __u32 create_options, __u8 *oplock,
+ struct smb2_file_all_info *buf);
extern int SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
u64 persistent_file_id, u64 volatile_file_id);
extern int SMB2_flush(const unsigned int xid, struct cifs_tcon *tcon,
--
1.8.1.2


2013-02-28 15:23:02

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

that maps them into O_DENY flags and make them visible for
applications that use O_DENYMAND opens.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/locks.c | 1 +
fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 0cc7d1b..593d464 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -874,6 +874,7 @@ deny_lock_file(struct file *filp)
locks_free_lock(lock);
return error;
}
+EXPORT_SYMBOL(deny_lock_file);

static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
{
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index ac8ed96c..766256a 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -476,6 +476,19 @@ test_deny(u32 access, struct nfs4_ol_stateid *stp)
return test_bit(access, &stp->st_deny_bmap);
}

+static int nfs4_deny_to_odeny(u32 access)
+{
+ switch (access & NFS4_SHARE_DENY_BOTH) {
+ case NFS4_SHARE_DENY_READ:
+ return O_DENYMAND | O_DENYREAD;
+ case NFS4_SHARE_DENY_WRITE:
+ return O_DENYWRITE | O_DENYMAND;
+ case NFS4_SHARE_DENY_BOTH:
+ return O_DENYREAD | O_DENYWRITE | O_DENYMAND;
+ }
+ return O_DENYMAND;
+}
+
static int nfs4_access_to_omode(u32 access)
{
switch (access & NFS4_SHARE_ACCESS_BOTH) {
@@ -2793,6 +2806,21 @@ nfsd4_truncate(struct svc_rqst *rqstp, struct svc_fh *fh,
}

static __be32
+nfs4_vfs_set_deny(struct nfs4_file *fp, unsigned long share_access,
+ unsigned long deny_access)
+{
+ int oflag, rc;
+ __be32 status = nfs_ok;
+
+ oflag = nfs4_access_to_omode(share_access);
+ fp->fi_fds[oflag]->f_flags |= nfs4_deny_to_odeny(deny_access);
+ rc = deny_lock_file(fp->fi_fds[oflag]);
+ if (rc)
+ status = nfserrno(rc);
+ return status;
+}
+
+static __be32
nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
{
u32 op_share_access = open->op_share_access;
@@ -2813,6 +2841,14 @@ nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *c
}
return status;
}
+ status = nfs4_vfs_set_deny(fp, op_share_access, open->op_share_deny);
+ if (status) {
+ if (new_access) {
+ int oflag = nfs4_access_to_omode(op_share_access);
+ nfs4_file_put_access(fp, oflag);
+ }
+ return status;
+ }
/* remember the open */
set_access(op_share_access, stp);
set_deny(open->op_share_deny, stp);
@@ -3046,7 +3082,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf

/*
* OPEN the file, or upgrade an existing OPEN.
- * If truncate fails, the OPEN fails.
+ * If truncate or setting deny fails, the OPEN fails.
*/
if (stp) {
/* Stateid was found, this is an OPEN upgrade */
@@ -3060,6 +3096,10 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
status = nfsd4_truncate(rqstp, current_fh, open);
if (status)
goto out;
+ status = nfs4_vfs_set_deny(fp, open->op_share_access,
+ open->op_share_deny);
+ if (status)
+ goto out;
stp = open->op_stp;
open->op_stp = NULL;
init_open_stateid(stp, fp, open);
@@ -3758,6 +3798,10 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
}
nfs4_stateid_downgrade(stp, od->od_share_access);

+ status = nfs4_vfs_set_deny(stp->st_file, od->od_share_access,
+ od->od_share_deny);
+ if (status)
+ goto out;
reset_union_bmap_deny(od->od_share_deny, stp);

update_stateid(&stp->st_stid.sc_stateid);
--
1.8.1.2


2013-02-28 15:22:54

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 4/7] CIFS: Use NT_CREATE_ANDX command for forcemand mounts

forcemand mount option now lets us use Windows mandatory style of
byte-range locks even if server supports posix ones - switches on
Windows locking mechanism. Share flags is another locking mehanism
provided by Windows semantic that can be used by NT_CREATE_ANDX
command. This patch combines all Windows locking mechanism in one
mount option by using NT_CREATE_ANDX to open files if forcemand is on.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/cifs/dir.c | 1 +
fs/cifs/file.c | 6 ++++--
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 6975072..139c8bc 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -217,6 +217,7 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
}

if (tcon->unix_ext && cap_unix(tcon->ses) && !tcon->broken_posix_open &&
+ ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
(CIFS_UNIX_POSIX_PATH_OPS_CAP &
le64_to_cpu(tcon->fsUnixInfo.Capability))) {
rc = cifs_posix_open(full_path, &newinode, inode->i_sb, mode,
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 3ad484c..05191da 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -454,8 +454,9 @@ int cifs_open(struct inode *inode, struct file *file)
else
oplock = 0;

- if (!tcon->broken_posix_open && tcon->unix_ext &&
- cap_unix(tcon->ses) && (CIFS_UNIX_POSIX_PATH_OPS_CAP &
+ if (!tcon->broken_posix_open && tcon->unix_ext && cap_unix(tcon->ses)
+ && ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
+ (CIFS_UNIX_POSIX_PATH_OPS_CAP &
le64_to_cpu(tcon->fsUnixInfo.Capability))) {
/* can not refresh inode info since size could be stale */
rc = cifs_posix_open(full_path, &inode, inode->i_sb,
@@ -623,6 +624,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
oplock = 0;

if (tcon->unix_ext && cap_unix(tcon->ses) &&
+ ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
(CIFS_UNIX_POSIX_PATH_OPS_CAP &
le64_to_cpu(tcon->fsUnixInfo.Capability))) {
/*
--
1.8.1.2


2013-02-28 15:22:47

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 1/7] fcntl: Introduce new O_DENY* open flags

This patch adds 4 flags:
1) O_DENYREAD that doesn't permit read access,
2) O_DENYWRITE that doesn't permit write access,
3) O_DENYDELETE that doesn't permit delete or rename,
4) O_DENYMAND that enables O_DENY* flags checks.

Network filesystems CIFS, SMB2.0, SMB3.0 and NFSv4 have such flags -
this change can benefit cifs and nfs modules as well as Samba and
NFS file servers. These flags are only take affect for opens with
O_DENYMAND flags - there is no impact on native Linux opens.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/fcntl.c | 5 +++--
include/uapi/asm-generic/fcntl.h | 14 ++++++++++++++
2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/fcntl.c b/fs/fcntl.c
index 71a600a..d88a548 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -730,14 +730,15 @@ static int __init fcntl_init(void)
* Exceptions: O_NONBLOCK is a two bit define on parisc; O_NDELAY
* is defined as O_NONBLOCK on some platforms and not on others.
*/
- BUILD_BUG_ON(19 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
+ BUILD_BUG_ON(23 - 1 /* for O_RDONLY being 0 */ != HWEIGHT32(
O_RDONLY | O_WRONLY | O_RDWR |
O_CREAT | O_EXCL | O_NOCTTY |
O_TRUNC | O_APPEND | /* O_NONBLOCK | */
__O_SYNC | O_DSYNC | FASYNC |
O_DIRECT | O_LARGEFILE | O_DIRECTORY |
O_NOFOLLOW | O_NOATIME | O_CLOEXEC |
- __FMODE_EXEC | O_PATH
+ __FMODE_EXEC | O_PATH | O_DENYREAD |
+ O_DENYWRITE | O_DENYDELETE | O_DENYMAND
));

fasync_cache = kmem_cache_create("fasync_cache",
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index a48937d..6e4e979 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -84,6 +84,20 @@
#define O_PATH 010000000
#endif

+#ifndef O_DENYREAD
+#define O_DENYREAD 020000000 /* Do not permit read access */
+#endif
+#ifndef O_DENYWRITE
+#define O_DENYWRITE 040000000 /* Do not permit write access */
+#endif
+/* FMODE_NONOTIFY 0100000000 */
+#ifndef O_DENYDELETE
+#define O_DENYDELETE 0200000000 /* Do not permit delete or rename */
+#endif
+#ifndef O_DENYMAND
+#define O_DENYMAND 0400000000 /* Enable O_DENY flags checks */
+#endif
+
#ifndef O_NDELAY
#define O_NDELAY O_NONBLOCK
#endif
--
1.8.1.2


2013-02-28 15:22:56

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 5/7] CIFS: Translate SHARING_VIOLATION to -ETXTBSY error code for SMB2

to make it match CIFS and VFS variants.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/cifs/smb2maperror.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
index 494c912..11e589e 100644
--- a/fs/cifs/smb2maperror.c
+++ b/fs/cifs/smb2maperror.c
@@ -356,7 +356,7 @@ static const struct status_to_posix_error smb2_error_map_table[] = {
{STATUS_PORT_CONNECTION_REFUSED, -ECONNREFUSED,
"STATUS_PORT_CONNECTION_REFUSED"},
{STATUS_INVALID_PORT_HANDLE, -EIO, "STATUS_INVALID_PORT_HANDLE"},
- {STATUS_SHARING_VIOLATION, -EBUSY, "STATUS_SHARING_VIOLATION"},
+ {STATUS_SHARING_VIOLATION, -ETXTBSY, "STATUS_SHARING_VIOLATION"},
{STATUS_QUOTA_EXCEEDED, -EDQUOT, "STATUS_QUOTA_EXCEEDED"},
{STATUS_INVALID_PAGE_PROTECTION, -EIO,
"STATUS_INVALID_PAGE_PROTECTION"},
--
1.8.1.2


2013-02-28 15:22:58

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

by passing these flags to NFSv4 open request.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/nfs/nfs4xdr.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 26b1439..58ddc74 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1325,7 +1325,8 @@ static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struc
encode_string(xdr, name->len, name->name);
}

-static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
+static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode,
+ int open_flags)
{
__be32 *p;

@@ -1343,7 +1344,22 @@ static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
default:
*p++ = cpu_to_be32(0);
}
- *p = cpu_to_be32(0); /* for linux, share_deny = 0 always */
+ if (open_flags & O_DENYMAND) {
+ switch (open_flags & (O_DENYREAD|O_DENYWRITE)) {
+ case O_DENYREAD:
+ *p = cpu_to_be32(NFS4_SHARE_DENY_READ);
+ break;
+ case O_DENYWRITE:
+ *p = cpu_to_be32(NFS4_SHARE_DENY_WRITE);
+ break;
+ case O_DENYREAD|O_DENYWRITE:
+ *p = cpu_to_be32(NFS4_SHARE_DENY_BOTH);
+ break;
+ default:
+ *p = cpu_to_be32(0);
+ }
+ } else
+ *p = cpu_to_be32(0);
}

static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
@@ -1354,7 +1370,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
* owner 4 = 32
*/
encode_nfs4_seqid(xdr, arg->seqid);
- encode_share_access(xdr, arg->fmode);
+ encode_share_access(xdr, arg->fmode, arg->open_flags);
p = reserve_space(xdr, 36);
p = xdr_encode_hyper(p, arg->clientid);
*p++ = cpu_to_be32(24);
@@ -1491,7 +1507,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close
encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
encode_nfs4_stateid(xdr, arg->stateid);
encode_nfs4_seqid(xdr, arg->seqid);
- encode_share_access(xdr, arg->fmode);
+ encode_share_access(xdr, arg->fmode, 0);
}

static void
--
1.8.1.2


2013-02-28 21:53:28

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

[possible resend -- sorry]

On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
> This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
>
> These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
>
> According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
>
> So, we have four new flags:
> O_DENYREAD - to prevent other opens with read access,
> O_DENYWRITE - to prevent other opens with write access,
> O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
> O_DENYMAND - to switch on/off three flags above.

O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
better?

Other than that, this seems like a sensible mechanism.

--Andy


2013-02-28 15:22:50

by Pavel Shilovsky

[permalink] [raw]
Subject: [PATCH v3 2/7] vfs: Add O_DENYREAD/WRITE flags support for open syscall

If O_DENYMAND flag is specified, O_DENYREAD/WRITE/MAND flags are
translated to flock's flags:

!O_DENYREAD -> LOCK_READ
!O_DENYWRITE -> LOCK_WRITE
O_DENYMAND -> LOCK_MAND

and set through flock_lock_file on a file.

This change affects opens that use O_DENYMAND flag - all other
native Linux opens don't care about these flags. It allow us to
enable this feature for applications that need it (e.g. NFS and
Samba servers that export the same directory for Windows clients,
or Wine applications that access the same files simultaneously).

Create codepath is slightly changed to prevent data races on
newely created files: when open with O_CREAT can return with -ETXTBSY
error for successfully created files due to a deny lock set by
another task.

Signed-off-by: Pavel Shilovsky <[email protected]>
---
fs/locks.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++------
fs/namei.c | 44 ++++++++++++++++++--
include/linux/fs.h | 6 +++
3 files changed, 151 insertions(+), 15 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index a94e331..0cc7d1b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -605,20 +605,81 @@ static int posix_locks_conflict(struct file_lock *caller_fl, struct file_lock *s
return (locks_conflict(caller_fl, sys_fl));
}

-/* Determine if lock sys_fl blocks lock caller_fl. FLOCK specific
- * checking before calling the locks_conflict().
+static unsigned int
+deny_flags_to_cmd(unsigned int flags)
+{
+ unsigned int cmd = LOCK_MAND;
+
+ if (!(flags & O_DENYREAD))
+ cmd |= LOCK_READ;
+ if (!(flags & O_DENYWRITE))
+ cmd |= LOCK_WRITE;
+
+ return cmd;
+}
+
+/*
+ * locks_mand_conflict - Determine if there's a share reservation conflict
+ * @caller_fl: lock we're attempting to acquire
+ * @sys_fl: lock already present on system that we're checking against
+ *
+ * Check to see if there's a share_reservation conflict. LOCK_READ/LOCK_WRITE
+ * tell us whether the reservation allows other readers and writers.
+ *
+ * We only check against other LOCK_MAND locks, so applications that want to
+ * use share mode locking will only conflict against one another. "normal"
+ * applications that open files won't be affected by and won't themselves
+ * affect the share reservations.
*/
-static int flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
+static int
+locks_mand_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
{
- /* FLOCK locks referring to the same filp do not conflict with
+ unsigned char caller_type = caller_fl->fl_type;
+ unsigned char sys_type = sys_fl->fl_type;
+ fmode_t caller_fmode = caller_fl->fl_file->f_mode;
+ fmode_t sys_fmode = sys_fl->fl_file->f_mode;
+
+ /* they can only conflict if they're both LOCK_MAND */
+ if (!(caller_type & LOCK_MAND) || !(sys_type & LOCK_MAND))
+ return 0;
+
+ if (!(caller_type & LOCK_READ) && (sys_fmode & FMODE_READ))
+ return 1;
+ if (!(caller_type & LOCK_WRITE) && (sys_fmode & FMODE_WRITE))
+ return 1;
+ if (!(sys_type & LOCK_READ) && (caller_fmode & FMODE_READ))
+ return 1;
+ if (!(sys_type & LOCK_WRITE) && (caller_fmode & FMODE_WRITE))
+ return 1;
+
+ return 0;
+}
+
+/*
+ * Determine if lock sys_fl blocks lock caller_fl. FLOCK specific checking
+ * before calling the locks_conflict(). Boolean is_mand indicates whether
+ * we should use a share reservation scheme or not.
+ */
+static int
+flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl,
+ bool is_mand)
+{
+ /*
+ * FLOCK locks referring to the same filp do not conflict with
* each other.
*/
- if (!IS_FLOCK(sys_fl) || (caller_fl->fl_file == sys_fl->fl_file))
- return (0);
- if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND))
+ if (!IS_FLOCK(sys_fl))
+ return 0;
+ if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND)) {
+ if (is_mand)
+ return locks_mand_conflict(caller_fl, sys_fl);
+ else
+ return 0;
+ }
+ if (caller_fl->fl_file == sys_fl->fl_file)
return 0;

- return (locks_conflict(caller_fl, sys_fl));
+ return locks_conflict(caller_fl, sys_fl);
}

void
@@ -697,14 +758,19 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
return 0;
}

-/* Try to create a FLOCK lock on filp. We always insert new FLOCK locks
+/*
+ * Try to create a FLOCK lock on filp. We always insert new FLOCK locks
* after any leases, but before any posix locks.
*
* Note that if called with an FL_EXISTS argument, the caller may determine
* whether or not a lock was successfully freed by testing the return
* value for -ENOENT.
+ *
+ * Take @is_conflict callback that determines how to check if locks have
+ * conflicts or not.
*/
-static int flock_lock_file(struct file *filp, struct file_lock *request)
+static int
+flock_lock_file(struct file *filp, struct file_lock *request, bool is_mand)
{
struct file_lock *new_fl = NULL;
struct file_lock **before;
@@ -760,7 +826,7 @@ find_conflict:
break;
if (IS_LEASE(fl))
continue;
- if (!flock_locks_conflict(request, fl))
+ if (!flock_locks_conflict(request, fl, is_mand))
continue;
error = -EAGAIN;
if (!(request->fl_flags & FL_SLEEP))
@@ -783,6 +849,32 @@ out:
return error;
}

+/*
+ * Determine if a file is allowed to be opened with specified access and deny
+ * modes. Lock the file and return 0 if checks passed, otherwise return a error
+ * code.
+ */
+int
+deny_lock_file(struct file *filp)
+{
+ struct file_lock *lock;
+ int error = 0;
+
+ if (!(filp->f_flags & O_DENYMAND))
+ return error;
+
+ error = flock_make_lock(filp, &lock, deny_flags_to_cmd(filp->f_flags));
+ if (error)
+ return error;
+
+ error = flock_lock_file(filp, lock, true);
+ if (error == -EAGAIN)
+ error = -ETXTBSY;
+
+ locks_free_lock(lock);
+ return error;
+}
+
static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
{
struct file_lock *fl;
@@ -1589,7 +1681,7 @@ int flock_lock_file_wait(struct file *filp, struct file_lock *fl)
int error;
might_sleep();
for (;;) {
- error = flock_lock_file(filp, fl);
+ error = flock_lock_file(filp, fl, false);
if (error != FILE_LOCK_DEFERRED)
break;
error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
diff --git a/fs/namei.c b/fs/namei.c
index 43a97ee..c1f7e08 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2559,9 +2559,14 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
* here.
*/
error = may_open(&file->f_path, acc_mode, open_flag);
- if (error)
+ if (error) {
fput(file);
+ goto out;
+ }

+ error = deny_lock_file(file);
+ if (error)
+ fput(file);
out:
dput(dentry);
return error;
@@ -2771,9 +2776,9 @@ retry_lookup:
}
mutex_lock(&dir->d_inode->i_mutex);
error = lookup_open(nd, path, file, op, got_write, opened);
- mutex_unlock(&dir->d_inode->i_mutex);

if (error <= 0) {
+ mutex_unlock(&dir->d_inode->i_mutex);
if (error)
goto out;

@@ -2791,8 +2796,32 @@ retry_lookup:
will_truncate = false;
acc_mode = MAY_OPEN;
path_to_nameidata(path, nd);
- goto finish_open_created;
+
+ /*
+ * Unlock parent i_mutex later when the open finishes - prevent
+ * races when a file can be locked with a deny lock by another
+ * task that opens the file.
+ */
+ error = may_open(&nd->path, acc_mode, open_flag);
+ if (error) {
+ mutex_unlock(&dir->d_inode->i_mutex);
+ goto out;
+ }
+ file->f_path.mnt = nd->path.mnt;
+ error = finish_open(file, nd->path.dentry, NULL, opened);
+ if (error) {
+ mutex_unlock(&dir->d_inode->i_mutex);
+ if (error == -EOPENSTALE)
+ goto stale_open;
+ goto out;
+ }
+ error = deny_lock_file(file);
+ mutex_unlock(&dir->d_inode->i_mutex);
+ if (error)
+ goto exit_fput;
+ goto opened;
}
+ mutex_unlock(&dir->d_inode->i_mutex);

/*
* create/update audit record if it already exists.
@@ -2885,6 +2914,15 @@ finish_open_created:
goto stale_open;
goto out;
}
+ /*
+ * Lock parent i_mutex to prevent races with deny locks on newely
+ * created files.
+ */
+ mutex_lock(&dir->d_inode->i_mutex);
+ error = deny_lock_file(file);
+ mutex_unlock(&dir->d_inode->i_mutex);
+ if (error)
+ goto exit_fput;
opened:
error = open_check_o_direct(file);
if (error)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7617ee0..347e1de 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1005,6 +1005,7 @@ extern int lease_modify(struct file_lock **, int);
extern int lock_may_read(struct inode *, loff_t start, unsigned long count);
extern int lock_may_write(struct inode *, loff_t start, unsigned long count);
extern void locks_delete_block(struct file_lock *waiter);
+extern int deny_lock_file(struct file *);
extern void lock_flocks(void);
extern void unlock_flocks(void);
#else /* !CONFIG_FILE_LOCKING */
@@ -1153,6 +1154,11 @@ static inline void locks_delete_block(struct file_lock *waiter)
{
}

+static inline int deny_lock_file(struct file *filp)
+{
+ return 0;
+}
+
static inline void lock_flocks(void)
{
}
--
1.8.1.2


2013-03-11 18:46:15

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] vfs: Add O_DENYREAD/WRITE flags support for open syscall

On Thu, 28 Feb 2013 19:25:28 +0400
Pavel Shilovsky <[email protected]> wrote:

> If O_DENYMAND flag is specified, O_DENYREAD/WRITE/MAND flags are
> translated to flock's flags:
>
> !O_DENYREAD -> LOCK_READ
> !O_DENYWRITE -> LOCK_WRITE
> O_DENYMAND -> LOCK_MAND
>
> and set through flock_lock_file on a file.
>
> This change affects opens that use O_DENYMAND flag - all other
> native Linux opens don't care about these flags. It allow us to
> enable this feature for applications that need it (e.g. NFS and
> Samba servers that export the same directory for Windows clients,
> or Wine applications that access the same files simultaneously).
>
> Create codepath is slightly changed to prevent data races on
> newely created files: when open with O_CREAT can return with -ETXTBSY
> error for successfully created files due to a deny lock set by
> another task.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/locks.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++------
> fs/namei.c | 44 ++++++++++++++++++--
> include/linux/fs.h | 6 +++
> 3 files changed, 151 insertions(+), 15 deletions(-)
>
> diff --git a/fs/locks.c b/fs/locks.c
> index a94e331..0cc7d1b 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -605,20 +605,81 @@ static int posix_locks_conflict(struct file_lock *caller_fl, struct file_lock *s
> return (locks_conflict(caller_fl, sys_fl));
> }
>
> -/* Determine if lock sys_fl blocks lock caller_fl. FLOCK specific
> - * checking before calling the locks_conflict().
> +static unsigned int
> +deny_flags_to_cmd(unsigned int flags)
> +{
> + unsigned int cmd = LOCK_MAND;
> +
> + if (!(flags & O_DENYREAD))
> + cmd |= LOCK_READ;
> + if (!(flags & O_DENYWRITE))
> + cmd |= LOCK_WRITE;
> +
> + return cmd;
> +}
> +
> +/*
> + * locks_mand_conflict - Determine if there's a share reservation conflict
> + * @caller_fl: lock we're attempting to acquire
> + * @sys_fl: lock already present on system that we're checking against
> + *
> + * Check to see if there's a share_reservation conflict. LOCK_READ/LOCK_WRITE
> + * tell us whether the reservation allows other readers and writers.
> + *
> + * We only check against other LOCK_MAND locks, so applications that want to
> + * use share mode locking will only conflict against one another. "normal"
> + * applications that open files won't be affected by and won't themselves
> + * affect the share reservations.
> */
> -static int flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
> +static int
> +locks_mand_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
> {
> - /* FLOCK locks referring to the same filp do not conflict with
> + unsigned char caller_type = caller_fl->fl_type;
> + unsigned char sys_type = sys_fl->fl_type;
> + fmode_t caller_fmode = caller_fl->fl_file->f_mode;
> + fmode_t sys_fmode = sys_fl->fl_file->f_mode;
> +
> + /* they can only conflict if they're both LOCK_MAND */
> + if (!(caller_type & LOCK_MAND) || !(sys_type & LOCK_MAND))
> + return 0;
> +
> + if (!(caller_type & LOCK_READ) && (sys_fmode & FMODE_READ))
> + return 1;
> + if (!(caller_type & LOCK_WRITE) && (sys_fmode & FMODE_WRITE))
> + return 1;
> + if (!(sys_type & LOCK_READ) && (caller_fmode & FMODE_READ))
> + return 1;
> + if (!(sys_type & LOCK_WRITE) && (caller_fmode & FMODE_WRITE))
> + return 1;
> +
> + return 0;
> +}
> +
> +/*
> + * Determine if lock sys_fl blocks lock caller_fl. FLOCK specific checking
> + * before calling the locks_conflict(). Boolean is_mand indicates whether
> + * we should use a share reservation scheme or not.
> + */
> +static int
> +flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl,
> + bool is_mand)

I'm not sure you really need to add this new is_mand bool. Won't that
be equivalent to (caller_fl->fl_type & LOCK_MAND)?

> +{
> + /*
> + * FLOCK locks referring to the same filp do not conflict with
> * each other.
> */
> - if (!IS_FLOCK(sys_fl) || (caller_fl->fl_file == sys_fl->fl_file))
> - return (0);
> - if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND))
> + if (!IS_FLOCK(sys_fl))
> + return 0;
> + if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND)) {
> + if (is_mand)
> + return locks_mand_conflict(caller_fl, sys_fl);
> + else
> + return 0;
> + }
> + if (caller_fl->fl_file == sys_fl->fl_file)
> return 0;
>
> - return (locks_conflict(caller_fl, sys_fl));
> + return locks_conflict(caller_fl, sys_fl);
> }
>
> void
> @@ -697,14 +758,19 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
> return 0;
> }
>
> -/* Try to create a FLOCK lock on filp. We always insert new FLOCK locks
> +/*
> + * Try to create a FLOCK lock on filp. We always insert new FLOCK locks
> * after any leases, but before any posix locks.
> *
> * Note that if called with an FL_EXISTS argument, the caller may determine
> * whether or not a lock was successfully freed by testing the return
> * value for -ENOENT.
> + *
> + * Take @is_conflict callback that determines how to check if locks have
> + * conflicts or not.
> */
> -static int flock_lock_file(struct file *filp, struct file_lock *request)
> +static int
> +flock_lock_file(struct file *filp, struct file_lock *request, bool is_mand)

Ditto here on the is_mand bool. I think you can determine that from
request->fl_type. Right?

> {
> struct file_lock *new_fl = NULL;
> struct file_lock **before;
> @@ -760,7 +826,7 @@ find_conflict:
> break;
> if (IS_LEASE(fl))
> continue;
> - if (!flock_locks_conflict(request, fl))
> + if (!flock_locks_conflict(request, fl, is_mand))
> continue;
> error = -EAGAIN;
> if (!(request->fl_flags & FL_SLEEP))
> @@ -783,6 +849,32 @@ out:
> return error;
> }
>
> +/*
> + * Determine if a file is allowed to be opened with specified access and deny
> + * modes. Lock the file and return 0 if checks passed, otherwise return a error
> + * code.
> + */
> +int
> +deny_lock_file(struct file *filp)
> +{
> + struct file_lock *lock;
> + int error = 0;
> +
> + if (!(filp->f_flags & O_DENYMAND))
> + return error;
> +
> + error = flock_make_lock(filp, &lock, deny_flags_to_cmd(filp->f_flags));
> + if (error)
> + return error;
> +
> + error = flock_lock_file(filp, lock, true);
> + if (error == -EAGAIN)
> + error = -ETXTBSY;
> +

I think EBUSY would be a better return code here. ETXTBSY is returned
in more specific circumstances -- mostly when you're opening a file for
write that is being executed.

> + locks_free_lock(lock);
> + return error;
> +}
> +
> static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
> {
> struct file_lock *fl;
> @@ -1589,7 +1681,7 @@ int flock_lock_file_wait(struct file *filp, struct file_lock *fl)
> int error;
> might_sleep();
> for (;;) {
> - error = flock_lock_file(filp, fl);
> + error = flock_lock_file(filp, fl, false);
> if (error != FILE_LOCK_DEFERRED)
> break;
> error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
> diff --git a/fs/namei.c b/fs/namei.c
> index 43a97ee..c1f7e08 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2559,9 +2559,14 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
> * here.
> */
> error = may_open(&file->f_path, acc_mode, open_flag);
> - if (error)
> + if (error) {
> fput(file);
> + goto out;
> + }
>
> + error = deny_lock_file(file);
> + if (error)
> + fput(file);
> out:
> dput(dentry);
> return error;
> @@ -2771,9 +2776,9 @@ retry_lookup:
> }
> mutex_lock(&dir->d_inode->i_mutex);
> error = lookup_open(nd, path, file, op, got_write, opened);
> - mutex_unlock(&dir->d_inode->i_mutex);
>
> if (error <= 0) {
> + mutex_unlock(&dir->d_inode->i_mutex);
> if (error)
> goto out;
>
> @@ -2791,8 +2796,32 @@ retry_lookup:
> will_truncate = false;
> acc_mode = MAY_OPEN;
> path_to_nameidata(path, nd);
> - goto finish_open_created;
> +
> + /*
> + * Unlock parent i_mutex later when the open finishes - prevent
> + * races when a file can be locked with a deny lock by another
> + * task that opens the file.
> + */
> + error = may_open(&nd->path, acc_mode, open_flag);
> + if (error) {
> + mutex_unlock(&dir->d_inode->i_mutex);
> + goto out;
> + }
> + file->f_path.mnt = nd->path.mnt;
> + error = finish_open(file, nd->path.dentry, NULL, opened);
> + if (error) {
> + mutex_unlock(&dir->d_inode->i_mutex);
> + if (error == -EOPENSTALE)
> + goto stale_open;
> + goto out;
> + }
> + error = deny_lock_file(file);
> + mutex_unlock(&dir->d_inode->i_mutex);
> + if (error)
> + goto exit_fput;
> + goto opened;
> }
> + mutex_unlock(&dir->d_inode->i_mutex);
>
> /*
> * create/update audit record if it already exists.
> @@ -2885,6 +2914,15 @@ finish_open_created:
> goto stale_open;
> goto out;
> }
> + /*
> + * Lock parent i_mutex to prevent races with deny locks on newely
> + * created files.
> + */
> + mutex_lock(&dir->d_inode->i_mutex);
> + error = deny_lock_file(file);
> + mutex_unlock(&dir->d_inode->i_mutex);
> + if (error)
> + goto exit_fput;
> opened:
> error = open_check_o_direct(file);
> if (error)
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 7617ee0..347e1de 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1005,6 +1005,7 @@ extern int lease_modify(struct file_lock **, int);
> extern int lock_may_read(struct inode *, loff_t start, unsigned long count);
> extern int lock_may_write(struct inode *, loff_t start, unsigned long count);
> extern void locks_delete_block(struct file_lock *waiter);
> +extern int deny_lock_file(struct file *);
> extern void lock_flocks(void);
> extern void unlock_flocks(void);
> #else /* !CONFIG_FILE_LOCKING */
> @@ -1153,6 +1154,11 @@ static inline void locks_delete_block(struct file_lock *waiter)
> {
> }
>
> +static inline int deny_lock_file(struct file *filp)
> +{
> + return 0;
> +}
> +
> static inline void lock_flocks(void)
> {
> }


--
Jeff Layton <[email protected]>

2013-03-01 06:44:12

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

2013/3/1 Andy Lutomirski <[email protected]>:
> [possible resend -- sorry]
>
> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
>> This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
>>
>> These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
>>
>> According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
>>
>> So, we have four new flags:
>> O_DENYREAD - to prevent other opens with read access,
>> O_DENYWRITE - to prevent other opens with write access,
>> O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
>> O_DENYMAND - to switch on/off three flags above.
>
> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
> better?
>
> Other than that, this seems like a sensible mechanism.

I don't mind to rename it. Your suggestion looks ok to me, thanks.

--
Best regards,
Pavel Shilovsky.

2013-03-12 12:35:23

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

On Mon, 11 Mar 2013 14:54:34 -0400
Jeff Layton <[email protected]> wrote:

> On Thu, 28 Feb 2013 19:25:32 +0400
> Pavel Shilovsky <[email protected]> wrote:
>
> > by passing these flags to NFSv4 open request.
> >
> > Signed-off-by: Pavel Shilovsky <[email protected]>
> > ---
> > fs/nfs/nfs4xdr.c | 24 ++++++++++++++++++++----
> > 1 file changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> > index 26b1439..58ddc74 100644
> > --- a/fs/nfs/nfs4xdr.c
> > +++ b/fs/nfs/nfs4xdr.c
> > @@ -1325,7 +1325,8 @@ static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struc
> > encode_string(xdr, name->len, name->name);
> > }
> >
> > -static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> > +static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode,
> > + int open_flags)
> > {
> > __be32 *p;
> >
> > @@ -1343,7 +1344,22 @@ static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> > default:
> > *p++ = cpu_to_be32(0);
> > }
> > - *p = cpu_to_be32(0); /* for linux, share_deny = 0 always */
> > + if (open_flags & O_DENYMAND) {
>
>
> As Bruce mentioned, I think a mount option to enable this on a per-fs
> basis would be a better approach than this new O_DENYMAND flag.
>
>
> > + switch (open_flags & (O_DENYREAD|O_DENYWRITE)) {
> > + case O_DENYREAD:
> > + *p = cpu_to_be32(NFS4_SHARE_DENY_READ);
> > + break;
> > + case O_DENYWRITE:
> > + *p = cpu_to_be32(NFS4_SHARE_DENY_WRITE);
> > + break;
> > + case O_DENYREAD|O_DENYWRITE:
> > + *p = cpu_to_be32(NFS4_SHARE_DENY_BOTH);
> > + break;
> > + default:
> > + *p = cpu_to_be32(0);
> > + }
> > + } else
> > + *p = cpu_to_be32(0);
> > }
> >
> > static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
> > @@ -1354,7 +1370,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
> > * owner 4 = 32
> > */
> > encode_nfs4_seqid(xdr, arg->seqid);
> > - encode_share_access(xdr, arg->fmode);
> > + encode_share_access(xdr, arg->fmode, arg->open_flags);
> > p = reserve_space(xdr, 36);
> > p = xdr_encode_hyper(p, arg->clientid);
> > *p++ = cpu_to_be32(24);
> > @@ -1491,7 +1507,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close
> > encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
> > encode_nfs4_stateid(xdr, arg->stateid);
> > encode_nfs4_seqid(xdr, arg->seqid);
> > - encode_share_access(xdr, arg->fmode);
> > + encode_share_access(xdr, arg->fmode, 0);
> > }
> >
> > static void
>
>
> Other than that, this seems reasonable.
>
> Acked-by: Jeff Layton <[email protected]>

Oh duh...

Please ignore my comment on patch #7 to add a patch for the NFS client.
This one does that. That said, there may be a potential problem here
that you need to consider.

In the case of a local filesystem you'll want to set deny locks using
deny_lock_file(). For a network filesystem like CIFS or NFS though,
the server will handle that atomically during the open. You need to
ensure that you don't go trying to set LOCK_MAND locks on the file once
that's done.

Perhaps you can use a fstype flag to indicate that the filesystem
handles this during the open and doesn't need to try and set a flock
lock?

--
Jeff Layton <[email protected]>

2013-03-11 18:54:38

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

On Thu, 28 Feb 2013 19:25:32 +0400
Pavel Shilovsky <[email protected]> wrote:

> by passing these flags to NFSv4 open request.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/nfs/nfs4xdr.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> index 26b1439..58ddc74 100644
> --- a/fs/nfs/nfs4xdr.c
> +++ b/fs/nfs/nfs4xdr.c
> @@ -1325,7 +1325,8 @@ static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struc
> encode_string(xdr, name->len, name->name);
> }
>
> -static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> +static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode,
> + int open_flags)
> {
> __be32 *p;
>
> @@ -1343,7 +1344,22 @@ static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> default:
> *p++ = cpu_to_be32(0);
> }
> - *p = cpu_to_be32(0); /* for linux, share_deny = 0 always */
> + if (open_flags & O_DENYMAND) {


As Bruce mentioned, I think a mount option to enable this on a per-fs
basis would be a better approach than this new O_DENYMAND flag.


> + switch (open_flags & (O_DENYREAD|O_DENYWRITE)) {
> + case O_DENYREAD:
> + *p = cpu_to_be32(NFS4_SHARE_DENY_READ);
> + break;
> + case O_DENYWRITE:
> + *p = cpu_to_be32(NFS4_SHARE_DENY_WRITE);
> + break;
> + case O_DENYREAD|O_DENYWRITE:
> + *p = cpu_to_be32(NFS4_SHARE_DENY_BOTH);
> + break;
> + default:
> + *p = cpu_to_be32(0);
> + }
> + } else
> + *p = cpu_to_be32(0);
> }
>
> static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
> @@ -1354,7 +1370,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
> * owner 4 = 32
> */
> encode_nfs4_seqid(xdr, arg->seqid);
> - encode_share_access(xdr, arg->fmode);
> + encode_share_access(xdr, arg->fmode, arg->open_flags);
> p = reserve_space(xdr, 36);
> p = xdr_encode_hyper(p, arg->clientid);
> *p++ = cpu_to_be32(24);
> @@ -1491,7 +1507,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close
> encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
> encode_nfs4_stateid(xdr, arg->stateid);
> encode_nfs4_seqid(xdr, arg->seqid);
> - encode_share_access(xdr, arg->fmode);
> + encode_share_access(xdr, arg->fmode, 0);
> }
>
> static void


Other than that, this seems reasonable.

Acked-by: Jeff Layton <[email protected]>

2013-03-11 20:11:10

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

On Mon, Mar 11, 2013 at 04:08:44PM -0400, Jeff Layton wrote:
> On Mon, 11 Mar 2013 15:36:38 -0400
> "J. Bruce Fields" <[email protected]> wrote:
>
> > On Mon, Mar 11, 2013 at 03:05:40PM -0400, Jeff Layton wrote:
> > > knfsd has some code already to handle share reservations internally.
> > > Nothing outside of knfsd is aware of these reservations, of course so
> > > moving to a vfs-level object for it would be a marked improvement.
> > >
> > > It doesn't look like this patch removes any of that old code though. I
> > > think it probably should, or there ought to be some consideration of
> > > how this new stuff will mesh with it.
> > >
> > > I think you have 2 choices here:
> > >
> > > 1/ rip out the old share reservation code altogether and require that
> > > filesystems mount with -o sharemand or whatever if they want to allow
> > > their enforcement
> > >
> > > 2/ make knfsd fall back to using the internal share reservation code
> > > when the mount option isn't enabled
> > >
> > > Personally, I think #1 would be fine, but Bruce may want to weigh in on
> > > what he'd prefer.
> >
> > #1 sounds good. Clients that use deny bits are few. My preference
> > would be to return an error to such clients in the case share locks
> > aren't available.
> >
> > (We're a little out of spec there, so I'm not sure which error. I think
> > the goal is to notify a human there's a problem with minimal collateral
> > damange.
> >
> > NFS4ERR_SERVERFAULT ("I'm a buggy server, sorry about that!") would
> > probably result in an IO error to the application.
> >
> > SHARE_DENIED strikes me as unsafe: an application would be in its rights
> > not to even check for that e.g. in the case of an exclusive create.
> >
> > Maybe DELAY? Kind of ridiculous, but blocking the application
> > indefinitely would probably get someone's attention quickly enough
> > without doing any damnage.)
> >
>
> I agree that we should return an error, but hadn't considered what
> error. Given that hardly any NFS clients use them, I'd probably just go
> with NFS4ERR_SERVERFAULT, and maybe throw a printk or something on the
> server about enabling share reservations for superblock x:y.

Sounds reasonable.

> Pavel, as a side note, you may want to consider adding a patch to hook
> this stuff up in the NFS client as well.

Definitely.

--b.

2013-03-11 14:00:00

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

2013/3/5 Simo <[email protected]>:
> On 03/05/2013 01:13 PM, J. Bruce Fields wrote:
>>
>> On Mon, Mar 04, 2013 at 05:49:46PM -0500, Simo wrote:
>>>
>>> On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
>>>> I'm a little more worried: these are mandatory locks, and applications
>>>> that use them are used to the locks being enforced correctly. Are we
>>>> sure that an application that opens a file O_DENYWRITE won't crash if it
>>>> sees the file data change while it holds the open?
>>>
>>> The redirector may simply assume it has full control of that part of
>>> the file and not read nor send data until the lock is released too,
>>> so you get conflicting views of the file contents between different
>>> clients if you let a mandatory lock not be mandatory.
>>>
>>>> In general the idea of making a mandatory lock opt-in makes me nervous.
>>>> I'd prefer something like a mount option, so that we know that everyone
>>>> on that one filesystem is playing by the same rules, but we can still
>>>> mount filesystems like / without the option.
>>>
>>> +1
>>>
>>>> But I'll admit I'm definitely not an expert on Windows locking and may
>>>> be missing something about how these locks are meant to work.
>>>
>>> Mandatory locks really are mandatory in Windows.
>>> That may not be nice to a Unix system but what can you do ?
>>
>> I wonder if we could repurpose the existing -omand mount option?
>>
>> That would be a problem for anyone that wants to allow mandatory fcntl
>> locks without allowing share locks. I doubt anyone sane actually uses
>> mandatory fcntl locks, but still I suppose it would probably be better
>> to play it safe and use a new mount option.
>
>
> Maybe we should have a -o win_semantics option :-)
>
> /me runs
>

(CC'ing Al Viro, since these patches should go through his tree)

I don't mind to introduce a new mount option for turning this feature
on/off - something like '-o denylock' to make it mathing names of new
flags would be ok.

Al, what do you think about this feature overall?

--
Best regards,
Pavel Shilovsky.

2013-03-11 18:50:36

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 3/7] CIFS: Add O_DENY* open flags support

On Thu, 28 Feb 2013 19:25:29 +0400
Pavel Shilovsky <[email protected]> wrote:

> Make CIFSSMBOpen take share_flags as a parm that allows us
> to pass new O_DENY* flags to the server.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/cifs/cifsacl.c | 6 ++++--
> fs/cifs/cifsglob.h | 12 +++++++++++-
> fs/cifs/cifsproto.h | 9 +++++----
> fs/cifs/cifssmb.c | 47 +++++++++++++++++++++++++----------------------
> fs/cifs/dir.c | 13 ++++++++-----
> fs/cifs/file.c | 12 ++++++++----
> fs/cifs/inode.c | 11 ++++++-----
> fs/cifs/link.c | 10 +++++-----
> fs/cifs/readdir.c | 2 +-
> fs/cifs/smb1ops.c | 15 ++++++++-------
> fs/cifs/smb2file.c | 10 +++++-----
> fs/cifs/smb2inode.c | 4 ++--
> fs/cifs/smb2ops.c | 10 ++++++----
> fs/cifs/smb2pdu.c | 6 +++---
> fs/cifs/smb2proto.h | 14 ++++++++------
> 15 files changed, 105 insertions(+), 76 deletions(-)
>
> diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
> index 5cbd00e..222b989 100644
> --- a/fs/cifs/cifsacl.c
> +++ b/fs/cifs/cifsacl.c
> @@ -891,7 +891,8 @@ static struct cifs_ntsd *get_cifs_acl_by_path(struct cifs_sb_info *cifs_sb,
> create_options |= CREATE_OPEN_BACKUP_INTENT;
>
> rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, READ_CONTROL,
> - create_options, &fid, &oplock, NULL, cifs_sb->local_nls,
> + FILE_SHARE_ALL, create_options, &fid, &oplock,
> + NULL, cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (!rc) {
> rc = CIFSSMBGetCIFSACL(xid, tcon, fid, &pntsd, pacllen);
> @@ -952,7 +953,8 @@ int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
> access_flags = WRITE_DAC;
>
> rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, access_flags,
> - create_options, &fid, &oplock, NULL, cifs_sb->local_nls,
> + FILE_SHARE_ALL, create_options, &fid, &oplock,
> + NULL, cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc) {
> cERROR(1, "Unable to open file to set ACL");
> diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
> index e6899ce..f1d62ed 100644
> --- a/fs/cifs/cifsglob.h
> +++ b/fs/cifs/cifsglob.h
> @@ -310,7 +310,7 @@ struct smb_version_operations {
> struct cifs_sb_info *);
> /* open a file for non-posix mounts */
> int (*open)(const unsigned int, struct cifs_tcon *, const char *, int,
> - int, int, struct cifs_fid *, __u32 *, FILE_ALL_INFO *,
> + int, int, int, struct cifs_fid *, __u32 *, FILE_ALL_INFO *,
> struct cifs_sb_info *);
> /* set fid protocol-specific info */
> void (*set_fid)(struct cifsFileInfo *, struct cifs_fid *, __u32);
> @@ -947,6 +947,16 @@ struct cifsFileInfo {
> struct work_struct oplock_break; /* work for oplock breaks */
> };
>
> +#define CIFS_DENY_RW_FLAGS_SHIFT 22
> +#define CIFS_DENY_DEL_FLAG_SHIFT 23
> +
> +static inline int cifs_get_share_flags(unsigned int flags)
> +{
> + return (flags & O_DENYMAND) ?
> + (((~(flags >> CIFS_DENY_RW_FLAGS_SHIFT)) & 3) |
> + ((~(flags >> CIFS_DENY_DEL_FLAG_SHIFT)) & 4)) : FILE_SHARE_ALL;
> +}
> +
> struct cifs_io_parms {
> __u16 netfid;
> #ifdef CONFIG_CIFS_SMB2
> diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
> index 1988c1b..9661607 100644
> --- a/fs/cifs/cifsproto.h
> +++ b/fs/cifs/cifsproto.h
> @@ -361,10 +361,11 @@ extern int CIFSSMBQueryReparseLinkInfo(const unsigned int xid,
> const struct nls_table *nls_codepage);
> #endif /* temporarily unused until cifs_symlink fixed */
> extern int CIFSSMBOpen(const unsigned int xid, struct cifs_tcon *tcon,
> - const char *fileName, const int disposition,
> - const int access_flags, const int omode,
> - __u16 *netfid, int *pOplock, FILE_ALL_INFO *,
> - const struct nls_table *nls_codepage, int remap);
> + const char *file_name, const int disposition,
> + const int access_flags, const int share_flags,
> + const int omode, __u16 *netfid, int *oplock,
> + FILE_ALL_INFO *, const struct nls_table *nls_codepage,
> + int remap);

Yuck...maybe it's time for a struct cifs_openargs? At the very least,
while you're in here, it would be good to just pass in a cifs_sb
instead of a nls_table and a remap flag.

> extern int SMBLegacyOpen(const unsigned int xid, struct cifs_tcon *tcon,
> const char *fileName, const int disposition,
> const int access_flags, const int omode,
> diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
> index 76d0d29..9c4632f 100644
> --- a/fs/cifs/cifssmb.c
> +++ b/fs/cifs/cifssmb.c
> @@ -1289,10 +1289,11 @@ OldOpenRetry:
>
> int
> CIFSSMBOpen(const unsigned int xid, struct cifs_tcon *tcon,
> - const char *fileName, const int openDisposition,
> - const int access_flags, const int create_options, __u16 *netfid,
> - int *pOplock, FILE_ALL_INFO *pfile_info,
> - const struct nls_table *nls_codepage, int remap)
> + const char *file_name, const int disposition,
> + const int access_flags, const int share_flags,
> + const int create_options, __u16 *netfid, int *oplock,
> + FILE_ALL_INFO *file_info, const struct nls_table *nls_codepage,
> + int remap)
> {
> int rc = -EACCES;
> OPEN_REQ *pSMB = NULL;
> @@ -1313,26 +1314,28 @@ openRetry:
> count = 1; /* account for one byte pad to word boundary */
> name_len =
> cifsConvertToUTF16((__le16 *) (pSMB->fileName + 1),
> - fileName, PATH_MAX, nls_codepage, remap);
> + file_name, PATH_MAX, nls_codepage,
> + remap);
> name_len++; /* trailing null */
> name_len *= 2;
> pSMB->NameLength = cpu_to_le16(name_len);
> } else { /* BB improve check for buffer overruns BB */
> count = 0; /* no pad */
> - name_len = strnlen(fileName, PATH_MAX);
> + name_len = strnlen(file_name, PATH_MAX);
> name_len++; /* trailing null */
> pSMB->NameLength = cpu_to_le16(name_len);
> - strncpy(pSMB->fileName, fileName, name_len);
> + strncpy(pSMB->fileName, file_name, name_len);
> }
> - if (*pOplock & REQ_OPLOCK)
> + if (*oplock & REQ_OPLOCK)
> pSMB->OpenFlags = cpu_to_le32(REQ_OPLOCK);
> - else if (*pOplock & REQ_BATCHOPLOCK)
> + else if (*oplock & REQ_BATCHOPLOCK)
> pSMB->OpenFlags = cpu_to_le32(REQ_BATCHOPLOCK);
> pSMB->DesiredAccess = cpu_to_le32(access_flags);
> pSMB->AllocationSize = 0;
> - /* set file as system file if special file such
> - as fifo and server expecting SFU style and
> - no Unix extensions */
> + /*
> + * set file as system file if special file such as fifo and server
> + * expecting SFU style and no Unix extensions
> + */
> if (create_options & CREATE_OPTION_SPECIAL)
> pSMB->FileAttributes = cpu_to_le32(ATTR_SYSTEM);
> else
> @@ -1347,8 +1350,8 @@ openRetry:
> if (create_options & CREATE_OPTION_READONLY)
> pSMB->FileAttributes |= cpu_to_le32(ATTR_READONLY);
>
> - pSMB->ShareAccess = cpu_to_le32(FILE_SHARE_ALL);
> - pSMB->CreateDisposition = cpu_to_le32(openDisposition);
> + pSMB->ShareAccess = cpu_to_le32(share_flags);
> + pSMB->CreateDisposition = cpu_to_le32(disposition);
> pSMB->CreateOptions = cpu_to_le32(create_options & CREATE_OPTIONS_MASK);
> /* BB Expirement with various impersonation levels and verify */
> pSMB->ImpersonationLevel = cpu_to_le32(SECURITY_IMPERSONATION);
> @@ -1366,20 +1369,20 @@ openRetry:
> if (rc) {
> cFYI(1, "Error in Open = %d", rc);
> } else {
> - *pOplock = pSMBr->OplockLevel; /* 1 byte no need to le_to_cpu */
> + *oplock = pSMBr->OplockLevel; /* 1 byte no need to le_to_cpu */
> *netfid = pSMBr->Fid; /* cifs fid stays in le */
> /* Let caller know file was created so we can set the mode. */
> /* Do we care about the CreateAction in any other cases? */
> if (cpu_to_le32(FILE_CREATE) == pSMBr->CreateAction)
> - *pOplock |= CIFS_CREATE_ACTION;
> - if (pfile_info) {
> - memcpy((char *)pfile_info, (char *)&pSMBr->CreationTime,
> + *oplock |= CIFS_CREATE_ACTION;
> + if (file_info) {
> + memcpy((char *)file_info, (char *)&pSMBr->CreationTime,
> 36 /* CreationTime to Attributes */);
> /* the file_info buf is endian converted by caller */
> - pfile_info->AllocationSize = pSMBr->AllocationSize;
> - pfile_info->EndOfFile = pSMBr->EndOfFile;
> - pfile_info->NumberOfLinks = cpu_to_le32(1);
> - pfile_info->DeletePending = 0;
> + file_info->AllocationSize = pSMBr->AllocationSize;
> + file_info->EndOfFile = pSMBr->EndOfFile;
> + file_info->NumberOfLinks = cpu_to_le32(1);
> + file_info->DeletePending = 0;
> }
> }
>
> diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
> index 8719bbe..6975072 100644
> --- a/fs/cifs/dir.c
> +++ b/fs/cifs/dir.c
> @@ -203,6 +203,7 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
> FILE_ALL_INFO *buf = NULL;
> struct inode *newinode = NULL;
> int disposition;
> + int share_access;
> struct TCP_Server_Info *server = tcon->ses->server;
>
> *oplock = 0;
> @@ -293,6 +294,8 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
> else
> cFYI(1, "Create flag not set in create function");
>
> + share_access = cifs_get_share_flags(oflags);
> +
> /*
> * BB add processing to set equivalent of mode - e.g. via CreateX with
> * ACLs
> @@ -320,8 +323,8 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
> create_options |= CREATE_OPEN_BACKUP_INTENT;
>
> rc = server->ops->open(xid, tcon, full_path, disposition,
> - desired_access, create_options, fid, oplock,
> - buf, cifs_sb);
> + desired_access, share_access, create_options,
> + fid, oplock, buf, cifs_sb);
> if (rc) {
> cFYI(1, "cifs_create returned 0x%x", rc);
> goto out;
> @@ -626,9 +629,9 @@ int cifs_mknod(struct inode *inode, struct dentry *direntry, umode_t mode,
> if (backup_cred(cifs_sb))
> create_options |= CREATE_OPEN_BACKUP_INTENT;
>
> - rc = CIFSSMBOpen(xid, pTcon, full_path, FILE_CREATE,
> - GENERIC_WRITE, create_options,
> - &fileHandle, &oplock, buf, cifs_sb->local_nls,
> + rc = CIFSSMBOpen(xid, pTcon, full_path, FILE_CREATE, GENERIC_WRITE,
> + FILE_SHARE_ALL, create_options, &fileHandle, &oplock,
> + buf, cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc)
> goto mknod_out;
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 8ea6ca5..3ad484c 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -174,6 +174,7 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
> {
> int rc;
> int desired_access;
> + int share_access;
> int disposition;
> int create_options = CREATE_NOT_DIR;
> FILE_ALL_INFO *buf;
> @@ -209,6 +210,7 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
> *********************************************************************/
>
> disposition = cifs_get_disposition(f_flags);
> + share_access = cifs_get_share_flags(f_flags);
>
> /* BB pass O_SYNC flag through on file attributes .. BB */
>
> @@ -220,8 +222,8 @@ cifs_nt_open(char *full_path, struct inode *inode, struct cifs_sb_info *cifs_sb,
> create_options |= CREATE_OPEN_BACKUP_INTENT;
>
> rc = server->ops->open(xid, tcon, full_path, disposition,
> - desired_access, create_options, fid, oplock, buf,
> - cifs_sb);
> + desired_access, share_access, create_options,
> + fid, oplock, buf, cifs_sb);
>
> if (rc)
> goto out;
> @@ -579,6 +581,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
> struct inode *inode;
> char *full_path = NULL;
> int desired_access;
> + int share_access;
> int disposition = FILE_OPEN;
> int create_options = CREATE_NOT_DIR;
> struct cifs_fid fid;
> @@ -643,6 +646,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
> }
>
> desired_access = cifs_convert_flags(cfile->f_flags);
> + share_access = cifs_get_share_flags(cfile->f_flags);
>
> if (backup_cred(cifs_sb))
> create_options |= CREATE_OPEN_BACKUP_INTENT;
> @@ -658,8 +662,8 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
> * not dirty locally we could do this.
> */
> rc = server->ops->open(xid, tcon, full_path, disposition,
> - desired_access, create_options, &fid, &oplock,
> - NULL, cifs_sb);
> + desired_access, share_access, create_options,
> + &fid, &oplock, NULL, cifs_sb);
> if (rc) {
> mutex_unlock(&cfile->fh_mutex);
> cFYI(1, "cifs_reopen returned 0x%x", rc);
> diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
> index ed6208f..a497dfa 100644
> --- a/fs/cifs/inode.c
> +++ b/fs/cifs/inode.c
> @@ -396,8 +396,8 @@ cifs_sfu_type(struct cifs_fattr *fattr, const unsigned char *path,
> tcon = tlink_tcon(tlink);
>
> rc = CIFSSMBOpen(xid, tcon, path, FILE_OPEN, GENERIC_READ,
> - CREATE_NOT_DIR, &netfid, &oplock, NULL,
> - cifs_sb->local_nls,
> + FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
> + NULL, cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags &
> CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc == 0) {
> @@ -987,8 +987,9 @@ cifs_rename_pending_delete(const char *full_path, struct dentry *dentry,
> tcon = tlink_tcon(tlink);
>
> rc = CIFSSMBOpen(xid, tcon, full_path, FILE_OPEN,
> - DELETE|FILE_WRITE_ATTRIBUTES, CREATE_NOT_DIR,
> - &netfid, &oplock, NULL, cifs_sb->local_nls,
> + DELETE|FILE_WRITE_ATTRIBUTES, FILE_SHARE_ALL,
> + CREATE_NOT_DIR, &netfid, &oplock, NULL,
> + cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc != 0)
> goto out;
> @@ -1509,7 +1510,7 @@ cifs_do_rename(const unsigned int xid, struct dentry *from_dentry,
>
> /* open the file to be renamed -- we need DELETE perms */
> rc = CIFSSMBOpen(xid, tcon, from_path, FILE_OPEN, DELETE,
> - CREATE_NOT_DIR, &srcfid, &oplock, NULL,
> + FILE_SHARE_ALL, CREATE_NOT_DIR, &srcfid, &oplock, NULL,
> cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
> CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc == 0) {
> diff --git a/fs/cifs/link.c b/fs/cifs/link.c
> index 51dc2fb..9b4f0db 100644
> --- a/fs/cifs/link.c
> +++ b/fs/cifs/link.c
> @@ -212,7 +212,7 @@ CIFSCreateMFSymLink(const unsigned int xid, struct cifs_tcon *tcon,
> create_options |= CREATE_OPEN_BACKUP_INTENT;
>
> rc = CIFSSMBOpen(xid, tcon, fromName, FILE_CREATE, GENERIC_WRITE,
> - create_options, &netfid, &oplock, NULL,
> + FILE_SHARE_ALL, create_options, &netfid, &oplock, NULL,
> nls_codepage, remap);
> if (rc != 0) {
> kfree(buf);
> @@ -254,8 +254,8 @@ CIFSQueryMFSymLink(const unsigned int xid, struct cifs_tcon *tcon,
> FILE_ALL_INFO file_info;
>
> rc = CIFSSMBOpen(xid, tcon, searchName, FILE_OPEN, GENERIC_READ,
> - CREATE_NOT_DIR, &netfid, &oplock, &file_info,
> - nls_codepage, remap);
> + FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
> + &file_info, nls_codepage, remap);
> if (rc != 0)
> return rc;
>
> @@ -332,8 +332,8 @@ CIFSCheckMFSymlink(struct cifs_fattr *fattr,
> pTcon = tlink_tcon(tlink);
>
> rc = CIFSSMBOpen(xid, pTcon, path, FILE_OPEN, GENERIC_READ,
> - CREATE_NOT_DIR, &netfid, &oplock, &file_info,
> - cifs_sb->local_nls,
> + FILE_SHARE_ALL, CREATE_NOT_DIR, &netfid, &oplock,
> + &file_info, cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags &
> CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (rc != 0)
> diff --git a/fs/cifs/readdir.c b/fs/cifs/readdir.c
> index cdd6ff4..726b52e 100644
> --- a/fs/cifs/readdir.c
> +++ b/fs/cifs/readdir.c
> @@ -224,7 +224,7 @@ int get_symlink_reparse_path(char *full_path, struct cifs_sb_info *cifs_sb,
> char *tmpbuffer;
>
> rc = CIFSSMBOpen(xid, ptcon, full_path, FILE_OPEN, GENERIC_READ,
> - OPEN_REPARSE_POINT, &fid, &oplock, NULL,
> + FILE_SHARE_ALL, OPEN_REPARSE_POINT, &fid, &oplock, NULL,
> cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
> if (!rc) {
> diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
> index 47bc5a8..d782209 100644
> --- a/fs/cifs/smb1ops.c
> +++ b/fs/cifs/smb1ops.c
> @@ -671,9 +671,9 @@ cifs_mkdir_setinfo(struct inode *inode, const char *full_path,
>
> static int
> cifs_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
> - int disposition, int desired_access, int create_options,
> - struct cifs_fid *fid, __u32 *oplock, FILE_ALL_INFO *buf,
> - struct cifs_sb_info *cifs_sb)
> + int disposition, int desired_access, int share_access,
> + int create_options, struct cifs_fid *fid, __u32 *oplock,
> + FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb)
> {
> if (!(tcon->ses->capabilities & CAP_NT_SMBS))
> return SMBLegacyOpen(xid, tcon, path, disposition,
> @@ -682,8 +682,8 @@ cifs_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
> cifs_sb->local_nls, cifs_sb->mnt_cifs_flags
> & CIFS_MOUNT_MAP_SPECIAL_CHR);
> return CIFSSMBOpen(xid, tcon, path, disposition, desired_access,
> - create_options, &fid->netfid, oplock, buf,
> - cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
> + share_access, create_options, &fid->netfid, oplock,
> + buf, cifs_sb->local_nls, cifs_sb->mnt_cifs_flags &
> CIFS_MOUNT_MAP_SPECIAL_CHR);
> }
>
> @@ -779,8 +779,9 @@ smb_set_file_info(struct inode *inode, const char *full_path,
> cFYI(1, "calling SetFileInfo since SetPathInfo for times not supported "
> "by this server");
> rc = CIFSSMBOpen(xid, tcon, full_path, FILE_OPEN,
> - SYNCHRONIZE | FILE_WRITE_ATTRIBUTES, CREATE_NOT_DIR,
> - &netfid, &oplock, NULL, cifs_sb->local_nls,
> + SYNCHRONIZE | FILE_WRITE_ATTRIBUTES, FILE_SHARE_ALL,
> + CREATE_NOT_DIR, &netfid, &oplock, NULL,
> + cifs_sb->local_nls,
> cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
>
> if (rc != 0) {
> diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c
> index 71e6aed..7dfb50c 100644
> --- a/fs/cifs/smb2file.c
> +++ b/fs/cifs/smb2file.c
> @@ -58,9 +58,9 @@ smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock)
>
> int
> smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
> - int disposition, int desired_access, int create_options,
> - struct cifs_fid *fid, __u32 *oplock, FILE_ALL_INFO *buf,
> - struct cifs_sb_info *cifs_sb)
> + int disposition, int desired_access, int share_access,
> + int create_options, struct cifs_fid *fid, __u32 *oplock,
> + FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb)
> {
> int rc;
> __le16 *smb2_path;
> @@ -87,8 +87,8 @@ smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon, const char *path,
> memcpy(smb2_oplock + 1, fid->lease_key, SMB2_LEASE_KEY_SIZE);
>
> rc = SMB2_open(xid, tcon, smb2_path, &fid->persistent_fid,
> - &fid->volatile_fid, desired_access, disposition,
> - 0, 0, smb2_oplock, smb2_data);
> + &fid->volatile_fid, desired_access, share_access,
> + disposition, 0, 0, smb2_oplock, smb2_data);
> if (rc)
> goto out;
>
> diff --git a/fs/cifs/smb2inode.c b/fs/cifs/smb2inode.c
> index 7064824..6af174a 100644
> --- a/fs/cifs/smb2inode.c
> +++ b/fs/cifs/smb2inode.c
> @@ -54,8 +54,8 @@ smb2_open_op_close(const unsigned int xid, struct cifs_tcon *tcon,
> return -ENOMEM;
>
> rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
> - desired_access, create_disposition, file_attributes,
> - create_options, &oplock, NULL);
> + desired_access, FILE_SHARE_ALL, create_disposition,
> + file_attributes, create_options, &oplock, NULL);
> if (rc) {
> kfree(utf16_path);
> return rc;
> diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
> index c9c7aa7..dc38434 100644
> --- a/fs/cifs/smb2ops.c
> +++ b/fs/cifs/smb2ops.c
> @@ -222,7 +222,8 @@ smb2_is_path_accessible(const unsigned int xid, struct cifs_tcon *tcon,
> return -ENOMEM;
>
> rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
> - FILE_READ_ATTRIBUTES, FILE_OPEN, 0, 0, &oplock, NULL);
> + FILE_READ_ATTRIBUTES, FILE_SHARE_ALL, FILE_OPEN, 0, 0,
> + &oplock, NULL);
> if (rc) {
> kfree(utf16_path);
> return rc;
> @@ -432,8 +433,8 @@ smb2_query_dir_first(const unsigned int xid, struct cifs_tcon *tcon,
> return -ENOMEM;
>
> rc = SMB2_open(xid, tcon, utf16_path, &persistent_fid, &volatile_fid,
> - FILE_READ_ATTRIBUTES | FILE_READ_DATA, FILE_OPEN, 0, 0,
> - &oplock, NULL);
> + FILE_READ_ATTRIBUTES | FILE_READ_DATA, FILE_SHARE_ALL,
> + FILE_OPEN, 0, 0, &oplock, NULL);
> kfree(utf16_path);
> if (rc) {
> cERROR(1, "open dir failed");
> @@ -515,7 +516,8 @@ smb2_queryfs(const unsigned int xid, struct cifs_tcon *tcon,
> u8 oplock = SMB2_OPLOCK_LEVEL_NONE;
>
> rc = SMB2_open(xid, tcon, &srch_path, &persistent_fid, &volatile_fid,
> - FILE_READ_ATTRIBUTES, FILE_OPEN, 0, 0, &oplock, NULL);
> + FILE_READ_ATTRIBUTES, FILE_SHARE_ALL, FILE_OPEN, 0, 0,
> + &oplock, NULL);
> if (rc)
> return rc;
> buf->f_type = SMB2_MAGIC_NUMBER;
> diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
> index 41d9d07..45fa2dc 100644
> --- a/fs/cifs/smb2pdu.c
> +++ b/fs/cifs/smb2pdu.c
> @@ -910,8 +910,8 @@ parse_lease_state(struct smb2_create_rsp *rsp)
> int
> SMB2_open(const unsigned int xid, struct cifs_tcon *tcon, __le16 *path,
> u64 *persistent_fid, u64 *volatile_fid, __u32 desired_access,
> - __u32 create_disposition, __u32 file_attributes, __u32 create_options,
> - __u8 *oplock, struct smb2_file_all_info *buf)
> + __u32 share_access, __u32 create_disposition, __u32 file_attributes,
> + __u32 create_options, __u8 *oplock, struct smb2_file_all_info *buf)
> {
> struct smb2_create_req *req;
> struct smb2_create_rsp *rsp;
> @@ -940,7 +940,7 @@ SMB2_open(const unsigned int xid, struct cifs_tcon *tcon, __le16 *path,
> req->DesiredAccess = cpu_to_le32(desired_access);
> /* File attributes ignored on open (used in create though) */
> req->FileAttributes = cpu_to_le32(file_attributes);
> - req->ShareAccess = FILE_SHARE_ALL_LE;
> + req->ShareAccess = cpu_to_le32(share_access);
> req->CreateDisposition = cpu_to_le32(create_disposition);
> req->CreateOptions = cpu_to_le32(create_options);
> uni_path_len = (2 * UniStrnlen((wchar_t *)path, PATH_MAX)) + 2;
> diff --git a/fs/cifs/smb2proto.h b/fs/cifs/smb2proto.h
> index 2aa3535..edff8f6 100644
> --- a/fs/cifs/smb2proto.h
> +++ b/fs/cifs/smb2proto.h
> @@ -86,9 +86,10 @@ extern int smb2_create_hardlink(const unsigned int xid, struct cifs_tcon *tcon,
>
> extern int smb2_open_file(const unsigned int xid, struct cifs_tcon *tcon,
> const char *full_path, int disposition,
> - int desired_access, int create_options,
> - struct cifs_fid *fid, __u32 *oplock,
> - FILE_ALL_INFO *buf, struct cifs_sb_info *cifs_sb);
> + int desired_access, int share_access,
> + int create_options, struct cifs_fid *fid,
> + __u32 *oplock, FILE_ALL_INFO *buf,
> + struct cifs_sb_info *cifs_sb);
> extern void smb2_set_oplock_level(struct cifsInodeInfo *cinode, __u32 oplock);
> extern int smb2_unlock_range(struct cifsFileInfo *cfile,
> struct file_lock *flock, const unsigned int xid);
> @@ -108,9 +109,10 @@ extern int SMB2_tcon(const unsigned int xid, struct cifs_ses *ses,
> extern int SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon);
> extern int SMB2_open(const unsigned int xid, struct cifs_tcon *tcon,
> __le16 *path, u64 *persistent_fid, u64 *volatile_fid,
> - __u32 desired_access, __u32 create_disposition,
> - __u32 file_attributes, __u32 create_options,
> - __u8 *oplock, struct smb2_file_all_info *buf);
> + __u32 desired_access, __u32 share_access,
> + __u32 create_disposition, __u32 file_attributes,
> + __u32 create_options, __u8 *oplock,
> + struct smb2_file_all_info *buf);
> extern int SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
> u64 persistent_file_id, u64 volatile_file_id);
> extern int SMB2_flush(const unsigned int xid, struct cifs_tcon *tcon,


--
Jeff Layton <[email protected]>

2013-03-11 19:36:40

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

On Mon, Mar 11, 2013 at 03:05:40PM -0400, Jeff Layton wrote:
> knfsd has some code already to handle share reservations internally.
> Nothing outside of knfsd is aware of these reservations, of course so
> moving to a vfs-level object for it would be a marked improvement.
>
> It doesn't look like this patch removes any of that old code though. I
> think it probably should, or there ought to be some consideration of
> how this new stuff will mesh with it.
>
> I think you have 2 choices here:
>
> 1/ rip out the old share reservation code altogether and require that
> filesystems mount with -o sharemand or whatever if they want to allow
> their enforcement
>
> 2/ make knfsd fall back to using the internal share reservation code
> when the mount option isn't enabled
>
> Personally, I think #1 would be fine, but Bruce may want to weigh in on
> what he'd prefer.

#1 sounds good. Clients that use deny bits are few. My preference
would be to return an error to such clients in the case share locks
aren't available.

(We're a little out of spec there, so I'm not sure which error. I think
the goal is to notify a human there's a problem with minimal collateral
damange.

NFS4ERR_SERVERFAULT ("I'm a buggy server, sorry about that!") would
probably result in an IO error to the application.

SHARE_DENIED strikes me as unsafe: an application would be in its rights
not to even check for that e.g. in the case of an exclusive create.

Maybe DELAY? Kind of ridiculous, but blocking the application
indefinitely would probably get someone's attention quickly enough
without doing any damnage.)

--b.

2013-03-11 20:08:52

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

On Mon, 11 Mar 2013 15:36:38 -0400
"J. Bruce Fields" <[email protected]> wrote:

> On Mon, Mar 11, 2013 at 03:05:40PM -0400, Jeff Layton wrote:
> > knfsd has some code already to handle share reservations internally.
> > Nothing outside of knfsd is aware of these reservations, of course so
> > moving to a vfs-level object for it would be a marked improvement.
> >
> > It doesn't look like this patch removes any of that old code though. I
> > think it probably should, or there ought to be some consideration of
> > how this new stuff will mesh with it.
> >
> > I think you have 2 choices here:
> >
> > 1/ rip out the old share reservation code altogether and require that
> > filesystems mount with -o sharemand or whatever if they want to allow
> > their enforcement
> >
> > 2/ make knfsd fall back to using the internal share reservation code
> > when the mount option isn't enabled
> >
> > Personally, I think #1 would be fine, but Bruce may want to weigh in on
> > what he'd prefer.
>
> #1 sounds good. Clients that use deny bits are few. My preference
> would be to return an error to such clients in the case share locks
> aren't available.
>
> (We're a little out of spec there, so I'm not sure which error. I think
> the goal is to notify a human there's a problem with minimal collateral
> damange.
>
> NFS4ERR_SERVERFAULT ("I'm a buggy server, sorry about that!") would
> probably result in an IO error to the application.
>
> SHARE_DENIED strikes me as unsafe: an application would be in its rights
> not to even check for that e.g. in the case of an exclusive create.
>
> Maybe DELAY? Kind of ridiculous, but blocking the application
> indefinitely would probably get someone's attention quickly enough
> without doing any damnage.)
>

I agree that we should return an error, but hadn't considered what
error. Given that hardly any NFS clients use them, I'd probably just go
with NFS4ERR_SERVERFAULT, and maybe throw a printk or something on the
server about enabling share reservations for superblock x:y.

Pavel, as a side note, you may want to consider adding a patch to hook
this stuff up in the NFS client as well.

--
Jeff Layton <[email protected]>

2013-03-05 19:07:51

by Simo

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On 03/05/2013 01:13 PM, J. Bruce Fields wrote:
> On Mon, Mar 04, 2013 at 05:49:46PM -0500, Simo wrote:
>> On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
>>> On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
>>>> [possible resend -- sorry]
>>>>
>>>> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
>>>>> This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
>>>>>
>>>>> These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
>>>>>
>>>>> According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
>>>>>
>>>>> So, we have four new flags:
>>>>> O_DENYREAD - to prevent other opens with read access,
>>>>> O_DENYWRITE - to prevent other opens with write access,
>>>>> O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
>>>>> O_DENYMAND - to switch on/off three flags above.
>>>> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
>>>> better?
>>>>
>>>> Other than that, this seems like a sensible mechanism.
>>> I'm a little more worried: these are mandatory locks, and applications
>>> that use them are used to the locks being enforced correctly. Are we
>>> sure that an application that opens a file O_DENYWRITE won't crash if it
>>> sees the file data change while it holds the open?
>> The redirector may simply assume it has full control of that part of
>> the file and not read nor send data until the lock is released too,
>> so you get conflicting views of the file contents between different
>> clients if you let a mandatory lock not be mandatory.
>>
>>> In general the idea of making a mandatory lock opt-in makes me nervous.
>>> I'd prefer something like a mount option, so that we know that everyone
>>> on that one filesystem is playing by the same rules, but we can still
>>> mount filesystems like / without the option.
>> +1
>>
>>> But I'll admit I'm definitely not an expert on Windows locking and may
>>> be missing something about how these locks are meant to work.
>> Mandatory locks really are mandatory in Windows.
>> That may not be nice to a Unix system but what can you do ?
> I wonder if we could repurpose the existing -omand mount option?
>
> That would be a problem for anyone that wants to allow mandatory fcntl
> locks without allowing share locks. I doubt anyone sane actually uses
> mandatory fcntl locks, but still I suppose it would probably be better
> to play it safe and use a new mount option.

Maybe we should have a -o win_semantics option :-)

/me runs

Simo.

2013-03-11 19:10:54

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] vfs: Add O_DENYREAD/WRITE flags support for open syscall

On Mon, 11 Mar 2013 22:57:27 +0400
Pavel Shilovsky <[email protected]> wrote:

> 2013/3/11 Jeff Layton <[email protected]>:
> > On Thu, 28 Feb 2013 19:25:28 +0400
> > Pavel Shilovsky <[email protected]> wrote:
> >
> >> If O_DENYMAND flag is specified, O_DENYREAD/WRITE/MAND flags are
> >> translated to flock's flags:
> >>
> >> !O_DENYREAD -> LOCK_READ
> >> !O_DENYWRITE -> LOCK_WRITE
> >> O_DENYMAND -> LOCK_MAND
> >>
> >> and set through flock_lock_file on a file.
> >>
> >> This change affects opens that use O_DENYMAND flag - all other
> >> native Linux opens don't care about these flags. It allow us to
> >> enable this feature for applications that need it (e.g. NFS and
> >> Samba servers that export the same directory for Windows clients,
> >> or Wine applications that access the same files simultaneously).
> >>
> >> Create codepath is slightly changed to prevent data races on
> >> newely created files: when open with O_CREAT can return with -ETXTBSY
> >> error for successfully created files due to a deny lock set by
> >> another task.
> >>
> >> Signed-off-by: Pavel Shilovsky <[email protected]>
> >> ---
> >> fs/locks.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++------
> >> fs/namei.c | 44 ++++++++++++++++++--
> >> include/linux/fs.h | 6 +++
> >> 3 files changed, 151 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/fs/locks.c b/fs/locks.c
> >> index a94e331..0cc7d1b 100644
> >> --- a/fs/locks.c
> >> +++ b/fs/locks.c
> >> @@ -605,20 +605,81 @@ static int posix_locks_conflict(struct file_lock *caller_fl, struct file_lock *s
> >> return (locks_conflict(caller_fl, sys_fl));
> >> }
> >>
> >> -/* Determine if lock sys_fl blocks lock caller_fl. FLOCK specific
> >> - * checking before calling the locks_conflict().
> >> +static unsigned int
> >> +deny_flags_to_cmd(unsigned int flags)
> >> +{
> >> + unsigned int cmd = LOCK_MAND;
> >> +
> >> + if (!(flags & O_DENYREAD))
> >> + cmd |= LOCK_READ;
> >> + if (!(flags & O_DENYWRITE))
> >> + cmd |= LOCK_WRITE;
> >> +
> >> + return cmd;
> >> +}
> >> +
> >> +/*
> >> + * locks_mand_conflict - Determine if there's a share reservation conflict
> >> + * @caller_fl: lock we're attempting to acquire
> >> + * @sys_fl: lock already present on system that we're checking against
> >> + *
> >> + * Check to see if there's a share_reservation conflict. LOCK_READ/LOCK_WRITE
> >> + * tell us whether the reservation allows other readers and writers.
> >> + *
> >> + * We only check against other LOCK_MAND locks, so applications that want to
> >> + * use share mode locking will only conflict against one another. "normal"
> >> + * applications that open files won't be affected by and won't themselves
> >> + * affect the share reservations.
> >> */
> >> -static int flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
> >> +static int
> >> +locks_mand_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
> >> {
> >> - /* FLOCK locks referring to the same filp do not conflict with
> >> + unsigned char caller_type = caller_fl->fl_type;
> >> + unsigned char sys_type = sys_fl->fl_type;
> >> + fmode_t caller_fmode = caller_fl->fl_file->f_mode;
> >> + fmode_t sys_fmode = sys_fl->fl_file->f_mode;
> >> +
> >> + /* they can only conflict if they're both LOCK_MAND */
> >> + if (!(caller_type & LOCK_MAND) || !(sys_type & LOCK_MAND))
> >> + return 0;
> >> +
> >> + if (!(caller_type & LOCK_READ) && (sys_fmode & FMODE_READ))
> >> + return 1;
> >> + if (!(caller_type & LOCK_WRITE) && (sys_fmode & FMODE_WRITE))
> >> + return 1;
> >> + if (!(sys_type & LOCK_READ) && (caller_fmode & FMODE_READ))
> >> + return 1;
> >> + if (!(sys_type & LOCK_WRITE) && (caller_fmode & FMODE_WRITE))
> >> + return 1;
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +/*
> >> + * Determine if lock sys_fl blocks lock caller_fl. FLOCK specific checking
> >> + * before calling the locks_conflict(). Boolean is_mand indicates whether
> >> + * we should use a share reservation scheme or not.
> >> + */
> >> +static int
> >> +flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl,
> >> + bool is_mand)
> >
> > I'm not sure you really need to add this new is_mand bool. Won't that
> > be equivalent to (caller_fl->fl_type & LOCK_MAND)?
>
> This function is already used by flock system call that can pass
> LOCK_MAND flag to caller_fl->fl_type. I don't want to affect existing
> flock behavior by introduing new denylocking strategy - so, we need to
> let flock_locks_conflict function know if we are in flock or open
> codepath - in open codepath it will call locks_mand_conflict to check
> if there is any other open that prevents us.
>

Right, but if you move to a mount option for this, then enforcing these
locks in the flock() codepath should be ok. It seems reasonable that
anyone who wants enforcement of O_DENY* would want to enforce LOCK_MAND
locks as well.

--
Jeff Layton <[email protected]>

2013-03-11 18:59:17

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] CIFS: Translate SHARING_VIOLATION to -ETXTBSY error code for SMB2

2013/3/11 Jeff Layton <[email protected]>:
> On Thu, 28 Feb 2013 19:25:31 +0400
> Pavel Shilovsky <[email protected]> wrote:
>
>> to make it match CIFS and VFS variants.
>>
>> Signed-off-by: Pavel Shilovsky <[email protected]>
>> ---
>> fs/cifs/smb2maperror.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
>> index 494c912..11e589e 100644
>> --- a/fs/cifs/smb2maperror.c
>> +++ b/fs/cifs/smb2maperror.c
>> @@ -356,7 +356,7 @@ static const struct status_to_posix_error smb2_error_map_table[] = {
>> {STATUS_PORT_CONNECTION_REFUSED, -ECONNREFUSED,
>> "STATUS_PORT_CONNECTION_REFUSED"},
>> {STATUS_INVALID_PORT_HANDLE, -EIO, "STATUS_INVALID_PORT_HANDLE"},
>> - {STATUS_SHARING_VIOLATION, -EBUSY, "STATUS_SHARING_VIOLATION"},
>> + {STATUS_SHARING_VIOLATION, -ETXTBSY, "STATUS_SHARING_VIOLATION"},
>> {STATUS_QUOTA_EXCEEDED, -EDQUOT, "STATUS_QUOTA_EXCEEDED"},
>> {STATUS_INVALID_PAGE_PROTECTION, -EIO,
>> "STATUS_INVALID_PAGE_PROTECTION"},
>
> Actually, I think Sachin is converting the CIFS
> STATUS_SHARING_VIOLATION to translate to EBUSY, since that seems to
> better reflect the situation. I'd suggest dropping this patch, unless
> you have a specific need for this error return here.

Yes, I am ok to drop this patch - accoring to the previous discussion
in linux-cifs@ it is no suitable.

--
Best regards,
Pavel Shilovsky.

2013-03-04 22:57:01

by Simo

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
> On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
>> [possible resend -- sorry]
>>
>> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
>>> This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
>>>
>>> These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
>>>
>>> According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
>>>
>>> So, we have four new flags:
>>> O_DENYREAD - to prevent other opens with read access,
>>> O_DENYWRITE - to prevent other opens with write access,
>>> O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
>>> O_DENYMAND - to switch on/off three flags above.
>> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
>> better?
>>
>> Other than that, this seems like a sensible mechanism.
> I'm a little more worried: these are mandatory locks, and applications
> that use them are used to the locks being enforced correctly. Are we
> sure that an application that opens a file O_DENYWRITE won't crash if it
> sees the file data change while it holds the open?

The redirector may simply assume it has full control of that part of the
file and not read nor send data until the lock is released too, so you
get conflicting views of the file contents between different clients if
you let a mandatory lock not be mandatory.

> In general the idea of making a mandatory lock opt-in makes me nervous.
> I'd prefer something like a mount option, so that we know that everyone
> on that one filesystem is playing by the same rules, but we can still
> mount filesystems like / without the option.

+1

> But I'll admit I'm definitely not an expert on Windows locking and may
> be missing something about how these locks are meant to work.

Mandatory locks really are mandatory in Windows.
That may not be nice to a Unix system but what can you do ?

Simo.

2013-03-11 18:35:12

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 5/7] CIFS: Translate SHARING_VIOLATION to -ETXTBSY error code for SMB2

On Thu, 28 Feb 2013 19:25:31 +0400
Pavel Shilovsky <[email protected]> wrote:

> to make it match CIFS and VFS variants.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/cifs/smb2maperror.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/cifs/smb2maperror.c b/fs/cifs/smb2maperror.c
> index 494c912..11e589e 100644
> --- a/fs/cifs/smb2maperror.c
> +++ b/fs/cifs/smb2maperror.c
> @@ -356,7 +356,7 @@ static const struct status_to_posix_error smb2_error_map_table[] = {
> {STATUS_PORT_CONNECTION_REFUSED, -ECONNREFUSED,
> "STATUS_PORT_CONNECTION_REFUSED"},
> {STATUS_INVALID_PORT_HANDLE, -EIO, "STATUS_INVALID_PORT_HANDLE"},
> - {STATUS_SHARING_VIOLATION, -EBUSY, "STATUS_SHARING_VIOLATION"},
> + {STATUS_SHARING_VIOLATION, -ETXTBSY, "STATUS_SHARING_VIOLATION"},
> {STATUS_QUOTA_EXCEEDED, -EDQUOT, "STATUS_QUOTA_EXCEEDED"},
> {STATUS_INVALID_PAGE_PROTECTION, -EIO,
> "STATUS_INVALID_PAGE_PROTECTION"},

Actually, I think Sachin is converting the CIFS
STATUS_SHARING_VIOLATION to translate to EBUSY, since that seems to
better reflect the situation. I'd suggest dropping this patch, unless
you have a specific need for this error return here.

--
Jeff Layton <[email protected]>

2013-03-05 18:13:11

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On Mon, Mar 04, 2013 at 05:49:46PM -0500, Simo wrote:
> On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
> >On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
> >>[possible resend -- sorry]
> >>
> >>On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
> >>>This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
> >>>
> >>>These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
> >>>
> >>>According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
> >>>
> >>>So, we have four new flags:
> >>>O_DENYREAD - to prevent other opens with read access,
> >>>O_DENYWRITE - to prevent other opens with write access,
> >>>O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
> >>>O_DENYMAND - to switch on/off three flags above.
> >>O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
> >>better?
> >>
> >>Other than that, this seems like a sensible mechanism.
> >I'm a little more worried: these are mandatory locks, and applications
> >that use them are used to the locks being enforced correctly. Are we
> >sure that an application that opens a file O_DENYWRITE won't crash if it
> >sees the file data change while it holds the open?
>
> The redirector may simply assume it has full control of that part of
> the file and not read nor send data until the lock is released too,
> so you get conflicting views of the file contents between different
> clients if you let a mandatory lock not be mandatory.
>
> >In general the idea of making a mandatory lock opt-in makes me nervous.
> >I'd prefer something like a mount option, so that we know that everyone
> >on that one filesystem is playing by the same rules, but we can still
> >mount filesystems like / without the option.
>
> +1
>
> >But I'll admit I'm definitely not an expert on Windows locking and may
> >be missing something about how these locks are meant to work.
>
> Mandatory locks really are mandatory in Windows.
> That may not be nice to a Unix system but what can you do ?

I wonder if we could repurpose the existing -omand mount option?

That would be a problem for anyone that wants to allow mandatory fcntl
locks without allowing share locks. I doubt anyone sane actually uses
mandatory fcntl locks, but still I suppose it would probably be better
to play it safe and use a new mount option.

--b.

2013-03-11 18:57:28

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 2/7] vfs: Add O_DENYREAD/WRITE flags support for open syscall

2013/3/11 Jeff Layton <[email protected]>:
> On Thu, 28 Feb 2013 19:25:28 +0400
> Pavel Shilovsky <[email protected]> wrote:
>
>> If O_DENYMAND flag is specified, O_DENYREAD/WRITE/MAND flags are
>> translated to flock's flags:
>>
>> !O_DENYREAD -> LOCK_READ
>> !O_DENYWRITE -> LOCK_WRITE
>> O_DENYMAND -> LOCK_MAND
>>
>> and set through flock_lock_file on a file.
>>
>> This change affects opens that use O_DENYMAND flag - all other
>> native Linux opens don't care about these flags. It allow us to
>> enable this feature for applications that need it (e.g. NFS and
>> Samba servers that export the same directory for Windows clients,
>> or Wine applications that access the same files simultaneously).
>>
>> Create codepath is slightly changed to prevent data races on
>> newely created files: when open with O_CREAT can return with -ETXTBSY
>> error for successfully created files due to a deny lock set by
>> another task.
>>
>> Signed-off-by: Pavel Shilovsky <[email protected]>
>> ---
>> fs/locks.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++------
>> fs/namei.c | 44 ++++++++++++++++++--
>> include/linux/fs.h | 6 +++
>> 3 files changed, 151 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/locks.c b/fs/locks.c
>> index a94e331..0cc7d1b 100644
>> --- a/fs/locks.c
>> +++ b/fs/locks.c
>> @@ -605,20 +605,81 @@ static int posix_locks_conflict(struct file_lock *caller_fl, struct file_lock *s
>> return (locks_conflict(caller_fl, sys_fl));
>> }
>>
>> -/* Determine if lock sys_fl blocks lock caller_fl. FLOCK specific
>> - * checking before calling the locks_conflict().
>> +static unsigned int
>> +deny_flags_to_cmd(unsigned int flags)
>> +{
>> + unsigned int cmd = LOCK_MAND;
>> +
>> + if (!(flags & O_DENYREAD))
>> + cmd |= LOCK_READ;
>> + if (!(flags & O_DENYWRITE))
>> + cmd |= LOCK_WRITE;
>> +
>> + return cmd;
>> +}
>> +
>> +/*
>> + * locks_mand_conflict - Determine if there's a share reservation conflict
>> + * @caller_fl: lock we're attempting to acquire
>> + * @sys_fl: lock already present on system that we're checking against
>> + *
>> + * Check to see if there's a share_reservation conflict. LOCK_READ/LOCK_WRITE
>> + * tell us whether the reservation allows other readers and writers.
>> + *
>> + * We only check against other LOCK_MAND locks, so applications that want to
>> + * use share mode locking will only conflict against one another. "normal"
>> + * applications that open files won't be affected by and won't themselves
>> + * affect the share reservations.
>> */
>> -static int flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
>> +static int
>> +locks_mand_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl)
>> {
>> - /* FLOCK locks referring to the same filp do not conflict with
>> + unsigned char caller_type = caller_fl->fl_type;
>> + unsigned char sys_type = sys_fl->fl_type;
>> + fmode_t caller_fmode = caller_fl->fl_file->f_mode;
>> + fmode_t sys_fmode = sys_fl->fl_file->f_mode;
>> +
>> + /* they can only conflict if they're both LOCK_MAND */
>> + if (!(caller_type & LOCK_MAND) || !(sys_type & LOCK_MAND))
>> + return 0;
>> +
>> + if (!(caller_type & LOCK_READ) && (sys_fmode & FMODE_READ))
>> + return 1;
>> + if (!(caller_type & LOCK_WRITE) && (sys_fmode & FMODE_WRITE))
>> + return 1;
>> + if (!(sys_type & LOCK_READ) && (caller_fmode & FMODE_READ))
>> + return 1;
>> + if (!(sys_type & LOCK_WRITE) && (caller_fmode & FMODE_WRITE))
>> + return 1;
>> +
>> + return 0;
>> +}
>> +
>> +/*
>> + * Determine if lock sys_fl blocks lock caller_fl. FLOCK specific checking
>> + * before calling the locks_conflict(). Boolean is_mand indicates whether
>> + * we should use a share reservation scheme or not.
>> + */
>> +static int
>> +flock_locks_conflict(struct file_lock *caller_fl, struct file_lock *sys_fl,
>> + bool is_mand)
>
> I'm not sure you really need to add this new is_mand bool. Won't that
> be equivalent to (caller_fl->fl_type & LOCK_MAND)?

This function is already used by flock system call that can pass
LOCK_MAND flag to caller_fl->fl_type. I don't want to affect existing
flock behavior by introduing new denylocking strategy - so, we need to
let flock_locks_conflict function know if we are in flock or open
codepath - in open codepath it will call locks_mand_conflict to check
if there is any other open that prevents us.

>
>> +{
>> + /*
>> + * FLOCK locks referring to the same filp do not conflict with
>> * each other.
>> */
>> - if (!IS_FLOCK(sys_fl) || (caller_fl->fl_file == sys_fl->fl_file))
>> - return (0);
>> - if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND))
>> + if (!IS_FLOCK(sys_fl))
>> + return 0;
>> + if ((caller_fl->fl_type & LOCK_MAND) || (sys_fl->fl_type & LOCK_MAND)) {
>> + if (is_mand)
>> + return locks_mand_conflict(caller_fl, sys_fl);
>> + else
>> + return 0;
>> + }
>> + if (caller_fl->fl_file == sys_fl->fl_file)
>> return 0;
>>
>> - return (locks_conflict(caller_fl, sys_fl));
>> + return locks_conflict(caller_fl, sys_fl);
>> }
>>
>> void
>> @@ -697,14 +758,19 @@ static int posix_locks_deadlock(struct file_lock *caller_fl,
>> return 0;
>> }
>>
>> -/* Try to create a FLOCK lock on filp. We always insert new FLOCK locks
>> +/*
>> + * Try to create a FLOCK lock on filp. We always insert new FLOCK locks
>> * after any leases, but before any posix locks.
>> *
>> * Note that if called with an FL_EXISTS argument, the caller may determine
>> * whether or not a lock was successfully freed by testing the return
>> * value for -ENOENT.
>> + *
>> + * Take @is_conflict callback that determines how to check if locks have
>> + * conflicts or not.
>> */
>> -static int flock_lock_file(struct file *filp, struct file_lock *request)
>> +static int
>> +flock_lock_file(struct file *filp, struct file_lock *request, bool is_mand)
>
> Ditto here on the is_mand bool. I think you can determine that from
> request->fl_type. Right?

The same suggestions are applied to this place too.

>
>> {
>> struct file_lock *new_fl = NULL;
>> struct file_lock **before;
>> @@ -760,7 +826,7 @@ find_conflict:
>> break;
>> if (IS_LEASE(fl))
>> continue;
>> - if (!flock_locks_conflict(request, fl))
>> + if (!flock_locks_conflict(request, fl, is_mand))
>> continue;
>> error = -EAGAIN;
>> if (!(request->fl_flags & FL_SLEEP))
>> @@ -783,6 +849,32 @@ out:
>> return error;
>> }
>>
>> +/*
>> + * Determine if a file is allowed to be opened with specified access and deny
>> + * modes. Lock the file and return 0 if checks passed, otherwise return a error
>> + * code.
>> + */
>> +int
>> +deny_lock_file(struct file *filp)
>> +{
>> + struct file_lock *lock;
>> + int error = 0;
>> +
>> + if (!(filp->f_flags & O_DENYMAND))
>> + return error;
>> +
>> + error = flock_make_lock(filp, &lock, deny_flags_to_cmd(filp->f_flags));
>> + if (error)
>> + return error;
>> +
>> + error = flock_lock_file(filp, lock, true);
>> + if (error == -EAGAIN)
>> + error = -ETXTBSY;
>> +
>
> I think EBUSY would be a better return code here. ETXTBSY is returned
> in more specific circumstances -- mostly when you're opening a file for
> write that is being executed.

Yes, I agree. This work was done before the discussion in linux-cifs@
about a error code for STATUS_SHARING_VIOLATION.

>
>> + locks_free_lock(lock);
>> + return error;
>> +}
>> +
>> static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
>> {
>> struct file_lock *fl;
>> @@ -1589,7 +1681,7 @@ int flock_lock_file_wait(struct file *filp, struct file_lock *fl)
>> int error;
>> might_sleep();
>> for (;;) {
>> - error = flock_lock_file(filp, fl);
>> + error = flock_lock_file(filp, fl, false);
>> if (error != FILE_LOCK_DEFERRED)
>> break;
>> error = wait_event_interruptible(fl->fl_wait, !fl->fl_next);
>> diff --git a/fs/namei.c b/fs/namei.c
>> index 43a97ee..c1f7e08 100644
>> --- a/fs/namei.c
>> +++ b/fs/namei.c
>> @@ -2559,9 +2559,14 @@ static int atomic_open(struct nameidata *nd, struct dentry *dentry,
>> * here.
>> */
>> error = may_open(&file->f_path, acc_mode, open_flag);
>> - if (error)
>> + if (error) {
>> fput(file);
>> + goto out;
>> + }
>>
>> + error = deny_lock_file(file);
>> + if (error)
>> + fput(file);
>> out:
>> dput(dentry);
>> return error;
>> @@ -2771,9 +2776,9 @@ retry_lookup:
>> }
>> mutex_lock(&dir->d_inode->i_mutex);
>> error = lookup_open(nd, path, file, op, got_write, opened);
>> - mutex_unlock(&dir->d_inode->i_mutex);
>>
>> if (error <= 0) {
>> + mutex_unlock(&dir->d_inode->i_mutex);
>> if (error)
>> goto out;
>>
>> @@ -2791,8 +2796,32 @@ retry_lookup:
>> will_truncate = false;
>> acc_mode = MAY_OPEN;
>> path_to_nameidata(path, nd);
>> - goto finish_open_created;
>> +
>> + /*
>> + * Unlock parent i_mutex later when the open finishes - prevent
>> + * races when a file can be locked with a deny lock by another
>> + * task that opens the file.
>> + */
>> + error = may_open(&nd->path, acc_mode, open_flag);
>> + if (error) {
>> + mutex_unlock(&dir->d_inode->i_mutex);
>> + goto out;
>> + }
>> + file->f_path.mnt = nd->path.mnt;
>> + error = finish_open(file, nd->path.dentry, NULL, opened);
>> + if (error) {
>> + mutex_unlock(&dir->d_inode->i_mutex);
>> + if (error == -EOPENSTALE)
>> + goto stale_open;
>> + goto out;
>> + }
>> + error = deny_lock_file(file);
>> + mutex_unlock(&dir->d_inode->i_mutex);
>> + if (error)
>> + goto exit_fput;
>> + goto opened;
>> }
>> + mutex_unlock(&dir->d_inode->i_mutex);
>>
>> /*
>> * create/update audit record if it already exists.
>> @@ -2885,6 +2914,15 @@ finish_open_created:
>> goto stale_open;
>> goto out;
>> }
>> + /*
>> + * Lock parent i_mutex to prevent races with deny locks on newely
>> + * created files.
>> + */
>> + mutex_lock(&dir->d_inode->i_mutex);
>> + error = deny_lock_file(file);
>> + mutex_unlock(&dir->d_inode->i_mutex);
>> + if (error)
>> + goto exit_fput;
>> opened:
>> error = open_check_o_direct(file);
>> if (error)
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 7617ee0..347e1de 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1005,6 +1005,7 @@ extern int lease_modify(struct file_lock **, int);
>> extern int lock_may_read(struct inode *, loff_t start, unsigned long count);
>> extern int lock_may_write(struct inode *, loff_t start, unsigned long count);
>> extern void locks_delete_block(struct file_lock *waiter);
>> +extern int deny_lock_file(struct file *);
>> extern void lock_flocks(void);
>> extern void unlock_flocks(void);
>> #else /* !CONFIG_FILE_LOCKING */
>> @@ -1153,6 +1154,11 @@ static inline void locks_delete_block(struct file_lock *waiter)
>> {
>> }
>>
>> +static inline int deny_lock_file(struct file *filp)
>> +{
>> + return 0;
>> +}
>> +
>> static inline void lock_flocks(void)
>> {
>> }
>
>
> --
> Jeff Layton <[email protected]>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html



--
Best regards,
Pavel Shilovsky.

2013-03-11 18:52:06

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 4/7] CIFS: Use NT_CREATE_ANDX command for forcemand mounts

On Thu, 28 Feb 2013 19:25:30 +0400
Pavel Shilovsky <[email protected]> wrote:

> forcemand mount option now lets us use Windows mandatory style of
> byte-range locks even if server supports posix ones - switches on
> Windows locking mechanism. Share flags is another locking mehanism
> provided by Windows semantic that can be used by NT_CREATE_ANDX
> command. This patch combines all Windows locking mechanism in one
> mount option by using NT_CREATE_ANDX to open files if forcemand is on.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/cifs/dir.c | 1 +
> fs/cifs/file.c | 6 ++++--
> 2 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
> index 6975072..139c8bc 100644
> --- a/fs/cifs/dir.c
> +++ b/fs/cifs/dir.c
> @@ -217,6 +217,7 @@ cifs_do_create(struct inode *inode, struct dentry *direntry, unsigned int xid,
> }
>
> if (tcon->unix_ext && cap_unix(tcon->ses) && !tcon->broken_posix_open &&
> + ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
> (CIFS_UNIX_POSIX_PATH_OPS_CAP &
> le64_to_cpu(tcon->fsUnixInfo.Capability))) {
> rc = cifs_posix_open(full_path, &newinode, inode->i_sb, mode,
> diff --git a/fs/cifs/file.c b/fs/cifs/file.c
> index 3ad484c..05191da 100644
> --- a/fs/cifs/file.c
> +++ b/fs/cifs/file.c
> @@ -454,8 +454,9 @@ int cifs_open(struct inode *inode, struct file *file)
> else
> oplock = 0;
>
> - if (!tcon->broken_posix_open && tcon->unix_ext &&
> - cap_unix(tcon->ses) && (CIFS_UNIX_POSIX_PATH_OPS_CAP &
> + if (!tcon->broken_posix_open && tcon->unix_ext && cap_unix(tcon->ses)
> + && ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
> + (CIFS_UNIX_POSIX_PATH_OPS_CAP &
> le64_to_cpu(tcon->fsUnixInfo.Capability))) {
> /* can not refresh inode info since size could be stale */
> rc = cifs_posix_open(full_path, &inode, inode->i_sb,
> @@ -623,6 +624,7 @@ cifs_reopen_file(struct cifsFileInfo *cfile, bool can_flush)
> oplock = 0;
>
> if (tcon->unix_ext && cap_unix(tcon->ses) &&
> + ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_NOPOSIXBRL) == 0) &&
> (CIFS_UNIX_POSIX_PATH_OPS_CAP &
> le64_to_cpu(tcon->fsUnixInfo.Capability))) {
> /*

Sounds reasonable...

Acked-by: Jeff Layton <[email protected]>

2013-03-04 21:19:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
> [possible resend -- sorry]
>
> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
> > This patchset adds support of O_DENY* flags for Linux fs layer. These flags can be used by any application that needs share reservations to organize a file access. VFS already has some sort of this capability - now it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic. This patchset build new capabilities on top of the existing one but doesn't bring any changes into the flock call semantic.
> >
> > These flags can be used by NFS (built-in-kernel) and CIFS (Samba) servers and Wine applications through VFS (for local filesystems) or CIFS/NFS modules. This will help when e.g. Samba and NFS server share the same directory for Windows and Linux users or Wine applications use Samba/NFS share to access the same data from different clients.
> >
> > According to the previous discussions the most problematic question is how to prevent situations like DoS attacks where e.g /lib/liba.so file can be open with DENYREAD, or smth like this. That's why one extra flag O_DENYMAND is added. It indicates to underlying layer that an application want to use O_DENY* flags semantic. It allows us not affect native Linux applications (that don't use O_DENYMAND flag) - so, these flags (and the semantic of open syscall that they bring) are used only for those applications that really want it proccessed that way.
> >
> > So, we have four new flags:
> > O_DENYREAD - to prevent other opens with read access,
> > O_DENYWRITE - to prevent other opens with write access,
> > O_DENYDELETE - to prevent delete operations (this flag is not implemented in VFS and NFS part and only suitable for CIFS module),
> > O_DENYMAND - to switch on/off three flags above.
>
> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
> better?
>
> Other than that, this seems like a sensible mechanism.

I'm a little more worried: these are mandatory locks, and applications
that use them are used to the locks being enforced correctly. Are we
sure that an application that opens a file O_DENYWRITE won't crash if it
sees the file data change while it holds the open?

In general the idea of making a mandatory lock opt-in makes me nervous.
I'd prefer something like a mount option, so that we know that everyone
on that one filesystem is playing by the same rules, but we can still
mount filesystems like / without the option.

But I'll admit I'm definitely not an expert on Windows locking and may
be missing something about how these locks are meant to work.

--b.

2013-03-11 18:21:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On Mon, Mar 11, 2013 at 11:18:15AM -0700, Andy Lutomirski wrote:
> On Tue, Mar 5, 2013 at 11:07 AM, Simo <[email protected]> wrote:
> > On 03/05/2013 01:13 PM, J. Bruce Fields wrote:
> >>
> >> On Mon, Mar 04, 2013 at 05:49:46PM -0500, Simo wrote:
> >>>
> >>> On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
> >>>>
> >>>> On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
> >>>>>
> >>>>> [possible resend -- sorry]
> >>>>>
> >>>>> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
> >>>>>>
> >>>>>> This patchset adds support of O_DENY* flags for Linux fs layer. These
> >>>>>> flags can be used by any application that needs share reservations to
> >>>>>> organize a file access. VFS already has some sort of this capability - now
> >>>>>> it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic.
> >>>>>> This patchset build new capabilities on top of the existing one but doesn't
> >>>>>> bring any changes into the flock call semantic.
> >>>>>>
> >>>>>> These flags can be used by NFS (built-in-kernel) and CIFS (Samba)
> >>>>>> servers and Wine applications through VFS (for local filesystems) or
> >>>>>> CIFS/NFS modules. This will help when e.g. Samba and NFS server share the
> >>>>>> same directory for Windows and Linux users or Wine applications use
> >>>>>> Samba/NFS share to access the same data from different clients.
> >>>>>>
> >>>>>> According to the previous discussions the most problematic question is
> >>>>>> how to prevent situations like DoS attacks where e.g /lib/liba.so file can
> >>>>>> be open with DENYREAD, or smth like this. That's why one extra flag
> >>>>>> O_DENYMAND is added. It indicates to underlying layer that an application
> >>>>>> want to use O_DENY* flags semantic. It allows us not affect native Linux
> >>>>>> applications (that don't use O_DENYMAND flag) - so, these flags (and the
> >>>>>> semantic of open syscall that they bring) are used only for those
> >>>>>> applications that really want it proccessed that way.
> >>>>>>
> >>>>>> So, we have four new flags:
> >>>>>> O_DENYREAD - to prevent other opens with read access,
> >>>>>> O_DENYWRITE - to prevent other opens with write access,
> >>>>>> O_DENYDELETE - to prevent delete operations (this flag is not
> >>>>>> implemented in VFS and NFS part and only suitable for CIFS module),
> >>>>>> O_DENYMAND - to switch on/off three flags above.
> >>>>>
> >>>>> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
> >>>>> better?
> >>>>>
> >>>>> Other than that, this seems like a sensible mechanism.
> >>>>
> >>>> I'm a little more worried: these are mandatory locks, and applications
> >>>> that use them are used to the locks being enforced correctly. Are we
> >>>> sure that an application that opens a file O_DENYWRITE won't crash if it
> >>>> sees the file data change while it holds the open?
> >>>
> >>> The redirector may simply assume it has full control of that part of
> >>> the file and not read nor send data until the lock is released too,
> >>> so you get conflicting views of the file contents between different
> >>> clients if you let a mandatory lock not be mandatory.
> >>>
> >>>> In general the idea of making a mandatory lock opt-in makes me nervous.
> >>>> I'd prefer something like a mount option, so that we know that everyone
> >>>> on that one filesystem is playing by the same rules, but we can still
> >>>> mount filesystems like / without the option.
> >>>
> >>> +1
> >>>
> >>>> But I'll admit I'm definitely not an expert on Windows locking and may
> >>>> be missing something about how these locks are meant to work.
> >>>
> >>> Mandatory locks really are mandatory in Windows.
> >>> That may not be nice to a Unix system but what can you do ?
> >>
> >> I wonder if we could repurpose the existing -omand mount option?
> >>
> >> That would be a problem for anyone that wants to allow mandatory fcntl
> >> locks without allowing share locks. I doubt anyone sane actually uses
> >> mandatory fcntl locks, but still I suppose it would probably be better
> >> to play it safe and use a new mount option.
> >
> >
> > Maybe we should have a -o win_semantics option :-)
> >
>
> It's not entirely obvious to me that allowing programs to bypass this
> kind of locking is a bad idea. It's hard to do on Windows, but
> presumably network filesystems on Windows already have this issue.

Could be, but I'd like to see evidence of that.

--b.

2013-03-11 18:18:36

by Andy Lutomirski

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On Tue, Mar 5, 2013 at 11:07 AM, Simo <[email protected]> wrote:
> On 03/05/2013 01:13 PM, J. Bruce Fields wrote:
>>
>> On Mon, Mar 04, 2013 at 05:49:46PM -0500, Simo wrote:
>>>
>>> On 03/04/2013 04:19 PM, J. Bruce Fields wrote:
>>>>
>>>> On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
>>>>>
>>>>> [possible resend -- sorry]
>>>>>
>>>>> On 02/28/2013 07:25 AM, Pavel Shilovsky wrote:
>>>>>>
>>>>>> This patchset adds support of O_DENY* flags for Linux fs layer. These
>>>>>> flags can be used by any application that needs share reservations to
>>>>>> organize a file access. VFS already has some sort of this capability - now
>>>>>> it's done through flock/LOCK_MAND mechanis, but that approach is non-atomic.
>>>>>> This patchset build new capabilities on top of the existing one but doesn't
>>>>>> bring any changes into the flock call semantic.
>>>>>>
>>>>>> These flags can be used by NFS (built-in-kernel) and CIFS (Samba)
>>>>>> servers and Wine applications through VFS (for local filesystems) or
>>>>>> CIFS/NFS modules. This will help when e.g. Samba and NFS server share the
>>>>>> same directory for Windows and Linux users or Wine applications use
>>>>>> Samba/NFS share to access the same data from different clients.
>>>>>>
>>>>>> According to the previous discussions the most problematic question is
>>>>>> how to prevent situations like DoS attacks where e.g /lib/liba.so file can
>>>>>> be open with DENYREAD, or smth like this. That's why one extra flag
>>>>>> O_DENYMAND is added. It indicates to underlying layer that an application
>>>>>> want to use O_DENY* flags semantic. It allows us not affect native Linux
>>>>>> applications (that don't use O_DENYMAND flag) - so, these flags (and the
>>>>>> semantic of open syscall that they bring) are used only for those
>>>>>> applications that really want it proccessed that way.
>>>>>>
>>>>>> So, we have four new flags:
>>>>>> O_DENYREAD - to prevent other opens with read access,
>>>>>> O_DENYWRITE - to prevent other opens with write access,
>>>>>> O_DENYDELETE - to prevent delete operations (this flag is not
>>>>>> implemented in VFS and NFS part and only suitable for CIFS module),
>>>>>> O_DENYMAND - to switch on/off three flags above.
>>>>>
>>>>> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
>>>>> better?
>>>>>
>>>>> Other than that, this seems like a sensible mechanism.
>>>>
>>>> I'm a little more worried: these are mandatory locks, and applications
>>>> that use them are used to the locks being enforced correctly. Are we
>>>> sure that an application that opens a file O_DENYWRITE won't crash if it
>>>> sees the file data change while it holds the open?
>>>
>>> The redirector may simply assume it has full control of that part of
>>> the file and not read nor send data until the lock is released too,
>>> so you get conflicting views of the file contents between different
>>> clients if you let a mandatory lock not be mandatory.
>>>
>>>> In general the idea of making a mandatory lock opt-in makes me nervous.
>>>> I'd prefer something like a mount option, so that we know that everyone
>>>> on that one filesystem is playing by the same rules, but we can still
>>>> mount filesystems like / without the option.
>>>
>>> +1
>>>
>>>> But I'll admit I'm definitely not an expert on Windows locking and may
>>>> be missing something about how these locks are meant to work.
>>>
>>> Mandatory locks really are mandatory in Windows.
>>> That may not be nice to a Unix system but what can you do ?
>>
>> I wonder if we could repurpose the existing -omand mount option?
>>
>> That would be a problem for anyone that wants to allow mandatory fcntl
>> locks without allowing share locks. I doubt anyone sane actually uses
>> mandatory fcntl locks, but still I suppose it would probably be better
>> to play it safe and use a new mount option.
>
>
> Maybe we should have a -o win_semantics option :-)
>

It's not entirely obvious to me that allowing programs to bypass this
kind of locking is a bad idea. It's hard to do on Windows, but
presumably network filesystems on Windows already have this issue.

--Andy

2013-03-11 20:31:14

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

On Mon, Mar 11, 2013 at 01:25:20PM -0700, Frank S Filz wrote:
>
> "J. Bruce Fields" <[email protected]> wrote
> > On Mon, Mar 11, 2013 at 04:08:44PM -0400, Jeff Layton wrote:
> > > On Mon, 11 Mar 2013 15:36:38 -0400
> > > "J. Bruce Fields" <[email protected]> wrote:
> > > > > It doesn't look like this patch removes any of that old code
> though. I
> > > > > think it probably should, or there ought to be some consideration
> of
> > > > > how this new stuff will mesh with it.
> > > > >
> > > > > I think you have 2 choices here:
> > > > >
> > > > > 1/ rip out the old share reservation code altogether and require
> that
> > > > > filesystems mount with -o sharemand or whatever if they want to
> allow
> > > > > their enforcement
> > > > >
> > > > > 2/ make knfsd fall back to using the internal share reservation
> code
> > > > > when the mount option isn't enabled
> > > > >
> > > > > Personally, I think #1 would be fine, but Bruce may want to weigh
> in on
> > > > > what he'd prefer.
> > > >
> > > > #1 sounds good. Clients that use deny bits are few. My preference
> > > > would be to return an error to such clients in the case share locks
> > > > aren't available.
> > > >
> > > > (We're a little out of spec there, so I'm not sure which error. I
> think
> > > > the goal is to notify a human there's a problem with minimal
> collateral
> > > > damange.
> > > >
> > > > NFS4ERR_SERVERFAULT ("I'm a buggy server, sorry about that!") would
> > > > probably result in an IO error to the application.
> > > >
> > > > SHARE_DENIED strikes me as unsafe: an application would be in its
> rights
> > > > not to even check for that e.g. in the case of an exclusive create.
>
> Hmm, shouldn't the client catch that with a "default" case at least?
>
> > > > Maybe DELAY? Kind of ridiculous, but blocking the application
> > > > indefinitely would probably get someone's attention quickly enough
> > > > without doing any damnage.)
> > > >
> > >
> > > I agree that we should return an error, but hadn't considered what
> > > error. Given that hardly any NFS clients use them, I'd probably just go
> > > with NFS4ERR_SERVERFAULT, and maybe throw a printk or something on the
> > > server about enabling share reservations for superblock x:y.
> >
> > Sounds reasonable.
>
> If I'm understanding, the suggestion is a mount option to enable share
> reservations and if so, they will be mandatory?
>
> In that case, perhaps we want to keep the existing knfsd code as a
> fallback, someone might want to support them, but not have them be
> mandatory (if nothing else, you may cause consternation from folks running
> pynfs against a default configured knfsd server....).

Understood, but the benefit is slight and the cost (in complexity) is
rather large. On balance I'd far prefer to get rid of any fallback code
entirely.

> In the Ganesha project, we provide an internal implementation of share
> reservations for when the underlying system can not support them.
>
> Another bit to consider, does lockd provide share reservations for NLM?

Yes. I don't know if anyone's tested them in recent memory! But it
might be interesting to write a few simple tests for them and hook them
up to this on the server side. (I don't know if they'd be worth
implementing on the client side?)

--b.

2013-03-01 08:26:13

by David Laight

[permalink] [raw]
Subject: Re: [PATCH v3 0/7] Add O_DENY* support for VFS and CIFS/NFS

On Thu, Feb 28, 2013 at 01:53:25PM -0800, Andy Lutomirski wrote:
> > O_DENYMAND - to switch on/off three flags above.
>
> O_DENYMAND doesn't deny anything. Would a name like O_RESPECT_DENY be
> better?

Possibly rename to O_CHECK_DENY ?

David

--
David Laight: [email protected]

2013-03-11 19:05:48

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

On Thu, 28 Feb 2013 19:25:33 +0400
Pavel Shilovsky <[email protected]> wrote:

> that maps them into O_DENY flags and make them visible for
> applications that use O_DENYMAND opens.
>
> Signed-off-by: Pavel Shilovsky <[email protected]>
> ---
> fs/locks.c | 1 +
> fs/nfsd/nfs4state.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 46 insertions(+), 1 deletion(-)
>
> diff --git a/fs/locks.c b/fs/locks.c
> index 0cc7d1b..593d464 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -874,6 +874,7 @@ deny_lock_file(struct file *filp)
> locks_free_lock(lock);
> return error;
> }
> +EXPORT_SYMBOL(deny_lock_file);
>
> static int __posix_lock_file(struct inode *inode, struct file_lock *request, struct file_lock *conflock)
> {
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index ac8ed96c..766256a 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -476,6 +476,19 @@ test_deny(u32 access, struct nfs4_ol_stateid *stp)
> return test_bit(access, &stp->st_deny_bmap);
> }
>
> +static int nfs4_deny_to_odeny(u32 access)
> +{
> + switch (access & NFS4_SHARE_DENY_BOTH) {
> + case NFS4_SHARE_DENY_READ:
> + return O_DENYMAND | O_DENYREAD;
> + case NFS4_SHARE_DENY_WRITE:
> + return O_DENYWRITE | O_DENYMAND;
> + case NFS4_SHARE_DENY_BOTH:
> + return O_DENYREAD | O_DENYWRITE | O_DENYMAND;
> + }
> + return O_DENYMAND;
> +}
> +
> static int nfs4_access_to_omode(u32 access)
> {
> switch (access & NFS4_SHARE_ACCESS_BOTH) {
> @@ -2793,6 +2806,21 @@ nfsd4_truncate(struct svc_rqst *rqstp, struct svc_fh *fh,
> }
>
> static __be32
> +nfs4_vfs_set_deny(struct nfs4_file *fp, unsigned long share_access,
> + unsigned long deny_access)
> +{
> + int oflag, rc;
> + __be32 status = nfs_ok;
> +
> + oflag = nfs4_access_to_omode(share_access);
> + fp->fi_fds[oflag]->f_flags |= nfs4_deny_to_odeny(deny_access);
> + rc = deny_lock_file(fp->fi_fds[oflag]);
> + if (rc)
> + status = nfserrno(rc);
> + return status;
> +}
> +
> +static __be32
> nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *cur_fh, struct nfs4_ol_stateid *stp, struct nfsd4_open *open)
> {
> u32 op_share_access = open->op_share_access;
> @@ -2813,6 +2841,14 @@ nfs4_upgrade_open(struct svc_rqst *rqstp, struct nfs4_file *fp, struct svc_fh *c
> }
> return status;
> }
> + status = nfs4_vfs_set_deny(fp, op_share_access, open->op_share_deny);
> + if (status) {
> + if (new_access) {
> + int oflag = nfs4_access_to_omode(op_share_access);
> + nfs4_file_put_access(fp, oflag);
> + }
> + return status;
> + }
> /* remember the open */
> set_access(op_share_access, stp);
> set_deny(open->op_share_deny, stp);
> @@ -3046,7 +3082,7 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
>
> /*
> * OPEN the file, or upgrade an existing OPEN.
> - * If truncate fails, the OPEN fails.
> + * If truncate or setting deny fails, the OPEN fails.
> */
> if (stp) {
> /* Stateid was found, this is an OPEN upgrade */
> @@ -3060,6 +3096,10 @@ nfsd4_process_open2(struct svc_rqst *rqstp, struct svc_fh *current_fh, struct nf
> status = nfsd4_truncate(rqstp, current_fh, open);
> if (status)
> goto out;
> + status = nfs4_vfs_set_deny(fp, open->op_share_access,
> + open->op_share_deny);
> + if (status)
> + goto out;
> stp = open->op_stp;
> open->op_stp = NULL;
> init_open_stateid(stp, fp, open);
> @@ -3758,6 +3798,10 @@ nfsd4_open_downgrade(struct svc_rqst *rqstp,
> }
> nfs4_stateid_downgrade(stp, od->od_share_access);
>
> + status = nfs4_vfs_set_deny(stp->st_file, od->od_share_access,
> + od->od_share_deny);
> + if (status)
> + goto out;
> reset_union_bmap_deny(od->od_share_deny, stp);
>
> update_stateid(&stp->st_stid.sc_stateid);

knfsd has some code already to handle share reservations internally.
Nothing outside of knfsd is aware of these reservations, of course so
moving to a vfs-level object for it would be a marked improvement.

It doesn't look like this patch removes any of that old code though. I
think it probably should, or there ought to be some consideration of
how this new stuff will mesh with it.

I think you have 2 choices here:

1/ rip out the old share reservation code altogether and require that
filesystems mount with -o sharemand or whatever if they want to allow
their enforcement

2/ make knfsd fall back to using the internal share reservation code
when the mount option isn't enabled

Personally, I think #1 would be fine, but Bruce may want to weigh in on
what he'd prefer.

--
Jeff Layton <[email protected]>

2013-03-11 20:25:20

by Frank S Filz

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS


"J. Bruce Fields" <[email protected]> wrote
> On Mon, Mar 11, 2013 at 04:08:44PM -0400, Jeff Layton wrote:
> > On Mon, 11 Mar 2013 15:36:38 -0400
> > "J. Bruce Fields" <[email protected]> wrote:
> > > > It doesn't look like this patch removes any of that old code
though. I
> > > > think it probably should, or there ought to be some consideration
of
> > > > how this new stuff will mesh with it.
> > > >
> > > > I think you have 2 choices here:
> > > >
> > > > 1/ rip out the old share reservation code altogether and require
that
> > > > filesystems mount with -o sharemand or whatever if they want to
allow
> > > > their enforcement
> > > >
> > > > 2/ make knfsd fall back to using the internal share reservation
code
> > > > when the mount option isn't enabled
> > > >
> > > > Personally, I think #1 would be fine, but Bruce may want to weigh
in on
> > > > what he'd prefer.
> > >
> > > #1 sounds good. Clients that use deny bits are few. My preference
> > > would be to return an error to such clients in the case share locks
> > > aren't available.
> > >
> > > (We're a little out of spec there, so I'm not sure which error. I
think
> > > the goal is to notify a human there's a problem with minimal
collateral
> > > damange.
> > >
> > > NFS4ERR_SERVERFAULT ("I'm a buggy server, sorry about that!") would
> > > probably result in an IO error to the application.
> > >
> > > SHARE_DENIED strikes me as unsafe: an application would be in its
rights
> > > not to even check for that e.g. in the case of an exclusive create.

Hmm, shouldn't the client catch that with a "default" case at least?

> > > Maybe DELAY? Kind of ridiculous, but blocking the application
> > > indefinitely would probably get someone's attention quickly enough
> > > without doing any damnage.)
> > >
> >
> > I agree that we should return an error, but hadn't considered what
> > error. Given that hardly any NFS clients use them, I'd probably just go
> > with NFS4ERR_SERVERFAULT, and maybe throw a printk or something on the
> > server about enabling share reservations for superblock x:y.
>
> Sounds reasonable.

If I'm understanding, the suggestion is a mount option to enable share
reservations and if so, they will be mandatory?

In that case, perhaps we want to keep the existing knfsd code as a
fallback, someone might want to support them, but not have them be
mandatory (if nothing else, you may cause consternation from folks running
pynfs against a default configured knfsd server....).

In the Ganesha project, we provide an internal implementation of share
reservations for when the underlying system can not support them.

Another bit to consider, does lockd provide share reservations for NLM?

Frank


Attachments:
(No filename) (0.00 B)
(No filename) (1.00 B)
Download all attachments

2013-03-11 20:37:41

by Frank S Filz

[permalink] [raw]
Subject: Re: [PATCH v3 7/7] NFSD: Pass share reservations flags to VFS

"J. Bruce Fields" <[email protected]> wrote> >
> > Another bit to consider, does lockd provide share reservations for NLM?
>
> Yes. I don't know if anyone's tested them in recent memory! But it
> might be interesting to write a few simple tests for them and hook them
> up to this on the server side. (I don't know if they'd be worth
> implementing on the client side?)

Oh, they would be REALLY NICE to have on the client side... Then we can
test NLM_SHARE with an open source client and not only with Windows...

Frank


Attachments:
(No filename) (0.00 B)
(No filename) (1.00 B)
Download all attachments

2013-04-04 10:30:13

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

2013/3/12 Jeff Layton <[email protected]>:
> On Mon, 11 Mar 2013 14:54:34 -0400
> Jeff Layton <[email protected]> wrote:
>
>> On Thu, 28 Feb 2013 19:25:32 +0400
>> Pavel Shilovsky <[email protected]> wrote:
>>
>> > by passing these flags to NFSv4 open request.
>> >
>> > Signed-off-by: Pavel Shilovsky <[email protected]>
>> > ---
>> > fs/nfs/nfs4xdr.c | 24 ++++++++++++++++++++----
>> > 1 file changed, 20 insertions(+), 4 deletions(-)
>> >
>> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
>> > index 26b1439..58ddc74 100644
>> > --- a/fs/nfs/nfs4xdr.c
>> > +++ b/fs/nfs/nfs4xdr.c
>> > @@ -1325,7 +1325,8 @@ static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struc
>> > encode_string(xdr, name->len, name->name);
>> > }
>> >
>> > -static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
>> > +static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode,
>> > + int open_flags)
>> > {
>> > __be32 *p;
>> >
>> > @@ -1343,7 +1344,22 @@ static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
>> > default:
>> > *p++ = cpu_to_be32(0);
>> > }
>> > - *p = cpu_to_be32(0); /* for linux, share_deny = 0 always */
>> > + if (open_flags & O_DENYMAND) {
>>
>>
>> As Bruce mentioned, I think a mount option to enable this on a per-fs
>> basis would be a better approach than this new O_DENYMAND flag.
>>
>>
>> > + switch (open_flags & (O_DENYREAD|O_DENYWRITE)) {
>> > + case O_DENYREAD:
>> > + *p = cpu_to_be32(NFS4_SHARE_DENY_READ);
>> > + break;
>> > + case O_DENYWRITE:
>> > + *p = cpu_to_be32(NFS4_SHARE_DENY_WRITE);
>> > + break;
>> > + case O_DENYREAD|O_DENYWRITE:
>> > + *p = cpu_to_be32(NFS4_SHARE_DENY_BOTH);
>> > + break;
>> > + default:
>> > + *p = cpu_to_be32(0);
>> > + }
>> > + } else
>> > + *p = cpu_to_be32(0);
>> > }
>> >
>> > static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
>> > @@ -1354,7 +1370,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
>> > * owner 4 = 32
>> > */
>> > encode_nfs4_seqid(xdr, arg->seqid);
>> > - encode_share_access(xdr, arg->fmode);
>> > + encode_share_access(xdr, arg->fmode, arg->open_flags);
>> > p = reserve_space(xdr, 36);
>> > p = xdr_encode_hyper(p, arg->clientid);
>> > *p++ = cpu_to_be32(24);
>> > @@ -1491,7 +1507,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close
>> > encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
>> > encode_nfs4_stateid(xdr, arg->stateid);
>> > encode_nfs4_seqid(xdr, arg->seqid);
>> > - encode_share_access(xdr, arg->fmode);
>> > + encode_share_access(xdr, arg->fmode, 0);
>> > }
>> >
>> > static void
>>
>>
>> Other than that, this seems reasonable.
>>
>> Acked-by: Jeff Layton <[email protected]>
>
> Oh duh...
>
> Please ignore my comment on patch #7 to add a patch for the NFS client.
> This one does that. That said, there may be a potential problem here
> that you need to consider.
>
> In the case of a local filesystem you'll want to set deny locks using
> deny_lock_file(). For a network filesystem like CIFS or NFS though,
> the server will handle that atomically during the open. You need to
> ensure that you don't go trying to set LOCK_MAND locks on the file once
> that's done.
>
> Perhaps you can use a fstype flag to indicate that the filesystem
> handles this during the open and doesn't need to try and set a flock
> lock?

Also, we can simply mask off O_DENY* flags in open (and atomic_open)
codepath of filesystems that support these flags:

...
do open request to the storage
...
file->f_flags &= ~(O_DENYREAD | O_DENYWRITE | O_DENYDELETE);
...
return to VFS
...

Thoughts?

--
Best regards,
Pavel Shilovsky.

2013-04-04 13:02:40

by Jeff Layton

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

On Thu, 4 Apr 2013 14:30:12 +0400
Pavel Shilovsky <[email protected]> wrote:

> 2013/3/12 Jeff Layton <[email protected]>:
> > On Mon, 11 Mar 2013 14:54:34 -0400
> > Jeff Layton <[email protected]> wrote:
> >
> >> On Thu, 28 Feb 2013 19:25:32 +0400
> >> Pavel Shilovsky <[email protected]> wrote:
> >>
> >> > by passing these flags to NFSv4 open request.
> >> >
> >> > Signed-off-by: Pavel Shilovsky <[email protected]>
> >> > ---
> >> > fs/nfs/nfs4xdr.c | 24 ++++++++++++++++++++----
> >> > 1 file changed, 20 insertions(+), 4 deletions(-)
> >> >
> >> > diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
> >> > index 26b1439..58ddc74 100644
> >> > --- a/fs/nfs/nfs4xdr.c
> >> > +++ b/fs/nfs/nfs4xdr.c
> >> > @@ -1325,7 +1325,8 @@ static void encode_lookup(struct xdr_stream *xdr, const struct qstr *name, struc
> >> > encode_string(xdr, name->len, name->name);
> >> > }
> >> >
> >> > -static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> >> > +static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode,
> >> > + int open_flags)
> >> > {
> >> > __be32 *p;
> >> >
> >> > @@ -1343,7 +1344,22 @@ static void encode_share_access(struct xdr_stream *xdr, fmode_t fmode)
> >> > default:
> >> > *p++ = cpu_to_be32(0);
> >> > }
> >> > - *p = cpu_to_be32(0); /* for linux, share_deny = 0 always */
> >> > + if (open_flags & O_DENYMAND) {
> >>
> >>
> >> As Bruce mentioned, I think a mount option to enable this on a per-fs
> >> basis would be a better approach than this new O_DENYMAND flag.
> >>
> >>
> >> > + switch (open_flags & (O_DENYREAD|O_DENYWRITE)) {
> >> > + case O_DENYREAD:
> >> > + *p = cpu_to_be32(NFS4_SHARE_DENY_READ);
> >> > + break;
> >> > + case O_DENYWRITE:
> >> > + *p = cpu_to_be32(NFS4_SHARE_DENY_WRITE);
> >> > + break;
> >> > + case O_DENYREAD|O_DENYWRITE:
> >> > + *p = cpu_to_be32(NFS4_SHARE_DENY_BOTH);
> >> > + break;
> >> > + default:
> >> > + *p = cpu_to_be32(0);
> >> > + }
> >> > + } else
> >> > + *p = cpu_to_be32(0);
> >> > }
> >> >
> >> > static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_openargs *arg)
> >> > @@ -1354,7 +1370,7 @@ static inline void encode_openhdr(struct xdr_stream *xdr, const struct nfs_opena
> >> > * owner 4 = 32
> >> > */
> >> > encode_nfs4_seqid(xdr, arg->seqid);
> >> > - encode_share_access(xdr, arg->fmode);
> >> > + encode_share_access(xdr, arg->fmode, arg->open_flags);
> >> > p = reserve_space(xdr, 36);
> >> > p = xdr_encode_hyper(p, arg->clientid);
> >> > *p++ = cpu_to_be32(24);
> >> > @@ -1491,7 +1507,7 @@ static void encode_open_downgrade(struct xdr_stream *xdr, const struct nfs_close
> >> > encode_op_hdr(xdr, OP_OPEN_DOWNGRADE, decode_open_downgrade_maxsz, hdr);
> >> > encode_nfs4_stateid(xdr, arg->stateid);
> >> > encode_nfs4_seqid(xdr, arg->seqid);
> >> > - encode_share_access(xdr, arg->fmode);
> >> > + encode_share_access(xdr, arg->fmode, 0);
> >> > }
> >> >
> >> > static void
> >>
> >>
> >> Other than that, this seems reasonable.
> >>
> >> Acked-by: Jeff Layton <[email protected]>
> >
> > Oh duh...
> >
> > Please ignore my comment on patch #7 to add a patch for the NFS client.
> > This one does that. That said, there may be a potential problem here
> > that you need to consider.
> >
> > In the case of a local filesystem you'll want to set deny locks using
> > deny_lock_file(). For a network filesystem like CIFS or NFS though,
> > the server will handle that atomically during the open. You need to
> > ensure that you don't go trying to set LOCK_MAND locks on the file once
> > that's done.
> >
> > Perhaps you can use a fstype flag to indicate that the filesystem
> > handles this during the open and doesn't need to try and set a flock
> > lock?
>
> Also, we can simply mask off O_DENY* flags in open (and atomic_open)
> codepath of filesystems that support these flags:
>
> ...
> do open request to the storage
> ...
> file->f_flags &= ~(O_DENYREAD | O_DENYWRITE | O_DENYDELETE);
> ...
> return to VFS
> ...
>
> Thoughts?
>

I'd probably still stick with a FS_* flag for this...

That sort of mechanism would work (for now) but sounds like the sort of
subtle behavior that's difficult for filesystem authors to get right.
It would also be subject to subtle breakage later.

Also, suppose there are changes in the future that require you to
determine this before calling into ->open? Then you'll have to go back
and somehow mark the fs anyway...

--
Jeff Layton <[email protected]>

2013-04-04 17:45:21

by Pavel Shilovsky

[permalink] [raw]
Subject: Re: [PATCH v3 6/7] NFSv4: Add O_DENY* open flags support

2013/4/4 Jeff Layton <[email protected]>:
> I'd probably still stick with a FS_* flag for this...
>
> That sort of mechanism would work (for now) but sounds like the sort of
> subtle behavior that's difficult for filesystem authors to get right.
> It would also be subject to subtle breakage later.
>
> Also, suppose there are changes in the future that require you to
> determine this before calling into ->open? Then you'll have to go back
> and somehow mark the fs anyway...

Ok, this makes sense, thanks. Will do it this way and repost.

--
Best regards,
Pavel Shilovsky.