Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934609AbbDIJco (ORCPT ); Thu, 9 Apr 2015 05:32:44 -0400 Received: from mail-la0-f50.google.com ([209.85.215.50]:36344 "EHLO mail-la0-f50.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933326AbbDIJcf (ORCPT ); Thu, 9 Apr 2015 05:32:35 -0400 From: Dmitry Monakhov To: Jens Axboe , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: ming.l@ssi.samsung.com, david@fromorbit.com, Jens Axboe Subject: Re: [PATCH 2/7] Add support for per-file stream ID In-Reply-To: <1427296070-8472-3-git-send-email-axboe@fb.com> References: <1427296070-8472-1-git-send-email-axboe@fb.com> <1427296070-8472-3-git-send-email-axboe@fb.com> User-Agent: Notmuch/0.18.1 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu) Date: Thu, 09 Apr 2015 12:30:40 +0300 Message-ID: <87h9sp643j.fsf@openvz.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 6097 Lines: 192 --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable Jens Axboe writes: One small question. You states that all IDs are equals but can we reserve some IDs for internal kernel purposes. For example very short lived data (files opened with O_TEMP) and so on. Also small nitpicking see below. > Writing on flash devices can be much more efficient, if we can > inform the device what kind of data can be grouped together. If > the device is able to group data together with similar lifetimes, > then it can be more efficient in garbage collection. This, in turn, > leads to lower write amplification, which is a win on both device > wear and performance. > > Add a new fadvise hint, POSIX_FADV_STREAMID, which sets the file > and inode streamid. The file streamid is used if we have the file > available at the time of the write (O_DIRECT), we use the inode > streamid if not (buffered writeback). The fadvise hint uses the > 'offset' field to specify a stream ID. > > Signed-off-by: Jens Axboe > --- > fs/inode.c | 1 + > fs/open.c | 1 + > include/linux/fs.h | 23 +++++++++++++++++++++++ > include/uapi/linux/fadvise.h | 2 ++ > mm/fadvise.c | 17 +++++++++++++++++ > 5 files changed, 44 insertions(+) > > diff --git a/fs/inode.c b/fs/inode.c > index f00b16f45507..41885322ba64 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -149,6 +149,7 @@ int inode_init_always(struct super_block *sb, struct = inode *inode) > inode->i_blocks =3D 0; > inode->i_bytes =3D 0; > inode->i_generation =3D 0; > + inode->i_streamid =3D 0; > inode->i_pipe =3D NULL; > inode->i_bdev =3D NULL; > inode->i_cdev =3D NULL; > diff --git a/fs/open.c b/fs/open.c > index 33f9cbf2610b..4a9b2be1a674 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -743,6 +743,7 @@ static int do_dentry_open(struct file *f, > f->f_flags &=3D ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC); >=20=20 > file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping); > + f->f_streamid =3D 0; >=20=20 > return 0; >=20=20 > diff --git a/include/linux/fs.h b/include/linux/fs.h > index b4d71b5e1ff2..43dde70c1d0d 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -631,6 +631,7 @@ struct inode { > }; >=20=20 > __u32 i_generation; > + unsigned int i_streamid; >=20=20 > #ifdef CONFIG_FSNOTIFY > __u32 i_fsnotify_mask; /* all events this inode cares about */ > @@ -640,6 +641,14 @@ struct inode { > void *i_private; /* fs or device private pointer */ > }; >=20=20 > +static inline unsigned int inode_streamid(struct inode *inode) > +{ > + if (inode) > + return inode->i_streamid; > + > + return 0; > +} > + > static inline int inode_unhashed(struct inode *inode) > { > return hlist_unhashed(&inode->i_hash); > @@ -820,6 +829,8 @@ struct file { > const struct cred *f_cred; > struct file_ra_state f_ra; >=20=20 > + unsigned int f_streamid; > + > u64 f_version; > #ifdef CONFIG_SECURITY > void *f_security; > @@ -842,6 +853,18 @@ struct file_handle { > unsigned char f_handle[0]; > }; >=20=20 > +/* > + * If the file doesn't have a stream ID set, return the inode stream ID > + * in case that has been set. > + */ > +static inline unsigned int file_streamid(struct file *f) > +{ > + if (f->f_streamid) > + return f->f_streamid; > + > + return inode_streamid(f->f_inode); > +} > + > static inline struct file *get_file(struct file *f) > { > atomic_long_inc(&f->f_count); > diff --git a/include/uapi/linux/fadvise.h b/include/uapi/linux/fadvise.h > index e8e747139b9a..3dc8a1ff1422 100644 > --- a/include/uapi/linux/fadvise.h > +++ b/include/uapi/linux/fadvise.h > @@ -18,4 +18,6 @@ > #define POSIX_FADV_NOREUSE 5 /* Data will be accessed once. */ > #endif >=20=20 > +#define POSIX_FADV_STREAMID 8 /* associate stream ID with file */ > + > #endif /* FADVISE_H_INCLUDED */ > diff --git a/mm/fadvise.c b/mm/fadvise.c > index 4a3907cf79f8..b111a8899fb7 100644 > --- a/mm/fadvise.c > +++ b/mm/fadvise.c > @@ -60,6 +60,7 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offset, = loff_t, len, int, advice) > case POSIX_FADV_WILLNEED: > case POSIX_FADV_NOREUSE: > case POSIX_FADV_DONTNEED: > + case POSIX_FADV_STREAMID: > /* no bad return value, but ignore advice */ > break; > default: > @@ -144,6 +145,22 @@ SYSCALL_DEFINE4(fadvise64_64, int, fd, loff_t, offse= t, loff_t, len, int, advice) > } > } > break; > + case POSIX_FADV_STREAMID: > + /* > + * streamid is stored in offset... we don't limit or check > + * if the device supports streams, or if it does, if the > + * stream nr is within the limits. 1 is the lowest valid > + * stream id, 0 is "don't care/know". > + */ > + if (offset !=3D (unsigned int) offset) > + ret =3D EINVAL; Shuld be negative ret =3D -EINVAL; > + else { > + f.file->f_streamid =3D offset; > + spin_lock(&inode->i_lock); > + inode->i_streamid =3D offset; > + spin_unlock(&inode->i_lock); > + } > + break; > default: > ret =3D -EINVAL; > } > --=20 > 1.9.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" = in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBCgAGBQJVJkbAAAoJELhyPTmIL6kBtxAH/2kEBuRPHc/apxK6MIXLftsH vikgnJT1bcwL1ZnanpgSZOhWYyi6mDkFr96R4oQ9UR31Nfe1JjrLs34vGPCaXGHx pJI0OY4WLLn1xrBCGlMfrx8wGzvvufDKg7o+XWC7FW20ldJm1+n81Ly7XGO2iXE2 FBxqQtmlZhYB2CK/QfI2i7V9xIOo3TU2IWXXGYXXaSgUg1zWWJ+mBIpuM/B/PzzM bkayqsOwZ2vmqzMGYf9Vwz1kFrLyJELrqCjzhk5SX10A7qLqgnNC5QeRvFbTYJAX /L1ejnXOcvf8L2ogOkqWgSYvmBn+2rZNkc8TORn2JkK2RTgXVFE3nmPxhESt5UE= =MXX2 -----END PGP SIGNATURE----- --=-=-=-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/