2006-03-05 21:36:21

by Jon Masters

[permalink] [raw]
Subject: [OT] inotify hack for locate

Folks,

I'm fed up with those finds running whenever I power on. Has anyone
written an equivalent of the Microsoft indexing service to update
locate's database?

I know about Beagle and friends but I'd be interested in whatever I'm
missing that specifically solves the above problem - I'm sure it's
been done :-)

Jon.


2006-03-05 21:42:40

by Jesper Juhl

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/5/06, Jon Masters <[email protected]> wrote:
> Folks,
>
> I'm fed up with those finds running whenever I power on.

You run updatedb at boot time?
Why not run it from cron at night like most people do?
Personally I run it at 04:40.



--
Jesper Juhl <[email protected]>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html

2006-03-05 21:43:08

by Lee Revell

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Sun, 2006-03-05 at 21:36 +0000, Jon Masters wrote:
> Folks,
>
> I'm fed up with those finds running whenever I power on. Has anyone
> written an equivalent of the Microsoft indexing service to update
> locate's database?
>
> I know about Beagle and friends but I'd be interested in whatever I'm
> missing that specifically solves the above problem - I'm sure it's
> been done :-)

updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
IO priority follows the nice value, so why does it still kill the
machine?

Lee

2006-03-05 21:50:48

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/5/06, Jesper Juhl <[email protected]> wrote:

> On 3/5/06, Jon Masters <[email protected]> wrote:

> > I'm fed up with those finds running whenever I power on.

> You run updatedb at boot time?

No, but said box will catch up cron jobs on boot.

> Why not run it from cron at night like most people do?

That's not the point. It usually does. I'm interested to know if
anyone has written a daemon that can sit and just do this
synchronously on my desktop - then not only do I /not/ have to run
updatedb every day but I can also have a locate that is always up to
the minute.

Jon.

2006-03-05 21:53:34

by Robin Holt

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Sun, Mar 05, 2006 at 10:42:39PM +0100, Jesper Juhl wrote:
> On 3/5/06, Jon Masters <[email protected]> wrote:
> > Folks,
> >
> > I'm fed up with those finds running whenever I power on.
>
> You run updatedb at boot time?
> Why not run it from cron at night like most people do?
> Personally I run it at 04:40.

I use suspend to disk on my laptop. When I power it back up in the
morning, updatedb starts.

Thanks,
Robin

2006-03-05 22:02:45

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/5/06, Robin Holt <[email protected]> wrote:

> I use suspend to disk on my laptop. When I power it back up in the
> morning, updatedb starts.

It just seems to me that things like Beagle are all well and good, but
what would be really useful to /me/ :-) is a hack for locate. It's
probably been done and I'm rambling for nothing - someone put me out
of my misery with a link?

Or I can look at fixing it for myself otherwise. This is something
Microsoft almost "get right" with their fast indexing service.

Jon.

2006-03-05 23:13:34

by Benjamin LaHaise

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Sun, Mar 05, 2006 at 04:43:03PM -0500, Lee Revell wrote:
> updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
> IO priority follows the nice value, so why does it still kill the
> machine?

Running updatedb on a laptop when you're sitting in an airplane running
off of batteries is Not Nice to the user. I know, I've had it happen far
too many times.

-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[email protected]>.

2006-03-05 23:30:48

by Måns Rullgård

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

Benjamin LaHaise <[email protected]> writes:

> On Sun, Mar 05, 2006 at 04:43:03PM -0500, Lee Revell wrote:
>> updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
>> IO priority follows the nice value, so why does it still kill the
>> machine?
>
> Running updatedb on a laptop when you're sitting in an airplane running
> off of batteries is Not Nice to the user. I know, I've had it happen far
> too many times.

Running updatedb only if AC powered shouldn't be too difficult.

--
M?ns Rullg?rd
[email protected]

2006-03-05 23:41:50

by Chris Ball

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

>> On 5 Mar 2006 21:36:19, Jon Masters <[email protected]> said:

> I'm fed up with those finds running whenever I power on. Has
> anyone written an equivalent of the Microsoft indexing service to
> update locate's database?

I think the reason this hasn't been done is that inotify_add_watch()es
are non-recursive: you'd need a watch over every directory, and you'd
need a crawling step (churn, churn) to enumerate the directories to
add watches for.

Beagle (which only indexes home directories, by default) uses an
algorithm for placing watches as it crawls, such that by the end
of the crawl you can guarantee not to have a lost a race on new
directories being created while the crawl was happening:

http://mail.gnome.org/archives/dashboard-hackers/2004-October/msg00022.html

- Chris.
--
Chris Ball <[email protected]> <http://www.mrao.cam.ac.uk/~cjb/>

2006-03-06 01:04:26

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/5/06, Chris Ball <[email protected]> wrote:
> >> On 5 Mar 2006 21:36:19, Jon Masters <[email protected]> said:
>
> > I'm fed up with those finds running whenever I power on. Has
> > anyone written an equivalent of the Microsoft indexing service to
> > update locate's database?

> I think the reason this hasn't been done is that inotify_add_watch()es
> are non-recursive: you'd need a watch over every directory, and you'd
> need a crawling step (churn, churn) to enumerate the directories to
> add watches for.

You're right. What I want really is to be able to bind to a netlink
socket and get told about particular file IO operations I'm interested
in for the /whole/ of a filesystem. The same kind of thing that real
time anti-virus/anti-spam people want to do anyway.

Thanks for the links, Chris. I've not been following Beagle
development (lost interest after the OLS talk got cancelled) very
closely so wasn't aware of the current implementation.

Jon.

2006-03-06 07:49:51

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Sun, 2006-03-05 at 21:36 +0000, Jon Masters wrote:
> Folks,
>
> I'm fed up with those finds running whenever I power on. Has anyone
> written an equivalent of the Microsoft indexing service to update
> locate's database?


there is both rlocate and mlocate to replace whatever variant of locate
you are using.

But this is obviously offtopic for lkml

2006-03-06 09:24:06

by Helge Hafting

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

Jon Masters wrote:

>On 3/5/06, Jesper Juhl <[email protected]> wrote:
>
>
>
>>On 3/5/06, Jon Masters <[email protected]> wrote:
>>
>>
>
>
>
>>>I'm fed up with those finds running whenever I power on.
>>>
>>>
>
>
>
>>You run updatedb at boot time?
>>
>>
>
>No, but said box will catch up cron jobs on boot.
>
>
>
>>Why not run it from cron at night like most people do?
>>
>>
>
>That's not the point. It usually does. I'm interested to know if
>anyone has written a daemon that can sit and just do this
>synchronously on my desktop - then not only do I /not/ have to run
>updatedb every day but I can also have a locate that is always up to
>the minute.
>
>
I haven't heard about anyone doing this. You could modify
the VFS to notify you everytime a file is created, moved or deleted.
That should give you what you want, but at the cost of delaying
those operations.

Another option would be to make a filesystem that stores its
directory structure (or a copy of it) in a single file, so that
a locate-like program can do quick lookups of the always-correct
data.

Helge Hafting

2006-03-06 09:31:42

by Helge Hafting

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

Jon Masters wrote:

>You're right. What I want really is to be able to bind to a netlink
>socket and get told about particular file IO operations I'm interested
>in for the /whole/ of a filesystem. The same kind of thing that real
>time anti-virus/anti-spam people want to do anyway.
>
>
>
Do they?
I thought all this mail processing could be done in the mailserver
and/or mail reader. Why detect spam by looking for generic file
creation when you can trivially tap into mail as it arrives?

As for the non-existent virus problem - it is mostly prevented
by users not being administrators. And you can go further
with a readonly /usr and a noexec /home.

Helge Hafting

2006-03-06 13:10:41

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/6/06, Helge Hafting <[email protected]> wrote:
> Jon Masters wrote:
>
> >You're right. What I want really is to be able to bind to a netlink
> >socket and get told about particular file IO operations I'm interested
> >in for the /whole/ of a filesystem. The same kind of thing that real
> >time anti-virus/anti-spam people want to do anyway.

> Do they?
> I thought all this mail processing could be done in the mailserver
> and/or mail reader. Why detect spam by looking for generic file
> creation when you can trivially tap into mail as it arrives?

Because it's not just email :-) These guys want to be able to filter
/every/ file no matter how it is accessed.

> As for the non-existent virus problem - it is mostly prevented
> by users not being administrators. And you can go further
> with a readonly /usr and a noexec /home.

That's definately OT - I was simply saying that there are
anti-spam/anti-virus products which run on Linux that use hooks to do
this at the VFS level. So that you don't need to modify
Samba/Mailserver/NFS/everything else.

Jon.

2006-03-06 13:12:37

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/6/06, Arjan van de Ven <[email protected]> wrote:

> On Sun, 2006-03-05 at 21:36 +0000, Jon Masters wrote:

> > I'm fed up with those finds running whenever I power on. Has anyone
> > written an equivalent of the Microsoft indexing service to update
> > locate's database?

> there is both rlocate and mlocate to replace whatever variant of locate
> you are using.

Interesting.

> But this is obviously offtopic for lkml

Not entirely - because I'm asking about VFS functionality. I'm going
to look at the rlocate kernel module with a view to doing something
generic that communicates over netlink like I want. Thanks.

Jon.

2006-03-06 21:54:05

by Pavel Machek

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Ne 05-03-06 23:30:09, M?ns Rullg?rd wrote:
> Benjamin LaHaise <[email protected]> writes:
>
> > On Sun, Mar 05, 2006 at 04:43:03PM -0500, Lee Revell wrote:
> >> updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
> >> IO priority follows the nice value, so why does it still kill the
> >> machine?
> >
> > Running updatedb on a laptop when you're sitting in an airplane running
> > off of batteries is Not Nice to the user. I know, I've had it happen far
> > too many times.
>
> Running updatedb only if AC powered shouldn't be too difficult.

That makes locate useless on some machines. I have sharp zaurus C3000
here... It is either powered on *or* connected on AC, but very rarely
connected to ac while turned on. Well, its power plug located at weird
place and old software version that prevents charging while turned on
is contributory factor, but...

Pavel
--
Web maintainer for suspend.sf.net (http://www.sf.net/projects/suspend) wanted...

2006-03-06 21:59:36

by Pavel Machek

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

Hi!

> >That's not the point. It usually does. I'm interested to know if
> >anyone has written a daemon that can sit and just do this
> >synchronously on my desktop - then not only do I /not/ have to run
> >updatedb every day but I can also have a locate that is always up to
> >the minute.
> >
> >
> I haven't heard about anyone doing this. You could modify
> the VFS to notify you everytime a file is created, moved or deleted.
> That should give you what you want, but at the cost of delaying
> those operations.
>
> Another option would be to make a filesystem that stores its
> directory structure (or a copy of it) in a single file, so that
> a locate-like program can do quick lookups of the always-correct
> data.

Better way is probably to create M-RECURSIVE-TIME field in inode --
similar to MTIME but counting modifications in directories. There are
many applications that would like to watch file modifications, and
some of them (like locate) are not running all the time.
Pavel
--
Web maintainer for suspend.sf.net (http://www.sf.net/projects/suspend) wanted...

2006-03-06 22:09:03

by Måns Rullgård

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

Pavel Machek <[email protected]> writes:

> On Ne 05-03-06 23:30:09, M?ns Rullg?rd wrote:
>> Benjamin LaHaise <[email protected]> writes:
>>
>> > On Sun, Mar 05, 2006 at 04:43:03PM -0500, Lee Revell wrote:
>> >> updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
>> >> IO priority follows the nice value, so why does it still kill the
>> >> machine?
>> >
>> > Running updatedb on a laptop when you're sitting in an airplane running
>> > off of batteries is Not Nice to the user. I know, I've had it happen far
>> > too many times.
>>
>> Running updatedb only if AC powered shouldn't be too difficult.
>
> That makes locate useless on some machines. I have sharp zaurus C3000
> here... It is either powered on *or* connected on AC, but very rarely
> connected to ac while turned on. Well, its power plug located at weird
> place and old software version that prevents charging while turned on
> is contributory factor, but...

OK, although that surely must be an exception. Most laptops run
happily with AC connected, and the current power source is easily
obtained from some file in /proc that I've forgotten the name of.

--
M?ns Rullg?rd
[email protected]

2006-03-06 22:23:02

by Pavel Machek

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Po 06-03-06 22:08:28, M?ns Rullg?rd wrote:
> Pavel Machek <[email protected]> writes:
>
> > On Ne 05-03-06 23:30:09, M?ns Rullg?rd wrote:
> >> Benjamin LaHaise <[email protected]> writes:
> >>
> >> > On Sun, Mar 05, 2006 at 04:43:03PM -0500, Lee Revell wrote:
> >> >> updatedb runs at nice 20 on most distros, and with the CFQ scheduler the
> >> >> IO priority follows the nice value, so why does it still kill the
> >> >> machine?
> >> >
> >> > Running updatedb on a laptop when you're sitting in an airplane running
> >> > off of batteries is Not Nice to the user. I know, I've had it happen far
> >> > too many times.
> >>
> >> Running updatedb only if AC powered shouldn't be too difficult.
> >
> > That makes locate useless on some machines. I have sharp zaurus C3000
> > here... It is either powered on *or* connected on AC, but very rarely
> > connected to ac while turned on. Well, its power plug located at weird
> > place and old software version that prevents charging while turned on
> > is contributory factor, but...
>
> OK, although that surely must be an exception. Most laptops run
> happily with AC connected, and the current power source is easily
> obtained from some file in /proc that I've forgotten the name of.

This is really small machine. Yes, it is an exception...

..and you could modify cron to know about AC power. Something like "do
every day at 4 am if you are on AC power; delay it if you are on DC
power, but for no more than 3 days"... should do the trick.

Actually that modification to cron would be probably useful for
non-updatedb stuff, too....
Pavel
--
Web maintainer for suspend.sf.net (http://www.sf.net/projects/suspend) wanted...

2006-03-07 00:11:56

by Lee Revell

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On Mon, 2006-03-06 at 10:31 +0100, Helge Hafting wrote:
> As for the non-existent virus problem - it is mostly prevented
> by users not being administrators. And you can go further
> with a readonly /usr and a noexec /home.
>

I believe he is referring to using Linux systems to provide virus
scanning services for mail, NFS, SMB etc. clients, rather than to virus
scanning for the Linux desktop (which is indeed a non problem).

Lee

2006-03-07 00:33:45

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/7/06, Lee Revell <[email protected]> wrote:
> On Mon, 2006-03-06 at 10:31 +0100, Helge Hafting wrote:
> > As for the non-existent virus problem - it is mostly prevented
> > by users not being administrators. And you can go further
> > with a readonly /usr and a noexec /home.

> I believe he is referring to using Linux systems to provide virus
> scanning services for mail, NFS, SMB etc. clients, rather than to virus
> scanning for the Linux desktop (which is indeed a non problem).

Sure. I wasn't hand waving an muttering about virus problems on Linux
desktops everywhere.

Anyway. Seems a couple of us are interested in having something more
generic at the VFS level to notify userspace about particular events
of interest (recursively registering a watcher on every directory is
silly). I really should go scope out some of the existing projects
that cover this before I decide what to do. This kind of thing should
be in mainline IMHO.

Jon.

2006-03-07 10:26:20

by Steven Rostedt

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate



On Tue, 7 Mar 2006, Jon Masters wrote:

> On 3/7/06, Lee Revell <[email protected]> wrote:
> > On Mon, 2006-03-06 at 10:31 +0100, Helge Hafting wrote:
> > > As for the non-existent virus problem - it is mostly prevented
> > > by users not being administrators. And you can go further
> > > with a readonly /usr and a noexec /home.
>
> > I believe he is referring to using Linux systems to provide virus
> > scanning services for mail, NFS, SMB etc. clients, rather than to virus
> > scanning for the Linux desktop (which is indeed a non problem).
>
> Sure. I wasn't hand waving an muttering about virus problems on Linux
> desktops everywhere.
>
> Anyway. Seems a couple of us are interested in having something more
> generic at the VFS level to notify userspace about particular events
> of interest (recursively registering a watcher on every directory is
> silly). I really should go scope out some of the existing projects
> that cover this before I decide what to do. This kind of thing should
> be in mainline IMHO.
>

Hmm, this could also be very useful for change management systems and
especially backup utilities. Imagine having a daemon that records all the
changes on a filesystem, and then backs them up periodically. Could very
well be useful.

-- Steve

2006-03-08 13:56:35

by Jan Engelhardt

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

>
>OK, although that surely must be an exception. Most laptops run
>happily with AC connected, and the current power source is easily
>obtained from some file in /proc that I've forgotten the name of.
>
/sys/acpi/battery/BAT1/state for me.


Jan Engelhardt
--

2006-03-09 02:44:36

by Jon Masters

[permalink] [raw]
Subject: Re: [OT] inotify hack for locate

On 3/7/06, Steven Rostedt <[email protected]> wrote:

> Hmm, this could also be very useful for change management systems and
> especially backup utilities. Imagine having a daemon that records all the
> changes on a filesystem, and then backs them up periodically. Could very
> well be useful.

Some people have already done similar things with FUSE based
filesystems but my point is that there's nothing cool and useful in
mainline to do what I personally want :-) Anyway, I was catching up
with an old friend as part of this thread and we'll spend some time
over at LinuxWorld Boston hacking on some ideas. Then I'll see what's
worth doing.

Jon.