Dan Maas wrote:
> This may be true for environments where people mostly run a handful of
> monolithic applications (*ahem* windows) but look at typical Linuxy things
> like:
>
> make (compiler, assembler, linker processes...)
> forking servers (Apache 1.x...)
> *./configure scripts* (a big one!!!)
> etc...
>
> Startup cost is likely to be a big factor here...
Btw, a little story about startup times and Linux.
I once wrote a Perl script that needed to know the current directory.
It did:
use POSIX 'getcwd'
getcwd(...)
After a few months, I was annoyed by the slowness of this script
(compared with other scripts) and decided to try speeding it up. It
turns out that the above two lines took about 0.25 of a second, and that
was the dominant running time of the script.
I replaced getcwd() with `/bin/pwd`. Lo! It took about 0.0075 second.
Says very good things about Linux' fork, exec and mmap times, and about
Glibc's dynamic loading time, I think.
-- Jamie
On Sat, 26 Jan 2002, Jamie Lokier wrote:
> I once wrote a Perl script that needed to know the current directory.
> It did:
>
> use POSIX 'getcwd'
> getcwd(...)
>
> After a few months, I was annoyed by the slowness of this script
> (compared with other scripts) and decided to try speeding it up. It
> turns out that the above two lines took about 0.25 of a second, and that
> was the dominant running time of the script.
>
> I replaced getcwd() with `/bin/pwd`. Lo! It took about 0.0075 second.
>
> Says very good things about Linux' fork, exec and mmap times, and about
> Glibc's dynamic loading time, I think.
Most likely it says very bad things about getcwd() implementation in Perl
compared to sys_getcwd() in the kernel. The latter just walks the chain
of dentries copying ->d_name.name into the buffer. The former... my guess
would be stat ".", open "..", readdir from it, stat every damn object in
there until you find one with the right ->st_ino, put its name as the
last component and repeat the whole thing until you reach root...
On Fri, Jan 25, 2002 at 11:33:44PM -0500, Alexander Viro wrote:
>
> On Sat, 26 Jan 2002, Jamie Lokier wrote:
>
> > I once wrote a Perl script that needed to know the current directory.
> > It did:
> >
> > use POSIX 'getcwd'
> > getcwd(...)
> >
> > After a few months, I was annoyed by the slowness of this script
> > (compared with other scripts) and decided to try speeding it up. It
> > turns out that the above two lines took about 0.25 of a second, and that
> > was the dominant running time of the script.
> >
> > I replaced getcwd() with `/bin/pwd`. Lo! It took about 0.0075 second.
> >
> > Says very good things about Linux' fork, exec and mmap times, and about
> > Glibc's dynamic loading time, I think.
>
> Most likely it says very bad things about getcwd() implementation in Perl
> compared to sys_getcwd() in the kernel.
No no no--it says very bad things about 'use POSIX', and in general
about overhead-creep in the perl library.
Andrew
Andrew Pimlott wrote:
> > Most likely it says very bad things about getcwd() implementation in Perl
> > compared to sys_getcwd() in the kernel.
>
> No no no--it says very bad things about 'use POSIX', and in general
> about overhead-creep in the perl library.
I ended up calling sys_getcwd from Perl as it's extremely fast. Even
faster if you hard code the syscall number instead of reading the header
file in Perl :-)
However, I was still impressed by the 0.0075s for a pipe/fork/exec.
'use POSIX' is very slow, but Perl's getcwd() is pretty slow too -- it
does the old fashioned directory walk. We cannot blame it for being
portable.
-- Jamie
Alexander Viro wrote:
> Most likely it says very bad things about getcwd() implementation in Perl
> compared to sys_getcwd() in the kernel. The latter just walks the chain
> of dentries copying ->d_name.name into the buffer. The former... my guess
> would be stat ".", open "..", readdir from it, stat every damn object in
> there until you find one with the right ->st_ino, put its name as the
> last component and repeat the whole thing until you reach root...
I better correct my statement. Just had a look at the old Perl script
in question.
I was originally using 'use Cwd; getcwd()'. That was really slow and it
does exactly what Alex says. Even though you don't need to call stat()
on each entry: the readdir system call returns the inode numbers for all
directories that aren't mount points, but Perl doesn't give that
information to its libraries (for the sake of portability).
Then I switched to 'use POSIX; getcwd()'. That was faster, but still
distressingly slow (i.e. noticable in human terms). POSIX::getcwd()
forks and execs to call /bin/pwd, and is quite fast. But 'use POSIX' is
quite slow. Shame really, because the implementation is in a shared
library; it's just the 'use POSIX' importing part that's slow.
Then I tried `/bin/pwd' myself since the POSIX function uses it. That
was pretty fast and I was impressed with Linux for being that fast. Of
course it is still 1.7 million cycles which is not brilliant, but it was
faster than I'd expected for a pipe/fork/exec.
Finally I just did the getcwd() system call myself. Used a hard-coded
system call number, because reading it from a header file was slow.
Linus was right about startup times in this case.
enjoy,
-- Jamie