Difference between revisions of "Locate"

From Helpful
Jump to: navigation, search
m (Alternatives)
m (Behaviour)
Line 43: Line 43:
 
* http://rlocate.sourceforge.net/
 
* http://rlocate.sourceforge.net/
  
==Behaviour==
+
==Behaviour and configuration==
  
 
===pruning===
 
===pruning===
Line 50: Line 50:
  
 
Details vary slightly between implementations, but most support at least:
 
Details vary slightly between implementations, but most support at least:
* don't go under certain under absolute paths
+
* don't go under certain absolute paths
 
: will not add its content, but still enter the directory name itself
 
: will not add its content, but still enter the directory name itself
 
: avoid trailing slashes
 
: avoid trailing slashes
Line 60: Line 60:
 
* directory basename  
 
* directory basename  
 
: e.g. <tt>.git .bzr .hg .svn</tt>, because their existance may be interesting but their filenames not so much
 
: e.g. <tt>.git .bzr .hg .svn</tt>, because their existance may be interesting but their filenames not so much
 +
  
 
===security notes===
 
===security notes===
Line 66: Line 67:
  
  
slocate/mlocate hide filenames the calling user couldn't read, by checking the permission they also store in the database.
+
slocate/mlocate hide filenames the calling user couldn't read, by checking the permission they also store in the database. {{comment|(note that this is only really more secure if only the tools (and not the user) can read that database, which is why mlocate/slocate usually has a user account / group and are [[SUID]]/GUID)}}
{{comment|(note that this means that the database itself should be readable by locate but not the calling user, which is why there is usually an mlocate/slocate user/group and the locate command uses [[SUID]])}}
+
  
  
Keep in mind that indexed permissions can be a day behind, so if you want more security (at the cost of more IO on every locate),
+
Keep in mind that indexed permissions can be a day behind, so if your permissions were previously too lax, these files will still be named until the next index.  
you can make updatedb store a flag into the database that means "locate should stat() the parent directories to see if the calling user could list this{{verify}}".
+
(note: the check is disabled if the file is world-readable)
+
  
 +
If you want to avoid that, you can make updatedb store a flag into the database that means "locate should stat() the parent directories to see if the calling user could list this{{verify}}".
 +
This comes at the cost of more IO on every locate.
  
 
==some inspection==
 
==some inspection==

Revision as of 19:44, 22 June 2021

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)

tl;dr:

  • locate
    lets you find files by name, without going through your filesystem like with find
  • ...from an index typically rebuilt every night (
    updatedb
    )


  • of the (GNU, slocate, mlocate) set, you probably want mlocate
  • you probably have mlocate
  • locate
    will be symlinked to mlocate/slocate if you use them


Variants

GNU locate

(part of GNU findutils)

Build and index of readable files (verify) (when run as root, this typically means everything(verify))

slocate and mlocate

(s for secure, m for merging)

slocate stores file permissions and ownership as seen at index time,

and slocate's
locate
command will filter out entries that the invoking user couldn't read.


mlocate extends slocate, in that its updatedb only rereads a directory contents if that directory's mtime changed, which often avoids stat()ing most files on the filesystem, so it's done much faster and does less IO.


https://wiki.gentoo.org/wiki/Mlocate


rlocate

Hooks into the kernel (via a kernel module), in that it has a hook into (that hooks into open, mkdir, mknod, link, rename, symlink).

Its Template:Inlincode searches both the database and kernel list, so you'll see new files within seconds of their creation.


See also:

Behaviour and configuration

pruning

Can be told not look in certain places.

Details vary slightly between implementations, but most support at least:

  • don't go under certain absolute paths
will not add its content, but still enter the directory name itself
avoid trailing slashes
  • don't index certain filesystem types


mlocate/slocate add stuff, including:

  • directory basename
e.g. .git .bzr .hg .svn, because their existance may be interesting but their filenames not so much


security notes

Historical/GNU locate would let you see filenames that you couldn't yourself.


slocate/mlocate hide filenames the calling user couldn't read, by checking the permission they also store in the database. (note that this is only really more secure if only the tools (and not the user) can read that database, which is why mlocate/slocate usually has a user account / group and are SUID/GUID)


Keep in mind that indexed permissions can be a day behind, so if your permissions were previously too lax, these files will still be named until the next index.

If you want to avoid that, you can make updatedb store a flag into the database that means "locate should stat() the parent directories to see if the calling user could list this(verify)". This comes at the cost of more IO on every locate.

some inspection

The database may be in one of various places, depending on which one you use.

Perhaps the simplest summary comes from

locate -S

Which in my case says something like:

Database /var/lib/mlocate/mlocate.db:
       552,900 directories
       5,639,641 files
       428,849,911 bytes in file names
       181,205,838 bytes used to store database

...yes, that 5+ million files and I need to clean up:P though the database is only ~170MB.


Listing all entries could be done like

sudo locate /

...but keep in mind that if the security flag mentioned above applies, it means access() calls for everything.