Filesystem links on different OSes

From Helpful
Jump to navigation Jump to search


Linux

Hardlinks

For context: In in internals of most unix-style filesystems (still simplified):

  • directories contain (name,type,metadata,inode)
  • inodes look up to content

...where

inode is just a unique number, with little other meaning.
type can be file, directory (and a few other special things)

Conceptually, this separates directories from each other, and separates the name-and-direcory-and-permission stuff from "and now read a large file's contents" stuff.


So what happens when you have more than one directory entry to point to the same inode?

It's allowed -- for files (not for directories or other types).

Such additional entries are called hardlinks.


To you, it looks like two different filenames that point to the same stored content. (Edit it, and it will appear different when accessed under either name)

Note that

  • because inode handling is fully handled by the filesystem code (it has to be),

this is not a special case like many other kinds of links are, that you might have to check for or read as a special case.

  • they are not really links at all.
Yes, users can only really create the additional entries by referring to the first, but once they exist, they are all equal
Hardlinks were not usually called hardlinks until symlinks existed. (verify)(verify)
  • since inodes are unique only per filesystem, hardlinks cannot point to things on other filesystems.
  • Since that's extra bookkeeping (in particular "we can only delete content after there are 0 directory entries resolving to this inode"), a filesystem has to implement this specifically.
There are some that don't implement/allow it, but most filesystems that have POSIX in their background will.
  • they arguably make the file analogy a little leaky, in that if you see two different directory entries, you don't expect that editing one will also change the other

Symlinks

Symlinks (sometimes 'softlinks') are, instead, a special type of directory entry.

Conceptually, it stores:

  • "handle this as a symlink" and
  • "this is the path string for that handling to "


You have to get pretty gritty to even find out how it does this at low level (used to be a small file containing just the path, now often stored in the metadata(verify)) and also, those details do not matter because you should only ever use filesystem code which will always handle this for you, e.g. resolve what they point to, and then open that.

You will run into a few cases where you do care about the distinction about the symlink itself versus what it points to, e.g.

  • check that it is a symlink (without trying to open it)
  • check that you have permissions to the symlink and/or what the symlink points to
  • check whether the symlink is broken


Symlinks can point to any path, so can easily point to another filesystem.

For the same reason, it is not necessarily valid at all times, e.g. when you delete the file it points to,when you do not have that other filesystem mounted in the same place, or at all, etc.

Unlike hardlinks, there is no related filesystem bookkeeping.


Notes:

  • you can have a stat() call stat the symlink entry or its target.
  • Symlinks do not have permissions themselves, they always appear as lrwxrwxrwx, and these permissions are not used.
Typical use just lands on the permissions of the target entry
access to the symlink itself (e.g. to alter it) is based on the permissions on the directory it is in(verify).
  • It is possible for a symlink to point to something that doesn't exist.
  • interesting detail to symlinks and trailing slashes. Consider:
rm symlinktodir         #removes the symlink
rm symlinktodir/        #implies the directory pointed to
If you meant to remove the symlink, but you added -rf out of (bad) habit, and path autocompletion added the slash, then you might've just thrown away a lot of data.
  • A symlink path may be relative or absolute
Absolute can be a little more secure, but potentially more fragile.
Consider e.g. what happens when the target is replaced or the symlink is moved around.
  • symlinks break the abstraction of trees, in that
typical use makes the filesystem a dag, rather than a tree
symlinked directories now have multiple absolute paths that point to them
symlinked directories (arguably) now have multiple parents (so cd .. i)
It is possible for a symlink to create loops in the filesystem.
  • symlinks are more fragile in some (unusual) contexts
consider what happens when you present directories outside of their context (e.g. chroot, over network filesystems) - that changes how symlinks should be resolved?


Notes that address both

While there are clear technical differences, in practice there isn't much you can do with hardlinks that you can't do with symlinks.

There are practical details and security issues to both. For example, consider the combination with a chroot jail - symlinks won't work, hardlinks give access to content that is outside the jail.


.desktop (freedesktop)

INI-like text file mostly used for application launchers, directory links, or URL links

Various of its entries (name, icon) can exist in localized form.

See also:

Windows

Microsoft .URL

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In windows, saved internet shortcuts are often .URL files on the filesystem.


.URL files are simple text files, arguably an INI flavour.

The minimal content is:

[InternetShortcut] 
URL=http://url.example.com/

This is also the most interesting content, and often the only. Even if the file stores more metadata, this is still usually the URL you want to open.


In file browsers in linux, you may not be able to directly use these files. It's fairly easy to associate .URL with a simple script and hand the URL to a browser (e.g. with xdg-open {{comment|(or gnome-open, kfmclient openURL) but .URL files don't appear in linux unless you copied them from windows, so this may not be set up.



See also:

LNK files / Windows Shortcuts

A binary file format that includes a link target and various metadata about itself and the link target.

A lnk file is

  • specific to MicroSoft Windows
  • specific to windows explorer (to programs, and command line utilities, it is just another file. Look at start.ext, though).
Explorer tends to call these shortcuts, and hides the .lnk extension even when 'show file extensions' is enabled


The stuff you could edit

  • description
  • relative path
  • working directory
  • command line arguments
  • icon location

and things like

  • file attributes
  • whether it points to a network location, and if so, what kind
  • whether it's pointing to a particular kind of device)
  • extra environment variables location to set(verify)
  • location, size, codepage of console window to open
  • shell item identifiers (what for?)


See also

NTFS reparse points

NTFS reparse points (introduced in NTFS 3.0, aroundWindows 2000) [1] are basically a way of saying "this directory entry needs to be handled in a special way; I will tell you the kind of special here".

It will have a reparse tag (a simple number indicating a type), plus data to be interpreted
The value in the reparse tag implies the which filter driver should be interpreting it (where filter drivers are things added to an existing driver stack to handle added cases[2])


Since their introduction, many reparse tags have appeared[3][4], and for varied uses (e.g. making directories point elsewhere, some drive management in servers, caching in IIS, syncing protocols, metadata to launch UWP apps, and more).

A few of them also have useful everyday names, for example:


IO_REPARSE_TAG_MOUNT_POINT

The most common case is also a confusing case.


A single tag IO_REPARSE_TAG_MOUNT_POINT does two things - that is, does what in the documentation is mentioned as two distinct things:

  • Volume mount points
  • Junctions

In both cases, opening a directory that has this reparse tag will actually lead the OS to fetch data from a directory elsewhere.


The concept of a volume mount point is basically that you can "mount" the root of another volume inside one directory of another. Put another way, you can have that directory point at the root of another volume.

The concept of a directory junction, often just junction, is very similar, but can point to a directory within a filesystem.

Slightly different restrictions apply(verify).


Different practicalities apply too.

Say, it would have no practical use to have a 'volume mount point point at the same volume (it would expose the same data, and various utilities might recurse endlessly)...

...but it is potentially very useful to have a junction point to a direcory within the same volume.

For example, it is used to have applications looking in one place go elsewhere, without them really being aware of it - often for capability reasons.

For example, as windows has reorganized things over time, so in your %USERPROFILE% there is e.g. probably a Application Data pointing to AppData\Roaming (see also AppData).

To see that, run cmd, and type:

cd %USERPROFILE%
DIR /A

(You can get a little more information with utilities like junction)


See also:

https://en.wikipedia.org/wiki/NTFS_reparse_point#Directory_junctions
https://en.wikipedia.org/wiki/NTFS_volume_mount_point

IO_REPARSE_TAG_SYMLINK

NTFS symbolic links, like junctions but can also point to files, and are resolved a little differently.

can point to either files or directories (decision made at link creation time)
can be relative paths
can be UNC paths, but apparently doing so requires support by the remote host as well?(verify)
added in NTFS 3.1 (around XP) but were never easy to use'
XP only allowed them for kernel mode(verify)
Vista enabled user mode use and added a mklink tool to create them, but required administrator privileges to create them(verify).
Windows 10 has similar restrictions, though if you enable Developer Mode, the mklink tool becomes usable by any user [5][6]

See also:

some other links

There's also

  • WSL symlink
not sure yet how this is different, but it's different.
Probably related how there are also WSL domain sockets, named pipes, character and block device types.
  • WCI links
MS does containers now, so needed to add more secure isolation. This is part of that.


Hardlinks aren't reparse points(verify) (refer to MFT entries?(verify)) and are created with the CreateHardLink call.


"How do I see?"

Apple