Filesystem links on different OSes

From Helpful
(Redirected from Windows and links)
Jump to navigation Jump to search


Linux

Hardlinks

In most unix-style filesystems, files themselves are (inode,content) pairs, where inode is just a unique number for that filesystem

Directories are collections of (name,inode) entries.

That means you can theoretically have more than one directory entry to point to the same inode - two filenames pointing to the same stored data.

This is usually allowed, and such additional entries are called hardlinks.

Note that

  • they are not really links at all.
Yes, you can only really create the additional entries by referring to the first, but once they exist, they are all equal.
Hardlinks were not usually called hardlinks until symlinks existed. (verify)(verify)
  • since inodes are unique only per filesystem, hardlinks cannot point to things on other filesystems.
  • Since that's extra bookkeeping (in particular "only delete contents after there are 0 directory entries resolving to this inode"), a filesystem has to implement this specifically.
There are some that don't implement/allow it, but most filesystems that have POSIX in their background will.
  • they arguably make the file analogy a little leaky, in that if you see two different directory entries, you don't expect that editing one will also change the other


Symlinks

Symlinks (sometimes 'softlinks') are, instead, a special type of directory entry.

On a low level these entries just store a path string (in early implementations in the file data, now optionally in the metadata as that's faster), and all file APIs know about this, meaning that if they meet a directory entry marked as a symlink, they will interpretet this as a kind of proxy/redirect, in that they resolve what they point to, and then open that.


Symlinks will work across across filesystems, because they use path strings instead of inodes.

For the same reason, it is not necessarily valid at all times, e.g. when you delete the file it points to. There is no related filesystem bookkeeping.



Notes:

  • you can have a stat() call stat the symlink entry or its target.
  • Symlinks do not have permissions themselves, they always appear as lrwxrwxrwx, and these permissions are not used.
Typical use just lands on the permissions of the target entry
access to the symlink itself (e.g. to alter it) is based on the permissions on the directory it is in(verify).
  • It is possible for a symlink to point to something that doesn't exist.
  • interesting detail to symlinks and trailing slashes. Consider:
rm symlinktodir         #removes the symlink
rm symlinktodir/        #implies the directory pointed to
If you meant to remove the symlink, but you added -rf out of (bad) habit, and path autocompletion added the slash, then you might've just thrown away a lot of data.
  • A symlink path may be relative or absolute
Absolute can be a little more secure, but potentially more fragile.
Consider e.g. what happens when the target is replaced or the symlink is moved around.
  • symlinks break the abstraction of trees, in that
typical use makes the filesystem a dag, rather than a tree
symlinked directories now have multiple absolute paths that point to them
symlinked directories (arguably) now have multiple parents (so cd .. i)
It is possible for a symlink to create loops in the filesystem.
  • symlinks are more fragile in some (unusual) contexts
consider what happens when you present directories outside of their context (e.g. chroot, over network filesystems) - that changes how symlinks should be resolved?


Notes that address both

While there are clear technical differences, in practice there isn't much you can do with hardlinks that you can't do with symlinks.

There are practical details and security issues to both. For example, consider the combination with a chroot jail - symlinks won't work, hardlinks give access to content that is outside the jail.



Windows

NTFS reparse point (introduced in NTFS 3.0, with Windows 2000) [1] are basically a way of saying "this directory entry needs to be handled in a special way".

It will havea reparse tag, plus data to be interpreted

The value in the reparse tag implies the which filter driver should be interpreting it


Since their introduction, then quite a few reparse tags have appeared[2][3], and for varied uses (e.g. some DFS stuff, some drive management in servers, caching in IIS, syncing protocols, metadata to launch UWP apps, and more).


In the context of regular-looking localish links, the more interesting are:

  • volume mount points
like *nix mounts, where the root of another filesystem can be placed under a specific path (rather than a specific drive letter)
does not allow UNC paths(verify)
  • directory junction points
point to a directory within another volume, rather than its root (but actually implemented with the same tag as volume mount points)
  • NTFS symbolic links [4]
can point to either files or directories (decision at link time)
can be relative paths, and UNC paths (so e.g. SMB network paths) (but apparently requires support by the remote host as well?(verify))
added in NTFS 3.1 (around XP) but were never very easy to use
XP only allowed them for kernel mode(verify)
Vista enabled user mode use and added a mklink tool to create them,, but required administrator privileges to create them(verify).
Windows 10 has similar restrictions, though if you enable Developer Mode, the mklink tool becomes usable by any user [5][6]


There's also

  • WSL symlink
not sure yet how this is different, but it's different.
Probably related how there are also WSL domain sockets, named pipes, character and block device types.
  • WCI links
MS does containers now, so needed to add more secure isolation. This is part of that.


Hardlinks aren't reparse points(verify) (refer to MFT entries?(verify)) and are created with the CreateHardLink call.



Apple