Difference between revisions of "Virtualization, emulation, simulation"

From Helpful
Jump to: navigation, search
m
m (Simulation and emulation)
 
Line 147: Line 147:
 
==Simulation and emulation==
 
==Simulation and emulation==
  
When you can't really run the code directly on the metal, you're talking about simulation and emulation.
+
Whenever you can't run the code directly on the silicon, you're talking about simulation and emulation.
  
 
<!--
 
<!--
 
Generally,  
 
Generally,  
: '''emulation''' means exactly replicating how the internals work,
 
 
: '''simulation''' means imitating surface behaviour
 
: '''simulation''' means imitating surface behaviour
 +
:: Simulation usually points to implementing one thing in another at the highest level you can get away with (often because it's less work).
  
There is argument about these distinctions -- and, since many things are a bit of both, how useful they are.
+
: '''emulation''' means replicating how the internals work, exactly
 +
: Emulation usually points to implementing one thing in the another at the lowest level (often because you need to).
 +
 
 +
 
 +
 
 +
For example, [https://www.bryanbraun.com/after-dark-css/ after dark in CSS] shows you toasters, not caring about the code the originals used, or the windows 3.1 it was often run on.
 +
This is a clean example of simulation.
 +
 
 +
Running classical consoles games on a PC sit ''largely'' on the emulation end, mainly because there is little to no hardware overlap between them and the PC you probably want it on. ''Everything'' has to be imitated. It's the details of most cases that imply large chunks of that have to be emulated at hardware level.
 +
 
 +
 
 +
 
 +
Somewhere in the middle is the question of "running a program in another environment", e.g.
 +
* [https://www.winehq.org/ Wine] letting you run windows binaries on linux
 +
: but the code is for the same architecture, so at opcode level basically just runs
 +
: but there's no windows kernel, so the main thing left to is take care of the system calls the program (and any libraries it calls) do.
 +
: so this is simulating an operating system
 +
 
 +
* [https://en.wikipedia.org/wiki/Windows_Subsystem_for_Linux#WSL_1 WSL1] running linux binaries on windows is basically the same thing (simulating an operating system) but the other way around: no linux kernel, but a translation layer to windows
 +
:: {{comment|(though note WSL2 works differently: it runs a modified linux kernel in a VM)}}
 +
 
 +
 
 +
 
 +
 
 +
To go more into how this often a gliding scale, consider:
 +
* classical game consoles tended to use CPUs we no longer use, and custom hardware setups. Also, code often talks to the hardware directly.
 +
: so both the code has to be run indirectly/differently
 +
: hardware have to be redirected
 +
:: while you could emulate each piece of hardware down to the silicon, that's often slow to do (easily one or two orders of magnitude slower), so it can often be a lot more efficient to simulate what it does.
 +
:: e.g. memory management is just bookkeeping
 +
:: e.g. a sound chip that mostly makes squarewaves can be simulated. It takes proper care, though, because people used these chips in ''interesting'' ways.
 +
:: in general, a game not running on an emulator (yet) is because it does something that simulation code did not consider.
 +
 
 +
 
 +
* VirtualBox and other VMs were
 +
: modern PCs do hardware virtualization, which means some care was taken at hardware level, meaning distinct OSes can run on the same CPU without being aware or able to affect each other
 +
: before that was standard, they did some selective interception of what they hosted to do that protection at software level.
 +
:: You can call this some thin emulation, or arguably it's neither.
 +
: arguably, neither of these are truly simulation or emulation, it's just special-cased stuff within the same architecture
 +
 
 +
: at the same time, the network card, sound card, mouse, etc. presented to are simulations -- mostly because you're transporting these things between OSes, so you have no other choice.
 +
 
 +
 
 +
* Note that some things are so complex - e.g. a CPU, that simulating and emulating come down to almost the same thing anyway.
  
  
The differences aren't really relevant for e.g. "Run this program made for this other CPU", in that it must do everything according to spec, or expect to fail wildly, which typically means emulation is your only option {{comment|(or, from another view, simulating and emulating a CPU is almost the same thing anyway)}}.
 
  
  
Line 179: Line 221:
 
...most of this doesn't apply to VMs, though.
 
...most of this doesn't apply to VMs, though.
 
-->
 
-->
 
 
 
  
 
==Same-architecture emulation, virtualization==
 
==Same-architecture emulation, virtualization==

Latest revision as of 15:18, 3 December 2019

This article/section is a stub — probably a pile of half-sorted notes, is not well-checked so may have incorrect bits. (Feel free to ignore, fix, or tell me)
Notes related to (mostly whole-computer) virtualization, emulation and simulation.

Some overview · Docker notes · Qemu notes

Virtualization, jails, application containers, OS containers

chroot jails

chroot() changes a process's apparent filesystem-root directory to a given path, so only really affects how path lookups work.

This is useful for things that want to run in an isolated environment of files. Such as a clean build system, which seems to be its original use.


But you should assume people who want to break out of these will manage, so this is not a security feature.

See chroot for more words about that.

FreeBSD Jails

Insipred by early chroot() and the need to compartimentalize, BSD jails actually did have some security in mind.

It only lets processes talk to others in the same jail, and considers syscalls, sockets/networking and some other things.

It's been around for a while, is mature, and allows decently fine-grained control.

Solaris/illumos Zones

If you come from linux angle: these are mostly like LXC (verify), and were mature before it.

Due to solaris pedigree, this combines well with things like ZFS's snapshots and cloning.


Linux containers

There are a few kernel features that isolate, monitor, and limit processes on a linux host.


So when we say containers are just host processes, that's meant literally.

While these isolations are a "pick which you want" gliding scale, 'Linux containers' basically refers to using all of them to isolate things in all the ways that matter to security.

And often with a toolset to keep your admin person sane, which is why 'docker' (and similar) are so closely related.


The main two building blocks that make this possible are namespaces and cgroups.

Just that would probably still be easier to break out of than a classical VM, so when you care about security you typically supplement that with

capabilities, basically fine-grained superuser rights
seccomp, which filters allowed syscalls,
SELinux/AppArmor/other MAC, which applies separately to be any protection for your host that any specific container might not do (often not necessary, but good for peace of mind)

(Note that a good number of these fundaments resemble BSD jails and illumos zones, from the conceptual "let's do a pretty hard split and build from there" level)

Docker's focus on copy-on-write storage to build images mostly isn't necessary, but quite useful because fast and manageable.


Uses

Various other names around this area - docker, LXC, kubernetes, rkt, runC, systemd-nspawn, OpenVZ - are various degrees of runtimes and management around the above.

Some of them the gears and duct tape, some of them the nice clicky interface, some of them aimed at different typical uses, or different scales.

For example, where Docker is aimed at making single-purpose containers, on building them, on portability, LXC is aimed more at being a multitenant machine virtualisation thing

They share history, libraries, and other code -- but their design and ecosystem ended up making them different specializations. And wWhile you can do microservices with LXC, and a fleshed-out system in docker, both are more work and tweaking to get them quite right (e.g. docker avoids /sbin/init not only because it's not necessary, but also because it init does some stuff that you have to work around, like setting the default gateway route).

rkt and runC just on running (minimal wrappers around libcontainer)

kubernetes focuses almost purely on orchestrating (autimated deployment, scaling, and other management) systems within one or more hosts.


See also:

Linux OpenVZ

See also:


Hardware emulation, simulation, and virtualization - VM stuff

"Here is a thing that acts like hardware X. Put on that whatever you want" can be done in various ways.


Simulation and emulation

Whenever you can't run the code directly on the silicon, you're talking about simulation and emulation.


Same-architecture emulation, virtualization

Virtual machines

See Comparison of virtual machines (wikipedia).


LXC notes

A few notes on...

namespaces

Namespaces limits what you can see of a specific type of resource, by implementing a mapping between within-a-container resources to the host's.


This allows a cheap way to have a container see their own - and have their host manage these as distict subsets.

Linux has grown approximately six of these so far:

  • PID - allows containers to have distinct process trees
  • user - user and group IDs. (E.g. allows UID 0 (root) inside a container to be non-root on the host)
  • mount - can have its own filesystem root (chroot-alike), can have own mounts (e.g. useful for /tmp)
  • network - network devices, addresses and routing, sockets, ports, etc.
some interesting variations/uses
  • UTS - mostly for nodename and domainname
  • IPC - for SysV IPC and POSIX message queues


For example, you can

sudo unshare --fork --pid --mount-proc bash

which separates only the processes' PID namespace. You can run top (because your filesystem is still there as before), you can talk to the network as before, etc. -- but no matter what you do you'll only see the processes you started under this bash.


See also namespaces [1]

cgroups

Control groups concept is about balancing and metering resources.


Note that cgroups apply to all of the host and not just containers, (they just effectively default to no limits),

They're a useful tool to limit how crazy a known-to-misbehave app can go, without going near anything resembling namespaces or containers.


Resources types (which cgroups calls subsystems) include:

  • Memory - heap and stack and more (interacts with page cache(verify))
allows OOM killer to work per group
which is part of the "put one service in a container" suggestion
  • cpu - sets weights, not limits
  • cpuset - pin to a CPU
  • blockIO - limits and/or weights
also network, but does little more than tagging for egress
  • devices
  • hugeTLB
  • freezer - allows pausing in a cgroup


Cgroups set up a hierarchy for each of these. Nodes in each hierarchy refers to (a group of) processes.

The API for cgroups is a filesystem, /sys/fs/cgroups. Which, being a verbose/finicky, there are now nicer tools for.


See also

https://en.wikipedia.org/wiki/Cgroups cgroups
https://www.kernel.org/doc/Documentation/cgroup-v1/cgroups.txt
https://www.youtube.com/watch?v=sK5i-N34im8

Cloudy environments

Openstack notes

Openstack is designed (by Rackspace+NASA) as a coherent set of components to set up a compute cloud.


libcloud notes