Unsorted webdev notes

📃 These are primarily notes, intended to be a collection of useful fragments, that will probably never be complete in any sense.

Communicating both ways between client and server

Sometimes the server has things to say.

Polling

"Hey server, do you have something new to say since last I asked?"

Is the laziest option, and will work.

...but not an efficient one if you care about the server's message being here sooner rather than later.

Want it to be on screen there within order of 100ms?

Well then, you need ten XHR-style queries per second.

And hope it doesn't get backlogged or throttled.

Hanging GET

(See also COMET[1])

This is an XHR-style call on a URL where the server accepts the connection and keeps it open, but only responds once it has something to say.

This works, and in some ways this is a cleverer variant of polling, but:

is a bit hacky on the browser side

for robustness you must deal with timeouts, routers that kill idle connections, etc. so your code must be ready to re-establish these when needed.

occupies a lot of connections on the server side.

which at scale can run it out of available sockets

is one-way (server to client), mostly.

Server-Sent Events

Supported by most browsers since 2015, but IE never did, and Edge only did in 2020.

Could be considered a formalisation of the hanging get with a cleanish browser API.

One-way (server to client), mostly.

Useful for notifications and such, and avoids polling and its latency and connection overhead.

Still plain HTTP.

What you get is serialization, some browser hooks, automatic reconnection, and you can define fairly arbitrary events.

Optionally, event IDs let the browser know what it saw last before a reconnect, so it's easier for servers t support sending an appropriate backlog.

WebSockets

WebSockets are

two-way
kept open
sends whole messages (text or binary), in that the browser-side API is event-based.

...so allow both push as well as pull systems.

The initial initial handshake is HTTP-like (in part so that you can use the same port, and you can sometimes get the same webserver/proxy to deal with both similarly(verify)), but the communication after then switches to is entirely its own thing.

Fairly widely supported since around 2014 [2]

Upsides are that

server push of arbitrary messages
lower latency than request-response, more standard than hanging GET and more flexible than Server-Sent Events

Arguables / downsides

it's basically only network later - hence 'socket' (though it's actually messages as a whole)

you don't even get to add HTTP headers for the HTTP phase of the setup

you must implement your own protocol (events back and forth tend to be simple enough)

you must implement your own semantics

you must implement your own cacheing

they don't reconnect automatically(verify)

though it's not very hard to set up reconnection

that does have implications if and when the interchange you put on top must not miss events

you have to avoid common design pitfalls in the process

you may need to implement your own DDoS alleviation / put some software in front to do so for you

and you basically can't do that with HTTP auth; WS doesn't allow that

holds open a connection; there's a limit per server

so it's likely something you use for logged in users, not arbitrary pages

Messages and frames

Frames are header + body.

A message is made up of one or more messages frames, in that frames can be marked as a continuation fragment of the previous - and so of an overall message, to be reassembled as a whole.

Websocket proxies are free to reframe messages.

Messages or streaming?

Websockets are often explained as an API sending and receiving messages as a whole, suggesting you create whole messages, calculate their size, and then send.

And that's done for you. It's often considered whole-message based, because the API on top usually does that.

The specs are more interesting than that -- but in messy ways that you probably don't care for, because it tried to meet people with varied interests.

So the specs allow sending data without buffering, and without knowing the message size at the time you start sending, by terminating messages with the FIN frame flag[3] instead, which are provisions to use it as a stream-based protocol as well.

Practically, there are a number of footnotes that mean you probably want to

avoid infinitely long messages (streaming allows this)

avoid streaming in general, except in cases you really want it

avoid huge messages

one reason being that APIs typically expose messages (and not frames).

exposing a frame API is not required by specs

avoid huge frames within messages

frames can be petabytes in theory

While receiving streams is required to be in-spec, exposing a frame API is not, so standard implementations will receive a message before letting you consume data from it.

Receiving data as a stream would have to be a lowish level API change at both sides, so in the real world, e.g. streaming media with websockets would require you both chop into reasonably sized frames (which a proxy could do) but also into reasonably sized messages.

That is, if you want a standard websocket client to use consume your stream. And if you meant 'browser', then you do.

Maybe they figured we could figure out streaming by a later-standardized extension? That doesn't seem to have happened, though.

A smaller reason for 'have small messages (and by extension small frames)' is compression.

While the basic websocket spec doesn't have compression, it has extensions (and negotiation of such) meaning it effectively allows it. : A few extensions have popped up[4] but the only one remotely standard one is permessage[5].

Which implies compression is effectively exclusive with streaming - unless you implement it on top. (This isn't a huge issue you'ld generally only want to stream media, and probably only compress text which is likely to never be very large and rarely streamed)

Practical issues

While intended to support a browser, and the connection should come from a browser, the ws:// / wss:// location is public, so anyone can connect to a websocket.

This is both a security issue and potentially a DoS issue.

Neither are new on the web at all, it's just that basically no existing solutions apply, so you have to do it yourself.

That said, you would probably only use WS on authenticated users, and not expose it to the world.

Auth tickets plus using WS over TLS goes a long way. And while you have to implement the auth yourself, that's not the worst.

On DoS

To deal with DoS attacks, implementations should probably drop connections with

very high message size (and implicitly fragment size) leading to high memory allocation

mentioned in the standard (10.4).

very high fragment size and/or very high fragments per message (to deal with streaming)
very low transfer rate (...while actually sending)

On security

there is no auth on this separate connection (10.5)

the HTTP nature of the handshake allows some form of auth that way, but you may want to add more

browsers will help by not allowing willy-nilly connections, but that only helps connections from browsers

a server is effectively a separate server open to the internet (exposed via an URL), so even without auth you probably want to somehow verify a connection came from a page

it's up to you to implement that, e.g. a token system

Technically, WebSockets are not necessarily restrained by the same-origin policy

by default, anyway; Origin: header is optional. This isn't an issue from browsers, because they'll probably use it.

keep in mind restrictions via CSP and openings via CORS (verify)

...making Cross-Site WebSocket Hijacking[6] (like CSRF but exposing a potentially interactive protocol) an interesting thing

session cookies help

there are also things you could do that you proably really shouldn't, like tunneling things

Much of this is only really an issue if the page that initiates a WS connection is comprimised (at which points most bets are off anyway), but it's still something to keep in mind.

Making things work in more places

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Shims, polyfills, and monkey patches

Shim is a generic programming term, often about whatever patches you need to make something work, at all, in a different environment, making it work well, making it work faster, etc.

...often implying "without altering that original thing directly".

This is not necessarily about APIs. Yet it often is, because we are often talking about making specific libraries behave as we want, and doing that at the call interface is just the most sensible spot to make that change / do that wrapping.

Polyfill seems to be coined around webdev, so polyfill usually means "a shim for a browser API".

In this context we also use 'shim', but but people are inconsistent with the terms, it can turn into one of those semantic arguments, so often both end up meaning "someone else's javascript duct-taped on", but in a nice way.

Relatedly, monkey patching refers to altering code very late, possibly at call time, to make it work as you expect.

Where shims/polyfills often make an effect to present a complete whole (e.g. a library), monkey patching is more duct tape like "hey if I disable error reporting before calling, it won't error out".

Monkey patching is sometimes involved in shims, but we try to avoid that because they can be fragile with changes in the underlying code.

---

Reasons for shims might include:

makes a newer API work in an older environment

e.g. because you want a fancy new feature, and workable fallback wherever that feature isn't available yet

makes an older API work in a newer environment

often because older code relies on it

which is why you would often call this a compatibility layer (though there is some overlap with...)

running programs on different software platforms than they were developed for

e.g. wine has a shim intercepting syscalls.

Microsoft's compatibility layer, ensuring that programs continue to function after major windows version upgrades, involves a whole lot of API interception (based on knowledge of specific programs, and/or the compatibility mode you've selected)

altering performance to an API, without changing its overall function

e.g. wikipedia's example of hardware accelerated APIs

Graceful degradation and progressive enhancement

These two exist mostly in acknowledgment that the web browser's features will never be entirely predictable, and we need a way deal with that.

graceful degradation:

accept that you may not get a particular feature

ensure that when that features is missing, the fallback will not break

...though it's allowed to look a little worse, not perform quite as well, be static rather than dynamic, etc.

progressive enhancement

start showing something supported by everything, then add things that make it shinier

since that probably means less-universally-supported things, only when

you know (e.g. detect) it will work

or when it falls back to just not do that shinier thing

The argument to both GD and PE is sort of "having an approach to consistent UX, is better than none at all".

In theory PE is an easier user-experience guarantee than GD, in that you start with something that works and is more established.

In practice,

both can be broken, depending on how you do it.

both can be poorer UX by being one reason it may take half a dozen relayouts before people can read your damn webpage text

GD and PE also make you think a little about spiders (and SEO),

e.g. if you're thinking about putting navigation in scripting (only; this is why there are tricks like "generate regular HTML links, have JS strip that and convert it into event based"; this has been done for many years, but these days that's often called a router).

Inline data URLs

Format:

data:[<mediatype>][;base64],<base64data>

Example: (source)

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII=

In HTML:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAQAAAC1HAwCAAAAC0lEQVR42mNkYAAAAAYAAjCB0C8AAAAASUVORK5CYII=">

The inline image

will always be larger than the original data,

saving a round trip may outweigh that, but only for extremely tiny images

will not be cached.

though when placed in CSS it's effectively cached as part of that

So arguably this is nly useful for very small images, where saving a roundtrip outweighs both of those.

Notes:

Modern browsers will also treat data URLs as having unique origins - not the page's.

IE/Edge only support this for images. But that's probably what most people would use this for.

HTML5 isn't a singular standard (or, "why WHATWG isn't very related to W3C")

Audio and video

Accessibility

WCAG

Web Content Accessibility Guidelines

ARIA

Accessible Rich Internet Applications

aria-* attributes

Designing for Reader Mode

Custom attributes

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

We have had a long-standing question of "can we just add non-standard attributes to HTML?"

Note that in a lot of ways, this is less about whether it's technically valid, and more about whether it's reliable enough. Basically whether we can get away with doing it.

This is arguably four questions:

can we add them to the serialized HTML document?
can we add them to the serialized XHTML document?
can we add them to the DOM with scripting?

this is nicknamed expando attributes

will this clash with something in future standards?

The answers are roughly:

For HTML you can get away with it.

HTML4 won't validate, but nothing else bad will happen

HTML5 doesn't do validation anymore.

HTML5 allows it - and for practical reasons suggests using data- prefix (see below)

For XHTML, browsers are somewhat likelier to actively complain, where for HTML most rarely would.

maybe less so these days? Test that before doing it(verify)

In JS you could basically always get away with it.

Basically no browser will really care about DOM alterations it doesn't understand.

Apparently IE once leaked memory around expando attributes - but who cares about IE anymore?

Potential clashes are a thing, yes.

This is why HTML5 suggests data- prefix, for the practical reason that you would only clash with some other part of your own app / use, never with HTML standards.

data-* attributes

Basically, HTML5 declared that it won't use this prefix for itself in the future.

In modern browsers you can also get to it slightly more easily than an explicit getAttribute(). For example, with:

 <div id="index_html" data-indexno="3" data-col="#0f0">yay</div>

You can do:

 document.querySelector('#index_html').dataset.indexno

(and in theory in CSS with attr(), but only for content: [7])

(You can pass data to CSS more variably - see var())

Practically speaking, data- attribute probably won't cause clashes within any one site or app, because you control it and can fix it.

If you're going to do something interoperable (like frameworks or libraries or web components), document it - and note that that can still be unhelpful when it means you have to change one codebase or other.

Semantic elements

Drawing things

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

CSS

Reasons for:

all in DOM
purely in standards
can do gradients
can do shapes via clipping
can do animations

keyframed from-to-stuff,

and animating properties (beyond positing: opacity, rotation, skew, scale) and combinations

Reasons against:

can't do much more than that
some things not so implemented yet[8]

HTML5 <canvas>

Controlled by JS

Reasons for:

well supported[9]

Reasons against:

contents are not in the DOM, so avoid putting main website text in it - consider screen readers and search engines
Requires JS
not always many upsides over raster images or SVG
2D only

SVG

Reasons for:

Well supported[10]
part of the DOM means flexible (also meaning e.g. some basic :hover-style interactivity with plain CSS)
basics well supported
may compress well
scales automatically, less worry on high-DPI
can do interactivity

Reasons against

for a long time, not all effects/animations were universally supported
some CSS-SVG was similarly weird

WebGL

Is basically JS-controlled OpenGL ES2 on top of a HTML5 <canvas> (so inherits canvas' upsides/downsides)

Reasons for:

may be very smooth due to hardware acceleration

Reasons against:

those for canvas
can be heavy on CPU/GPU/battery / low-spec machines

and generally doesn't perform nearly as well as you would expect for complex things(verify)

privacy worries (mostly minor - machine fingerprinting and such)

All have basic supported in modern browsers

Libaries

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

D3

D3 notes

https://threejs.org/

PlayCanvas

https://playcanvas.com/

Construct

https://www.construct.net/en

BabylonJS

Unity web

Unsorted webdev notes

Contents

Communicating both ways between client and server

Polling

Hanging GET

Server-Sent Events

WebSockets

Making things work in more places

Shims, polyfills, and monkey patches

Graceful degradation and progressive enhancement

Inline data URLs

HTML5 isn't a singular standard (or, "why WHATWG isn't very related to W3C")

Audio and video

Accessibility

WCAG

ARIA

aria-* attributes

Designing for Reader Mode

Custom attributes

data-* attributes

Semantic elements

Drawing things

Libaries

Controlling keyboard on mobile

Hashbang URLs

Navigation menu