Keeping codebases healthy

Refactoring

Tech debt and code debt

Tech debt (/Technical debt) refers to any decision that makes work for later.

...in particular more work later than it is to do it properly now, e.g.

when you decide to apply a simplified, quick-fix, incomplete solution now,
particularly when you know you know will probably have to redo it properly later (at which time more things will be built on it so more needs to change)

Note that this lies on a scale between

quick fix now, polish it up later
some double work between now and then
the work later will probably be a complete restructure

Whether this is a good or bad idea depends on context, because yeah, often enough the ability to get other people started is worth some extra hours spent overall.

Yet when postponing mean you know that later change will have to be a complete redesign, there is a very good argument for more effort up front.

Particularly when the thing we talk about is something other parts will build on.

Particularly when that postponing it will lead to entangled things you will also need to change, increase that future work, on top of a moderate amount spent on a quick fix now.

Code debt

Permeation debt

Software rot and refactoring

Everything is experimental

Tests

There are more possible names for tests than most of us can remember or define on the spot. And some are fuzzily defined, or treated as interchangeable in some contexts. Here is some context for some of them.

Code testing (roughly smaller to larger)

Unit testing

"is this piece of code behaving sanely on its own, according to the tests I've thought up?"

Typically for small pieces of code, often functions, behaviour within classes, etc.

Upsides:

reveals your approach to code quality and knowledge thereof - probably not so much via the presence of unit tests, as much as what you're actually testing

unit tests are part of regression testing - "this part is fragile, and we expect future tweaks may break this again"

unit tests can be a form of self-documentation

example cases are often unit tests as well

and say, if I want to dig into details, then seeing assert uri_component('http://example.com:8080/foo#bar') == 'http%3A%2F%2Fexample.com%3A8080%2Ffoo%23bar' gives me more information than a comment/docstring saying "percent-escapes for URI use" (which in practice is pretty imprecise)

can be particularly helpful in helper/library functions

forces you to think about edge cases

...around more dynamic programming

...around more dynamic typing

doing this sooner rather than later avoids some predictable mistakes

sometimes you discover edge cases you didn't think of, and didn't implement correctly, and/or didn't describe very precisely

easily overstated, yet probably everyone has done this

Arguables:

you will only write tests for the things you thought of, not the things you forgot (and probably didn't just write into the code)

this a more of a indication of completeness than of correctness of function

the tests you write give you more security, the thoroughness you forget will give you a false sense of security

The more OO code dabbles in abstractions, the more black-box it is and the harder it is to say how much they even really test

a lot of real-world bugs sit in interactions of different code, and unit tests do not test that at all

sure that's not their function, point is that we that 'write tests' often leads to writing unit tests, not on finding bugs

while on paper the "try your hardest to think of everything that would break it" idea is great

...if you look around, a buttload of unit tests are of the "think of things you know probably work anyway" sort

...because a lot of people write unit tests only because someone told them sternly (often by someone who barely understands when they are useful and when not)

Downsides:

if it involves locking, IPC, network communication, or concurrency, or interact with other parts of the program that has state (think OO), or other programs that have state, the less you really test - or can even say what you have tested or not.

such things are hard to test even with much fancier techniques

there is no good measure of how thorough your unit tests are

if you think code coverage is that thing, you are probably a manager, not a programmer.

the less dynamic the behaviour, the more that unit testing converges on testing if 1 is still equal to 1

which wastes time

which can give a false sense of security

the more dynamic the behaviour (in the 'execution depends on the actual input' sense), the less that adding a few tests actually prove correctness at all

In fact, tests rarely prove correctness to begin with (because this is an extremely hard thing to do), even in the forms of TDD that TDDers would find overzealous

most of the time, they only prove you didn't make the more obvious mistakes that you thought of

Regression testing

"Is this still working as it always was / doing what it always did / not doing a bad thing we had previously patched up?"

Refers to any kind of test that should stay true over time.

particularly when you expect that code to be often touched/altered,

particularly when that code is used (implicitly) by a lot of the codebase, so bugs (or breaking changes, or false assumptions) are far reaching

Yes, any test that you do not throw away acts as a sort of regression test, but when we call it this, it more specifically often means "a test we wrote when we fixed a nasty bug, to ensure we won't regress to that bug later" - hence the name.

Regression tests are often as simple as they need to be, and frequently a smallish set of unit tests is enough.

Upsides:

should avoid such bug regression well

may also help avoid emergence of similar bugs.

Arguables / downsides:

the same specificity that avoids that regression means it's covering very little else

...even similar issues in the same code

which can lead to a false sense of security

Integration testing

"Does this code interact sanely with the other code / parts of the program?"

Integration tests takes components and checks whether they interact properly (rather than testing those components in isolation).

Integration tests are often the medium-sized tests you can do during general development.

...so not testing the product as a whole - that tends to be later in the process. Or, these continuous-delivery days, sometimes never a.k.a. "user tests means deploying to users and see if they complain, right?"

Fuzz testing

Fuzz testing, a.k.a. fuzzing, feeds in what is often largely random data, or random variation of existing data.

If software does anything other than complain about bad input, it may reveal bordercases you're not considering, and e.g. the presence of exploitable buffer overflows, injection vectors, ability to DoS, bottlenecks, etc.

Perhaps used more in security reviews, but also in some tests for robustness.

Can apply

for relatively small bits of code, e.g. "add random number generator to unit tests and see if it breaks",

up to "feed stuff into the GUI field and see if it breks".

Acceptance testing

"Are we living up to the specific list of requirements in that document over there?"

Said document classically said 'functional design' at the top.

In agile, it's probably the collection of things labeled 'user stories'.

Which can involve any type of test, though in many cases is a fairly minimal set of tests its overall function, and basic user interaction, and is largely unrelated to bug testing, security testing, or such.

These draw in some criticism, for various reasons.

A design document tends to have an overly narrow view of what really needs to be tested. You're not necessarily testing whether the whole actually... functions, or even acts as people expect.

The more formally it's treated, the less valued it is when people do their own useful tests.

End-to-End testing

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Basically, testing whether the flow of an application works, basically with a 'simulated user.

The goal is still to test the application at mostly functional level - whether information is passed between distinct components, whether database, network, hardware, and other dependencies act as expected.

End-to-end testing is often still quite mechanical, and you might spend time specify a bunch of test cases and expected to cover likely uses and likely bugs.

This is in some ways an extension of integration testing, at a whole-application and real-world interaction level, finding bugs

While you are creating a setup as real users might see it,

It's not e.g. letting users loose and see what they break.

Tests in short release cycles

Sanity testing, Smoke testing

Caring about users

Usability testing

Accessibility testing

Also relevant

Black box versus white-box

Self-testing code

✎ This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Self-testing code is code that includes some checks inside its own code.

This often amounts to mean

assert() statements within a function, e.g.

testing important invariants

doing your own regression checks

intentionally borking out earlier rather than later when a bug could have wide-reaching implications (e.g. around concurrency)

https://en.wikipedia.org/wiki/Self-testing_code

Programming in teams, working on larger systems, keeping code healthy

Contents