Some abstractions around programming: Difference between revisions

From Helpful
Jump to navigation Jump to search
mNo edit summary
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
{{#addbodyclass:tag_tech}}
{{#addbodyclass:tag_prog}}
{{programming}}
{{programming}}


Line 177: Line 179:
   else:
   else:
     return fib(n-1) + fib(n-2)
     return fib(n-1) + fib(n-2)
which is basically the mathematical definition in code form, nice.
which is basically the mathematical definition in code form.
 
 


Now call <tt>fibonacci(n)</tt> for n in 1, 2, 3, 4 (the n-th number in the series).
Now call <tt>fibonacci(n)</tt> for n in 1, 2, 3, 4 (the n-th number in the series).


This works, but each call is spending an increasing amount just recusing -- most of which is not only work that has been done for the previous n just before, but  
This works, but each call is spending an increasing amount of work just recursing -- most of which is not only work that has been done for the previous n just before, but ''many'' calls for any one n actually duplicates of what is being done in this same top-level call, because the -1 and -2 calls end up calling for the same n.  
''many'' calls for any one n actually duplicates of what is being done in this same top-level call, because the -1 and -2 calls end up calling for the same n.  


It turns out it's exponential. ({{search|J Roberts, "How Many Recursive Calls Does a Recursive Function Make?"|the actual number is slightly funny to calculate]])
It turns out it's exponential. ({{search|J Roberts, "How Many Recursive Calls Does a Recursive Function Make?"|the actual number is slightly funny to calculate]]). Which also means the amount of time taken increases exponentially:


import time
import time
for i in range(38):
for i in range(38):
  start = time.time()
  start = time.time()
  v = fib(i)
  v = fib(i)
  print( 'took %.3f sec to calculate %d-th number: %d'%(time.time()-start, i, v))
  print( 'took %.3f sec to calculate %d-th number: %d'%(time.time()-start, i, v))
took 0.000 sec to calculate 0-th number: 0
...
took 0.000 sec to calculate 10-th number: 55
...
took 0.005 sec to calculate 20-th number: 6765
...
took 0.036 sec to calculate 25-th number: 75025
...
took 0.455 sec to calculate 30-th number: 832040
...
took 4.454 sec to calculate 35-th number: 9227465
took 7.221 sec to calculate 36-th number: 14930352
took 11.837 sec to calculate 37-th number: 24157817
...and soon after that you can probably beat the computer with a pencil.


took 0.000 sec to calculate 0-th number: 0
...
took 0.000 sec to calculate 10-th number: 55
...
took 0.005 sec to calculate 20-th number: 6765
...
took 0.036 sec to calculate 25-th number: 75025
...
took 0.455 sec to calculate 30-th number: 832040
...
took 4.454 sec to calculate 35-th number: 9227465
took 7.221 sec to calculate 36-th number: 14930352
took 11.837 sec to calculate 37-th number: 24157817


and soon after you can probably beat the computer with a pencil.


 
Yes, calculating it this way '''is''' somewhat contrived (and ends up you don't want it ''anyway'') -- but if you wanted to optimize this, you would probably do it via memoization. Consider:
And yes, calculating it this way '''is''' somewhat contrived (and ends up you don't want it ''anyway'') -- but if you wanted to optimize this, you would probably do it via memoization. Consider:


answers = {}
answers = {}
Line 224: Line 227:
     return answer
     return answer


Now when you run the same timing tests, all answers are more or less instant, because almost all of these calls almost immediately hit on a number we already calculated and stored. Roughly, they end up up doing is what you would do on paper, or in a smarter implementation: the calls only
Now when you run the same timing tests, all answers are more or less instant, because almost all of these calls almost immediately hit on a number we already calculated and stored. Roughly, they end up up doing is what you would do on paper, or in a smarter implementation:  




{{skipme|
{{skipme|
 
The reason you ''still'' might not want to do it ''quite'' that way is that most languages have a maximum recursion depth (as a guard against programming mistakes), so you can only get so far.  
The reason you still might not want to do it that way is that most languages have a maximum recursion depth (as a guard against programming mistakes), so you can only get so far. Say,
Say,
  fib(1000)
  fib(1000)
will complain. You can sort of work around it if you do something like:
will complain. You can sort of work around it if you do something like:
Line 239: Line 242:
{{skipme|
{{skipme|
The above still puts ''n'' things.   
The above still puts ''n'' things.   
This is pretty cheap on PCs, but if you were more constrained, and your task is only to print the ''series'' (and not an arbitrary n-th item from the series),
This is pretty cheap on PCs, but if you were more constrained, and your task is only to print the ''series'' (and not an arbitrary n-th item from the series), then you can do with memorizing only two numbers. The last two. That's the definition, after all.
then you can do with memorizing only two numbers. The last two, that's the definition, after all.
}}
}}
-->
-->
Line 599: Line 601:




=Encapsulation=
=Encapsulation, Information Hiding=
<!--
<!--
Bundling data with the things that alter that data.


OO is a common method ''now'', but not the origin or only way of doing encapsulation.
Encapsulation is often explained as bundling data, and the things that alter that data, in the same ''thing''.
 
OO is a common context ''now'', and classes/objects themselves the means,
yet OO is not the origin, not or only way of doing encapsulation.
 
 
 
'''Information hiding''' is a similar idea but more aimed at ''segregating'' design decisions
of things that are likely to change.
 
In OO, information hiding is often done by nesting types so that you only see the interface of the outermost type (until you dig deeper).
 
 
The terms overlap and there is no clean separation;
: some argue information hiding is the idea, encapsulation is the means, (and both are compartmentalizing)
: others argue they are distinct goals that happen to overlap in how you implement them
 
It further overlaps with the whole idea of [[interfaces]] and APIs
being the thing you talk to so that implementation changes
have no effect on clients of those interfaces.
 
 
 
 


https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
https://en.wikipedia.org/wiki/Encapsulation_(computer_programming)
https://en.wikipedia.org/wiki/Information_hiding
-->
-->



Latest revision as of 00:19, 22 April 2024

Some fragmented programming-related notes, not meant as introduction or tutorial

Data: Numbers in computers ·· Computer dates and times ·· Data structures

Wider abstractions: Programming language typology and glossary · Generics and templating ·· Some abstractions around programming · · Computational complexity theory notes · Synchronous, asynchronous · First-class citizen

Syntaxy abstractions: Constness · Memory aliasing · Binding, assignment, and such · Closures · Context manager · Garbage collection

Sharing stuff: Communicated state and calls · Locking, data versioning, concurrency, and larger-scale computing notes

Language specific: Python notes ·· C and C++ notes · Compiling and linking ·· Lua notes

Teams and products: Programming in teams, working on larger systems, keeping code healthy · Benchmarking, performance testing, load testing, stress testing, etc. · Maintainability

More applied notes: Optimized number crunching · File polling, event notification · Webdev · GUI toolkit notes

Mechanics of duct taping software together: Automation, remote management, configuration management · Build tool notes · Installers



Mathy concepts, state

Invariants

In math

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In math, invariants are usually used to point out the concept of 'something that remains unchanged under certain transformations'

(and in the cases you care to use it, it often likes to be as broad as possible, saying that entire classes of transformations don't affect it)


This is potentially useful to

  • classify mathematical objects into groups, and name those groups
  • characterize certain objects.
say, the Pythagorean theorem is an an invariant property of right triangles
(and is still invariant under any operation that maintains the property of being a right triangle)
going a little further, Fundamental invariants of triangles


  • mention what to watch out for in more pragmatic use
Say, programmers (around graphics, geography, and more) may care to know things like
the area of a triangle is invariant under translation, rotation and reflection, but not under magnification
angles are invariant under all of those
  • It is sometimes useful to explain the contraints of a given problem or puzzle.
  • It's useful around various topology.


https://mathworld.wolfram.com/Invariant.html


In more practical engineering

In programming

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

Invariants are expressions that do not change in value.

In some cases, that's "things that should test as true no matter when or where I test them".
In other cases, it only matters that they are true at the start and end of a function.

...the difference matters to things like recursion, and to concurrent and/or parallel execution.


Often a boolean expression that should stay true. (in practice, they often contain quality expressions, sometimes contain counts, relatively rarely more complex things)

If the invariant changes, then the code is probably wrong.


Invariants are also useful to

describe behaviour in broad terms,
describe the correctness of individual operations,
to test these assumptions, during debugging and sometimes production


Class invariant

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

A class invariant, also known as type invariant (and apparently object invariant(verify)) is something that should stay true throughout an object's lifetime, regardless of its state.


This isn't any different from the above concept, just points out you can often think of an object as a sort of state machine that you may change on any call - and that you should think about correctness on any operation that changes its state.

Memoization

In programming, a memo function is a function that remembers the output for a specific set of input parameters it has previously processed, and can answer from that memory rather than from calculation.

Using this as a speed/memory optimization tradeoff is called memoization.


It becomes effective when the duplicated calculation is nontrivial to calculate and/or when the a function is often called with the same input.


A memo function should not have any intentional side effects, or (in general) depend on any external state.


In some cases, memoization is something from the language/syntax itself, particularly in more functional languages (e.g. haskell), but in procedural language it's often done with a more explicit cache, which requires some management, e.g. to keep the cache of answers in check, and possibly even be smart about retaining entries that are requested more than others.

Side effects

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In computing, a side effect tends to mean altering some state outside the environment in consideration.


You may see this most often in programming, where e.g.

  • a function that can only refer to its own local variables can never have side effects (...on other code) - (see also pure function)
  • a function that alters global state may easily have effect on other runs of that same function -- or on anything else that happens to rely on it.
  • a function that shares state with others by design
alters shared state tends to be more intentional, done with



More practically, you usually talk about side effects to talk about when you share state, particularly when you unintentionally share state, and particularly when this can lead to issues like race conditions.


...and also when intropducting the abstractions we introduce to avoid such issues.


Such abstractions include

clearer/singular owners of state,
exclusive access to state,
or e.g. concurrent data structures, which exist to tame such issues.


Side effects are not necessarily negative, in that if there is an implicit/explicit contract pointing out how to deal with them. For example, one approach to concurrent data structures exist to have things work predictably even in the presence of side effects.



https://en.wikipedia.org/wiki/Side_effect_(computer_science)

Idempotency

This article/section is a stub — some half-sorted notes, not necessarily checked, not necessarily correct. Feel free to ignore, or tell me about it.

In most contexts, the point of an idempotent operation is that it has no additional effect if it is done more than once.


In general, in machines

For a tactile example, consider an industrial machine has an separate 'on' and a separate 'off' button.

Regardless of the state before, if you press the 'on' button, it is going to be on afterwards
Regardless of the state before, if you press the 'off' button, it is going to be off afterwards
...also regardless of how often you press that same button afterwards.


In comparison, the power button on most computers toggles (and is only a request to change - unless held for 4 seconds as per ATX specs)

...so you won't know entirely for sure about its state afterwards
so it's not idempotent.
(the fact that that's because action depends on the current state just happens to be the main reason, not a defining part of this non-idempotency)


Stoplight crosswalk buttons are idempotent (with few exceptions)

in that you can press them more than once, but it'll just stay in 'requested' mode until the system acts on it.
(and sometimes nullipotent, meaning the light will schedule you regardless of whether you pressed, and it's there just to make you feel in control)
  • elevator call and floor buttons are idempotent (with few exceptions)
...even if people like to hold on to their magical thinking about special modes that make it go faster


In math, in computing


In math and computing, you're saying the output doesn't change if called more than once (with the same input parameters).

For example

  • anything that makes no changes
e.g. HTTP GET, HEAD, OPTIONS, TRACE
though people do abuse GET (e.g. allowing sets from GET requests, not just POST, is moderately common)
  • adding or removing an item to a set
e.g. HTTP DELETE
  • various string operations, e.g. uppercasing a string

...and may more than we usually don't name specifically.


Such a property happens to be useful, in design, and in talking about design, in particular APIs, because you're typically modeling what should happen on the other end, and are usually not inspecting or policing every step.

Say, it's really useful to know that if an identical operation is sent to a server several times, it leaves the server in the same intended state.

Around updating state, this can e.g. simplify complex state management to 'just retry until is says OK'.


For the HTTP examples,

It's just defined that way so that HTTP clients can assume certain things.
it's not that you can't break that, by mistake or even by choice. In particular GET is sometimes not idempotent in practice.



Around parallelism, this relates to things like reentrancy


Scope and namespaces

Encapsulation, Information Hiding

Exceptions