Visuals DIY

From Helpful
Jump to navigation Jump to search
The physical and human spects dealing with audio, video, and images

Vision and color perception: objectively describing color · the eyes and the brain · physics, numbers, and (non)linearity · color spaces · references, links, and unsorted stuff

Image: file formats · noise reduction · halftoning, dithering · illuminant correction · Image descriptors · Reverse image search · image feature and contour detection · OCR · Image - unsorted

Video: format notes · encoding notes · On display speed · Screen tearing and vsync


Audio physics and physiology: Sound physics and some human psychoacoustics · Descriptions used for sound and music

Noise stuff: Stray signals and noise · sound-related noise names · electronic non-coupled noise names · electronic coupled noise · ground loop · strategies to avoid coupled noise · Sampling, reproduction, and transmission distortions · (tape) noise reduction


Digital sound and processing: capture, storage, reproduction · on APIs (and latency) · programming and codecs · some glossary · Audio and signal processing - unsorted stuff

Music electronics: device voltage and impedance, audio and otherwise · amps and speakers · basic audio hacks · Simple ADCs and DACs · digital audio · multichannel and surround
On the stage side: microphones · studio and stage notes · Effects · sync


Electronic music:

Electronic music - musical terms
MIDI · Some history, ways of making noises · Gaming synth · microcontroller synth
Modular synth (eurorack, mostly):
sync · power supply · formats (physical, interconnects)
DAW: Ableton notes · MuLab notes · Mainstage notes


Unsorted: Visuals DIY · Signal analysis, modeling, processing (some audio, some more generic) · Music fingerprinting and identification

For more, see Category:Audio, video, images

Analog video notes

VGA hacking

You can mess with the pixel voltages directly.


Of the many pins, the only things that are truly required are

  • Hsync
  • Vsync
TTL signals (0..5V)
  • red
  • green
  • blue
0V for black, 0.7V for full intensity


You will typically use hsync and vsync only to set up a specific resolution, and mess only with R, G, and B.


Ideally you also consider the back porch, front porch, but not doing so will CRTs look extra janky (which can be a feature) while LCDs may actually clean that up for you -- and/or have their auto adjust fail.


Feeding in audio is mostly just a diode away (to protect from negative voltages), also because it's roughly the right voltage ([Voltage_levels#Analog_audio_voltage_levels consumer audio is approx 0.3V RMS])

That said, baseband audio will mostly look like lots of horizontal lines, due mainly to audio rate being multiples slower than the pixel rate.


You can generate a a basic digital VGA signal from a microcontroller, see e.g. https://hackaday.com/2013/03/29/avr-vga-generator/


More interesting things:

ewa justka, e.g. https://ewajustka.tumblr.com/post/141525276739/audio-video-synth-based-on-arduino-based-vga

http://www.jameshconnolly.com/rgb-vga-volt.html


When modes are sort of weird things

Let's assume for a moment that we're a CRT. We're bending a ray of electrons that light up the phosphors, line by line.

Hsync is the time the beam has to go to the next line.

Vsync is the time in which the beam goes from bottom right to top left for the next frame.


The porches blank the beam a little more than necessary, adding breathing room to make it just a little easier to ensure the beam isn't visible while moving.


How does a video card know what a monitor supports?

Old monitors would only support a smallish number of modes, sometimes just one.


Before DDC methods like in particular EDID, a PC would choose fairly standard modes and assume they worked, or asked the user which mode to use so it would be your fault if it didn't.

Particualrly EDID made things easier, reading out an EEPROM that contains basic information about timing, and that information implies a set of resolutions that are likely to work.


How does a monitor detect the mode of a signal?

With hsync and vsync being logic signals on separate pins, it is relatively simple to see what's happening on them, so monitors can guess which mode it corresponds to, or at least the closest of their supported resolutions matches (monitors obviously know the timing of what they support themselves).

CRTs would not like being driven at unusual timings, whereas LCDs are more forgiving, also in detection.


(Yes, matching the closest one too loosely sometimes makes for weird results)

(for those old enough to remember setting up and fiddling with the timings in X windows, this is why)


On sync and pixel speed - and hobby electronics applications

Most halfway decent resolutions are clocked above 50MHz, and modern resolutions a few hundred MHz. (Yes, VGA can carry 1080p -- over a short wire, anyway.)


Yet 640x480 at 60Hz is one of the slowest-clocked of the historically standard modes (namely at 25.175MHz), and also a bit of an industry standard, so you can expect even very old monitors will accept it. see timing details here


Additionally interesting is that 640x480@60's timings closely related to NTSC. This is by design, and potentially useful for some electronics projects in that with fairly little extra work you could output to either/both to a monitor and e.g. composite output.


For PICs and AVRs, even ~25MHz is too fast, so sending out all individual pixels can't be done directly. There's also no AVR/PIC hardware we can fully delegate such output to, so

you have to directly bit-bang GPIO,
the overhead of doing that at all means you can't practically do more than about half your own clock rate to start with,
and the rate during pixel output is fixed so you spend most CPU getting the timing right.

There's one saving grace to that, in that the time that hsync+vsync blanks (meaning you don't need to send pixels) for the 640x480@60Hz mode is about ~30% of the overall time, which you can spend thinking of what the next frame will be.


The blanking region is spent with the RGB pins blanked.

On CRT, not doing that would put stripes all over the screen. LCDs are more forgiving if you don't actually blank, but blanking helps the 'auto adjust' feature to, well, work. Similar for the color burst in PAL and NTSC - you probably want to do this.


So what do we do on <25MHz microcontrollers?

HSync and VSync happen in the kHz range, so getting the monitor to go to the right mode is well within your capabilities here (possibly using timed interrupts, for accuracy).

It's only the pixels that go faster. Because it's supposed to be voltages back to back, and analog, screens don't care (or know) if you hold the voltage that happen to represent pixels. It's only when you want pixels to be distinct and non-bleeding that exact timing matters.

So if you can accept wider pixels, just go at a multiple slower. you can easily change it at half or quarter the rate, you just get wider pixels.

If you want to display a stored image, then the RAM you have limits the amount of pixels you can store anyway (a full screen of pixels would take ~300KB), so many people decide to also repeat lines, to get larger square pixels, and use less memory.

Showing text is easier, because it's effectively a lookup table of ~100 possible bitmaps.

For a lot of these tricks, look also to early TV-connected home computers - they did a lot of these things too, even if they usually separated video output from main work.

Another reason you can't get perfectly crisp images is that the an AVR at 16MHz or 20MHz isn't a divisor of 25.175, so the timing won't align so well, but call it retro and it's fine.


More on timing

The 640x480@60Hz mode describes just the visible pixels.

Timing-wise, you could see it as something like 800x524@60Hz (which is where almost all of that 25.175MHz comes from) with a whole bunch of pixels intentionally blanked.

From that view, horizontal lines are split like, in pixel-clock units:

  • 96 sync
  • 48 back porch
  • 640 visible
  • 16 front porch


Say you make an AVR go at 20MHz.

That same line, in terms of wallclock time, sums to ~636 clocks at 20MHz, which you can e.g. split up like 76+36+512+12 (which steals some time from the porch so that pixel generation can be regular)

Since an AVR can output a value every two clocks, this gives you an effective horizontal resolution of 256 pixels


You could also get some color, because AVR GPIO lets you set an 8-pin port at once, so with some resistor-ladder work you could do e.g. 2-bit color for the RGB channels (or basically 8-bit color if you're clever).

For a sense of magnitude: a pixel takes ~40ns, a line takes ~30us, a frame takes ~17ms


Output to TV?

NTSC is 720x480@60i, PAL is 720x576@50i.

The timing is close enough to 640x480x60, this was intentional at the time.

They're different enough that there are no simple passive conversions between the two, yet in theory if you watch the constraint of both, then you could make something that can output to either of the two at a time, or if you're clever, both at the same time.

Sync signals are similar.

There are two differences: The image signal is luma and chroma (though you can get away with just luma) and on TV there's a color burst thing, a ~3.6MHz signal(verify) you have to generate at the right time (and takes some time away from that idle time), but is well within your capabilities.


See e.g.



On slightly faster uCs

On things like STM32s, even the simpler STM32F103, we have

72MHz of speed,
more for a framebuffer,

This lets us

do 640x480 with more colors and/or detail,
manage 800x600@56Hz (36MHz pixel clock, conveniently half of a 72MHz STM32's main clock)

Particularly for the latter you now want to learn about DMA, and you still won't have enough RAM for a full res framebuffer, so projects seem to do either monochrome at higher res, or color at an effective 400x300 or so.


See e.g.

VGA (and sometimes TV) hacking

Microcontrollers don't have much memory, or spare time, to generate fine graphics.

You can have external ram, from a few hundred KB from SPI SRAM, to more from things like e.g. http://tinyvga.com/avr-sdram-vga or the later stages of http://www.lucidscience.com/pro-vga%20video%20generator-14.aspx. But even then, putting something interesting in that buffer is challenging because of the limited CPU in simple uCs, and the fact you're using much of the time outputting pixels. This is one of the reasons that video cards quickly moved to using RAMDACs.

(Mind you, the Due can do better, see e.g. https://stimmer.github.io/DueVGA/]


Just the hsync and vsync is fairly easy to do.


For the R,G,B most take one of three approaches:

  • use PWM outputs for R,G,B
e.g. vgax uses two for just two of the colors
  • use a digital port and resistor ladder for R,G,B
and a basic 2-bit resistor ladder, e.g. for 2-bit on each channel (sometimes cleverer)
e.g. [1]
e.g. [2]
  • Monochrome is somewhat easier
e.g. http://tinyvga.com/avr-vga


They could feed through some other data, though, or just be used to set the mode without the need for a computer.


VGA and TV:

http://www.serasidis.gr/circuits/AVR_VGA/avr_vga.htm
http://www.lucidscience.com/pro-vga%20video%20generator-1.aspx


Note that for composite PAL/NTSC TV output (only) you may like things like the AD724



https://electronics.stackexchange.com/questions/23579/whats-the-simplest-way-to-generate-a-vga-signal-for-a-totally-white-screen-pre


Audio as image

The basic idea is to put two audaight onto the color pins of VGA, while have something else (uC or computer) set the mode via sync signals.


Why is is darkish

For starters, consumer audio is roughly -0.3V ... 0.3V, and VGA color in voltages are 0 .. 0.75V. So you're probably clipping half your signal, and the other half tends to be strongest in low to mid frequencies.

You'll have to bias the signal a few hundred millivolt upwards to see less black. An op amp per channel would help, and once you're at that you may be able to mix something for the third color too.


Why is audio horizontal stripes?

tl;dr: you get no nice patterns out of the box, mainly because

anything not a multiple of 60 will walk around like crazy
also meaning even if it is there, it will much less visible
and music and speech, which do most of their stuff in the first few kHz, will mostly be horizontal lines a few pixels thick.
audio output is filtered to not contain anything faster than that


This one's nice to intuit, along with the math, and some physical experimentation.

So hook up a tone generator - a website will do for basic experiments. You can pretty much wire audio directly into one or two of the R,G,B lines, when you also have a 'duino or such generating just the sync pins.


Now, consider that in 640x480@60, the screen updates sixty times per second.

That means a 60Hz sine wave takes exactly one screen (1/60th of a second, ~17ms), and would look like a single overall vertical gradient.

In part because there's not enough between-line difference to be noticeable. At higher frequencies thing start mattering in fewer lines, though for two adjacent lines to differ, you need a signal faster than audible range (since a single line is ~31kHz).


At 60Hz it will also be entirely still. Have it slightly off, e.g. 59 and 61Hz, and that gradient will seem to move around. (fiddling with a tone generator that also does sweeps, perhaps binaural combinations, may be nice here).

Multiples of 60Hz will be a multiple complete waves/gradients per frame, which'll look thinner because, well, it completes in the time of fewer lines for each full wave.


Music is mostly in the few-kHz region, so this will be many multiples of complete waves per frame, and 99+% of them won't sync up with a frame. Or with a line, because the start of each line happens at 31kHz, above audible.

This means most audio contentwill be a few scanlines long.

This is also why audio has no apparent vertical content -- a vertical line would have to be regular on the scale of 31kHz, which audio-geared DACs intentionally filter out. Even if they wouldn't, 99+% of that content won't line up with a line.

If you want thin vertical content, that also implies it needs to pulse. e.g. a single pixel vertical line would be a ~40ns pulse (remember the pixel clock is ~25MHz) with ~30us (799px*~40ns) of nothing inbetween.


TV, Composite and CRT hacking

This article/section is a stub — probably a pile of half-sorted notes and is probably a first version, is not well-checked, so may have incorrect bits. (Feel free to ignore, or tell me)


Similar to VGA, but with a few differences.

https://en.wikipedia.org/wiki/Analog_television#Structure_of_a_video_signal



http://gieskes.nl/instruments/?file=3TrinsRGB1


http://www.sabinegruffat.com/Arduino-Video-Synth.html


http://hotchk155.blogspot.com/


Mixed output

HSS3jb

https://bleeplabs.com/hss3jb/
http://forum.vintagesynth.com/viewtopic.php?f=1&t=70757


http://www.coppertracesmusic.com/videodrone.html

Unsorted

https://www.maximintegrated.com/en/design/technical-documents/tutorials/1/1184.html



Modular adjacent

Visual output of CV

If you go very modular, then you may want to see your CV.

This can be as simple as a LED, or as complex as a

LED

Level meter

LM3914 - linear, e.g. for level indication

LM3915 - 3db log, e.g. for VU

LM3916


Video modular

https://www.modulargrid.net/e/tags/view/48

https://www.muffwiggler.com/forum/viewforum.php?f=48&sid=5cef807773e20d30ca802a5e3357e964



Unsorted (devices)

Atari Video Music (C240)

A few op amps, and a custom chip that does all the work.


http://www.atarimuseum.com/videogames/dedicated/videomusic/videomusic.html

https://www.youtube.com/watch?v=INnpnJvDXDg

https://technabob.com/blog/2007/08/24/atari-video-music-forgotten-1970s-tech/#