A decade ago I was cleaning out some out stuff and and ran across my (broken) PXL2000 and a box full of old cassette videos.
I don't know if you remember these, but the PXL2000 is a handheld camcorder which was unique in that it recorded onto standard audio cassette tapes.
I thought it would be nice if there were some software that would convert or decode the analog signal on the tapes to a modern movie file. Since I couldn't find one, I decided to put a little research into making one. I could just restore my pxl 2000....also I'm curious how the pxl works. :)
First I spent a couple hours reverse-engineering the raw analog signal format on my cassette tapes. Fortunately I still have a working cassette player. I just plugged this into my computer and digitized a section of tape.
I sampled the pxl 2000 video data at 192 khtz and viewed it in a wave editor. (44.1 khtz resolution will not work). The signal was VERY high pitched.
I boosted the video channel as much as possible without clipping.) Here are screen shots of the wave in a wave editor:
(192 khtz sample points in relation to signal size)
Summary of signal format:
1. It looks like one of the stereo channels is used for video, one is used for audio (standard analog wav format).
3. Long pulse every 92 packets probably demarcates an image frame (roughly every .5 sec at regular tape speed). This matches what is known about the frame rate. If the video is running about 15 fps, that means the data for a 92x110 video frame must be compressed within roughly 9/15 seconds (tape runs about 9x in camcorder)....just not enough room for any fancy encoding. Note, the long pulse is equal to two AM packets. Note sync signals are proportional to surrounding amplitude (seems exactly 5x larger than regular signal...may be usable video data I'd think it's unlikely).
4. Looking at the medium zoom, the small pulse signal probably demarcates a row of pixels (looks to be 110 oscillations in between). Amplitude modulation in between this sync signal probably describes brightness/darkness of 110 pixels. likely brightness(i) = posterize(amplitude(i), 8). These are probably all painted/recorded in real time, as opposed to buffering the pixels for a single time-slice. If you notice, occasionally there are sharp changes in the signal from a row the next row.
5. It is possible rows may be interlaced (note that some pixel rows appears to repeat a pattern ... halves sometime look like could align). The 110 length packet could be split in the middle...each half describing even and odd rows. Although: the images frames transition into the next very smoothly, which would suggest interlacing (perhaps an s-shaped path down and up?).
The video signal does not look exactly like NTSC to me...although it seems similar. The signal looks roughly like:
[long pulse signal about 230 oscillations long] [ [AM signal 110 oscillations] [5 small pulses] [AM signal 110 oscillations] [5 small pulses] ... 92 total AM packets... ] [long pulse signal about 230 oscillations long] [ [AM signal 110 oscillations] [5 small pulses] [AM signal 110 oscillations] [5 small pulses] ... 92 total AM packets... ] ...repeats....
So the video signal probably maps to:
[image frame sync signal] [ [row of 110 pixels] [sync signal] [row of 110 pixels] [sync signal] ... 92 rows total ... ] [next image frame sync signal] ...and so on...
Although I was puzzled why there are only 110 oscillations, as several people have reported 90x120 video. If that is true, I'd expect 90 packets of 120 oscillations -- unless I just can't count :).
Also, I looked closer the sampled wave (@ 44.1 khtz) and noticed an odd pattern. The first two packets and last packet of the frame have a regular wave pattern (which is more easily seen at the lower sample rate).
(close up...regular patterns unlikely to hold interesting data. Or black due to frame edge bleed.)
It looks like the long pulse might hold data, but that would be kinda silly (in my opinion). Maybe what is happening is that the tape speed changes slightly as the circuit prepares to ramp up for the large signal. Or, an edge of the image may always be dark, due to the camera.
In some cases the signal is so weak that no peaks are present (perhaps just my recording is bad, but I've tried to boost the signal as much as possible) So, only the large sync peaks can be reliably detected. It will be necessary to keep an average time between sync peaks for when the signal vanishes, or always divide the packet into 110 parts. See Figure: