Random Thoughts: pxl2000 ntsc video signal

For this post, we will look at the analog audio/video signals for the PXL 2000 camcorder, reverse engineer the signal formats, and build a working decoder.

Decoding the PXL 2000 Audio/Video Signal

I don't know if you remember these, but the PXL2000 is a handheld camcorder which was unique in that it recorded onto standard audio cassette tapes.

My camcorder no longer works, so I thought it would be nice if there were some software that would convert or decode the analog signal on the tapes to a modern movie file. Granted, I could just fix my PXL 2000 camcorder, but I was curious how the PLX worked. :)

Since I couldn't find any existing software, I decided to put a little research into creating a decoder.

The first step is to reverse-engineering the raw analog signal format on my cassette tapes.

Fortunately I still have a working cassette player. I just plugged this into my computer and digitized a section of tape.

I sampled the pxl 2000 video data at 192 khtz and viewed it in a wave editor. (44.1 khtz resolution will not work). The signal was VERY high pitched.

I boosted the video channel as much as possible without clipping. Here are screen shots of the wave in a wave editor:

(wide zoom)

(medium zoom)

(close zoom)

(192 khtz sample points in relation to signal size) This is good. There is a definite repeating pattern in the signal. I was a little bit familiar with NTSC (which I expected) but the signal didn't look like anything I had seen before. It looked like the PXL used it's own proprietary video signal.

Summary of signal format (rough guesses):

1. One of the stereo channels is used for video, one is used for audio (looks like standard analog wav format).

2. Looking at a wide zoom, it looks like amplitude is used to store video data. The entire video signal is roughly all at a constant frequency. (On my sample, there are occasionally small dc offsets in the AM...maybe because the tape is so old and it's bleeding over from audio channel?).

3. Long pulse every 92 packets probably demarcates an image frame (roughly every .5 sec at regular tape speed). This matches what is known about the frame rate. If the video is running about 15 fps, that means the data for a 92x110 video frame must be compressed within roughly 9/15 seconds (tape runs about 9x in camcorder)....just not enough room for any fancy encoding. Note, the long pulse is equal to two AM packets. Note sync signals are proportional to surrounding amplitude (seems exactly 5x larger than regular signal...may be usable video data I'd think it's unlikely).

4. Looking at the medium zoom, the small pulse signal probably demarcates a row of pixels (looks to be 110 oscillations in between). Amplitude modulation in between this sync signal probably describes brightness/darkness of 110 pixels. likely brightness(i) = posterize(amplitude(i), 8). These are probably all painted/recorded in real time, as opposed to buffering the pixels for a single time-slice. If you notice, occasionally there are sharp changes in the signal from a row the next row.

5. It is possible rows may be interlaced (note that some pixel rows appears to repeat a pattern ... halves sometime look like could align). The 110 length packet could be split in the middle...each half describing even and odd rows. Although: the images frames transition into the next very smoothly, which would suggest interlacing (perhaps an s-shaped path down and up?).

The video signal does not look exactly like NTSC to me...although it seems similar. The signal looks roughly like:

[long pulse signal about 230 oscillations long]
[
 [AM signal 110 oscillations] [5 small pulses] 
 [AM signal 110 oscillations] [5 small pulses] 
 ... 92 total AM packets... 
]
[long pulse signal about 230 oscillations long] 
[
 [AM signal 110 oscillations] [5 small pulses] 
 [AM signal 110 oscillations] [5 small pulses] 
 ... 92 total AM packets... 
]
...repeats....

So the video signal probably maps to:

[image frame sync signal]
[
 [row of 110 pixels] [sync signal] 
 [row of 110 pixels] [sync signal]
 ... 92 rows total ... 
]
[next image frame sync signal] 
...and so on...

Although I was puzzled why there are only 110 oscillations, as several people have reported 90x120 video. If that is true, I'd expect 90 packets of 120 oscillations -- unless I just can't count :).

Also, I looked closer the sampled wave (@ 44.1 khtz) and noticed an odd pattern. The first two packets and last packet of the frame have a regular wave pattern (which is more easily seen at the lower sample rate).

(close up...regular patterns unlikely to hold interesting data. Or black due to frame edge bleed.)

If this is significant, it only leaves 89 regular packets for data. This is odd, since it would be hard to explain where the 90th pixel's data is stored (if there are 90 pixels).

It looks like the long pulse might hold data, but that would be kinda silly (in my opinion). Maybe what is happening is that the tape speed changes slightly as the circuit prepares to ramp up for the large signal. Or, an edge of the image may always be dark, due to the camera.

In some cases the signal is so weak that no peaks are present (perhaps just my recording is bad, but I've tried to boost the signal as much as possible) So, only the large sync peaks can be reliably detected. It will be necessary to keep an average time between sync peaks for when the signal vanishes, or always divide the packet into 110 parts. See Figure:

It appears some signal that is bleeding over from the other tracks. Here is a clip of the audio and video data. You can see the video appears to have bled over into the audio track and vice-versa. Plus, as a tape sits for a long time, the tape will be sandwiched in a roll of tape that may transfer a magnetic signal to the next loop. makes me think the dc offset can be ignored... there doesn't seem to be any pattern to it.

Generally the audio/video signal makes sense, though oddly the data part seems slightly smaller than it should be. However, the pixels on a TV aren't square, and it would be difficult to count them on a TV (as it is also hard to count them on a tape signal). Building a decoder: 1. The hardest part on building a software converter would be parsing data from the slightly damaged analog signal. The parser would need to be able to a. detect relative peaks (primary AM signal) b. detect relative sync regions (regions louder than relative data) c. extract wave audio on second track d. handle damanged audio/video signal (missing signal, dc offset, clipping, etc) Though, once the peaks/inflection points are extracted, I'd expect putting those back into an image would be much more straight forward. I did test out the Java Sound API a while back, but didn't think it was stable enough to build an analog parser with (at the time). -- update 2014-09-08 I ran a quick test using java to decode the video, testing with a few random (sequential) frames. This was a bit easier than I expected ... I think I see my old drum set (the drums had clear heads with o-rings. I think the dark spot is the 'tone control' or whatever it's called). :) This seems to confirm the basic video format, though needs quite a bit of tuning to clean up the sync:

I used the high and low points of the wave to construct the row, effectively doubling the width of number of pixels. So, to fix the aspect ratio, it displays each row twice. The signal was *not* interlaced (it was just coincidental that my first batch of wave samples were symmetrical). --

Sample Decoded Video

Update 2014-09-10 I decoded a small sample of signal, and stitched the frames back together with avconv: avconv -r 15 -i frame_%05d.png movie.flv It is definitely my old drum set:

The black/white values are inverted from what I initially thought. A high signal is black, a low signal is white. Which I suppose makes more sense from a storage perspective... you generally won't film the sun; filming a black or dark image is more common. Aside from tuning, now the decoder needs to parse audio (left track) and merge it with data (right track) at 15 frames/second. -- Update 2014-09-29 As suggested by T.Ishimuni, using the first derivative of the AM signal looks better than using the straight AM signal. The straight AM signal looks a bit grainy to me, and I think is likely more distorted by DC offset. I included a patch so that the decoder can find either the first derivative (default) or direct AM signal. --

PXL 2000 Decoder Software

I published all the code on github (GPL open source). Code and documentation is here:

https://github.com/sevkeifert/pxl-2000-decoder

This decoder can convert a PXL 2000 video signal from either a wav file or line-in to digital video. In theory, you may be able to recover signal from tapes that no longer play in a PXL 2000 camcorder (with proper boost/compression).

Screenshot:

Features:

can decode from line-in or wav file
shows preview of decoded video
brightness/contrast control
speed control
sync controls tab (allow fine-grain tuning for your specific signal)
converts video signal to png frames
resamples audio to normal speed
creates a sample avconv script (with calculated fps) that will create a video file
saves time code of each frame
offers both GUI and command line modes

Requirements:

Java JDK 6+ to compile, and
You'll need something like avconv or ffmpeg to merge the decoded png's and audio to a video format.
If you use a wav file, the decoder is currently tuned for stereo 16-bit audio sampled at 192khtz.

The stable code is all the default "master" branch. Any other branch should be functional but is more experimental. --

Update 10/3/2018

Michael Turvey started a new project on github, using a FFT for sync detection... a great idea! The project goal is to get the highest quality image from the analog signals. Project details are here:

https://github.com/mwturvey/pxl2000_Magic

Random Thoughts

Tuesday, August 12, 2014

PXL 2000 video signal format

Decoding the PXL 2000 Audio/Video Signal

Sample Decoded Video

PXL 2000 Decoder Software

About Me