Random Thoughts: August 2014

I've used the Unity interface for a couple years now. Initially I wasn't a fan, though I did see potential. It's very forward-thinking, and geared for a time when people may have thousands of software applications installed. And, half of these applications might not be installed locally at all, but instead reside in "the cloud" (which is where a lot of data and software is going). In this light, the interface makes a lot of sense.

At the same time, Unity is a fairly experimental (and somewhat unpopular) interface. Just from a marketing perspective, I think the Canonical should present the xubuntu version of Ubuntu as the "flagship" workhorse desktop, with Unity presented as the touch-friendly and futuristic/experimental interface.

Xubuntu uses XFCE and the Whisker menu (which is simple and extremely usable). Also, xubuntu runs well on older machines with a nice blend of features, design, and speed. The Whisker menu looks like:

(courtesy of http://gottcode.org)

In comparison, Unity's menu by default looks like:

My first impression is "that looks polished." In using it, however, there are several usability issues.

Instead of text or small icons, Unity uses large icons with large spaces. That is fine, but not a lot of data can be shown in this small pop-up window. Notice that very little data is shown, and most of the screen is now wasted space:

Instead, with such large icons, this screen really should be maximized by default:

However, the second problem here is that Unity uses transparency -- by default -- for the background of the menu.

The problem here, there is so much information bleeding through the background that I can't even tell what I'm looking at, and I can barely read the words on the right. To me, transparency doesn't look visually appealing. It looks like the video card has a problem and it is now rendering random junk to the screen.

Transparency is not eye candy... it's really just clutter and bombards the user with millions of points of data that they have no need to see. It honestly looks quite terrible and is not useful. :)

The third problem is the interface violates the concept of "put commonly used items close together." An interface should be like a super market... you put the toothbrushes and tooth paste next to each other. But instead, suppose I open Dash, and I just want to see all my multimedia applications. Look at the path my mouse (or touch) will travel:

These buttons couldn't be placed in a worse location, relative to each other. One of the goals of Unity, I thought, was to reduce mouse/touch travel. However here it objectively fails. Granted you can use keyboard shortcuts on a desktop, though (a) it's still visually disorganized, (b) as things move towards tablets, there might not be an external keyboard, and (c) not everyone knows or wants to use keyboard shortcuts.

Here are a couple ideas of how some of these UX issues can be fixed.

For starters, the menu screen should be on a solid background color by default, and it should be maximized by default.

Of course, if someone wants transparency, they should have the option...

But, it's not even obvious how to change the background to a solid color. You need to install Compiz's settings manager -- CCSM. The user will expect a right click or long touch either way.

Compare how much simpler this looks:

By comparison, the default transparent interface makes me feel like pulling the power cable out of the wall to reset the video card :)

As for the location of the buttons (bottom) and category filters (right), these should be grouped together.

An interface should have a primary and secondary flow. For English speakers, the page can read like a page of text, from top to bottom, left to right, mirroring the order of actions someone will perform while using the interface.

So, the bottom buttons would make more sense at top, with the filters on the left, for example:

Here's a mockup with the filters expanded:

And for side-by-side comparison :)

For types of controls "what is performing actions vs what the target of action" should be separated visually. For example, in the updated interface, search controls are on the left, search results are on the right. Information that is not needed can be hidden by default. The whole interface has flow: top-left to bottom-right.

Though I think, once you do the work to filter down to a set of applications or files, there should be a way to save this set for later use. Since usually I have groups of applications that I use. For example: a group of applications for creating audio, developing software, creating video, network tools, etc.

Overall, there needs to be some easy way to save a set of applications as a new lens, and to assign an icon to it.

For example:

Otherwise, I actually have never used several of the lenses that are installed by default, and I've had Unity installed for a couple years.

I tried using the global menus in Unity for a year or so, though ultimately they don't really save space. What it does to is force a lot of extra mouse travel from the application and top of the screen. I'm glad that Unity allows the menus in the window again.

Though I also see a lot of wasted space. For example, the title bars are wasting the majority of the space:

Generally I don't look at a title bar to tell the difference between a terminal and Open Office, or a browser. Amazing as it sounds, I think most people can actually tell the difference between apps by just looking at the application :)

So, it makes little sense to show the title by default, only to reveal the menu options on mouse over. If I want to open the Edit menu, the current interface forces me to guess where "Edit" might be, I wait for the real location to reveal itself, then I have to visually re-process that the interface has changed, and I finally move to the correct location. In other words, it's designed like a "Whack-a-mole" game.

What might make much more sense would be to always show the title in the giant area of wasted space -- on the right. The menu can then always be visible and functional:

That, or allow an option to show menu by default. I'm not really interested in seeing the titles.

Also, one other feature that I think would improve user experience would be a launch button in the Software Center. If someone installs a piece of software, there's a 90% chance they are going to immediately run it after it installs. Again, minimize the mouse travel by adding a launch button:

Also, if you place the menu options in the the title bar, the top bar is largely just wasted space.

Really it would make more sense if the lenses (the horizontal row of buttons in Dash) were always visible on the desktop. That way, the left column of buttons would show a list of favorite applications, whereas the top row would show groups of applications/files by type.

Overall, the Unity interface is usable, though the user-experience is a bit rough-around-the-edges in spots. It violates numerous interface design principles:

* an interface should have a primary and secondary flow that mirrors order of operations. For example, top to bottom, left to right.

* group commonly used items together closely

* don't waste space

* hide information that is not needed.

* only show information needed when it is needed. don't show information after it is needed.

* important items should be larger

* allow a user to save their work

For this post, we will look at the analog audio/video signals for the PXL 2000 camcorder, reverse engineer the signal formats, and build a working decoder.

Decoding the PXL 2000 Audio/Video Signal

I don't know if you remember these, but the PXL2000 is a handheld camcorder which was unique in that it recorded onto standard audio cassette tapes.

My camcorder no longer works, so I thought it would be nice if there were some software that would convert or decode the analog signal on the tapes to a modern movie file. Granted, I could just fix my PXL 2000 camcorder, but I was curious how the PLX worked. :)

Since I couldn't find any existing software, I decided to put a little research into creating a decoder.

The first step is to reverse-engineering the raw analog signal format on my cassette tapes.

Fortunately I still have a working cassette player. I just plugged this into my computer and digitized a section of tape.

I sampled the pxl 2000 video data at 192 khtz and viewed it in a wave editor. (44.1 khtz resolution will not work). The signal was VERY high pitched.

I boosted the video channel as much as possible without clipping. Here are screen shots of the wave in a wave editor:

(wide zoom)

(medium zoom)

(close zoom)

(192 khtz sample points in relation to signal size) This is good. There is a definite repeating pattern in the signal. I was a little bit familiar with NTSC (which I expected) but the signal didn't look like anything I had seen before. It looked like the PXL used it's own proprietary video signal.

Summary of signal format (rough guesses):

1. One of the stereo channels is used for video, one is used for audio (looks like standard analog wav format).

2. Looking at a wide zoom, it looks like amplitude is used to store video data. The entire video signal is roughly all at a constant frequency. (On my sample, there are occasionally small dc offsets in the AM...maybe because the tape is so old and it's bleeding over from audio channel?).

3. Long pulse every 92 packets probably demarcates an image frame (roughly every .5 sec at regular tape speed). This matches what is known about the frame rate. If the video is running about 15 fps, that means the data for a 92x110 video frame must be compressed within roughly 9/15 seconds (tape runs about 9x in camcorder)....just not enough room for any fancy encoding. Note, the long pulse is equal to two AM packets. Note sync signals are proportional to surrounding amplitude (seems exactly 5x larger than regular signal...may be usable video data I'd think it's unlikely).

4. Looking at the medium zoom, the small pulse signal probably demarcates a row of pixels (looks to be 110 oscillations in between). Amplitude modulation in between this sync signal probably describes brightness/darkness of 110 pixels. likely brightness(i) = posterize(amplitude(i), 8). These are probably all painted/recorded in real time, as opposed to buffering the pixels for a single time-slice. If you notice, occasionally there are sharp changes in the signal from a row the next row.

5. It is possible rows may be interlaced (note that some pixel rows appears to repeat a pattern ... halves sometime look like could align). The 110 length packet could be split in the middle...each half describing even and odd rows. Although: the images frames transition into the next very smoothly, which would suggest interlacing (perhaps an s-shaped path down and up?).

The video signal does not look exactly like NTSC to me...although it seems similar. The signal looks roughly like:

[long pulse signal about 230 oscillations long]
[
 [AM signal 110 oscillations] [5 small pulses] 
 [AM signal 110 oscillations] [5 small pulses] 
 ... 92 total AM packets... 
]
[long pulse signal about 230 oscillations long] 
[
 [AM signal 110 oscillations] [5 small pulses] 
 [AM signal 110 oscillations] [5 small pulses] 
 ... 92 total AM packets... 
]
...repeats....

So the video signal probably maps to:

[image frame sync signal]
[
 [row of 110 pixels] [sync signal] 
 [row of 110 pixels] [sync signal]
 ... 92 rows total ... 
]
[next image frame sync signal] 
...and so on...

Although I was puzzled why there are only 110 oscillations, as several people have reported 90x120 video. If that is true, I'd expect 90 packets of 120 oscillations -- unless I just can't count :).

Also, I looked closer the sampled wave (@ 44.1 khtz) and noticed an odd pattern. The first two packets and last packet of the frame have a regular wave pattern (which is more easily seen at the lower sample rate).

(close up...regular patterns unlikely to hold interesting data. Or black due to frame edge bleed.)

If this is significant, it only leaves 89 regular packets for data. This is odd, since it would be hard to explain where the 90th pixel's data is stored (if there are 90 pixels).

It looks like the long pulse might hold data, but that would be kinda silly (in my opinion). Maybe what is happening is that the tape speed changes slightly as the circuit prepares to ramp up for the large signal. Or, an edge of the image may always be dark, due to the camera.

In some cases the signal is so weak that no peaks are present (perhaps just my recording is bad, but I've tried to boost the signal as much as possible) So, only the large sync peaks can be reliably detected. It will be necessary to keep an average time between sync peaks for when the signal vanishes, or always divide the packet into 110 parts. See Figure:

It appears some signal that is bleeding over from the other tracks. Here is a clip of the audio and video data. You can see the video appears to have bled over into the audio track and vice-versa. Plus, as a tape sits for a long time, the tape will be sandwiched in a roll of tape that may transfer a magnetic signal to the next loop. makes me think the dc offset can be ignored... there doesn't seem to be any pattern to it.

Generally the audio/video signal makes sense, though oddly the data part seems slightly smaller than it should be. However, the pixels on a TV aren't square, and it would be difficult to count them on a TV (as it is also hard to count them on a tape signal). Building a decoder: 1. The hardest part on building a software converter would be parsing data from the slightly damaged analog signal. The parser would need to be able to a. detect relative peaks (primary AM signal) b. detect relative sync regions (regions louder than relative data) c. extract wave audio on second track d. handle damanged audio/video signal (missing signal, dc offset, clipping, etc) Though, once the peaks/inflection points are extracted, I'd expect putting those back into an image would be much more straight forward. I did test out the Java Sound API a while back, but didn't think it was stable enough to build an analog parser with (at the time). -- update 2014-09-08 I ran a quick test using java to decode the video, testing with a few random (sequential) frames. This was a bit easier than I expected ... I think I see my old drum set (the drums had clear heads with o-rings. I think the dark spot is the 'tone control' or whatever it's called). :) This seems to confirm the basic video format, though needs quite a bit of tuning to clean up the sync:

I used the high and low points of the wave to construct the row, effectively doubling the width of number of pixels. So, to fix the aspect ratio, it displays each row twice. The signal was *not* interlaced (it was just coincidental that my first batch of wave samples were symmetrical). --

Sample Decoded Video

Update 2014-09-10 I decoded a small sample of signal, and stitched the frames back together with avconv: avconv -r 15 -i frame_%05d.png movie.flv It is definitely my old drum set:

The black/white values are inverted from what I initially thought. A high signal is black, a low signal is white. Which I suppose makes more sense from a storage perspective... you generally won't film the sun; filming a black or dark image is more common. Aside from tuning, now the decoder needs to parse audio (left track) and merge it with data (right track) at 15 frames/second. -- Update 2014-09-29 As suggested by T.Ishimuni, using the first derivative of the AM signal looks better than using the straight AM signal. The straight AM signal looks a bit grainy to me, and I think is likely more distorted by DC offset. I included a patch so that the decoder can find either the first derivative (default) or direct AM signal. --

PXL 2000 Decoder Software

I published all the code on github (GPL open source). Code and documentation is here:

https://github.com/sevkeifert/pxl-2000-decoder

This decoder can convert a PXL 2000 video signal from either a wav file or line-in to digital video. In theory, you may be able to recover signal from tapes that no longer play in a PXL 2000 camcorder (with proper boost/compression).

Screenshot:

Features:

can decode from line-in or wav file
shows preview of decoded video
brightness/contrast control
speed control
sync controls tab (allow fine-grain tuning for your specific signal)
converts video signal to png frames
resamples audio to normal speed
creates a sample avconv script (with calculated fps) that will create a video file
saves time code of each frame
offers both GUI and command line modes

Requirements:

Java JDK 6+ to compile, and
You'll need something like avconv or ffmpeg to merge the decoded png's and audio to a video format.
If you use a wav file, the decoder is currently tuned for stereo 16-bit audio sampled at 192khtz.

The stable code is all the default "master" branch. Any other branch should be functional but is more experimental. --

Update 10/3/2018

Michael Turvey started a new project on github, using a FFT for sync detection... a great idea! The project goal is to get the highest quality image from the analog signals. Project details are here:

https://github.com/mwturvey/pxl2000_Magic

Random Thoughts

Saturday, August 30, 2014

Ubuntu 14.04 - Unity usability issues

Tuesday, August 12, 2014

PXL 2000 video signal format

Decoding the PXL 2000 Audio/Video Signal

Sample Decoded Video

PXL 2000 Decoder Software

About Me