adamwilt.com > Technical Difficulties > Lookin' Sharp | |||||||||||||||||||
Articles about video technology and technique | search | ||||||||||||||||||
"Sharpness" isn't just lines of resolution, it's total system MTF. Confused? Read on...
Written April 2000. A version of this article appeared in DV Magazine, August 2000. © 2000-2004 Adam J. Wilt
Why does the Sony TRV-900 camcorder (380,000 pixels per CCD) look sharper than the VX1000 (410,000 pixels per CCD)? Sharpness is more than pixel counts or resolution; let's take a whirlwind tour through some engineering geekspeak and see what the factors really are.
Depth of modulation measures the contrast reproduced by an imaging system, typically at a specified spatial frequency. If a set of black and white lines is reproduced with black at 0% and white at 100%, the depth of modulation is 100%. If instead the black comes out at 25%, and the white is at 75%, the depth of modulation is only 50% - half the original contrast has been lost. The modulation transfer function (MTF) is a measure of the depth of modulation over a range of spatial frequencies, conceptually similar to the frequency response plot for a bit of audio gear.
Remember that high frequencies exist not only in fine details, but also in any sharp transitions in the picture: any place things go between light and dark in a very short distance.
The scanning aperture or imaging aperture is the spot or "peephole" through which an image is read, such as the spot size of the scanning electron beam in a tube camera or the light-gathering area of a pixel in a CCD camera. (Don't confuse this with the lens aperture, a very different thing.) When the aperture is small compared to the detail being rendered, the depth of modulation will be high. If you're shooting a series of broad black and white lines, each of which spans multiple CCD pixels, most of the CCD pixels "see" the solid interior of a white or black area, and output 100% video or 0% video respectively. As detail increases in frequency and decreases in size, the lines become small enough that they don't completely cover a pixel; each pixel will see some mixture of black and white lines, and the output will no longer be pure black and white but varying shades of gray. Thus as frequency increases, depth of modulation decreases, and the MTF or aperture response curve starts sloping downwards.
TV lines of resolution is one of the trickier numbers: it indicates a limiting horizontal resolution of the system, but it's odd in two ways.
First, a "TV line" consists of a single distinguishable detail. In our hypothetical case of alternating black and white vertical lines, each black line and each white line is a "TV line", in contrast with the more normal measurement of "line pairs" or "cycles" used in film, lens measurement, and audio worlds. Thus a resolution figure of "500 TV lines" means that 250 black and 250 white lines could be resolved: 250 line pairs, but 500 TV lines total.
Second, resolution is always "normalized" to a square screen: resolution is always specified in TV lines per picture height (TVL/ph). The normalized measurement lets one compare the horizontal resolution of a TV system (the figure normally quoted) with the vertical resolution, which is fixed by the number of scan lines used and the kind of scanning performed (interlaced or progressive), and eliminates any dependency on aspect ratio (4:3 or 16:9). A camera resolving 600 TV lines resolves those lines across a width of the image equal to the picture height. If the camera shoots 4:3 images, the camera can actually resolve 800 TV lines across the entire picture (4:3 times 600); if the camera shoots true 16:9 images, it resolves 1067 TV lines across its entire picture width.
This explains why true 16:9 switchable cameras list the same resolution in both 4:3 and 16:9 modes: the figures are normalized to picture height, even though more pixels per line are used in 16:9 than in 4:3. You'll sometimes see 16:9 camera specs that say things like "700 TVL in 16:9 mode, the equivalent of 930 TVL in 4:3 mode" or the like. What these chaps are saying is "look, if you took all the pixels available on the 16:9 chip and squeezed them in to make a 4:3 picture, that picture would have a resolution of 960 TVL", because it sounds more impressive that way!
If that weren't bad enough, there's also the question of what sort of picture you'll really get as you approach the specified resolution figure. Remember, MTF tends to decrease as frequency increases. At least one manufacturer specifies resolution at the point where the MTF is only 5%, others may measure the point where the MTF intersects the noise floor and no detail can be seen. Limiting resolution is exactly that: the limit, not necessarily the usable limit.
Pixel counts for CCDs can be given as total or effective pixels. The total number includes those scanlines and line-end pixels that are masked from light or are used for digital image stabilization, while the effective number counts those that actually participate in forming an image. All else being equal, more is better for resolution. The number of pixels on a scan line sets an upper limit to resolution; a chip with 700 pixels per picture height along a scanline can resolve no more than 700 TVL/ph.
Many 3-chip cameras boost resolution by offsetting the green CCD half a pixel horizontally from the red and blue CCDs. This increases resolution most of the time, since most natural image elements contain significant amounts of both green and red/blue light. However, shoot something that's only green or something without any green in it, and the deprived chip(s) can't contribute any luma samples to increase resolution. The affected areas of the picture look unaccountably soft compared with the rest of the image (although you have to look closely to see the difference most of the time). Single-chip cameras suffer similarly when one or more of the colors matching the mosaic filters on the chip is missing from scene detail.
Don't confuse or compare camera pixel counts with the 720 samples/line used in most SDTV digital recording formats: there's no 1:1 relationship between them (but that's a topic for a different day).
Aliasing is an artifact in which high-frequency picture detail is erroneously shown as a lower frequency detail; you'll sometimes see this as a moir pattern shimmering on top of fine, repetitive image details, like screen doors or picket fences running into the distance. It's similar to the shimmering colors you see on broadcast TV when someone is wearing a herringbone jacket or finely-patterned shirt, although that's really a cross-color artifact of composite video - an aliasing of luma detail as chroma information.
Aliasing is a problem whenever picture detail in TV lines exceeds the sampling frequency (as defined, for example, by the number of pixels on the CCD). [Those of you shouting, "hey! The Nyquist limit is half the sampling frequency," relax. You're right: a 700-pixel-per-picture-height camera is limited to 350 line pairs or cycles per picture height - but remember, we're talking TV lines, not line pairs; there are two TV lines per line pair. So that 700-pixel camera is limited to resolving 700 or fewer TV lines before aliasing sets in.] It's more of a problem with imagers where the aperture is small compared to the pixel size. If the aperture is infinitesimally small, it samples only a point of light falling on the pixel instead of averaging all of the light; MTF is very good, but aliasing is high. On the other hand, a large active aperture has a comparatively lousy MTF curve, but since any high-frequency detail gets averaged out to a flat gray, aliasing tends towards zero.
The active aperture of a CCD pixel is always smaller than the total pixel area, and even with added microlenses the effective aperture is always less than 100% of the pixel area. More must be done to reduce aliasing: enter the optical low-pass filter. Think of it as layers of lightly scratched or ground glass mounted above the CCD so that fine detail is fuzzed out just a bit. Ideally, you'd want a "brick-wall" filter - one with a 100% MTF all the way out to the limiting frequency where aliasing begins, and with a 0% MTF above that. Unfortunately, that's not possible; filters can have steep, but not infinite, cutoffs. Designers of low-pass filters trade off squashing any aliasing at the expense of diminished high-frequency detail, or letting more detail through and tolerating a bit of aliasing as a result. Not surprisingly, different manufacturers choose different solutions; those with higher-resolution chips cut detail a bit more, while those with softer chips typically allow more high-frequency information through and suffer from more aliasing as a result.
Aperture correction is an electronic emphasis of higher frequencies, to compensate for reduced depth of modulation at those frequencies. Aperture correction, especially when overdone, is also known as edge enhancement. On cameras and monitors, you can often tweak it with a "sharpness" or "detail" setting. Whatever you call it, what it does is pull up the system MTF where it sags from less-than-ideal aperture response, restoring contrast and "snap" to the finer details of the image. When you overdo it, you get that harsh, "electric" or "cartoony" edging of details so characteristic of live sports or Sunday-morning talk shows; the "TV look" that's one of the prime clues that something was shot on video, not film. But still, it looks sharp, doesn't it?
Which brings us to the main question: what's "sharpness"? Simply put, it's a function of the area under the system MTF curve: not just the limiting resolution, but how you get there and what the system does in the meantime. In 1948, Otto Schade defined the "equivalent line number," NE, as the area under the curve squared. NE is not a number you'll see in specifications, but it does seem to quantify perceived sharpness fairly well. (And no, "Otto Schade" did not invent the video AGC selection on a studio camera's CCU!)
Now we can see why the TRV900 seems sharper than the VX1000: the TRV900 has fewer pixels, thus a lower limiting resolution, but more edge enhancement (a lot more). Its MTF curve doesn't go quite so far out as the VX1000's, but it's boosted in the high-frequency detail it does have, so the area under its MTF curve is greater.
For more information, try a good TV engineering book, like "Video Engineering" by Andrew F. Inglis or "Television Engineering Handbook" by K. Blair Benson, both from McGraw-Hill.
Puzzled by the "video AGC selection" comment? Studio engineers refer to exposure control as "shading" the camera.
Last updated 2004.01.24 - the 20th Anniversary of the Macintosh