HD Cinematography

Cinematographer iris Ng
Cinematographer Iris Ng

Video sensors

The video sensor is the element that picks up the image so that it can be converted to data. There are two types of sensors in general use: CCD (charge-coupled device), which is a surface containing millions of capacitors that convert light to an electrical charge, or electrons. CCDs are measured in mega-pixels, which refers to the number of capacitors they have. CMOS sensors (complementary metal oxide semiconductor) use transistors at the level of each pixel to amplify the electrical charge. CMOS sensors use less power than CCD sensors. CCD sensors tend to have higher quality pixels.

Three-chip (using 3 CCDs) cameras have a separate chip for each of the video primary colors. They take separate readings of red, green and blue values for each pixel and produce a higher resolution image.

Bayer filter systems are CCD systems that detect only one-third of the color information for each pixel. The other two-thirds must be estimated using an algorithm to ‘fill in the gaps’, resulting in a lower effective resolution. One-chip cameras use bayer filters in their sensors.

Interlaced vs. Progressive Video Signals

Video compression can be interlaced or progressive. Interlacing was a created as a way to reduce flicker on cathode ray tube displays. The horizontal scan lines were divided into two fields, odd-numbered and even-numbered. All of the odd-numbered fields were scanned first, and then the beam returned to the top of the frame and scanned all of the even-numbered fields. This double-scanning of the image occurred 30 times per second, or 60 interlaced half-frames per second. This resulted in a smoother, crisper image, but that image can reveal the interlacing if you put your video online. Interlaced video produces a freeze frame with interlacing artifacts, called “combing.”

Progressive video is scanned line by line in order. The frame is not broken up into separate fields. It creates one complete uninterrupted image. The image is recorded more slowly than an interlaced image and has a higher vertical resolution than interlaced video. Slightly slower scanning results in a softer image that many feel more closely resembles a film image. Progressive video produces a clean freeze frame.


This term can refer to a couple things: type of camera – SD or HD, the aspect ratio (4:3? 16:9?), the compression, or the file-type created (.mov, .avi, etc.), type of data container or codec.

Here’s a link to good information about file formats: http://www.dpbestflow.org/Video_Format_Overview


NTSC is the television system used North America and the western countries of South America, with 525 horizontal scan lines. PAL is used in Brazil and other eastern countries in South America, Europe, Iceland, Australia, India, Japan, and western African countries, with 625 horizontal scan lines. SECAM is another system used by France, eastern African countries, and Russia. SECAM processes color information differently from PAL and NTSC. Many countries are migrating their formats from SECAM to PAL.

Professional digital editing software accepts either PAL or NTSC formatted footage for editing. You cannot mix these formats in the same project.

Standard Definition and High Definition

Standard definition video has 525 scanned lines from top to bottom of frame (625 for PAL video).

High definition video has at least 720 scanned lines from top to bottom of frame.

Digital formats are defined by how many scan lines and vertical lines of pixels there are in a frame. “1080 x 1920” refers to 1080 scan lines x 1920 vertical pixels per frame. The expression of resolution would be “720” or “1080,” or, by the number of scan lines.

Formats such as 2K, 4K, 8K, etc. refer to formats higher in resolution than 1080. The “K” refers to 1024 pixels across the frame horizontally. A 2K image has 2048 pixels across the frame. 4K has 4096 pixels across the frame, etc.


Digital File Types

Container files: Quicktime/MOV, MPEG-4, AVI, AVCHD, Divx. Each of these container files contains the video and audio data.

Codecs: Codecs create the data files and different containers have different codecs that they are compatible with. The cameras we use in the class are AVCHD cameras that are compatible with the H.264 codec.


When you shoot video, the data is compressed so that the data travels quickly. Video compression involves reducing identical information within a sequence of frames and storing only the information that is different. Lossless compression enables the original image to be restored. Lossy compression does just that – loses some of the information in the image. The image quality difference is often hard to see.

In order to create smaller files of your footage, the camera will do “chroma sub-sampling,” which means that it gets rid of some of your color information to allow for more efficient file storage.

RAW video is a sequence of files captured at whatever frame rate the camera is set at and stored as raw data. The files are uncompressed and are very large – about 7GB per minute of footage/ over 400GB per hour. The files need to be processed by the computer before you can see them on the computer and edit with them. Most RAW cameras will have built-in processing so you can view your images in the camera.


Bitrate is the rate of the stream of computer bits representing one second of media. Bitrate matters most in terms of output, when you’ve finished editing and want to output at a high bitrate. The higher the bitrate, the less compressed the file is because more information is streamed per second. Computer systems as well as compression codecs may determine the limit on bitrate.

Waveform Monitor and Vectorscope

The waveform monitor and vectorscope are each another way of looking at your image. They represent your image in terms of brightness and hue & saturation. They look complicated but are surprisingly easy to understand.

Waveform Monitor

A waveform monitor is often part of the camcorder’s software and is always part of digital editing software. It allows you to monitor your scene for exposure levels including black level, white level, and the amounts of contrast in your image. In video, a waveform monitor is more accurate than a light meter for determining how much light you need in your shot. The waveform monitor only measures luminance (brightness) and not chrominance (color). It will measure the luminance (brightness) of any color, black, grey, or white pixels in your frame.

The X-axis on the monitor represents the entire frame from left to right. The Y-axis represents the exposure levels from black (0 on the monitor) to white (100+), measured in IRE (Institute of Radio Engineers) values, representing the intensity of the signal. “Legal” levels, in terms of broadcasting levels, are 7.5 for black, and 100 for white. Your entire image signal should fall between 7.5 and 100.

Crushing Blacks and Clipping Whites

If your waveform shows that a lot of your signal is at 0, this means you are crushing the black values in your image. “Crushing blacks” means that all of the detail in dark parts of your image is being lost. The correct value for black in a broadcast-legal image is 7.5 rather than 0. Any part of your image that is black should appear at the 7.5 point on your waveform monitor.

If your waveform shows a concentration of signal at or above 100, your whites are being clipped. “Clipping whites” means that all of the detailed information is lost in the over-exposure. Only the darkest parts of your image, whatever is actually black in the image, should approach 0. Only the brightest parts of the image should be at 100.

Try to get the best exposure while shooting rather than depending on correcting your image in post-production. If you have difficulty with exposure, it’s best if your image is under-exposed rather than over-exposed. An over-exposed image loses information that cannot be located in post-production by merely toning down the white values in the image. But in an underexposed image, some of the information in the darker areas is still there, in the pixels. Working with luminance values and contrast/black values in post-production via color-correction may bring some of those details back into the image.


The vectorscope measures chrominance, or color values in the image and shows you where each pixel in your image lands on the color spectrum. This includes hue as well as saturation. The vectorscope appears as a circle, marked with boxes indicating the correct values for video primary and secondary colors, as represented by color bars.

When each color is correct, it appears on the vectorscope as a dot of light in the center of the box. If one color is not correct, it is said to be out of phase. If one color is out of phase, all of the colors are out of phase. White is represented on the vectorscope as a dot of light in the center of the circle. If the center dot is off-center, then your camera needs to be white-balanced.

Saturation is represented by the distance of the signal to the edge of the circle. The more saturated an image is, the more the signal will reach towards the edge of the circle. A less saturated image will show the signal being closer to the center of the circle.

Video Latitude, or Too Much Brightness

Video that is too bright loses most of the information of the details in a shot. This information cannot be recovered in post-production because it is simply not there. In video that is too bright, the brightness of the image exceeds the maximum brightness level that the camera can handle and the information available stops there.

Digital Video Encoding

4:2:2: This number represents the ratio of luminance and color components of the video signal.

The Y-channel (the ‘4’) of the signal carries the luminance information. The total value of Y is equal to the value of the Red, Green & Blue signals combined. To encode digital video for television or monitoring, it’s necessary to separate the Y (luminance) from the RGB (chroma) signals.

The 4 in the 4:2:2 ratio represents the sampling frequency of the Y channel. The 2’s represent the sampling frequency of the B-Y (blue minus luminance) and R-Y (red minus luminance) color difference channels. The RGB signal is being sampled to convert it to a television/video signal, allowing for chroma sub-sampling for output to a monitor. For every 4 samples of luminance, there are 2 samples of B-Y and R-Y color difference. This is the sampling standard for digital video. The G-Y (green minus luminance signal) does not need to be measured – it’s value can be deduced from the B-Y and R-Y signals.

What this means for you is that on exporting your finished film from digital editing software, you will get the highest resolution file from an Apple Pro Res 4.2.2 codec. A PC equivalent codec is DNxHD. If you are not editing on a Mac, you should do a little further research online. Different editing platforms will have different high-res output codecs available for non-Mac based projects.

Frame Rates

24fps is really 23.98fps, and 30 fps is really 29.97fps. If you are given a choice between 23.98 and 24fps for shooting or editing, choose 23.98fps. If you are given a choice between 30fps or 29.97fps, choose 29.97fps. This is because the tiny fraction of a second between frame rates eventually may result in your audio going out of sync if you’re using 24 or 30fps rather than the actual video frame rates of 23.98 and 29.97fps.


Timecode is an eight-digit number representing hours:minutes:seconds:frames of recorded video. Timecode values range from 00:00:00:00 to 23:59:59:59. If you are shooting continuously longer than 24 hours, your timecode will begin again at 00:00:00:00. Timecode is recorded onto a separate track or digital space. Each number represents a single frame of recorded video, giving each frame of video a unique number, or “address.” Because of the unique address given each frame of video, timecode is also referred to as the “address track.”

Drop-Frame vs. Non-Drop-Frame Timecode

In drop-frame time-code, the frame address (timecode) is adjusted once per minute to make up the difference between frame rate (e.g. 24fps) and real time (e.g. 23.98fps). Drop-frame timecode is important where you need exact timing. Otherwise, the minute differences are neither visible nor audible. In 24P HD video, there is no drop-frame timecode.