Sue Ellen Reager, CEO
@International Services
February 12, 2009


The entire world of captioning and its sister, subtitling, is evolving at breathtaking speed because of HD and the impending tsunami of thousands of internet channels.

For the world of captioning, HD is like bringing DVD to television. The quality is extraordinary, HD television sets accept both HD and SD, and, for the first time in history, HD enables captioning in up to eight different languages with more to come.

With the increasing adoption of HD begins the metamorphosis of captioning into subtitling. Until HD, captioning was identified by the turn-on/turn-off factor and the black boxes behind the text, which always used the same font. Meanwhile, subtitling could not be turned off, did not use the black boxes, and featured generally attractive text with attributes such as color and outline.

With the advent of HD and modern techniques for the web, captions are beginning to include attributes that can be changed by users, who can select the font, style, color, and size of their choice. The black boxes can be removed or retained. The time in history has arrived when captioning and subtitling are beginning to blend into one. Programs will become available in two, five, 20, or 40 languages, most particularly with on-demand on the web, and revenues for even small-content producers will go global.

The Way We Were
In 1980, the first set-top closed-caption decoder box with its own antenna was made available for $200. Users could flip a switch to see subtitles (which were called captions to differentiate their new on/off capability from the burn-in variety), which could be annoying to those who weren’t hearing-impaired. About 400,000 set-top boxes were sold over 25 years, but there are 24 million people in North America who are deaf or hard of hearing. The number of set-top boxes sold in that 25-year time frame was not impressive compared to the number of people who needed the subtitle feature.

In 1990, the Decoder Circuitry Act mandated that all televisions sold in the U.S. with screens 13" or larger must contain a built-in caption-decoder chip. That law signaled the death knell of the set-top box manufacturers. Captioning became available on 10 million new televisions per year, and the FCC set rules governing captioning’s attributes and technical aspects. The advent of the decoder chip enabled all 24 million hearing-impaired people to be assimilated into our lives—to feel they were part of, not apart from, society.

Another effect of the Decoder Circuitry Act was the ability to control captioning with a remote control, enabling television viewing in public buildings and sports bars with muted sound. This also enabled a user to follow the news with the TV on mute while stuck on the phone with a dull conversationalist.

Today, captioning is mandatory for all broadcast media with a few exceptions, such as new companies, companies with less than $3 million in revenue, and internet channels. There are certain character and font limitations, such as no more than 32 characters per line, equidistant lettering in which an “I” must be as wide as a “W,” and a number of other rules to ensure that the captioning produced is compatible with all chips and players. The cost of captioning is somewhat heavy on the budget, as companies are often charged about $600 per hour depending on the vendor and the care devoted to breakdown and aesthetics. Then, the network or content manager either accepts responsibility for the encoding or outsources the task.

But in the near future, the cost of captioning, which has been a valuable investment in a needy segment of our society, could become an investment to attract both the viewing public and new sources of global revenue for many of the world’s media-content distributors, from cable to the web.

I suppose it is a ridiculous notion that the world should share a commonality, such as being able to watch the same television set or use the same caption-decoder chip. The Americas, even with the new digital changeover, will still use the NTSC standard (as digital, not analog), while most of Europe still uses the PAL standard, and the French—always striving to be different—use SECAM. So the captioning system used in the Americas does not work on European or Asian televisions, and the teletext system of Europe does not work in the Americas.

Live Captioning
Live captioning is the real-time transcription of a broadcast as it happens, such as the news, sports, and other live events. Live captioning is generally performed by court reporters using 10-key machines in which the keys are a form of shorthand. The 10-key output is translated via software into English words.

VITAC Corp. is responsible for captioning 160,000 hours of programming each year for major networks, ESPN, Fox Sports, and various feature films. VITAC had to heavily invest in new infrastructure and reprogram portions of its internal software to meet the new challenges in captioning. Tim Taylor, vice president of engineering and facility operations at VITAC, says, “For offline, we had to make changes for captioning in HD by modifying our software for a new aspect ratio—16x9. Then, the next hurdle with HD was that the frame rate for SD (standard definition) is at 29.97 frames per second, PAL is at 25 per second, and now HD has two different frame rates: 29.97 and 24 frames per second. Moreover, everything we caption ‘live’ is pulled off of satellite. So, in order for us to be able to see all of the HD feeds, we now have 120 satellite receivers, with more on the way. To caption for the HD networks meant that we had to make significant investments just so that we could see all of the HD feeds simultaneously. This investment in infrastructure was gradual over a period of time as HD began to phase in. IRD satellite receivers cost $5,000 to $9,000 each. This represents hundreds of thousands of dollars in investment in captioning and subtitling capabilities for both Pre-Recorded and Real-Time Captioning.”

The ‘How’ of Live Captioning
“For live webcast captioning, video streaming servers are used,” says Dilip K. Som, president of Computer Prompting & Captioning Co. (CPC), a company that produces live and offline captioning and subtitling software for Windows and Mac. “From the computer running our WebCaption software that encodes captions to the videos, a link is made to a streaming server that is usually equipped with large bandwidth to accommodate hundreds to thousands of web viewers. The customer sends a stream to one of the companies that uses our software, such as Caption Technologies. That company (Caption Technologies) uses a steno system or a speech recognition software to create the captions in real time and embed the captions to the video. We embed the captions inside the video instead of sending the captions as a separate stream to achieve full synchronization of captions to audio. At the present moment, we only work with Windows Media Video. We investigated other players—including Real and QuickTime, but there were some technical difficulties to achieve the same solution. We’re still working on it.”

Live captioning from an originally English feed directly into other languages is rare, and VITAC is one of the few companies that offer such a service. Very few people can use a 10-key machine or type in shorthand while translating correctly at the same time, and these translating captionists should be valued. Live translation is particularly difficult because of the wide variety of subject matter that appears on broadcasts, from financial news to international politics to science and technology programs.

Software and Offline Captioning
For offline captioning and subtitling, some of the big players in software packages include CPC, Adobe, and Sonic Scenarist. At the click of a button, CPC’s software breaks a transcribed script instantaneously into the raw captions required, splitting each sentence into the proper two or three subtitles, all of correct length as specified by the FCC. Then, the software guides the user to assign timecodes to each entry point with mouse clicks.

For translation of captioning files or subtitling for the web and DVD, International Services offers a web-based, do-it-yourself Subtitler designed to control translation into other languages with glossaries and other features for linguists. Subtitle text can be translated by a professional translator (if you don’t have one, links are provided to qualified translators), or users can send the subtitles for automated translation by language software.

Automatic Sync Technologies (AST) has created web-based software to provide a “digital drive-thru,” a quick-turnaround version of captioning in which users drop off via the web and pick up via email. To provide this quick turnaround, AST runs each transcription through parsing software. Then, the company’s algorithms synchronize the script words with the corresponding audio in a text-to-audio match pattern.

Where Does the Money Go?
An hour of outsourced captioning varies widely depending on the vendor, with $600 being fairly normal, but this is not the only price around. The bean counter in you may respond, “But transcription sounds so easy. Where does the money go?” First, transcription is a long and precise process. There can be 10,000 words in 1 hour of content. A typist who types at 100 words per minute would need 10 hours to finish such a transcription. A fast typist at 200 words per minute would need 5 hours. Then, the transcription must be proofread before being broken down into captions and timecode assignments.

A professional transcriptionist transcribes much faster, and more thoroughly, than a typist. Professionals in transcription often use 10-key court-reporting equipment that speeds up the transcription turnaround significantly. In addition, they are usually excellent proofreaders.

VITAC has invented a type of “shorthand for keyboards” for use with its own internal software that is a 21st-century version of Gregg Shorthand, which was popular in the 1950s. According to Yelena Makarczyk, director of subtitling operations for VITAC, both captioning and subtitling are arts that may not be readily apparent to nondeaf viewers, but poorly split sentence text is like a punch in the gut to people exposed to captioning and subtitling on a daily basis. “So much goes into one caption. The average reading speed is 15 characters per second, but cartoons for kids must be slowed, with [fewer] words showing on screen for a much longer period of time,” Makarczyk says. “For other languages, we also perform cultural assessment checks. Every single caption is treated as a unit—dividing the timing and process of captioning and subtitling into age group, accuracy, flavor, consistency, [and] verification. And there is research involved. We once spent 4 days in our off-hours looking for the exact spelling of an African god, until someone from Sudan finally found it in a book.”

Captioning and Subtitling on DVD
Captioning on DVD is similar to subtitling in appearance. The captioning approach adds the ability to turn the text on and off at any time desired, as opposed to only selecting a language at the beginning of the disc. With subtitles, TIFF images containing subtitle text are keyed into video by the DVD player. For captioning, the DVD player actually inserts the closed-captioning data into line 21 of the SD video output. The viewer’s television then decodes and displays the captions. For HD, unfortunately, Blu-ray did not make accommodations for closed captioning, but it does carry the DVD-standard capacity for subtitling in eight languages.

As 50,000 channels flood the web, the internet is revolutionizing the way people feel about captioning, languages, and subtitling. In 5 years, at any time and on any day, billions of people around the world will be watching global internet channels from 300 countries, many subtitled and/or captioned in dozens of languages, and a new, global marketplace will explode before our eyes. As for those who cannot see the explosion, audio description for the blind technology, perhaps as text-to-speech, is in the works.


Writer: Sue Ellen Reager, CEO
@International Services

Sue Ellen Reager is CEO and founder of International Services, one of the best kept secrets in technology, preferred vendor to Microsoft, IBM, Intel, Cisco Systems, NCR, Home Depot, CNN and others. Ms. Reager oversees a virtual company of 300 people in 90 countries.

Sue Ellen Reager is CEO and founder of @International Services, a global translation services company and developer of localization software. She can be reached at

Learn more about the companies mentioned in this article at:

Print Version     ShareThis   Page 1of 1