Facial recognition systems are built on computer programs that analyze images of human faces for the purpose of identifying them. The programs take a facial image, measure characteristics such as the distance between the eyes, the length of the nose, and the angle of the jaw, and create a unique file called a "template." Using templates, the software then compares that image with another image and produces a score that measures how similar the images are to each other. Typical sources of images for use in facial recognition include video camera signals and pre-existing photos such as those in driver's license databases.
A second use of the technology was at the 2001 Super Bowl in Tampa, where pictures were taken of every attendee as they entered the stadium through the turnstiles and compared against a database of some undisclosed kind. The authorities would not say who was in that database, but the software did flag 19 individuals. The police indicated that some of those were false alarms, and no one flagged by the system was anything more than a petty criminal such as a ticket scalper. Press reports indicate that NewOrleans authorities are considering using it again at the 2002 Super Bowl.
The technology has also been deployed by a part of Tampa, Ybor City, which has trained cameras on busy public sidewalks in the hopes of spotting criminals. As with the Super Bowl, it is unclear what criteria were used for including photos in the database. The operators have not yet caught any criminals. In addition, in England, where public, police-operated video cameras are widespread, the town of Newham has also experimented with the technology.
Not surprisingly, government studies of face-recognition software have found high rates of both "false positives" (wrongly matching innocent people with photos in the database) and "false negatives" (not catching people even when their photo is in the database). One problem is that unlike our fingerprints or irises, our faces do not stay the same over time. These systems are easily tripped up by changes in hairstyle, facial hair, or body weight, by simple disguises, and by the effects of aging.
A study by the government's National Institute of Standards and Technology (NIST), for example, found false-negative rates for face-recognition verification of 43 percent using photos of subjects taken just 18 months earlier, for example. And those photos were taken in perfect conditions, significant because facial recognition software is terrible at handling changes in lighting or camera angle or images with busy backgrounds. The NIST study also found that a change of 45 degrees in the camera angle rendered the software useless. The technology works best under tightly controlled conditions, when the subject is starting directly into the camera under bright lights - although another study by the Department of Defense found high error rates even in those ideal conditions. Grainy, dated video surveillance photographs of the type likely to be on file for suspected terrorists would be of very little use.
In addition, questions have been raised about how well the software works on dark-skinned people, whose features may not appear clearly on lenses optimized for light-skinned people.
Samir Nanavati of the International Biometric Group, a consulting firm, sums it up: "You could expect a surveillance system using biometrics to capture a very, very small percentage of known criminals in a given database."
However, the government also has possession of a huge, ready-made facial image database - driver's license photos - and is looking into how they can be used. By law, the government can't sell those photos to private companies, but there are no prohibitions on their use for surveillance purposes by the government itself. The Federal government has begun to fund pilot projects on expanding the use of driver's license photos to facial recognition databases.
Another problem is the threat of abuse. The use of facial recognition in public places like airports depends on widespread video monitoring, an intrusive form of surveillance that can record in graphic detail personal and private behavior. And experience tells us that video monitoring will be misused. Video camera systems are operated by humans, after all, who bring to the job all their existing prejudices and biases. In Great Britain, for example, which has experimented with the widespread installation of closed circuit video cameras in public places, camera operators have been found to focus disproportionately on people of color, and the mostly male operators frequently focus voyeuristically on women.
While video surveillance by the police isn't as widespread in the U.S., an investigation by the Detroit Free Press (and followup) shows the kind of abuses that can happen. Looking at how a database available to Michigan law enforcement was used, the newspaper found that officers had used it to help their friends or themselves stalk women, threaten motorists, track estranged spouses - even to intimidate political opponents. The unavoidable truth is that the more people who have access to a database, the more likely that there will be abuse.
Facial recognition is especially subject to abuse because it can be used in a passive way that doesn't require the knowledge, consent, or participation of the subject. It's possible to put a camera up anywhere and train it on people; modern cameras can easily view faces from over 100 yards away. People act differently when they are being watched, and have the right to know if their movements and identities are being captured.
While appropriate for bank transactions and entry into secure areas, such technologies have the disadvantage that they are intrusive both physically and socially. They require the user to position their body relative to the sensor, and then pause for a second to `declare' themselves. This `pause and declare' interaction is unlikely to change because of the fine-grain spatial sensing required. Moreover, there is a `oracle-like' aspect to the interaction: since people can't recognize other people using this sort of data, these types of identification do not have a place in normal human interactions and social structures.
While the `pause and present' interaction and the oracle-like perception are useful in high-security applications (they make the systems look more accurate), they are exactly the opposite of what is required when building a store that recognizes its best customers, or an information kiosk that remembers you, or a house that knows the people who live there.
Face recognition from video and voice recognition have a natural place in these next-generation smart environments -- they are unobtrusive (able to recognize at a distance without requiring a `pause and present' interaction), are usually passive (do not require generating special electro-magnetic illumination), do not restrict user movement, and are now both low-power and inexpensive. Perhaps most important, however, is that humans identify other people by their face and voice, therefore are likely to be comfortable with systems that use face and voice recognition.
Verification and identification follow the same steps. Assuming your audience is a cooperative audience (as opposed to uncooperative or non-cooperative), the user 'claims' an identity through a login name or a token, stands or sits in front of the camera for a few seconds, and is either matched or not matched. This comparison is based on the similarity of the newly created match template against the reference template or templates on file. The point at which two templates are similar enough to match, known as the threshold, can be adjusted for different personnel, PC's, time of day, and other factors.
A second variable in identification is the dynamic between the target subjects and capture device. In verification, one assumes a cooperative audience, one comprised of subjects who are motivated to use the system correctly. Facial scan systems, depending on the exact type of implementation, may also have to be optimized for non-cooperative and uncooperative subjects. Non-cooperative subjects are unaware that a biometric system is in place, or do not care, and make no effort to either be recognized or to avoid recognition. Uncooperative subjects actively avoid recognition, and may use disguises or take evasive measures. Facial scan technologies are much more capable of identifying cooperative subjects, and are almost entirely incapable of identifying uncooperative subjects.
"Eigenface," roughly translated as "one's own face," is a technology patented at MIT which utilizes two dimensional, global grayscale images representing distinctive characteristics of a facial image. Variations of eigenface are frequently used as the basis of other face recognition methods.

As suggested by the graphic, distinctive characteristics of the entire face are highlighted for use in future authentication. The vast majority of faces can be reconstructed by combining features of approximately 100-125 eigenfaces. Upon enrollment, the subject's eigenface is mapped to a series of numbers (coefficients). For 1-to-1 authentication, in which the image is being used to verify a claimed identity, one's "live" template is compared against the enrolled template to determine coefficient variation. The degree of variance from the template, of course, will determine acceptance or rejection. For 1-to-many identification, the same principle applies, but with a much larger comparison set. Like all facial recognition technology, eigenface technology is best utilized in well-lit, frontal image capture situations.
Feature analysis is perhaps the most widely utilized facial recognition technology. This technology is related to Eigenface, but is more capable of accommodating changes in appearance or facial aspect (smiling vs. frowning, for example). Visionics, a prominent facial recognition company, uses Local Feature Analysis (LFA), which can be summarized as an "irreducible set of building elements." LFA utilizes dozens of features from different regions of the face, and also incorporates the relative location of these features. The extracted (very small) features are building blocks, and both the type of blocks and their arrangement are used to identify/verify. It anticipates that the slight movement of a feature located near one's mouth will be accompanied by relatively similar movement of adjacent features. Since feature analysis is not a global representation of the face, it can accommodate angles up to approximately 25° in the horizontal plane, and approximately 15° in the vertical plane. Of course, a straight-ahead video image from a distance of three feet will be the most accurate. Feature analysis is robust enough to perform 1-1 or 1-many searches.
In Neural Network Mapping technology, features from both faces - the enrollment and verification face - vote on whether there is a match. Neural networks employ an algorithm to determine the similarity of the unique global features of live versus enrolled or reference faces, using as much of the facial image as possible. An incorrect vote, i.e. a false match, prompts the matching algorithm to modify the weight it gives to certain facial features. This method, theoretically, leads to an increased ability to identify faces in difficult conditions. As with all primary technologies, neural network facial recognition can do 1-1 or 1-many.
Automatic Face Processing (AFP) is a more rudimentary technology, using distances and distance ratios between easily acquired features such as eyes, end of nose, and corners of mouth. Though overall not as robust as eigenfaces, feature analysis, or neural network, AFP may be more effective in dimly lit, frontal image capture situations.
|
|
|
|
|
|
Additional Description |
Manchester, NH Viisage | US-NH | Viisage | Travel and Transportation | Surveillance/ Screening |
Screening | 4th US airport to adopt solution |
Cognitec 'SmartGate' Sydney Airport | Australia | Cognitec | Travel and Transportation | Phys Acc/T&A | Physical Access | 6k Qantas aircrew, based on passport read |
Virginia Beach Surveillance | US-VA | Identix | Law Enforcement | Criminal ID | Surveillance | 600 image database, 10 subjects, alarm rate met with deployer approval |
Berlin Airport | Germany | ZN | Travel and Transportation | Phys Acc/T&A | Physical Access | Face recognition terminal; template stored on SC |
Diversity Visa Program | US-MA | Viisage | Government | Civil ID | Immig ID | Image first entered into system at time of green card registration to prevent duplicate apps, later used for security screening |
CO DL | US-CO | Identix | Government | Civil ID | DL | duplicate enrollment detection |
Zurich Airport Face | Switzerland | C-VIS | Travel and Transportation | Surveillance/ Screening |
Screening | Zurich Airport Police running system; targeting illegal immigrants from W. Africa, M.East and Asia |
City of Brentwood Police Dept. | US-CA | Imagis | Law Enforcement | Criminal ID | Forensic | ID-2000 and CABS system integrated into the Records Management System (RMS) of Data911 |
External Links