Voice Stress Analysis Challenges
The speculation behind how Voice Stress Analysis (VSA) works is based upon the following premise: liars experience more psychological stress when lying than truth tellers do when they tell the truth. The notion is that this psychological stress results in minor changes in the blood circulation, which subsequently influences various characteristics of the voice. This is where they employ an instrument known as a Psychological Stress Evaluator and use microphones attached to a computer to detect and display intensity, frequency, pitch, harmonics or (my favorite) microtremors.
It is used in the workplace and even in some police investigations today including the recent high-profile case of George Zimmerman’s shooting of Trayvon Martin.
There is absolutely no empirical validity to any of the above assertions by the VSA community. None. In fact, the core notions have been falsified repeatedly dating back to Horvath (1978, 1979) and Gamer (2006).
The Comparative Questions test (CQT), which is lauded by the polygraph industry as being the gold standard in protocols, entirely depends on the examiner not being blinded to the inquiry. The questions need to be discussed with the examinees prior to the examination, and background information is needed from the examinee. This means that CQT protocol cannot be carried out covertly.
Even the voice analysis community even admits that the RIT (Relevant-Irrelevant test) protocol cannot be performed on people using voice stress analysis.
There are large intrapersonal differences with no research to isolate that variable.
There are astronomically large issues with interrater interpretation of the same data.
The National Research Council in 2003 concluded that “although proponents of voice stress analysis claim high levels of accuracy, empirical research on the validity of the technique has been far from encouraging.”
The US Department of Defense Polygraph Institute conducted a controlled study and found it to be inferior to the polygraph and slightly above random chance in the detection of lies.
So, it is scientific junk.
Why does it persist as an investigative tool? Beats me.
Consider the following survey of the empirical research as compiled by the American Polygraph Association
- Brenner, M., Branscomb, H., & Schwartz, G. E. (1979). Psychological stress evaluator: Two tests of a vocal measure. Psychophysiology, 16(4), 351-357.
Conclusion: “Validity of the analysis for practical lie detection is questionable”
- Cestaro, V.L. (1995). A Comparison Between Decision Accuracy Rates Obtained Using the Polygraph Instrument and the Computer Voice Stress Analyzer (CVSA) in the Absence of Jeopardy. (DoDPI95-R-0002). Fort McClellan, AL: Department of Defense Polygraph Institute.
Conclusion: Accuracy was not significantly greater than chance for the CVSA.
- DoDPI Research Division Staff, Meyerhoff, J.L., Saviolakis, G.A., Koenig M.L., & Yourick, D.L. (In press). Physiological and Biochemical Measures of Stress Compared to Voice Stress Analysis Using the Computer Voice Stress Analyzer (CVSA). (DoDPI01-R-0001). Department of Defense Polygraph Institute.
Conclusion: Direct test of the CVSA against medical markers for stress (blood pressure, plasma ACTH, salivary cortisol) found that CVSA examiners could not detect known stress. This project was a collaborative effort with Walter Reed Army Institute of Research.
- Fuller, B.F. (1984). Reliability and validity of an interval measure of vocal stress. Psychological Medicine, 14(1), 159-166
Conclusion: Validity of voice stress measures was poor.
- Janniro, M. J., & Cestaro, V. L. (1996). Effectiveness of Detection of Deception Examinations Using the Computer Voice Stress Analyzer. (DoDPI95-P-0016). Fort McClellan, AL : Department of Defense Polygraph Institute.
DTIC AD Number A318986.
Conclusion: Chance-level detection of deception using the CVSA as a voice stress device.
- Hollien, H., Geison, L., & Hicks, J. W., Jr. (1987). Voice stress analysis and lie detection. Journal of Forensic Sciences, 32(2), 405-418.
Conclusions: Chance-level detection of stress. Chance-level detection of lies.
- Horvath, F. S. (1978). An experimental comparison of the psychological stress evaluator and the galvanic skin response in detection of deception. Journal of Applied Psychology, 63(3), 338-344.
Conclusion: Chance-level detection of deception.
- Horvath, F. S. (1979). Effect of different motivational instructions on detection of deception with the psychological stress evaluator and the galvanic skin response. Journal of Applied Psychology, 64(3, June), 323-330.
Conclusion: Voice stress did not detect deception greater than chance.
- Kubis, J. F. (1973). Comparison of Voice Analysis and Polygraph As Lie Detection Procedures. (Technical Report No. LWL-CR-03B70, Contract DAAD05-72-C-0217). Aberdeen Proving Ground, MD: U.S. Army Land Warfare Laboratory.
Conclusion: Chance-level detection of deception for voice analysis.
- Lynch, B. E., & Henry, D. R. (1979). A validity study of the psychological stress evaluator. Canadian Journal of Behavioural Science, 11(1), 89-94.
Conclusion: Chance level detection of stress using the voice.
- O’Hair, D., Cody, M. J., & Behnke, R. R. (1985). Communication apprehension and vocal stress as indices of deception. The Western Journal of Speech Communication, 49, 286-300.
Conclusions: Only one subgroup showed a detection rate significantly better than chance, and it did so by the thinnest of margins. Use of questionable statistical methods in this study suggests the modest positive findings would not be replicated in other research. See next citation.
- O’Hair, D., Cody, M. J., Wang, S., & Chao, E. Y. (1990). Vocal stress and deception detection among Chinese. Communication Quarterly, 38(2, Spring), 158ff.
Conclusion: Partial replication of above study. Vocal scores were not related to deception.
- Suzuki, A., Watanabe, S., Takeno, Y., Kosugi, T., & Kasuya, T. (1973). Possibility of detecting deception by voice analysis. Reports of the National Research Institute of Police Science, 26(1, February), 62-66.
Conclusion: Voice measures were not reliable or useful.
- Timm, H. W. (1983). The efficacy of the psychological stress evaluator in detecting deception. Journal of Police Science and Administration, 11(1), 62-68.
Conclusion: Chance-level detection of deception.
- Waln, R. F., & Downey, R. G. (1987). Voice stress analysis: Use of telephone recordings. Journal of Business and Psychology , 1(4), 379-389.
Conclusions: Voice stress methodology did not show sufficient reliability to warrant its use as a selection procedure for employment.
There is also Assessing the Validity of Voice Stress Analysis Tools in a Jail Setting by Damphousse, Kelly R., University of Oklahoma; Pointon, Laura, University of Oklahoma; Upchurch, Deidra, KayTen Research and Development; Moore, Rebecca K., Oklahoma Department of Mental Health and Substance Abuse Services. It can be downloaded here: https://www.ncjrs.gov/pdffiles1/nij/grants/219031.pdf
Manufacturers of Voice Stress Analysis (VSA) devices have suggested that their devices are able to measure deception with great accuracy, low cost, and little training. As a result, police departments across the country have purchased costly VSA computer program s with the intention of supplementing (or supplanting) the use of the polygraph at an estimated cost of more than $16,000,000. Previous VSA studies have been conducted using simulated deception in laboratory conditions. These earlier research projects suggest that VSA programs have the capacity to detect changes in vocal patterns as a result of induced stress. To date, however, no published research studies have demonstrated that VSA program s can distinguish between “general” stress and the stress related to being deceptive. The goal of this study was to test the validity and reliability of two popular VSA programs (LVA and CVSA) in a “real world” setting. Questions about recent drug use were asked of a random sample of arrestees in a county jail. Their responses and the VSA output were compared to a subsequent urinalysis to determine if the VSA programs could detect deception. Both VSA program s show poor validity – neither program efficiently determined who was being deceptive about recent drug use. The programs were not ab le to detect deception at a rate any better than chance. The data also suggest poor reliability for both VSA products when we compared expert and novice interpretations of the output. Correlations between novices and experts ranged from 0.11 to 0.52 (depending on the drug in question). Finally, we found that arrestees in this VSA study were much less likely to be deceptive ab out recent drug use than arrestees in a non-VSA research project that used the same protocol (i.e., the ADAM project). This finding provides support for the “bogus pipeline” effect.
There is a lot of great research and analysis and phrases in the actual study itself. It is a well-constructed and very well executed study. It has great and proper statistical discussion. It is a US DOJ NCJRS funded grant.
There is also the Virginia Department of Occupational Professionalism and Regulation Study of the Utility and Validity of Voice Stress Analyzers which can be downloaded here: http://www.dpor.virginia.gov/uploadedFiles/MainSite/Content/News/BPOR03%20%28Voice%20Stress%20Analyzers%29.pdf