IAFTC Newsletter

The Official Newsletter of IAFTC

IAFTC Newsletter Archives

<< First  < Prev   1   2   Next >  Last >> 
  • 7 Jun 2026 11:31 AM | ​Aaron ​Olson (Administrator)

    The International Association of Forensic Toxicology Consultants is excited to announce the launch of the Journal of Applied Forensic Toxicology, a new open-access journal published by IAFTC.

    The journal was created to provide a practical, accessible publication venue focused on the real-world application of forensic toxicology to casework, litigation, laboratory practice, expert testimony, impaired driving investigations, and the interpretation of toxicological evidence.

    The inaugural issue, Volume 1, Issue 1, is now open for submissions. We are excited to get this rolling and begin building a journal that reflects the practical needs and expertise of forensic toxicology professionals.

    The journal welcomes original research, case reports and case series, review articles, short communications, perspectives, and technical notes. Topics may include alcohol and drug toxicology, breath alcohol testing, blood and urine drug testing, pharmacokinetics, laboratory quality assurance, cannabis and impaired driving, DRE evaluations, SFSTs, expert testimony, casework issues, and litigation-focused forensic toxicology.

    The journal is fully open access, has no article processing charges, and all published articles will receive a DOI. Submissions are subject to peer review, with a focus on clarity, accuracy, usefulness, and practical application.

    IAFTC members are encouraged to consider contributing to the inaugural issue and helping establish the Journal of Applied Forensic Toxicology as a practical resource for the forensic toxicology field.

    Submissions for Volume 1, Issue 1 can be made here:

    https://jaft.pubpub.org/volume-1-issue-1


  • 2 Jun 2026 8:50 AM | ​Aaron ​Olson (Administrator)

    The International Association of Forensic Toxicology Consultants held its first annual conference on May 29, 2026, and we are pleased to share that the event was a success. The conference brought together forensic toxicology professionals, expert witnesses, attorneys, and others interested in the scientific and legal issues that arise in toxicology casework.

    The conference focused on current challenges in forensic toxicology, including testing errors, alcohol interpretation, drug impairment evaluations, accreditation, cognitive bias, sweat patch testing, autobrewery syndrome, and the presentation of toxicology evidence in court.

    Presentation recordings are now available to IAFTC members in the members-only section of the IAFTC website. 

    Members can access the recordings by logging into the IAFTC website and visiting:

    https://iaftc.org/IAFTC-Conference-2026

    Conference Presentations

    Aaron Olson presented on Errors in Forensic Toxicology Testing: Patterns, Persistence, and the Case for Reform. His presentation discussed recurring problems in forensic toxicology testing, how errors can persist before they are identified, and why full discovery and transparency are necessary to evaluate the reliability of toxicology evidence.

    Chad Snyder presented on Case of Defendant vs. Commonwealth of VA: Presumptuousness SFST and DRE Analysis. His presentation examined issues involving Standardized Field Sobriety Testing and Drug Recognition Expert evaluations, with attention to how assumptions and interpretation errors can affect case outcomes.

    Deandra Grant presented on Intoxicated Without Drinking: Legal Case Reports Involving Autobrewery Syndrome. Her presentation reviewed legal cases involving endogenous alcohol production and discussed how autobrewery syndrome may be evaluated and challenged in court.

    Joshua Ott presented on DRE and the Drug Influence Evaluation. His presentation reviewed the Drug Recognition Expert protocol, the strengths and limitations of the drug influence evaluation, and how DRE findings are used in impaired driving investigations.

    Sol Bobst presented on Sweat Patch Testing Reliability: Case Reports. Her presentation examined the reliability of sweat patch drug testing through case examples, including issues involving contamination, environmental exposure, and interpretation challenges.

    Bethany Pridgen presented on Why Accreditation Does Not Mean Scientific Validity in Testing. Her presentation addressed the distinction between laboratory accreditation and scientific validity, including why compliance with accreditation standards does not automatically mean that a test method or result is scientifically reliable.

    Matthew E. Malhiot presented on Breath Alcohol Testing in the 50 States. His presentation provided a comparative look at breath alcohol testing practices across the country, including differences in regulations, instrumentation, and legal standards.

    Andy Ewens presented on Bias Caused by How Drug and Alcohol Test Results Are Reported. His presentation discussed how the reporting and presentation of drug and alcohol test results can affect interpretation and potentially influence legal decision-making.

    Bethany Pridgen also presented on How Cognitive Bias Can Affect the Quality and Reliability of Method Validation in Forensic Testing. This presentation explored how expectations, assumptions, and contextual information can influence method validation and scientific decision-making in forensic laboratories.

    Andy Ewens also presented on Differentiating Medical and Psychological Causes from Drug-Induced Impairment: Case Reports. This presentation examined case examples involving medical and psychological conditions that may mimic drug-induced impairment and emphasized the importance of considering alternative explanations.

    How Members Can Access the Recordings

    IAFTC members can access the available presentation recordings by logging into the IAFTC website and visiting the members-only conference page:

    https://iaftc.org/IAFTC-Conference-2026

    After logging in, members will be able to view the approved conference recordings that have been posted to the page.

    Thank you to all of the presenters, attendees, and IAFTC members who helped make the first annual conference a success. We look forward to continuing to build this event as a valuable resource for forensic toxicology professionals.


  • 21 May 2026 9:36 AM | ​Joshua Ott

    IAFTC Newsletter. Volume 2. Issue 2. May 21, 2026.

    Joshua Ott1

    1Caselock, Inc., P.O. Box 285, Lebanon, GA 30146

    This is an open-access article under the CC BY-NC-ND license.

    Download PDF.

    Abstract

    The Seated Battery of Standardized Field Sobriety Tests (SFSTs) was developed for Boating Under the Influence (BUI) investigations, where balance-dependent roadside field sobriety tests are impractical. This article analyzes the two foundational studies supporting the Seated Battery: the laboratory development study and the field validation study. Reported accuracy, sensitivity, specificity, and false positive and false negative rates are examined for each individual test and for the combined battery. While some measures, most notably Horizontal Gaze Nystagmus and Finger to Nose, demonstrated improved specificity in the field, the combination of tests had a substantial decrease in sensitivity in the field, resulting in a false negative rate of 72%. This analysis raises significant concerns regarding officer selection, unexplained performance shifts between studies, and the suitability of the Seated Battery as a screening tool for BUI Investigations. The findings suggest that additional research and refinement are necessary before the Seated Battery of SFSTs can be relied upon as a scientifically valid method for identifying boaters with a blood alcohol concentration at or above 0.08%.

    Introduction

    Standardized Field Sobriety Tests (SFSTs) were originally developed for roadside enforcement to assist officers in identifying drivers whose blood alcohol concentration (BAC) is likely at or above a defined threshold.

    The Walk and Turn and One Leg Stand tests rely heavily on balance, making them poorly suited for use in marine environments, where boat motion and residual “sea legs” can affect performance independent of intoxication.


    To address these limitations, researchers developed the Seated Battery of Standardized Field Sobriety Tests for use in Boating Under the Influence (BUI) investigations. The Seated Battery consists of four tests, Horizontal Gaze Nystagmus (HGN), Finger to Nose (FTN), Palm Pat (PP), and Hand Coordination (HC), administered while the subject is seated. Like the roadside SFSTs, the Seated Battery was designed and validated to identify that a subject’s BAC is likely at or above 0.08%, not to measure impairment.


    The scientific foundation for the Seated Battery rests on two studies: a laboratory development study and a field validation study. These studies reported varying levels of accuracy, sensitivity, and specificity across individual tests and the combined battery. While the authors concluded that the Seated Battery demonstrated sufficient reliability for operational use, closer examination of the data reveals substantial limitations, including high false positive rates in the laboratory, substantial shifts in performance between the laboratory study and the field study, unexplained and problematic reductions in sensitivity, and concerns related to officer selection.


    This article critically examines both foundational studies, compares their reported performance metrics, and questions whether the Seated Battery meets the sensitivity that should be expected of a screening tool intended to identify operators who are at or above a BAC of 0.08%. 

    Overview of the Tests

    The Seated Battery of Standardized Field Sobriety Tests (SFSTs) was developed for Boating Under the Influence investigations due to the multiple issues of using tests that require balance for someone on a boat or who has recently been on a boat. (“Sea legs”)(1) 


    There are (4) tests that are a part of the Seated Battery of SFSTs: Horizontal Gaze Nystagmus (HGN), Finger to Nose, Palm Pat, and Hand Coordination. Just like the standing SFSTs, these tests were only validated to indicate if a person’s Blood Alcohol Concentration (BAC) was likely at or above 0.08%.

    Development of Sobriety Tests for the Marine Environment study (2)

    This laboratory study involved 157 paid volunteers who were randomly assigned to one of four groups to be dosed with alcohol. (0.00, 0.04, 0.08, and 0.12%) There were twenty-four officers with an average of 9.7 years’ experience administering the roadside SFSTs. There were six tests evaluated: Finger to Nose (Finger to Nose), Time Estimation (TE), Finger Count (FC), Hand Coordination (HC), Palm Pat (PP), and Horizontal Gaze Nystagmus (HGN). To measure the volunteers' BAC, the Intox EC/IR and Alco Sensor FST were used. 


    Results (BAC ≥ 0.08 vs < 0.08):

    • HGN:Accuracy 67.4%Sensitivity 86.8%False Positive Rate 44.7%

    • FTN:Accuracy 59.9%Sensitivity 58.5%False Positive Rate 39.3%

    • PP:Accuracy 57.2%Sensitivity 66.0%False Positive Rate 48.2%

    • HC:Accuracy 57.2%Sensitivity 64.2%False Positive Rate 47.1%

    • Combination: Accuracy 72.3%, Sensitivity 81.1%, False Positive Rate 33.3%

    • Time Estimation and Finger Count did not correlate with BAC and were excluded from the final battery.

    The average BAC of the volunteers ≥0.08% was 0.102%.The average BAC < 0.08% was only 0.023%, yet false positive rates were high across all tests. The authors attributed the lower accuracy to lower overall BACs compared to prior SFST studies. While this explanation may account for reduced sensitivity, it does not adequately explain the high false positive rates at low BACs. Notably, raw data were not published, preventing analysis of false positives in the placebo group.


    The authors noted, “the overall correct percentages, sensitivity, and specificity of the tests were below what is typically reported in the literature on the roadside SFSTs. Comparison with prior studies, however, should be made with caution. First, in this study, the average BACs were considerably lower than in previous studies. In the Burns and Moskowitz study, for example, 48 participants were tested at a mean BAC of 0.120%, and 16 participants were tested at a mean BAC of 0.156%. In comparison, in the current study, the highest BAC group was tested at a mean BAC of 0.110%. The wider distribution of BACs in the previous studies may have made the impairment or no impairment decision less difficult than in the current study.”


    Despite officers’ experience, a large variability in performance was observed; 20% of officers were less than 50% accurate on HGN. 


    The authors ultimately concluded the tests warranted field validation without any reported modifications to the tests.


    Validation of Sobriety Tests for the Marine Environment (3)

    This was a field study that was conducted on the Lake of the Ozarks in Missouri. Four marine officers from the Missouri State Water Patrol who had prior experience administering HGN were used for the study. In the Development Study, in which twenty-four officers were used, it was noted that there were great differences between the accuracy rates of the officers. That number was now cut down to only four, which creates questions about how the officers were selected and why the number of officers was reduced to four.


    The officers in this study received an eight-hour class and then 3 ten-hour shifts in patrol boats on the water to become proficient with the tests. An inexperienced officer attending the National Association of State Boating Law Administrators’ (NASBLA) course receives only twenty-four hours of training when being taught how to administer and interpret these tests, and none of those hours are in patrol boats on the water.


    During the study, officers stopped boaters suspected of BUI and asked them to come aboard the patrol boat. The Seated Battery of SFSTs was then administered. There were observers present during 76% of the study cases. The majority of the stops were Probable Cause stops involving a boater suspected of BUI by the officer, and the other stops were checkpoint stops. The study noted that some passengers were administered the tests, and their data were included in the analysis.


    Average BACs:

    • < 0.08%: 0.028%

    • ≥ 0.08%: 0.133%

    There were a total of 331 cases, and only one of those people refused to provide a blood or breath specimen. This means that only 0.3% of the people involved in the study refused to submit to chemical testing. This is an extremely low rate of refusals and matches the rate seen in the San Diego study(4).


    Key Results:

    • HGN:Accuracy 84.8%Sensitivity 86%False Positive Rate 16%

    • FTN:Accuracy 67.3%Sensitivity 49%False Positive Rate 19%

    • PP:Accuracy 65.2%Sensitivity 76%False Positive Rate 43%

    • HC:Accuracy 59.4%Sensitivity 62%False Positive Rate 43%

    • Combination: Accuracy 68.1%, Sensitivity 28%, False Positive Rate 2%

    The false positive rate of the combination of all four tests dropped by 31.3% from where it was in the laboratory study.The false negative rate was 72%. This is alarmingly high. This means that over half of the BUI suspects who should be charged may be incorrectly released by officers. This is the opposite of what should be desired and accepted from field sobriety tests.


    The author concluded, “It is proposed that marine officers administer HGN, FTN, PP, and HC to all BUI suspects, and then, for each suspect, use the pattern of test results to estimate the probability of BAC ≥. 08%.” 

    Read the rest here.

  • 29 Mar 2026 3:55 PM | ​Aaron ​Olson (Administrator)

    On March 27, 2026, the IAFTC hosted a webinar presented by Dr. Joseph Anderson titled Airway Gas Exchange: The Foundation of the Alcohol Breath Test. In this session, Dr. Anderson provided a detailed and research-driven examination of lung physiology as it relates to breath alcohol testing, with a focus on airway gas exchange. 

    Drawing from decades of experimental data and mathematical modeling, he explained that highly soluble compounds such as ethanol do not behave like traditional alveolar gases (such as carbon dioxide), but instead undergo significant exchange within the airways. He described how the airway lining acts as a reservoir for alcohol, leading to continuous absorption and reabsorption during the breathing cycle, which challenges the assumption that breath alcohol reflects a simple alveolar equilibrium.

    The presentation also explored how multiple physiological and procedural factors can influence breath alcohol results, including breathing pattern, breath temperature, exhaled volume, and lung size. Dr. Anderson emphasized that these variables can meaningfully affect measured concentrations, underscoring the importance of understanding underlying physiology when interpreting breath test data. 

    The webinar provided attendees with a deeper scientific framework for evaluating breath alcohol testing and highlighted the complexity of interpreting results in forensic contexts. The webinar recording is available here: Airway Gas Exchange - The Foundation of the Alcohol Breath Test - Joe Anderson Webinar.mp4
  • 24 Feb 2026 5:39 PM | ​Guy ​Oldaker III

    IAFTC Newsletter. Volume 2. Issue 1. February 24, 2026

    Guy Oldaker III, J.D., Ph.D1

    1guyoldaker3@yahoo.com467 Heritage Drive, Lewisville, NC 27023

    This is an open-access article under the CC BY-NC-ND license.

    Download PDF.

    Abstract

    In North Carolina, the admissibility of results from testing for breath-alcohol in driving while impaired (“DWI”) cases depends upon results from checks of the accuracy of Intox EC/IR-II evidentiary analyzers.  Accuracy checks use dry-gas standards of known alcohol (technically, ethanol) concentration.  By regulation, an analyzer is deemed accurate if the result of the accuracy check either agrees with the expected result or is 0.01 less than the expected result.  Expected results depend upon atmospheric pressure, which varies with altitude above sea level.  Because of this, expected results must reflect adjustment for altitude when accuracy checks are done at elevations above sea level.  Defendants receive test tickets that document the results of accuracy checks.  However, test tickets do not report the expected results that are needed for comparison to the accuracy check results printed on test tickets.  Consequently, defendants cannot independently assess whether accuracy has been demonstrated.  This article shows how analyzer accuracy can be assessed in spite of the lack of information on test tickets.  Actual data from a test ticket issued in Yancey County, North Carolina, is used.  The assessment demonstrates that, contrary to what normally would be claimed, the analyzer is not accurate.  Therefore, the breath alcohol result for this defendant should not be admissible in evidence.  In addition, accuracy check records from preventive maintenance of the analyzers used in Yancey County were reviewed.  None of the test results for 2023 and 2024 should have been admissible.  The forensic reliability of past and present breath-alcohol test results from analyzers used in the Mountain and Piedmont Regions of North Carolina requires scrutiny.

    Introduction

    The subject of this article is the accuracy check of the Intox EC/IR-II analyzer [1] when used in North Carolina for DWI prosecutions.  Forensic scientists dealing with breath-alcohol measurements in North Carolina need to be aware of a potential question of forensic reliability when measurements are done at elevations above sea level in the Piedmont and Mountain Regions.  Here, forensic reliability means whether measurements satisfy legal rules dealing with evidentiary admissibility.


    This article identifies the statute and regulations defining forensic reliability; the data supplied to defendants; the application of the Ideal Gas Law and the Barometric Formula; and procedures for assessing whether reported results are, in fact, forensically reliable.  The article is written for a broad audience.  Forensic scientists are the main group.  However, forensic scientists often will be serving as expert consultants or witnesses in assisting attorneys.  Because of this, the article tries to bridge the gap between science and law.  At times, concepts are presented in a manner reflecting direct examination as well as legal analysis.  In addition, an attempt has been made, to the extent practicable, to simplify so as to make the concepts easier to understand by factfinders, who, in North Carolina, can be either judges or laypersons serving on a jury.  

    Brief Description of the Analyzer

    In North Carolina, the Intox EC/IR-II is the evidentiary analyzer used for DWI prosecutions [2].  In certain respects, this analyzer is unique to North Carolina.  This uniqueness is because Intoximeters, the manufacturer of the Intox EC/IR-II, configures the analyzer to an individual state’s requested specifications.  Consequently, although many of the basic functions of the Intox EC/IR-II remain the same from state to state, the operation and outputs can differ substantially.  To take one example, in Arkansas, the Intox EC/IR-II prints results to three decimal places [3].  By contrast, for North Carolina, the analyzer reports all alcohol concentration measurements truncated to two decimal places.  For example, 0.079 truncates to 0.07.


    The main components of North Carolina’s Intox EC/IR-II analyzer are an infrared sensor, an electrochemical sensor, a flow sensor, and an atmospheric pressure sensor (a barometer) [4].  The infrared sensor and flow sensor operate together.  Their two functions are to alert to mouth alcohol and to determine when an appropriate sample has been obtained for analysis.  With an appropriate sample acquired, a small portion (about 1 mL) is directed to the electrochemical sensor.  The analysis of alcohol takes place at the electrochemical sensor.


    A key function of the barometer is to account for the effect of atmospheric pressure on the alcohol concentration of the dry-gas standard when the analyzer’s calibration is checked [3, 5].  The check uses (as a reference material) a mixture of alcohol (specifically ethanol) with a known concentration in nitrogen gas.  This mixture is contained and compressed in a small tank.  North Carolina calls the tank a “gas canister”[6].  The gas canister is located in a locked compartment within the analyzer and cannot be seen during normal operation.  North Carolina refers to these calibration checks as “accuracy checks” [7].

    North Carolina’s Rules for the Accuracy Check

    One statute and two regulations govern forensic reliability and acceptance criteria for the accuracy check, respectively.  North Carolina General Statutes § 20-139.1(b)(1) addresses forensic reliability, that is, whether results are admissible in evidence:  “A chemical analysis of the breath administered pursuant to the implied-consent law is admissible in any court or administrative hearing or proceeding if . . . [i]t is performed in accordance with the rules of the Department of Health and Human Services.”


    The pertinent rules are contained in two regulations.  One regulation supplies the criterion for forensic accuracy:  the analyzer “shall be deemed accurate” if the result of the accuracy check is either “obtaining the expected result or 0.01 less than the expected result as specified in Item (10) of this Rule” [7].  The other regulation, Item (10), stipulates that the sample provided by the gas canister “corresponds to the equivalent concentration of 0.08” [6].  (Units of g/210 L are implied for the 0.08 value.  This article follows this convention.)  No statute or regulation supplies guidance on the interpretations of the terms “expected result” and “equivalent concentration.”


    Critically, in court proceedings throughout North Carolina, an accuracy check result of either 0.08 or 0.07 is presently assumed to establish that an analyzer is accurate, and therefore, its results are forensically reliable and admissible in evidence.

    Operation of the EC/IR II

    The Forensic Tests for Alcohol Branch (“FTAB”) of the North Carolina Department of Health and Human Services has general responsibility for the analyzers.  Based upon the absence of published documentation, it appears that FTAB uses the analyzers “as received.”  With the sole exception of the accuracy check, there is no indication that FTAB checks the calibration of any of the analyzer’s sensors, including the barometer.  FTAB reports no quality assurance plan that would include checking the output of any sensor over its expected measurement range [8].


    Ordinarily, law enforcement officers, who, by regulation, are termed “chemical analysts,” [9] operate the analyzers when a subject’s breath-alcohol is tested.


    Figure 1 is a copy of an actual test ticket that a defendant received when breath alcohol was tested.  With the exception of identifying information, which has been redacted, this test ticket is representative of all test tickets currently issued in North Carolina.  The test ticket is the sole source of information about the breath-alcohol analysis and accuracy check, which is abbreviated ACCY CHK on the test ticket.  This defendant received nothing else.  Noteworthy is the fact that Yancey County is located in North Carolina’s Mountain Region.   Burnsville, the county seat, is at an elevation of 2,749 feet above sea level.  


    The result of the accuracy check was 0.08.  Based upon custom and a facial interpretation of the regulation, law enforcement, defendants, and their counsel would all reasonably conclude that this result demonstrates that the analyzer “shall be deemed accurate.”  As will be explained below, this conclusion is wrong.  Indeed, quite the opposite is true:  The analyzer should be deemed not accurate.  The result for this defendant’s breath-alcohol concentration is not forensically reliable.  The result should not be admissible in evidence.

    Read the rest in PDF.


  • 23 Feb 2026 10:14 AM | ​Aaron ​Olson (Administrator)

    On February 20, 2026, at an IAFTC webinar, Bethany Pridgen delivered a  presentation on forensic accreditation and what it does, and does not, guarantee.

    Her central point was straightforward: ISO 17025 accreditation is important, but it is not proof that a laboratory’s methods are scientifically optimal. Accreditation confirms that procedures are documented, followed, and supported by required records. It does not certify that every methodological choice is the best available or free from weakness.

    Pridgen highlighted how scientifically questionable practices, such as single point calibration in quantitative blood alcohol testing, persisted for years within accredited systems before being formally cited. The issue was not the absence of paperwork. The issue was whether the underlying science was sufficiently scrutinized.

    She also raised concerns about the concentration of forensic accreditation and the insularity of assessor pools. When assessors are drawn largely from within the same forensic culture, widely accepted practices may go unchallenged. Science benefits from cross disciplinary input and independent scrutiny.

    Pridgen closed with a constructive message. Accreditation should be viewed as a minimum requirements for quality, not a substitute for critical thinking. Forensic toxicology earns trust not by invoking accreditation alone, but by continuously examining and strengthening its scientific practices.

    Access the webinar recording and transcript here.


  • 2 Feb 2026 10:21 AM | ​Joshua Ott

    IAFTC Newsletter. Volume 2. Issue 1. February 02, 2026.

    Joshua Ott1

    1Caselock, Inc., P.O. Box 285, Lebanon, GA 30146

    This is an open-access article under the CC BY-NC-ND license.


    Download PDF.

    Abstract

    This article examines the findings of a randomized clinical trial published in JAMA Psychiatry in 2023 evaluating the classification accuracy of Field Sobriety Tests (FSTs) with respect to cannabis exposure and driving impairment (as determined via a driving simulation). The study involved 184 adult cannabis users who were randomly assigned to placebo (0.02% THC), 5.9% THC, or 13.4% THC groups. 

    Certified Drug Recognition Expert (DRE) Instructors administered a battery of field sobriety tests following dosing to the participants. Driving simulations were also performed by the participants. The participants who received a placebo dose are the primary emphasis of this article. 49.2% of the placebo-dosed participants were classified as FST impaired, despite the pretreatment simulator performance showing no evidence of residual effects. The individual FSTs, which included the Walk and Turn, One Leg Stand, Modified Romberg Balance Test, Finger to Nose, and Lack of Convergence tests, demonstrated high false-positive rates. 

    These findings raise significant concerns regarding the accuracy of these FSTs in distinguishing impaired from unimpaired individuals. The results highlight the need for reassessment of current roadside testing practices and further research into more accurate and reliable evaluations.

    Introduction

    Field Sobriety Tests (FSTs) are widely used by law enforcement officers to assist in determining whether a driver should be arrested for DUI. These tests play a central role in arrest and charging decisions for driving under the influence (DUI), including cases involving suspected drug impairment. While multiple studies [2,3,4,5,6,7,8] have evaluated the Standardized Field Sobriety Tests (SFSTs), Horizontal Gaze Nystagmus, Walk and Turn, and One Leg Stand, in relation to Blood Alcohol Concentration (BAC), these studies validated classification accuracy relative to BAC thresholds rather than alcohol-related impairment as measured by independent performance outcomes. There is no known research validating these or any other FSTs for identifying drug and/or alcohol impairment. Despite this limitation, officers frequently rely on both standardized and non-standardized FSTs when evaluating suspected drug-impaired drivers.


    A randomized clinical trial published in JAMA Psychiatry sought to examine the classification accuracy of FSTs with respect to cannabis exposure and driving impairment (as determined via a driving simulation). The study enrolled 184 adult participants between the ages of 21 and 55 who were active cannabis users and met the inclusion and exclusion criteria. Participants were required to abstain from cannabis use for at least two days prior to the experiment. A pretreatment driving simulation test demonstrated no residual effects of prior cannabis use. 


    Participants were randomly assigned to receive one of three different doses: a placebo dose (0.02% THC), a 5.9% THC dose, or a 13.4% THC dose. Following dosing, participants completed driving simulations and underwent a battery of FSTs administered by eleven certified DRE Instructors. The officers were blinded to the dosing condition and were asked to indicate whether they believed participants had received active THC or a placebo. Horizontal Gaze Nystagmus was not administered, as it is not expected to be present from cannabis use.


    This article focuses on the results from the placebo-dosed group to examine the false-positive rates associated with commonly used FSTs. The findings raise important questions regarding the accuracy of the FSTs at correctly identifying sober subjects as not impaired and raise concerns for all DUI investigations, including cases involving only alcohol.

    Overview of the Study

    There were 184 participants ranging in age from 21 to 55 years old who were cannabis users. 

    Inclusion and Exclusion Criteria

    Inclusion criteria required participants to:

    • Be 21–55 years of age

    • Have used cannabis four or more times in the past month

    • Hold a valid driver’s license

    • Have driven at least 1,000 miles in the previous year

    Exclusion criteria included:

    • History of traumatic brain injury

    • Significant medical or psychiatric conditions

    • Positive pregnancy test result

    • Positive urine screen for nonprescription amphetamines, benzodiazepines, barbiturates, opiates, oxycodone, cocaine, methamphetamine, or phencyclidine

    • Past-year substance use disorder

    • Oral fluid THC concentration greater than 5 ng/mL on the testing day

    Study Design

    The participants were to abstain from cannabis use for at least 2 days prior to training and experiment days. On the experiment day, the participants were tested for drugs and alcohol. Prior to dosing, they performed a driving simulation. 

    Participants were randomly assigned to one of three groups:

    • Placebo group (0.02% THC)

    • Low-dose THC group (5.9% THC)

    • High-dose THC group (13.4% THC)

    (63) participants received the placebo dose. For purposes of this paper, the primary focus is on the placebo-dosed group.

    After the dosing, the participants performed driving simulations followed by Field Sobriety Tests (FSTs) four times during the remainder of the day. (The results provided were from the first driving simulation and FST administration.)

    A question that must be asked is, were the participants in the placebo group under the influence or experiencing residual effects from marijuana usage prior to the experiment? The authors addressed this concern stating, “we found no differences in use intensity or time since use and no evidence of residual effects on pretreatment simulator performance.”

    Field Sobriety Test Administration

    Eleven certified DRE Instructors administered the Field Sobriety Tests (FSTs). This is the highest level of certification that a law enforcement officer can obtain for DUI enforcement. The officers were asked, “Which treatment do you think the participant received?” Answers were given using a 5-point scale (from “strongly believed…real marijuana” [1] to “strongly believed…placebo” [5]).” The officers did not observe the driving simulation.

    Tests Administered

    The following tests were used:

    • Walk and Turn (WAT)

    • One Leg Stand (OLS)

    • Modified Romberg Balance Test

    • Finger to Nose

    • Lack of Convergence

    Horizontal Gaze Nystagmus (HGN) was not administered because it is not expected to be present from cannabis use.

    FST Results

    Overall, officers classified 49.2% of participants in the placebo-dosed group as “FST impaired,” using the authors’ terminology. The authors noted, “of participants classified as FST impaired, officers strongly or somewhat believed that 99.2% of participants had received THC, suggesting they suspected all poorly performing participants to be under the influence.” 

    Field Sobriety Test outcomes in this study function as a binary classification, in which participants are categorized as either “FST impaired” or “unimpaired.” In a binary classification framework where all subjects are known to be unimpaired, an ideal test would approach a false-positive rate near zero; a rate approaching 50% indicates the test performs no better than random assignment. Among placebo-dosed participants, 49.2% were classified as “FST impaired,” a rate consistent with chance-level classification rather than meaningful discriminatory accuracy.

    These numbers have important implications. Put into a real-world context, officers usually suspect that a driver may be under the influence prior to administering the FSTs based on the driving and initial interaction with the driver. If the FSTs then identify the driver as “FST impaired,” it is highly likely that the officer will charge them with DUI. 

    The results for each of the tests individually are provided in the table below. 

    Comparison to Prior Research

    It is important to note that the San Diego Study(5), which is currently relied upon by law enforcement to cite the accuracy of the SFSTs (HGN, Walk and Turn, and One Leg Stand), has similar false-positive rates. The Walk and Turn false-positive rate was 52%, and the One Leg Stand was 41%. In that study, the false positives had alcohol in their systems but were below the legal limit. It has been argued that these individuals may still have been impaired. The present study counters that argument by demonstrating comparable false-positive rates among placebo-dosed participants.

    There are no known studies that have been conducted to validate the Lack of Convergence, Modified Romberg Balance, or Finger to Nose tests.

    Driving Simulation Results

    The authors reported that “FST impairment had a sensitivity of 80.9% and specificity of 35.7% relative to driving simulator impairment.” When driving simulator impairment is used as the reference standard, the reported specificity corresponds to a false-positive rate of 64.3% (1 − specificity) for the FSTs.

    The Study did not document the results for each of the groups, but it provided a table showing the overall results. The study reported overall simulator results showing 112 participants classified as “not impaired” and 68 classified as “impaired,” totaling 180 participants. The reason four participants were not included in this summary was not explained.

    The reported simulator data allows for the possibility that if all 63 placebo participants were classified as “not impaired” on the driving simulator, this would suggest that 49 THC-dosed participants (40.4%) were also classified as “not impaired” during simulated driving. If any of the placebo-dosed participants were classified as “impaired” on the driving simulator, then the percentage of THC-dosed participants classified as “not impaired” would increase. These findings suggest that recent cannabis use does not necessarily correspond to driving impairment.

    Discussion

    The results of this study show high false-positive rates for individual FSTs and when officers used all the tests combined to classify someone as “FST impaired.” The implications are significant, particularly given that the study was conducted in a controlled environment free from roadside factors such as weather, uneven surfaces, traffic, and distractions that could affect a person’s performance. Additionally, officers in this study relied solely on FST performance, without behavioral or driving indicators, which differs from real-world conditions. Even so, placebo-dosed participants were frequently incorrectly classified as “FST impaired.”

    The authors stated: “Officers also knew that many participants would receive placebo, and it is surprising that THC exposure was assumed in almost all individuals who performed poorly on the FSTs. Officers may encounter situations in which they suspect recent cannabis use (eg, noticing cannabis paraphernalia or odors, drivers stating that they use cannabis); such information could potentially influence the belief that poor FST performance may be causally related to cannabis use.”

    Law enforcement training places substantial emphasis on FST performance in arrest decisions. Given this reliance, the high false-positive rates observed in this study warrant serious scrutiny into the current tests used by law enforcement.

    “Cognitive, or confirmation, bias refers to seeking or in interpreting evidence that supports existing beliefs or hypotheses often outside of awareness. Law enforcement in state legal jurisdictions emphasizes that THC-related impairment, and not just exposure, is the question of interest. Confirmation bias, common in the general population, can remain despite advanced training (including in law enforcement and forensic sciences and it may have been a factor in this study.”

    Conclusions

    The findings of this study raise substantial concerns regarding the accuracy and reliability of FSTs. False-positive rates approaching or exceeding 50% suggest that FST outcomes alone may offer little discriminatory value between impaired and unimpaired individuals. In some cases, decisions may be no more accurate than chance.

    There is a clear need to either improve the specificity of existing tests or develop new methods capable of accurately and reliably distinguishing impairment from non-impairment. Until such measures are implemented, both law enforcement and the legal community should be informed of the limitations and error rates of these tests.

    Future research should utilize sober participants to examine the effects of environmental conditions, footwear, age, and physical injuries on FST performance.

    Acknowledgements

    The author used ChatGPT to assist in drafting the abstract and introduction and to improve clarity and conciseness of the manuscript text, including the conclusion. The AI tool was used solely for language refinement and organizational assistance. All substantive content, data interpretation, conclusions, and final editorial decisions were made by the author, who takes full responsibility for the accuracy and integrity of the work.

    Conflict of Interest Disclosures

    The author is a consultant and expert witness for DUI and BUI cases, but has received no funding or compensation for the preparation of this article.

    References

    [1]Marcotte TD, Umlauf A, Grelotti DJ, Sones EG, Mastropietro KF, Suhandynata RT, et al. Evaluation of Field Sobriety Tests for Identifying Drivers Under the Influence of Cannabis: A Randomized Clinical Trial. JAMA Psychiatry; 2023.

    [2]    Burns M, Herbert M. Psychophysical Tests for DWI Arrest. U.S. Department of Transportation National Highway Traffic Safety Administration; 1977.

    [3]Tharp V, Burns M, Moskowitz H. Development and Field Test of Psychophysical Tests for DWI Arrest. Southern California Research Institute; 1981.  

    [4]Anderson T, Schweitz R, Snyder M. Field Evaluation of a Behavioral Test Battery for DWI. U.S. Department of Transportation National Highway Traffic Safety Administration; 1983.

    [5]Stuster J, Burns M. Validation of the Standardized Field Sobriety Test Battery at BACs Below 0.10 Percent. United States. National Highway Traffic Safety Administration; 1998. 

    [6]Burns M, Dioquino T. A Florida Validation Study of the Standardized Field Sobriety Test (S.F.S.T.) Battery. United States. National Highway Traffic Safety Administration; 1997. 

    [7]Burns M, Anderson E. A Colorado Validation Study of the Standardized Field Sobriety Test (SFST) Battery. U.S. Department of Transportation National Highway Traffic Safety Administration; 1995.

    [8]Burns M. The Robustness of the Horizontal Gaze Nystagmus Test. Southern California Research Institute; 2007.  



  • 5 Jan 2026 2:26 PM | ​Darcy Richardson

    IAFTC Newsletter. Volume 2. Issue 1. January 5, 2026.

    Darcy Richardson1

    1Vermont Forensic Services, 146 River Street, Milton, VT 05468. Darcy.Richardson@vtforensicservices.com 

    This is an open-access article under the CC BY-NC-ND license.

    Download PDF.

    A Different Scientific Upbringing

    My scientific upbringing was in an environmental laboratory, but not just any environmental lab, and not just at any time. It was 1999, and the lab had been bought out after the infamous and massive Intertek fraud of the late 80s and 1990s had come to light [1]. The criminal trials and thus the public eye on laboratory fraud were contemporaneous with the start of my scientific career. To say that my lab was serious about quality control is putting it lightly. 

    The warnings from supervisors were clear: follow proper laboratory practice or go to prison. The people from the Intertek Texas Lab were a stark warning of what happens when laboratory staff don’t take their roles seriously.

    Our Standard Operating Procedures were uniform, established, and found throughout the country. Audits were monthly, sometimes even more frequently, due to the various organizations, states, and federal entities involved. We were used to being evaluated and critiqued, and while we would sometimes roll our eyes at the quality control department insisting that something be rerun because it was out by 0.01%, we did it anyway.

    From Uniform Standards to the “Wild West” 

    Imagine my surprise when I moved into forensics in late 2002.

    Ask a forensic scientist about the state of things, and you’d hear the term “The Wild West.” Gone were my uniform SOPs; the entire country followed. Accreditation? Never heard of it. Audits? Yeah, pretty sure clinical did that kind of thing.


    Early Lessons in Risk and Responsibility

    Forensic Labs were flying blind and making it up as they went along. I’ll never forget being in a classroom when an attendee admitted they were using urine for DUI cases, and the entire room turned around to stare in disbelief.

    Urine, of course, is a perfectly reasonable matrix to analyze to demonstrate exposure or past use, but it is not appropriate to determine a concentration in effect at a certain point in time or to support impairment, as one needs to do in a DUI case.

    My upbringing in environmental testing meant that in my new forensics lab, I insisted on quality control, following Good Laboratory Practice [2], erring on the side of caution. It was a position that often led to head-butting and ultimately being called a “whistleblower” [3,4].

    Over the twenty-plus years I’ve been involved in forensics, the conversations have happened about the need to tame the Wild West. The need for standards and uniformity, proficiency testing, and accreditation. To bring forensics in line with other scientific areas, and to demand that science used in the courts meet the requirements mandated everywhere else.

    And while things are improving, we’re not quite there yet.

    Why Accreditation Alone Is Not Enough

    Accreditation now exists, but it’s still overbroad, mandating only minimal best practices. When it began, it was taken from manufacturing standards and is still largely based on ISO. States still have varied requirements from practically none at all to more specific rules regarding instrumentation, accuracy, or precision. From there, programs and laboratories vary in what is required. Some are seeking to be in line with the published literature and going above and beyond, and others are falling far short of Good Laboratory Practice. Accreditation can’t go as far as it needs to without uniform standards.

    The Role of the American Standards Board

    Enter the American Standards Board (ASB) under the American Academy of Forensic Sciences [5].

    The ASB consists of multiple Consensus Bodies covering all areas of forensics from Anthropology to Wildlife, which work in conjunction with OSAC to draft standards, guidelines, and best practices for forensic work. It seeks to establish that uniformity that is already so standard in environmental testing and to improve the quality of forensic work performed nationwide.

    Quality Assurance may be a weird thing to be passionate about, but when a colleague recommended I apply to the ASB Toxicology Subcommittee that was looking for new members, I jumped at the opportunity.

    I’ve been honored to work with the Toxicology Consensus Body and various working groups over the years to help adjudicate comments and finalize documents for approval. This work is all volunteer, and I’ve had the pleasure of working with dedicated individuals from manufacturing, independent and public labs, and private practice in an attempt to bring forensics to where it should be. The end goal is that throughout the country, we can be assured that the science entering the courtroom meets the level of performance and acceptability in other areas of laboratory work.

    Changing the Culture Is the Hardest Part

    It is not only a long process, standardizing an entire discipline, but it also requires a change in culture [6]. Not everyone grew up in an environmental lab with the threat of prison held over their heads, after all.

    For each new standard, a slew of questions must be answered, and education on why “but we’ve always done it this way” is not a reason to skip out on following good quality assurance. There is always some resistance. Always some argument that the work is good enough. Some of that comes from always being a cowboy in that Wild West and chaffing at authority. Part of it comes from being entrenched with police departments so that the focus is not on science, it’s on being a partner for prosecution.

    Independence, Bias, and the Courtroom

    This issue of independence has long been discussed in the forensic world [7]. My own forensic lab was maintained in the Health Department in some attempt to stay separate. However, regardless of knowing about cognitive bias, when you see police officers or attorneys on a regular basis in your job, you can’t help but view them as your colleagues. And that can be okay as long as the science always comes first.

    If science is to be used in the courtroom, it must meet scientific standards. That is paramount. Being accredited isn’t a free pass.

    When the Science Speaks for Itself

    A question that often comes up when discussing standards is “What will an attorney do with this document?” I have never found that to be a compelling concern. 

    Why? 

    Because you don’t have to worry about being confronted in court if the work stands for itself, it’s easy to explain that science is constantly changing and improving, and we are always striving to meet that standard.

    One day, while I was still working for the state lab and testifying mostly for the prosecution, a defense attorney told me that I was the only one in the room telling the truth. “The cops are lying, I’m lying, my clients are lying, but not you.” I took that as the highest compliment. I still work to that standard.

    Where Forensics Still Needs to Go

    Is forensics where it needs to be? No, but there are people working to get it there. It will take time, effort, and a change in culture, but eventually it will get there.

    Conflicts of Interest

    Darcy Richardson, MS, is a forensic toxicology consultant and provides expert testimony in civil and criminal cases. She is a member of the American Standards Board Toxicology Consensus Body, where her participation is voluntary and unpaid. She has no financial interest in the topics discussed in this paper.

    References

    [1]Texas lab techs allegedly altered data. UPI 2000. https://www.upi.com/Archives/2000/09/21/Texas-lab-techs-allegedly-altered-data/4774969508800/ (accessed December 29, 2025).

    [2]Jena GB, Chavan S. Implementation of Good Laboratory Practices (GLP) in basic scientific research: Translating the concept beyond regulatory compliance. Regul Toxicol Pharmacol 2017;89:20–5. https://doi.org/10.1016/j.yrtph.2017.07.010.

    [3]Bromage A. DUI Chemists Blow the Whistle on Vermont’s Breath-Testing Program. Seven Days: Vermont’s Independent Voice 2011. https://www.sevendaysvt.com/news/dui-chemists-blow-the-whistle-on-vermonts-breath-testing-program-2143006 (accessed November 19, 2023).

    [4]Olson A, Ramsay C. Errors in toxicology testing and the need for full discovery. Forensic Sci Int Synerg 2025;11:100629. https://doi.org/10.1016/j.fsisyn.2025.100629.

    [5]Academy Standards Board. American Academy of Forensic Sciences 2025. https://www.aafs.org/academy-standards-board (accessed December 29, 2025).

    [6]Mnookin JL, Cole SA, Dror IE, Fisher BAJ, Houck MM. The need for a research culture in the forensic sciences. UCLA L Rev 2010.

    [7]Olson A. Truth, power, and the crisis of forensic independence. Forensic Sci Int Synerg 2025;11:100647. https://doi.org/10.1016/j.fsisyn.2025.100647.



  • 24 Dec 2025 9:14 AM | ​Joshua Ott

    IAFTC Newsletter. Volume 1. Issue 1. December 24, 2025.

    Joshua Ott1

    1Caselock, Inc., P.O. Box 285, Lebanon, GA 30146

    This is an open-access article under the CC BY-NC-ND license.

    Download PDF.

    Abstract

    The Horizontal Gaze Nystagmus (HGN) test is widely presented in courtrooms as an accurate and valid component of the Standardized Field Sobriety Test (SFST) battery. However, the 2007 Robustness of the Horizontal Gaze Nystagmus Test study, authored by Dr. Marceline Burns and sponsored by the National Highway Traffic Safety Administration (NHTSA), raises significant concerns regarding the test’s accuracy, validity, and false positive rates. This paper critically analyzes the raw data from the study, specifically the stimulus variation experiment, and compares those findings to the conclusions reported by the study’s author. When evaluated using the HGN criterion established in the San Diego Study and still taught in the 2025 edition of the SFST Manual (four or more clues indicating a BAC of 0.08 g/dL or more), the overall false positive rate was 67% when administered correctly, and false positive rates ranged from 79% to 92% when the stimulus position was altered. Despite these findings, the study’s published conclusions assert that HGN is “robust” and unaffected by minor procedural deviations. This paper demonstrates that the reported conclusions were achieved only after the study’s author altered the criterion of a false positive, lowering the BAC threshold from 0.08 g/dL to 0.03 g/dL. The analysis presented here reveals substantial issues with the study and the rate of false positives for HGN, even when administered in accordance with the NHTSA standard.

    Introduction

    The Horizontal Gaze Nystagmus (HGN) test has long been portrayed in courtrooms as a highly accurate and valid indicator of a person’s BAC being at/or above the legal limit and remains a central component of the Standardized Field Sobriety Test (SFST) battery. Yet the scientific foundation for this confidence warrants renewed scrutiny. The 2007 study, The Robustness of the Horizontal Gaze Nystagmus Test, was authored by Dr. Marceline Burns and funded by the National Highway Traffic Safety Administration (NHTSA). Because Dr. Burns played a central role in developing the SFSTs, including authoring or co-authoring five of the six studies foundational to their use, her conclusions carry significant influence.

    However, a close examination of the raw data from the Robustness Study demonstrates a substantial discrepancy between the data and the conclusions published in the report. When administered and interpreted correctly, as established in the San Diego Study, a score of four or more clues indicates a blood alcohol concentration (BAC) of 0.08 g/dl or more. When using this established interpretation criterion, the raw data exhibits an alarmingly high false positive rate. The false positive rates increase further when the stimulus position deviates from NHTSA’s standardized procedures. Despite these findings, the study characterizes HGN as “robust,” a conclusion reached only after redefining a false positive by lowering the threshold from 0.08 g/dL to 0.03 g/dL.

    This paper provides a critical analysis of the study’s methodology, its data, and its conclusions. By comparing the study’s raw data to the criterion governing HGN interpretation, this analysis demonstrates that the claimed robustness of HGN is not supported by the underlying data. In doing so, it illuminates significant implications for the admissibility, accuracy, validity, and weight of HGN evidence in impaired-driving cases.

    Study Overview

    The Robustness Study (Horizontal Gaze Nystagmus Test Study)(1) was published in 2007 and was sponsored by the National Highway Traffic Safety Administration (NHTSA). It was authored by Dr. Marceline Burns. Dr. Burns was one of the investigators who developed the SFSTs, and she was an author or co-author of five out of the six studies (1977(2), 1981(3), Colorado(4), Florida(5), and San Diego(6)) used to develop and validate the SFSTs, which includes the HGN test. When analyzing the Robustness Study, it is important to know and understand that Dr. Burns was intimately familiar with HGN and its scoring criterion.

    The study addressed defense attorney arguments that variations from the standardized procedures in HGN administration invalidate the test, so this study examined variations in the administration of the test. There were (3) experiments conducted. The first experiment examined variables in the stimulus, such as stimulus speed during Lack of Smooth Pursuit, elevation of the stimulus throughout the HGN test, and distance of the stimulus from the subject’s face. The second experiment examined the participants’ posture (Standing, sitting, or lying down). The third experiment examined the participants’ vision (Monocular vs. binocular vision). For this paper, the first experiment will be the primary focus. The raw data for the second experiment were not published, so it cannot be analyzed with the same level of scrutiny. The third experiment will also be briefly discussed.

    Stimulus Variation Experiment

    This was a laboratory experiment that involved volunteers who were dosed to different Blood Alcohol Concentrations (BACs) that were measured using an “AlcoholSensor IV.” Seven experienced officers administered the HGN Test to the participants. 

    “A Video/HGN System (EyeDynamics, Inc) was used to make video records of participants’ eyes during examinations. The apparatus uses a small adjustable camera mounted in the right side of goggles that are worn by the participant. The camera transmits an image of the participant’s right eye to a television monitor and VCR which the examiner used to view the right eye. The open left side of the goggles allows the participant’s left eye to be viewed by the examiner.” 

    When analyzing the data, it must be understood that the criterion for HGN is that four or more clues indicate a BAC of 0.08 g/dL or more. This standard was established in the San Diego Study and remains in effect as of the 2025 edition of the NHTSA SFST Manual(7).

    The first variation tested was the speed of the stimulus. This involved moving the stimulus at both the “standard” speed (moving from the center of the face to one side as far as the eye can in 2 seconds and 2 seconds back to the center) and faster than the “standard” (1 second out to the side and 1 second back to the center) when checking for Lack of Smooth Pursuit. One officer administered the test correctly, and the other moved the stimulus faster than the standard. During this variation, the false positive rate of the HGN test was 76% with an overall correct rate of 44% when the test was administered correctly. (Appendix 1)

    The second variation was the elevation of the stimulus. This involved holding the stimulus at the “standard” height (2” above eye level), lower than the “standard” (0” / at eye level), and higher than the “standard” (4” above eye level). During this variation, the false positive rate of the HGN test was 54% with an overall correct rate of 61% when the test was administered correctly. (Appendix 2)

    The last variation was the distance of the stimulus from the participant’s face. This involved holding the stimulus at the “standard” distance (12-15”), closer than the “standard” (10”), and further than the “standard” (20”). During this variation, the false positive rate of the HGN test was 69% with an overall correct rate of 47% when the test was administered correctly. (Appendix 3)

    Overall, for the entire experiment (all 3 variations combined),the false positive rate of the HGN Test was 67% with an overall correct rate of 50% when the test was administered correctly. (Appendix 4)

    HGN Test Accuracy by Stimulus Variation

    Stimulus Variation

    Standard Condition Tested

    False Positive Rate (%)

    Overall Correct Rate (%)

    Appendix

    Speed of Stimulus

    2 seconds out / 2 seconds back

    76%

    44%

    Appendix 1

    Elevation of Stimulus

    2 inches above eye level

    54%

    61%

    Appendix 2

    Distance of Stimulus

    12–15 inches from face

    69%

    47%

    Appendix 3

    Overall (All Variations)

    Standardized administration

    67%

    50%

    Appendix 4


    What were the results when the test was not administered in accordance with the “standard?”

    Stimulus Speed

    • (1 Second) Faster than the “standard” - Overall correct 58% with a false positive rate of 50%. 

    • This is the only variation tested that increased accuracy and decreased false positives.

    Stimulus Elevation

    • (0”) Lower than the “Standard” - Overall correct 44% with a false positive rate of 79%.

    • (4”) Higher than the “Standard” - Overall correct 38% with a false positive rate of 91%.

    Stimulus Distance

    • (10”) Closer than the “Standard” - Overall correct 29% with a false positive rate of 92%.

    • (20”) Further than the “Standard” - Overall correct 35% with a false positive rate of 84%.

    The false positive rates of HGN were very high when the test was administered correctly, but increased notably when the stimulus was not positioned in accordance with the standardized guidelines. What was Dr. Burns’ conclusion, and how did she address the false positives? 

    “In conclusion, HGN as used by law enforcement is a robust procedure. The study findings provide no basis for concluding that the validity of HGN is compromised by minor procedural variations.” 

    How did Dr. Burns come to this conclusion? By changing the criterion for what would be considered a false positive. Image 1 below is a screenshot from page 15 of the study.


    Image 1. Criterion for a “Hit” in the HGN.


    The highlighted area shows that four clues were considered a “hit” if the participant’s BAC was 0.03 or higher. By lowering the criterion, it drastically lowered the number of false positives. As can be seen when looking at each one of the tables, very few of the false positives that occurred when applying the established criterion were noted as false positives (denoted by **) by Dr. Burns. It is important to remember that Dr. Burns was the person who trained the officers in the San Diego Study of the updated criterion of HGN scoring. Her statement (from the box above), “the criteria by which scores have been classified as correct, false negative, or false positive as defined in the SFST curriculum appear below,” is not the truth. 

    It appears that instead of using the correct criterion and applying it to the data to form her opinions, Dr. Burns altered it to make the data fit her opinions. 

    Monocular Vision Experiment

    This was also a laboratory experiment and was listed as a preliminary analysis due to the limited number of participants. The participants were required to be functionally one-eyed, so data was only obtained from 7) individuals. The participants were dosed with alcohol and their BACs were measured with an AlcoSensor IV. Two certified DREs independently examined the participants. The false positive rate was 68%. (Appendix 5)

    What did Dr. Burns state? 

    Because HGN appears to be reduced in a non-functioning eye, if officers were to rely solely on eye signs, they would only increase their false-negative rates, and they might improperly release one-eyed individuals. There is no evidence that HGN signs in such individuals will lead to false arrests.

    NHTSA Training Manuals

    All references to the Robustness Study were removed from the 2018 SFST, ARIDE, and DRE curricula.(8) (At the time of this writing, the study is still available on the NHTSA website, but is still absent from the NHTSA curricula.) This removal occurred due to a concern that part of the study was conducted in a manner that substantially deviated from the normal protocol for administering and interpreting HGN. (The purpose of the study was to examine deviations from the standardized protocol.) A formal retraction of the study was not recommended. There was no additional information provided as to what the specific issues were, or which experiments of the study were the problem. 

    There were no concerns raised with Dr. Burns changing the criterion to alter the number of false positives that were reported in the study. 

    The data speaks for itself. These were experienced officers; their correct or incorrect administration of the HGN test was known, the participants’ BACs were known, and the number of clues reported by the officers was known.

    Conclusion

    The analysis of the Robustness Study reveals a critical issue that has substantial legal and scientific implications: the study’s conclusions are based on an altered definition of a false positive that does not align with the established NHTSA criterion. This change dramatically reduced the number of reported false positives and enabled the author to conclude that HGN was “robust,” despite data showing false-positive rates ranging from 67% when administered correctly to 92% when the stimulus position was altered.

    This alteration was not a trivial mistake. Dr. Marceline Burns was the principal or co-author of five of the six foundational SFST development and validation studies that courts repeatedly rely on. If the same researcher who authored the core validation studies subsequently alters the definition of a false positive to align outcomes with a predetermined conclusion, it raises legitimate concerns about the integrity of their prior SFST validation research.

    The implications for legal proceedings are significant. Courts routinely rely on the SFST validation studies to support the admissibility and scientific accuracy and validity of HGN evidence. Given these issues, the weight afforded to HGN, and by extension the SFST battery, should be carefully reevaluated.

    Acknowledgements

    The author acknowledges the use of ChatGPT to assist in drafting and refining the abstract, introduction, and conclusion by improving wording, organization, and clarity based solely on the author’s original manuscript text. All substantive content, analysis, and conclusions are entirely the author’s own.

    Conflict of Interest Disclosures

    The author is a consultant and expert witness for DUI cases, but has received no funding or compensation for the preparation of this article.

    References

    [1]Burns M. The Robustness of the Horizontal Gaze Nystagmus Test. Southern California Research Institute; 2007.  

    [2]Burns M, Herbert M. Psychophysical Tests for DWI (Driving While Intoxicated) Arrest. U.S. Department of Transportation National Highway Traffic Safety Administration; 1977.

    [3]Tharp V, Burns M, Moskowitz H. Development and Field Test of Psychophysical Tests for DWI Arrest. Southern California Research Institute; 1981. 

    [4]Burns M, Anderson E. A Colorado Validation Study of the Standardized Field Sobriety Test (SFST) Battery. U.S. Department of Transportation National Highway Traffic Safety Administration; 1995.

    [5]Burns M, Dioquino T. A Florida Validation Study of the Standardized Field Sobriety Test (SFST) Battery. United States. National Highway Traffic Safety Administration; 1997. 

    [6]Stuster J, Burns M. Validation of the Standardized Field Sobriety Test Battery at BACs Below 0.10 Percent. United States. National Highway Traffic Safety Administration; 1998. 

    [7]NHTSA. SFST DWI Detection and Standardized Field Sobriety Test (SFST) Participant and Instructor Manuals. NHTSA; 2025.

    [8]DRE Technical Advisory Panel Mid-Year Meeting Minutes March 27, 2018



  • 19 Nov 2025 10:38 AM | ​Jay Gehlhausen

    IAFTC Newsletter. Volume 1. Issue 1. November 19, 2025.

    Jay M. Gehlhausen, Ph.D., DABFT-FD1

    1Forensic Toxicologist and Expert Witness, JG Tox LLC, Apex, NC 27539


    This is an open-access article under the CC BY-NC-ND license.

    Download PDF.

    Abstract

    Kratom (Mitragyna speciosa) is a Southeast Asian botanical product that has gained increasing prominence in forensic toxicology casework throughout North America. The leaves of this tropical tree contain numerous indole alkaloids, most notably mitragynine and 7-hydroxymitragynine, which exhibit complex pharmacological activity at opioid and adrenergic receptors. At low doses, kratom produces stimulant effects, while higher doses result in sedation and euphoria similar to opioid intoxication. The legal status of kratom remains controversial; while it is classified by the U.S. Drug Enforcement Administration as a Drug and Chemical of Concern, federal scheduling efforts have stalled, leading to a patchwork of state-level regulations. Forensic laboratories have developed robust LC-MS/MS methods for detecting kratom alkaloids and their metabolites in biological specimens, though therapeutic and toxic concentration ranges remain poorly defined. Published case reports document mitragynine blood concentrations ranging from 10-970 ng/mL in driving under the influence investigations and 10-4,310 ng/mL in fatal cases, though polydrug use is common. With over 1,200 adverse events reported to the FDA between 2008 and 2024, including 637 fatalities, and increasing prevalence in impaired driving cases, forensic toxicologists require familiarity with kratom's chemistry, pharmacology, and analytical detection. This review synthesizes current knowledge regarding kratom's chemical composition, metabolism, toxicological effects, and legal status to assist toxicologists and legal professionals in evaluating kratom-related cases.

    Introduction

    Mitragyna speciosa, or kratom, is a tropical tree native to Southeast Asia used historically as a natural stimulant and analgesic. Local residents in the region also refer to the tree as thang, kakuam, ketum, or biak. Early purveyors of Mitragyna speciosa would chew or smoke the leaves as a respite from demanding physical labor. Cultural acceptance developed over time in Thailand and Malaysia, and kratom would later become a global commodity. The chemical composition is not fully characterized, but fifty-four known alkaloids have been identified, two of which, mitragynine and 7-hydroxymitragynine, exhibit significant neurological activity (DEA, 2024). Plant varieties differ in composition and potency based on regional soil and climatic conditions; Thai kratom, for example, is the most potent due to a more favorable climate. Another variety of kratom found in Malaysia, referred to as ketum, has a lower prevalence of the psychoactive drug, mitragynine.

    In recent years, products manufactured with concentrated levels of mitragynine have been marketed as health and medicinal products. In the United States, for example, the plant leaves are sold as a powder available through internet sites and herbal shops. Newer formulations, such as brewed tea or concentrated drinks, are also prepared from the crushed leaves. Kratom is available from numerous vendors with names like Kona and Star (Kratom, 2025). Some of these products are highly potent, causing state governments to take notice following anecdotal reports of life-threatening intoxication and dependency.

    The Legal Status of Kratom

    Despite evidence of misuse and opiate-like pharmacology, the legal status of kratom remains in limbo. In September of 2016, the Drug Enforcement Administration (DEA) announced plans to classify mitragynine and 7-hydroxymitragynine as Schedule I narcotics using the administration’s emergency classification authority (Erickson, 2016). The pushback from Congress came immediately. With opposition coming from fifty-one members of the House of Representatives, the DEA was forced to reconsider the kratom ban. The Drug Enforcement Administration remains skeptical of the drug’s efficacy and considers kratom a Drug and Chemical of Concern (DEA, 2024). Advocates and several research centers, on the other hand, have pointed to evidence of relief from anxiety, management of opioid dependence, and limited abuse potential as justification for legal status. Even so, only limited legislative progress has been made at the federal level.

    In the absence of federal regulation, fifteen states have addressed concerns from the legal and medical communities by passing their own laws. Alabama, Arkansas, Indiana, Rhode Island, Vermont, and Wisconsin have banned mitragynine and 7-hydroxymitragynine-containing products. Other states have attempted to limit abuse by enacting age requirements. Tennessee has restrictions on the synthetic products but maintains legal status for the plant material (CRS, 2023). Kratom laws vary significantly across the fifty states. Issues relating to impairment and Driving While Impaired (DWI) can only be managed on a case-by-case basis. The public debate over the efficacy of kratom will continue because few controlled studies have been performed to understand the acute and long-term effects of kratom on human health. The actual risk of psychological and physical addiction has not been elucidated through scientific study.

    The Chemistry of Kratom

    The psychoactive constituents of kratom are characterized as alkaloids. This broad class of naturally occurring organic compounds, including compounds like caffeine and nicotine, contains at least one nitrogen and exhibits weakly basic chemistry.  Other alkaloids, such as theobromine and theophylline, derivatives of caffeine, are amphoteric. Alkaloids dissolve poorly in water but readily dissolve in organic solvents such as diethyl ether, an important consideration in the preparation of concentrated drinks. Alkaloids can also form salts that are freely soluble in water and ethanol. The alkaloids in kratom, for example, are classified as indole alkaloids, containing the structural moiety of indole, which is structurally related to the pentacyclic indole alkaloids, yohimbine and voacangine (Basiliere, 2020).

    Indoleis an organic compound classified as an aromatic heterocycle with a bicyclic structure, consisting of a six-membered benzene ring fused to a five-membered pyrrole ring. Indoles are widely distributed in nature, most notably as the amino acid tryptophan and neurotransmitter serotonin. There are more than 4100 known indole alkaloids, which often exhibit significant physiological activity. The indole structure is the backbone of the kratom alkaloids. Mitragynine is the most abundant active ingredient. In one study, a mitragynine concentrate extracted from the tree leaves contained 66% and 12% by weight from Thai and Malaysian varieties, respectively (Karunakaran, 2022). The plant contains fifty additional alkaloids that are present at much lower concentrations and have not been fully investigated (Karunakaran, 2022).

    Pharmacology of Mitragynine and 7-hydroxymitragynine

    At low dose levels, mitragynine exhibits mild stimulant effects, but as the dosage increases, an individual can experience sedation and euphoria similar to opioid use; this duality relates to concurrent α-adrenergic and opioid receptor induction. Notably, mitragynine has a long half-life in blood, estimated at 23 hours, which increases the risk of drug toxicity during binge use (Trakulsrichai, 2015). Doses of 2 to 10 grams of leaf material are more typical of the casual user. Physiological effects begin within 5 to 10 minutes after ingestion and last for 2 to 5 hours. Pharmacological investigations demonstrate that mitragynine and 7-hydroxymitragynine have µ-opioid receptor agonist activity, but mixed Δ and κ opioid activity has also been observed.

    The major kratom alkaloids mitragynine, paynantheine, speciogynine, and speciociliatine, in addition to several metabolites, have been detected in the urine of rats and humans following ingestion of kratom. There are a few studies describing the metabolic pathways of mitragynine, although recently there has been renewed interest in this area of research. An early study by Zarembo et al. reported that oxidation and hydroxylation were the primary metabolic routes. Mitragynine is known to undergo hepatic metabolism (Zarembo, 1974). Phillipp et al. conducted the first comprehensive in vivo study: rats were administered a single 40 mg/kg dose of mitragynine by gastric intubation (Philipp, 2009). The authors reported that phase I metabolism involved hydrolysis and demethylation, followed by oxidative and reductive transformations to produce carboxylic acid and alcohol derivatives. Additionally, mitragynine undergoes extensive phase II metabolism, producing both glucuronide and sulfate conjugates (Basiliere, 2020).

    Kratom Toxicology

    There is no in-depth understanding of kratom toxicology or how the drug influences the central nervous system (CNS). There have been recent human and animal studies involving kratom alkaloids, but at present, there are no authors who have proposed therapeutic and toxic concentration ranges (Maxwell, 2020). A review of the kratom literature has found blood concentrations between 10 - 970 ng/mL and 10 – 4310 ng/mL for DUID and death investigation cases, respectively (Society of Forensic Toxicologists, 2020). However, these wide ranges preclude the development of a practical safety scale. Similarly, there are no comprehensive studies on the interaction of mitragynine with other drugs despite postmortem case reports involving kratom, mixed drug fatalities (McIntyre, 2015).

    Acute side effects observed during emergency room presentations include nausea, itching, sweating, dry mouth, constipation, and loss of appetite. More severe toxic effects, like psychosis and hallucinations, have also been reported. As kratom products have become readily available in North America, reports of adverse medical events and emergency room visits have also increased. The American Association of Poison Control Centers’ 2022 report indicates that kratom accounted for 1,278 case mentions, 794 single exposures, and 586 cases that involved treatment in a healthcare facility (Gummin, 2024). According to data from the Food and Drug Administration’s adverse event reporting system, mitragynine was identified in 1,255 cases from 2008 to 2024. Of these cases, 1,171 were classified as serious, and 637 reports involved fatalities (DEA, 2025).

    The analysis of biological specimens for mitragynine and other kratom alkaloids has become routine since the development of robust LC-MSMS methods. The first reported analytical method for the quantitation of mitragynine was a high-performance liquid chromatography–ultraviolet detection (HPLC–UV) method measuring mitragynine in the serum of dosed rats, with a limit of quantitation (LOQ) of 100 ng/mL (Janchawee, 2007). More recently, a study by Le et al. reported a quantitative liquid chromatography–tandem mass spectrometry (LC–MSMS) procedure for the identification of mitragynine and other kratom alkaloids in human urine, including the metabolites 5-desmethylmitragynine and 17-desmethyldihydromitragynine (Le, 2012).

    Conclusion

    Kratom, with its wide array of commercial products, challenges a simple characterization as a drug-of-abuse or natural medicine. Kratom has vocal advocates and detractors. The active constituents, mitragynine and 7-hydroxymitragynine, have demonstrated complex pharmacology and myriad psychophysical effects. To date, there is no clear understanding of the toxicology, and claims of medical efficacy are unproven. Despite political support for legal status at the federal level, state governments have moved forward with regulations restricting the sale of kratom. Scientific research on kratom predominantly supports the Drug Enforcement Administration’s classification of kratom as a Drug and Chemical of Concern. The intent of this article was to inform toxicologists and attorneys about the properties of kratom since it is highly likely that cases of overdose and motor vehicle accidents will increase in the coming years.

    Declaration of competing interest

    The author serves as an expert witness in forensic toxicology cases and receives compensation for testimony and consultation services.

    AI Disclaimer 

    Artificial intelligence tools were used to assist in the preparation of this manuscript, including reviewing, editing, and formatting. All scientific content has been verified by the author, who takes full responsibility for the accuracy and integrity of the work.

    References

    CRS. (2023). Kratom Regulation: Federal Status and State Approaches. Retrieved from Congress.gov: https://crsreports.congress.gov/

    Le, David, et al. (2012). Analysis of Mitragynine and Metabolites in Human Urine for Detecting the Use of the Psychoactive Plant Kratom. Journal of Analytical Toxicology, 36, 616–625.

    DEA. (2024). Kratom. Retrieved from Drug Fact Sheet: https://www.dea.gov/factsheets/kratom/

    DEA. (2025). Kratom. Retrieved from DEA: https://deadiversion.usdoj.gov/drug_chem_info/kratom.pdf/

    Gummin, D. D., Mowry, J. B., Beuhler, M. C., Spyker, D. A., Rivers, L. J., Feldman, R., … DesLauriers, C. (2023). 2022 Annual Report of the National Poison Data System® (NPDS) from America’s Poison Centers®: 40th Annual Report. Clinical Toxicology, 61(10), 717–939. https://doi.org/10.1080/15563650.2023.2268981 

    Maxwell, Elizabeth A., et al. (2020). Pharmacokinetics and Safety of Mitragynine in Beagle Dogs. Planta Med.,86(17), 1278–1285.

    Erickson, B. (2016). Congress pushes to delay kratom ban. C&EN, October 10, 2016.

    McIntyre, Iain M., et al. (2015). Mitragynine ‘Kratom’ Related Fatality: A Case Report with Postmortem Concentrations. Journal of Analytical Toxicology, 39, 152–155.

    Zarembo, John E., et al. (1974). Metabolites of Mitragynine. Journal of Pharmaceutical Sciences, 63, 1407-1415.

    Kratom. (2025). Top 13 Ultimate Kratom Vendors Online – Verified Reviews (2025). Retrieved from Kratom.org: https://kratom.org/vendors/

    Philipp, AA, et al. (2009). Studies on the metabolism of mitragynine, the main alkaloid of the herbal drug Kratom, in rat and human urine using liquid chromatography-linear ion trap mass spectrometry. J Mass Spectrom.44(8):1249-61.

    Society of Forensic Toxicologists. (2020). Retrieved from Short Communication for the Analysis of Mitragynine: https://www.soft-tox.org/assets/NPSLiterature/mitragynine.pdf 

    Basiliere, Stephanie, et al. (2020). CYP450-Mediated Metabolism of Mitragynine. Journal of Analytical Toxicology, 44, 301–313.

    Karunakaran, Thiruventhanet, et al. (2022). The Chemical and Pharmacological Properties of Mitragynine and Its Diastereomers: An Insight Review. Frontiers in Pharmacology, vol. 13.

    Trakulsrichai, Satariya, et al. (2015). Pharmacokinetics of mitragynine in man. Drug Design, Development and Therapy, 9, 2421–2429.

    Janchawee, B., et al. (2007) A high-performance liquid chromatographic method for determination of mitragynine in serum and its application to a pharmacokinetic study in rats. Biomedical Chromatography, 21, 176–183.






<< First  < Prev   1   2   Next >  Last >> 


Privacy Policy | Terms of Use

Powered by Wild Apricot Membership Software