A fre­quently asked ques­tion on the list­servs that I belong to basi­cally asks the fol­low­ing question:

In GC-FID use to quan­tify Blood Alco­hol Con­tent where EtOH is the tar­get ana­lyte, how does the machine arrive at the reported number?

The machine is called a Gas Chro­mato­graph with a Flame Ion­iza­tion Detec­tor (GC-FID). Typ­i­cally the sam­ple is intro­duced by way of head­space. The most fre­quently used tech­nique is tech­ni­cally called Sta­tic Head­space Isother­mal Wall Coated Open Tubu­lar Gas Chro­matog­ra­phy with Flame Ion­iza­tion Detector.

First a crash course on the Flame Ion­iza­tion Detec­tor (FID).

The FID is the part of the appa­ra­tus that quan­ti­tates the result of the chro­mato­graphic efflu­ent. Remem­ber that the num­ber one rule of GC-FID is that you must demon­strate with data (actu­ally prove) proper res­o­lu­tion first. The mea­sure of qual­i­ta­tive selec­tiv­ity (sep­a­ra­tion) must be proven before we can validly quantitate.

The FID actu­ally is a destruc­tive, mass count­ing device. It lit­er­ally burns all carbon-hydogen bonds (C-H bonds) from what comes off the col­umn. The cur­rent is sensed by an elec­trom­e­ter, con­verted to a dig­i­tal form and sent to an out­put device that gives us the peak. It counts the increase in the num­ber of ions between a cath­ode and a diode. A polar­iz­ing volt­age attracts these ions to a col­lec­tor located near the flame. This cur­rent is mea­sured with a high-impedance picoam­me­ter. An elec­trom­e­ter is an elec­tri­cal instru­ment for mea­sur­ing elec­tric charge or elec­tri­cal poten­tial dif­fer­ence. The response is pro­por­tional to the num­ber of C-H bonds. So, there­fore, iden­ti­cal amounts of methanol, ethanol, butanol and hexa­nol would not give equal area count responses. The FID itself emits an ana­log (con­stant) sig­nal; yet, it is reported through a com­puter sys­tem that is dig­i­tal in nature.

There are two main meth­ods of deter­min­ing quan­ti­ta­tion when it comes to GC-FID: (1) Peak height, and (2)Peak area.

Peak Height versus Peak Area

Peak Height ver­sus Peak Area

Peak height is an anti­quated method of quan­tifi­ca­tion that is not used a lot. It harkens back to a time when there were no com­puter pro­grams. Lab­Corp and cer­tain labs in Cal­i­for­nia still use the peak height method. Both are based upon the stan­dard dose response cal­i­bra­tion curve.

In order to deter­mine either peak height or peak area, the cru­cial bound­ary is the baseline.

The baseline and its determination is a crucial bound

The base­line and its deter­mi­na­tion is a cru­cial bound

It is essen­tial to remem­ber that all detec­tors have ana­lyt­i­cal noise. There is no way of record­ing zero. No machine can mea­sure true zero. There is always going to be an off­set. As an aside, this is why on a cal­i­bra­tion curve zero (the ori­gin) can­not be used as a legit­i­mate data point in deter­min­ing the slope which gives us the quantitation.

While there is a peak, there is, of course, sig­nal. As a result dur­ing the act of detec­tion there is no base­line per se. Instead, the com­puter uses an algo­rithm based upon prior input before the peak and “guesses” what the base­line would be if there was no sig­nal. As this is the computer’s guess, is it is sub­ject to inter­pre­ta­tion and error.

When we use the peak area method of quan­ti­ta­tion, then the ques­tion becomes not only where is the base­line, but where does the com­puter begin to cal­cu­late the begin­ning and end­ing of a peak. This is so very important.

As you can see in the above there are ver­ti­cal hash marks that are labeled “start” and “stop,” these are called “tick marks.” This is the other perime­ter that deter­mines peak area.

So in sum the deter­mi­na­tion of the peak area is a func­tion of 3 bounds: (1) the base­line and (2) start­ing tick mark, and (3) end­ing tick mark. Think of them as bound­ary marks.

There are two types of inte­gra­tion: (1) auto-integration and (2) man­ual integration.

If the but­ton pusher uses auto-integration, then the deter­mi­na­tion of the base­line will be cal­cu­lated the same through­out the run. This is what should be done (pro­vided that the use of the auto-integration events is sta­tic and equally applies to the cal­i­bra­tors and the unknowns that are tested and is part of a truly val­i­dated method).

On the other hand, one can use man­ual inte­gra­tion events to manip­u­late the data. This is a func­tion of the end user. This is typ­i­cally a one-time event in that only one chro­matogram is manip­u­lated. It may be some­thing that can be dis­cov­ered by look­ing at the chro­matogram or it may be utterly undis­cov­er­able unless you have the raw data. (Here is a video that shows this con­cept: The case for raw data: “Inte­gra­tion” in Gas Chro­matog­ra­phy: How to make an inno­cent per­son guilty in a DUI case by manip­u­lat­ing the soft­ware Actu­ally at Axion, we dis­cov­ered another 3 ways of manip­u­lat­ing the raw data so it would be undetectable.)

The key func­tion of all of this is that every­thing remain the same in order for the result to be valid. The inte­gra­tion events must be sta­tic in order for the quan­tifi­ca­tion to be valid.

When look­ing at the EtOH peak:

The unknown that is your client’s sam­ple must use the same inte­gra­tion perime­ters as the cal­i­bra­tors. The bound­ary marks used to deter­mine the peak area for the cal­i­bra­tors in their indi­vid­ual chro­matograms must match that of the unknown. If the base­line remains the same, but the tick marks are wider for the cal­cu­la­tion of the unknown than the cal­i­bra­tors, then this will over-report the Blood Alco­hol Con­tent (BAC). The con­verse to this is true as well. If the tick marks for the cal­i­bra­tor and the unknown remain the same, but the base­line for the unknown is lower than the cal­i­bra­tor, it will over-report the BAC. The con­verse is true as well.

Auto-integration versus manual integration

Auto-integration ver­sus man­ual integration

Above is an exam­ple of how the auto-integration leads to less area and how the man­ual inte­gra­tion leads to more area. More area equates a higher BAC if this is per­formed on the EtOH peak.

What fur­ther com­pli­cates this is the use of the Inter­nal Stan­dard (ISTD). The inte­gra­tion events for the ISTD must also remain sta­tic. A peak is a peak is a peak. The same bound­ary marks apply. How­ever, with respect to the ISTD, the ISTD amount is inversely related to the EtOH. If there is too much ISTD cal­cu­lated, then the BAC will be ratio reduced. The con­verse is true as well.

How can you tell if any of this is done? Well, maybe you can and maybe you can’t as dis­cussed above and in the video. If you trust the lab­o­ra­tory, and if they report the base­line and the tick marks on all of the chro­matograms, you can sim­ply use a light­box or even hold it up to a light with the cal­i­bra­tors and the unknowns on top of one another and see if the bound­ary marks are all treated the same.

Bot­tom line take away: Con­sis­tent and appro­pri­ate inte­gra­tions are sci­en­tif­i­cally defen­si­ble. Incon­sis­tent or inap­pro­pri­ate inte­gra­tions are dif­fi­cult to defend. All QC, Ver­i­fiers and Unknowns must be treated in a con­sis­tent (same) man­ner and per a val­i­dated method. Oth­er­wise, you have a non-validated result.

The take away ques­tion: What does your lab­o­ra­tory do when it reports out it’s BAC?

More on metrology in the courtroom

A good col­league and friend of mine, Ted Vosk intro­duced me to the world of metrol­ogy in the court­room in 2008. He has intro­duced a lot of us to it. Since then I have learned more about metrol­ogy, sta­tis­tics and val­i­da­tion hav­ing taken a lot of courses and hav­ing read a lot of mate­ri­als on the subjects.To me it is fas­ci­nat­ing. But like most things of sci­ence and like most things that are inter­est­ing, they can become too com­plex and intim­i­dat­ing to folks who have lit­tle or no exposure.

Metrol­ogy has been a fre­quent topic on this blog and also our sis­ter blog www.PADUIBlog.com

The fol­low­ing is a really super over-simplified expla­na­tion that I like to use to illus­trate the very basic prin­ci­ples of metrol­ogy and answer the ques­tion of why it mat­ters because the ques­tion is how do we quickly and mean­ing­fully explain this to octo­ge­nar­ian judges who have no inter­est and sim­ple folks on juries who are wor­ry­ing about pick­ing Suzie Peesh­er­pants up from day­care care under­stand it all. I offer this expla­na­tion in that con­text alone, not in a rig­or­ous sci­en­tific context.

I am a Bayesian and not a frequentist.

A fre­quen­tist believes (in a nut­shell) that repeated test­ing will lead to an accept­able expres­sion of uncer­tainty of the mea­sure­ment. This is really prob­lem­atic in that this approach ignores cer­tain types of error.

A Bayesian believes in the use of math­e­mat­i­cal equa­tions like the Prop­a­ga­tion of Errors and the Monte Carlo method to arrive at a value expressed to some range of inte­ger expressed to a confidence/predictive inter­val (e.g., Tube 1 has a BAC of 0.091 g/mL +/- 0.005  g/mL to 2 stan­dard devi­a­tions or 95% con­fi­dence) based upon the iden­ti­fi­ca­tion and eval­u­at­ing of dif­fer­ent parts and com­po­nents of error to com­prise the expanded uncer­tainty budget.

This is a debate in metrol­ogy with the bet­ter part of the evi­dence in favor of the Bayesians, in my opinion.

What every­one in metrol­ogy can agree on is that sim­ple expres­sions that we typ­i­cally see in a court­room where a mea­sure­ment is expressed as an absoute is wholly and totally wrong. Equally as wrong is that when the defense bar pushes for the lab­o­ra­tory that is doing the test­ing to express some sort of uncer­tainty in its mea­sure­ment, that lab­o­ra­tory sim­ply expresses a “stated” (as opposed to proven ) error. Both are unscientific.

What the hell did I just write?

Let me make it too simple.

So sim­ple that metrol­o­gists will likely respond that it is ridicu­lous. How­ever, I sug­gest that we have to under­stand it in an sim­plis­tic way to get the prover­bial foothold on the large moun­tain that is Bayesian-based metrol­ogy first before we can all stand on the top like my great friend and col­league Ted.

First, I sug­gest that we under­stand and agree on the def­i­n­i­tions of accu­racy (bias) and pre­ci­sion (calibration).

calibration and bias
cal­i­bra­tion and bias

Ran­dom error:

The pre­ci­sion of a mea­sure­ment is how close a num­ber of mea­sure­ments of the same quan­tity agree with each other. The pre­ci­sion is largely lim­ited by the ran­dom errors. It may usu­ally be deter­mined by repeat­ing the measurements.

Think about some­thing that is more or less inher­ently unpredictable.

An exam­ple would be: fluc­tu­a­tions in the air pres­sure when we go to weigh some­thing on a 7 point bal­ance (scale).

Sys­temic error:

The accu­racy of a mea­sure­ment is how close the mea­sure­ment is to the true value of the quan­tity being mea­sured. The accu­racy of mea­sure­ments is often reduced by sys­tem­atic errors, which under cer­tain cir­cum­stances are dif­fi­cult to detect even for expe­ri­enced research workers.

Think about some­thing that is iden­ti­fi­able and correctable.

An exam­ple would be: the cal­i­bra­tors are all run­ning high. We can iden­tify it and cor­rect for it (shift the cal­i­bra­tion curve to cor­rect for the bias).

It is this com­bi­na­tion of sys­temic (cor­rectable error) and ran­dom error (know­able, but inher­ently dif­fi­cult to pre­dict and cor­rect for) that gets us to a prop­erly metro­log­i­cally respon­si­ble expres­sion such as Tube 1 has a BAC of 0.091 g/mL +/- 0.005  g/mL to 2 stan­dard devi­a­tions or 95% confidence.

Source: http://www.physics.umd.edu/courses/Phys276/Hill/Information/Notes/ErrorAnalysis.html

 Why is all of this sci­en­tif­i­cally important?

A mea­sure­ment is only rel­e­vant and use­ful we can assess the risk of being wrong. Each mea­sure­ment is a unique event that will never be exactly repeated again. We always have the risk of being wrong in express­ing it.

An exam­ple

Sup­pose that you and I are friends, but you have never met me in per­son but instead have just “met” over this blog. You are at some con­cert where my favorite band was play­ing, but I could not go because I was stuck here in Har­ris­burg. You, being a good friend, want to buy me a t-shirt at the con­cert. So, you want to know how tall I am so you can buy the right t-shirt. You ask me.

If I were to tell you that I took a mea­sure­ment of my height with a mea­sur­ing stick and I was 7’2” an imme­di­ate men­tal image comes in your mind. He’s very tall. I need a very big shirt. Those of you who met me and have seen me know this to be totally false (I am about 5’5”). But if you had never seen me before so my true height was unknown to you, you would make cer­tain deci­sions based upon that infor­ma­tion I gave you ( I am 7’2″). You would buy the wrong shirt.

How­ever, if I were to tell you all of the proper infor­ma­tion of that mea­sure­ment that I did for my height that resulted in my con­clu­sion as stated to you that I am 7’2″, it would be fully expressed as fol­lows: I mea­sured myself with a result of 7’2” using a mea­sur­ing stick that was +/- 2 feet with a 99% con­fi­dence level.

If I told you this full expres­sion (I mea­sured myself with a result of 7’2” using a mea­sur­ing stick that was +/- 2 feet with a 99% con­fi­dence level), you would have no idea of my height. That mea­sure­ment that was sim­ply expressed as 7’2” is mean­ing­less for the deci­sion you have to make. You wouldn’t know what T-shirt to buy and would get me some other trinket.

As we can see the mea­sure­ment is totally depen­dent upon the mea­sur­and (the mea­sur­ing stick) which is largely com­prised of sys­temic error, and to a degree ran­dom error (e.g., how much I slouch ver­sus stand­ing straight up).

You need all of the infor­ma­tion to buy the right T-shirt.

 
Lions and Tigers and Bears... Verifiers, Calibrators and Controls... Oh my!
Lions and Tigers and Bears… Ver­i­fiers, Cal­i­bra­tors and Con­trols… Oh my!

Some­times a crim­i­nal defense attor­ney can at times feel like Dorothy in the Wiz­ard of Oz in that we are trans­ported from the rel­a­tive safety of home (the court­room) to the weird world of Oz (the lab­o­ra­tory). There are unusual and often times con­flict­ing phrases and words that seem to defy com­mon sense. Some­times, words and phrases are used inter­change­ably and with lit­tle appar­ent dis­tinc­tion. In this post we will exam­ine a decon­vo­lute some of these impor­tant terms.

Qual­ity Con­trol- We cov­ered the con­cept of Qual­ity Con­trol (QC) before. QC is strictly speak­ing a process that is used to con­struct the cal­i­bra­tion curve that our knowns are tested and then an unknowns are tested against. This is typ­i­cally per­formed in the begin­ning of the run.

Standards/Controls- is a ref­er­ence solu­tion or test solu­tion used for assess­ment of the per­for­mance of an ana­lyt­i­cal pro­ce­dure. A rig­or­ously tested and high qual­ity known ana­lyte at a cer­tain con­cen­tra­tion is known as a Cer­ti­fied Ref­er­ence Mate­r­ial (CRM). CRMs are gen­er­ally gov­erned by ISO Guide 34:2009. NIST makes its own ser­vice marked brand of CRMs named Stan­dard Ref­er­ence Mate­ri­als (SRMs). CRMs and SRMs should have state­ments of cal­i­bra­tion (pre­ci­sion) and boas (accuracy).

Cal­i­bra­tor- Often it is in the begin­ning of a run. What makes a stan­dard or con­trol a cal­i­bra­tor is that it is placed before the unknowns are tested. The cal­i­bra­tors are run in a series over the hoped for lin­ear dynamic range. The response from the test­ing of the cal­i­bra­tors is plot­ted on a sig­nal ver­sus con­cen­tra­tion y-x axis graph. A line is drawn along the data points with the R2 cal­cu­lated to deter­mine if the response is lin­ear. Cal­i­bra­tors are used to con­struct the cal­i­bra­tion curve. It is the QC of a quan­ti­ta­tive process. A cal­i­bra­tor is a solu­tion hope­fully from a trace­able source and hope­fully a CRM/SRM with a known amount (con­cen­tra­tion) of ana­lyte of inter­est that is hope­fully pure and only con­tains that ana­lyte of inter­est. It is placed within the batch of the run as part of the QC pro­ce­dures to insure that the ana­lyt­i­cal instru­ment is detect­ing the known within an estab­lished stated and often times arbi­trary range of values.

Ver­i­fier- Often it is in the mid­dle of a run or at an end of the run. What makes a stan­dard or con­trol a ver­i­fier is that it is placed among the unknowns are tested after the run. It too con­tains a known ana­lyte at a known con­cen­tra­tion. It is placed within or at the end of the run to insure that the ana­lyt­i­cal instru­ment is detect­ing the known within an estab­lished tol­er­ance through­out the test­ing of unknowns. Think of it as a check. If the test­ing method has a scheme where the lab­o­ra­tory places a ver­i­fier amongst the unknowns tested, then this is not a func­tion of qual­ity con­trol, but rather an act of verification.

There is a big dif­fer­ence. QC data (where a series of cal­i­bra­tors are used over a range of con­cen­traions) is used to estab­lish a cal­i­bra­tion curve. The data is inputted and R2 value is cal­cu­lated and then the data adjusted to make it fit. Ver­i­fi­ca­tion data sim­ply tests at one point on the cal­i­bra­tion curve already estab­lished, that data is then eval­u­ated by a human, not a machine, and no adjust­ment is made to cor­rect for bias if the ver­i­fier data result does not per­fectly fit against the cal­i­bra­tion curve expected result. The cal­i­bra­tion curve is not altered based upon this new data point. Test­ing con­tin­ues if the ver­i­fi­ca­tion data is within an arbi­trary range. Again, even if the ver­i­fi­ca­tion data shows appre­cia­ble bias, noth­ing is done about it. Unfor­tu­nately, a lot of crim­i­nal lab­o­ra­to­ries in an effort to save money (which really is not that much) make their own in-house ver­i­fier solu­tions. I call this home brew or lab­o­ra­tory moon-shining. The dif­fi­culty with mak­ing it on your own is two fold: (1) You may make it wrong (impu­ri­ties or impre­cise or inac­cu­rate con­cen­tra­tion lev­els), and (2)it’s like the fox guard­ing the hen house in that you are trust­ing the lab­o­ra­tory to guard itself.

Home brew is not a good idea

Home brew is not a good idea

You must under­stand the dif­fer­ence between cal­i­bra­tors and verifiers.

 

There is a large dif­fer­ence between a sin­gle col­umn analy­sis and a dual col­umn analy­sis when it comes to the abil­ity to most cor­rectly iden­tify and quan­ti­tate an unknown in the sci­en­tific world.

In foren­sic sci­ence, we are con­stantly test­ing unknowns. What is meant by this is that we have a sam­ple that is seized from a crime scene or from a per­son, but we don’t know what it con­tains. For exam­ple, in blood analy­sis for EtOH in an alleged DUI case, we have a sam­ple of blood that is taken from the accused, but just by look­ing at it, we can­not know if there is even ethanol in it, and even if present, how much there is. We need to ana­lyze it using instru­men­ta­tion in a sci­en­tific manner.

We can't tell if there is EtOH in this sample just by looking at it
We can’t tell if there is EtOH in this sam­ple just by look­ing at it

When we are look­ing to be sci­en­tific about our analy­sis we are look­ing to be as spe­cific as pos­si­ble, and try­ing not to be merely selec­tive. There is a large and impor­tant sci­en­tific dif­fer­ence between being selec­tive and spe­cific. As we wrote before on this blog: Metrol­ogy in Quan­ta­tive Mea­sure: Is it Spe­cific or Selec­tive or Neither…

The Inter­na­tional Union of Pure and Applied Chem­istry (IUPAC), which is the world author­ity on chem­i­cal nomen­cla­ture, ter­mi­nol­ogy, stan­dard­ized meth­ods for mea­sure­ment, atomic weights and other crit­i­cally eval­u­ated data and oth­ers have defined the dif­fer­ence between these often con­fused terms as follows:

A spe­cific reac­tion or test is one that occurs only with the sub­stance of inter­est, while a selec­tive reac­tion or test is one that can occur with other sub­stances but exhibits a degree of pref­er­ence for the sub­stance of inter­est.  Few reac­tions are spe­cific, but many “exhibit selectivity”.

Other com­mon def­i­n­i­tions include:

Selec­tiv­ity gives an indi­ca­tion of how strongly the result is affected by other com­po­nents in the sample.

and also

Selec­tiv­ity refers to the extent to which the method can be used to deter­mine par­tic­u­lar ana­lytes in mix­tures or matri­ces with­out inter­fer­ences from other com­po­nents of sim­i­lar behavior.

A selec­tive test may be not a spe­cific test due to cross-reactivity, inter­fer­ence, or codetermination.

So, we search for he most spe­cific form of analy­sis. In the world of DUI for EtOH, the gov­ern­ment typ­i­cally set­tles for Head­space gas Chro­matog­ra­phy with Flame Ion­iza­tion Detec­tor (HS-GC-FID). GC-FID is not the most spe­cific test avail­able for EtOH exam­i­na­tion as there is Gas Chro­matog­ra­phy with Mass Spec­trom­e­try (GC-MS) for exam­ple which is much more selec­tive and bor­ders on spe­cific when it comes to EtOH analy­sis, but for what­ever pol­icy rea­son, the gov­ern­ment chooses not to do the most scein­tific thing which is to use the most spe­cific assay avail­able. There is no sci­en­tific rea­son not to test for EtOH on the most spe­cific assay avail­able. In fact, it could be legit­i­mately argued that rely­ing on GC-FID instead of GC-MS for EtOH deter­mi­na­tion and quan­tifi­ca­tion is not sci­en­tific as GC-MS exists and is read­ily avail­able. How­ever, that is a post for another day.

As we are seemly inex­plic­a­bly stuck with the sci­en­tific step-sister of analy­sis in GC-FID as opposed to GC-MS, we must look at ways that the gov­ern­ment chooses to employ GC-FID to see whether or not as an assay it is valid. As our last series of posts “Method Val­i­da­tion for Lawyers” revealed, there is power in the words “valid” and “valid­ity.” With­out hav­ing a truly valid method that has been proven to be suit­able for its intended pur­pose, we can­not have a valid result.

Some foren­sic lab­o­ra­to­ries choose to use a con­fig­u­ra­tion in GC-FID that is known as a sin­gle col­umn, sin­gle injec­tion setup. In this set up there is one installed col­umn and the ana­lyst makes one injec­tion (or the autosam­pler does) to test the sample.

An installed single column GC-FID setup
An installed sin­gle col­umn GC-FID setup

With­out any sci­en­tific doubt, a sin­gle col­umn method of analy­sis is not foren­si­cally or sci­en­tif­i­cally defen­si­ble or acceptable.

Remem­ber that when we use GC-FID, we can never achieve true speci­ficity, the most we can hope for is the pos­si­bil­ity of being merely selec­tive as demon­strated and proven through the res­o­lu­tion stan­dard (sep­a­ra­tion matrix/standard mix). The qual­i­ta­tive result is only based upon one cri­te­ria which is the reten­tion time. Reten­tion times through any given col­umn are not unique to one spe­cific volatile organic com­pound (VOC) to the exclu­sion of every thing else in the uni­verse. Hence, we have the often repeated phrase that all legit­i­mate tech­ni­cally trained chro­matog­ra­phers know and can recite in their sleep—the lim­i­ta­tion of GC-FID is that the reten­tion time is merely char­ac­ter­is­tic of a com­pound and cer­tainly not adju­dica­tive or con­fir­ma­tory of the speci­ficity of that com­pound— mean­ing a peak at a given reten­tion time is not a unique qual­i­ta­tive mea­sure (to the exclu­sion of every other com­pound in the universe).

Don’t take just my word for it con­sider the following:

As ven­er­ated Pro­fes­sor Harold McNair, PhD writes in his book, Basic Gas Chro­matog­ra­phy,

Reten­tion times are char­ac­ter­is­tic of a GC sys­tem, but they are not unique, so GC reten­tion times can­not be used for qual­i­ta­tive confirmation.

He fur­ther writes:

Iden­ti­fi­ca­tion of an unknown by com­par­i­son to reten­tion times using stan­dards that forms the basis of the qual­i­ta­tive analy­sis [in GC-FID analysis].

He con­cludes:

Unfor­tu­nately, GC sys­tems can­not con­firm the iden­tity or struc­ture of any peak. Reten­tion times are related to par­ti­tion coef­fi­cients (Chap­ter 3); and while they are char­ac­ter­is­tic of a well-defined sys­tem, they are not unique.

How do we acknowl­edge this lim­i­ta­tion in the lack of speci­ficity and try to mit­i­gate it?

We can add a dif­fer­ent col­umn and ana­lyze the sam­ple con­cur­rently on both columns. As we learned in our post What is a Gas Chro­matog­ra­phy col­umn and why should I care?, the col­umn, if prop­erly selected and prop­erly installed, is what pri­mar­ily causes the sep­a­ra­tion of the var­i­ous VOCs to occur. What we do is select two dif­fer­ent columns with two dif­fer­ent sta­tion­ary phases. We attach a y-splitter that will take the sin­gle injec­tion made into the injec­tor port and divide the sam­ple into two dif­fer­ent path­ways with one part of the sam­ple going to one col­umn for analy­sis and the sec­ond part going to another col­umn for analysis.

A y-splitter
A y-splitter splits the same sam­ple injec­tion and sends the parts to two dif­fer­ent columns for analysis

The strength of a well-designed dual col­umn analy­sis method where the sta­tion­ary phase is dif­fer­ent between the two columns is that this dif­fer­ence in the sta­tion­ary phase will cause dif­fer­ent sep­a­ra­tion of the ana­lytes (both in terms of reten­tion time and pos­si­bly even elut­ing order) as proven by the chro­matograms of the analy­sis of the res­o­lu­tion stan­dard (sep­a­ra­tion matrix/standard mix). This dif­fer­ent elut­ing order and dif­fer­ent reten­tion times only min­i­mizes, but does not entirely elim­i­nate the pos­si­bil­ity of co-elution as again, the resolv­ing (sep­a­rat­ing) power of the method is only deter­mined by one non-unique mea­sure which is the two reten­tion times. Even though there is a change in the elut­ing order poten­tially and the reten­tion times are dif­fer­ent based upon the sta­tion­ary phase com­po­si­tion, again, it must be empha­sized that the basic lim­i­ta­tion of GC-FID remains in that a reten­tion time is merely char­ac­ter­is­tic of the ana­lyte, but is cer­tainly not confirmatory.

A schematic of a dual column GC-FID. Note the change in the eluting order and retention time among the columns
A schematic of a dual col­umn GC-FID. Note the change in the elut­ing order and reten­tion time among the columns

This is why dual col­umn is referred to as the poor man’s Mass Spec as it has a more orthog­o­nal approach towards the qual­i­ta­tive mea­sure than does a sin­gle column.

How the cur­rent trend of the crime lab­o­ra­tory using a dual col­umn GC-FID is alarm­ingly unscientific.

What is most alarm­ing to me is the trend that is devel­op­ing across the United States where the sec­ond col­umn is not being used for quan­tifi­ca­tion at all. In this trend that I see sweep­ing all across the US, the sec­ond col­umn is merely being used as a “con­fir­ma­tory” col­umn in that if the reten­tion time matches with the standards/controls in the Qual­ity Con­trol sam­ples, then it is pre­sented as “ver­i­fied” in terms of the qual­i­ta­tive mea­sure by sim­ply that sec­ond col­umn match­ing reten­tion time with the knowns that act as the standards/controls. As we explain above, that is a dan­ger­ous and unsci­en­tific approach.

Installed dual column GC-FID

Installed dual col­umn GC-FID

The rea­son that this is so impor­tant given the above con­text (there are other rea­sons that it is alarm­ing, but I want to stick to this ref­er­ence) is that with­out the sec­ond col­umn giv­ing a quan­ti­ta­tive mea­sure, we can­not fairly elim­i­nate co-elution (where two com­pounds elute at the same time, but only get iden­ti­fied and quan­ti­tated as one com­pound) because if the sec­ond col­umn is used to also quan­tify it will serve as an indi­rect deter­mi­na­tion of whether or not there is co-elution. If the sec­ond col­umn also pro­vides a quan­tifi­ca­tion of the unknown, we would exam­ine the pre­ci­sion of the quan­ti­ta­tive results (how closely the num­bers agree among those given by col­umn A and those of col­umn B– do the A’s match the B’s?). If there is co-elution that was “dis­cov­ered” by the dual col­umn approach, then we would expect to see impre­ci­sion between these num­bers (the A’s don’t match the B’s).

Now that they have decided as an orga­ni­za­tion to not quan­ti­tate on the sec­ond col­umn, we have lost a vital part of qual­ity con­trol and the lab super­vi­sor has lost a pow­er­ful tool of qual­ity assur­ance. It is just bad science.

Fur­ther, there is no legit­i­mate sci­en­tific rea­son for quan­ti­at­ing on the sec­ond col­umn. If you are doing good qual­ity work ad your meth­ods and instru­ments are in con­trol, then your num­bers should agree.

It’s not a time thing.

As it is a sin­gle injec­tion y-splitter dual col­umn analy­sis any­way, the amount of time it takes to make a cal­i­bra­tion curve, eval­u­ate it and then incor­po­rate it into the soft­ware for one col­umn is vir­tu­ally the same amount of time to do the same on col­umn 2. There is vir­tu­ally no added time. It makes no sense from a sci­ence point-of-view.

So in con­clu­sion, we can fairly con­clude the following:

  • Sci­en­tif­i­cally, we always want to test our unknowns on the most spe­cific assay available.
  • GC-FID is not the most spe­cific assay available.
  • Lab­o­ra­to­ries that use sin­gle col­umn GC-FID as their method of analy­sis do not pro­duce foren­si­cally or sci­en­tif­i­cally defen­si­ble or accept­able results.
  • Lab­o­ra­to­ries who use dual col­umn GC-FID that do not quan­ti­tate on both their columns do not pro­duce foren­si­cally or sci­en­tif­i­cally defen­si­ble or accept­able results.
  • Good sci­ence is not always prac­ticed in the mod­ern state crime laboratory.
 

Oné of my favorite movies is The Matrix. There is one scene that really jumps out to me. When Mor­pheous and Neo first meet. Mor­pheus shows Neo two pills: a blue and a red one. If Neo chooses the blue pill, he will wake up in his bed and for­get about every­thing that hap­pened to him up to that point. He will bliss­fully con­tinue on in the matrix free of any aware­ness that his life and everyone’s life is but an illu­sion. If he takes the red one he will see “how far the rab­bit whole goes.” Remember?

Do you want the Redpill or the Bluepill?

Do you want the Red­pill or the Bluepill?

We are at that point, I sug­gest with Evi­den­tial Breath Test­ing for Blood Alco­hol Con­tent in the United States.

I have blogged here before about the sci­en­tific farce that is Evi­den­tial Breath Test­ing as prac­ticed today in the United States:

Can some­one hon­estly answer why there is still Breath test­ing for EtOH in America?

Breath test­ing the­ory for ETOH is wrong and unscientific

Why do instru­ments need to be calibrated?

Now, I would like to build on our pre­vi­ous posts, and hope that you are will­ing to take the red pill with me here to see that the idea that Evi­den­tial Breath Test machines are reg­u­larly cal­i­brated in a true sci­en­tific sense.

First, some definitions:

  • Cal­i­bra­tion is the imper­fect act of testing
  1. as series of known and adju­di­cated mate­ri­als (Cer­ti­fied Ref­er­ence Mate­ri­als trace­able to NIST, Sigma Aldridge or accept­ably made up from USP grade mate­ri­als) at dif­fer­ent con­cen­tra­tions (hope­fully more than once) which are assumed to be true val­ues within some proven and expressed Uncer­tainty Measurement
  2. in your device under the exact envi­ron­men­tal and instru­men­ta­tion con­di­tions that you will be test­ing the unknowns, and
  3. deter­min­ing whether or not the result­ing data points fit along a line as eval­u­ated by some mea­sure such as Ordi­nary Least Squares, regres­sion analy­sis or even bet­ter r2adj or Weighted Least Squares with a Lack-Of-Fit exam­i­na­tion with p val­ues expressed.
  4. We accept this as being the TRUE VALUE if cri­te­ria is met and exam­ine unknowns cor­re­lat­ing the response to the cal­i­bra­tion curve to arrive at a result.
  5. PROPER SCIENTIFIC CALIBRATION REQUIRES RE-EXAMINING OVER TIME THE MACHINE USING THE SAME SERIES OF TESTS BUT THEN ACTING ON THE DATA to try to elim­i­nate or cor­rect for bias and error in terms of pre­ci­sion. It is this “act­ing on the data” ele­ment that dif­fer­en­ti­ates this from our next concept.
  • Ver­i­fi­ca­tion is the act of test­ing a known stan­dard and com­par­ing the response on the detec­tor when it is tested to dis­cover whether or not it is within an arbi­trary “zone” of accept­able response from the pre­vi­ous calibration.

The data:

(You may want to watch this one on YouTube in full fcreen so you can see the detail and the leg­end that explains the data.)

  • Assume the blue line and the blue data points are the machine’s ini­tial act of cal­i­bra­tion as defined above.
  • Now sup­pose the red line is indica­tive of the lower and upper bounds of the arbi­trary accep­tance cri­te­ria (oth­er­wise thought of as a “zone” of proper responses) in terms of the orig­i­nal blue line.
  • Finally, sup­pose the green line and the green data points are the sub­se­quent “cal­i­bra­tion test” results as per­formed all across the United States.

Dis­cus­sion of the data:

  • It is clear that there is a dif­fer­ence between the y-intercept and the slope of the two lines (blue and green).
  • They are different.
  • But both meet the tra­di­tional r2=0.999 mea­sure. It is clear that the green line and data points lie within the red line.

Sig­nif­i­cance of the data and our procedure:

If what we were per­form­ing to get the green line is a true cal­i­brat­ing act, then when we get the green data points and estab­lish the green line and then from that point for­ward, we would use that data and the resul­tant green line as our pre­sumed true val­ues and eval­u­ate all future data of the unknowns based upon that.

How­ever, this is not what hap­pens, as far as what I see from the data, with the green line event (the “cal­i­bra­tion test”). There is  no act­ing on the data.

Instead the green line is ignored. There is no adjust­ment made on these devices to cor­rect for what we see above is the clear bias of the device.

Con­clu­sion:

So, by def­i­n­i­tion the green line event is sim­ply not a true sci­en­tific cal­i­bra­tion, but rather a ver­i­fi­ca­tion over the same data points.

Now my questions:

Where and when is the true cal­i­bra­tion of these machines per­formed and by whom?

Stated dif­fer­ently where is the data actu­ally exam­ined and a true sci­en­tific cal­i­bra­tion curve established?

Most impor­tantly, how long ago was this done?

So, I sup­pose the answer is that once it leaves the man­u­fac­turer it never really and truly is cal­i­brated (in the sci­en­tific sense) ever again. At most, it is verified.

When the gov­ern­ment comes in and claims the machine “had just been cal­i­brated” the day before (i.e., a “cal­i­bra­tion test” was per­formed the day before) this is not true.

So I guess the secret is that there is no spoon. Is there?

 

World Metrology Day

Today is one of the most awe­some days to cel­e­brate in the world: It’s world Metrol­ogy Day!

Happy World Metrology Day
Happy World Metrol­ogy Day

Accord­ing to their web­site: http://www.metrologyinfo.org/worldmetrologyday/

World Metrol­ogy Day cel­e­brates the sig­na­ture by rep­re­sen­ta­tives of sev­en­teen nations of The Metre Con­ven­tion on 20 May 1875. The Con­ven­tion set the frame­work for global col­lab­o­ra­tion in the sci­ence of mea­sure­ment and in its indus­trial, com­mer­cial and soci­etal appli­ca­tion. The orig­i­nal aim of the Metre Con­ven­tion — the world­wide uni­for­mity of mea­sure­ment — remains as impor­tant today as it was in 1875.

The World Metrol­ogy Day project is cur­rently real­ized jointly by the BIPM and the OIML together with PTB Inter­na­tional Tech­ni­cal Coöperation.

World Metrol­ogy Day has become an estab­lished annual event dur­ing which more than eighty States cel­e­brate the impact of mea­sure­ment on our daily lives, no part of which is untouched by this essen­tial, and largely hid­den, aspect of mod­ern soci­ety. Pre­vi­ous themes have included top­ics such as mea­sure­ments for inno­va­tion, and mea­sure­ments in sport, the envi­ron­ment, med­i­cine, and trade.

UNESCO and IUPAC have decided to des­ig­nate 2011 as The Inter­na­tional Year of Chem­istry (IYC 2011), a world­wide cel­e­bra­tion of the achieve­ments of chem­istry and its con­tri­bu­tions to the well-being of humankind. Under the uni­fy­ing theme “Chem­istry — our life, our future,” IYC 2011 will offer a range of inter­ac­tive, enter­tain­ing, and edu­ca­tional activ­i­ties for all ages. The year 2011 also coin­cides with the cen­te­nary of the Nobel Prize in Chem­istry awarded to Madame Marie Curie — an oppor­tu­nity to cel­e­brate the con­tri­bu­tions of women to science.

Chem­istry is a cre­ative sci­ence that is essen­tial for sus­tain­abil­ity and improve­ments to our way of life. All known mat­ter is com­posed of pure chem­i­cal ele­ments or of com­pounds made from those ele­ments. Humankind’s under­stand­ing of the mate­r­ial nature of our world is grounded in our knowl­edge of chem­istry. Mol­e­c­u­lar trans­for­ma­tions are cen­tral to the pro­duc­tion of food­stuffs, med­i­cines, fuels, and met­als — i.e. vir­tu­ally all man­u­fac­tured and extracted products.

The World Metrol­ogy Day 2011 mes­sage Chem­i­cal mea­sure­ments for our life, our future builds upon the IYC 2011 theme. Chem­istry and chem­i­cals pose par­tic­u­larly inter­est­ing chal­lenges to the mea­sure­ment com­mu­nity: thou­sands of com­pounds must be mea­sured, and the range of con­cen­tra­tions at which some com­pounds must be reli­ably detected, quan­ti­fied, and in some cases reg­u­lated can nowa­days extend down to parts per bil­lion (or even tril­lion). Yet the abil­ity to make appro­pri­ately accu­rate and reli­able chem­i­cal mea­sure­ments is cru­cial to our econ­omy, our envi­ron­ment and our per­sonal well being; in short we must not under­es­ti­mate the impor­tance of Chem­i­cal mea­sure­ments for our life, our future.

National mea­sure­ment sys­tems must rely on agreed stan­dards, units, and tech­niques to make con­sis­tent, repro­ducible and accu­rate mea­sure­ments. Each sys­tem of national mea­sure­ment stan­dards and lab­o­ra­to­ries is then linked into a world-wide net­work coör­di­nated by the Inter­na­tional Bureau of Weights and Mea­sures (BIPM). This net­work gives soci­ety access to accu­rate mea­sure­ments in order to meet today’s chal­lenges in health­care, within the envi­ron­ment and in all the new tech­nolo­gies and processes. In indus­try and com­merce, it helps ensure prod­uct qual­ity and inter­op­er­abil­ity, elim­i­nates waste, raises pro­duc­tiv­ity, and facil­i­tates trade based on agreed mea­sure­ments and tests. It also enables sci­en­tists to use a com­mon lan­guage to under­pin their col­lab­o­ra­tion across the world and ensure that their exploits can be taken up and accu­rately repro­duced by com­pa­nies wher­ever they operate.

National and regional metro­log­i­cal reg­u­la­tions must be based on agreed tech­ni­cal require­ments in order to help avoid or elim­i­nate tech­ni­cal bar­ri­ers to trade, ensure fair trade prac­tice, care for the envi­ron­ment and main­tain a sat­is­fac­tory health­care sys­tem. The Inter­na­tional Orga­ni­za­tion of Legal Metrol­ogy (OIML) has devel­oped a world­wide tech­ni­cal struc­ture by means of which it pro­vides its Mem­bers with tech­ni­cal Rec­om­men­da­tions and Doc­u­ments as well as Guides, Vocab­u­lar­ies and other pub­li­ca­tions. When devel­op­ing their metro­log­i­cal leg­is­la­tion and reg­u­la­tions, OIML Mem­bers can ensure they meet these objec­tives by includ­ing the require­ments con­tained in the rel­e­vant OIML publications.

This year, in their mes­sages to the world of metrol­ogy, Gov­ern­ments, com­pa­nies, aca­d­e­mics, and indeed to the man or woman in the street, the Direc­tors of the Inter­na­tional Bureau of Weights and Mea­sures and of the Inter­na­tional Bureau of Legal Metrol­ogy both high­light the impor­tance of accu­rate, reli­able and inter­na­tion­ally accepted chem­i­cal mea­sure­ments in the mod­ern world as it deals with today’s grand challenges.

Here are some of our posts on Metrol­ogy (in no par­tic­u­lar order):

HERE’S TO METROLOGY! Look for­ward to us giv­ing you many more posts on it.

 

Cali­bra­tion (and bias) schema is a pro­ce­dure that imper­fectly trans­forms a response into a use­ful measure.

Some crime lab­o­ra­to­ries have no method or man­ner as to how, why or when they should cal­i­brate their instru­ments. Other lab­o­ra­to­ries have truly arbi­trary inter­vals that they cal­i­brate their instru­ments, but then they declare that this arbi­trary inter­val is suf­fi­cient to insure against lack of pre­ci­sion or accu­racy or ana­lyt­i­cal drift with­out data to sup­port such a declaration.

First, who really cares what some crime lab­o­ra­tory thinks they should be doing? If we let them run them­selves we get tragedies like San Fran­cisco or Col­orado or Nas­sau or Hous­ton. While it is use­ful to dis­cover their own pub­lished inter­nal “this is the way we do it” ideas and they may prove to be use­ful to throw it in their face if they don’t fol­low their own estab­lished pro­ce­dure, their method of “we do it every month or every week” is not true science!

Remem­ber, cal­i­bra­tion and bias have to do with pre­ci­sion and accu­racy respec­tively. The issue sur­rounds ana­lyt­i­cal drift and other basics of metrol­ogy. Over time regard­less of use (how­ever, heavy through­put does exac­er­bate the prob­lem and so too does also lack of use) all ana­lyt­i­cal devices lose their sen­si­tiv­ity, their pre­ci­sion and their accuracy.

Multiple meaasure explanation of metrology
This graph­i­cal rep­re­sen­ta­tion is the best one to show the inter­sect of these depen­dent fea­tures over mul­ti­ple mea­sures. As we can see from this depic­tion the goal of min­i­miz­ing risk of bias and cal­i­bra­tion error is a mov­ing tar­get. You adjust one and the other may suf­fer. It is also quite costly to min­i­mize both simul­ta­ne­ously. It is expo­nen­tially eas­ier and cheaper to cor­rect for cal­i­bra­tion error than bias error.

For more on the nomen­cla­ture involved please visit “A rose by any other name??? More on Metrol­ogy and its nomen­cla­ture

To have a valid result that is close as sci­en­tif­i­cally pos­si­ble to achiev­ing a true result, those seek­ing to mea­sure must prove and demon­strate that they are robust and sta­ble in their approach to mea­sure­ment. At a min­i­mum, this is:

  • why they must estab­lish exter­nal cal­i­bra­tion curves in a metro­log­i­cally respon­si­ble way using CRMs in at least a 5x5 method with con­cen­tra­tions involv­ing ana­lytes of inter­est and inter­nal stan­dards within the range of pre­dicted response (demon­strated lin­ear dynamic range), and
  • why they must insert con­trols and ver­i­fiers (at the least a high and a low con­cen­tra­tions involv­ing ana­lytes of inter­est and inter­nal stan­dards within the range of pre­dicted response) within a run, and
  • why con­trol charts must be maintained.

While hav­ing a “we do it every Mon­day” plan is swell, it does not nec­es­sar­ily equate to demon­strat­ing that they have a robust method and sta­ble instru­ments which may result in good analy­sis that is valid and as close to achiev­ing a true result as sci­en­tif­i­cally pos­si­ble. They must have and use con­trol charts and other data to jus­tify that this estab­lished inter­val of cal­i­bra­tion and accu­racy schema. Cal­i­bra­tion and its inter­val must be a data dri­ven deci­sion and not an act of faith or guesswork.

We would do well to remem­ber that sim­ply because there has been demon­strated lin­ear­ity within a cer­tain dynamic range [they use Ordi­nary Least Squares (coef­fi­cient of vari­a­tion or basic regres­sion analysis)=0.999, when they should be using Weighted Least Squares], in real­ity it does not nec­es­sar­ily equate to a pro­nounce­ment that the mea­sure itself is sound (mean­ing, I sup­pose, valid and true). All that it means, if prop­erly estab­lished by a val­i­dated cal­i­bra­tion and bias test­ing schema such as I describe below in points 1.1 through 1.7, is that the mea­sure is within some sort of pred­ica­tive inter­val within some level of sta­tis­ti­cal tol­er­ance under those given con­di­tions and vari­ables that give rise to the cal­i­bra­tion and bias attempt in the exact matrix that the CRMs are.

The basic seven steps to a robust and valid cal­i­bra­tion schema using CRMs include:

1.1   Plot response ver­sus true con­cen­tra­tion using the 5-by-5 method,

1.2   Deter­mine the behav­ior of the stan­dard devi­a­tion of the response,

1.3   Fit the pro­posed model and eval­u­ate R2adj,

1.4  Exam­ine the resid­u­als for non-randomness,

1.5   Eval­u­ate the p-value for the slope (and any higher-order terms),

1.6  Per­form a lack-of-fit eval­u­a­tion, and

1.7  Plot and eval­u­ate the pre­dic­tion interval.

In indus­try (mean­ing Guide­line for Good Clin­i­cal Prac­tice, US Food and Drug Enforce­ment Administration-Good Lab­o­ra­tory Prac­tices meth­ods, Good Man­u­fac­tur­ing Prac­tices meth­ods, US Envi­ron­men­tal Pro­tec­tion Agency meth­ods, and Inter­na­tional Con­fer­ence on Har­mo­niza­tion pub­li­ca­tions) (note that there are no uni­ver­sal stan­dards in terms of instruc­tions for cal­i­bra­tion and bias schemas in foren­sic sci­ence) a method needs to be re-validated (e.g., at a min­i­mum a new cal­i­bra­tion curve established):

  • When­ever there is a change in the method,
  • When­ever there is change in instrumentation,
  • When­ever there is a change in the software,
  • When­ever there is change in the envi­ron­men­tal con­di­tions of the laboratory,
  • When­ever there is change in the con­sum­ables (e.g, new septa, new injec­tor port liner, new o-ring, col­umn is removed and installed, col­umn is clipped, col­umn is adjusted, new make-up gas cylin­der is installed, new car­rier gas cylin­der is installed, new golden seal is installed, new FID is installed, etc.)
  • When­ever there is a recorded appar­ent aber­rant result,
  • When­ever the machine appears to be out­side of val­i­dated con­trol chart pre-established cal­i­bra­tion and bias areas, and
  • When­ever the ana­lyst has rea­son to believe or sus­pect that there has been ana­lyt­i­cal drift or loss of cal­i­bra­tion and increase in bias.

Some folks have estab­lished in their GLP that when exe­cut­ing a full shut down and re-powering up of the instru­ment there is a sig­nif­i­cant enough change to require full cal­i­bra­tion and bias deter­mi­na­tion to occur again. Oth­ers say that that is not a sig­nif­i­cant enough event, but they have data to sup­port that decision.

In foren­sic sci­ence, where is the data?

 

A rose by any other name would smell as sweet…

is a quo­ta­tion by William Shake­speare from his play Romeo and Juliet meant to say that the names of things do not mat­ter, only what things are. In the play Romeo and Juliet, the line is said by Juliet in ref­er­ence to Romeo’s house, Mon­tague which would imply that his name means noth­ing and they should be together.

Does nomenclature really matter?

Does nomen­cla­ture really matter?

Why do we lawyers have prob­lems con­nect­ing and talk­ing with you scientists?

Why do you sci­en­tists seem so obtuse and need­lessly pedan­tic to us?

In part, it may be a nomen­cla­ture issue. I sug­gest that if we are going to try to open lab­o­ra­to­ries and make them be trans­par­ent and for lawyers and the judi­ciary to exam­ine their processes to ensure against unjust con­vic­tions, then we need to mind our nomen­cla­ture. I sug­gest that we use either the Inter­na­tional Con­fer­ence on Har­mo­niza­tion (ICH) and The Inter­na­tional Union of Pure and Applied Chem­istry (IUPAC) Gold Book definitions.

No place in foren­sic test­ing is this need to use the cor­rect nomen­cla­ture more impor­tant than when we dis­cus the valid­ity of a given method when it comes to any sort of test­ing. A small com­po­nent of valid­ity is metrol­ogy. We have dis­cussed metrol­ogy before here at the www.TheTruthAboutForensicScience.com blog as well as www.PADUIBlog.com as well.

The goal of all mea­sure­ment is to try to cap­ture the true value or the actual value of that which we are mea­sur­ing. How­ever, we can never, never do so. We can only attempt to design a method of mea­sure­ment where we have set up a process where we have deter­mined what level of risk we are will­ing to accept that we are wrong. Mea­sure­ment is the study of accept­able risk. What level of risk is accept­able that we could be wrong in our mea­sure? You see we are always wrong. It is a ques­tion of how much are we will­ing to risk that we are wrong and how wrong are we will­ing to be. Uncer­tainty Mea­sure­ment (UM), if prop­erly done, is the imper­fect embod­i­ment of the expres­sion of that risk.

You write as to “accu­racy.” Accu­racy (strictly in a ICH and IUPAC way) is a par­tic­u­lar type of assess­ment of a mea­sure­ment. Accu­racy is more prop­erly known as “bias.” Bias is the mea­sure of how closely the results are to the true value. It is char­ac­ter­ized by per­haps a high Stan­dard Devi­a­tion, but may or may not have a low aver­age devi­a­tion from the true (actual) value.

Then there is pre­ci­sion. Pre­ci­sion is an entirely dif­fer­ent type of ani­mal. They are inter-related and depen­dent vari­ables, but they are entirely dif­fer­ent con­cepts. Pre­ci­sion is more prop­erly known as “cal­i­bra­tion.” Pre­ci­sion is best defined as a mea­sure of how closely the results can be to one another. It is char­ac­ter­ized by a low Stan­dard Devi­a­tion, but may or may not have a high aver­age devi­a­tion from the true (actual) value. Pre­ci­sion is made up of repeata­bil­ity, inter­me­di­ate pre­ci­sion, and repro­ducibil­ity. Repeata­bil­ity is char­ac­ter­ized as the abil­ity day-in and day-out using the test, using the same method on the same instru­men­ta­tion on the same unknown arrives at the same result. Inter­me­di­ate pre­ci­sion is an expres­sion of with-in lab­o­ra­tory vari­a­tion: dif­fer­ent days, dif­fer­ent ana­lyst, etc. Repro­ducibil­ity is defined as the abil­ity of a test or exper­i­ment to be accu­rately repro­duced, or repli­cated, by some­one else work­ing inde­pen­dently. Pre­ci­sion should be inves­ti­gated using homo­ge­neous, authen­tic sam­ples over the long term.

There are three graph­i­cal rep­re­sen­ta­tions that best and most sim­ply show these concepts.

(This graphical representation is the best one to describe these intersecting and dependent features as to a single measuring event)

(This graph­i­cal rep­re­sen­ta­tion is the best one to describe these inter­sect­ing and depen­dent fea­tures as to a sin­gle mea­sur­ing event)

Multiple meaasure explanation of metrology

(This graph­i­cal rep­re­sen­ta­tion is the best one to show the inter­sect of these depen­dent fea­tures over mul­ti­ple mea­sures. As we can see from this depic­tion the goal of min­i­miz­ing risk of bias and cal­i­bra­tion error is a mov­ing tar­get. You adjust one and the other may suf­fer. It is also quite costly to min­i­mize both simul­ta­ne­ously. It is expo­nen­tially eas­ier and cheaper to cor­rect for cal­i­bra­tion error than bias error.)

(This graphical representation is from Ted Vosk’s presentation at the AAFS meeting. I am unsure as to where he got it. This graphical representation again looks at an individual measure and shows the difference between Type I error and Type II error. Type I error can be termed, by and large, as a function of bias; whereas, Type II error is, by and large, a function of calibration.)

(This graph­i­cal rep­re­sen­ta­tion is from Ted Vosk’s pre­sen­ta­tion at the AAFS meet­ing. I am unsure as to where he got it. This graph­i­cal rep­re­sen­ta­tion again looks at an indi­vid­ual mea­sure and shows the dif­fer­ence between Type I error and Type II error. Type I error can be termed, by and large, as a func­tion of bias; whereas, Type II error is, by and large, a func­tion of calibration.)

 

This post is inspired by a com­bi­na­tion of two events. First a com­ment by oppos­ing coun­sel, and sec­ond sev­eral com­ments by sev­eral “old school” foren­sic sci­ence prac­ti­tion­ers at this year’s Amer­i­can Acad­emy of Foren­sic Sci­ence (AAFS) annual meet­ing that I attended. In a con­tested hear­ing oppos­ing coun­sel argued that there was “no such thing as metrol­ogy.” He said this despite the fact that he heard from two metrol­o­gists about the well-established sci­ence, and its appli­ca­tion. The sec­ond set of com­ments came from many at the AAFS meet­ing when Ted Vosk, Esquire who is a friend and col­league of mine was pre­sent­ing at the meet­ing. He was lec­tur­ing at a part of a pre­sen­ta­tion on Uncer­tainty Mea­sure­ment (UM) report­ing in the foren­sic arena. In reac­tion to his words, some peo­ple com­mented that UM report­ing was a “waste of time” or a “use­less exer­cise.” One per­son com­mented that if it were to be done “where would I stop the fig­ur­ing of UM.”

I actu­ally think it is a sim­ple case.

Quite frankly, I don’t under­stand what all of the hub-bub is about in not report­ing UM.

What is the confidence of your measure? Are you a house cat who thinks he is a lion?
What is the con­fi­dence of your mea­sure? Are you a house cat who thinks he is a lion?

Whereas my good friend Ted Vosk, Esquire made a very good, very con­vinc­ing and very impas­sioned plea to the analyst’s sense of jus­tice and sci­ence, I am going to try to be more prac­ti­cal. I am going to make an appeal to your log­i­cal bias.

Here is my open address to all of those involved in foren­sic sci­ence (regard­less of whether you are employed by a pros­e­cu­tor or a defender) in terms of UM reporting:

An opinion letter to all those in forensic science laboratories today
An opin­ion let­ter to all those in foren­sic sci­ence lab­o­ra­to­ries today

Dear Foren­sic Scientist,

I know you are not the robot that you claim to be when you pre­form some form of sci­ence. I under­stand that you are a real life human being. As such, you have bias. And you know what? Here’s a dirty lit­tle secret: it is totally accept­able that you do. You can­not not have bias. You have no choice in the mat­ter. It is fail­ing to acknowl­edge your bias that is dan­ger­ous. If you acknowl­edge you have bias, then you can take steps to mit­i­gate it and try your best to not allow it inap­pro­pri­ately to influ­ence your process, your pro­ce­dure, your per­for­mance, your inter­pre­ta­tion, your opin­ion and your conclusion.

Your bias could be as extreme as that you want one side to win. Your bias could be that you want to defend your inter­pre­ta­tion or your opin­ion. Your bias could be that you want to defend your data. Your bias could be that you want to defend your pro­fes­sion. Your bias could be that you want to defend what you do or did.

To that end, I appeal to your bias with this. Logically.

1. You are not the finder of facts.

2. You are not sup­posed to be an advocate.

3. To do oth­er­wise, you are an edi­tor of facts.

4. When you present your mea­sure as an absolute value, the old adage of “A half truth+ A half truth= A full lie” applies.

But you still say “Why present UM at all or why should I present it unless it is near a crit­i­cal value?”

Well, it’s sim­ple. These days, crim­i­nal defense lawyers win by expos­ing the whole truth when you chose not to present the whole truth. When you present a mea­sure, whether it is a qual­i­ta­tive or quan­ti­ta­tive mea­sure, as an absolute and there­fore free of any sort of doubt or error, you know sci­en­tif­i­cally this is wholly wrong.

A half truth+ A half truth= A full lie.

My col­leagues are slowly learn­ing the whole sci­en­tific truth. When you show half truth, we show the whole truth truth. We show the truth in the lim­i­ta­tion of the assay per­formed, the truth about the lim­i­ta­tion of your knowl­edge and expe­ri­ence, and the truth that you made assump­tions or inter­pre­ta­tions or judg­ment calls along the way.

No mat­ter how much you try to jus­tify on re-direct this ini­tial lack of full dis­clo­sure of the whole truth, you will likely lose. Also, it fre­quently doesn’t mat­ter if on re-direct exam­i­na­tion if you have the UM ready to report. You have been exposed. There is doubt.

While the sim­ple truth is in some tightly con­trolled and truly val­i­dated meth­ods, the demand for hon­est and com­plete report­ing in the expanded UM in both the quan­ti­ta­tive mea­sure and the qual­i­ta­tive mea­sure (using accept­able metro­log­i­cally accept­able meth­ods such as the prop­a­ga­tion of errors method or Monte Carlo analy­sis) may actu­ally show that there is no pos­si­ble way that the value could be below the crit­i­cal mea­sure, in my view, I say “Good,” and “So be it.” If you can legit­i­mately and sta­tis­ti­cally prove (not just sim­ply a stated value) that it takes 6, 7, 8 or 100 sigma to get below the crit­i­cal value, then you have noth­ing to fear do you? But do you know you are in control?

If it is true, then that is what belongs in the court­room and noth­ing else.

To do oth­er­wise is a sci­en­tific sin (Vosk’s point) and will make you seem decep­tive because you know what? You are (my point).

True sci­ence is not your pri­vate parochial sand­box that you need to “pro­tect” us from, but rather it is for all of us to share in the joy of unbi­ased dis­cov­ery of the truth at the tem­ple of empircism.

With true sin­cer­ity as a true admirer of val­i­dated science,

Justin J. McShane, Esquire

 

Conclusion to the twelve part ISO 17025 introduction

Scan this QR mark into your cell phone to get bonus information on Lord Kelvin

Scan this QR mark into your cell phone to get bonus infor­ma­tion on Lord Kelvin

In a series of posts, I am going to intro­duce the reader to the exis­tence of ISO 17025 and its impor­tance.  I am going to intro­duce it in bite-sized bits for easy diges­tion.  Just like all mat­ters of learn­ing, knowl­edge is incre­men­tal over time and builds upon pre­vi­ous exposure.

So far we have answered the fol­low­ing questions:

In today’s post we seek to tie all of the other 11 posts together into some­thing mean­ing­ful to the Practitioner.

While it is impor­tant to note for the crim­i­nal law Prac­ti­tioner that ISO 17025 is com­ing to a lab­o­ra­tory near you, it pro­vides only a use­ful frame­work from which min­i­mum stan­dards of sci­en­tif­i­cally accept­able pol­icy, pro­ce­dure, and instruc­tions result in the over­ar­ch­ing Qual­ity Man­age­ment Sys­tem. These are min­i­mum safe­guards that are rec­og­nized per ISO and cer­tainly do not con­sti­tute or endorse a lab­o­ra­tory to pro­duce a foren­si­cally accept­able result at the end of its imple­men­ta­tion of the ISO 17025 qual­ity man­age­ment system.

(Con­sider this: Who wants a doc­tor who only meets “min­i­mum stan­dards” or a lawyer who only meets “min­i­mum standards.”)

Some of the most strik­ing inad­e­qua­cies of ISO 17025 and ASCLD/LAB’s inter­pre­ta­tion of it con­cerns the def­i­n­i­tion of “cus­tomer” as described in our ear­lier post as well as when and if  Uncer­tainty Mea­sure­ment (UM) should be reported as detailed as dis­cussed previously.

With the "customer' being interpreted as the proscuting authority UM will not likely be reported

With the “cus­tomer’ being inter­preted as the pros­e­cut­ing author­ity UM will not likely be reported

Another sys­temic short­com­ing of ISO 17025 and ASCLD/LAB’s inter­pre­ta­tion of it comes in the basics of true Bayesian-based expanded UM report­ing. While they address for the first time the need to be uni­form and con­sis­tent in the approach towards test­ing and cal­i­bra­tion among lab­o­ra­to­ries, within a lab­o­ra­tory, and even down to the ana­lyst, they focus exclu­sively on the quan­tifi­ca­tion of an unknown. It leaves untouched and unad­dressed, except indi­rectly in the method val­i­da­tion require­ments, the impor­tance of the qual­i­ta­tive mea­sure­ment and its valid­ity. Nowhere in ISO 17025 is the need to be selec­tive and spe­cific in a reported qual­i­ta­tive mea­sure­ment report directly addressed. This rush to be able to express UM in terms of quan­tifi­ca­tion seems to be plac­ing the cart before the prover­bial horse . If one focuses on the quan­ti­ta­tive mea­sure but first can­not be cer­tain that the method employed is one that results in a qual­i­ta­tive mea­sure that is both selec­tive and spe­cific to the exclu­sion of all other pos­si­ble meth­ods and con­clu­sions in terms of a qual­i­ta­tive result, then there can be no real value in the mea­sure itself. In other words, the key ques­tion of “Does the method exclu­sively and uniquely mea­sure what we need mea­sured?” remains unaddressed.

William Thom­son, 1st Baron Kelvin (26 June 1824 – 17 Decem­ber 1907) once wrote,“[When you can mea­sure what you are speak­ing about, and express it in num­bers, you know some­thing about it; but when you can­not mea­sure it, when you can­not express it in num­bers, your knowl­edge is of a mea­gre and unsat­is­fac­tory kind; it may be the begin­ning of knowl­edge, but you have scarcely in your thoughts advanced to the state of Sci­ence, what­ever the mat­ter may be.”

The days of ISO 17025 are shortly com­ing upon us. If the defense bar is prop­erly pre­pared, then we can pro­vide to the trier of fact and the cit­i­zen among us who has entrusted us with his lib­erty that final and all-important last check to the unfet­tered power of the great Leviathan that is the gov­ern­ment.  It is up to us, armed with knowl­edge, to defend the cit­i­zen among us who has been accused of a crime.  It is through our own igno­rance that we can insure that per­sonal tragedy and injus­tice results.  As Albert Ein­stein once penned “As far as the laws of math­e­mat­ics refer to real­ity, they are not cer­tain, as far as they are cer­tain, they do not refer to reality.”