Author: Richard Johnston

  • 2017 ATS PFT Reporting Standardization

    The ATS has released its first standard for reporting pulmonary function results. This report is in the December 1, 2017 issue of the American Journal of Respiratory and Critical Care Medicine. At the present time however, despite its importance it is not an open access article and you must either be a member of the ATS or pay a fee ($25) in order to access it. Hopefully, it will soon be included with the other open access ATS/ERS standards.

    There are a number of interesting recommendations made in the standard that supersede or refine recommendations made in prior ATS/ERS standards, or are otherwise presented for the first time. Specific recommendations include (although not necessarily in the order they were discussed within the standard):

    • The lower limit of normal, where available, should be reported for all test results.
    • The Z-score, where available, should be reported for all test results. A linear graphical display for this is recommended for spirometry and DLCO results.
    • Results should be reported in tables, with individual results in rows. The result’s numerical value, LLN, Z-score and percent predicted are reported in columns, in that recommended order. Reporting the predicted value is discouraged.

    Part of Figure 1 from page 1466 of the ATS Recommendations for a Standardized Pulmonary Function Report.

    • All tests should be reported as separate sections. Spirometry, slow vital capacity, lung volumes and DLCO should be reported in that order, with other tests following.
    • The reference source for normal values should be included with each section of the report.
    • Although not specifically dis-allowed, the use of bold face or colored fonts when displaying results below the LLN is discouraged.
    • Only FVC, FEV1 and the FEV1/FVC ratio should be routinely reported for spirometry. Reporting FEF25-75 and instantaneous flows (FEF75% for example) is discouraged. Expiratory time (FET) should be reported for quality assessment purposes.
    • The FEV1/FVC ratio should only be reported as a decimal fraction with the LLN and Z-score but its percent of the predicted is specifically discouraged from being reported.
    • The use of the GLI spirometry reference equations is recommended, although the NHANES III spirometry reference equations are considered acceptable if continuity is important. No recommendations were made for lung volume or DLCO reference equations.
    • Spirometry test quality should be graded using an A-F scale. Although it is recommended that results with a quality score of F should not be reported the standard notes that acceptable test quality is not possible in some subjects and that is up to the interpreting reviewer to decide whether poor quality results are adequate for interpretation or not.
    • For spirometry both the flow-volume loop and volume-time curve should be included. Peak flow should be read from the flow-volume loop graph and not reported in the table of spirometry results.
    • The size of the flow-volume loop and volume-time curve are mandated. Flow-volume loops are to be no smaller than 5 mm per L/sec and 10 mm per L. Volume-time curves are to be no smaller than 10 mm per L and 20 mm per second. These specifications do not appear to include any numerical scale.

      Note: For my lab’s reports a flow-volume loop would be at least 16 cm x 8 cm (+/- 16 L/sec x 8 L) and a volume-time curve would be at least 8 cm x 32 cm (8 L x -1/+15 sec). This is a bit problematic since 32 cm is 12.6 inches and at this scale a volume-time curve would not fit a standard sheet of paper and we’d have to use a smaller length of time.

    • The SVC should be reported in a separate section and, if available, should include the FEV1/VC ratio. Although an example in the standard included IC as a reported value, no recommendations concerning the reporting of IC and ERV were made within the text of the report.
    • The volume-time curve for the SVC should be reported with enough of the volume-time curve to show the end-exhalation baseline and the SVC maneuver.

    Part of Figure 1 from page 1466 of the ATS Recommendations for a Standardized Pulmonary Function Report.

    • Reporting KCO from the DLCO is optional but reporting it as DL/VA is specifically discouraged. (Interestingly, the example included in the standard notes KCO to be reported as ml/min/mmHg/L which is essentially incorrect since there is no volume, i.e. ml or L, in KCO which to me indicates a continued and pervasive lack of understanding of the nature of KCO).
    • It is recommended that DLCO and DLCO corrected for barometric pressure be reported as separate values. An additional row that takes into consideration the predicted DLCO adjusted for hemoglobin is also recommended.
    • Although reporting the DLCO’s VA, TLCsb and Vi/Vc were included within an example within the standard, these results were not specifically addressed.

    Figure 3 from page 1467 of the ATS Recommendations for a Standardized Pulmonary Function Report.

    • Both date of birth and age should be reported. Age should include a decimal point for children and adolescents.
    • Barometric pressure should be reported (units are not specified but mm Hg were implied).
    • Oxygen saturation is suggested as a demographic item (although how and why is left open to question).
    • Whether or not reference values are adjusted for ethnicity or specific to an ethnicity must be included, but this is specified to be in the technician notes, not the tabular results. Interestingly, ethnicity/race is not included among the recommended demographic items.

    There are however, a number of reporting and interpretation problems that were not addressed or only partially addressed, and their solutions, if any, left to individual labs and manufacturers.

    • It’s possible for multiple post-baseline spirometry tests to be performed such as post-BD, post-exercise and supine, but no easy way to report more than one pair in any report.
    • The reported FVC and FEV1 can be taken from different tests but there is no suggested way to indicate that this has occurred. In addition, only one flow-volume loop and volume-time curve can be reported and there are no suggestions whether either or both should be linked to the FEV1 or to the FVC.
    • Multiple N2 washout and plethysmographic lung volume tests can, and often are, performed and their averaged results reported, but graphs can only be taken from a single test and there were no recommendations as to how this single graph is selected.

    Figure 2 from page 1467 of the ATS Recommendations for a Standardized Pulmonary Function Report

    • It appears that to be specifically recommended that that the helium curve for helium dilution lung volumes not be added to the report. Why this information is considered unimportant while the N2 washout curve and plethysmographic pressure-volume tracing were was not explained.
    • It was noted that spirometry efforts with low peak flows can have a higher FEV1 due to reduced negative effort dependence. Although the standard recommends that a technician should coach for maximal effort there is no explicit recommendation that peak flow be used when selecting spirometry efforts.
    • The quality grading system for spirometry is based solely on the reproducibility of acceptable tests (both the number of tests and their closeness) but at the same time that standard notes that acceptability (back-extrapolation, expiratory time, EOT flow rate) needs to be assessed based on the limitations or abilities of the subject. However, almost none of these acceptability factors (other than what can be determined from flow-volume loops and volume-time curves) are specifically recommended to be reported and in fact, reporting of them appears to be discouraged.
    • A graph showing the washout phase of the DLCO test with the alveolar sample was shown and suggested for inclusion in reports however this graph does not show exhaled volume and the 2017 DLCO standard recommends that graphs of both the “full manoeuvre and exhaled gas concentration versus volume with sample collection” be reported.
    • Although additional tests such as Oscillometry, MIP/MEP and Exhaled NO were mentioned in passing (and I’ll add 6MWT and HAST testing), other than giving very general guidelines, there were no examples and no particular suggestions about reporting additional tests.

    The ATS standards on reporting are welcome. I’ve seen reports from many different PFT labs and the lack of standardization is notable. In addition, many reports are missing important information that the new ATS reporting standard more or less mandates. Having said all this, the standard is a first effort and there are a number of issues, some of which I consider to be relatively basic, that were not addressed. In addition, a careful reading of the reporting standard shows that although some factors are mandated, many are not and much of the decision about what should and should not be included in a report is still up to individual labs and manufacturers. Although there were some specific recommendations there were no overall guidelines for which results must be reported; which results are optional; and which results must be excluded for spirometry, lung volumes and DLCO.

    A large part of the standard was devoted to quality assessment, but there were some inconsistencies, the most notable of which was the lack of any quality assessment for lung volume measurements. Moreover, much of what was said about quality assessment is to one extent or another derived from previous ATS/ERS standards and not original. Relatively strict grading systems for spirometry and diffusing capacity were recommended but the limitations of these systems were also extensively noted and this leaves their inherent value open to question. My criticism of grading systems like these is that they are focused on factors that relatively easy to measure but leave the more difficult factors to the technician or interpreting physician. Even so, what’s important is that the standard does mandate the reporting of test quality.

    One very important issue that should have been included was standardized test nomenclature. This is implied to some extent in the included examples, but there are discrepancies particularly in the use of acronyms, capitalization, subscripts and subscripts between this standard and prior ATS/ERS standards. Because there is no recommended nomenclature this means that individual labs will continue to use the nomenclature they are comfortable with.

    Another issue that should have been discussed was trend reporting. As important as current test results may be, comparing them over time is at least equally important. Trend reports should be mandated and not left to the discretion of individual PFT labs and manufacturers.

    Finally, although not directly related to the reporting standard, equipment manufacturers need to develop friendlier and more responsive report management software. My lab’s reporting software has remained essentially unchanged for at least the last 15 years. Creating, modifying, selecting and printing reports continues to be a time-intensive process with much of the “heavy lifting” done by us and not the software.

    In addition there is a reporting issue that appears to be more or less universal and that is the use of fixed-format pages where the positions of all the report elements are decided in advance. I think we can all agree that in order to make reports readable tests that aren’t performed (i.e. empty tables and graphs) shouldn’t be on the report. But since all report elements are fixed in advance this means that my lab has to maintain over a half dozen different report formats for our most common combinations of tests. Even so, we still perform unusual combinations of tests on occasion and when this happens our only recourse (other than creating a new report format on the spot) is to use our “kitchen sink” report that includes all of the tests we are able to perform and is 5 pages long. As importantly, any changes made to a single report often have to be manually duplicated the other half dozen report formats.

    Over 20 years ago, the DOS version of our lab software (during the pre-Windows stone age for all the youngsters out there) allowed us to format the individual sections of a report and when printed, only the sections that had test results were printed, This meant there was really only one report format no matter what combination of tests we performed but here we are 20 years later with far more advanced computers and software and we’re stuck having to manually maintain and select a bunch of different report formats. Not exactly what I’d call progress.

    References:

    Culver BH, Graham BL, Coates AL, et al. Recommendations for a standardized pulmonary function reports. Am J Respir Crit Care Med 2017; 196(11): 1463-1472

    Graham BL, Brusasco V, Burgos F, et al. 2017 ERS/ATS standard for single-breath carbon monoxide uptake in the lung. Eur Respir J 2017; 49: 1600016.

     

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • Making Assumptions about TGV and FRC

    When lung volumes are measured in a plethysmograph the actual measurement is called the Thoracic Gas Volume (TGV). This is the volume of air in the lung at the time the shutter closes and the subject performs a panting maneuver. Ideally, the TGV measurement should be made at end-exhalation and should be approximately equal to the Functional Residual Capacity (FRC). For any number of reasons in both manual and automated systems this doesn’t happen and the point at which the TGV is measured is either above or below the FRC.

    Testing software usually corrects for the difference in TGV and FRC by determining the end-exhalation baseline that is present during the tidal breathing at the beginning of the test. Using this value the software can determine where the TGV was measured relative to the tidal breathing FRC and then either subtracts or adds a correction factor to derive the actual FRC volume.

    One problem with this is that leaks in either the subject or the mouthpiece and valve manifold can occur during the panting maneuver and the end-exhalation baseline can shift and this will affect the calculation of RV and TLC. I’ve discussed this previously and as a reminder, RV is calculated from:

    RV = [average FRC] – [average ERV]

    where the FRC is determined from the corrected TGV and ERV is determined from SVC maneuvers. TLC is then calculated from:

    TLC = RV + [largest SVC]

    When the post-shutter FRC baseline shifts upwards (higher lung volumes relative to the pre-shutter FRC):

    ERV is underestimated, which in turn causes both RV and TLC to be overestimated. When the post-shutter FRC baseline shifts downwards (lower lung volumes relative to the pre-shutter FRC):

    ERV is overestimated, which in turn causes both RV and TLC to be underestimated.

    I’ve been aware of this problem for quite a while and use this as a guideline when selecting the FRCs and SVCs from specific plethysmograph tests. All of these assumptions are based on the fact that FRC is derived from the pre-shutter end-exhalation tidal breathing. Well, you know what they say about assuming…

    Our lab software was updated a while back, ostensibly with only the changes we needed to perform electronic signing by the pulmonary physicians. I’ve reviewed the information we got concerning the update and there is nothing in it that indicates that any changes were made in any of the testing modules. Recently I was reviewing the plethysmographic lung volume for a patient and suddenly realized that the FRC baseline was being determined by the post-shutter end-exhalation level.

    In this particular instance this means is that even though the shutter closed near the end-exhalation level of the pre-shutter tidal breathing, the reported FRC was being corrected to a much lower lung volume. This was causing both RV and TLC to be underestimated. Interestingly, for this patient their TGV measurements were well within +/- 5% (the 2005 ATS/ERS standard for plethysmographic FRC measurements), but the FRCs were not and this was all due to post-shutter baseline shifts.

    Fortunately, our software lets me mix and match SVC and TGV results, and to manually correct the FRC. This allowed me to take the highest quality TGV measurements and correct the FRC and then to match them with the highest quality SVC measurements (also corrected for FRC). So for this patient I was able to report what I believe to be reasonably accurate lung volume measurements.

    But just to keep me confused, a couple of reports later another patient had also had their lung volumes measured plethysmographically, and also had a post-shutter baseline shift in FRC, but this time the software was using the pre-shutter end-exhalation level to determine FRC (and yes, there was a negative ERV).

    Looking carefully at all the plethysmographic lung volumes I’ve reviewed since then I’ve seen that FRC is being determined occasionally by the post-shutter end-exhalation level and more often by the pre-shutter end-exhalation level. I haven’t been able to determine why one or the other is being used so I have no idea what criteria the software is using.

    Other than mentioning that the SVC maneuver should be linked to the FRC measurements there are no guidelines in the ATS/ERS 2005 lung volume standards as to how this linking should be done. Although it would seem to be more logical for the pre-shutter end-exhalation level to be used to determine FRC (since they appear to be more closely related to the actual TGV measurement) this is left to individual manufacturers.

    Note: I will also mention that there are no guidelines for the number of tidal breaths and the repeatability of the end-exhalation volumes that are used to determine FRC and this is also left to the individual manufacturers.

    In a sense, the real problem is that leaks are occurring while the shutter is closed. I suspect that it’s the patients that are leaking rather than the valve manifold because leaks don’t occur with every patient or with every test (although that doesn’t rule out an intermittent valve problem). It’s hard to know how to address this since we routinely instruct the patients to keep their lips tight and we already use the largest mouthpiece the patient is comfortable using.

    Baseline shifts are more often downwards than they are upwards. This implies that leaks are more likely during the compression (exhalation) part of the panting maneuver but I suspect that this is because patients probably don’t generate as much negative pressure as they do positive pressure during panting.

    Regardless of where or when the leak is occurring it’s also unclear to me what effect a leak has on the actual TGV measurement. A leak will cause lung volume to change during the panting period and may also dampen the amplitude of the mouth pressure signal but I suspect that these effects are small. Our plethysmographs accumulate between 1 and 4 pants for a TGV measurement but any difference between individual pants (other than the normal pant-to-pant variations) in patients with baseline shifts is not readily apparent.

    We maintain our plethysmographs well and try to instruct our patients properly but leaks and baseline shifts during plethysmographic lung volume measurements still happen. For these reasons I suspect that this is not an uncommon problem regardless of which manufacturer’s equipment is being used. How evident this problem is however, depends on the way each manufacturer’s software displays the TGV results graphically and I’ve seen examples from some systems where a baseline shift would be difficult to detect.

    As importantly, our ability to detect a baseline shift also depends on the procedure we follow. Specifically, for my lab a TGV test consists of:

    • tidal breathing
    • shutter closes
    • panting
    • shutter opens
    • tidal breathing
    • SVC maneuver

    I’ve seen results from labs where the subject goes directly into an SVC maneuver as soon as the shutter opens and without a second period of tidal breathing a baseline shift would probably be hard to detect.

    Strictly speaking the periods of tidal breathing pre-shutter and post-shutter serve different purposes. Pre-shutter the end-exhalation level of tidal breathing is used to determine where FRC is in relation to the TGV. Post-shutter the end-exhalation level is used to determine IC and ERV from the SVC maneuver. I would like to see these facts addressed the next time the ATS/ERS updates the standards for lung volume measurements and that pre-shutter and post-shutter tidal breathing periods be mandated and FRC determined independently for both.

    Our software (and I suspect those from many other manufacturers as well) assumes that the pre-shutter tidal end-exhalation level is the same as post-shutter (ore vice-versa). It’s clear that this isn’t always true and that when it isn’t it can affect the reported RV, FRC and TLC. So you know what they say about assuming…

    References:

    Brusasco V, Crapo R, Viegi G. ATS/ERS task force: Standardisation of lung function testing. Standardisation of the measurement of lung volumes. Eur Respir J 2005; 26: 511-522.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • The effect of errors in Inspiratory Volume on DLCO.

    Yesterday while reviewing reports I ran across an interesting error in the Inspiratory Volume (VI) from a DLCO test. I’ve probably seen this before but this time I realized what effect it could have on DLCO. Specifically, what I saw was that at the start of the DLCO test the subject had not finished exhaling and although the technician had started the test, the subject continued to exhale.

    What makes this interesting is that the software used the subject’s volume at the start of the test as the initial volume. This means that the software measured the VI from the initial volume to the end of inspiration, not from the point at which the subject stopped exhaling to the end of inspiration. This also means that the VI was underestimated by 0.20 L and this affects both VA and the calculated DLCO.

    VA is calculated by:

    Where:

    VA = alveolar volume (ml)

    VI = inspired volume (ml)

    Vd = machine + anatomical deadspace

    Plugging in the original test values this is:

    When the VI is adjusted for missing volume, it becomes:

    which is a difference of 0.45 L and a 12% increase. As a reminder, the DLCO calculation formula is:

    Given that BHT, Pb, PH20, FATrace, FiTrace, FICO and FACO are unchanged, then the increase in VA would in turn cause an increase in calculated DLCO of 12% as well.

    For this particular subject, this error did not make a significant change in their reported DLCO since it increased from 50% of predicted to only 56% of predicted. For a subject that was at 75% of predicted however, the adjusted DLCO would have been 84% of predicted. By my lab’s criteria this would be a change from being mildly reduced to being within normal limits and that would have been a significant difference.

    There is an opposite error in inspiratory volume that unfortunately we see more frequently than we’d like. Specifically, the expiratory valve in our test system’s manifolds often closes too slowly or leaks slightly and when this happens subjects often leak at the beginning of the breath-holding period.

    The whole point of an adequate VI is to ensure the subject’s lungs are near TLC and at their maximum surface area during the breath-holding period. DLCO normally tends to decrease when the maximum inspiration is below 90% of TLC so an inadequate VI can cause DLCO to be underestimated. Interestingly however, even though most of the breath-holding period is at a lower lung volume which would normally result in a lower DLCO it will instead be higher than would be expected.

    VA is a critical part of the DLCO calculation and as already noted is calculated from the VI and the ratio of the inhaled and exhaled tracer gas concentrations. To some extent the exhaled tracer gas concentration requires a certain amount of breath-holding time in order to be for it to be distributed homogeneously. The actual time needed for this to occur differs from subject to subject but individuals with normal lungs or restrictive disorders like Pulmonary Fibrosis will probably achieve a homogeneous concentration of tracer gas relatively quickly. For these subjects the reported VA will likely be whatever the lung volume was at the end of the initial inspiration.

    For individuals with airway obstruction however, even the standard 10 seconds of breath-holding time is not sufficient for equilibration and when this occurs, the exhaled tracer gas concentration will probably be more in line with the lung volume that was present during most of the breath-holding period. For this reason, VA will still be overestimated due to the falsely elevated VI but not so much due to the dilution of the tracer gas.

    Either way, VA will be overestimated but the subject actually spent most of their time at a lower lung volume. When breath-holding occurs at volumes below TLC, KCO (the rate at which CO disappears from the lung), tends to increase. This increase in KCO generally correlates with decreases in lung volume but not necessarily predictably or even linearly. Even so, given the combination of falsely elevated VA and a possible increase in KCO, DLCO will probably be overestimated in these circumstances.

    So errors in test performance can cause the Inspiratory Volume to be under- or overestimated. Because of the relationship between VI and VA, and between VA and DLCO, an underestimated VI will cause DLCO to be underestimated and an overestimated VI will cause DLCO to be overestimated. This problem is exacerbated to some extent by the inability of our software to recognize any problems with the VI measurement in the first place and by that fact that our software doesn’t let us manually correct any of these errors.

    Whether the Inspiratory Volume was underestimated or overestimated, it wouldn’t have been evident if I hadn’t taken a close look at the the volume graph from the DLCO test. I review the raw data for all DLCO tests performed in my lab so I routinely see these kinds of problems. Many reviewers however, only see the final report and either don’t or can’t see the raw test data. I’ve seen reports from dozens of labs around the United States and only a small number had a DLCO graph on the report and even then, it is almost always very small and any problems with the VI probably wouldn’t have been noticeable. The 2017 ERS/ATS DLCO standards mandates the presence of a DLCO graph on DLCO reports and if you’re going to implement this it makes sense to make sure it’s large enough for the details to be evident.

    As importantly, although we report the VI along with the DLCO results, not all labs do. I think this is a mistake because it’s helpful in ensuring DLCO test quality to a reviewer. More than that though, since the ATS/ERS spirometry interpretation guidelines say that the FEV1/VC ratio is supposed use the largest VC from any test. Occasionally I find that the VI (which is an IVC) is larger than the FVC or SVC and knowing this can make a difference in the interpretation of spirometry results.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • DLCO, de-constructed

    My wife watches the Food Network a lot and I occasionally watch it with her but I can only take so much of it before I go off and read or work on one of my projects. I’ve noticed however in the various cooking contests that sometimes a chef will deconstruct a familiar recipe. This more or less means they break the recipe down into its components and present them as separate pieces or perhaps by putting what goes inside on the outside instead.

    I’ve discussed the DLCO test with numerous people and have found that many know and understand (or at least remember) the ATS/ERS criteria for test quality. At the same time however, there seems to be very few people that understand the formula used to calculate the single-breath DLCO and I suspect this is probably because most of us didn’t like the mathematics classes we had to attend in high school or college (and tried to forget what we learned as quickly as we could afterwards).

    The DLCO formula isn’t that complicated however, and more importantly all the components of the DLCO test and the reasons for the ATS/ERS quality criteria are embedded within it. All this seems to be a good reason to de-construct the DLCO “recipe” and try to explain it’s various pieces.

    As a reminder the single-breath DLCO formula is:

    Where:

    VA = alveolar volume in ml

    BHT = breath holding time in seconds

    Pb = barometric pressure

    PH2O = partial pressure of water vapor in the lung

    FITrace = fractional concentration of tracer gas in the inspired DLCO mixture

    FATrace = fractional concentration of tracer gas in the alveolar sample

    FICO = fractional concentration of CO in the inspired DLCO mixture

    FACO = fractional concentration of CO in the alveolar sample

    I think the part that bothers everybody the most is:

    and that’s because there’s two different things going on here. First, the part within the brackets:

    is intended to correct the initial CO concentration for the dilution that occurs when the DLCO test gas mixture is inhaled and mixes with the gas that was within the lung at the start of the inhalation. The whole point of the DLCO test is to measure CO uptake but the initial concentration for this measurement is not what’s in the tank, it’s what’s in the lungs after it has been diluted by the lung’s residual volume and deadspace gas.

    This might be easier to understand if it was re-stated as:

    where the relationship between the exhaled and inhaled concentrations of the tracer gas:

    is what says how much the DLCO gas mixture was diluted, and

    is the part that says how much CO was taken up.

    FITrace and FICO are more or less constants. For this reason it’s easy to see the lower FATrace is, the more the inhaled gas was diluted. FACO however, is affected both by dilution and CO uptake and should therefore always be proportionally lower than FATrace.

    So, what’s a tracer gas? In this particular instance it’s any gas that is highly insoluble (being inert helps too but it’s not a requirement). There’s nothing in the alveolar-capillary membrane that acts as a barrier to the movement of gases and in fact that’s what it’s optimized for. The amount of any gas that can be absorbed by blood and tissue however, is based on how soluble they are. Gases like helium, argon, neon and methane are highly insoluble so when they are inhaled only a tiny amount is absorbed by blood and tissue. This means that when a gas mixture containing an insoluble gas is inhaled and exhaled, the ratio between the inhaled and exhaled concentrations indicates how much it was diluted. This is the basis for the Helium Dilution FRC test and for measuring VA.

    One of the interesting things about combining the tracer gas and CO in the same equation:

    is that the accuracy of the CO and tracer gas analyzers becomes unimportant as long as the analyzers are linear and their zero level is accurate. This is because it is the ratio of concentrations that matters, not what the concentrations actually are. For this reason, the actual concentration of CO and tracer gas that comes from a tank is not terribly important. The choice of 0.3% CO was fairly arbitrary and the accuracy of single-breath DLCO tests using CO concentrations from 3.0% to 0.03% are not significantly different from those performed at 0.3%. What matters is patient safety and the ability of the gas analyzer to provide a signal that’s adequate to differentiate between adjacent concentrations across it’s entire range.

    The Ln (natural logarithm) part of the equation outside the brackets

    is there because rate at which CO is taken up by the lung is not linear and in fact decreases over time. Once CO uptake starts (which is soon as the DLCO gas mixture clears the deadspace of the airways and reaches and alveoli) the concentration of CO decreases. As the CO concentration in the lung decreases, the rate at which it crosses into the blood stream also decreases which looks something like this:

    This decline follows an exponential curve and when it is re-stated as a natural logarithm it becomes a straight line.

    So basically, this section of the equation:

    determines the rate at which CO declines in the lung during breath-holding, with an adjustment for the dilution that occurs when the gas mixture is inhaled.

    On the left upper side of the equation, VA is the lung volume “seen” by the DLCO gas mixture.  Ignoring for the moment the 2017 ERS/ATS DLCO standard’s recommendations that VA should be calculated using a mass balance equation over the entire exhalation since this has yet to be implemented on any test systems, it is calculated separately as:

    Where:

    VA = alveolar volume (ml)

    VI = inspired volume (ml)

    Vd = machine + anatomical deadspace.

    The purpose of VA is to convert the rate at which CO disappears given by the last section of the formula into a flow rate. Remember that DLCO is expressed (at least in part) as a flow rate, i.e. ml/min. The rate at which CO disappears in any given lung during breath-holding (given by the right side of the equation) can be the same for large, small or even diseased lungs so by itself it is not terribly informative. When it is expressed as a flow rate however, it can be related to normal values for a given gender, height and age.

    On the lower left side of the equation, BHT, Breath-Holding Time, which is in seconds, is divided by 60:

    in order to convert BHT to fractional minutes. This wouldn’t be necessary if DLCO was expressed as ml/sec but it’s not, so the conversion from seconds to fractional minutes is needed.

    BHT however, is always going to be something of an approximation. This is because the DLCO formula basically assumes that inspiration and expiration are instantaneous. Because they aren’t, there are several ways in which BHT is routinely measured.

    The current 2017 and the 2005 ERS/ATS DLCO standards recommends using the Jones-Meade approach to measuring BHT and this is mostly because J-M makes some consideration for the prolonged expiratory times (and alveolar sample collection times) for individuals with COPD.

    Finally,

    obviously corrects for the partial pressure of water vapor in the lung but less obviously, Pb essentially converts the fractional concentrations of CO and the tracer gas into their corresponding partial pressures. Remember that rate that CO disappears in the lung:

    is based on fractional concentrations, not partial pressures. DLCO is expressed as ml per minute per mmHg and the pressure component comes from the barometric pressure.

    When looking at the raw data for a DLCO test it may be helpful to remember that FICO, FITrace, Pb and PH2O are essentially constants. In addition, even though BHT is not a constant, it’s usually within a very narrow range and given that it’s divided by 60 it doesn’t have a large effect on test-to-test changes. This basically means that everything on the lower left side of the DLCO equation changes very little from test-to-test.

    VA, on the other hand is a function of VI and FATRACE and these can and do change a fair amount between tests. As long as the VI ends more or less at TLC however, these differences tend to cancel out and VA should be fairly constant between tests. A large difference in VA from one test to another indicates a large difference in test quality. In general the DLCO test with a larger VA should probably be considered more accurate than one with a smaller VA, however VA can be falsely skewed both upwards and downwards when VI is suboptimal (due to inhomegenous distribution of the inhaled gas mixture).

    It’s hardest to characterize test-to-test changes in FACO since it is a function of both FATrace and the actual CO transfer rate in the lung during the breath-holding period. Although significant changes in FACO can point to test quality issues, determining the actual cause is more difficult.

    When a chef deconstructs a recipe it’s successful sometimes and sometimes it isn’t but this is often due to how well the individual components were treated. Deconstructing the single-breath DLCO calculation not only helps to highlight its components but also shows that understanding it isn’t as formidable as it first appears. We may not feel comfortable with the math we routinely use (whether we realize it or not since pretty much everything is done automatically by our computer systems) but deconstructing equations is a good way to understand what’s important in the tests we perform.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • COHb and Pulse Oximetry

    I was reviewing a report recently that included the results for walking oximetry. These showed that the individual has a resting SaO2 of 97% and desaturated significantly to 86% after walking a couple hundred yards. This was curious since a DLCO had also been performed and the results for that test were 94% of predicted. It’s unusual for somebody with a normal DLCO to have that low of an SaO2 but I have seen it before in individuals who were unable to ventilate adequately because of a paralyzed diaphragm. I’ve also seen it happen sometimes when somebody has a peripheral vascular disease like Reynaud’s that produces a poor quality oximeter signal. Buried in the technician’s notes however, was an additional piece of information that called into question both the resting and the exercise SaO2 readings. Specifically, the notes mentioned that an ABG had been performed and that the subject’s COHb was 9%.

    Oxygen saturation is measured spectrophotometrically. The different forms of hemoglobin, i.e. oxyhemoglobin (O2Hb), deoxyhemoglobin, methemoglobin (MetHb) and carboxyhemoglobin (COHb) absorb the frequencies of red and infrared light differently.

    from Hampson NH. Pulse oximetry in severe carbon monoxide poisoning. Chest 1998; 114: 1036-1041

    Although non-invasive oximetry was first developed during the 1930’s and 1940’s (in 1935 by K. Mathes in Germany and independently in 1942 by G. Milliken in the USA), current pulse oximeter technology dates from 1972 (by Takuo Aoyagi, researcher for Nihon Koden in Japan). The original pulse oximeters were large, bulky and generally stationary pieces of equipment. Oximeters underwent progressive miniaturization during the 1980’s and 1990’s and rapidly evolved into the handheld and fingertip units we see today and the only “stationary” oximeters that remain are those used in ICU-type monitoring systems.

    Modern laboratory CO-oximeters measure the absorption of light in a blood sample at up to 128 wavelengths, spread across the entire hemoglobin absorption spectrum. Using mathematical analysis they can report total hemoglobin concentration and oxygen saturation in addition to fractional deoxyhemoglobin, COHb, and MetHb.

    A pulse oximeter however, transmits only two wavelengths of light through tissue, typically 660 and 940 nm. Body tissue contains both arterial and venous blood and a pulse oximeter differentiates between these by using the cyclical rise and fall of overall light absorption that occurs because of the heart beat. The majority of light absorbed during pulse oximetry is due to skin, connective tissue, bone and venous blood and this does not change during the cardiac cycle. Arterioles however, contain more blood during systole than they do during diastole and by comparing the peak to the trough the light absorption by non-arterial sources becomes irrelevant.

    Although the mathematics are more complex than this, it is primarily the ratio of light absorbed at 660 nm (red) to that at 940 nm that in turn is used to perform a look-up from a table of empirically derived values in order to calculate SaO2.

    It has long been known that pulse oximeters are insensitive to carbon monoxide. The reason for this is that COHb and O2Hb have similar absorbances at 660 nm and this results in pulse oximeters measuring COHb and O2Hb similarly. Although the absorbances of COHb and O2Hb somewhat differ at 990 nm this does not become apparent until COHb is approximately 40% or greater. Up until the higher levels of COHb are reached the SaO2 reported by a pulse oximeter is usually equal to SaO2 + COHb.

    A rule of thumb for oximeters (whether non-invasive or not) is that the number of forms of hemoglobin that can be measured is equal to the number of wavelengths that are measured. This means that oximeters that measure light at two wavelengths can only measure O2Hb and de-oxygenated Hb. Pulse oximeters that measured additional light wavelengths and considered capable of measuring COHb (as well as hemoglobin and methemoglobin) were first developed and manufactured around 2005. Unfortunately, as useful as a pulse CO-oximeter might be their accuracy and reliability remains questionable. Verification this type of pulse oximeter has shown equivocal results. For every research study that has shown good agreement with a laboratory-based CO-oximeter there is another that showed poor correlation instead. On a positive note however, one study showed that although there were significant differences in measured COHb between pulse oximetry and laboratory CO-oximetry, the initial finding of elevated COHb levels in the emergency room lead to significantly quicker treatment.

    What the COHb of 9% told me was that the baseline SaO2 of 97% was inaccurate since the highest that the SaO2 could really have been was 91%. It’s also unclear how accurate the exercise SaO2 of 86% was. When I looked up this individual’s ABG results their resting SaO2 measured on a laboratory CO-oximeter was actually 81%, which means that all of the oximeter readings were incorrect.

    Although the oximeter reading were incorrect, does that also mean that the significant desaturation during exercise actually occurred? There’s no way to be sure but given how a pulse oximeter works I’m going to say that it probably did.

    Pulse oximetry has been called the “fifth vital sign” (although pain has been called that as well) and seems to be responsible for a significant decrease in clinical use of ABGs. As an example my lab (which sees outpatients almost exclusively) is currently performing about half the number of ABG’s per year that we did in the 1990’s and the total number of patients seen in my lab is about double from then. In one sense this a good thing, but a problem with it is that the other ABG values such as PaCO2, pH and COHb, which are often more clinically relevant to patients with COPD, aren’t being performed as often as they should be.

    Oximetry is frequently used during CPETs, 6-minute walk studies and walking oximetry. A number of studies have shown that pulse oximetry during exercise both over- and underestimates oxygen saturation when compared to arterial samples measured on a CO-oximeter, but not predictably so. In addition, oximetry during field testing (6MWT, walking oximetry) is subject to motion artifact and operator errors.

    A prime example of an operator error that we saw a couple of years ago was a young woman who had recently delivered a baby. Her obstetrician had made an “emergency” PFT appointment for her because despite having no symptoms she desaturated into the upper 70’s while doing a walking oximetry test in the physician’s office. At our lab she had normal PFTs and normal walking oximetry and while talking to her, I found that she had been carrying her baby while walking and that the oximeter probe had been placed on a finger of the hand she was using to carrying her baby.

    Along the same lines, I’ve seen nursing assistants place the oximeter probe on a finger of the same arm they’re using to take blood pressures and then document the low SaO2 that occurred while the blood pressure cuff was pumped up (this happened to my wife during a routine clinic visit). I’d like to say that my lab’s technicians wouldn’t make a similar error but I’ve seen our staff place the oximeter probe on a finger of the hand that a mobility-impaired patient was using to hold their cane.

    But more importantly I think that we seem to have forgotten about oximetry’s inherent limitations. Numerous studies have shown that pulse oximetry correlates well with SaO2 measured by CO-oximetry, but correlation doesn’t necessarily mean accuracy. The accuracy of pulse oximeters has been shown to vary somewhat by manufacturer and ranges from +/- 3% to +/- 5% (which is somewhat in contrast to general manufacturer claims of +/-2% accuracy). In addition, pulse oximetry accuracy is highest at SaO2s of between 82% to 94%, and much less so outside of that range. Interestingly, despite anecdotal evidence to the contrary, a couple of studies have shown that skin pigmentation and nail polish do not affect overall oximeter accuracy.

    There are still a large number of smokers around, and pulmonary labs probably see a higher percentage of them than what’s in the general population. The average COHb levels of smokers is 3% to 5% but as a general rule COHb rises by approximately 5% for every pack of cigarettes smoked per day so higher levels are not only possible but in many cases highly probable. Elevated COHb levels are important for numerous clinical (and technical) reasons but in most cases we don’t know what an individual’s COHb level is because (at least for my lab) it requires an arterial puncture. Pulse oximetry has shown itself to be incredibly useful but in order to use it properly we have to remember its limitations.

    References:

    Adler JN, Hughes LA, Vivilecchia R, Camargo CA. Effect of skin pigmentation of pulse oximetry accuracy in the emergency department. Acad Emerg Med 1998; 5: 965-970.

    Amalakanti S, Pentakota MR. Pulse oximetry overestimates oxygen saturation in COPD. Resp Care 2016; 61(4): 423-427.

    Barker SJ, Shah NK. The effects of motion on the performance of pulse oximeters in volunteers. Anesthesiology 1997; 86(1): 101-108.

    Barker SJ, Curry J, Redford D, Morgan S. Measurement of carboxyhemoglobin and methemoglobin by pulse oximetry. Anesthesiology 2006; 105: 892-897.

    Bozeman WP, Myers RAM, Barish RA. Confirmation of the pulse oximetry gap in carbon monoxide poisoning. Ann Emerg Med 1997; 30(5): 608-611.

    Chan ED, Chan MM, Chan MM. Pulse oximetry: Understanding its basic principals facilitates appreciation of its limitations. Resp Med 2013; 107: 789-799.

    Chiappini F, Fuso L, Pistelli R. Accuracy of a pulse oximeter in the measurement of oxyhaemoglobin saturation. Eur Respir J 1998; 11: 716-719.

    Coulange M, Barthelemy A, Hug F, Thierry AL, De Haro L. Reliability of new pulse CO-oximeter in victims of carbon monoxide poisoning. Undersea Hyperbaric Medicine 2008; 35(2): 107-111.

    Diaz-Gonzalez CDLM, Hormiga MDLR, Lopez JMR, Rivero YD, Morales MSM. Concordance among measurements obtained by three pulse oximeters currently used by health professionals. J Clin Diag Res 2014; 8(8): MC09-MC12.

    Hampson NH. Pulse oximetry in severe carbon monoxide poisoning. Chest 1998; 114: 1036-1041.

    Hampson NH, Exker ED, Scott KL. Use of a noninvasive pulse Co-Oximeter to measure blood carboxyhemoglobin levels in bingo players. Resp Care 2006; 51(7): 758-760.

    Hampson NH. Noninvasive pulse CO-oximetry expedites evaluation and management of patients with carbon monoxide poisoning. J Emerg Med 2012; 30: 2021-2024.

    Hampson NH, Piantodosi CA, Thom SR, Weaver LK. Concise Clinical Review. Practice recommendations in the diagnosis, management and prevention of carbon monoxide poisoning. Amer J Respir Crit Care Med 2012; 186(11): 1095-1101.

    Hannhart B, Michalski H, Delorme N, Chapparo G, Polu JM. Reliability of six pulse oximeters in chronic obstructive pulmonary disease. Chest 1991; 99: 842-846.

    Hansen JE, Casaburi R. Validity of ear oximetry in clinical exercise. Chest 1987; 91(3): 333-337.

    Hess DR. Pulse oximetry: Beyond SpO2. Resp Care 2016; 61(12): 1671-1680.

    Kohyama T, Moriyama K, Kanai R, Kotani R, Kotani M, Uzawa K, Satoh T, Yorozu T. Accuracy of pulse oximeters in detecting hypoxemia in patients with chronic thromboembolic pulmonary hypertension. PLOS One 2015; 10(5): e0126979

    Luks AM, Swenson ER. Pulse oximetry at high altitude. High Altitude Medicine and Biology 2011; 12(2): 109-119.

    McGovern JP, Sasse SA, Stansbury DW, Causing LA, Light RW. Comparison of oxygen saturation by pulse oximetry and CO-oximetry during exercise testing in patients with COPD. Chest 1996; 109: 1151-1155.

    Nitzan M, Romem A, Koppel R. Pulse Oximetry: fundamentals and technology update. Medical Devices: Evidence and Research 2014; 7: 231–239

    Ortiz FO, Aldrich TK, Nagel Rlm Benjamin LJ. Accuracy of pulse oximetry in sickle cell disease. Am J Respir Crit Care Med 1999; 159: 447-451.

    Piatkowski A, Ulrich D, Grieb G, Pallua N. A new tool for the early diagnosis of carbon monoxide intoxication. Inhalation Toxicology 2009; 21(13): 1144-1147.

    Schnapp LM, Cohen NH. Pulse oximetry. Uses and abuses. Chest 1990; 98(5): 1244-1250.

    Tremper KK. Pulse oximetry. Chest 1989; 95(4): 713-715.

    Touger M, Bimbaum A, Wang J, Chou K, Pearson D. Performance of the RAD-57 pulse CO-oximeter compared with standard laboratory carboxyhemoglogin measurement. Ann Emerg Med 2010; 56: 382-388.

    Weaver LK, Churchill SK, Deru K, Cooney D. False positive rate of carbon monoxide saturation by pulse oximetry of emergency department patients. Resp Care 2013; 58(2): 232-240.

    Yamamoto LG, Yamamato JA, Yamamoto JB, Yamamoto BE, Yamamoto PP. Nail polish does not significantly alter oximetry measurement in mildly hypoxic subjects. Resp Care 2008; 53(11): 1470-1474.

    Yamiya Y, Bogaard HJ, Wagner PD, Niizeki K, Hoplins SR. Validity of pulse oximetry during maximal exercise in normoxia, hypoxia and hyperoxia. J Appl Physiol 2002; 92: 162-168.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • Why the FEV1/FVC ratio LLN as a percent of the predicted FEV1/FVC ratio is important

    My medical director and I had a discussion today about where the cutoff for a normal FEV1/FVC ratio would be for a 93 year old patient of his. Part of the problem is that there are almost no reference equations for patients this age and the best you can usually do is to extrapolate. Another part is that anybody in their 90’s is a survivor and must have had good lung function throughout their life to reach that age, which means that they aren’t average so it’s not clear how well extrapolation actually works in this population. The final part is that the guidelines for PFT interpretation that are used by my lab were put into place about 40 years ago and reflect the thoughts at that time. I updated part of the guidelines with the 2005 ATS/ERS interpretation algorithm about 10 years ago, but the thresholds for normalcy (as well as the reference equations we use) still haven’t changed all that much. I’ve brought this issue up a number of times over the years (usually every time I get a new medical director) but haven’t gotten a consensus from the pulmonary physicians on either the need for change or for what threshold values should be used.

    Anyway, both my medical director and I felt felt that the LLN for the FEV1/FVC ratio (when viewed as a percent of the predicted FEV1/FVC ratio) is probably lower for a 75 year old (and certainly for a 93 year old) than it is for a 25 year old, and that the current lab guidelines for interpretation were probably diagnosing airway obstruction in the elderly more often than they should. My lab currently uses the NHANESIII reference equations for spirometry however, and I wasn’t sure they showed this particularly well since the equations for the FEV1/FVC ratio and its LLN are quite simplistic compared to those for FVC and FEV1.

    The NHANESIII reference equations were published in 1999 and at that time they were derived from the largest population that had ever been studied (7428 subjects, 40.9% male, 59.1% female) and with the most sophisticated statistical analysis that had been used up until that time. In 2012 however, the Global Lung Function Initiative (GLI) released a set of reference equations using data obtained from 73 centers world-wide on 97,759 subjects (44.7% male, 55.3% female). Statistical analysis of the GLI data was performed using the Lambda, Mu, Sigma (LMS) approach and a set of equations were derived that covered ages 3 to 95.

    I have some reservations about how well the GLI equations match the population served by my lab but it’s a moot point whether I like them or not since even now, 5 years after the GLI equations were published, my lab’s software has not been updated to include them. The reason for this is that the GLI spirometry equations use what are called “splines” to generate the spirometry reference values and these are taken from a look-up table. My lab’s software does have an equation editor but it will not accommodate lookup tables so the GLI equations can’t be added. I’m sure our equipment manufacturer could get around this if they really wanted to, but so far it hasn’t happened.

    I do have a lot of respect for the GLI equations however, and think that the overall view they give of the normal distribution of FVC, FEV1 and the FEV1/FVC ratio is far more correct than those of any prior studies. Using a spreadsheet tool downloaded from the GLI that lets me generate the GLI spirometry predicted values and the NHANESIII reference equations I decided to take a closer look at their predicted FEV1/FVC ratios and their LLNs.

    The GLI FEV1/FVC ratios and their LLNs trace out curves that generally decline with age while the NHANESIII FEV1/FVC ratio and LLN trace out straight lines. A quick look at the NHANESIII FEV1/FVC ratio reference equations (this example is for Caucasian males):

    would give you the impression that the LLN is always going to be factor of roughly 0.89 (i.e. 78.388 / 88.066) of the FEV1/FVC ratio and while that’s true:

    when you look at the LLN as a percent of the predicted FEV1/FVC ratio (LLN PP) for the NHANESIII equations they actually decline slightly with age.

    The behavior of the GLI LLN PP is more complex (and far more interesting). The LLN PP is negatively curved up to about age 50 (although any change is less than 1% over that range) and then starts decreasing with increasing age, and this decrease is reasonably linear.

    I think that the curves described by the GLI equations are probably far more correct than straight lines described by the NHANESIII equations. This is partly because the GLI study population is significantly larger (and more diverse) than the NHANESIII study but also because the GLI statistical analysis is significantly more sophisticated than NHANESIII.

    It should be remembered that the LLN is based on the variability of the underlying data so in a sense what the GLI curve is saying is that above age 50 the FEV1/FVC ratio becomes more variable. That’s okay and it’s probably a reflection of how variable the effects of aging are on us. Not everybody ages at the same rate and not all of the organs in our body age at the same rate either.

    Regardless of whether you look at the NHANESIII equations or the GLI equations what is abundantly clear is that using a fixed percent of predicted as the threshold for normalcy (which is what my lab uses) is not correct at any age since the both NHANESIII and GLI show that the LLN PP changes with age. What’s also clear is that probably too many of my lab’s elderly patients are being diagnosed with airway obstruction they probably don’t have and this needs to change.

    The FEV1/FVC ratio and its LLN is the starting point of the ATS/ERS interpretation algorithm so defining the LLN correctly is critical. The GLI reference equations make it quite clear that the LLN is not linear but viewing the FEV1/FVC ratio LLN by itself does not make its complexity as apparent as when it is viewed as a percent of the predicted FEV1/FVC ratio. I still have reservations that the GLI predicted FVC and FEV1 are a good match to my lab’s patient population but I think the overall behavior of the FEV1/FVC ratio’s LLN as described by its percent predicted is not only reasonably accurate, but that it is more accurate than the LLN from the NHANESIII or any other spirometry reference equations. For this reason I have no particular problem combining the GLI FEV1/FVC ratio’s percent predicted with the NHANESIII reference equations (not that I have a real choice since my lab’s software does not support the GLI reference equations).

    I’m hoping that sharing the GLI FEV1/FVC ratio LLN percent predicted graphs with the department’s physicians will help to move us forward towards updating our interpretation guidelines. One probable sticking point is that many of the department’s physicians were taught relatively simplistic approaches towards interpreting spirometry using percent predicteds and rules like 80-60-40 (weren’t we all?). A variable FEV1/FVC ratio LLN percent predicted makes it clear that simple rules are no longer possible (but then anybody who’s been paying attention to the literature knows that simple rules were put to rest a while ago).

    An additional problem is that a change in our interpretation guidelines means that going forward there will be a significant discontinuity in our interpretations, more so than for just changing reference equations. This should never be a reason not to go forward however, since improving the accuracy of interpretations is more important than continuity.

    There’s no straightforward way to use the GLI FEV1/FVC ratio LLN percent predicted in our interpretations. I haven’t explored the differences in the LLN PP across different heights and ethnicities so it’s not clear how universal the graphs I’ve included here really are and that will have to be investigated at some point. At the moment the only workable I idea I’ve come up with is to use either graphs or tables of the LLN PP as part of the interpretation process and to manually insert the LLN PP into the report notes. Not elegant and a bit of extra effort, but it should work.

    References:

    Hankinson JL, Odencrantz JR, Fedan KB. Spirometric reference values from a sample of the general U.S. population. Am J Respir Crit Care Med 1999; 159: 179-187.

    Gaunger PH, Stanojevic S, Cole TJ, Baur X, Hall GH, Culver BH, Enright PL, Hankinson JL, Ip MSM, Zheng J, Stocks J. ERS Task Force. Multi-ethnic reference values for spirometry for the 3-95 age range: the global lung function 2012 equations. Eur Respir J 2012; 40: 1324-1343.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • What’s normal about the GLI DLCO reference values?

    The Global Lung Initiative (GLI) has been working for several years to develop a universal reference equation for DLCO. Although this endeavor is not necessarily complete, an article describing the GLI DLCO reference equation for Caucasians was published in the September issue of the European Respiratory Journal as an open access article and can be downloaded by anyone. The Global Lung Initiative in general and the authors of the article more particularly are to be commended for this monumental work and for the insight it brings to understanding the normal distribution of DLCO.

    The data used to develop the GLI reference equations was originally derived from 19 studies the GLI identified to have been performed on lifetime nonsmoking populations. 85% of the results came from Caucasian populations and the remaining from two Asian sources. The authors felt that there weren’t a sufficient number of non-Caucasians to accurately describe any ethnicity-based differences in DLCO and for this reason only the Caucasian data was used.

    From this data some results were excluded because of:

    • FEV1 > 5 Z-scores or < 5 Z scores
    • Height (children only, >5 or <5 Z scores)
    • VA less than VC
    • Elevated BMI (>30 kg/m2 in adults, >85% centile in children)
    • Missing demographic information

    After these exclusions 9710 results remained of which 4859 were male and 4851 were female. DLCO values were corrected for altitude and FiO2 and uncorrected for hemoglobin. Reference equations were derived using the LMS (Lambda, Mu, Sigma) method.

    Note: The study population consisted of individuals from 4.5 to 91 years of age and GLI reference equations are valid across this entire span. The majority of the existing DLCO reference equations available to me are for an adult population and for this reason this discussion of the GLI DLCO reference equations will be limited to this portion of the age range. The GLI article also includes reference values for KCO and VA but these subjects will also be saved for a separate discussion.

    Not surprisingly, DLCO is highest in tall and young individuals, and lower in short and elderly ones.

    Over the years many researchers have contended that a statistical analysis of most pulmonary function results show a homoscedastic distribution of the results. Strictly speaking this means the the lower (and upper) limits of normal parallel the mean value and this has been the basis for using the SEE (Standard Error of the Estimate) x 1.645 as the LLN for DLCO reference equations that were originally published without an LLN. This notion has bothered me but I have neither the statistical background nor access to the original data in order to be able to test it. The GLI DLCO reference equations do show that as far as age is concerned that for the same height, the data looks somewhat homoscedastic.

    But for a range of heights with the same age, it does not look that way at all.

    This calls into question the use of SEE x 1.645 as a valid way of estimating the LLN. At the same time however, I found that for the same age, that LLN as a percent of predicted was the same for all heights but that it did vary by age:

    which in turn calls into question the use of 80% of predicted as stand-in for the LLN, particularly for the elderly. Interestingly, similar to the GLI spirometry reference equations, the GLI DLCO reference equations show that DLCO peaks in the early 20’s, which is different from all existing DLCO reference equations. Also of interest, DLCO appears to decline in a reasonable linear fashion with increasing age.

    The big question has to be how the GLI DLCO reference equations compares to the existing DLCO reference equations in common use:

    In all instances, with the sole exception of taller females, the GLI predicted DLCO is lower than the average of all reference equations. Depending on gender, age and height there are some reference equations that produce similar values to the GLI reference equation but realistically there is no single existing reference equation that remains anywhere close the the GLI values across both genders and across all heights and ages.

    In the USA, the most commonly used DLCO reference equations are probably Crapo [A], Knudsen [F] and Miller [H]. With the sole exception of taller females the GLI predicted DLCO is noticeably lower than all of these, although in most instances the GLI is closest to Miller and furthest away from Knudsen.

    The authors noted the fact that the GLI predicted DLCO tends to be lower than those of previous studies from the 1970’s through 1990’s but also indicated that it is reasonably in line with more recently published studies. The authors did not speculate as to the cause for this but I will note that rapid response gas analyzer technology was not available until the 1990’s and the studies performed prior to that time were done with alveolar sampling bags. The difference between DLCO measured by rapid response gas analyzer and by alveolar sampling bag has not, to my knowledge, ever been studied. This is unfortunate given that our field has moved almost entirely (and unquestioningly) over the last two decades to rapid response gas analyzer systems for measuring DLCO.

    Note: I can see a number of reasons why there could be minor differences in DLCO measured by these different techniques and these include the change from helium to methane as the insoluble gas component; better definition of the alveolar plateau; the change from volume displacement spirometers to flow sensors; gas and flow integration algorithm accuracy; and differences in system dead space. But whether or not any of these are a factor is likely going to remain unknown, particularly since there are almost no alveolar sampling bag systems left to compare the current rapid gas analyzer systems to.

    Most of the DLCO data was measured using commercially available test systems, primarily Collins, Jaeger and Sensormedics. Interestingly there were some small systematic differences in results based on which test system was used, with Collins producing results that were slightly above the mean, Sensormedics slightly below the mean and Jaeger being closest to the mean.

    One issue that should have been discussed more fully was that “based on the observed variability in TLCO”…“a physiologically relevant difference” in DLCO was “0.5 z-scores” or a “10% relative change”. At the moment I’m taking this to mean that a change in DLCO from one visit to another of 10% is significant and should be noted but I may well be misunderstanding the intent of this statement.

    Note: In fact, one of the few criticisms I have of GLI DLCO article is that there a number of issues discussed where an understanding requires the reader to be reasonably well-versed in statistics. To some extent I’d say that I understand the issues and their conclusions but the logic in between those two points is often opaque.

    Any implementation of the GLI reference equations in our present systems is going to have to wait until our equipment manufacturers update our software, however. What the article doesn’t make as clear as I’d like (although it is discussed somewhat vaguely in the supplementary material) is that the reference equations are dependent something called “splines” (Mspline and Sspline). Spline values are not generated from an equation and instead come from look-up tables. This means that even if you have a reference equation editor (my lab equipment does) there’s no way to insert a look-up table into the process. Even though the use of splines may be familiar to statisticians I was disappointed there was never any explanation about how they were generated.

    I am also disappointed that GLI was unable to include reference values or correction factors for other ethnicities than Caucasian. This is not the fault of the authors since they were dependent on studies performed by other researchers. There is however, nothing to be proud of the fact that AFTER ALL THIS TIME there are no DLCO reference equations for blacks and that there has not been a single study that did more than compare a small population of blacks to Caucasian DLCO reference values. Ditto lung volumes, and in both instances we’re still guessing. Is it really that hard?

    The GLI DLCO data set is the largest that has ever been analyzed and it was analyzed with the most sophisticated statistical analysis currently available. For these reasons I think that some of the general trends it shows should be taken to heart. In particular the fact that the LLN as a percent of predicted declines with increasing age means that the use of 80% of predicted as a cutoff for normalcy should end. In addition even though the GLI data is somewhat homoscedastic, it is not completely so, and this also means that the practice of generating an LLN from SEE x 1.645 should also end. Finally, it’s also clear that whether it’s due to a change in measurement technology or not, normal values have changed over time and the older DLCO reference equations no longer represent the populations we serve.

    References:

    [Q] Ayers LN, Ginsberg ML, Fein J, Wasserman K. Diffusing capacity, specific diffusing capacity and interpretatio of diffusion defects. West J Med 1975; 123: 255-264

    [O] Burrows B, Kasik JE, Niden AH, Barclay WR. Clinical usefulness of the single-breath diffusing capacity test. Am Rev Respir Dis 1961; 84: 789-806

    [P] Chhabra SK, Kumar R, Gupta AU. Prediction equations for diffusing capacity (transfer factor) of lung for North Indians. Lung India, 2016; 33: 479-486

    [A] Crapo RO, Morris AH. Standardized single-breath normal values for carbon monoxide diffusing capacity. Am Rev Resp Dis 1981; 123: 185-189.

    [B] Gaensler EA, Smith AA. Attachment for automated single-breath diffusing capacity measurement. Chest 1973; 63: 136-145.

    [C] Gulsvik A, Bakke P, Humerfelt S, Omenaas E, Tostenson T, Weiss ST, Speizer FE. Single breath transfer factor for carbon monoxide in an asymptomatic population of never smokers. Thorax 1992; 47: 167-173.

    [D] Gutierrez C, Ghezzo RH, Abboud RT, Cosio MG, Dill JR, Martin RR, McCarthy DS, Moorse JLC, Zamel N. Reference values of pulmonary function tests for Canadian Caucasians. Can Respir J 2004; 11(6): 414-424.

    [E] Ip MSM, Lai AYK, Ko FWS, Lau ACW, Ling SO, Chan JWM, Chan-Yeung MMW. Reference values of diffusing capacity on non-smoking Chinese in Hong Kong. Respirilogy 2007; 12: 599-606

    [F] Knudsen RJ, Kaltenborn WT, Knudsen DE, Burrows B. The single-breath carbon monoxide diffusing capacity. Am Rev Resp Dis 1987; 135: 805-811.

    [G] Marsh S, Aldington S, Williams M, Weatherall M, Shirtcliffe P, McNaughton A, Pritchard A, Beaseley R. Complete reference ranges for pulmonary function tests from a single New Zealand population. New Zealand Med J 2006; 119: N1244.

    [H] Miller A, Thornton JC, Warshaw R, Anderson H, Teirstein AS, Selikoff IJ. Single breath diffusing capacity in a representative of Michigan, a large industrial state. Am Rev Resp Dis 1983; 127: 270-277.

    [I] Neder JA, Andreoni S, Peres C, Nery IE. Reference values for lung finction. III. Carbon monoxide diffusing capacity (transfer factor). Braz J Med Biol Res 1999; 32: 729-737.

    [J] Paoletti P, et al. Reference equations for the single-breath diffusing capacity. Am Rev Resp Dis 1985; 132: 806-813.

    [K] Roberts CM, MacRae KD, Winning AJ, Adams L, Seed WA. Reference values and prediction equations for normal lung function in a non-smoking white urban population. Thorax 1991; 46: 643-650.

    [L] Roca J, Rodrigue-Roisin R, Cobo E, Burgos F, Perez J, Clausen JL. Single-breath carbon monoxide diffusing capacity prediction equations from a Mediterranean population. Am Rev Resp Dis 1990; 141: 1025-1032

    [R] Stanojevic S, Graham BL, Cooper BG, et al. Official ERS technical standards: Global Lung Function Initiative reference values for the carbon monoxide transfer factor for Caucasians. Eur Respir J 2017; 50: 1700010

    [M] Vijayan VK, Kuppurao KV, Venkatesan P, Sankaran K, Prabhakar. Pulmonary function in healthy young adult Indians in Madras. Thorax 1990; 45: 611-615.

    [N] Yang SC, Yang SP, Lin PJ. Prediction equations for single-breath carbon monoxide diffusing capacity from a Chinese population. Am Rev Resp Dis 1993; 147: 599-606.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • What’s wrong with an elevated DLCO?

    Well, not necessarily anything, although as usual that depends on the circumstances. Recently I was contacted by an individual who was concerned that their DLCO had decreased from 120% of predicted to 99% of predicted. They also mentioned that their DLCO results have normally ranged from 117% to 140% of predicted over the last 9 months.

    More interestingly however, they said that

    “the technician told me before I even took the test that anything over 100% for DLCO is essentially a testing error.”

    Wow. That statement is wrong on so many levels it’s hard to know where to start but I’ll give it a shot anyway.

    First, there are a variety of DLCO reference equations. The ATS/ERS guidelines recommends that PFT Labs pick the reference values that most closely matches their patient population but how this is done is left to individual labs. There are at least a couple dozen DLCO reference equations to choose from and probably about a half dozen of these are in common use in PFT labs around the world.

    Because no patient population is ever going to precisely match those of a study this means that DLCO results are going to tend to be above or below 100% of predicted depending on which reference equation the lab is actually using. This also means that if results from otherwise normal subjects are mostly above or mostly below 100% of predicted then the wrong reference equations are being used.

    Moreover, if a lab selects a DLCO reference equation based on the expectation that everybody will be either 100% of predicted or less, that is also wrong. DLCO, like all other pulmonary function results, follows a normal, bell-shaped distribution.

    Reference equations are used to calculate the mean value and this means that approximately half the study population is above the mean and half is going to be below it. Admittedly most study subjects aren’t going to be too far away from the mean, but as an example Miller et al (H in the above graph) is one of the more commonly used DLCO reference equations. They did not include a lower limit of normal (LLN) in their original paper but did include extensive statistics on the study population. Although I have some significant disagreements with this approach many labs calculate the LLN for DLCO as:

    For Miller et al the SEE (Standard Error of the Estimate) x 1.645 is 7.99 ml/min/mm Hg. The upper limit of normal (ULN) can be calculated the same way,

    Since the predicted DLCO for a 50 year old, 175 cm male is 30.77 ml/min/mmHg, the ULN for the same patient would be 38.76 ml/min/mmHg, but that value is 126% of predicted. So it’s easily possible to be within normal limits, but above 120% of predicted.

    Probably far more pertinent to the person who contacted me about this issue (who has a tentative diagnosis of reactive airways disease and a BMI > 30), is that DLCO is frequently elevated in both asthma and in obesity. In a study of 245 patients with a DLCO >140% of predicted, Saydain et al found that most of the patients had a diagnosis of either asthma, obesity, or both. The reasons for this are unclear but subjects with asthma have been noted to have increased airway vascularization, and therefore an elevated capillary surface area in the lung. Other researchers have noted that asthma is associated with an elevated perfusion of the lung apices. And numerous other researchers have shown that obese individuals have an elevated capillary blood volume and an elevated cardiac output.

    So, is an elevated DLCO abnormal in these individuals? Probably not since no one has shown that an elevated DLCO is an adverse clinical findings in either asthma or obesity but in particular it also means that an elevated DLCO should not be dismissed as a testing error in these patients.

    A much smaller minority in Saydain et al’s study had pulmonary hemorrhage, polycythemia or left-to-right shunting. These are adverse clinical findings and for this reason, even more so than asthma or obesity, an elevated DLCO should not be ignored as a testing error.

    This is not to say that there aren’t DLCO testing errors that lead to an elevated DLCO and I’d be interested in what that lab’s policy was on checking test systems when a patient has an elevated DLCO.

    The DLCO test is sensitive to the pulmonary capillary blood volume and this can be affected by excess resistance during inspiration and by a Müller maneuver (inspiratory effort against a closed mouthpiece) both of which cause a negative intrapulmonary pressure and act to increase the capillary blood volume. The 2017 ERS/ATS DLCO standards explicitly mentions the need to avoid a Müller maneuver and the standard’s equipment specifications includes the statement that:

    “Circuit resistance must be <1.5 cmH2O·L−1·s−1 up to 6 L·s−1 flow. If a demand-flow regulator is used on a compressed test gas cylinder, the maximal inspiratory pressure required for 6 L·s−1 inspiratory flow through both the circuit and the valve must be <9 cmH2O.”

    The demand valve specifications in the ERS/ATS standard are not based on any particular evidence however, and are essentially a way to grandfather-in existing test systems. To my knowledge the effect that an inspiratory pressure of -9 cm H2O has on DLCO has not been studied (although at least one equipment manufacturer I’m familiar with has designed their DLCO systems without demand valves specifically because of their concerns with this issue). Regardless of whether or not an inspiratory pressure of -9 cm H2O affects DLCO, a sticking demand valve will significantly increase a subject’s inspiratory pressure. This will probably increase the measured DLCO as well and should be considered a testing error.

    Interestingly, although the ERS/ATS standard acknowledges that exercise affects pulmonary capillary blood volume (due to an elevated cardiac output), other than stating that:

    “…the subject must be seated comfortably throughout the test procedure…”

    it does not specify any pre-test waiting period although a minimum 4 minute between-test waiting period is recommended. This waiting period however, is intended to allow for adequate elimination of test gas from the lungs and not as a way to ensure the patient is in a resting state.

    Note: I’ve used the effects of exercise as a talking point when teaching students about DLCO testing. To do this I perform a DLCO on a student after they’ve sat quietly for several minutes. Next I get the test system set up for another DLCO test but first I have them walk briskly up and down the hallway for several minutes, then sit down and immediately perform another DLCO. Their second, post-exercise, DLCO is almost always at least 15% to 30% higher than their resting DLCO.

    Most of the time however, testing errors such as a valve and tubing failures, a reduced inspiratory volume, a reduced breath-holding time or an inadequate or mistimed alveolar sample tend to cause a decrease in DLCO, not an increase. Having said that, I’ve run across problems with gas analyzers that can cause an elevated DLCO, and these can be hard to detect because they are often transient. The same applies to computer glitches of one kind or another.

    I remain somewhat baffled about the statement that any DLCO over 100% of predicted is a testing error. What is it based on and is it blaming the patient or the test system? If the technician and the lab actually believes it, then why were the elevated DLCO’s reported anyway?

    Unfortunately, I think that it’s more likely the technician was pontificating about a subject they did not understand. It’s worse if they were parroting a statement made by a co-worker, their manager or their medical director since that points to a systemic lack of understanding about testing issues and idiosyncrasies which is (also unfortunately) more common than it should be.

    In my lab we get DLCO’s >140% of predicted several dozen times a year and >120% of predicted far more frequently. I always review the raw data for these tests but then I review the raw data for all DLCO tests so they are not singled out in any particular way. I don’t find any apparent testing errors with elevated DLCO results any more often than I do with regular DLCO results. I’ve also been aware of the association between asthma and elevated DLCO’s for more than 30 years so I’ve never seen any reason not to report them.

    The original question however, was whether a decrease in DLCO from 120% of predicted to 99% of predicted was abnormal and I don’t have a good answer for that. Partly this is because the intrasession reproducibility of elevated DLCOs has not been studied. Partly it’s because the specific causes of an elevated DLCO in a given individual are poorly understood and for this reason it’s not possible to say whether changes like this are clinically significant or not.

    It’s also unclear why this individual is having their DLCO tested so frequently. The only working diagnosis they have is mild shortness of breath and possible reactive airways disease, and those are not good reasons to repeat DLCO tests so frequently. Does their physician suspect something they haven’t yet shared or is it just the policy that DLCO is tested with each patient visit? I’m left with a number of questions but at least I was able to let them know that an elevated DLCO is actually normal for many individuals and not a testing error.

    References:

    Graham BL, Brusasco V, Burgos F, et al. 2017 ERS/ATS standards for single-breath

    carbon monoxide uptake in the lung. Eur Respir J 2017; 49: 1600016.

    Miller A, Thornton JC, Warshaw R, Anderson H, Teirstein AS, Selikoff IJ. Single breath diffusing capacity in a representative of Michigan, a large industrial state. Am Rev Resp Dis 1983; 127: 270-277.

    Saydain G, Beck KC, Decker PA, Cowl CT, Scanlon PD. Clinical significance of elevated diffusing capacity. Chest 2004; 125: 446-452.

    Tanaka H, Yamada G, Saikai T, Hashimoto M, Tanaka S, Suzuki K, Fujii M, Takahashi H, Abe S. Increased airway vascularity in newly diagnosed asthma using a high-magnification bronchovideoscope. Am J Respir Crit Care Med 2003; 168: 1495-1499.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • Should biological quality control be replaced?

    I’ve been thinking about quality control and quality improvement lately. Mostly this has been about how to go about determining whether the lab has a quality problem with testing and what statistics should be used for this purpose but I was reminded recently about an issue concerning biological quality control that came up a couple months ago on the AARC diagnostics forum. Specifically, one of the participants noted that some of their technicians had refused to perform biological QC on the basis that it violated their HIPAA rights to the privacy of their medical information. Further discussion noted that this was actually a correct interpretation of the HIPAA regulations and that no PFT lab can “force” its technicians to perform biological QC.

    I will be the first to admit that I’d never thought about it this way, and I’ve been mulling it over ever since. I’ve performed PFT testing on myself both for formal biological QC and as a quick way to check the operation of a test system for decades but I never thought of my PFT results as being part of my medical information. That’s probably an indication of my own short-sightedness however, and I also realize that over the years I’ve run across a number of testing issues I’d taken for granted up until somebody pointed out a problem with them.

    My attitude towards my PFT results may also be due to the fact that I don’t have any notable lung disease. My lab has had technicians who have been asthmatic however, and this has never been a factor in whether they were hired or not (other than not letting them perform methacholine challenges). They’ve usually performed bio-QC on themselves and at the time they seemed to regard it as a way to check on the status of their asthma. In retrospect however, I have to wonder if they were ever concerned that I would use their health status and test information against them in their annual evaluation, or even that the hospital would re-consider their employment because the costs of their health insurance might be higher. Although I don’t think the hospitals I’ve worked for ever thought along these lines, like it or not there are many businesses where this is a factor.

    Yesterday I asked myself what would happen if all PFT labs were required to completely end biological quality control because of HIPAA requirements? It didn’t take a lot of thought to realize that there are a number of mechanical test simulators in the marketplace that could do quite well at replacing the biological part of quality control. As importantly, the more I’ve thought about it the more I’ve come to think that biological QC probably isn’t the right way to go about QC in the first place.

    First, no matter how experienced we may be there’s still a lot of inherent variability in PFT results, even when we perform the tests on ourselves. This is the nature of any biological system and it’s why bio-QC results are plotted and analyzed using standard deviations on a Levey-Jennings chart.

    From Wikipedia

    The problem with this is that the “normal” range is based on the standard deviation of the test results themselves, and that the greater the variability these test results have, the larger the “normal” range is going to be as well. This means that a given test system may be considered to be out of range for one individual with a naturally low variability in test results but still within range for another individual who has a naturally high level of variability.

    Another problem with this approach is that it only considers whether changes over time are within a specified range of the average, not whether the results are actually accurate. Say, for example, the calibration syringe used by a lab was significantly inaccurate for some reason (the collar on the stem was loose and slipped, for example). Hopefully this is unlikely but nevertheless it would mean that the test equipment would be mis-calibrated. Biological QC might pick up on the fact that a calibration syringe had changed but not if the calibration syringe was already inaccurate when an individual began their QC.

    Finally, most individuals working in PFT Labs are likely in reasonably good health with reasonably normal PFT results. Most patients referred to a PFT lab probably have a lung disorder of one kind or another and their results will often be significantly outside the normal range. So, does QC performed on normal(-ish) individuals mean that the test equipment accurately tests patients with abnormal lung function? Maybe yes, maybe no, but think about about patients with severe airway obstruction who have markedly reduced expiratory flow rates and require a prolonged exhalation time in order to get anywhere near a normal FVC. Is the test system really measuring those low flow rates accurately and since most systems tend to be flow-based, is the exhaled volume being accurately integrated from their expiratory flow? Again, maybe yes, maybe no, but biological QC isn’t very likely to tell you this.

    So, what can take the place of the biological QC?

    Just as automated systems are replacing humans across the job marketplace there are a number of mechanical simulators that can replace the biological component in QC. Realistically however, because the current demand for these devices is presently low, there isn’t a great deal of selection available.

    There are a couple of flow-volume simulators (the 691227 from Hans-Rudolph and the PWG-33 from Piston Medical). These can produce any of the 50 different ATS spirometry waveforms which cover the entire range of pulmonary diseases and are used by manufacturers to test their spirometers. Additional waveforms can be imported and edited as well.

    There is a DLCO simulator (the 691044 from Hans-Rudolph). I’ve discussed this previously and it can, depending on the gases used, simulate low, medium or high DLCO results. My lab has been involved in several research studies where we were required to regularly test the equipment used on the study’s subjects with a DLCO simulator and its results were uploaded along with the patient test results.

    Finally there is a metabolic (i.e., VO2, VCO2, Ve) simulator (the 17050 series from Vacumed). This is able to simulate a wide variety of tidal volumes and respiratory rates as well as VO2 and VCO2.

    At the present time there are no lung volume simulators. Despite this lung volumes measurements using helium dilution or nitrogen washout can be simulated (with a bit of care) using a calibration syringe. Morgan Scientific used to sell a simulator for plethysmographs however this product has since been discontinued (but may return depending on customer interest).

    Because the market for simulators is so small, they also tend to be built only when an order has been placed. This means there are no economies of scale, which in turn means that they tend to be more expensive than they “should” be and this has discouraged their more widespread use, particularly since they are not required by any regulatory agencies.

    But since simulator technology is well understood and their design is a straightforward engineering issue there’s no particular reason that an automated simulator couldn’t be developed that performed spirometry, lung volumes and DLCO. There’s also no particular reason that an automated QC system couldn’t simulate subjects with mild, moderate, severe and very severe airway obstruction as well as subjects with mild, moderate and severe restriction.

    At the present time however, despite the fact that an automated QC system could test PFT systems across their entire operational range with a high degree of accuracy and repeatability there is no demand for an automated QC system and it is therefore unlikely to be built. But it’s possible for this situation to change, and perhaps to do so unexpectedly and as much as I’d like to see our field decide to police itself and mandate lab certifications, inspections and QC reporting, it’s more likely that this will be imposed from outside.

    One possibility is that a technician could be fired from a PFT lab position, in part allegedly for refusing to perform biological QC. Another is that a physiology student could refuse spirometry (or some other medical-ish) test and then received a poor grade, allegedly because of this. If these, or similar, cases were brought to court, it’s possible the court would rule in favor of the plaintiff on HIPAA grounds and this interpretation of the law could quickly mean the end of biological QC everywhere.

    For that matter CMS (Medicare) or SSA (Social Security) could suddenly decide (perhaps in order to limit payments or because they’re tired of questionable PFT results) to require that any PFT results they pay for or that are used in a payment decision could only be performed on test equipment independently verified by QC simulators.

    This issue could also slowly come to a head as the word gets around (mea culpa), and more and more technicians decide they don’t have to do biological QC any more. As much as this may appear to cause problems for some lab managers I have a hard time blaming anybody that doesn’t want to perform biological QC because they want to keep any and all of their medical information private. I don’t think of my own PFT results as being particularly concerning but where do you draw the line? If you worked in a chemistry or hematology lab would you feel the same if you had to regularly provide blood or urine samples for QC? Probably not.

    The fact is that biological QC is at best a limited way to monitor test equipment. In my experience equipment problems are far more often found during routine calibration and patient testing than they are because of biological QC. It’s a lot better than not performing any quality control at all but it doesn’t (and can’t) insure equipment accuracy and at best it can only detect change over time that is statistically significant. Finally, I’ve never met a technician (or a manager for that matter) that really liked performing biological QC and documenting the results.

    Maybe it’s time to look past biological QC to other ways of ensuring the proper operation of our test equipment. An automated simulator would make it a lot easier to routinely monitor test system accuracy and performance and does not violate any HIPAA regulations. Current simulators have a limited range of abilities, are relatively expensive and are available from only a small number of manufacturers but if simulator QC became a requirement this would likely change very quickly.

    Creative Commons License
    PFT Blog by Richard Johnston is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

  • Getting more out of the LCI with Scond and Sacin

    The Lung Clearance Index (LCI) is a relatively simple test that provides a measure of ventilation inhomogeneity within the lung. This can be clinically useful information since several studies have shown that increases in LCI often precede decreases in FEV1 in cystic fibrosis and post-lung transplant. LCI results are only a general index into ventilation inhomegeneity however, and other than showing its presence, does not give any further information about its cause or location.

    There is additional information that can be derived from an LCI test that can indicate the general anatomic location where ventilation inhomegeneity (or alternatively, ventilation heterogeneity) is occurring; specifically the conducting or acinar airways. This can be done because changes in the slope of the tidal N2 washout waveform during an LCI test are sensitive to the conduction-diffusion wavefront in the terminal bronchioles. Careful analysis of these slopes permits the derivation of two indexes; Scond, an index of the ventilation heterogeneity in the conducting airways; and Sacin, an index of ventilation heterogeneity the acinar airways.

    To review, an LCI test is a multi-breath nitrogen washout test. An individual is switched into a breathing circuit with 100% O2.

    Once this happens tidal volume is measured continuously and used to determine the cumulative exhaled volume. Exhaled nitrogen is also measured continuously and is used to determine the cumulative exhaled nitrogen volume. The LCI test continues until the end-tidal N2 concentration is 1/40th of what is was initially (nominally 2%). At that point FRC is calculated using the cumulative exhaled nitrogen volume:

    FRC (L) = Exhaled N2 Volume / (Initial N2 Concentration – Final N2 concentration)

    LCI is calculated by:

    LCI = Cumulative Exhaled Volume (L) / FRC (L)

    and is essentially a measure of how much ventilation is required to clear the FRC. When an individual tidal breath from the LCI test is graphed, it looks similar to a standard single-breath N2 washout:

    and can be similarly subdivided into phase I (dead space washout), phase II (transition) and phase III (alveolar gas).

    The interesting thing is that the phase III slopes steadily increase as the LCI test progresses.

    This occurs in subjects with normal lungs but the rate and the degree to which the slope increases is elevated in both COPD and Asthma. The phase III slopes are normalized by dividing by the average N2 concentration occurring under phase III and when this is plotted against lung turnover (which is the amount of ventilation equal to the subject’s FRC) the differences can between normal and obstructive lung disease can be seen.

    Modified from Verbanck S, Schuermans D, Van Muylem A, Paiva M, Noppen M, Vincken W. Ventilation distribution during histamine provocation. J Appl Physiol 1997; 83(6): Figure 3, page 1910.

    By definition Scond is the slope of the phase III nitrogen washout slope versus ventilation (measured in turnovers to normalize for different sized lungs) derived by linear regression in the TO range 1.5 to 6. Sacin is the phase III slope of the very first expiration minus Scond times the TO of the first breath and is basically the offset of the Scond slope.

    Normal values for Scond are around 0.033 and tend to increase with age. Normal values for Sacin are around 0.075 and the change with age tends to be minimal. Physiologically, Scond is thought to reflect ventilation heterogeneity occurring in the peripheral conducting airways, where gas transport is driven primarily by convection. Sacin is thought to reflect ventilation heterogeneity occurring within the acinar region of the lung, where gas transport is driven primarily by diffusion.

    One study showed the LCI itself tended to correlate more with Scond than it did with Sacin but that LCI was also affected by ventilation inhomogeneities in the larger airways. A study of adult patients with cystic fibrosis showed somewhat different results in that that following exacerbation their LCI and Sacin improved significantly but that Scond did not change.

    In particular Scond and Sacin have provided a lot of information about asthma and COPD. At the simplest level (and not surprisingly) there is a strong association with the degree of abnormality in Scond and Sacin and the degree of severity of both asthma and COPD.

    For adult asthmatics researchers have noted the presence of significant ventilation inhomegeneity in their conducting airways (Scond) that is only partially reversible with inhaled bronchodilators. Interestingly, Sacin often showed larger changes and often normalized after the use of bronchodilator and this was seen in both smokers and subjects with normal lungs. During asthma exacerbations both Scond and Sacin worsened but Sacin tended to correlate with decreases in FEV1 while Scond did not.

    The picture is more mixed during bronchoprovocation. Normal subjects who did not have a significant FEV1 decrease following histamine inhalation showed increases in both Scond and Sacin during bronchoprovocation that returned to baseline following a bronchodilator. Although subjects that did have a significant FEV1 decrease had an elevated Sacin at baseline but Sacin did not change significantly during bronchoprovocation or bronchodilation, while at the same time their Scond both increased and improved significantly.

    Inhaled steroids have been shown to improve Sacin but Scond has tended to show little or no change, and researchers also saw that the greatest benefit of steroid therapy tended to be seen in subjects with the most abnormal Sacin values at baseline. Interestingly, despite the role that steroids play in controlling inflammation another study showed that the relationship between exhaled NO with Sacin which was not significant while Scond small but statistically significant correlation with exhaled NO.

    An editorial by DA Kaminsky summarized the current clinical findings of Scond and Sacin with asthma:

    Clinical Feature: Abnormal Scond: Abnormal Sacin:
    Stable asthma (adults) +
    Stable asthma (children) +
    Exhaled NO (inflammation) +
    Airway hyperreactivity +
    Methacholine response + +
    Response to BD + ++
    Response to inhaled Steroids + (large particles) ++ (small particles)
    Asthma severity (FEV1) +
    Asthma control + +
    Acute exacerbation ++ +

    where:

    + associated with feature
    ++ strongly associated with feature
    not associated

    In a study of smokers with varying smoking histories, Scond and Sacin showed abnormalities in 10 ppy (pack per-year) smokers while spirometric abnormalities usually did not occur until after 20 ppy and DLCO abnormalities usually did not occur until after 30 ppy. Another study showed that smokers who quit showed an average improvement in Scond of 42% after 1 year, but any abnormalities in Sacin tended not to show any improvement at all.

    Patients with COPD all tended to have an elevated Sacin, while Scond tends to increase according to the level of COPD severity. Interestingly, patients with COPD that showed a significant improvement in gas trapping with the use of tiotropium bromide (Spiriva) did not have significant changes in either Sacin or Scond so it appears that any improvements in lung mechanics did not change the underlying ventilation inhomogeneity.

    There is still a lot that needs to be worked out before Scond and Sacin become routine clinical tests. One of the major factors that has kept Scond and Sacin measurements from routine clinical testing (and for that matter from more widespread research) is that measurement of the phase III slope is a time-consuming manual process. Because defining the phase III slope in a multi-breath N2 washout has the same problems as there are in the Fowler dead space measurement there is also no particular consensus on how it should be performed. A relatively recent paper showed that a computer algorithm could determine the phase III slopes as well as experienced researchers which opens the possibility that Scond and Sacin could be determined automatically. Unfortunately, in the 5 years since its publication this algorithm doesn’t appear to have been put into use by any other researchers.

    There are also questions about how Scond and Sacin testing should be standardized. As one example, the Scond and Sacin version of an LCI test is performed with the subject being asked to maintain a 1 liter tidal volume. The reason for this has never been explained except that it was part of the protocol for one of the original studies on Scond and Sacin in the 1980’s. The effect of a 1 liter tidal volume versus the subject’s “normal” tidal breath was studied in children and it was shown that LCI was larger and FRC and Scond were smaller with 1 liter tidal breaths than they were with their normal tidal breaths. Interestingly Sacin increased at the 1 liter tidal volume in children with cystic fibrosis but decreased in those with normal lungs. The reason for these changes however, may be that for many of the children in the study a 1 liter tidal breath caused them to exhale below FRC (hence the reduced FRC measured during the LCI). At the present time there doesn’t appear to have been any research done on the effect that tidal volume has on LCI, Scond and Sacin in adults.

    Finally, there are a number of technical issues that need to be addressed and these apply to all forms of the multi-breath nitrogen washout. In this kind of testing the signals from the flow sensor and the gas analyzer(s) occur in different time frames and need to be very precisely matched. A recent study showed that a +/- 40 millisecond mismatch in flow and analyzer signals led to significant decreases or increases in LCI and FRC depending on whether the change was + or -. Moreover, equipment deadspace, room temperature, barometric pressure and the location of the gas analyzer tap relative to the subject’s mouth also had measurable (and often significant) effects on LCI, FRC, Scond and Sacin.

    The LCI test is relatively safe, simple and easy to perform and because it requires only tidal breathing this makes it suitable for children and patients who are otherwise unable or contraindicated from performing spirometry and other pulmonary function tests. Scond and Sacin can be derived from the phase III of exhaled N2 waveform that occurs during an LCI test and provide significantly more information about the potential sites of ventilation inhomogeneities within the lung than does the LCI alone. In particular Scond and Sacin also show significant promise for guiding and monitoring asthma therapies. Although the 2013 ERS/ATS guidelines for multi-breath washouts provide an overall framework for LCI, Scond and Sacin measurements, more research and more specific guidelines are needed before these measurements enter the clinical arena.

    Note: One interesting point I ran across while researching this particular post was that there are no fast-responding nitrogen analyzers being manufactured anymore, regardless of whether it is for medical or even industrial use. All present-day PFT systems that perform N2 washout testing, which includes lung volumes, closing volume and the LCI, actually measure O2 (and sometimes CO2 as well). The thought behind this is basically that anything that isn’t O2 (or CO2 (and if they’re paying attention, argon)) in air, has to be nitrogen. That’s reasonably correct but at the same time all of these test systems use a flow sensor of one kind or another and the “indirect N2” signal has to be aligned and integrated with the flow signal using a high degree of precision in order to accurately measure the volume of nitrogen that’s being exhaled. The Lilly-type nitrogen analyzer (developed in the mid-1940’s by polymath John C. Lilly) that I worked with in the 1970’s and 1980’s was relatively large and mechanically complex (it required a vacuum pump) but the sample chamber was actually placed right next to the mouthpiece so the sample transport time was very low, and the analyzer itself had a response time on the order of 40 milliseconds. Most “high-speed” O2 and CO2 analyzers however, have response times that are usually >120 milliseconds (and also are usually different from each other which means that they have to aligned separately) and the gas sample is usually transported over catheters that are several feet long (with transport times in the hundreds of milliseconds). In order to make up for the relatively slow response of the gas analyzers and the “smearing” that occurs in transport most test systems use “predictive” algorithms of one kind or another. I’m not going to say that present-day N2 washout testing isn’t accurate, but I will say that when it comes to N2 gas analysis (and probably other things as well) we’ve accepted downgraded hardware and complex software as a substitute for far more capable (but admittedly more complex) hardware and I’m not sure that this has been an improvement.

    References:

    Crawford ABH, Makowska M, Paiva M, Engel LA. Convection- and diffusion-dependent ventilation maldistribution in normal subjects. J Appl Physiol 1985; 59(3): 838-846.

    Downie SR, Salome CM, Verbanck S, Thompson B, Berend N, King GG, Ventilation heterogeneity is a major determinant of airway hyperresonsiveness in asthma, independent of airway inflammation. Thorax 2007; 62: 684-689.

    Houltz B, Lindblad A, Singer F, Robinson P, Nielsen K, Gustaffson P. Tidal N2 washout ventilation inhomogeneity indices in a reference populatiopn aged 7-70 years. Eur Respir J 2012; 40: P3797.

    Jetmalani K, Chapman DG, Thamrin C, Farah CS, Berend N, Salome CM, King GG. Bronchodilator responsiveness of peripheral airways in smokers with normal spirometry. Respirology 2016; 21: 1270-1276.

    Kaminsky DA. Multiple breath nitrogen washout profiles in asthmatic patients: What do they mean clinically? J Allergy Clin Immunol 2013;131:1329-30.

    Macleod KA, Horsley AR, Bell NJ, Greening AP, Innes JA, Cunningham C. Ventilation heterogeneity in children with well controlled asthma with normal spirometry indicates residual airways disease. Thorax 2009; 64: 33-37.

    Robinson PD, Latzin P, Verbanck S, et al. ERS/ATS Consensus statement. Consensus statement for intert gas washout measurement using multiple- and single-breath tests. Eur Respir J 2013; 41: 507-522.

    Stuart-Andrews CR, Kelly VJ, Sands SA, Lewis AJ, Ellis MJ, Thompson BR. Automated detection of the phase III slope during inert gas washout testing. J Appl Physiol 2012; 112: 1073-1081.

    Summermatter S, Singer F, Latzin P, Yammine S. Impact of software settings on multiple-breath washout outcomes. PLOS One 2015; 10(7): e0132250.

    Thompson BR, Douglass JA, Ellis MJ, Kelly VJ, O’Hehir RE, King GG, Verbanck S. Peripheral lung function in patients with stable and unstable asthma. J Allergy Clin Immunol 2013; 131: 1322-1328.

    Vanderhelst E, De Meirleir L, Schuremans D, Malfroot A, Vincken W, Verbanck S. Evidence of an acinar response following treatment for exacerbations in adult patients with cystic fibrosis. Respiration 2014; 87: 492-498.

    Verbanck S, Schuermans D, Van Muylem A, Paiva M, Noppen M, Vincken W. Ventilation distribution during histamine provocation. J Appl Physiol 1997; 83(6): 1907-1916.

    Verbanck S, Schuermans D, Van Muylem A, Melot C, Noppen M, Vincken W, Paiva M. Conductive and acinar lung-zone contributions to ventilation inhomogeneity in COPD. Am J Respir Crit Care Med 1998; 157: 1573-1577.

    Verbanck S, Schuermans D, Paiva M, Vincken W. Nonreversible conductive airway ventilation heterogeneity in mild asthma. J Appl Physiol 2003; 94: 1380-1386.

    Verbanck S, Schuermans D, Meysman M, Paiva M, Vincken W. Noninvasive assessment of airway alteration in smokers. The small airways revisited. Am J Respir Crit Care Med 2004; 170: 414-419.

    Verbanck S, Schuermans D, Paiva M, Vincken W. The functional benefit of anti-inflammatory aerosols in the lung periphery. J Allergy Clin Immunol 2006; 118: 340-346.

    Verbanck S, Schuremans, Paiva M, Meysman M, Vincken W. Small airways function improvement after smoking cessation in smokers without airway obstruction. Am J Respir Crit Care Med 2006; 174: 853-857.

    Verbanck S, Schuermans D, Vincken W. Small airway heterogeneity and hyperinflation in COPD: response to tiotropium bromide. Int J COPD 2007; 2(4): 625-634.

    Verbanck S, Schuermans D, Vincken W. Inflammation and airway function in the lung periphery of patients with stable asthma. J Allergy Clin Immunol 2010; 125: 611-616.

    Verbanck S, Paiva M, Schuermans D, Hanon S, Vincken W, Van Muylem A. Relationships between the lung clearance index and conductive and acinar ventilation heterogeneity. J Appl Physiol 2012; 112: 782-790.

    Yammine S, Singer F, Gustafsson P, Latzin P. Impack of different breathing protocols on multiple-breath washout outcomes in children. J Cystic Fibrosis 2014; 13: 190-197.