The American Society for Nondestructive Testing   
Members Only | Contact Us | ShopASNT | Search   

Back to Basics

[ click here for the Back to Basics Archive ]

How Well Does Your NDT Work?

by J. Steve Cargill*

 

The old and the new questions about finding or missing discontinuities are given here by the author with a good clarity as to how this all fits together for NDT today. I think this article may well become a collector's item - worthy of being kept where you can regularly refer to it when needed. I know I will tab this page to make it easy to find.

Frank Iddings
Tutorial Projects Editor

 

Figure 1

Introduction
Historically, when an NDT engineer has been asked, "How well does your testing work?" the reply began, "Well, I can find…" That scenario is rapidly changing in the aerospace and nuclear industries through implementation of customer driven requirements for test reliability demonstrations. As a result, emphasis has shifted from concern about the size of discontinuity that can be found to the size of discontinuity that can be missed. While the difference between the two questions sounds subtle, radically different approaches are required to answer both questions.

 

Test Reliability
A relatively simple capability study may be performed on a few realistic specimens to determine what size discontinuity can be found. However, a more sophisticated experiment using a larger quantity of specimens, tested over a range of realistic production conditions, must be conducted to estimate test reliability. This estimate is generally presented in terms of probability of detection (POD) versus discontinuity size, thereby quantifying test effectiveness. A confidence level is applied to the estimate of POD to provide additional insight into the potential variability of the test. In simple terms, a POD estimate of 90 percent would indicate that nine out of ten discontinuities of a particular characteristic would be found during testing. A confidence level estimate of 95 percent would indicate that, if the entire experiment were repeated over and over, 95 out of 100 repeat experiments would perform at least as well as the quoted estimate.


Using statistical techniques, test matrices are developed that exercise all potential variables of each test.

 


 

In the aerospace industry, this concept of test reliability gained attention with the introduction of the United States Air Force Airframe Structural Integrity Program and the Engine Structural Integrity Program (ENSIP), as well as NASA's standard, MSFC-STD-1249 (1985). All of these comprehensive life management programs insist upon a direct relationship between guaranteed life of fracture critical hardware from assumed rogue discontinuities of certain sizes and the contractor's ability to reliably detect and reject those assumed discontinuities. Many arguments have taken place between contractor and customer regarding the validity of these required demonstrations, with most of the arguments focusing on the use of fatigue cracks to assess manufacturing tests. Obviously, in most cases, virgin parts contain no fatigue cracks. However, it has been the position of the customers that some cracklike discontinuities do occur in new parts and they prefer to work with a worst case scenario. As US Air Force and NASA look to their respective overhaul requirements, they both want to be assured that all critical hardware will be testable to the required crack sizes at depot. The best way to assure that, barring inservice damage, is to require the same critical tests to be performed by the contractor at production.

A team was formed by the Air Force to create a document that would provide guidance for all aspects of reliability demonstrations. The document was recently published as MIL-HBK-1823 (1999). The document is in wide circulation, primarily because no similar set of detailed instructions exists. MIL-HBK-1823 (1999) includes detailed guidance pertaining to:

  • specimen designs, quantities and special care
  • design of experiments and test matrices
  • potential test variables
  • discontinuity size distributions
  • test reports
  • data analysis and presentation
  • treatment of false calls.

Professional societies have initiated efforts to create POD related standards but no private industry documents are close to publication. At this point it makes sense for ASNT to take a leadership role. An ad hoc committee has been assembled to produce a position paper for the Society so that this leadership role may be embraced with a full understanding.

 

Demonstration Design
Design of a reliability demonstration generally flows from a customer requirement to quantify the effectiveness of special tests that are performed on critical, life limiting hardware. For the aerospace community, those tests may be ultrasonic testing, eddy current testing, fluorescent penetrant testing, magnetic particle testing, radiographic testing and other testing methods. Obviously, these various techniques are used for different test scenarios but the worst case discontinuity of concern is normally a tight, cracklike discontinuity in a stress concentrated location. This is the condition that can quickly lead to premature inservice failure. Therefore, most aerospace demonstration specimens are manufactured to contain cracks, while only a few sets contain discontinuities such as voids and inclusions. In other industries, weld discontinuities are a much greater concern, so specimens can be tailored to address those conditions.

The various tests of critical hardware are grouped according to testing technique, surface versus subsurface discontinuity, discontinuity size range, local testing geometry and material. Specimens, in groups of 30 to 60 crack opportunities, are then generated to simulate each scenario. This is the most expensive aspect of the experiment, therefore strong emphasis is placed upon grouping tests, minimizing numbers of specimens and selecting an optimal range of discontinuity sizes.

The ideal discontinuity distribution will have a few specimens on both horizontal sections of the anticipated POD curve, with the bulk of the discontinuity sizes selected to be in the transition region. Obviously, this can be a difficult guessing game, so prior history from previous similar demonstrations is valuable.

Using statistical techniques, test matrices are developed that exercise all potential variables of each test. These may include redundant test systems, probes, dwell times, concentrations of liquids and many other factors, including test personnel.

Honest interaction of testing engineers with inspectors, including development of experimental procedures and data recording devices, is critical in obtaining realistic results. An estimate of NDT capability could be obtained by an engineer in the laboratory, but an accurate estimate of NDT reliability can be obtained only by evaluating the inspectors in a comfortable, familiar environment.

 

Test Anxiety
Although the stigma of testing may make the testing staff acutely alert, a nervous tester is more likely to make a mistake. If the testing engineer does not normally work with the inspectors at that facility, it is advisable for the engineer to arrive early enough to spend some time getting to know the inspectors and discussing the program objectives and procedures before starting the tests. In addition, the tests procedure and data recording device must closely simulate normal daily activity to assure validity of results.

It is also important to provide feedback to the testing staff when the data have been reduced and POD curves generated. In most cases, anonymity of inspectors is assured, making the reliability demonstration a system test, rather than a test of individuals. However, an exception should be made in the case of techniques, which are extremely tester dependent. Although this has been a topic of much debate, it makes no sense to ignore the most critical variable of the test.

MIL-HBK-1823 (1999) describes two optional data analysis and presentation techniques: hit/miss and a-hat versus a analysis. Hit/miss analysis relies simply upon pass/fail information to provide an estimate of POD. It is generally applied to tests whose outputs are nonquantitative and it generally requires more data than the a-hat versus a process to produce a statistically valid answer. An a-hat versus a analysis makes use of the additional information that can be provided in a quantitative output. The symbol a represents the true discontinuity size, while the symbol a-hat indicated discontinuity size. Indicated discontinuity size may simply be signal amplitude, rather than special processing, in an attempt to provide a discontinuity dimension. In general, a-hat versus a analysis provides a more optimistic estimate of NDT reliability. Data analysis software may be obtained from Air Force personnel at Wright-Patterson Air Force Base, Ohio.

Confidence levels of 50 percent (mean curve) and 95 percent (two sigma lower bound) are most often quoted when POD information is presented. NASA requires the use of the 95 percent confidence level at all times when NDT reliability for fracture critical components is discussed. However, Air Force standards such as the ENSIP standard, MIL-STD-1783 (1997), allow the use of the 50 percent confidence level curve for tests that are automated or semiautomated. Obviously, these categories leave much room for debate, but the fact is that any well controlled test produces a 95 percent confidence level curve that is relatively close to the mean curve. Air Force program managers often ask to review lower bound curves for automated systems just to have another test for system stability.

 

Conclusion
In summary, the primary intent for an NDT reliability demonstration is to produce a superior quantification of the effectiveness of a test system, including basic system functions, environmental influences and human factors. Compared to the more conventional practice of demonstrating basic capability of an NDT method on a few artificial discontinuities, this is important information for those of us who rely upon sophisticated NDT methods to provide a final level of quality assurance for critical hardware. Many preliminary demonstrations have produced results that have led to modified test practices. Through those modifications and appropriate controls, the NDT industry now has the opportunity to step up to superior practices.

 

References
National Aeronautics and Space Administration, NASA Specifications Marshall Space Flight, MSFC-SPEC-1249, NDE Guidelines and Requirements for Fracture Control Programs, 1985.

US Air Force Aeronautical Systems Center, Military Standard, MIL-STD-1783, Engine Structural Integrity Program, 1999.

US Air Force Aeronautical Systems Center, Military Handbook, MIL-HDBK-1823, Nondestructive Evaluation System, Reliability Assessment, 1999.

 

 

* Aerospace Structural Integrity, Inc., 8637 S. E. Sharon Street, Hobe Sound, FL 33455; (561) 546-7718; fax (561) 881-4675; e-mail <cargill2@juno.com>.

 

Copyright © 2001 by the American Society for Nondestructive Testing, Inc. All rights reserved.

[ Materials Evaluation ]

 

 
Copyright © 2008 by the American Society for Nondestructive Testing, Inc. ASNT is not responsible for the authenticity or accuracy of information herein. Published opinions and statements do not necessarily reflect the opinion of ASNT. Products or services that are advertised or mentioned do not carry the endorsement or recommendation of ASNT.

IRRSP, NDT Handbook, The NDT Technician and www.asnt.org are trademarks of the American Society for Nondestructive Testing, Inc. ACCP, ASNT, Level III Study Guide, Materials Evaluation, Nondestructive Testing Handbook, Research in Nondestructive Evaluation and RNDE are registered trademarks of the American Society for Nondestructive Testing, Inc.

ASNT exists to create a safer world by promoting the profession and technologies of nondestructive testing.