Comment: Expert Elicitation for Reliable System Design

Statistic al Scienc e 2006, V ol. 21, No. 4, 451– 453 DOI: 10.1214 /0883423 06000000529 Main article DO I: 10.1214/0883 42306000000510 c  Institute of Mathematical Statisti cs , 2006 Comment: Exp ert Elici tation fo r Reliable System Design No rman F enton and Ma rtin Neil The pap er “Exp ert Elicitation for Reliable System Design” b y Bedford, Quigley and W alls is timely and signiﬁcan t for three reasons: 1. It add resses the imp ortance of exp er t elicitatio n in systems design and the statistical and p ractical c h allenges faced when trying to use exp ert jud ge- men ts in a w a y that is consistent with e stablished approac hes b ased on s tatistica l reliabilit y testing. 2. It righ tly fo cuses our atten tion on the n eed for a holistic appr oac h to r eliabilit y ev aluation that go es b eyo nd analysis of single pro jects to also include information from “softer” sources suc h as design and o p e rational use. 3. It recognizes the emerging imp ortance of Ba ye sian metho ds in pr o viding the “uncertaint y calculus” to com bine evidence from e xp e rts with stat istical reliabilit y data in s u c h a w a y th at system r eliabil- it y assessments an d f orecasts can gro w and ev olv e as a syste m c hanges throughout its life. Our own r esearch and exp er ience su pp ort many of the key thrusts of the authors’ ideas. F or the last ten y ears w e ha v e b een applying Ba ye sian metho ds— more sp eciﬁcally , B a y esian n et works (whic h the au- thors refer to in S ection 4.2.3)—to a wide v ariet y of problem areas (see, e.g., Neil, Malcolm and Sha w, 2003 , and F en ton et al., 2004 ). Th is includes system dep end abilit y ev aluation, of whic h the b est kn o w n Norman F enton is Pr ofessor of Computer Scienc e and He ad of Risk Ass essment and De cision Analysis R ese ar ch, Computer Scienc e Dep artment , Que en Mary Col le ge, U niversity of L ondon, L ondon E1 4NS, Unite d Kingdom e-mail: norman@dcs.qmul.ac.uk . Martin N eil is R e ader in Systems Risk at Qu e en Mary Col le ge, University of L ondon and CTO of Ag enda Ltd e-mail: martin@dcs.qmul.ac.uk . This is an electr onic r e print of the original ar ticle published by the Institute of Mathematical Statistics in Statistic al Scienc e , 2006, V o l. 21, No. 4, 451 – 453 . This reprint diﬀer s from the original in pagination a nd t yp ogr aphic detail. example is the T ransp ort Reliabilit y Assessment Cal- culation S ystem (TRA C S ) (Neil, F ent on, F orey and Harris, 2001 ); this is an early exemplar of the meta mo deling fr ameworks cited b y the authors in Sec- tion 4.1. W e hav e found Ba ye sian metho ds to b e most b eneﬁcial to the t yp es of p roblems men tioned b y the authors, including the issue of making tr ad e- oﬀs b etw een reliabilit y and other system ob jectiv es lik e fu nctionalit y and cost (something we examined in detail for soft w are systems in F en ton et al., 2004 ). W e hav e a num b er of additional observ ations to mak e ab out the pap er: V ery often reliabilit y assessments are carried out b y a clien t (rather than the design auth orit y) or by a pro curemen t agency on b ehalf of the client . In this case, the exp ert is not the designer bu t a customer, and the impact of this is more general than the au- thors app ear to suggest in T able 1. Such customers ma y ha v e relev ant op erational reliabilit y exp erience gained from use of similar pro ducts from this or d if- feren t suppliers and will, quite correctly , w ant to use this exp erience to b est eﬀect either to reduce testing eﬀort or to select suppliers at the p ro cure- men t stage. Other situations sp ring to mind where a diﬀerent p ersp ectiv e w ould giv e rise to additional problems and challenge s, su c h as COTS (commer- cial oﬀ the shelf systems). There can b e a p aucit y of empirical d ata for mis- sion and safet y critica l systems s imply b ecause the systems m a y b e no v el or the top ev en ts ma y b e rare. Pr obabilistic r isk assessmen t metho d s aside, this problem often forces practitioners to b orro w or adopt data fr om diﬀerent sources, some of uncertain pro v enance, to help mak e a reliabilit y claim based on some structured (or often unstructured) argument . Where data do exist, they may only b e p artially rel- ev an t for a num b er of r easons. F or example, the data ma y b e sourced from heterogeneous systems or ma y ha v e b een collected under diﬀeren t or u ncon trolled conditions. Detail ed s tatistical mo deling is p r acti- cally and economically infeasible in su c h “messy” situations, but nev ertheless ju dgemen ts ha ve to b e made. I n p ractice th ese decisions can b e a b lack art, 1 2 N. FENTON A ND M. NEIL in v olving o paque assumptions and unchec k ed sub- jectivit y , bu t in our exp erience Ba yesian metho ds can help bring some rigor and structure. More im- p ortantl y , they also encourage transparency and al- lo w uncertainti es and assu mptions to b e mo deled explicitly . In TRA CS (Neil, F en ton, F orey and Harr is, 2001 ) w e b uilt a system that partially or wh olly addr esses some of the au th ors’ aims with s ome success. In deed the system remains in routine use by Qin etiQ to as- sess the r eliabilit y of military v ehicles throughout pro curement, design, test and op erational us e. One of the original ke y motiv ations for TRACS was ex- actly the p roblem identiﬁed in Section 4.1 that tra- ditional approac hes to reliabilit y p rediction tend to b e o verly optimistic b ecause they fail to take into accoun t design and pro cess factors. The TRA CS arc h itecture allo ws estimation of failure rates from families of comp onen ts using a Ba yesia n hierarc hi- cal mo del and then aggregates these in to a system lev el reliabilit y d istribution, wh ic h can then b e up- dated, usin g Ba yes’ rule and lik eliho o d d ata gath- ered at protot yp e test, s ystem trial and p r epro du c- tion stages. Cr ucially , at eac h stage a num b er of exp ert-based assessmen ts are mad e to adju st the failure rate p redictions based on qu alitativ e esti- mates of design and m anufacturing factors, includ- ing su b contrac tor comp etence, r isk analysis qualit y , design do cument ation qualit y , staﬀ reputation and skills. A hybrid B a y esian n etw ork is then used to fuse all of the information to p ro vide a family of estimates and predictions throu gh ou t system life. The s tate of the art has mov ed on considerably since TRA CS, and the Ba yesian algorithms used in T RA CS are no w a v ailable commercia lly (Age naRisk, 2006 ). As a result, mo del construction is n o w considerably faster and easier than it was when TRACS w as ﬁrs t implemen ted in 1999. The issue of exp ert elicitati on is b ecoming increas- ingly relev ant to exte nd and sup plemen t six sigma approac hes. F or example, w e ha v e recen tly b een w ork- ing with Motorola to help complemen t their six sigma program by u sing Ba y esian metho d s to r ep resen t ex- p ert judgement s ab out the imp act of fun damen tal organizatio nal and pr o cess factors on do wn stream pro du ct reliabilit y . Th is is commercia lly imp ortan t b ecause reliabilit y problems often o ccur as a result of sources of sy s tematic design v ariabilit y , often it- self caused by t he ineﬀectiv e managemen t of out- sourced supp liers and problems with co mmunicat- ing and implemen ting system requiremen ts. Th ese are iss ues that are not easily add ressed b y statistical pro cess control techniques, nor are su c h tec h niques designed to address them, despite their imp ortance. Based on this exp erience, a num b er of in teresting re- searc h issues r elev ant to the pap er sp ring to m ind: • Cultural conﬂict; that is, how d o we p ers u ade en- gineering exp erts to express Bay esian priors when the dominant culture of statistical pro cess con trol is almost entirely d ata driv en [w h ic h can lead to what C hapman calls a syndr ome of ob jectiv e ir- rationalit y (Ch apman and W ard, 2000 )]? • What unive rsal organizatio nal and pro cess drivers aﬀect wh at indus tries and in what wa y? • Can w e assess the eﬀects of pro cess factors in quan titativ e te rms or encourage the adoption of metho dical collection and sh arin g of the n ecessary data? The authors imp licitly assume that the b en eﬁts of probabilit y elicitation will only accrue in situations where th er e is already a highly dev elop ed reliabil- it y metho dology to whic h new tec h niques can b e added. In these situations there are already struc- ture, metho d s and data, but what of those who need to assess reliabilit y of pro ducts sourced from less m ature organizatio ns or where data collection b y empirical means is economica lly infeasible? Here elicitat ion could , p erh ap s con tro v ersially , b e used in- stead of traditional reliabilit y metho ds. In this s it- uation decisions w ould turn on “softer” issues, but w ould nev ertheless b e q u an tiﬁed and the prediction ultimately would b e v eriﬁable, at least in prin ciple. An additional k ey b eneﬁt of probability elicita tion that was not co v ered in the p ap er is that it helps co dify kno wledge, making it a v ailable in the futur e for other pro jects or for other systems. Th is is im- p ortant b ecause r eliabilit y assessmen t is n ot just a one-oﬀ activit y und ertak en on a single system or pro ject or ev en o ve r the lifetime of suc h systems; it also ad d resses families of systems that c hange within a c hanging design organization or usage en- vironmen t. F rom this p ersp ectiv e, elicitation should b e seen as a kn o wledge managemen t opp ortunity rather than as a tec h nical problem to b e solv ed in isolation. Suc h kno wledge, if co d iﬁ ed a nd trusted, could b e r eused at redu ced cost on future p ro jects and used to help comm unicate engineering judge- men t from engineering exp erts to no vices. The issue of bias in sub jectiv e probabilit y elici- tation (whic h the authors addr ess in Section 3.2) has to o often b een used as an easy excuse not to COMMENT 3 do Bay esian mo d eling. W e feel strongly that this is- sue has b een ov erplay ed—a goo d discussion of this can b e found in Ayton and P asco e ( 1996 ). More- o ver, in our o wn work b uilding Ba yesia n net mo d- els with domain exp erts, we ha v e deve lop ed a range of tec hn iques that min imize the eﬀort required for probabilit y elicitation. An example is the u se of sim- ple predeﬁned distribu tions that co ve r m ost com- mon situations th at inv olve ord inal scale v ariables that are conditioned on other ordinal scale v ariables (F en ton and Neil, 2006 ). Finally , w e w ould lik e to congratulate the authors on writing su c h an interesting, wide r anging and though t pro vo king pap er. REFERENCES AgenaRisk (2006). Av ailable at www.agenar isk.com . A yton, P. and P ascoe, E. (1996). Bias in cognitiv e judge- ments? Know le dge Engine ering R eview 10 21–41. Chapman, C. B . and W ard, S. C. (2000). Estimation and ev aluation of un certainty: A minimalist ﬁrst pass approach. International J. Pr oje ct M anagement 18 369–383. Fenton, N. E., Marsh, W., Neil, M., Ca tes, P., Forey, S. and T ai lor, M. (2004). Making resource decisions for soft w are pro jects. I n Pr o c. 26th International Confer enc e on Softwar e Engine ering 2004 397– 406. IEEE Computer Society , W ashington. Fenton, N. E. and Neil, M. (2006). Us- ing ranked nodes to mo d el qualitative j udge- ments in Ba yesian net works. A v ailable at www.dcs.qm w.ac.uk/ ~ norman/pap ers/ranked_nod es%20v01.004.pdf . Neil, M., Fenton, N., Forey, S. and Harris, R. (2001). Using Bay esian b elief n etw orks to predict th e reliabilit y of military vehicle s. I EE Computing and Contr ol Engine ering J. 12 11–20. Neil, M., M alcolm, B. and Sha w, R. (2003). Mo deling an air traﬃc control environmen t u sing Ba yesian b elief net- w orks. Presented at 21st International System S afet y Con- ference, Ottaw a, ON, Canada.

Comment: Expert Elicitation for Reliable System Design

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment