Optimizing the Efficiency of Accelerated Reliability Testing for the Internet Router Motherboard
With the rapid development of internet Router, the complexity of its mainboard has been growing dramatically. The high reliability requirement renders the number of testing cases increasing exponentially, which becomes the bottleneck that prevents fu…
Authors: Hanxiao Zhang, Shouzhou Liu, Yan-Fu Li
Optimizing the Efficiency of Accelerated Reliability Testing for the Internet Router Motherboard Hanxiao Zhang, Department of Industrial Engineering , Tsinghua University, Beijing, China Shouzhou Liu, Department of Industrial Engineering, Tsinghua University, Beijing, China Yan-Fu Li, PhD, Department of Industrial Engineering , Tsinghua Universit y, Be ijing, China Key Words: ac celerated testing , test case selection , test case sequencing 1 SUMMARY & CONCLUSIONS With the rapid development of i nternet Router, t h e complexity of its mainboard has been gro wing dramatically. The high reliab ility requirement renders the number of testing cases increasi ng expone ntially, which beco mes the b ottlenec k that prevents further elevation of the production efficiency. I n this work, we develop a novel optimization method of two major steps to largely reduce the testing time and increase the testing e fficiency. In the f irst step , it optimizes the selectio n of test cases given the required a mount of testing time reduction while ensuring th e coverage of failures. In the second step, selected test case s are optimally scheduled to maxi mize t he efficiency of mainboard testing . A numerical experiment is investigated to illustra te the effectiveness of the proposed methods. T he results sho w that the o ptimal subset o f the t est cases can b e selected satisfying the 1 0% te sting ti me red uction requirement, and the effectivene ss index of the scheduled sequence can be improved b y more than 7 5% with test c ase sequencing. Moreover, our method can se lf -adjust to the new failure data, which realizes the auto matic configuration of board test cases. 1 INTRODUCTION The in ternet router is a critical device of the internet infrastructure that forwards da ta packets betwee n the computer networks. It is often p rogrammed to filter p ackets, translate addresses, make routing decisions, b roker quality of service reservations, etc. T he processing spee d of the router is one of the maj or constraints on inter net speed and it s reliab ility directly affects the quality of network service. In addition, with the rapid development and p roliferation of the internet r outer, its main co mponent: the m otherb oard, has beco me increasingly complex. Eviden tly, reliab ility tes ting of the m otherboar d is being e ver m o re n ecessary to en sure its qualit y and thus to protect the reputation of t he manufacturer. The high-reliability req uirement renders the number o f test cases in reliability testi ng incr easing dramatically and u nder the © 20 20 I EEE . Pe rsonal use of t his material is pe rmitted. Permission from I EEE must be obtained for al l other use s, in any curre nt of future me dia, including reprinting/ republis hing this mater ial for advertising or promo tional purpo ses, creating new col lective works, for resale or r edistribution to serv ers or l ists, or reuse of any copyrighted compo nent of this w ork in other w orks. current test ing sc heme, all test cases must be executed at each testing p eriod, which significantly prolongs the testing proce ss and thus crea tes a bottleneck t hat preve nts further elevation of the motherboar d pr oduction. On the other hand, according to our investigation, there is a large number o f test cases whi ch have not exposed any fa ult i n history. The observations and evidence above indicate the great po tential for improveme nts and therefore motivate o ur research. Accelerated Life T esting (A L T) [1 , 2] is o ne of the practical and principal methods o f electron ic p roducts used b y manufacturers to estimate the reliability of their products. Statistical methodolo gy and a pplications to estimate the failure time have b een researched e xtensively. With the characteristic of shortening the life time o f the products or accelerating the degradation of their performance , m anufacturer s can ge nerate a fault th rough AL T finding the causes of the fault and correcting them before producing. On the o ther hand, the weakest products can be detected with ALT to reduce the rejec tion rate. To guarantee t hat th e reliability of the products can b e estimated accurately and q uickly, the des ign of a plan is cr itical for ALT. ALT plan consists of the chosen test condition s, the number of specimens at each test conditions and the prechosen ce nsoring time of each specimen a nd so on [3, 4]. Some ALT plans consider Type II failure censoring times a nd periodic inspection for failure [ 5]. Step-stress tests can be used to yield failures quickly, which obtain the information in a short time [6]. The size of the tes t speci mens ca n be d etermined to get more accurate result s in a given time [7 ] . In this work, however, we focus on reducing the testi ng time b y decreasing the number of test cases, where the ALT conditions are fixed . The selection a nd sequenc ing of the te st cases ar e operated to meet t he ti me r eduction. These operations do not affect the lifetime o f products while influencin g the efficiency and effectiveness of prod uct quality inspection. The test case prioritization techniques: test case selection and test case seque ncing, have been widely resear ched in t he software regressio n test [ 8, 9, 11], which sched ule test ca ses in an o rder to increa se their effective ness at meeting bet ter performance goal, such as code coverage and rate of fault detection. In this paper , w e draw on th ese p rinciples a nd techniques, and modify them to adap t to our problem. The contributions of this work are: 1. W e model the problem of reducing the ALT testing ti me and define the notatio ns in this model. 2. We formulate this p roble m i nto a two-step method: test case selection and test case seq uencing. 3. We p ropose the exact algo rithm s to solve T est Case Selection and T est Case Sequencing r espectively. The rest of this paper is organized as follo ws. Sectio n 2 introduces the description of th is problem. Section 3 and 4 illustrate the models a nd the exact algorit hms for T est Case Selection and Test Case Sequencin g . Numerical experi ment and r esults are presented i n sec tion 5. In Section 6, w e co nclude this w ork and discuss the advanta ges and dra wback of the proposed model and methods. 2 THE S TATEMENT OF THE PROB LEM 2.1 Notation Table 1 sho ws the notations of the param eters w e u se in the following model. Table 1 – Den otations of parameters in ou r problem test case in the peri od decision of the n umber of the test case all periods in the ALT all test cases in eac h period the number of test cases in each period all effective test case s in each period three given testing sets of test cases in each period the priority o f the set and in selection procedure individually the set of periods with the sa me accelerated conditions the expected time limit for ea ch period the number of failures i n histor y exposed by the test ca se the stop time of the test case in the original scheduling o f the testing procedure the running time of the test case the start time of each per iod precedence relation am o ng test cases in peri od 2.2 Problem Description The type of accelerated relia bility tes t co nsidered in t his study is the te mperature and voltage cycle test. It is to determine the ability of the motherboard to w ithstand the mechani cal stresses ind uced b y the temperature and volta ge respectivel y alternating bet ween two extre mes. The weakest members were exposed and the data were collected for anal ysis. In our testing scheme, each motherbo ard undergoes fe w cycles to expose the po tential faults (Fig. 1). Eac h cycle consists of four different testing periods. During each perio d, a list of test cases is e xecuted sequentially. A test case is a small program that can diagnose if certain functionalit ies of the motherboard w or k correctly . Thus, t he failures of the motherboard should be exposed by certain test case(s). All historical failures are reco rded, including the time of failure and the correspo nding test cases. An effective test case is o ne test case that has exposed at least one fa ult in history. Fig 1 – An exa mple for th e testing conditions and the fa ilure number of ea ch test case in the cycle of ALT Due to the sparsity of t he failure number in the ALT, we can select a s ubset o f test cases to execute with t he given time limit and include all e ffective test ca ses to guarantee the testing efficiency. The lifetime of a product is variable with the different testing condit ion. Therefore, the subset of the test case is seq uenced as close to t he original ly sched uled ti me as possible to increase the rate of fault detec tion and maintain the testing effectiveness. In this work, we ai m to develop a novel op timization method of two major steps: test case selection and test case sequencing, to essentially imp rove the testing e fficiency wh ile maintaining the testi ng effectiveness. 3 TES T CASE SELECTION In the first step, the selection optimization is formulated as a modified assignment problem, solved by a linear integer programming tech nique. T he selection of the test cases is coded as indicative decision var iables . The objective function contains two add itive p arts: th e f irst part is the sum o f the n umber of selected test cases each weighted by the number o f failures expo sed in histor y , and another p art is the sum o f test cases selected in each given testing set weighted with its priorit y factor. The model searches for the opti mal subset of the test cases to maximize the obj ective function under a t ime li mit constraint. The time limit , shown as follo ws, is predetermined to achieve the goal of o verall te sting time reductio n. In addition, there are four hie rarchical constraints. The present o ne ca n be included in the optimization model if there are still empty time slots to be f illed after all the previ ous constraints are satisfied. These const raints are presented as follows: Constraint 1. The test case subset for each per iod must i nclude all effective test cases which ha ve exposed an y faults i n history. This is to ensure that all hi storical failures are covered . Constraint 2. All test cases must be exec uted under four d ifferent per iods (not necessarily i n the sa me cycle) to ensure that each test case has experienced the four enviro nmental conditions at leas t once. Constraint 3. The effective test cases me ntioned in th e fir st constrain t are allowed to appear t wice in the subset for each p eriod, in orde r to improve the possibilit y of failure detection. Constraint 4. Other u nselected test case s in each period ca n b e assigned into the subset accordin g to a pre defined priority. The second par t of the objective function in (1) guarantees the selection priority with different priority factors . This integer pro gramming p roblem ( 1) - (6 ) can be solved w ith any normal integer o ptimization technique. 4 TES T CASE SEQUENCING After the first step, we obtain an opti mal subset of test cases for each testing period. In the second step, the test ca se sequencing is deter mined such that it optimizes a specific g oal and simultaneously ensures that the pre ce dence relati ons between certain te st cases are r etained . To address our research problem, we need a measure with which can assess and co mpare the effectiveness of different test case sequences. Cost-cognizant weighted Average Percent age of Faults Detected measure ( ) used in te st cas e prioritization for software fault d etect ion [8 , 9 ] can be used for reference. Accord ing to the failure physics of the motherboard, the same type of fai lure is more likel y to occur near the same point at the timeline starting from the b eginning of the c ycle testing. Th us , t he obj ective is to m i nimize the total deviation between the historical failure t imes a nd the scheduled ti mes of the effecti ve test cases. Moreover, the failure nu mber of each test case based on hi storical recor d s is also considered as the coefficient of the deviation, w hich can improve the effectiveness of the decision. Lower for each period in our problem means better perfor mance. The m odel f or each period is constructed as follows (7 ) - ( 14 ). (7) s.t. (8) (9) (10) (11) (12) ( 1 3 ) (14) The time when the test case is finished is denoted as , which is the decision variab le in opti mization. In (7) th e objective aims at decreasing the d ifference between the scheduled time and th e his torical failure time of the effective test case. The incidence variable denotes whether the test case is scheduled before the test case in period . Constraint ( 8) means the order between a ny t wo test cases is u nidirectional. Constraint (9) guarantees the transitivity, i.e. if and , then . Constraint (1 0) is also known as precedence constraint [10] in scheduling proble m and it serves to limit the order of certain test cases in a sequence. Precedence co nstraint defines a partial order ing bet ween the test cases: , means that test case cannot start before the co mpletion o f test case . In constrain ts (8) – ( 10 ) and ( 14 ), a sequence of the selected test cases is constr ucted. After a sequence is deter mined, th e scheduled ti me of each test ca se is calculated in constraint s (11) – (13) . If , which means t he te st case is scheduled before , the co nstraint can be simpli fied to . Otherwise, the constraint is al ways satisfied and has no restriction on a nd . T he constraint ( 12 ) makes sure t hat the sequence is continuous and the testing process is never idle. T o make the feasible region more co mpact, the value of is set as in each period. 5 COMPU TATIONAL E XPERIMENT T o analyze the method prop osed in this paper, we use a randomly generated instance. T he studied ALT has t wo cycles and eight periods, and the accelerated conditions are the same as the exa mple s hown in Fig.1: LTLV, HT LV, LT HV, HT HV, LT LV, HTLV, LT HV, HT HV . There are 10 test cases during ALT. T he total ti me is 16 0 mins (20 mins for each per iod) and it is required to be reduced by 10 %. T he time limit equals to 18 mins and the start time of each per iod is minute. Tab le 2 .1 - 2 .3 p resents the data u sed for the experiment: failure numbers , test ca se running time , and originally scheduled time . T he p recedence co nstraints are . The giv en test ing sets ar e for all and the p riority factor s are set as respectively. 5.1 Results The test case selection results solved by linear integer programming are presented i n T able 3. We sho w that the time of each period is reduced by 10% with test case selection method. And the co vering co nstraints of four accelerat ed conditions are al so sati sfie d to ensure t he testin g effecti veness. Table 4 presents the test ca se sequencing re sults for each period. 5.2 Comparisons T o illustrate the e ffectiveness of the pr oposed meth od in the test case seque ncing, w e compare the results in Table 4 with the e ffectiveness index (7) o f t he seq uence without opti mization. The sequence without opti mization uses the subset of the test cases in Tab le 3 and the repetitive te st cases are scheduled at the end of the s equence i n order . L ower objective means better per formance. Tab le 5 shows t hat, the results obtained from test case sequencing are greater than the seq uence without optimization by more than 75%, where GAP is set as ( ObjVal without opti mization – ObjVal w ith optimization) / Obj Val without optimization . Table 2.1 – Data of the failure nu mber for each test ca se Test Case Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 TC1 20 20 15 14 5 3 1 2 TC2 0 0 0 0 0 0 0 0 TC3 0 0 1 0 2 0 1 0 TC4 0 0 0 0 0 0 0 0 TC5 1 0 0 0 1 0 0 0 TC6 1 0 0 0 0 0 0 0 TC7 0 0 0 0 0 0 0 0 T C8 1 0 0 0 0 0 0 1 TC9 0 0 0 0 0 0 0 0 TC10 2 0 1 0 2 0 0 1 Table 2.2 – Data of the running time (second s) for each test case Test Case Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 TC1 200 190 200 190 200 190 200 190 TC2 25 20 25 20 25 20 25 20 TC3 100 150 200 210 100 150 200 210 TC4 55 65 65 65 65 65 55 65 TC5 70 70 60 70 70 70 70 60 TC6 150 140 150 150 145 140 170 150 TC7 125 120 120 110 115 120 115 110 TC8 10 10 10 10 10 10 10 10 TC9 60 40 45 50 60 40 45 50 TC10 400 350 300 320 400 350 300 300 Table 2.3 – Data of the o riginal time (seconds) for each test case Test Case Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 TC1 200 1390 2600 3790 5000 6190 7400 8590 TC2 225 1410 2625 3810 5 025 6210 7425 8610 TC3 325 1560 2825 4020 5125 6360 7625 8820 TC4 380 1625 2890 4085 5190 6425 7680 8885 TC5 450 1695 2950 4155 5260 6495 7750 8945 TC6 600 1835 3100 4305 5405 6635 7920 9095 TC7 725 1955 3220 4415 5520 6755 8035 9205 TC8 735 1965 323 0 4425 5530 6765 8045 9215 TC9 795 2005 3275 4475 5590 6805 8090 9265 TC10 1195 2355 3575 4795 5990 7155 8390 9565 Table 3 – Test ca se selection results Test Case Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 TC1 2 2 2 2 1 2 1 2 TC2 1 1 1 1 0 1 1 1 TC3 0 1 1 1 1 0 2 0 TC4 0 1 1 1 1 1 1 0 TC5 1 1 1 1 2 1 0 1 TC6 1 1 0 1 0 0 1 0 TC7 0 1 0 1 1 1 1 0 TC8 2 1 1 1 0 1 1 2 TC9 0 0 0 1 1 1 1 0 TC10 1 0 1 0 1 1 0 2 Total time 1065 955 1060 1065 1080 1055 1020 1080 Table 4 – Te st case sequencing result s (“ TC1- ” den otes the same test case as TC1 ) order Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 1 TC1 TC1 TC1 TC1 TC1 TC1 TC1 TC1 2 TC1- TC1- TC1- TC4 TC7 TC7 TC2 T C2 3 TC5 TC6 TC4 TC3 TC9 TC4 TC7 TC10- 4 TC6 TC3 TC3 TC7 TC5- TC10 TC4 TC8- 5 TC8- TC4 TC5 TC8 TC4 TC1- TC9 TC5 6 TC2 TC5 TC8 TC6 TC5 TC9 TC3- TC1- 7 TC8 TC8 TC2 TC9 TC3 TC5 TC6 TC10 8 TC10 TC2 TC10 TC1- TC10 TC8 TC3 TC8 9 -- TC7 -- TC5 -- TC2 TC8 -- 10 -- -- -- TC2 -- -- -- -- ObjVal 1.3897 12.5654 21.9478 33.8028 38.7963 56.8720 41.4216 73.8426 Table 5 – Th e comparison results: sch eduled seq uence and original seq uence ObjVal (%) Period1 Period2 Period3 Period4 Period5 Period6 Period7 Period8 without optimization 5.9415 54.9020 95.5198 144.0000 162.0297 242.7746 307.3171 458.6207 with optimization 1.3897 12.5654 21.9478 33.8028 38.7963 56.8720 41.4216 73.8426 Gap (%) 76.6108 77.1129 77.0227 76.5258 76.0561 76.5741 86.5216 83.8990 6 CONCLUSI ONS In conclu sion, our t wo-step method p roduces the op timally selected and opti mally ordered subsets of te st cases, which can maximize the effective ness of the cycle testing and reduce the testing ti me to the desire d le vel. Our al gorithm for test case sequencing is u sed for the sma ll size of test cases and ca n obtain the exact solution. In an industrial process, we can develop evolutionary algorithms to op timize the test case sequence [ 11, 12, 13, 14 ]. Our m et hod h as bee n successfully applied to the router motherboard production of a m ajor Chin ese teleco mmunication manufacturer, satisfying the 20% testing time red uctio n requirement while scheduling the effective test cases around their historical failure times. Mo reover, our methodology can self-adjust to the new fai lure d ata and eventually realizes t he automation of the o ptimal s election and seque ncing of the motherboard reliability testing. T he promising results indicate its applicability to other simila r reliability testing pro cesses. REFERENCES 1. Nelson, W . B ., “ Accelerated te sting: s tatistical m o dels, test plans, and data analysis ”, John Wiley & Son s , 2009, Vo l. 344. 2. Meeker, W . Q., & Escobar, L. A. , “ Stati stical met hods for reliability data ”, Jo hn Wiley & So ns , 201 4. 3. Nelson, W . B., “A b ibliography o f acce lerated test plans”, IEEE Transactions on Reliab ility, 2005, 54(2), 194 - 197. 4. Nelson, W. B ., “An updated bibliography of accelerated test plans”, In 2015 Annual Reliability a nd Maintainabilit y Symposium (RAMS), IE EE, 2015, pp. 1 - 6. 5. Abushal, T . A., & Soli man, A. A., “Estimati ng the Pareto parameters under pro gressive censorin g data for constant - partially accelerated life te sts”, Jo urnal of Statistical Computation and Si mulation, 2 015, 85(5), 917 -934. 6. Hu, C . H., P lante, R . D., & Tang, J, “Equivale nt step -stress accelerated life tests w ith log-location-scale li fetime distributions under T ype- I censoring”, IIE Transactions, 2015, 47(3), 24 5-257. 7. Ahmad, N., Islam, A., & Salam, A., “An alysis of optimal accelerated life test plans for per iod ic inspection: the case of exp onentiated Weibull failure model”, Internatio nal Journal of Qualit y & Reliability Mana gement, 2 016, 23(8), 1019 -1046. 8. Rothermel, G., Untch, R. H., Chu, C., & Har rold, M. J. , “ Prioritizing test cases for regression testing ” ., IEEE Transactions on software en gineering , 2001 , 2 7(10), 929 - 948. 9. Elbaum, S., Malishevsky, A., & Rothermel, G., “ Incorporating var ying test costs and fa ult severities i nto test case pr ioritization ”, I n Proceeding s of the 23rd Internationa l Conference on S o ftware Eng ineering, IEEE Computer Society , 2001, July , pp. 329-338. 10. Baker, K. R., & Schrage, L. E. , “ Findi ng a n optimal sequence by d ynamic prog ramming: an e xtension to precedence-related tasks ” , Operations Research , 1 978 , 26(1), 111 - 120. 11. Li, Z., Har man, M., & Hieron s, R. M , “ Search algorit hms for regression te st case p rioritization ” , IE EE Transactio ns on software eng ineering , 200 7, 33(4), 225 -237. 12. Kaur, A., & Go yal, S, “ A gen etic algorithm for regressio n test case prioritization using code coverage ” , International journal on comp uter science and engineering , 2011, 3(5), 1839 -1847. 13. Ahmed, A. A., Shaheen, M., & Ko sba, E. , “ Software testing s uite p rioritization using multi-criteria fitness function ” . In 2012 22nd International Conference on Computer Theory and Applications (ICCTA) IEEE , 2012, October , pp. 160 - 166 . 14. Yadav DK, Du tta S , “ Regression test case prio ritization technique using ge netic al gorithm ”, In: Adva nces in computational in telligence, Springer, Singap ore , 2017, pp 33 – 140. BIOGRAPHIES Hanxiao Zhang Department of Industrial En gineering Tsinghua University 30 Shuangqing Road Beijing, China e-mail: zhx17@mails.ts hinghua.edu.cn Hanxiao Zhang received a B .S. degree in Applied mathematics from Wuhan University, China, in 201 7. She is currently a Ph.D. candidate in the Depart ment of Industrial Engineering, Tsinghua University. Her resear ch interests inclu de Redundancy Allocation Pro blem, d ynamic pr ogramming and optimization methods. Shouzhou Liu Depart ment of Industrial Engineering Tsinghua Universit y 30 Shuangqing Road Beijing, China e-mail: lsz16@ mails.tsinghua.edu.cn Shouzhou Liu received a M.S. degree in I ndustrial Engineering from Tsinghua Un iversity, China, in 2019. His research interests include s mart grid, cyber sec urity and game theor y . Yan -F u Li, PhD Department of Industrial En gineering Tsinghua University 30 Shuangqing Road Beijing, China e-mail: liyanfu@tsin ghua.edu.cn Yan -Fu Li is currentl y a pro fessor at the Department of Industrial Engineering (IE), Tsinghua University. He is the director of the Reliability & Risk Management Labo ratory at Institute of Q uality and Reliability in Tsinghua Universi ty. He obtained his B .Eng. degree in so ftware engineeri ng fr om Wuhan Uni versity in 2 005 and a Ph.D. in i ndustrial e ngineering from the National University of Singapore in 2010. He was a faculty member at the Labo ratory of Industrial Engineerin g at CentraleSupé lec, France, from 2011 to 2 016. His current research areas incl ude RAMS (reliability, a vailability, maintainability, safety and security) assessment and optimization with the ap plications onto energy systems, transportation s ystems, co mputing systems, etc. He is the Principal Inv esti gator on sever al g o vernment projects includin g one key p roj ect funded b y National Nat ural Scie nce Fo undation of China, one pro ject in Natio nal Key R&D P rogram of Chi na, and the p rojects supported b y EU and Fre nch fundin g bodies. He is also e xperienced i n industrial research: partner s incl ude EDF, ALSTOM, C hina Southe rn Grid, etc. Dr. Li has p ublished more t han 9 0 research papers, including more than 40 peer - reviewed internatio nal jo urnal pap ers. Dr. L i is curr ently an associate editor of IEEE T ransactions on Reliabilit y, a senior member of IEEE and a member of INFORMS. He is a mem ber of the Executive Co mmittee of the Reliability Chapter of Chinese Operations Research Society; E xecutive Committee of Industrial Engineering C hapter of Chinese Society of Optimization, Overall Plan ning an d Economic Math ematics; Committee of Uncertainty Chapter of Chinese Artificial Intelligence Societ y.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment