Measuring Password Strength: An Empirical Analysis

Measuring P assw ord Strength: An Empirial Analysis Matteo Dell'Amio, Pietro Mi hiardi and Y v es Roudier Institut Eureom {matteo.dell-amio, pietro.mi hiardi, yv es.roudier}eureom.fr Otob er 23, 2018 Abstrat W e presen t an in-depth analysis on the strength of the almost 10,000 passw ords from users of an in- stan t messaging serv er in Italy . W e estimate the strength of those passw ords, and ompare the ef- fetiv eness of state-of-the-art atta k metho ds su h as ditionaries and Mark o v  hain-based te hniques. W e sho w that the strength of passw ords  ho- sen b y users v aries enormously , and that the ost of atta ks based on passw ord strength gro ws v ery qui kly when the atta k er w an ts to obtain a higher suess p eren tage. In aordane with existing studies w e observ e that, in the absene of mea- sures for enforing passw ord strength, w eak pass- w ords are ommon. On the other hand w e diso v er that there will alw a ys b e a subset of users with ex- tremely strong passw ords that are v ery unlik ely to b e brok en. The results of our study will help in ev aluat- ing the seurit y of passw ord-based authen tiation means, and they pro vide imp ortan t insigh ts for in- spiring new and b etter proativ e passw ord  he k ers and passw ord reo v ery to ols. 1 In tro dution Ev en though m u h has b een said ab out their w eak- nesses, passw ords still are  and will b e in the fore- seeable future  ubiquitous in omputer authen ti- ation systems. A p euliar  harateristis of pass- w ords is that they inheren tly arry a trade-o b e- t w een usabilit y and seurit y: while strong pass- w ords are hard for atta k ers to guess, they are on the other hand also diult for the user to re- mem b er. As Ri hard Smith parado xially notes, passw ord b est praties imply that the passw ord m ust b e imp ossible to remem b er and nev er written do wn [17 ℄. In ligh t of this, it is not v ery surprising that users often kno wingly  ho ose to use w eak pass- w ords or irum v en t seurit y b est praties, sine they p ereiv e that follo wing them w ould get in the w a y of doing their w ork [1, 15 ℄. T o think sensibly ab out the seurit y of systems that use passw ords, it is therefore essen tial to an- alyze the  harateristis of passw ords  hosen b y users. In this w ork, w e analyze a large dataset on- taining all user passw ords from an instan t messag- ing serv er lo ated in Italy . Unlik e previous empiri- al studies on passw ords [10 , 7, 18 , 11 , 3, 4 , 9℄, this pap er ev aluates the strength of passw ords against a v ariet y of state of the art te hniques for andidate generation. The analysis w e onduted b eneted from ha ving aess to the passw ords in unenrypted form; this made it p ossible to measure the strength of all of them, inluding those that w ould hardly b e ra k ed ev en b y extremely p o w erful atta k ers. W e ev aluate the strength of a passw ord in terms of their asso iated sear h spae size, that is the n um b er of attempts that an atta k er w ould need to orretly guess it. This measure do es not de- p end on the partiular nature of the authen tiation system nor on the atta k er apabilities: it is only related to the atta k te hnique and to the w a y users  ho ose their passw ord. The atta k mo del and the  harateristis of the system will instead dene the ost that the atta k er has to pa y for ea h single guess. By om bining this ost with our measures of passw ord strength, it b eomes p ossible to obtain a sound ost-b enet analysis for atta ks based on passw ord guessing on an authen tiation system. As w e will sho w, dieren t atta k te hniques are advisable dep ending on the sear h spae size that the atta k er an aord to explore. This has to b e 1 tak en in to aoun t when prop osing and ev aluating new te hniques for reduing the sear h spae: they ma y b e eetiv e only if the strength of the atta k falls within a giv en in terv al. W e sho w that passw ord strength has an ex- tremely wide v ariane: as a rst appro ximation, the probabilit y to guess a passw ord at ea h attempt dereases roughly exp onen tially as the size of the explored sear h spae gro ws. These diminishing re- turns imply that, in most ases, an atta k er w ould ev en tually nd a p oin t where the ost of on tin uing the atta k w ould not b e justied b y the probabil- it y of suess. This study pro vides gures that an help designers and administrators in assessing the seurit y of their systems b y ev aluating where that p oin t resides. 2 Related W ork In this setion w e pro vide a short review of studies ab out passw ord seurit y , and mak e the ase for the imp ortane of measuring passw ord strength. A t- ta ks su h as phishing or so ial engineering, where the user is misled in omm uniating the passw ord to the atta k er, are unrelated to passw ord strength and therefore outside the sop e of this w ork. Priing Via Pro essing T o defend against in- truders who rep eatedly try passw ord after passw ord un til they obtain aess to the system, it is p ossible to limit the rate at whi h the atta k er is allo w ed to try new passw ords b y requiring the user to p erform an ation with a mo derate ost. While legitimate users w ould need to p erform this ation only one ev ery time they try to log on, an atta k er w ould need to rep eat this pro ess man y times, resulting in a disprop ortionate ost that renders the atta k w orthless. The follo wing measures b elong to this ategory: • CAPTCHAs [19℄, whi h require solving puz- zles that are diult without h uman in terv en- tion; • k ey strengthening te hniques, whi h require a few seonds of omputation to deriv e a k ey from the passw ords; this idea rst app eared in the design of the UNIX system in the late '70s [10 ℄. A mo dern k ey strengthening algo- rithm, where the omputation length is ong- urable via the  hoie of a tunable parameter, is PBKDF2 [6 ℄. It is imp ortan t to note that these te hniques imp ose a trade-o to legitimate users: if an honest user has to pa y a ost c , the atta k er m ust pa y at most c · s , where s is the strength of the passw ord in terms of the n um b er of attempts needed to guess it. The measures obtained in this pap er an b e used to estimate osts and b enets of these systems, and th us to prop erly tune this c parameter. An alternativ e approa h blo  ks aoun ts after a giv en n um b er of failed attempts. This resp onse, ho w ev er, op ens the do or to denial of servie atta ks on user aoun ts and is ineetiv e unless the atta k is sp eially targeted to w ards a single user [13 ℄. Oine A tta ks In most ases, the authen tia- tion serv er do es not store passw ords in plain text. Instead, it k eeps an enrypted v ersion of them whi h is oneptually analogous to a hash: when a user attempts to log on, the passw ord they pro vide is enrypted and ompared to the stored v alue. In this w a y , ev en if an atta k er obtains the enrypted passw ords, these annot b e used righ t a w a y to log on to the system. T o mak e it ostly for the at- ta k er to guess the passw ord b y enrypting lots of passw ord andidates, k ey strengthening te hniques are again applied. A tta ks based on pre-omputing the enrypted v ersion of the most lik ely passw ords [11 , 12 ℄ are defeated with the simple te hnique of salting, also kno wn sine the early da ys of UNIX: that te hnique w orks b y app ending a random n um- b er to the passw ord b efore enrypting it, and then storing this n um b er along with the enrypted pass- w ord. Sine these te hniques are based on the idea of making guessing atta ks ostly , the passw ord strength that w e are measuring is also a k ey pa- rameter when ev aluating the resiliene of a pass- w ord system to oine atta ks. P assw ord Reo v ery W e measure passw ord strength b y taking in to aoun t attempts to break them with state of the art te hniques. The free passw ord reo v ery soft w are John the R ipp er 1 iden- ties passw ords b y  he king them against a large- sized ditionary , plus a xed set of mangling rules, 1 http://www.openwall.om/j ohn / 2 su h as app ending or prep ending digits to ditio- nary w ords. A ording to Brue S hneier's de- sription [16 ℄, A essData's proprietary P assw ord Reo v ery T o olkit omplemen ts this approa h with a phoneti pattern set generated via a Mark o v  hain routine to generate meaningless but pro- nouneable passw ords. In Setion 5, w e formal- ize a metho d based on the same idea and ev aluate its merits in reduing the sear h spae for ra king passw ords. Proativ e P assw ord Che king A proativ e passw ord  he k er is a system that fores (or ad- vises) the user to  ho ose omplex enough pass- w ords. The impat of these  he k ers on atual passw ord seurit y is debatable: as W u [20 ℄ notes,  [users are℄ v ery go o d at seleting passw ords that are just `go o d enough' to pass whatev er  he king is in plae. The MySpae so ial net w ork requires users to ha v e at least a non-alphab eti  harater in their passw ord; in a set of leak ed passw ords, 86% of the users om- plied with this requiremen t b y app ending a n um- b er at the end of their passw ord; for 20% of them that n um b er w as a 1 [14 ℄. F urthermore, a proa- tiv e passw ord  he k er ould enourage users to use non-ditionary passw ords that are related to their p ersonal life su h as dates, telephone n um b ers or li- ense plate n um b ers [1 ℄. F or a motiv ated atta k er, these passw ords are ev en easier to guess than di- tionary w ords. Moreo v er, a strong passw ord in the abstrat ould fore the user to write it do wn and lea v e it in a plae where an atta k er an easily nd it. F or example, man y emplo y ees hide pass- w ords under their mouse pads at their ompanies [17 ℄. In general, it seems that passw ord strength  he k ers atually inrease system seurit y only if they are seen b y users as a to ol that helps them and not just as an additional ho op they ha v e to jump through to get their job done. Existing passw ord  he k ers are based on quite naiv e metris [2, 21 ℄: they  he k on passw ord length, or resiliene to brute fore and ditio- nary based atta ks; still, they do not tak e in to a- oun t adv aned ra king te hniques. Our measure of strength as sear h spae size an b e used as the basis for more eetiv e passw ord  he k ers. Empirial Studies It is a w ell kno wn fat that man y users almost in v ariably  ho ose easy to guess passw ords; urren t empirial studies, ho w ev er, gen- erally fo us on a single kind of atta k and neglet to quan tify ho w strong the remaining share of pass- w ords are with resp et to more general atta ks. T o the b est of our kno wledge, no other w ork ev al- uates the strength of passw ords o v er their whole strength sp etrum and against all state-of-the-art te hniques. Analyses on ditionary atta ks rep ort a p eren t- age of brok en passw ords v arying b et w een 17% and 24% [10 , 7, 18 ℄. In Setion 4, b efore in v estigat- ing the remaining stronger passw ords, w e obtain results of similar magnitude, v arying with the t yp e and size of ditionary used. Some studies are based on a dataset of enrypted passw ords, and only rep ort on the ones that ha v e b een atually ra k ed [7 , 11 , 3 , 9℄; in omparison, w e had aess to the plain-text whi h ga v e us in- formation on the passw ords that w ould b e ompu- tationally impratial to break. In a 2007 study [4 ℄, Florenio and Herley ob- tained data ab out the passw ords of ab out 500,000 users. That w ork pro vides in teresting insigh ts ab out user habits, but only quan ties passw ord strength with a simple bit strength measure based on their length and on the use of upp erase, n u- meri, and non-alphan umeri  haraters; resiliene against adv aned passw ord-ra king te hniques is not tak en in to onsideration. 3 Our Dataset Our dataset on tains the unenrypted passw ords for the 9,317 registered users of an Italian instan t messaging serv er. Storing passw ords in plain text on the serv er is required b y authen tiation algo- rithms su h as CRAM-MD5 2 . User registration is free and no p oliy for passw ord strength is enfored: ev en the empt y passw ord is allo w ed. The absene of strength enforemen t allo ws us to in v estigate the b eha vior of users when  ho osing their passw ord in the absene of external requiremen ts. Users are free to  ho ose an y un used username when registering. A total of 269 users (2.89% of the total) use the same string as b oth username and passw ord. The single most eetiv e attempt 2 http://tools.ietf.org/htm l/r f21 95 3 Figure 1: P assw ord length distribution. to guess a giv en user's passw ord w ould therefore b e its o wn username. Some users share the same passw ord, and this results in 7,848 unique passw ords. While in some ases this ma y b e due to oinidenes and use of to o frequen t passw ords, other ases ma y b e the onse- quene of the same p eople registering under dier- en t usernames at the same serv er. The a v erage passw ord length is 7.86. Figure 1 sho ws the length distribution. Ev en though the full Unio de  harater set is usable for the pass- w ords, only 124 dieren t  haraters had b een used. F requenies of  haraters ha v e v ery unev en distri- butions (see table 1): while one  harater out of 11 is an ` a ', the most frequen t upp erase  harater (` A ') has a frequeny of appro ximately 1 in 500. In table 2, w e sho w the mat hing ratio of v arious simple regular expression. More than 50% of the passw ords on tain only lo w erase  haraters, and less than 7% on tain non-alphan umeri  haraters. Around 15% of them onsist of a string of lo w erase  haraters follo w ed b y a n umeri app endage. W e also analyzed a set of 33,671 leak ed MySpae passw ords [5, 14 ℄. Sine these passw ords ha v e b een obtained through a phishing atta k, they inlude those of less seurit y-onsious users who fell for the atta k. Moreo v er, MySpae requires users to insert non-alphab eti  haraters in their passw ords, and this imp oses an artiial impat on passw ords that users, left alone, w ould  ho ose. F or these rea- sons, w e onsider this dataset less represen tativ e of atual user passw ords than our primary one; w e ho w ev er use it in this w ork to orrob orate some of our ndings b y v alidating them on another dataset. Charater Coun t P eren tage a 6,681 9.12% e 4,520 6.17% o 4,484 6.12% i 4,388 5.99% r 3,628 4.95% n 3,310 4.52% l 3,095 4.23% s 2,895 3.95% t 2,853 3.90% 1 2,518 3.44%  2,367 3.23% m 2,137 2.92% 0 1,990 2.72% p 1,945 2.66% d 1,813 2.48% 2 1,692 2.31% u 1,640 2.40% b 1,624 2.22% 3 1,487 2.03% g 1,334 1.82% other 16,832 22.98% T able 1: Charater distribution. Expression Example Mat hes [a-z℄+ abdef 51.20% [A-Z℄+ ABCDEF 0.29% [A-Za-z℄+ AbCdEf 53.74% [0-9℄+ 123456 9.10% [a-zA-Z0-9℄+ A1b2C3 93.43% [a-z℄+[0-9℄+ ab123 14.51% [a-zA-Z℄+[0-9℄+ aB123 16.30% [0-9℄+[a-zA-Z℄+ 123aB 1.80% [0-9℄+[a-z℄+ 123ab 1.65% T able 2: P eren tage of passw ords mat hing v arious regular expressions. 4 4 Ditionary A tta k Ditionary atta k is the most eetiv e te hnique to guess the w eak est passw ords. W e ev aluated pass- w ord strength b y using the ditionaries a v ailable in the already men tioned John the R ipp er (JtR) pass- w ord reo v ery to ol. The extended ditionaries that w e used are a v ailable for paid do wnload from the program w ebsite 3 . 4.1 The Ditionaries The JtR ditionaries on tain w ords from 21 dier- en t h uman languages, plus a list of frequen tly used passw ords. F or some languages (lik e English and Italian), v arious ditionaries of dieren t sizes are a v ailable: the smaller ones on tain only the most frequen tly used w ords while the bigger ones also on tain more obsure w ords, the rationale b eing that more ommon w ords are more lik ely to b e  ho- sen as passw ords. T ak en together, all ditionaries aoun t for almost 4 million w ords. A bigger ditio- nary on taining more than 40 millions w ords is ob- tained using mangling rules that attempt to re- ate more omplex passw ords b y altering ditionary w ords, for example b y juxtap osition of ditionary w ords or b y app ending a n um b er at the end of the w ord. An often-advised te hnique to reate strong but easy to remem b er passw ords is to turn phrases in to passw ords b y extrating an aron ym, p ossibly also using puntuation. F or example, the phrase Alas, p o or Y ori k! I knew him, Horatio b eomes A,pY!Ikh,H. W e also ev aluated su h aron yms with a ditionary reated b y Kuo et al. [8℄ that w as put together b y sraping w ebsites displa ying mem- orable phrases, su h as itations and m usi lyris. 4.2 Exp erimen tal Results W e sim ulated ditionary atta ks with all the JtR ditionaries. T able 3 sho ws the results for the most represen tativ e instanes. The found olumn lists the p eren tage of pass- w ords that app ear in that ditionary; the guess probabilit y olumn reets the probabilit y that a random w ord from that ditionary mat hes a ran- dom passw ord: a rational atta k er w ould try a w ord from that ditionary only if the b enet of ra king 3 http://www.openwall.om/ word lis ts/ Ditionary Size F ound Guess prob. F requen t passw ords 3,114 7.25% 2 . 33 · 10 − 5 English 1 l 27,424 4.91% 1 . 79 · 10 − 6 English 2 l 296,809 9.42% 3 . 17 · 10 − 7 English 3 l 390,532 11.59% 2 . 97 · 10 − 7 English extra l 444,678 8.03% 1 . 81 · 10 − 7 Italian 1 l 63,041 3.71% 5 . 89 · 10 − 7 Italian 2 l 344,074 14.89% 4 . 33 · 10 − 7 All ab o v e 1,117,767 25.51% 2 . 28 · 10 − 7 All JtR ditionaries 3,917,193 25.94% 6 . 62 · 10 − 8 All JtR + mangling 40,532,676 30.12% 7 . 43 · 10 − 9 Mnemonis [8℄ 406,430 1.27% 3 . 12 · 10 − 8 T able 3: Ditionary atta ks. The l aron ym stands for all-lo w erase ditionaries: those on- taining upp erase letters are mat hed b y v ery few w ords in our dataset. The English 1, English 2 and English 3 ditionaries, lik e Italian 1 and Italian 2, are listed in gro wing size; ea h w ord b elonging to a smaller ditionary is also on tained in the bigger v ersions. the passw ord exeeds the in v erse of that probabilit y times the ost of the eort for trying that passw ord. The English extra ditionary has a sligh tly mis- leading name: it on tains w ords that don't ap- p ear in a regular ditionary but that users are lik ely to use, su h as prop er nouns, ommon mis- sp ellings or alterations of w ords. Man y of them are language-agnosti (e.g., Aldebaran) or ome from non-English languages (Mariela). As the serv er is in Italy , most users are Italian. The amoun t of English w ords found in passw ords is not partiularly surprising for those who kno w the tendeny that nativ es ha v e to w ards the hea vy use (and abuse) of English. An in teresting feature is the notieably higher densit y of ommon English w ords (those presen t in the small English 1 ditio- nary); that phenomenon is m u h less relev an t with resp et to Italian. W e think that this is aused b y the fat that most users kno w English as a seond language, and th us are less inlined to use an ob- sure w ord as their passw ord. This suggests that it migh t b e go o d pratie to use one's nativ e language to reate stronger passw ords. The most imp ortan t lesson dra wn from this data is the priniple of diminishing r eturns : the proba- bilit y of guessing a w ord sharply dereases as the ditionary gro ws. The 3,100-w ord ditionary of fre- 5 quen t passw ords ra ks 7% of those in our datasets; b y inreasing roughly 300 times the size of the di- tionary up to more than one million and inlud- ing all Italian and English w ords, the n um b er of ra k ed passw ords rises to 25%. When the n um b er of attempts gro ws b ey ond 40 millions b y inluding other languages and mangling, only 5% more of the passw ords are found. T o put it in another w a y , the probabilit y of guessing a giv en passw ord b y trying an elemen t of the frequen t passw ords ditionary is one in 43,000. On the other hand, after ha v- ing tried all the frequen t passw ords and the Italian and English ditionary , the probabilit y of guessing b y using another ditionary w ord is less than one in 500 million! Sine the guessing probabilit y de- reases so sharply , it is oneiv able that in man y ases it w on't b e w orth trying a bigger ditionary for the atta k er. W e also observ e that the mnemoni ditionary is quite ineetiv e. This ma y b e due to sev eral reasons: rst, few users atually use mnemonis for their passw ords; seond, they are atually m u h harder to break with ditionary atta ks. Moreo v er, w e are not able to asertain whether the habit of  ho osing English passw ords for Italian users w ould arry o v er to the use of mnemonis. Our data is, at the momen t, insuien t to p oin t to w ards one reason or the other. 5 Mark o v Chain-Based A tta k The fat that ditionaries fall short do es not mean that an atta k er w ould need to resort to an exhaus- tiv e brute-fore atta k: some passw ords are m u h more lik ely to b e  hosen than others. As seen in Setion 3, there is a v ery unev en distribution of  harater  hoie. Moreo v er, other regularities ex- ist: passw ords are usually made of pronouneable sub-strings and/or sequenes of k eys that are lose on the k eyb oard. In this setion, w e desrib e and v alidate an at- ta k based on Mark o v  hain-based mo deling of the frequenies of sub-strings with parametri length k , or k -graphs. This allo ws us to lab el andidate passw ords with v ariable probabilities, where strings that are lab eled as more lik ely are  he k ed rst. Some passw ord generating utilities atually use this kind of mo deling to obtain meaningless but pro- nouneable passw ords on the grounds that they're easier to remem b er, th us sariing some strength for usabilit y 4 . 5.1 The T e hnique W e base our formalization on the te hniques sho wn in [11℄, extending the mo del so that it applies to sub-strings of length 3 and more. This mo del rep- resen ts a passw ord  hoie as a sequene of random ev en ts: rst, the length of the passw ord is  hosen aording to a giv en probabilit y distribution; then, ea h  harater of the string gets extrated aord- ing to a onditional probabilit y dep ending on the previous k − 1  haraters. W e eno de the  harateristis of passw ords via t w o funtions, λ and ν . λ represen ts the length dis- tribution of passw ords so that, for example, λ (8) is the probabilit y that the passw ord has length 8. ν , instead, represen ts the onditional probabilit y of ea h k -graph with resp et to the orresp onding ( k − 1) -graph: ν ( c 1 . . . c k | c 1 . . . c k − 1 ) is the prob- abilit y that the  harater c k follo ws the sub-string c 1 . . . c k − 1 . F or k = 1 , ν ( c ) expresses the frequeny of c , that is, the probabilit y that a random  hara- ter in a passw ord oinides with c . By  ho osing k = 1 , th us fo using on  harater frequeny , the probabilit y P 1 ( α ) that our mo del will generate a string α (where its length is | α | and its i th  harater is α i ) is P 1 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i ) . T o deriv e P k with k ≥ 2 , w e will adopt the on v en tion that α i = ⊥ whenev er i ≤ 0 , where  ⊥  is a sp eial  harater not allo w ed to app ear in passw ords. F or example, w e write the probabil- it y that a passw ord starts with the  a   harater as ν (” ⊥ a ” | ” ⊥ ” ) ; the probabilit y that a  b  follo ws an initial  a  is instead ν (” ⊥ ab ” | ” ⊥ a ” ) . Giv en this, w e an formalize the digraph-based probabilit y P 2 as P 2 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − 1 α i | α i − 1 ) and, in general, 4 See for example gpw ( http://www.multiians.o rg/ thvv/tvvtools.html#gpw ), apg ( http://www.adel.nursat. kz/apg/ ), otp ( http://www.fourmilab.h/ one time / ). 6 P k ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − k +1 . . . α i | α i − k +1 . . . α i − 1 ) . 5.1.1 Estimating ν and λ It is ob viously imp ortan t that the probabilities en- o ded in the λ and ν funtions are represen tativ e of the real  harateristis of passw ords. W e do this b y adopting a set of strings as a training set and setting λ ( x ) as the fration of strings of length x . Denoting C as the  harater set and σ ( c 1 . . . c k ) as the n um b er of o urrenes of the sub-string c 1 . . . c k in the whole training set, w e set ν ( c 1 . . . c k | c 1 . . . c k − 1 ) = σ ( c 1 . . . c k ) P c ∈ C σ ( c 1 . . . c k − 1 c ) . In the absene of a represen tativ e training set of passw ords, a ditionary an b e used as in [11 ℄. As w e will exp erimen tally sho w in Setion 5.2, using passw ords themselv es as training set nally results in a b etter mo del. In this ase, when omputing P k ( α ) in our exp erimen ts, α itself m ust b e remo v ed from the training set and should not b e tak en in to aoun t when omputing the v alues of λ and σ . As men tioned in Setion 3, some users share the same passw ord. This migh t b e due to  hane and to the fat that those passw ords are quite trivial; another p ossibilit y is that they ome from the same user registering man y aoun ts and using the same passw ords for all of them. In the latter ase, an atta k er w ould not ha v e aess to the passw ord in a represen tativ e training set, and it w ould b e or- ret for our purp oses to remo v e all opies of the passw ord from the training set. Sine w e annot disriminate b et w een the t w o ases, w e will adopt a onserv ativ e approa h that ma y result in o v eres- timating the apabilities of the atta k er, therefore disarding only a single op y of the passw ord from the training set. A mo del with higher v alues of k should b e more aurate, but the pro ess of reating it is more dif- ult and exp ensiv e. In the extreme, a mo del with k exeeding the maxim um passw ord length w ould expliitly list the probabilit y of o urrene of ea h p ossible passw ord: this w ould require prohibitiv e training set size and storage apabilities (the re- quired spae is of the order of | C | k , where | C | is the size of the  harater set). With limited resoures, Algorithm 1 Expliit oun ting of sear h spae size. funtion size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : threshold if l = 0 then return 1 s ← 0 for all c ∈ C do p ← ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) if p ≥ t then s ← s + size ( c 2 . . . c k − 1 c, l − 1 , t · p ) return s funtion tot al_size ( t ) ⊲ t : threshold return P i size ( ⊥ . . . ⊥ , i, t · λ ( i )) when a k -graph do es not app ear in the training set due to under-sampling, then the probabilit y of a passw ord on taining that k -graph is omputed as 0. Su h a mo del w ould therefore nev er generate the required passw ord. 5.1.2 Computing The Sear h Spae Size So far, w e ha v e desrib ed a mo del that assigns prob- abilities to passw ords, with the aim of measuring ho w lik ely it is that a user w ould atually selet a giv en passw ord. A rational atta k er w ould use this mo del b y en umerating andidate passw ords start- ing with the most lik ely ones and on tin uing in de- reasing order of probabilit y . In order to measure the sear h spae size that su h a strategy w ould need to explore b efore nding a giv en passw ord, w e ha v e to nd out ho w man y unsuessful andidates w ould b e generated b efore the orret one: if the Mark o vian mo del lab els the probabilit y of a passw ord as p , its asso iated sear h spae size w ould therefore b e the n um b er of strings with probabilit y of o urrene higher than or equal to p. Expliit Coun ting The most ob vious system for omputing the sear h spae size up to a giv en threshold is to plainly en umerate it. In Algorithm 1, w e sho w ho w this an b e implemen ted with a simple reursiv e algorithm. 7 Algorithm 2 Appro ximation of sear h spae size. funtion appr_size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : log-threshold if l = 0 then return 1 s ← 0 for all c ∈ C do t ← t − ⌈− log b ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) ⌉ if t ≥ 0 then s ← s + a he_size  c 2 . . . c k − 1 c, l − 1 , t  return s funtion a he_size ( c 1 . . . c k − 1 , l , t ) ⊲ W e store results from appro x_size in a a he K if ( c 1 . . . c k − 1 , l , t ) / ∈ K then K ( c 1 . . . c k − 1 , l , t ) ← appr_size ( c 1 . . . c k − 1 , l , t ) return K ( c 1 . . . c k − 1 , l , t ) funtion tot al_size ( t ) ⊲ t : threshold return P i a he_size ( ⊥ . . . ⊥ , i, ⌊− log b t · λ ( i ) ⌋ ) Appro ximate Estimation As the sear h spae gro ws, the ab o v e approa h b eomes extremely ex- p ensiv e and should b e replaed with an appro x- imate estimation metho d [11℄. By xing a base b > 1 , an y probabilit y p an b e appro ximate as b − l for an in teger v alue l ≥ 0 . Cho osing l = ⌊− log b p ⌋ appro ximates p b y exess, while l = ⌈− log b p ⌉ ap- pro ximates b y defet. T o help in tuition, l an b e seen as a disrete passw ord strength v alue, whi h an b e omputed as the sum of strengths for ea h k - graph on tained in the passw ord. V alues of b loser to 1 result in a ner gran ularit y for our appro xima- tion, at the ost of an inrease in omputation. By adopting su h an alteration, the omputation gets a big sp eedup b y memoizing the parameters and results of ea h appr o x_size all, and return- ing them when the funtion is alled again with the same parameters. This ouldn't b e done with the former v ersion, sine the t threshold parameter of the size funtion is a oating p oin t n um b er whi h is v ery lik ely to b e dieren t at ea h funtion all. Sine w e are aiming for a onserv ativ e estimate for the sear h spae that appro ximates b y exess the apabilities of the atta k er, w e use appro xima- 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 Search space size 1 0 - 8 1 0 - 7 1 0 - 6 1 0 - 5 1 0 - 4 1 0 - 3 Threshold k = 1 k = 2 k = 3 k = 4 k = 5 Figure 2: Sear h spae size v ersus probabilit y threshold for the k -graph Mark o vian mo del. The plotted urv es sho w the result of the exat om- putation of Algorithm 1, while the p oin ts mark ed b y rosses are the result of the appro ximation of Algorithm 2. tions to obtain a lo w er limit for the sear h spae size. T o do this, w e appro ximate the starting threshold b y defet and all the ν probabilities b y exess. The result of these mo diations is the appro xi- mate funtion dened in Algorithm 2. 5.2 Exp erimen tal results This setion desrib es the results of the exp eri- men ts desrib ed ab o v e when applied to our pass- w ord dataset. Unless otherwise sp eied, w e use the passw ords themselv es as training set. Sear h Spae Size V ersus Probabilit y Threshold In Figure 2 w e sho w the size of sear h spae on taining strings lab eled with a probabilit y greater or equal to a giv en probabilit y threshold. This is omputed for dieren t v alues of k and using b oth the exat oun t and the appro ximate measure from Algorithm 2. W e used a parameter b = 1 . 01 ; with that  hoie, w e obtained a relativ e error of the order of 5% (not notieable in the gure due to the log-log sale). By  ho osing 1 ≤ k ≤ 3 (i.e., basing the mo del on sub-strings of lengths 1 to 3), the probabilities of strings generated b y the mo del roughly follo w a 8 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 3: Sear h spae size v ersus fration of guessed passw ords. p o w er la w. It is in teresting to note that this mirrors frequenies of w ords in h uman natural languages, whi h ob ey the p o w er la w as w ell [22 ℄. F or k ≥ 4 , the n um b er of andidate strings gro ws denitely slo w er as the probabilit y threshold inreases; this is due to the fat that ea h k -graph is represen ted b y a lo w n um b er of strings in the training set, and the n um b er of strings that an b e obtained b y om- bining k -graphs that are presen t in the dataset is limited. W e onjeture that, with a bigger training set, w e w ould obtain a p o w er-la w distribution also in this ase. In the follo wing, w e use the appro ximate ap- proa h to estimate the sear h spae size where the exat v alue b eomes either impratial or imp ossi- ble to ompute. W e ompute data p oin ts for ea h p = 10 − i threshold ( i b eing an in teger) and in ter- p olate with the p o w er la w that onnets the p oin ts (a straigh t line in the log-log plot). P assw ord Strength In Figure 3, w e plot the fration of passw ords guessed as a funtion of the sear h spae size. With higher v alues of k , w e obtain b etter results for the w eak er passw ords due to the more preise mo deling obtained in this ase. Ho w ev er, the pass- w ords that inlude k -graphs not represen ted in the training set annot b e guessed. Metho ds based on smaller k v alues b eome more eetiv e b eause they an generalize some more. In pratie, the opti- 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 4: Sear h spae size v ersus fration of guessed passw ords on the MySpae dataset. mal strategy dep ends on the resoures of the at- ta k er, measured b y the n um b er of attempts that an b e tried. The diminishing returns eet that w e diso v- ered for ditionary atta ks also applies to this te h- nique: ev en when  ho osing the b est v alue of k for ea h ase, around 100,000 andidates need to b e tried in order to guess 20% of the passw ords ( k = 5 ); this n um b er rises to roughly 1.1 billions andidates for a suess rate of 40% ( k = 3 ); the sear h spae needed to break 90% of the passw ords gro ws to appro ximately 3 · 10 17 ( k = 2) . With su h a h uge v ariane in the size of the sear h spae, it seems that no reasonable atta k based on passw ord guessing w ould sueed in guessing all passw ords  exepting those ases where users are artiially fored to limit passw ord strength, for example b y imp osing a maxim um length. MySpae P assw ords In Figure 4, w e rep eat our measuremen ts using MySpae passw ords in the plae of our main dataset b oth as training set and as guessed passw ords. W e obtain qualitativ ely sim- ilar results  in partiular, higher v alues of k are more appropriate as training sets for w eak er pass- w ords, and the diminishing returns priniple holds. F rom a quan titativ e p oin t of view, the sear h spae for w eak passw ord is bigger, while it is smaller for stronger passw ords. W e think that this is mainly due to the partiularities of the dataset: w eak pass- 9 1 0 4 1 0 8 1 0 1 2 1 0 1 6 1 0 2 0 1 0 2 4 1 0 2 8 1 0 3 2 1 0 3 6 1 0 4 0 1 0 4 4 1 0 4 8 Search space size 1 0 - 3 1 0 - 2 1 0 - 1 1 0 0 Fraction of missed passwords Brute force k = 1 k = 2 Figure 5: Comparison of brute fore and Mark o v- mo del based atta ks. w ords are made stronger b y the requiremen t of non- alphab eti  haraters; strong passw ords reated b y seurit y-onsious users, on the other hand, are under-represen ted sine su h users are not lik ely to fall vitim to a phishing atta k. Brute F ore In Figure 5, w e ompare the brute fore approa h with our Mark o vian mo deling. The brute fore approa h starts b y trying the empt y passw ord, then pro eeds with en umerating all p os- sible passw ords with inreasing length. The full Unio de  harater set urren tly has more than 99,000  haraters 5 , but man y of them are v ery rare and denitely unlik ely in a passw ord; to aoun t for this, w e again to ok a onserv ativ e approa h o v er- estimating the atta k er apabilities, and to ok in to aoun t only the 124  haraters that w e ha v e found in our dataset. In all but the most extreme ases, the Mark o vian mo del pro v es more eien t b y orders of magnitude. It is not b efore 10 40 andidates (and ha ving found 99.7% of the passw ords) that a brute fore approa h b eomes more eetiv e than the Mark o vian mo del with k = 1 ( harater frequenies). This n um- b er is w ell b ey ond the apabilities of an y realis- ti atta k er: to put this in on text, a luster of a thousand 10 GHz ma hines w ould need more than 3 · 10 19 y ears to rea h that n um b er of iterations, ev en assuming that they are able to try a passw ord 5 http://www.uniode.org/p ress /pr - u d5.0 .htm l 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 1 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 2 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 3 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 4 Figure 6: Satter plots of passw ord length (Y axis) v ersus strength (asso iated sear h spae size on X axis). for ea h lo  k yle. Strength and P assw ord Length In Figure 6, w e highligh t the relationship b et w een a passw ord length and its strength. As the graphs sho w, the assumption that longer passw ords are stronger an only b e regarded as a rule of th um b: a short pass- w ord on taining infrequen t  haraters and/or se- quenes thereof an b e atually stronger than a no- tieably longer one. The orrelation b et w een length and strength b eomes w eak er as the k parameter gro ws: long but w eak passw ords ma y b e based on preditable long patterns that are less eien tly predited b y mo dels based on lo w er k v alues. F or example, it is quite lik ely that the  abd  sequene is follo w ed b y a  e ; a mo del based on digraphs, though, annot apture this and an only mo del whi h  harater is more lik ely to follo w a  d . T raining Sets Figure 7 illustrates ho w the  hoie of training sets aets the atta k p erformane. The training sets used are our sets of usernames and passw ords, the MySpae leak ed passw ords, the JtR ommon passw ord ditionary , and Italian and En- glish ditionaries. The most eetiv e training set is the real pass- w ord set. The ommon passw ords ditionary from JtR is more represen tativ e of real passw ords 10 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords Passwords MySpace Usernames JtR common English Italian Figure 7: Comparison of v arious training sets for guessing passw ords in our dataset ( k = 2) . than standard ditionaries, sine it on tains om- binations of  haraters, su h as puntuation and digits, that don't app ear in standard ditionaries. Still, it app ears that a v erage passw ords do not losely resem ble the most ommon ones. The ase of MySpae passw ords as training set is in teresting: they are lose to the p erformane of our passw ord dataset for strong passw ords, but they do not represen t w eak ones w ell. W e b eliev e this is due to the o v er-represen tation of non-alphab eti  har- aters, whi h are required to b e presen t in MySpae passw ords. The dierene in o v erage on strong passw ords (around 5% with equiv alen t sear h spae size) an also b e attributed to this feature, as w ell as to the follo wing fators: • Dierene in omputer literay: the MySpae sample on tains only the vitims of a phishing atta k; • Dierene in language: MySpae users are dis- tributed w orldwide. If a represen tativ e training set of real passw ords is not a v ailable to the atta k er, usernames are b y far the most eetiv e training set. It app ears that, when users are ask ed to pro vide a username and a passw ord, they emplo y similar riteria. This is quite surprising sine the t w o strings need to sat- isfy v ery dieren t, and arguably oniting, rite- ria: go o d usernames are easily memorable, while a 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords Usernames, k = 2 Passwords, k = 2 Usernames, k = 3 Passwords, k = 3 Figure 8: Comparison of omplexit y b et w een pass- w ords and usernames. strong passw ord has to b e as diult to guess as p ossible. Usernames The former result suggests a onsid- eration: usernames and passw ords are  hosen si- m ultaneously , when registering a new aoun t. A user w an ts b oth strings to b e memorable, sine the t w o are needed in order to log on suessfully . Ho w- ev er, while there is no inen tiv e in  ho osing omplex usernames, a seurit y onsious user will ommit some eort to mak e his passw ord more omplex. The dierene in omplexit y b et w een usernames and passw ords is therefore a w a y to measure the eort that users willingly put in making their pass- w ords more omplex: while usernames an b e v ery long or diult to guess, this is not lik ely to happ en as the result of a onsious attempt to do so. In Figure 8, w e ompare the sear h spae size asso iated to usernames and passw ords. Mat hing what w e ha v e done with passw ords, the training set used to guess a giv en username onsists of all the usernames exept the one under srutin y . It turns out that the eort that users put in reating omplex passw ords is measurable, but it is o v erall quite w eak: giv en a  hoie for k and a sear h spae size, the p eren tage of ra k ed usernames nev er exeeds the ra k ed passw ords b y more than 15%. 11 6 Com bined Strategy Our results onrm that no single strategy or te h- nique is more eetiv e in reduing the sear h spae: ditionaries are most eetiv e in diso v ering the w eak est passw ords; the o v erage (fration of pass- w ords that are in the ditionary) gro ws as the di- tionary size gro ws, but this en tails a loss in prei- sion (fration of ditionary items that are atual passw ords). The Mark o v- hain based te hnique should b e used when ditionaries are exhausted. Higher v alues of k obtain b etter results at rst, but after a n um b er of attempts they b eome quite in- eetiv e. No single strategy is the b est one for all ases; this, in fat, v alidates the approa h tak en b y pass- w ord reo v ery systems that adopt bigger and big- ger ditionaries in asade, and resort afterw ards to Mark o v-based te hniques. In this setion, w e sum- marize our results b y presen ting the results that an atta k er w ould b e able to obtain b y using su h a te hnique. Consisten tly with our approa h of estimating the apabilities of the atta k er b y exess in the fae of unertain t y , w e assume that the atta k er has aess to a passw ord training set whi h is as eetiv e as the one w e obtain from the lear text. F urthermore, w e also assume that the atta k er is able to predit the eetiv eness of te hniques that w e measured in Setions 4 and 5. Based on this kno wledge, using the training set as a ditionary , the strategy for the ditionary-based rst part of the atta k is as follo ws: 1. T ry the username; 2. T ry the ommon passw ords ditionary; 3. T ry all passw ords in the training set; 4. T ry the English 1 ditionary; 5. T ry the Italian 1 and 2 ditionaries; 6. T ry the English 2, 3 and extra ditionaries; 7. T ry all remaining JtR ditionaries; 8. T ry mangling rules. If this approa h is not suien t, one should resort to the Mark o vian mo del. Figure 9: Sear h spae size for passw ords that are not found in an y ditionary . In the inner frame, detail on the rst iterations. In Figure 9, w e sho w the sear h spae for the passw ords that ha v e not b een diso v ered within an y ditionary . With resp et to gure 3, there is a sharp derease in the suess rate un til the sear h spae size rea hes appro ximately 10 8 . In partiular, te hniques with k = 5 and k = 4 are unsuessful to break more than, resp etiv ely , roughly 1% and 4% of the passw ords. This mat hes with the in tu- ition that ditionary-based atta ks are more useful against the less omplex passw ords. Based on the data represen ted in Figure 9, an eien t strategy for the atta k w ould b e as follo ws: 1. T ry 500,000 andidates with the mo del based on k = 5 ; 2. T ry 7,000,000 andidates with k = 4 ; 3. T ry 700,000,000 andidates with k = 3 ; 4. T ry 7 · 10 16 andidates with k = 2 ; 5. Con tin ue with k = 1 . In T able 4, w e summarize the sear h spae size and p eren tage of ra k ed passw ords for ea h of these steps. This is the answ er to our original ques- tion: ho w man y attempts an atta k er w ould need in order to guess a giv en p eren tage of the passw ords. By in tegrating this with system-sp ei kno wledge su h as the omputational ost needed to p erform a single guess and the amoun t of resoures that the 12 Step #attempts Cra k ed Username 1 2.88% Common passw ords 3,115 9.95% T raining set 10,431 28.83% English 1 36,574 30.51% Italian 1 98,511 32.25% Italian 2 373,834 36.31% English 2 632,613 37.18% English 3 722,215 37.69% English extra 1,123,841 40.07% JtR - all ditionaries 3,923,660 41.14% Mangling 40,538,747 44.26% Mark o v  hain - k = 5 41,070,093 45.05% Mark o v  hain - k = 4 48,051,199 46.76% Mark o v  hain - k = 3 ~750,000,000 58.10% Mark o v  hain - k = 2 ~ 7 · 10 16 91.06% Mark o v  hain - k = 1 ~ 10 40 99.71% T able 4: Cum ulativ e n um b er of attempts and of guessed passw ords for the m ulti-step approa h. Candidates that w ould b e  he k ed in more than one ditionary are oun ted only one. F or the Mark o v  hain te hnique with k ≤ 3 , the sear h spae has not b een generated expliitly and its size has b een appro ximated with Algorithm 2. atta k er has aess to, it is p ossible to estimate the p eren tage of passw ords that are vulnerable to a giv en atta k. 7 Conlusion As the bibliograph y of this w ork witnesses, the rst studies on passw ord ra king date ba k to almost 30 y ears ago. Still, the te hniques that are used in state of the art passw ord-ra king appliations are quite simple: deades of resear h suggest that it is p ossible to do b etter than applying simple Mark o v  hain-based mo deling te hniques. The results of our measuremen t study ma y pro- vide an explanation as to wh y not m u h has b een done in this diretion: the diminishing returns ef- fet implies that, ev en if the size of the sear h spae dereases b y orders of magnitude, the p eren tage of passw ords that an atta k er w ould b e able to ra k in a giv en n um b er of attempt w ould inrease only b y a non-impressiv e p eren tage. In addition, it is lik ely that an inno v ativ e strategy for exploring the sear h spae w ould impro v e o v er the state of the art only for a giv en in terv al of sear h spae sizes; the lo w-ost/high-rew ard part of the sear h spae is already easily o v ered b y ditionaries of frequen t passw ords. When su h an atta k pro v es ineetiv e, an atta k er ould  hange target to nd an easier prey , or use other means of atta k whi h are not based on the passw ord strength, su h as so ial en- gineering, phishing, or exploitation of vulnerabili- ties in soft w are or in the proto ol: as the energies instilled in to an unsuessful atta k gro w, the at- ta k is more and more lik ely to b e unsuessful in the future as w ell. W e fo used on the strength of passw ords  hosen b y users in the absene of passw ord strength en- foremen t. As p oin ted out in Setion 2, it is debat- able that systems enforing passw ord omplexit y atually inrease seurit y: they ma y instead lead users to irum v en t the enforemen t te hniques b y adopting inseure b eha vior. T o assess this, measur- ing passw ord omplexit y with and without enfore- men t should b e oupled with an analysis of user b eha vior. Another in teresting question y et to b e addressed regards the orrelation b et w een passw ord strength and the domain they are related to. In partiu- lar, ho w will the passw ord strength of a user v ary 13 if getting an aoun t ompromised w ould result in a notieable loss? In [4℄, some evidene that users atually  ho ose b etter passw ords for aoun ts re- lated to v aluable assets (e.g., P a yP al) is rep orted. Unfortunately , the bit-strength measure adopted is quite simple. F urther in v estigations w ould b e re- quired to obtain atual gures in terms of atta k er osts in order to break an aoun t. 8 A  kno wledgemen ts The authors w ould lik e to thank Sebastian P orst and Roger Grimes for ha ving shared the set of MySpae passw ords, and Cyn thia Kuo, Sasha Ro- manosky , and Lorrie F. Cranor for ha ving shared their mnemonis ditionary . Referenes [1℄ A. A dams and M. A. Sasse. Users are not the enem y . Commun. A CM , 42(12):4046, Deem- b er 1999. [2℄ M. Bishop. Impro ving system seurit y via proativ e passw ord  he king. Computers & Se- urity , 14(3):233249, 1995. [3℄ J. A. Cazier and D. B. Medlin. P assw ord seurit y: An empirial in v estigation in to e- ommere passw ords and their ra k times. In- formation Se urity Journal: A Glob al Persp e - tive , 15(6):4555, 2006. [4℄ D. Florenio and C. Herley . A large-sale study of w eb passw ord habits. In WWW '07: Pr o  e e dings of the 16th international  onfer- en e on W orld Wide W eb , pages 657666, New Y ork, NY, USA, 2007. A CM. [5℄ R. A. Grimes. MySpae passw ord exploit: Crun hing the n um b ers (and letters). In- foW orld online artile, No v em b er 2006. [6℄ B. Kaliski. RF C 2898 - PK CS #5: P assw ord- based ryptograph y sp eiation v ersion 2.0. T e hnial rep ort, IETF, Septem b er 2000. [7℄ D. V. Klein. F oiling the ra k er: A surv ey of, and impro v emen ts to, passw ord seurit y . In Pr o  e e dings of the 2nd USENIX UNIX Se u- rity W orkshop , 1990. [8℄ C. Kuo, S. Romanosky , and L. F. Cranor. Hu- man seletion of mnemoni phrase-based pass- w ords. In SOUPS '06: Pr o  e e dings of the se - ond symp osium on Usable privay and se u- rity , pages 6778, New Y ork, NY, USA, 2006. A CM Press. [9℄ S. Mare hal. A dv anes in passw ord ra k- ing. Journal in Computer Vir olo gy , 4(1):73 81, F ebruary 2008. [10℄ R. Morris and K. Thompson. P assw ord se- urit y: a ase history . Commun. A CM , 22(11):594597, No v em b er 1979. [11℄ A. Nara y anan and V. Shmatik o v. F ast di- tionary atta ks on passw ords using time-spae tradeo. In CCS '05: Pr o  e e dings of the 12th A CM  onfer en e on Computer and  ommu- ni ations se urity , pages 364372, New Y ork, NY, USA, 2005. A CM Press. [12℄ P . Oe hslin. Making a faster ryptanalyti time-memory trade-o. In A dvan es in Cryp- tolo gy - CR YPTO 2003 , pages 617630, 2003. [13℄ B. Pink as and T. Sander. Seuring pass- w ords against ditionary atta ks. In CCS '02: Pr o  e e dings of the 9th A CM  onfer en e on Computer and  ommuni ations se urity , pages 161170, New Y ork, NY, USA, 2002. A CM Press. [14℄ S. P orst. A brief analysis of 40,000 leak ed MySpae passw ords. Blog p ost at http://www.the- in te rwe b. o m/ serendipity/index .p hp? /a r hiv es / 94- A- brief- analysi s- of- 40, 000- leaked- MySpae - p ass wo rd s.h tm l , No v em b er 2007. [15℄ S. Riley . P assw ord seurit y: What users kno w and what they atually do. Usability News , 8(1), F ebruary 2006. [16℄ B. S hneier. S hneier on seurit y: Cho os- ing seure passw ords. Blog p ost at http://www.shnei er .o m/ bl og/ ar h ive s/ 2007/01/hoosing\ _s eu re .h tml , Jan uary 2007. [17℄ R. E. Smith. The Str ong Passwor d Dilemma ,  hapter 6. A ddison-W esley , 2002. 14 [18℄ E. H. Spaord. Observing reusable passw ord  hoies. In In Pr o  e e dings of the 3r d Se urity Symp osium. Usenix , pages 299312, 1992. [19℄ L. v on Ahn, M. Blum, and J. Langford. T elling h umans and omputers apart automatially . Commun. A CM , 47(2):5660, F ebruary 2004. [20℄ T. W u. A real-w orld analysis of Kerb eros pass- w ord seurit y . In Pr o  e e dings of 1999 Network and Distribute d System Se urity Symp osium , F ebruary 1999. [21℄ J. J. Y an. A note on proativ e passw ord  he k- ing. In NSPW '01: Pr o  e e dings of the 2001 workshop on New se urity p ar adigms , pages 127135, New Y ork, NY, USA, 2001. A CM. [22℄ G. K. Zipf. Human Behaviour and the Prini- ple of L e ast Eort: an Intr o dution to Human E olo gy . A ddison-W esley , 1949. 15

Measuring Password Strength: An Empirical Analysis

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment