Measuring Password Strength: An Empirical Analysis
We present an in-depth analysis on the strength of the almost 10,000 passwords from users of an instant messaging server in Italy. We estimate the strength of those passwords, and compare the effectiveness of state-of-the-art attack methods such as d…
Authors: Matteo DellAmico, Pietro Michiardi, Yves Roudier
Measuring P assw ord Strength: An Empirial Analysis Matteo Dell'Amio, Pietro Mi hiardi and Y v es Roudier Institut Eureom {matteo.dell-amio, pietro.mi hiardi, yv es.roudier}eureom.fr Otob er 23, 2018 Abstrat W e presen t an in-depth analysis on the strength of the almost 10,000 passw ords from users of an in- stan t messaging serv er in Italy . W e estimate the strength of those passw ords, and ompare the ef- fetiv eness of state-of-the-art atta k metho ds su h as ditionaries and Mark o v hain-based te hniques. W e sho w that the strength of passw ords ho- sen b y users v aries enormously , and that the ost of atta ks based on passw ord strength gro ws v ery qui kly when the atta k er w an ts to obtain a higher suess p eren tage. In aordane with existing studies w e observ e that, in the absene of mea- sures for enforing passw ord strength, w eak pass- w ords are ommon. On the other hand w e diso v er that there will alw a ys b e a subset of users with ex- tremely strong passw ords that are v ery unlik ely to b e brok en. The results of our study will help in ev aluat- ing the seurit y of passw ord-based authen tiation means, and they pro vide imp ortan t insigh ts for in- spiring new and b etter proativ e passw ord he k ers and passw ord reo v ery to ols. 1 In tro dution Ev en though m u h has b een said ab out their w eak- nesses, passw ords still are and will b e in the fore- seeable future ubiquitous in omputer authen ti- ation systems. A p euliar harateristis of pass- w ords is that they inheren tly arry a trade-o b e- t w een usabilit y and seurit y: while strong pass- w ords are hard for atta k ers to guess, they are on the other hand also diult for the user to re- mem b er. As Ri hard Smith parado xially notes, passw ord b est praties imply that the passw ord m ust b e imp ossible to remem b er and nev er written do wn [17 ℄. In ligh t of this, it is not v ery surprising that users often kno wingly ho ose to use w eak pass- w ords or irum v en t seurit y b est praties, sine they p ereiv e that follo wing them w ould get in the w a y of doing their w ork [1, 15 ℄. T o think sensibly ab out the seurit y of systems that use passw ords, it is therefore essen tial to an- alyze the harateristis of passw ords hosen b y users. In this w ork, w e analyze a large dataset on- taining all user passw ords from an instan t messag- ing serv er lo ated in Italy . Unlik e previous empiri- al studies on passw ords [10 , 7, 18 , 11 , 3, 4 , 9℄, this pap er ev aluates the strength of passw ords against a v ariet y of state of the art te hniques for andidate generation. The analysis w e onduted b eneted from ha ving aess to the passw ords in unenrypted form; this made it p ossible to measure the strength of all of them, inluding those that w ould hardly b e ra k ed ev en b y extremely p o w erful atta k ers. W e ev aluate the strength of a passw ord in terms of their asso iated sear h spae size, that is the n um b er of attempts that an atta k er w ould need to orretly guess it. This measure do es not de- p end on the partiular nature of the authen tiation system nor on the atta k er apabilities: it is only related to the atta k te hnique and to the w a y users ho ose their passw ord. The atta k mo del and the harateristis of the system will instead dene the ost that the atta k er has to pa y for ea h single guess. By om bining this ost with our measures of passw ord strength, it b eomes p ossible to obtain a sound ost-b enet analysis for atta ks based on passw ord guessing on an authen tiation system. As w e will sho w, dieren t atta k te hniques are advisable dep ending on the sear h spae size that the atta k er an aord to explore. This has to b e 1 tak en in to aoun t when prop osing and ev aluating new te hniques for reduing the sear h spae: they ma y b e eetiv e only if the strength of the atta k falls within a giv en in terv al. W e sho w that passw ord strength has an ex- tremely wide v ariane: as a rst appro ximation, the probabilit y to guess a passw ord at ea h attempt dereases roughly exp onen tially as the size of the explored sear h spae gro ws. These diminishing re- turns imply that, in most ases, an atta k er w ould ev en tually nd a p oin t where the ost of on tin uing the atta k w ould not b e justied b y the probabil- it y of suess. This study pro vides gures that an help designers and administrators in assessing the seurit y of their systems b y ev aluating where that p oin t resides. 2 Related W ork In this setion w e pro vide a short review of studies ab out passw ord seurit y , and mak e the ase for the imp ortane of measuring passw ord strength. A t- ta ks su h as phishing or so ial engineering, where the user is misled in omm uniating the passw ord to the atta k er, are unrelated to passw ord strength and therefore outside the sop e of this w ork. Priing Via Pro essing T o defend against in- truders who rep eatedly try passw ord after passw ord un til they obtain aess to the system, it is p ossible to limit the rate at whi h the atta k er is allo w ed to try new passw ords b y requiring the user to p erform an ation with a mo derate ost. While legitimate users w ould need to p erform this ation only one ev ery time they try to log on, an atta k er w ould need to rep eat this pro ess man y times, resulting in a disprop ortionate ost that renders the atta k w orthless. The follo wing measures b elong to this ategory: • CAPTCHAs [19℄, whi h require solving puz- zles that are diult without h uman in terv en- tion; • k ey strengthening te hniques, whi h require a few seonds of omputation to deriv e a k ey from the passw ords; this idea rst app eared in the design of the UNIX system in the late '70s [10 ℄. A mo dern k ey strengthening algo- rithm, where the omputation length is ong- urable via the hoie of a tunable parameter, is PBKDF2 [6 ℄. It is imp ortan t to note that these te hniques imp ose a trade-o to legitimate users: if an honest user has to pa y a ost c , the atta k er m ust pa y at most c · s , where s is the strength of the passw ord in terms of the n um b er of attempts needed to guess it. The measures obtained in this pap er an b e used to estimate osts and b enets of these systems, and th us to prop erly tune this c parameter. An alternativ e approa h blo ks aoun ts after a giv en n um b er of failed attempts. This resp onse, ho w ev er, op ens the do or to denial of servie atta ks on user aoun ts and is ineetiv e unless the atta k is sp eially targeted to w ards a single user [13 ℄. Oine A tta ks In most ases, the authen tia- tion serv er do es not store passw ords in plain text. Instead, it k eeps an enrypted v ersion of them whi h is oneptually analogous to a hash: when a user attempts to log on, the passw ord they pro vide is enrypted and ompared to the stored v alue. In this w a y , ev en if an atta k er obtains the enrypted passw ords, these annot b e used righ t a w a y to log on to the system. T o mak e it ostly for the at- ta k er to guess the passw ord b y enrypting lots of passw ord andidates, k ey strengthening te hniques are again applied. A tta ks based on pre-omputing the enrypted v ersion of the most lik ely passw ords [11 , 12 ℄ are defeated with the simple te hnique of salting, also kno wn sine the early da ys of UNIX: that te hnique w orks b y app ending a random n um- b er to the passw ord b efore enrypting it, and then storing this n um b er along with the enrypted pass- w ord. Sine these te hniques are based on the idea of making guessing atta ks ostly , the passw ord strength that w e are measuring is also a k ey pa- rameter when ev aluating the resiliene of a pass- w ord system to oine atta ks. P assw ord Reo v ery W e measure passw ord strength b y taking in to aoun t attempts to break them with state of the art te hniques. The free passw ord reo v ery soft w are John the R ipp er 1 iden- ties passw ords b y he king them against a large- sized ditionary , plus a xed set of mangling rules, 1 http://www.openwall.om/j ohn / 2 su h as app ending or prep ending digits to ditio- nary w ords. A ording to Brue S hneier's de- sription [16 ℄, A essData's proprietary P assw ord Reo v ery T o olkit omplemen ts this approa h with a phoneti pattern set generated via a Mark o v hain routine to generate meaningless but pro- nouneable passw ords. In Setion 5, w e formal- ize a metho d based on the same idea and ev aluate its merits in reduing the sear h spae for ra king passw ords. Proativ e P assw ord Che king A proativ e passw ord he k er is a system that fores (or ad- vises) the user to ho ose omplex enough pass- w ords. The impat of these he k ers on atual passw ord seurit y is debatable: as W u [20 ℄ notes, [users are℄ v ery go o d at seleting passw ords that are just `go o d enough' to pass whatev er he king is in plae. The MySpae so ial net w ork requires users to ha v e at least a non-alphab eti harater in their passw ord; in a set of leak ed passw ords, 86% of the users om- plied with this requiremen t b y app ending a n um- b er at the end of their passw ord; for 20% of them that n um b er w as a 1 [14 ℄. F urthermore, a proa- tiv e passw ord he k er ould enourage users to use non-ditionary passw ords that are related to their p ersonal life su h as dates, telephone n um b ers or li- ense plate n um b ers [1 ℄. F or a motiv ated atta k er, these passw ords are ev en easier to guess than di- tionary w ords. Moreo v er, a strong passw ord in the abstrat ould fore the user to write it do wn and lea v e it in a plae where an atta k er an easily nd it. F or example, man y emplo y ees hide pass- w ords under their mouse pads at their ompanies [17 ℄. In general, it seems that passw ord strength he k ers atually inrease system seurit y only if they are seen b y users as a to ol that helps them and not just as an additional ho op they ha v e to jump through to get their job done. Existing passw ord he k ers are based on quite naiv e metris [2, 21 ℄: they he k on passw ord length, or resiliene to brute fore and ditio- nary based atta ks; still, they do not tak e in to a- oun t adv aned ra king te hniques. Our measure of strength as sear h spae size an b e used as the basis for more eetiv e passw ord he k ers. Empirial Studies It is a w ell kno wn fat that man y users almost in v ariably ho ose easy to guess passw ords; urren t empirial studies, ho w ev er, gen- erally fo us on a single kind of atta k and neglet to quan tify ho w strong the remaining share of pass- w ords are with resp et to more general atta ks. T o the b est of our kno wledge, no other w ork ev al- uates the strength of passw ords o v er their whole strength sp etrum and against all state-of-the-art te hniques. Analyses on ditionary atta ks rep ort a p eren t- age of brok en passw ords v arying b et w een 17% and 24% [10 , 7, 18 ℄. In Setion 4, b efore in v estigat- ing the remaining stronger passw ords, w e obtain results of similar magnitude, v arying with the t yp e and size of ditionary used. Some studies are based on a dataset of enrypted passw ords, and only rep ort on the ones that ha v e b een atually ra k ed [7 , 11 , 3 , 9℄; in omparison, w e had aess to the plain-text whi h ga v e us in- formation on the passw ords that w ould b e ompu- tationally impratial to break. In a 2007 study [4 ℄, Florenio and Herley ob- tained data ab out the passw ords of ab out 500,000 users. That w ork pro vides in teresting insigh ts ab out user habits, but only quan ties passw ord strength with a simple bit strength measure based on their length and on the use of upp erase, n u- meri, and non-alphan umeri haraters; resiliene against adv aned passw ord-ra king te hniques is not tak en in to onsideration. 3 Our Dataset Our dataset on tains the unenrypted passw ords for the 9,317 registered users of an Italian instan t messaging serv er. Storing passw ords in plain text on the serv er is required b y authen tiation algo- rithms su h as CRAM-MD5 2 . User registration is free and no p oliy for passw ord strength is enfored: ev en the empt y passw ord is allo w ed. The absene of strength enforemen t allo ws us to in v estigate the b eha vior of users when ho osing their passw ord in the absene of external requiremen ts. Users are free to ho ose an y un used username when registering. A total of 269 users (2.89% of the total) use the same string as b oth username and passw ord. The single most eetiv e attempt 2 http://tools.ietf.org/htm l/r f21 95 3 Figure 1: P assw ord length distribution. to guess a giv en user's passw ord w ould therefore b e its o wn username. Some users share the same passw ord, and this results in 7,848 unique passw ords. While in some ases this ma y b e due to oinidenes and use of to o frequen t passw ords, other ases ma y b e the onse- quene of the same p eople registering under dier- en t usernames at the same serv er. The a v erage passw ord length is 7.86. Figure 1 sho ws the length distribution. Ev en though the full Unio de harater set is usable for the pass- w ords, only 124 dieren t haraters had b een used. F requenies of haraters ha v e v ery unev en distri- butions (see table 1): while one harater out of 11 is an ` a ', the most frequen t upp erase harater (` A ') has a frequeny of appro ximately 1 in 500. In table 2, w e sho w the mat hing ratio of v arious simple regular expression. More than 50% of the passw ords on tain only lo w erase haraters, and less than 7% on tain non-alphan umeri haraters. Around 15% of them onsist of a string of lo w erase haraters follo w ed b y a n umeri app endage. W e also analyzed a set of 33,671 leak ed MySpae passw ords [5, 14 ℄. Sine these passw ords ha v e b een obtained through a phishing atta k, they inlude those of less seurit y-onsious users who fell for the atta k. Moreo v er, MySpae requires users to insert non-alphab eti haraters in their passw ords, and this imp oses an artiial impat on passw ords that users, left alone, w ould ho ose. F or these rea- sons, w e onsider this dataset less represen tativ e of atual user passw ords than our primary one; w e ho w ev er use it in this w ork to orrob orate some of our ndings b y v alidating them on another dataset. Charater Coun t P eren tage a 6,681 9.12% e 4,520 6.17% o 4,484 6.12% i 4,388 5.99% r 3,628 4.95% n 3,310 4.52% l 3,095 4.23% s 2,895 3.95% t 2,853 3.90% 1 2,518 3.44% 2,367 3.23% m 2,137 2.92% 0 1,990 2.72% p 1,945 2.66% d 1,813 2.48% 2 1,692 2.31% u 1,640 2.40% b 1,624 2.22% 3 1,487 2.03% g 1,334 1.82% other 16,832 22.98% T able 1: Charater distribution. Expression Example Mat hes [a-z℄+ abdef 51.20% [A-Z℄+ ABCDEF 0.29% [A-Za-z℄+ AbCdEf 53.74% [0-9℄+ 123456 9.10% [a-zA-Z0-9℄+ A1b2C3 93.43% [a-z℄+[0-9℄+ ab123 14.51% [a-zA-Z℄+[0-9℄+ aB123 16.30% [0-9℄+[a-zA-Z℄+ 123aB 1.80% [0-9℄+[a-z℄+ 123ab 1.65% T able 2: P eren tage of passw ords mat hing v arious regular expressions. 4 4 Ditionary A tta k Ditionary atta k is the most eetiv e te hnique to guess the w eak est passw ords. W e ev aluated pass- w ord strength b y using the ditionaries a v ailable in the already men tioned John the R ipp er (JtR) pass- w ord reo v ery to ol. The extended ditionaries that w e used are a v ailable for paid do wnload from the program w ebsite 3 . 4.1 The Ditionaries The JtR ditionaries on tain w ords from 21 dier- en t h uman languages, plus a list of frequen tly used passw ords. F or some languages (lik e English and Italian), v arious ditionaries of dieren t sizes are a v ailable: the smaller ones on tain only the most frequen tly used w ords while the bigger ones also on tain more obsure w ords, the rationale b eing that more ommon w ords are more lik ely to b e ho- sen as passw ords. T ak en together, all ditionaries aoun t for almost 4 million w ords. A bigger ditio- nary on taining more than 40 millions w ords is ob- tained using mangling rules that attempt to re- ate more omplex passw ords b y altering ditionary w ords, for example b y juxtap osition of ditionary w ords or b y app ending a n um b er at the end of the w ord. An often-advised te hnique to reate strong but easy to remem b er passw ords is to turn phrases in to passw ords b y extrating an aron ym, p ossibly also using puntuation. F or example, the phrase Alas, p o or Y ori k! I knew him, Horatio b eomes A,pY!Ikh,H. W e also ev aluated su h aron yms with a ditionary reated b y Kuo et al. [8℄ that w as put together b y sraping w ebsites displa ying mem- orable phrases, su h as itations and m usi lyris. 4.2 Exp erimen tal Results W e sim ulated ditionary atta ks with all the JtR ditionaries. T able 3 sho ws the results for the most represen tativ e instanes. The found olumn lists the p eren tage of pass- w ords that app ear in that ditionary; the guess probabilit y olumn reets the probabilit y that a random w ord from that ditionary mat hes a ran- dom passw ord: a rational atta k er w ould try a w ord from that ditionary only if the b enet of ra king 3 http://www.openwall.om/ word lis ts/ Ditionary Size F ound Guess prob. F requen t passw ords 3,114 7.25% 2 . 33 · 10 − 5 English 1 l 27,424 4.91% 1 . 79 · 10 − 6 English 2 l 296,809 9.42% 3 . 17 · 10 − 7 English 3 l 390,532 11.59% 2 . 97 · 10 − 7 English extra l 444,678 8.03% 1 . 81 · 10 − 7 Italian 1 l 63,041 3.71% 5 . 89 · 10 − 7 Italian 2 l 344,074 14.89% 4 . 33 · 10 − 7 All ab o v e 1,117,767 25.51% 2 . 28 · 10 − 7 All JtR ditionaries 3,917,193 25.94% 6 . 62 · 10 − 8 All JtR + mangling 40,532,676 30.12% 7 . 43 · 10 − 9 Mnemonis [8℄ 406,430 1.27% 3 . 12 · 10 − 8 T able 3: Ditionary atta ks. The l aron ym stands for all-lo w erase ditionaries: those on- taining upp erase letters are mat hed b y v ery few w ords in our dataset. The English 1, English 2 and English 3 ditionaries, lik e Italian 1 and Italian 2, are listed in gro wing size; ea h w ord b elonging to a smaller ditionary is also on tained in the bigger v ersions. the passw ord exeeds the in v erse of that probabilit y times the ost of the eort for trying that passw ord. The English extra ditionary has a sligh tly mis- leading name: it on tains w ords that don't ap- p ear in a regular ditionary but that users are lik ely to use, su h as prop er nouns, ommon mis- sp ellings or alterations of w ords. Man y of them are language-agnosti (e.g., Aldebaran) or ome from non-English languages (Mariela). As the serv er is in Italy , most users are Italian. The amoun t of English w ords found in passw ords is not partiularly surprising for those who kno w the tendeny that nativ es ha v e to w ards the hea vy use (and abuse) of English. An in teresting feature is the notieably higher densit y of ommon English w ords (those presen t in the small English 1 ditio- nary); that phenomenon is m u h less relev an t with resp et to Italian. W e think that this is aused b y the fat that most users kno w English as a seond language, and th us are less inlined to use an ob- sure w ord as their passw ord. This suggests that it migh t b e go o d pratie to use one's nativ e language to reate stronger passw ords. The most imp ortan t lesson dra wn from this data is the priniple of diminishing r eturns : the proba- bilit y of guessing a w ord sharply dereases as the ditionary gro ws. The 3,100-w ord ditionary of fre- 5 quen t passw ords ra ks 7% of those in our datasets; b y inreasing roughly 300 times the size of the di- tionary up to more than one million and inlud- ing all Italian and English w ords, the n um b er of ra k ed passw ords rises to 25%. When the n um b er of attempts gro ws b ey ond 40 millions b y inluding other languages and mangling, only 5% more of the passw ords are found. T o put it in another w a y , the probabilit y of guessing a giv en passw ord b y trying an elemen t of the frequen t passw ords ditionary is one in 43,000. On the other hand, after ha v- ing tried all the frequen t passw ords and the Italian and English ditionary , the probabilit y of guessing b y using another ditionary w ord is less than one in 500 million! Sine the guessing probabilit y de- reases so sharply , it is oneiv able that in man y ases it w on't b e w orth trying a bigger ditionary for the atta k er. W e also observ e that the mnemoni ditionary is quite ineetiv e. This ma y b e due to sev eral reasons: rst, few users atually use mnemonis for their passw ords; seond, they are atually m u h harder to break with ditionary atta ks. Moreo v er, w e are not able to asertain whether the habit of ho osing English passw ords for Italian users w ould arry o v er to the use of mnemonis. Our data is, at the momen t, insuien t to p oin t to w ards one reason or the other. 5 Mark o v Chain-Based A tta k The fat that ditionaries fall short do es not mean that an atta k er w ould need to resort to an exhaus- tiv e brute-fore atta k: some passw ords are m u h more lik ely to b e hosen than others. As seen in Setion 3, there is a v ery unev en distribution of harater hoie. Moreo v er, other regularities ex- ist: passw ords are usually made of pronouneable sub-strings and/or sequenes of k eys that are lose on the k eyb oard. In this setion, w e desrib e and v alidate an at- ta k based on Mark o v hain-based mo deling of the frequenies of sub-strings with parametri length k , or k -graphs. This allo ws us to lab el andidate passw ords with v ariable probabilities, where strings that are lab eled as more lik ely are he k ed rst. Some passw ord generating utilities atually use this kind of mo deling to obtain meaningless but pro- nouneable passw ords on the grounds that they're easier to remem b er, th us sariing some strength for usabilit y 4 . 5.1 The T e hnique W e base our formalization on the te hniques sho wn in [11℄, extending the mo del so that it applies to sub-strings of length 3 and more. This mo del rep- resen ts a passw ord hoie as a sequene of random ev en ts: rst, the length of the passw ord is hosen aording to a giv en probabilit y distribution; then, ea h harater of the string gets extrated aord- ing to a onditional probabilit y dep ending on the previous k − 1 haraters. W e eno de the harateristis of passw ords via t w o funtions, λ and ν . λ represen ts the length dis- tribution of passw ords so that, for example, λ (8) is the probabilit y that the passw ord has length 8. ν , instead, represen ts the onditional probabilit y of ea h k -graph with resp et to the orresp onding ( k − 1) -graph: ν ( c 1 . . . c k | c 1 . . . c k − 1 ) is the prob- abilit y that the harater c k follo ws the sub-string c 1 . . . c k − 1 . F or k = 1 , ν ( c ) expresses the frequeny of c , that is, the probabilit y that a random hara- ter in a passw ord oinides with c . By ho osing k = 1 , th us fo using on harater frequeny , the probabilit y P 1 ( α ) that our mo del will generate a string α (where its length is | α | and its i th harater is α i ) is P 1 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i ) . T o deriv e P k with k ≥ 2 , w e will adopt the on v en tion that α i = ⊥ whenev er i ≤ 0 , where ⊥ is a sp eial harater not allo w ed to app ear in passw ords. F or example, w e write the probabil- it y that a passw ord starts with the a harater as ν (” ⊥ a ” | ” ⊥ ” ) ; the probabilit y that a b follo ws an initial a is instead ν (” ⊥ ab ” | ” ⊥ a ” ) . Giv en this, w e an formalize the digraph-based probabilit y P 2 as P 2 ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − 1 α i | α i − 1 ) and, in general, 4 See for example gpw ( http://www.multiians.o rg/ thvv/tvvtools.html#gpw ), apg ( http://www.adel.nursat. kz/apg/ ), otp ( http://www.fourmilab.h/ one time / ). 6 P k ( α ) = λ ( | α | ) Y 1 ≤ i ≤| α | ν ( α i − k +1 . . . α i | α i − k +1 . . . α i − 1 ) . 5.1.1 Estimating ν and λ It is ob viously imp ortan t that the probabilities en- o ded in the λ and ν funtions are represen tativ e of the real harateristis of passw ords. W e do this b y adopting a set of strings as a training set and setting λ ( x ) as the fration of strings of length x . Denoting C as the harater set and σ ( c 1 . . . c k ) as the n um b er of o urrenes of the sub-string c 1 . . . c k in the whole training set, w e set ν ( c 1 . . . c k | c 1 . . . c k − 1 ) = σ ( c 1 . . . c k ) P c ∈ C σ ( c 1 . . . c k − 1 c ) . In the absene of a represen tativ e training set of passw ords, a ditionary an b e used as in [11 ℄. As w e will exp erimen tally sho w in Setion 5.2, using passw ords themselv es as training set nally results in a b etter mo del. In this ase, when omputing P k ( α ) in our exp erimen ts, α itself m ust b e remo v ed from the training set and should not b e tak en in to aoun t when omputing the v alues of λ and σ . As men tioned in Setion 3, some users share the same passw ord. This migh t b e due to hane and to the fat that those passw ords are quite trivial; another p ossibilit y is that they ome from the same user registering man y aoun ts and using the same passw ords for all of them. In the latter ase, an atta k er w ould not ha v e aess to the passw ord in a represen tativ e training set, and it w ould b e or- ret for our purp oses to remo v e all opies of the passw ord from the training set. Sine w e annot disriminate b et w een the t w o ases, w e will adopt a onserv ativ e approa h that ma y result in o v eres- timating the apabilities of the atta k er, therefore disarding only a single op y of the passw ord from the training set. A mo del with higher v alues of k should b e more aurate, but the pro ess of reating it is more dif- ult and exp ensiv e. In the extreme, a mo del with k exeeding the maxim um passw ord length w ould expliitly list the probabilit y of o urrene of ea h p ossible passw ord: this w ould require prohibitiv e training set size and storage apabilities (the re- quired spae is of the order of | C | k , where | C | is the size of the harater set). With limited resoures, Algorithm 1 Expliit oun ting of sear h spae size. funtion size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : threshold if l = 0 then return 1 s ← 0 for all c ∈ C do p ← ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) if p ≥ t then s ← s + size ( c 2 . . . c k − 1 c, l − 1 , t · p ) return s funtion tot al_size ( t ) ⊲ t : threshold return P i size ( ⊥ . . . ⊥ , i, t · λ ( i )) when a k -graph do es not app ear in the training set due to under-sampling, then the probabilit y of a passw ord on taining that k -graph is omputed as 0. Su h a mo del w ould therefore nev er generate the required passw ord. 5.1.2 Computing The Sear h Spae Size So far, w e ha v e desrib ed a mo del that assigns prob- abilities to passw ords, with the aim of measuring ho w lik ely it is that a user w ould atually selet a giv en passw ord. A rational atta k er w ould use this mo del b y en umerating andidate passw ords start- ing with the most lik ely ones and on tin uing in de- reasing order of probabilit y . In order to measure the sear h spae size that su h a strategy w ould need to explore b efore nding a giv en passw ord, w e ha v e to nd out ho w man y unsuessful andidates w ould b e generated b efore the orret one: if the Mark o vian mo del lab els the probabilit y of a passw ord as p , its asso iated sear h spae size w ould therefore b e the n um b er of strings with probabilit y of o urrene higher than or equal to p. Expliit Coun ting The most ob vious system for omputing the sear h spae size up to a giv en threshold is to plainly en umerate it. In Algorithm 1, w e sho w ho w this an b e implemen ted with a simple reursiv e algorithm. 7 Algorithm 2 Appro ximation of sear h spae size. funtion appr_size ( c 1 . . . c k − 1 , l , t ) ⊲ c 1 . . . c k − 1 : state, l : string length, t : log-threshold if l = 0 then return 1 s ← 0 for all c ∈ C do t ← t − ⌈− log b ν ( c 1 . . . c k − 1 c | c 1 . . . c k − 1 ) ⌉ if t ≥ 0 then s ← s + a he_size c 2 . . . c k − 1 c, l − 1 , t return s funtion a he_size ( c 1 . . . c k − 1 , l , t ) ⊲ W e store results from appro x_size in a a he K if ( c 1 . . . c k − 1 , l , t ) / ∈ K then K ( c 1 . . . c k − 1 , l , t ) ← appr_size ( c 1 . . . c k − 1 , l , t ) return K ( c 1 . . . c k − 1 , l , t ) funtion tot al_size ( t ) ⊲ t : threshold return P i a he_size ( ⊥ . . . ⊥ , i, ⌊− log b t · λ ( i ) ⌋ ) Appro ximate Estimation As the sear h spae gro ws, the ab o v e approa h b eomes extremely ex- p ensiv e and should b e replaed with an appro x- imate estimation metho d [11℄. By xing a base b > 1 , an y probabilit y p an b e appro ximate as b − l for an in teger v alue l ≥ 0 . Cho osing l = ⌊− log b p ⌋ appro ximates p b y exess, while l = ⌈− log b p ⌉ ap- pro ximates b y defet. T o help in tuition, l an b e seen as a disrete passw ord strength v alue, whi h an b e omputed as the sum of strengths for ea h k - graph on tained in the passw ord. V alues of b loser to 1 result in a ner gran ularit y for our appro xima- tion, at the ost of an inrease in omputation. By adopting su h an alteration, the omputation gets a big sp eedup b y memoizing the parameters and results of ea h appr o x_size all, and return- ing them when the funtion is alled again with the same parameters. This ouldn't b e done with the former v ersion, sine the t threshold parameter of the size funtion is a oating p oin t n um b er whi h is v ery lik ely to b e dieren t at ea h funtion all. Sine w e are aiming for a onserv ativ e estimate for the sear h spae that appro ximates b y exess the apabilities of the atta k er, w e use appro xima- 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 1 0 6 1 0 7 Search space size 1 0 - 8 1 0 - 7 1 0 - 6 1 0 - 5 1 0 - 4 1 0 - 3 Threshold k = 1 k = 2 k = 3 k = 4 k = 5 Figure 2: Sear h spae size v ersus probabilit y threshold for the k -graph Mark o vian mo del. The plotted urv es sho w the result of the exat om- putation of Algorithm 1, while the p oin ts mark ed b y rosses are the result of the appro ximation of Algorithm 2. tions to obtain a lo w er limit for the sear h spae size. T o do this, w e appro ximate the starting threshold b y defet and all the ν probabilities b y exess. The result of these mo diations is the appro xi- mate funtion dened in Algorithm 2. 5.2 Exp erimen tal results This setion desrib es the results of the exp eri- men ts desrib ed ab o v e when applied to our pass- w ord dataset. Unless otherwise sp eied, w e use the passw ords themselv es as training set. Sear h Spae Size V ersus Probabilit y Threshold In Figure 2 w e sho w the size of sear h spae on taining strings lab eled with a probabilit y greater or equal to a giv en probabilit y threshold. This is omputed for dieren t v alues of k and using b oth the exat oun t and the appro ximate measure from Algorithm 2. W e used a parameter b = 1 . 01 ; with that hoie, w e obtained a relativ e error of the order of 5% (not notieable in the gure due to the log-log sale). By ho osing 1 ≤ k ≤ 3 (i.e., basing the mo del on sub-strings of lengths 1 to 3), the probabilities of strings generated b y the mo del roughly follo w a 8 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 3: Sear h spae size v ersus fration of guessed passw ords. p o w er la w. It is in teresting to note that this mirrors frequenies of w ords in h uman natural languages, whi h ob ey the p o w er la w as w ell [22 ℄. F or k ≥ 4 , the n um b er of andidate strings gro ws denitely slo w er as the probabilit y threshold inreases; this is due to the fat that ea h k -graph is represen ted b y a lo w n um b er of strings in the training set, and the n um b er of strings that an b e obtained b y om- bining k -graphs that are presen t in the dataset is limited. W e onjeture that, with a bigger training set, w e w ould obtain a p o w er-la w distribution also in this ase. In the follo wing, w e use the appro ximate ap- proa h to estimate the sear h spae size where the exat v alue b eomes either impratial or imp ossi- ble to ompute. W e ompute data p oin ts for ea h p = 10 − i threshold ( i b eing an in teger) and in ter- p olate with the p o w er la w that onnets the p oin ts (a straigh t line in the log-log plot). P assw ord Strength In Figure 3, w e plot the fration of passw ords guessed as a funtion of the sear h spae size. With higher v alues of k , w e obtain b etter results for the w eak er passw ords due to the more preise mo deling obtained in this ase. Ho w ev er, the pass- w ords that inlude k -graphs not represen ted in the training set annot b e guessed. Metho ds based on smaller k v alues b eome more eetiv e b eause they an generalize some more. In pratie, the opti- 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords k = 1 k = 2 k = 3 k = 4 k = 5 Figure 4: Sear h spae size v ersus fration of guessed passw ords on the MySpae dataset. mal strategy dep ends on the resoures of the at- ta k er, measured b y the n um b er of attempts that an b e tried. The diminishing returns eet that w e diso v- ered for ditionary atta ks also applies to this te h- nique: ev en when ho osing the b est v alue of k for ea h ase, around 100,000 andidates need to b e tried in order to guess 20% of the passw ords ( k = 5 ); this n um b er rises to roughly 1.1 billions andidates for a suess rate of 40% ( k = 3 ); the sear h spae needed to break 90% of the passw ords gro ws to appro ximately 3 · 10 17 ( k = 2) . With su h a h uge v ariane in the size of the sear h spae, it seems that no reasonable atta k based on passw ord guessing w ould sueed in guessing all passw ords exepting those ases where users are artiially fored to limit passw ord strength, for example b y imp osing a maxim um length. MySpae P assw ords In Figure 4, w e rep eat our measuremen ts using MySpae passw ords in the plae of our main dataset b oth as training set and as guessed passw ords. W e obtain qualitativ ely sim- ilar results in partiular, higher v alues of k are more appropriate as training sets for w eak er pass- w ords, and the diminishing returns priniple holds. F rom a quan titativ e p oin t of view, the sear h spae for w eak passw ord is bigger, while it is smaller for stronger passw ords. W e think that this is mainly due to the partiularities of the dataset: w eak pass- 9 1 0 4 1 0 8 1 0 1 2 1 0 1 6 1 0 2 0 1 0 2 4 1 0 2 8 1 0 3 2 1 0 3 6 1 0 4 0 1 0 4 4 1 0 4 8 Search space size 1 0 - 3 1 0 - 2 1 0 - 1 1 0 0 Fraction of missed passwords Brute force k = 1 k = 2 Figure 5: Comparison of brute fore and Mark o v- mo del based atta ks. w ords are made stronger b y the requiremen t of non- alphab eti haraters; strong passw ords reated b y seurit y-onsious users, on the other hand, are under-represen ted sine su h users are not lik ely to fall vitim to a phishing atta k. Brute F ore In Figure 5, w e ompare the brute fore approa h with our Mark o vian mo deling. The brute fore approa h starts b y trying the empt y passw ord, then pro eeds with en umerating all p os- sible passw ords with inreasing length. The full Unio de harater set urren tly has more than 99,000 haraters 5 , but man y of them are v ery rare and denitely unlik ely in a passw ord; to aoun t for this, w e again to ok a onserv ativ e approa h o v er- estimating the atta k er apabilities, and to ok in to aoun t only the 124 haraters that w e ha v e found in our dataset. In all but the most extreme ases, the Mark o vian mo del pro v es more eien t b y orders of magnitude. It is not b efore 10 40 andidates (and ha ving found 99.7% of the passw ords) that a brute fore approa h b eomes more eetiv e than the Mark o vian mo del with k = 1 ( harater frequenies). This n um- b er is w ell b ey ond the apabilities of an y realis- ti atta k er: to put this in on text, a luster of a thousand 10 GHz ma hines w ould need more than 3 · 10 19 y ears to rea h that n um b er of iterations, ev en assuming that they are able to try a passw ord 5 http://www.uniode.org/p ress /pr - u d5.0 .htm l 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 1 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 2 1 0 0 10 3 10 6 10 9 1 0 1 2 1 0 1 5 1 0 1 8 1 0 2 1 10 24 10 27 10 30 10 33 10 36 1 0 3 9 0 5 1 0 1 5 20 k = 3 10 0 1 0 3 1 0 6 1 0 9 10 12 10 15 10 18 10 21 1 0 2 4 1 0 2 7 1 0 3 0 1 0 3 3 1 0 3 6 10 39 0 5 10 1 5 2 0 k = 4 Figure 6: Satter plots of passw ord length (Y axis) v ersus strength (asso iated sear h spae size on X axis). for ea h lo k yle. Strength and P assw ord Length In Figure 6, w e highligh t the relationship b et w een a passw ord length and its strength. As the graphs sho w, the assumption that longer passw ords are stronger an only b e regarded as a rule of th um b: a short pass- w ord on taining infrequen t haraters and/or se- quenes thereof an b e atually stronger than a no- tieably longer one. The orrelation b et w een length and strength b eomes w eak er as the k parameter gro ws: long but w eak passw ords ma y b e based on preditable long patterns that are less eien tly predited b y mo dels based on lo w er k v alues. F or example, it is quite lik ely that the abd sequene is follo w ed b y a e ; a mo del based on digraphs, though, annot apture this and an only mo del whi h harater is more lik ely to follo w a d . T raining Sets Figure 7 illustrates ho w the hoie of training sets aets the atta k p erformane. The training sets used are our sets of usernames and passw ords, the MySpae leak ed passw ords, the JtR ommon passw ord ditionary , and Italian and En- glish ditionaries. The most eetiv e training set is the real pass- w ord set. The ommon passw ords ditionary from JtR is more represen tativ e of real passw ords 10 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 . 0 0 . 2 0 . 4 0 .6 0 . 8 1 .0 Fraction of cracked passwords Passwords MySpace Usernames JtR common English Italian Figure 7: Comparison of v arious training sets for guessing passw ords in our dataset ( k = 2) . than standard ditionaries, sine it on tains om- binations of haraters, su h as puntuation and digits, that don't app ear in standard ditionaries. Still, it app ears that a v erage passw ords do not losely resem ble the most ommon ones. The ase of MySpae passw ords as training set is in teresting: they are lose to the p erformane of our passw ord dataset for strong passw ords, but they do not represen t w eak ones w ell. W e b eliev e this is due to the o v er-represen tation of non-alphab eti har- aters, whi h are required to b e presen t in MySpae passw ords. The dierene in o v erage on strong passw ords (around 5% with equiv alen t sear h spae size) an also b e attributed to this feature, as w ell as to the follo wing fators: • Dierene in omputer literay: the MySpae sample on tains only the vitims of a phishing atta k; • Dierene in language: MySpae users are dis- tributed w orldwide. If a represen tativ e training set of real passw ords is not a v ailable to the atta k er, usernames are b y far the most eetiv e training set. It app ears that, when users are ask ed to pro vide a username and a passw ord, they emplo y similar riteria. This is quite surprising sine the t w o strings need to sat- isfy v ery dieren t, and arguably oniting, rite- ria: go o d usernames are easily memorable, while a 1 0 0 1 0 2 1 0 4 1 0 6 1 0 8 1 0 1 0 1 0 1 2 1 0 1 4 1 0 1 6 1 0 1 8 1 0 2 0 1 0 2 2 1 0 2 4 Search space size 0 .0 0 .2 0 .4 0 . 6 0 .8 1 . 0 Fraction of cracked passwords Usernames, k = 2 Passwords, k = 2 Usernames, k = 3 Passwords, k = 3 Figure 8: Comparison of omplexit y b et w een pass- w ords and usernames. strong passw ord has to b e as diult to guess as p ossible. Usernames The former result suggests a onsid- eration: usernames and passw ords are hosen si- m ultaneously , when registering a new aoun t. A user w an ts b oth strings to b e memorable, sine the t w o are needed in order to log on suessfully . Ho w- ev er, while there is no inen tiv e in ho osing omplex usernames, a seurit y onsious user will ommit some eort to mak e his passw ord more omplex. The dierene in omplexit y b et w een usernames and passw ords is therefore a w a y to measure the eort that users willingly put in making their pass- w ords more omplex: while usernames an b e v ery long or diult to guess, this is not lik ely to happ en as the result of a onsious attempt to do so. In Figure 8, w e ompare the sear h spae size asso iated to usernames and passw ords. Mat hing what w e ha v e done with passw ords, the training set used to guess a giv en username onsists of all the usernames exept the one under srutin y . It turns out that the eort that users put in reating omplex passw ords is measurable, but it is o v erall quite w eak: giv en a hoie for k and a sear h spae size, the p eren tage of ra k ed usernames nev er exeeds the ra k ed passw ords b y more than 15%. 11 6 Com bined Strategy Our results onrm that no single strategy or te h- nique is more eetiv e in reduing the sear h spae: ditionaries are most eetiv e in diso v ering the w eak est passw ords; the o v erage (fration of pass- w ords that are in the ditionary) gro ws as the di- tionary size gro ws, but this en tails a loss in prei- sion (fration of ditionary items that are atual passw ords). The Mark o v- hain based te hnique should b e used when ditionaries are exhausted. Higher v alues of k obtain b etter results at rst, but after a n um b er of attempts they b eome quite in- eetiv e. No single strategy is the b est one for all ases; this, in fat, v alidates the approa h tak en b y pass- w ord reo v ery systems that adopt bigger and big- ger ditionaries in asade, and resort afterw ards to Mark o v-based te hniques. In this setion, w e sum- marize our results b y presen ting the results that an atta k er w ould b e able to obtain b y using su h a te hnique. Consisten tly with our approa h of estimating the apabilities of the atta k er b y exess in the fae of unertain t y , w e assume that the atta k er has aess to a passw ord training set whi h is as eetiv e as the one w e obtain from the lear text. F urthermore, w e also assume that the atta k er is able to predit the eetiv eness of te hniques that w e measured in Setions 4 and 5. Based on this kno wledge, using the training set as a ditionary , the strategy for the ditionary-based rst part of the atta k is as follo ws: 1. T ry the username; 2. T ry the ommon passw ords ditionary; 3. T ry all passw ords in the training set; 4. T ry the English 1 ditionary; 5. T ry the Italian 1 and 2 ditionaries; 6. T ry the English 2, 3 and extra ditionaries; 7. T ry all remaining JtR ditionaries; 8. T ry mangling rules. If this approa h is not suien t, one should resort to the Mark o vian mo del. Figure 9: Sear h spae size for passw ords that are not found in an y ditionary . In the inner frame, detail on the rst iterations. In Figure 9, w e sho w the sear h spae for the passw ords that ha v e not b een diso v ered within an y ditionary . With resp et to gure 3, there is a sharp derease in the suess rate un til the sear h spae size rea hes appro ximately 10 8 . In partiular, te hniques with k = 5 and k = 4 are unsuessful to break more than, resp etiv ely , roughly 1% and 4% of the passw ords. This mat hes with the in tu- ition that ditionary-based atta ks are more useful against the less omplex passw ords. Based on the data represen ted in Figure 9, an eien t strategy for the atta k w ould b e as follo ws: 1. T ry 500,000 andidates with the mo del based on k = 5 ; 2. T ry 7,000,000 andidates with k = 4 ; 3. T ry 700,000,000 andidates with k = 3 ; 4. T ry 7 · 10 16 andidates with k = 2 ; 5. Con tin ue with k = 1 . In T able 4, w e summarize the sear h spae size and p eren tage of ra k ed passw ords for ea h of these steps. This is the answ er to our original ques- tion: ho w man y attempts an atta k er w ould need in order to guess a giv en p eren tage of the passw ords. By in tegrating this with system-sp ei kno wledge su h as the omputational ost needed to p erform a single guess and the amoun t of resoures that the 12 Step #attempts Cra k ed Username 1 2.88% Common passw ords 3,115 9.95% T raining set 10,431 28.83% English 1 36,574 30.51% Italian 1 98,511 32.25% Italian 2 373,834 36.31% English 2 632,613 37.18% English 3 722,215 37.69% English extra 1,123,841 40.07% JtR - all ditionaries 3,923,660 41.14% Mangling 40,538,747 44.26% Mark o v hain - k = 5 41,070,093 45.05% Mark o v hain - k = 4 48,051,199 46.76% Mark o v hain - k = 3 ~750,000,000 58.10% Mark o v hain - k = 2 ~ 7 · 10 16 91.06% Mark o v hain - k = 1 ~ 10 40 99.71% T able 4: Cum ulativ e n um b er of attempts and of guessed passw ords for the m ulti-step approa h. Candidates that w ould b e he k ed in more than one ditionary are oun ted only one. F or the Mark o v hain te hnique with k ≤ 3 , the sear h spae has not b een generated expliitly and its size has b een appro ximated with Algorithm 2. atta k er has aess to, it is p ossible to estimate the p eren tage of passw ords that are vulnerable to a giv en atta k. 7 Conlusion As the bibliograph y of this w ork witnesses, the rst studies on passw ord ra king date ba k to almost 30 y ears ago. Still, the te hniques that are used in state of the art passw ord-ra king appliations are quite simple: deades of resear h suggest that it is p ossible to do b etter than applying simple Mark o v hain-based mo deling te hniques. The results of our measuremen t study ma y pro- vide an explanation as to wh y not m u h has b een done in this diretion: the diminishing returns ef- fet implies that, ev en if the size of the sear h spae dereases b y orders of magnitude, the p eren tage of passw ords that an atta k er w ould b e able to ra k in a giv en n um b er of attempt w ould inrease only b y a non-impressiv e p eren tage. In addition, it is lik ely that an inno v ativ e strategy for exploring the sear h spae w ould impro v e o v er the state of the art only for a giv en in terv al of sear h spae sizes; the lo w-ost/high-rew ard part of the sear h spae is already easily o v ered b y ditionaries of frequen t passw ords. When su h an atta k pro v es ineetiv e, an atta k er ould hange target to nd an easier prey , or use other means of atta k whi h are not based on the passw ord strength, su h as so ial en- gineering, phishing, or exploitation of vulnerabili- ties in soft w are or in the proto ol: as the energies instilled in to an unsuessful atta k gro w, the at- ta k is more and more lik ely to b e unsuessful in the future as w ell. W e fo used on the strength of passw ords hosen b y users in the absene of passw ord strength en- foremen t. As p oin ted out in Setion 2, it is debat- able that systems enforing passw ord omplexit y atually inrease seurit y: they ma y instead lead users to irum v en t the enforemen t te hniques b y adopting inseure b eha vior. T o assess this, measur- ing passw ord omplexit y with and without enfore- men t should b e oupled with an analysis of user b eha vior. Another in teresting question y et to b e addressed regards the orrelation b et w een passw ord strength and the domain they are related to. In partiu- lar, ho w will the passw ord strength of a user v ary 13 if getting an aoun t ompromised w ould result in a notieable loss? In [4℄, some evidene that users atually ho ose b etter passw ords for aoun ts re- lated to v aluable assets (e.g., P a yP al) is rep orted. Unfortunately , the bit-strength measure adopted is quite simple. F urther in v estigations w ould b e re- quired to obtain atual gures in terms of atta k er osts in order to break an aoun t. 8 A kno wledgemen ts The authors w ould lik e to thank Sebastian P orst and Roger Grimes for ha ving shared the set of MySpae passw ords, and Cyn thia Kuo, Sasha Ro- manosky , and Lorrie F. Cranor for ha ving shared their mnemonis ditionary . Referenes [1℄ A. A dams and M. A. Sasse. Users are not the enem y . Commun. A CM , 42(12):4046, Deem- b er 1999. [2℄ M. Bishop. Impro ving system seurit y via proativ e passw ord he king. Computers & Se- urity , 14(3):233249, 1995. [3℄ J. A. Cazier and D. B. Medlin. P assw ord seurit y: An empirial in v estigation in to e- ommere passw ords and their ra k times. In- formation Se urity Journal: A Glob al Persp e - tive , 15(6):4555, 2006. [4℄ D. Florenio and C. Herley . A large-sale study of w eb passw ord habits. In WWW '07: Pr o e e dings of the 16th international onfer- en e on W orld Wide W eb , pages 657666, New Y ork, NY, USA, 2007. A CM. [5℄ R. A. Grimes. MySpae passw ord exploit: Crun hing the n um b ers (and letters). In- foW orld online artile, No v em b er 2006. [6℄ B. Kaliski. RF C 2898 - PK CS #5: P assw ord- based ryptograph y sp eiation v ersion 2.0. T e hnial rep ort, IETF, Septem b er 2000. [7℄ D. V. Klein. F oiling the ra k er: A surv ey of, and impro v emen ts to, passw ord seurit y . In Pr o e e dings of the 2nd USENIX UNIX Se u- rity W orkshop , 1990. [8℄ C. Kuo, S. Romanosky , and L. F. Cranor. Hu- man seletion of mnemoni phrase-based pass- w ords. In SOUPS '06: Pr o e e dings of the se - ond symp osium on Usable privay and se u- rity , pages 6778, New Y ork, NY, USA, 2006. A CM Press. [9℄ S. Mare hal. A dv anes in passw ord ra k- ing. Journal in Computer Vir olo gy , 4(1):73 81, F ebruary 2008. [10℄ R. Morris and K. Thompson. P assw ord se- urit y: a ase history . Commun. A CM , 22(11):594597, No v em b er 1979. [11℄ A. Nara y anan and V. Shmatik o v. F ast di- tionary atta ks on passw ords using time-spae tradeo. In CCS '05: Pr o e e dings of the 12th A CM onfer en e on Computer and ommu- ni ations se urity , pages 364372, New Y ork, NY, USA, 2005. A CM Press. [12℄ P . Oe hslin. Making a faster ryptanalyti time-memory trade-o. In A dvan es in Cryp- tolo gy - CR YPTO 2003 , pages 617630, 2003. [13℄ B. Pink as and T. Sander. Seuring pass- w ords against ditionary atta ks. In CCS '02: Pr o e e dings of the 9th A CM onfer en e on Computer and ommuni ations se urity , pages 161170, New Y ork, NY, USA, 2002. A CM Press. [14℄ S. P orst. A brief analysis of 40,000 leak ed MySpae passw ords. Blog p ost at http://www.the- in te rwe b. o m/ serendipity/index .p hp? /a r hiv es / 94- A- brief- analysi s- of- 40, 000- leaked- MySpae - p ass wo rd s.h tm l , No v em b er 2007. [15℄ S. Riley . P assw ord seurit y: What users kno w and what they atually do. Usability News , 8(1), F ebruary 2006. [16℄ B. S hneier. S hneier on seurit y: Cho os- ing seure passw ords. Blog p ost at http://www.shnei er .o m/ bl og/ ar h ive s/ 2007/01/hoosing\ _s eu re .h tml , Jan uary 2007. [17℄ R. E. Smith. The Str ong Passwor d Dilemma , hapter 6. A ddison-W esley , 2002. 14 [18℄ E. H. Spaord. Observing reusable passw ord hoies. In In Pr o e e dings of the 3r d Se urity Symp osium. Usenix , pages 299312, 1992. [19℄ L. v on Ahn, M. Blum, and J. Langford. T elling h umans and omputers apart automatially . Commun. A CM , 47(2):5660, F ebruary 2004. [20℄ T. W u. A real-w orld analysis of Kerb eros pass- w ord seurit y . In Pr o e e dings of 1999 Network and Distribute d System Se urity Symp osium , F ebruary 1999. [21℄ J. J. Y an. A note on proativ e passw ord he k- ing. In NSPW '01: Pr o e e dings of the 2001 workshop on New se urity p ar adigms , pages 127135, New Y ork, NY, USA, 2001. A CM. [22℄ G. K. Zipf. Human Behaviour and the Prini- ple of L e ast Eort: an Intr o dution to Human E olo gy . A ddison-W esley , 1949. 15
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment